diff options
Diffstat (limited to 'Doc')
324 files changed, 30008 insertions, 9998 deletions
diff --git a/Doc/Makefile b/Doc/Makefile index 13411f2..82f5bef 100644 --- a/Doc/Makefile +++ b/Doc/Makefile @@ -41,11 +41,11 @@ help: checkout: @if [ ! -d tools/sphinx ]; then \ echo "Checking out Sphinx..."; \ - svn checkout $(SVNROOT)/external/Sphinx-1.0.7/sphinx tools/sphinx; \ + svn checkout $(SVNROOT)/external/Sphinx-1.2/sphinx tools/sphinx; \ fi @if [ ! -d tools/docutils ]; then \ echo "Checking out Docutils..."; \ - svn checkout $(SVNROOT)/external/docutils-0.6/docutils tools/docutils; \ + svn checkout $(SVNROOT)/external/docutils-0.11/docutils tools/docutils; \ fi @if [ ! -d tools/jinja2 ]; then \ echo "Checking out Jinja..."; \ @@ -53,7 +53,7 @@ checkout: fi @if [ ! -d tools/pygments ]; then \ echo "Checking out Pygments..."; \ - svn checkout $(SVNROOT)/external/Pygments-1.3.1/pygments tools/pygments; \ + svn checkout $(SVNROOT)/external/Pygments-1.6/pygments tools/pygments; \ fi update: clean checkout @@ -186,6 +186,11 @@ serve: autobuild-dev: make update make dist SPHINXOPTS='-A daily=1 -A versionswitcher=1' + -make suspicious + +# for quick rebuilds (HTML only) +autobuild-html: + make html SPHINXOPTS='-A daily=1 -A versionswitcher=1' # for stable releases: only build if not in pre-release stage (alpha, beta, rc) autobuild-stable: @@ -194,3 +199,4 @@ autobuild-stable: exit 1;; \ esac @make autobuild-dev + diff --git a/Doc/README.txt b/Doc/README.txt index c63fc60..a494c89 100644 --- a/Doc/README.txt +++ b/Doc/README.txt @@ -7,14 +7,13 @@ available at http://docs.python.org/download/. Documentation on the authoring Python documentation, including information about both style and markup, is available in the "Documenting Python" chapter of the -documentation. There's also a chapter intended to point out differences to -those familiar with the previous docs written in LaTeX. +documentation. Building the docs ================= -You need to have Python 2.4 or higher installed; the toolset used to build the +You need to have Python 2 installed; the toolset used to build the docs is written in Python. It is called *Sphinx*, it is not included in this tree, but maintained separately. Also needed are the docutils, supplying the base markup that Sphinx uses, Jinja, a templating engine, and optionally @@ -33,6 +32,9 @@ to check out the necessary toolset in the `tools/` subdirectory and build the HTML output files. To view the generated HTML, point your favorite browser at the top-level index `build/html/index.html` after running "make". +On Windows, we try to emulate the Makefile as closely as possible with a +``make.bat`` file. + To use a Python interpreter that's not called ``python``, use the standard way to set Makefile variables, using e.g. :: @@ -73,43 +75,23 @@ Available make targets are: `tools/sphinxext/pyspecific.py` -- pydoc needs these to show topic and keyword help. + * "suspicious", which checks the parsed markup for text that looks like + malformed and thus unconverted reST. + A "make update" updates the Subversion checkouts in `tools/`. Without make ------------ -You'll need to install the Sphinx package, either by checking it out via :: - - svn co http://svn.python.org/projects/external/Sphinx-1.0.7/sphinx tools/sphinx - -or by installing it from PyPI. - -Then, you need to install Docutils, either by checking it out via :: - - svn co http://svn.python.org/projects/external/docutils-0.6/docutils tools/docutils - -or by installing it from http://docutils.sf.net/. - -You also need Jinja2, either by checking it out via :: - - svn co http://svn.python.org/projects/external/Jinja-2.3.1/jinja2 tools/jinja2 - -or by installing it from PyPI. - -You can optionally also install Pygments, either as a checkout via :: - - svn co http://svn.python.org/projects/external/Pygments-1.3.1/pygments tools/pygments - -or from PyPI at http://pypi.python.org/pypi/Pygments. - +Install the Sphinx package and its dependencies from PyPI. -Then, make an output directory, e.g. under `build/`, and run :: +Then, from the ``Docs`` directory, run :: - python tools/sphinx-build.py -b<builder> . build/<outputdirectory> + sphinx-build -b<builder> . build/<builder> -where `<builder>` is one of html, text, latex, or htmlhelp (for explanations see -the make targets above). +where ``<builder>`` is one of html, text, latex, or htmlhelp (for explanations +see the make targets above). Contributing @@ -135,7 +117,7 @@ The Python source is copyrighted, but you can freely use and copy it as long as you don't change or remove the copyright notice: ---------------------------------------------------------------------- -Copyright (c) 2000-2013 Python Software Foundation. +Copyright (c) 2000-2014 Python Software Foundation. All rights reserved. Copyright (c) 2000 BeOpen.com. diff --git a/Doc/about.rst b/Doc/about.rst index d316f09..678168b 100644 --- a/Doc/about.rst +++ b/Doc/about.rst @@ -7,14 +7,15 @@ These documents are generated from `reStructuredText`_ sources by `Sphinx`_, a document processor specifically written for the Python documentation. .. _reStructuredText: http://docutils.sf.net/rst.html -.. _Sphinx: http://sphinx.pocoo.org/ +.. _Sphinx: http://sphinx-doc.org/ .. In the online version of these documents, you can submit comments and suggest changes directly on the documentation pages. -Development of the documentation and its toolchain takes place on the -docs@python.org mailing list. We're always looking for volunteers wanting -to help with the docs, so feel free to send a mail there! +Development of the documentation and its toolchain is an entirely volunteer +effort, just like Python itself. If you want to contribute, please take a +look at the :ref:`reporting-bugs` page for information on how to do so. New +volunteers are always welcome! Many thanks go to: @@ -26,9 +27,6 @@ Many thanks go to: <http://effbot.org/zone/pyref.htm>`_ project from which Sphinx got many good ideas. -See :ref:`reporting-bugs` for information how to report bugs in this -documentation, or Python itself. - Contributors to the Python Documentation ---------------------------------------- diff --git a/Doc/bugs.rst b/Doc/bugs.rst index 3785ccb..847c010 100644 --- a/Doc/bugs.rst +++ b/Doc/bugs.rst @@ -13,15 +13,17 @@ Documentation bugs ================== If you find a bug in this documentation or would like to propose an improvement, -please send an e-mail to docs@python.org describing the bug and where you found -it. If you have a suggestion how to fix it, include that as well. +please submit a bug report on the :ref:`tracker <using-the-tracker>`. If you +have a suggestion how to fix it, include that as well. -docs@python.org is a mailing list run by volunteers; your request will be -noticed, even if it takes a while to be processed. +If you're short on time, you can also email your bug report to docs@python.org. +'docs@' is a mailing list run by volunteers; your request will be noticed, +though it may take a while to be processed. -Of course, if you want a more persistent record of your issue, you can use the -issue tracker for documentation bugs as well. +.. seealso:: + `Documentation bugs`_ on the Python issue tracker +.. _using-the-tracker: Using the Python issue tracker ============================== @@ -62,9 +64,6 @@ taken on the bug. .. seealso:: - `Python Developer's Guide <http://docs.python.org/devguide/>`_ - Detailed description of the issue workflow and developers tools. - `How to Report Bugs Effectively <http://www.chiark.greenend.org.uk/~sgtatham/bugs.html>`_ Article which goes into some detail about how to create a useful bug report. This describes what kind of information is useful and why it is useful. @@ -73,3 +72,16 @@ taken on the bug. Information about writing a good bug report. Some of this is specific to the Mozilla project, but describes general good practices. + +Getting started contributing to Python yourself +=============================================== + +Beyond just reporting bugs that you find, you are also welcome to submit +patches to fix them. You can find more information on how to get started +patching Python in the `Python Developer's Guide`_. If you have questions, +the `core-mentorship mailing list`_ is a friendly place to get answers to +any and all questions pertaining to the process of fixing issues in Python. + +.. _Documentation bugs: http://bugs.python.org/issue?@filter=status&@filter=components&components=4&status=1&@columns=id,activity,title,status&@sort=-activity +.. _Python Developer's Guide: http://docs.python.org/devguide/ +.. _core-mentorship mailing list: https://mail.python.org/mailman/listinfo/core-mentorship/ diff --git a/Doc/c-api/allocation.rst b/Doc/c-api/allocation.rst index e8f60bf..25a867f 100644 --- a/Doc/c-api/allocation.rst +++ b/Doc/c-api/allocation.rst @@ -32,7 +32,7 @@ Allocating Objects on the Heap Allocate a new Python object using the C structure type *TYPE* and the Python type object *type*. Fields not defined by the Python object header are not initialized; the object's reference count will be one. The size of - the memory allocation is determined from the :attr:`tp_basicsize` field of + the memory allocation is determined from the :c:member:`~PyTypeObject.tp_basicsize` field of the type object. @@ -41,7 +41,7 @@ Allocating Objects on the Heap Allocate a new Python object using the C structure type *TYPE* and the Python type object *type*. Fields not defined by the Python object header are not initialized. The allocated memory allows for the *TYPE* structure - plus *size* fields of the size given by the :attr:`tp_itemsize` field of + plus *size* fields of the size given by the :c:member:`~PyTypeObject.tp_itemsize` field of *type*. This is useful for implementing objects like tuples, which are able to determine their size at construction time. Embedding the array of fields into the same allocation decreases the number of allocations, @@ -52,7 +52,7 @@ Allocating Objects on the Heap Releases memory allocated to an object using :c:func:`PyObject_New` or :c:func:`PyObject_NewVar`. This is normally called from the - :attr:`tp_dealloc` handler specified in the object's type. The fields of + :c:member:`~PyTypeObject.tp_dealloc` handler specified in the object's type. The fields of the object should not be accessed after this call as the memory is no longer a valid Python object. diff --git a/Doc/c-api/apiabiversion.rst b/Doc/c-api/apiabiversion.rst new file mode 100644 index 0000000..890a038 --- /dev/null +++ b/Doc/c-api/apiabiversion.rst @@ -0,0 +1,39 @@ +.. highlightlang:: c + +.. _apiabiversion: + +*********************** +API and ABI Versioning +*********************** + +``PY_VERSION_HEX`` is the Python version number encoded in a single integer. + +For example if the ``PY_VERSION_HEX`` is set to ``0x030401a2``, the underlying +version information can be found by treating it as a 32 bit number in +the following manner: + + +-------+-------------------------+------------------------------------------------+ + | Bytes | Bits (big endian order) | Meaning | + +=======+=========================+================================================+ + | ``1`` | ``1-8`` | ``PY_MAJOR_VERSION`` (the ``3`` in | + | | | ``3.4.1a2``) | + +-------+-------------------------+------------------------------------------------+ + | ``2`` | ``9-16`` | ``PY_MINOR_VERSION`` (the ``4`` in | + | | | ``3.4.1a2``) | + +-------+-------------------------+------------------------------------------------+ + | ``3`` | ``17-24`` | ``PY_MICRO_VERSION`` (the ``1`` in | + | | | ``3.4.1a2``) | + +-------+-------------------------+------------------------------------------------+ + | ``4`` | ``25-28`` | ``PY_RELEASE_LEVEL`` (``0xA`` for alpha, | + | | | ``0xB`` for beta, ``0xC`` for release | + | | | candidate and ``0xF`` for final), in this | + | | | case it is alpha. | + +-------+-------------------------+------------------------------------------------+ + | | ``29-32`` | ``PY_RELEASE_SERIAL`` (the ``2`` in | + | | | ``3.4.1a2``, zero for final releases) | + +-------+-------------------------+------------------------------------------------+ + +Thus ``3.4.1a2`` is hexversion ``0x030401a2``. + +All the given macros are defined in :source:`Include/patchlevel.h`. + diff --git a/Doc/c-api/arg.rst b/Doc/c-api/arg.rst index d4dda7c..28a434e 100644 --- a/Doc/c-api/arg.rst +++ b/Doc/c-api/arg.rst @@ -45,6 +45,7 @@ in any early abort case). Unless otherwise stated, buffers are not NUL-terminated. .. note:: + For all ``#`` variants of formats (``s#``, ``y#``, etc.), the type of the length argument (int or :c:type:`Py_ssize_t`) is controlled by defining the macro :c:macro:`PY_SSIZE_T_CLEAN` before including @@ -70,8 +71,7 @@ Unless otherwise stated, buffers are not NUL-terminated. as *converter*. ``s*`` (:class:`str`, :class:`bytes`, :class:`bytearray` or buffer compatible object) [Py_buffer] - This format accepts Unicode objects as well as objects supporting the - buffer protocol. + This format accepts Unicode objects as well as :term:`bytes-like object`\ s. It fills a :c:type:`Py_buffer` structure provided by the caller. In this case the resulting C string may contain embedded NUL bytes. Unicode objects are converted to C strings using ``'utf-8'`` encoding. @@ -101,14 +101,14 @@ Unless otherwise stated, buffers are not NUL-terminated. contain embedded NUL bytes; if it does, a :exc:`TypeError` exception is raised. -``y*`` (:class:`bytes`, :class:`bytearray` or buffer compatible object) [Py_buffer] - This variant on ``s*`` doesn't accept Unicode objects, only objects - supporting the buffer protocol. **This is the recommended way to accept +``y*`` (:class:`bytes`, :class:`bytearray` or :term:`bytes-like object`) [Py_buffer] + This variant on ``s*`` doesn't accept Unicode objects, only + :term:`bytes-like object`\ s. **This is the recommended way to accept binary data.** ``y#`` (:class:`bytes`) [const char \*, int] - This variant on ``s#`` doesn't accept Unicode objects, only bytes-like - objects. + This variant on ``s#`` doesn't accept Unicode objects, only :term:`bytes-like + object`\ s. ``S`` (:class:`bytes`) [PyBytesObject \*] Requires that the Python object is a :class:`bytes` object, without @@ -146,7 +146,7 @@ Unless otherwise stated, buffers are not NUL-terminated. Like ``u#``, but the Python object may also be ``None``, in which case the :c:type:`Py_UNICODE` pointer is set to *NULL*. -``U`` (:class:`str`) [PyUnicodeObject \*] +``U`` (:class:`str`) [PyObject \*] Requires that the Python object is a Unicode object, without attempting any conversion. Raises :exc:`TypeError` if the object is not a Unicode object. The C variable may also be declared as :c:type:`PyObject\*`. @@ -260,9 +260,12 @@ Numbers ``n`` (:class:`int`) [Py_ssize_t] Convert a Python integer to a C :c:type:`Py_ssize_t`. -``c`` (:class:`bytes` of length 1) [char] - Convert a Python byte, represented as a :class:`bytes` object of length 1, - to a C :c:type:`char`. +``c`` (:class:`bytes` or :class:`bytearray` of length 1) [char] + Convert a Python byte, represented as a :class:`bytes` or + :class:`bytearray` object of length 1, to a C :c:type:`char`. + + .. versionchanged:: 3.3 + Allow :class:`bytearray` objects. ``C`` (:class:`str` of length 1) [int] Convert a Python character, represented as a :class:`str` object of @@ -315,6 +318,15 @@ Other objects .. versionchanged:: 3.1 ``Py_CLEANUP_SUPPORTED`` was added. +``p`` (:class:`bool`) [int] + Tests the value passed in for truth (a boolean **p**\redicate) and converts + the result to its equivalent C true/false integer value. + Sets the int to 1 if the expression was true and 0 if it was false. + This accepts any valid Python value. See :ref:`truth` for more + information about how Python tests values for truth. + + .. versionadded:: 3.3 + ``(items)`` (:class:`tuple`) [*matching-items*] The object must be a Python sequence whose length is the number of format units in *items*. The C arguments must correspond to the individual format units in @@ -336,6 +348,15 @@ inside nested parentheses. They are: :c:func:`PyArg_ParseTuple` does not touch the contents of the corresponding C variable(s). +``$`` + :c:func:`PyArg_ParseTupleAndKeywords` only: + Indicates that the remaining arguments in the Python argument list are + keyword-only. Currently, all keyword-only arguments must also be optional + arguments, so ``|`` must always be specified before ``$`` in the format + string. + + .. versionadded:: 3.3 + ``:`` The list of format units ends here; the string after the colon is used as the function name in error messages (the "associated value" of the exception that @@ -493,7 +514,7 @@ Building values ``None`` is returned. ``y`` (:class:`bytes`) [char \*] - This converts a C string to a Python :func:`bytes` object. If the C + This converts a C string to a Python :class:`bytes` object. If the C string pointer is *NULL*, ``None`` is returned. ``y#`` (:class:`bytes`) [char \*, int] diff --git a/Doc/c-api/buffer.rst b/Doc/c-api/buffer.rst index 740b575..a7de37e 100644 --- a/Doc/c-api/buffer.rst +++ b/Doc/c-api/buffer.rst @@ -12,6 +12,7 @@ Buffer Protocol .. sectionauthor:: Greg Stein <gstein@lyra.org> .. sectionauthor:: Benjamin Peterson +.. sectionauthor:: Stefan Krah Certain objects available in Python wrap access to an underlying memory @@ -22,7 +23,7 @@ as image processing or numeric analysis. While each of these types have their own semantics, they share the common characteristic of being backed by a possibly large memory buffer. It is -then desireable, in some situations, to access that buffer directly and +then desirable, in some situations, to access that buffer directly and without intermediate copying. Python provides such a facility at the C level in the form of the :ref:`buffer @@ -62,8 +63,10 @@ isn't needed anymore. Failure to do so could lead to various issues such as resource leaks. -The buffer structure -==================== +.. _buffer-structure: + +Buffer structure +================ Buffer structures (or simply "buffers") are useful as a way to expose the binary data from another object to the Python programmer. They can also be @@ -80,249 +83,414 @@ allows them to be created and copied very simply. When a generic wrapper around a buffer is needed, a :ref:`memoryview <memoryview-objects>` object can be created. +For short instructions how to write an exporting object, see +:ref:`Buffer Object Structures <buffer-structs>`. For obtaining +a buffer, see :c:func:`PyObject_GetBuffer`. .. c:type:: Py_buffer - .. c:member:: void *buf + .. c:member:: void \*buf + + A pointer to the start of the logical structure described by the buffer + fields. This can be any location within the underlying physical memory + block of the exporter. For example, with negative :c:member:`~Py_buffer.strides` + the value may point to the end of the memory block. + + For contiguous arrays, the value points to the beginning of the memory + block. + + .. c:member:: void \*obj + + A new reference to the exporting object. The reference is owned by + the consumer and automatically decremented and set to *NULL* by + :c:func:`PyBuffer_Release`. The field is the equivalent of the return + value of any standard C-API function. - A pointer to the start of the memory for the object. + As a special case, for *temporary* buffers that are wrapped by + :c:func:`PyMemoryView_FromBuffer` or :c:func:`PyBuffer_FillInfo` + this field is *NULL*. In general, exporting objects MUST NOT + use this scheme. .. c:member:: Py_ssize_t len - :noindex: - The total length of the memory in bytes. + ``product(shape) * itemsize``. For contiguous arrays, this is the length + of the underlying memory block. For non-contiguous arrays, it is the length + that the logical structure would have if it were copied to a contiguous + representation. + + Accessing ``((char *)buf)[0] up to ((char *)buf)[len-1]`` is only valid + if the buffer has been obtained by a request that guarantees contiguity. In + most cases such a request will be :c:macro:`PyBUF_SIMPLE` or :c:macro:`PyBUF_WRITABLE`. .. c:member:: int readonly - An indicator of whether the buffer is read only. + An indicator of whether the buffer is read-only. This field is controlled + by the :c:macro:`PyBUF_WRITABLE` flag. + + .. c:member:: Py_ssize_t itemsize + + Item size in bytes of a single element. Same as the value of :func:`struct.calcsize` + called on non-NULL :c:member:`~Py_buffer.format` values. + + Important exception: If a consumer requests a buffer without the + :c:macro:`PyBUF_FORMAT` flag, :c:member:`~Py_Buffer.format` will + be set to *NULL*, but :c:member:`~Py_buffer.itemsize` still has + the value for the original format. + + If :c:member:`~Py_Buffer.shape` is present, the equality + ``product(shape) * itemsize == len`` still holds and the consumer + can use :c:member:`~Py_buffer.itemsize` to navigate the buffer. - .. c:member:: const char *format - :noindex: + If :c:member:`~Py_Buffer.shape` is *NULL* as a result of a :c:macro:`PyBUF_SIMPLE` + or a :c:macro:`PyBUF_WRITABLE` request, the consumer must disregard + :c:member:`~Py_buffer.itemsize` and assume ``itemsize == 1``. - A *NULL* terminated string in :mod:`struct` module style syntax giving - the contents of the elements available through the buffer. If this is - *NULL*, ``"B"`` (unsigned bytes) is assumed. + .. c:member:: const char \*format + + A *NUL* terminated string in :mod:`struct` module style syntax describing + the contents of a single item. If this is *NULL*, ``"B"`` (unsigned bytes) + is assumed. + + This field is controlled by the :c:macro:`PyBUF_FORMAT` flag. .. c:member:: int ndim - The number of dimensions the memory represents as a multi-dimensional - array. If it is 0, :c:data:`strides` and :c:data:`suboffsets` must be - *NULL*. - - .. c:member:: Py_ssize_t *shape - - An array of :c:type:`Py_ssize_t`\s the length of :c:data:`ndim` giving the - shape of the memory as a multi-dimensional array. Note that - ``((*shape)[0] * ... * (*shape)[ndims-1])*itemsize`` should be equal to - :c:data:`len`. - - .. c:member:: Py_ssize_t *strides - - An array of :c:type:`Py_ssize_t`\s the length of :c:data:`ndim` giving the - number of bytes to skip to get to a new element in each dimension. - - .. c:member:: Py_ssize_t *suboffsets - - An array of :c:type:`Py_ssize_t`\s the length of :c:data:`ndim`. If these - suboffset numbers are greater than or equal to 0, then the value stored - along the indicated dimension is a pointer and the suboffset value - dictates how many bytes to add to the pointer after de-referencing. A - suboffset value that it negative indicates that no de-referencing should - occur (striding in a contiguous memory block). - - Here is a function that returns a pointer to the element in an N-D array - pointed to by an N-dimensional index when there are both non-NULL strides - and suboffsets:: - - void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides, - Py_ssize_t *suboffsets, Py_ssize_t *indices) { - char *pointer = (char*)buf; - int i; - for (i = 0; i < ndim; i++) { - pointer += strides[i] * indices[i]; - if (suboffsets[i] >=0 ) { - pointer = *((char**)pointer) + suboffsets[i]; - } - } - return (void*)pointer; - } + The number of dimensions the memory represents as an n-dimensional array. + If it is 0, :c:member:`~Py_Buffer.buf` points to a single item representing + a scalar. In this case, :c:member:`~Py_buffer.shape`, :c:member:`~Py_buffer.strides` + and :c:member:`~Py_buffer.suboffsets` MUST be *NULL*. + The macro :c:macro:`PyBUF_MAX_NDIM` limits the maximum number of dimensions + to 64. Exporters MUST respect this limit, consumers of multi-dimensional + buffers SHOULD be able to handle up to :c:macro:`PyBUF_MAX_NDIM` dimensions. - .. c:member:: Py_ssize_t itemsize + .. c:member:: Py_ssize_t \*shape + + An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim` + indicating the shape of the memory as an n-dimensional array. Note that + ``shape[0] * ... * shape[ndim-1] * itemsize`` MUST be equal to + :c:member:`~Py_buffer.len`. + + Shape values are restricted to ``shape[n] >= 0``. The case + ``shape[n] == 0`` requires special attention. See `complex arrays`_ + for further information. + + The shape array is read-only for the consumer. + + .. c:member:: Py_ssize_t \*strides + + An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim` + giving the number of bytes to skip to get to a new element in each + dimension. + + Stride values can be any integer. For regular arrays, strides are + usually positive, but a consumer MUST be able to handle the case + ``strides[n] <= 0``. See `complex arrays`_ for further information. + + The strides array is read-only for the consumer. - This is a storage for the itemsize (in bytes) of each element of the - shared memory. It is technically un-necessary as it can be obtained - using :c:func:`PyBuffer_SizeFromFormat`, however an exporter may know - this information without parsing the format string and it is necessary - to know the itemsize for proper interpretation of striding. Therefore, - storing it is more convenient and faster. + .. c:member:: Py_ssize_t \*suboffsets - .. c:member:: void *internal + An array of :c:type:`Py_ssize_t` of length :c:member:`~Py_buffer.ndim`. + If ``suboffsets[n] >= 0``, the values stored along the nth dimension are + pointers and the suboffset value dictates how many bytes to add to each + pointer after de-referencing. A suboffset value that is negative + indicates that no de-referencing should occur (striding in a contiguous + memory block). + + This type of array representation is used by the Python Imaging Library + (PIL). See `complex arrays`_ for further information how to access elements + of such an array. + + The suboffsets array is read-only for the consumer. + + .. c:member:: void \*internal This is for use internally by the exporting object. For example, this might be re-cast as an integer by the exporter and used to store flags about whether or not the shape, strides, and suboffsets arrays must be - freed when the buffer is released. The consumer should never alter this + freed when the buffer is released. The consumer MUST NOT alter this value. +.. _buffer-request-types: -Buffer-related functions -======================== +Buffer request types +==================== +Buffers are usually obtained by sending a buffer request to an exporting +object via :c:func:`PyObject_GetBuffer`. Since the complexity of the logical +structure of the memory can vary drastically, the consumer uses the *flags* +argument to specify the exact buffer type it can handle. -.. c:function:: int PyObject_CheckBuffer(PyObject *obj) +All :c:data:`Py_buffer` fields are unambiguously defined by the request +type. - Return 1 if *obj* supports the buffer interface otherwise 0. When 1 is - returned, it doesn't guarantee that :c:func:`PyObject_GetBuffer` will - succeed. +request-independent fields +~~~~~~~~~~~~~~~~~~~~~~~~~~ +The following fields are not influenced by *flags* and must always be filled in +with the correct values: :c:member:`~Py_buffer.obj`, :c:member:`~Py_buffer.buf`, +:c:member:`~Py_buffer.len`, :c:member:`~Py_buffer.itemsize`, :c:member:`~Py_buffer.ndim`. -.. c:function:: int PyObject_GetBuffer(PyObject *obj, Py_buffer *view, int flags) +readonly, format +~~~~~~~~~~~~~~~~ - Export a view over some internal data from the target object *obj*. - *obj* must not be NULL, and *view* must point to an existing - :c:type:`Py_buffer` structure allocated by the caller (most uses of - this function will simply declare a local variable of type - :c:type:`Py_buffer`). The *flags* argument is a bit field indicating - what kind of buffer is requested. The buffer interface allows - for complicated memory layout possibilities; however, some callers - won't want to handle all the complexity and instead request a simple - view of the target object (using :c:macro:`PyBUF_SIMPLE` for a read-only - view and :c:macro:`PyBUF_WRITABLE` for a read-write view). + .. c:macro:: PyBUF_WRITABLE - Some exporters may not be able to share memory in every possible way and - may need to raise errors to signal to some consumers that something is - just not possible. These errors should be a :exc:`BufferError` unless - there is another error that is actually causing the problem. The - exporter can use flags information to simplify how much of the - :c:data:`Py_buffer` structure is filled in with non-default values and/or - raise an error if the object can't support a simpler view of its memory. + Controls the :c:member:`~Py_buffer.readonly` field. If set, the exporter + MUST provide a writable buffer or else report failure. Otherwise, the + exporter MAY provide either a read-only or writable buffer, but the choice + MUST be consistent for all consumers. - On success, 0 is returned and the *view* structure is filled with useful - values. On error, -1 is returned and an exception is raised; the *view* - is left in an undefined state. + .. c:macro:: PyBUF_FORMAT - The following are the possible values to the *flags* arguments. + Controls the :c:member:`~Py_buffer.format` field. If set, this field MUST + be filled in correctly. Otherwise, this field MUST be *NULL*. - .. c:macro:: PyBUF_SIMPLE - This is the default flag. The returned buffer exposes a read-only - memory area. The format of data is assumed to be raw unsigned bytes, - without any particular structure. This is a "stand-alone" flag - constant. It never needs to be '|'d to the others. The exporter will - raise an error if it cannot provide such a contiguous buffer of bytes. +:c:macro:`PyBUF_WRITABLE` can be \|'d to any of the flags in the next section. +Since :c:macro:`PyBUF_SIMPLE` is defined as 0, :c:macro:`PyBUF_WRITABLE` +can be used as a stand-alone flag to request a simple writable buffer. - .. c:macro:: PyBUF_WRITABLE +:c:macro:`PyBUF_FORMAT` can be \|'d to any of the flags except :c:macro:`PyBUF_SIMPLE`. +The latter already implies format ``B`` (unsigned bytes). - Like :c:macro:`PyBUF_SIMPLE`, but the returned buffer is writable. If - the exporter doesn't support writable buffers, an error is raised. - .. c:macro:: PyBUF_STRIDES +shape, strides, suboffsets +~~~~~~~~~~~~~~~~~~~~~~~~~~ - This implies :c:macro:`PyBUF_ND`. The returned buffer must provide - strides information (i.e. the strides cannot be NULL). This would be - used when the consumer can handle strided, discontiguous arrays. - Handling strides automatically assumes you can handle shape. The - exporter can raise an error if a strided representation of the data is - not possible (i.e. without the suboffsets). +The flags that control the logical structure of the memory are listed +in decreasing order of complexity. Note that each flag contains all bits +of the flags below it. - .. c:macro:: PyBUF_ND +.. tabularcolumns:: |p{0.35\linewidth}|l|l|l| - The returned buffer must provide shape information. The memory will be - assumed C-style contiguous (last dimension varies the fastest). The - exporter may raise an error if it cannot provide this kind of - contiguous buffer. If this is not given then shape will be *NULL*. ++-----------------------------+-------+---------+------------+ +| Request | shape | strides | suboffsets | ++=============================+=======+=========+============+ +| .. c:macro:: PyBUF_INDIRECT | yes | yes | if needed | ++-----------------------------+-------+---------+------------+ +| .. c:macro:: PyBUF_STRIDES | yes | yes | NULL | ++-----------------------------+-------+---------+------------+ +| .. c:macro:: PyBUF_ND | yes | NULL | NULL | ++-----------------------------+-------+---------+------------+ +| .. c:macro:: PyBUF_SIMPLE | NULL | NULL | NULL | ++-----------------------------+-------+---------+------------+ - .. c:macro:: PyBUF_C_CONTIGUOUS - PyBUF_F_CONTIGUOUS - PyBUF_ANY_CONTIGUOUS - These flags indicate that the contiguity returned buffer must be - respectively, C-contiguous (last dimension varies the fastest), Fortran - contiguous (first dimension varies the fastest) or either one. All of - these flags imply :c:macro:`PyBUF_STRIDES` and guarantee that the - strides buffer info structure will be filled in correctly. +contiguity requests +~~~~~~~~~~~~~~~~~~~ - .. c:macro:: PyBUF_INDIRECT +C or Fortran contiguity can be explicitly requested, with and without stride +information. Without stride information, the buffer must be C-contiguous. - This flag indicates the returned buffer must have suboffsets - information (which can be NULL if no suboffsets are needed). This can - be used when the consumer can handle indirect array referencing implied - by these suboffsets. This implies :c:macro:`PyBUF_STRIDES`. +.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l| - .. c:macro:: PyBUF_FORMAT ++-----------------------------------+-------+---------+------------+--------+ +| Request | shape | strides | suboffsets | contig | ++===================================+=======+=========+============+========+ +| .. c:macro:: PyBUF_C_CONTIGUOUS | yes | yes | NULL | C | ++-----------------------------------+-------+---------+------------+--------+ +| .. c:macro:: PyBUF_F_CONTIGUOUS | yes | yes | NULL | F | ++-----------------------------------+-------+---------+------------+--------+ +| .. c:macro:: PyBUF_ANY_CONTIGUOUS | yes | yes | NULL | C or F | ++-----------------------------------+-------+---------+------------+--------+ +| .. c:macro:: PyBUF_ND | yes | NULL | NULL | C | ++-----------------------------------+-------+---------+------------+--------+ - The returned buffer must have true format information if this flag is - provided. This would be used when the consumer is going to be checking - for what 'kind' of data is actually stored. An exporter should always - be able to provide this information if requested. If format is not - explicitly requested then the format must be returned as *NULL* (which - means ``'B'``, or unsigned bytes). - .. c:macro:: PyBUF_STRIDED +compound requests +~~~~~~~~~~~~~~~~~ - This is equivalent to ``(PyBUF_STRIDES | PyBUF_WRITABLE)``. +All possible requests are fully defined by some combination of the flags in +the previous section. For convenience, the buffer protocol provides frequently +used combinations as single flags. - .. c:macro:: PyBUF_STRIDED_RO +In the following table *U* stands for undefined contiguity. The consumer would +have to call :c:func:`PyBuffer_IsContiguous` to determine contiguity. - This is equivalent to ``(PyBUF_STRIDES)``. +.. tabularcolumns:: |p{0.35\linewidth}|l|l|l|l|l|l| - .. c:macro:: PyBUF_RECORDS ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| Request | shape | strides | suboffsets | contig | readonly | format | ++===============================+=======+=========+============+========+==========+========+ +| .. c:macro:: PyBUF_FULL | yes | yes | if needed | U | 0 | yes | ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| .. c:macro:: PyBUF_FULL_RO | yes | yes | if needed | U | 1 or 0 | yes | ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| .. c:macro:: PyBUF_RECORDS | yes | yes | NULL | U | 0 | yes | ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| .. c:macro:: PyBUF_RECORDS_RO | yes | yes | NULL | U | 1 or 0 | yes | ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| .. c:macro:: PyBUF_STRIDED | yes | yes | NULL | U | 0 | NULL | ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| .. c:macro:: PyBUF_STRIDED_RO | yes | yes | NULL | U | 1 or 0 | NULL | ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| .. c:macro:: PyBUF_CONTIG | yes | NULL | NULL | C | 0 | NULL | ++-------------------------------+-------+---------+------------+--------+----------+--------+ +| .. c:macro:: PyBUF_CONTIG_RO | yes | NULL | NULL | C | 1 or 0 | NULL | ++-------------------------------+-------+---------+------------+--------+----------+--------+ - This is equivalent to ``(PyBUF_STRIDES | PyBUF_FORMAT | - PyBUF_WRITABLE)``. - .. c:macro:: PyBUF_RECORDS_RO +Complex arrays +============== + +NumPy-style: shape and strides +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The logical structure of NumPy-style arrays is defined by :c:member:`~Py_buffer.itemsize`, +:c:member:`~Py_buffer.ndim`, :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides`. + +If ``ndim == 0``, the memory location pointed to by :c:member:`~Py_buffer.buf` is +interpreted as a scalar of size :c:member:`~Py_buffer.itemsize`. In that case, +both :c:member:`~Py_buffer.shape` and :c:member:`~Py_buffer.strides` are *NULL*. + +If :c:member:`~Py_buffer.strides` is *NULL*, the array is interpreted as +a standard n-dimensional C-array. Otherwise, the consumer must access an +n-dimensional array as follows: + + ``ptr = (char *)buf + indices[0] * strides[0] + ... + indices[n-1] * strides[n-1]`` + ``item = *((typeof(item) *)ptr);`` + + +As noted above, :c:member:`~Py_buffer.buf` can point to any location within +the actual memory block. An exporter can check the validity of a buffer with +this function: + +.. code-block:: python + + def verify_structure(memlen, itemsize, ndim, shape, strides, offset): + """Verify that the parameters represent a valid array within + the bounds of the allocated memory: + char *mem: start of the physical memory block + memlen: length of the physical memory block + offset: (char *)buf - mem + """ + if offset % itemsize: + return False + if offset < 0 or offset+itemsize > memlen: + return False + if any(v % itemsize for v in strides): + return False + + if ndim <= 0: + return ndim == 0 and not shape and not strides + if 0 in shape: + return True + + imin = sum(strides[j]*(shape[j]-1) for j in range(ndim) + if strides[j] <= 0) + imax = sum(strides[j]*(shape[j]-1) for j in range(ndim) + if strides[j] > 0) + + return 0 <= offset+imin and offset+imax+itemsize <= memlen + + +PIL-style: shape, strides and suboffsets +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In addition to the regular items, PIL-style arrays can contain pointers +that must be followed in order to get to the next element in a dimension. +For example, the regular three-dimensional C-array ``char v[2][2][3]`` can +also be viewed as an array of 2 pointers to 2 two-dimensional arrays: +``char (*v[2])[2][3]``. In suboffsets representation, those two pointers +can be embedded at the start of :c:member:`~Py_buffer.buf`, pointing +to two ``char x[2][3]`` arrays that can be located anywhere in memory. + + +Here is a function that returns a pointer to the element in an N-D array +pointed to by an N-dimensional index when there are both non-NULL strides +and suboffsets:: + + void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides, + Py_ssize_t *suboffsets, Py_ssize_t *indices) { + char *pointer = (char*)buf; + int i; + for (i = 0; i < ndim; i++) { + pointer += strides[i] * indices[i]; + if (suboffsets[i] >=0 ) { + pointer = *((char**)pointer) + suboffsets[i]; + } + } + return (void*)pointer; + } - This is equivalent to ``(PyBUF_STRIDES | PyBUF_FORMAT)``. - .. c:macro:: PyBUF_FULL +Buffer-related functions +======================== - This is equivalent to ``(PyBUF_INDIRECT | PyBUF_FORMAT | - PyBUF_WRITABLE)``. +.. c:function:: int PyObject_CheckBuffer(PyObject *obj) - .. c:macro:: PyBUF_FULL_RO + Return 1 if *obj* supports the buffer interface otherwise 0. When 1 is + returned, it doesn't guarantee that :c:func:`PyObject_GetBuffer` will + succeed. - This is equivalent to ``(PyBUF_INDIRECT | PyBUF_FORMAT)``. - .. c:macro:: PyBUF_CONTIG +.. c:function:: int PyObject_GetBuffer(PyObject *exporter, Py_buffer *view, int flags) - This is equivalent to ``(PyBUF_ND | PyBUF_WRITABLE)``. + Send a request to *exporter* to fill in *view* as specified by *flags*. + If the exporter cannot provide a buffer of the exact type, it MUST raise + :c:data:`PyExc_BufferError`, set :c:member:`view->obj` to *NULL* and + return -1. - .. c:macro:: PyBUF_CONTIG_RO + On success, fill in *view*, set :c:member:`view->obj` to a new reference + to *exporter* and return 0. In the case of chained buffer providers + that redirect requests to a single object, :c:member:`view->obj` MAY + refer to this object instead of *exporter* (See :ref:`Buffer Object Structures <buffer-structs>`). - This is equivalent to ``(PyBUF_ND)``. + Successful calls to :c:func:`PyObject_GetBuffer` must be paired with calls + to :c:func:`PyBuffer_Release`, similar to :c:func:`malloc` and :c:func:`free`. + Thus, after the consumer is done with the buffer, :c:func:`PyBuffer_Release` + must be called exactly once. .. c:function:: void PyBuffer_Release(Py_buffer *view) - Release the buffer *view*. This should be called when the buffer is no - longer being used as it may free memory from it. + Release the buffer *view* and decrement the reference count for + :c:member:`view->obj`. This function MUST be called when the buffer + is no longer being used, otherwise reference leaks may occur. + + It is an error to call this function on a buffer that was not obtained via + :c:func:`PyObject_GetBuffer`. .. c:function:: Py_ssize_t PyBuffer_SizeFromFormat(const char *) - Return the implied :c:data:`~Py_buffer.itemsize` from the struct-stype - :c:data:`~Py_buffer.format`. + Return the implied :c:data:`~Py_buffer.itemsize` from :c:data:`~Py_buffer.format`. + This function is not yet implemented. -.. c:function:: int PyBuffer_IsContiguous(Py_buffer *view, char fortran) +.. c:function:: int PyBuffer_IsContiguous(Py_buffer *view, char order) - Return 1 if the memory defined by the *view* is C-style (*fortran* is - ``'C'``) or Fortran-style (*fortran* is ``'F'``) contiguous or either one - (*fortran* is ``'A'``). Return 0 otherwise. + Return 1 if the memory defined by the *view* is C-style (*order* is + ``'C'``) or Fortran-style (*order* is ``'F'``) contiguous or either one + (*order* is ``'A'``). Return 0 otherwise. -.. c:function:: void PyBuffer_FillContiguousStrides(int ndim, Py_ssize_t *shape, Py_ssize_t *strides, Py_ssize_t itemsize, char fortran) +.. c:function:: void PyBuffer_FillContiguousStrides(int ndim, Py_ssize_t *shape, Py_ssize_t *strides, Py_ssize_t itemsize, char order) Fill the *strides* array with byte-strides of a contiguous (C-style if - *fortran* is ``'C'`` or Fortran-style if *fortran* is ``'F'``) array of the + *order* is ``'C'`` or Fortran-style if *order* is ``'F'``) array of the given shape with the given number of bytes per element. -.. c:function:: int PyBuffer_FillInfo(Py_buffer *view, PyObject *obj, void *buf, Py_ssize_t len, int readonly, int infoflags) +.. c:function:: int PyBuffer_FillInfo(Py_buffer *view, PyObject *exporter, void *buf, Py_ssize_t len, int readonly, int flags) + + Handle buffer requests for an exporter that wants to expose *buf* of size *len* + with writability set according to *readonly*. *buf* is interpreted as a sequence + of unsigned bytes. + + The *flags* argument indicates the request type. This function always fills in + *view* as specified by flags, unless *buf* has been designated as read-only + and :c:macro:`PyBUF_WRITABLE` is set in *flags*. + + On success, set :c:member:`view->obj` to a new reference to *exporter* and + return 0. Otherwise, raise :c:data:`PyExc_BufferError`, set + :c:member:`view->obj` to *NULL* and return -1; + + If this function is used as part of a :ref:`getbufferproc <buffer-structs>`, + *exporter* MUST be set to the exporting object. Otherwise, *exporter* MUST + be NULL. + - Fill in a buffer-info structure, *view*, correctly for an exporter that can - only share a contiguous chunk of memory of "unsigned bytes" of the given - length. Return 0 on success and -1 (with raising an error) on error. diff --git a/Doc/c-api/bytearray.rst b/Doc/c-api/bytearray.rst index 95ded96..8202205 100644 --- a/Doc/c-api/bytearray.rst +++ b/Doc/c-api/bytearray.rst @@ -40,7 +40,7 @@ Direct API functions .. c:function:: PyObject* PyByteArray_FromObject(PyObject *o) Return a new bytearray object from any object, *o*, that implements the - buffer protocol. + :ref:`buffer protocol <bufferobjects>`. .. XXX expand about the buffer protocol, at least somewhere diff --git a/Doc/c-api/bytes.rst b/Doc/c-api/bytes.rst index 12ec80c..5666dac 100644 --- a/Doc/c-api/bytes.rst +++ b/Doc/c-api/bytes.rst @@ -62,6 +62,8 @@ called with a non-bytes parameter. .. % because not all compilers support the %z width modifier -- we fake it .. % when necessary via interpolating PY_FORMAT_SIZE_T. + .. tabularcolumns:: |l|l|L| + +-------------------+---------------+--------------------------------+ | Format Characters | Type | Comment | +===================+===============+================================+ diff --git a/Doc/c-api/code.rst b/Doc/c-api/code.rst index 6932bb1..57e8072 100644 --- a/Doc/c-api/code.rst +++ b/Doc/c-api/code.rst @@ -31,11 +31,11 @@ bound into a function. Return true if *co* is a :class:`code` object -.. c:function:: int PyCode_GetNumFree(PyObject *co) +.. c:function:: int PyCode_GetNumFree(PyCodeObject *co) Return the number of free variables in *co*. -.. c:function:: PyCodeObject *PyCode_New(int argcount, int kwonlyargcount, int nlocals, int stacksize, int flags, PyObject *code, PyObject *consts, PyObject *names, PyObject *varnames, PyObject *freevars, PyObject *cellvars, PyObject *filename, PyObject *name, int firstlineno, PyObject *lnotab) +.. c:function:: PyCodeObject* PyCode_New(int argcount, int kwonlyargcount, int nlocals, int stacksize, int flags, PyObject *code, PyObject *consts, PyObject *names, PyObject *varnames, PyObject *freevars, PyObject *cellvars, PyObject *filename, PyObject *name, int firstlineno, PyObject *lnotab) Return a new code object. If you need a dummy code object to create a frame, use :c:func:`PyCode_NewEmpty` instead. Calling @@ -43,7 +43,7 @@ bound into a function. version since the definition of the bytecode changes often. -.. c:function:: int PyCode_NewEmpty(const char *filename, const char *funcname, int firstlineno) +.. c:function:: PyCodeObject* PyCode_NewEmpty(const char *filename, const char *funcname, int firstlineno) Return a new empty code object with the specified filename, function name, and first line number. It is illegal to diff --git a/Doc/c-api/codec.rst b/Doc/c-api/codec.rst index 8207ae0..83252af 100644 --- a/Doc/c-api/codec.rst +++ b/Doc/c-api/codec.rst @@ -52,19 +52,19 @@ and *NULL* returned. .. c:function:: PyObject* PyCodec_IncrementalEncoder(const char *encoding, const char *errors) - Get an :class:`IncrementalEncoder` object for the given *encoding*. + Get an :class:`~codecs.IncrementalEncoder` object for the given *encoding*. .. c:function:: PyObject* PyCodec_IncrementalDecoder(const char *encoding, const char *errors) - Get an :class:`IncrementalDecoder` object for the given *encoding*. + Get an :class:`~codecs.IncrementalDecoder` object for the given *encoding*. .. c:function:: PyObject* PyCodec_StreamReader(const char *encoding, PyObject *stream, const char *errors) - Get a :class:`StreamReader` factory function for the given *encoding*. + Get a :class:`~codecs.StreamReader` factory function for the given *encoding*. .. c:function:: PyObject* PyCodec_StreamWriter(const char *encoding, PyObject *stream, const char *errors) - Get a :class:`StreamWriter` factory function for the given *encoding*. + Get a :class:`~codecs.StreamWriter` factory function for the given *encoding*. Registry API for Unicode encoding error handlers diff --git a/Doc/c-api/conversion.rst b/Doc/c-api/conversion.rst index dfc0a3a..9578f98 100644 --- a/Doc/c-api/conversion.rst +++ b/Doc/c-api/conversion.rst @@ -119,13 +119,13 @@ The following functions provide locale-independent string to number conversions. .. versionadded:: 3.1 -.. c:function:: char* PyOS_stricmp(char *s1, char *s2) +.. c:function:: int PyOS_stricmp(char *s1, char *s2) Case insensitive comparison of strings. The function works almost identically to :c:func:`strcmp` except that it ignores the case. -.. c:function:: char* PyOS_strnicmp(char *s1, char *s2, Py_ssize_t size) +.. c:function:: int PyOS_strnicmp(char *s1, char *s2, Py_ssize_t size) Case insensitive comparison of strings. The function works almost identically to :c:func:`strncmp` except that it ignores the case. diff --git a/Doc/c-api/datetime.rst b/Doc/c-api/datetime.rst index fcd1395..39542bd 100644 --- a/Doc/c-api/datetime.rst +++ b/Doc/c-api/datetime.rst @@ -170,6 +170,31 @@ and the type is not checked: Return the microsecond, as an int from 0 through 999999. +Macros to extract fields from time delta objects. The argument must be an +instance of :c:data:`PyDateTime_Delta`, including subclasses. The argument must +not be *NULL*, and the type is not checked: + +.. c:function:: int PyDateTime_DELTA_GET_DAYS(PyDateTime_Delta *o) + + Return the number of days, as an int from -999999999 to 999999999. + + .. versionadded:: 3.3 + + +.. c:function:: int PyDateTime_DELTA_GET_SECONDS(PyDateTime_Delta *o) + + Return the number of seconds, as an int from 0 through 86399. + + .. versionadded:: 3.3 + + +.. c:function:: int PyDateTime_DELTA_GET_MICROSECOND(PyDateTime_Delta *o) + + Return the number of microseconds, as an int from 0 through 999999. + + .. versionadded:: 3.3 + + Macros for the convenience of modules implementing the DB API: .. c:function:: PyObject* PyDateTime_FromTimestamp(PyObject *args) diff --git a/Doc/c-api/dict.rst b/Doc/c-api/dict.rst index 6df84e0..6bacc32 100644 --- a/Doc/c-api/dict.rst +++ b/Doc/c-api/dict.rst @@ -36,11 +36,11 @@ Dictionary Objects Return a new empty dictionary, or *NULL* on failure. -.. c:function:: PyObject* PyDictProxy_New(PyObject *dict) +.. c:function:: PyObject* PyDictProxy_New(PyObject *mapping) - Return a proxy object for a mapping which enforces read-only behavior. - This is normally used to create a proxy to prevent modification of the - dictionary for non-dynamic class types. + Return a :class:`types.MappingProxyType` object for a mapping which + enforces read-only behavior. This is normally used to create a view to + prevent modification of the dictionary for non-dynamic class types. .. c:function:: void PyDict_Clear(PyObject *p) @@ -209,3 +209,10 @@ Dictionary Objects for key, value in seq2: if override or key not in a: a[key] = value + + +.. c:function:: int PyDict_ClearFreeList() + + Clear the free list. Return the total number of freed items. + + .. versionadded:: 3.3 diff --git a/Doc/c-api/exceptions.rst b/Doc/c-api/exceptions.rst index 6f13c80..0aa892d 100644 --- a/Doc/c-api/exceptions.rst +++ b/Doc/c-api/exceptions.rst @@ -129,6 +129,41 @@ in various ways. There is a separate error indicator for each thread. exception state. +.. c:function:: void PyErr_GetExcInfo(PyObject **ptype, PyObject **pvalue, PyObject **ptraceback) + + Retrieve the exception info, as known from ``sys.exc_info()``. This refers + to an exception that was already caught, not to an exception that was + freshly raised. Returns new references for the three objects, any of which + may be *NULL*. Does not modify the exception info state. + + .. note:: + + This function is not normally used by code that wants to handle exceptions. + Rather, it can be used when code needs to save and restore the exception + state temporarily. Use :c:func:`PyErr_SetExcInfo` to restore or clear the + exception state. + + .. versionadded:: 3.3 + + +.. c:function:: void PyErr_SetExcInfo(PyObject *type, PyObject *value, PyObject *traceback) + + Set the exception info, as known from ``sys.exc_info()``. This refers + to an exception that was already caught, not to an exception that was + freshly raised. This function steals the references of the arguments. + To clear the exception state, pass *NULL* for all three arguments. + For general rules about the three arguments, see :c:func:`PyErr_Restore`. + + .. note:: + + This function is not normally used by code that wants to handle exceptions. + Rather, it can be used when code needs to save and restore the exception + state temporarily. Use :c:func:`PyErr_GetExcInfo` to read the exception + state. + + .. versionadded:: 3.3 + + .. c:function:: void PyErr_SetString(PyObject *type, const char *message) This is the most common way to set the error indicator. The first argument @@ -187,13 +222,19 @@ in various ways. There is a separate error indicator for each thread. when the system call returns an error. -.. c:function:: PyObject* PyErr_SetFromErrnoWithFilename(PyObject *type, const char *filename) +.. c:function:: PyObject* PyErr_SetFromErrnoWithFilenameObject(PyObject *type, PyObject *filenameObject) Similar to :c:func:`PyErr_SetFromErrno`, with the additional behavior that if - *filename* is not *NULL*, it is passed to the constructor of *type* as a third - parameter. In the case of exceptions such as :exc:`IOError` and :exc:`OSError`, - this is used to define the :attr:`filename` attribute of the exception instance. - *filename* is decoded from the filesystem encoding + *filenameObject* is not *NULL*, it is passed to the constructor of *type* as + a third parameter. In the case of exceptions such as :exc:`IOError` and + :exc:`OSError`, this is used to define the :attr:`filename` attribute of the + exception instance. + + +.. c:function:: PyObject* PyErr_SetFromErrnoWithFilename(PyObject *type, const char *filename) + + Similar to :c:func:`PyErr_SetFromErrnoWithFilenameObject`, but the filename + is given as a C string. *filename* is decoded from the filesystem encoding (:func:`sys.getfilesystemencoding`). @@ -215,21 +256,43 @@ in various ways. There is a separate error indicator for each thread. specifying the exception type to be raised. Availability: Windows. +.. c:function:: PyObject* PyErr_SetFromWindowsErrWithFilenameObject(int ierr, PyObject *filenameObject) + + Similar to :c:func:`PyErr_SetFromWindowsErr`, with the additional behavior + that if *filenameObject* is not *NULL*, it is passed to the constructor of + :exc:`WindowsError` as a third parameter. Availability: Windows. + + .. c:function:: PyObject* PyErr_SetFromWindowsErrWithFilename(int ierr, const char *filename) - Similar to :c:func:`PyErr_SetFromWindowsErr`, with the additional behavior that - if *filename* is not *NULL*, it is passed to the constructor of - :exc:`WindowsError` as a third parameter. *filename* is decoded from the - filesystem encoding (:func:`sys.getfilesystemencoding`). Availability: - Windows. + Similar to :c:func:`PyErr_SetFromWindowsErrWithFilenameObject`, but the + filename is given as a C string. *filename* is decoded from the filesystem + encoding (:func:`sys.getfilesystemencoding`). Availability: Windows. + +.. c:function:: PyObject* PyErr_SetExcFromWindowsErrWithFilenameObject(PyObject *type, int ierr, PyObject *filename) -.. c:function:: PyObject* PyErr_SetExcFromWindowsErrWithFilename(PyObject *type, int ierr, char *filename) + Similar to :c:func:`PyErr_SetFromWindowsErrWithFilenameObject`, with an + additional parameter specifying the exception type to be raised. + Availability: Windows. + + +.. c:function:: PyObject* PyErr_SetExcFromWindowsErrWithFilename(PyObject *type, int ierr, const char *filename) Similar to :c:func:`PyErr_SetFromWindowsErrWithFilename`, with an additional parameter specifying the exception type to be raised. Availability: Windows. +.. c:function:: PyObject* PyErr_SetImportError(PyObject *msg, PyObject *name, PyObject *path) + + This is a convenience function to raise :exc:`ImportError`. *msg* will be + set as the exception's message string. *name* and *path*, both of which can + be ``NULL``, will be set as the :exc:`ImportError`'s respective ``name`` + and ``path`` attributes. + + .. versionadded:: 3.3 + + .. c:function:: void PyErr_SyntaxLocationEx(char *filename, int lineno, int col_offset) Set file, line, and offset information for the current exception. If the @@ -238,12 +301,12 @@ in various ways. There is a separate error indicator for each thread. is a :exc:`SyntaxError`. *filename* is decoded from the filesystem encoding (:func:`sys.getfilesystemencoding`). -.. versionadded:: 3.2 + .. versionadded:: 3.2 .. c:function:: void PyErr_SyntaxLocation(char *filename, int lineno) - Like :c:func:`PyErr_SyntaxLocationExc`, but the col_offset parameter is + Like :c:func:`PyErr_SyntaxLocationEx`, but the col_offset parameter is omitted. @@ -311,6 +374,7 @@ in various ways. There is a separate error indicator for each thread. .. versionadded:: 3.2 + .. c:function:: int PyErr_CheckSignals() .. index:: @@ -421,17 +485,18 @@ Exception Objects .. c:function:: PyObject* PyException_GetCause(PyObject *ex) - Return the cause (another exception instance set by ``raise ... from ...``) - associated with the exception as a new reference, as accessible from Python - through :attr:`__cause__`. If there is no cause associated, this returns - *NULL*. + Return the cause (either an exception instance, or :const:`None`, + set by ``raise ... from ...``) associated with the exception as a new + reference, as accessible from Python through :attr:`__cause__`. -.. c:function:: void PyException_SetCause(PyObject *ex, PyObject *ctx) +.. c:function:: void PyException_SetCause(PyObject *ex, PyObject *cause) - Set the cause associated with the exception to *ctx*. Use *NULL* to clear - it. There is no type check to make sure that *ctx* is an exception instance. - This steals a reference to *ctx*. + Set the cause associated with the exception to *cause*. Use *NULL* to clear + it. There is no type check to make sure that *cause* is either an exception + instance or :const:`None`. This steals a reference to *cause*. + + :attr:`__suppress_context__` is implicitly set to ``True`` by this function. .. _unicodeexceptions: @@ -525,7 +590,7 @@ recursion depth automatically). Marks a point where a recursive C-level call is about to be performed. - If :const:`USE_STACKCHECK` is defined, this function checks if the the OS + If :const:`USE_STACKCHECK` is defined, this function checks if the OS stack overflowed using :c:func:`PyOS_CheckStack`. In this is the case, it sets a :exc:`MemoryError` and returns a nonzero value. @@ -542,28 +607,28 @@ recursion depth automatically). Ends a :c:func:`Py_EnterRecursiveCall`. Must be called once for each *successful* invocation of :c:func:`Py_EnterRecursiveCall`. -Properly implementing :attr:`tp_repr` for container types requires +Properly implementing :c:member:`~PyTypeObject.tp_repr` for container types requires special recursion handling. In addition to protecting the stack, -:attr:`tp_repr` also needs to track objects to prevent cycles. The +:c:member:`~PyTypeObject.tp_repr` also needs to track objects to prevent cycles. The following two functions facilitate this functionality. Effectively, these are the C equivalent to :func:`reprlib.recursive_repr`. .. c:function:: int Py_ReprEnter(PyObject *object) - Called at the beginning of the :attr:`tp_repr` implementation to + Called at the beginning of the :c:member:`~PyTypeObject.tp_repr` implementation to detect cycles. If the object has already been processed, the function returns a - positive integer. In that case the :attr:`tp_repr` implementation + positive integer. In that case the :c:member:`~PyTypeObject.tp_repr` implementation should return a string object indicating a cycle. As examples, :class:`dict` objects return ``{...}`` and :class:`list` objects return ``[...]``. The function will return a negative integer if the recursion limit - is reached. In that case the :attr:`tp_repr` implementation should + is reached. In that case the :c:member:`~PyTypeObject.tp_repr` implementation should typically return ``NULL``. - Otherwise, the function returns zero and the :attr:`tp_repr` + Otherwise, the function returns zero and the :c:member:`~PyTypeObject.tp_repr` implementation can continue normally. .. c:function:: void Py_ReprLeave(PyObject *object) @@ -582,65 +647,116 @@ All standard Python exceptions are available as global variables whose names are :c:type:`PyObject\*`; they are all class objects. For completeness, here are all the variables: -+-------------------------------------+----------------------------+----------+ -| C Name | Python Name | Notes | -+=====================================+============================+==========+ -| :c:data:`PyExc_BaseException` | :exc:`BaseException` | \(1) | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_Exception` | :exc:`Exception` | \(1) | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_ArithmeticError` | :exc:`ArithmeticError` | \(1) | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_LookupError` | :exc:`LookupError` | \(1) | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_AssertionError` | :exc:`AssertionError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_AttributeError` | :exc:`AttributeError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_EOFError` | :exc:`EOFError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_EnvironmentError` | :exc:`EnvironmentError` | \(1) | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_FloatingPointError` | :exc:`FloatingPointError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_IOError` | :exc:`IOError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_ImportError` | :exc:`ImportError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_IndexError` | :exc:`IndexError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_KeyError` | :exc:`KeyError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_KeyboardInterrupt` | :exc:`KeyboardInterrupt` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_MemoryError` | :exc:`MemoryError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_NameError` | :exc:`NameError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_NotImplementedError` | :exc:`NotImplementedError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_OSError` | :exc:`OSError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_OverflowError` | :exc:`OverflowError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_ReferenceError` | :exc:`ReferenceError` | \(2) | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_RuntimeError` | :exc:`RuntimeError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_SyntaxError` | :exc:`SyntaxError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_SystemError` | :exc:`SystemError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_SystemExit` | :exc:`SystemExit` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_TypeError` | :exc:`TypeError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_ValueError` | :exc:`ValueError` | | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_WindowsError` | :exc:`WindowsError` | \(3) | -+-------------------------------------+----------------------------+----------+ -| :c:data:`PyExc_ZeroDivisionError` | :exc:`ZeroDivisionError` | | -+-------------------------------------+----------------------------+----------+ ++-----------------------------------------+---------------------------------+----------+ +| C Name | Python Name | Notes | ++=========================================+=================================+==========+ +| :c:data:`PyExc_BaseException` | :exc:`BaseException` | \(1) | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_Exception` | :exc:`Exception` | \(1) | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ArithmeticError` | :exc:`ArithmeticError` | \(1) | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_LookupError` | :exc:`LookupError` | \(1) | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_AssertionError` | :exc:`AssertionError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_AttributeError` | :exc:`AttributeError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_BlockingIOError` | :exc:`BlockingIOError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_BrokenPipeError` | :exc:`BrokenPipeError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ChildProcessError` | :exc:`ChildProcessError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ConnectionError` | :exc:`ConnectionError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ConnectionAbortedError` | :exc:`ConnectionAbortedError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ConnectionRefusedError` | :exc:`ConnectionRefusedError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ConnectionResetError` | :exc:`ConnectionResetError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_FileExistsError` | :exc:`FileExistsError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_FileNotFoundError` | :exc:`FileNotFoundError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_EOFError` | :exc:`EOFError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_FloatingPointError` | :exc:`FloatingPointError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ImportError` | :exc:`ImportError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_IndexError` | :exc:`IndexError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_InterruptedError` | :exc:`InterruptedError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_IsADirectoryError` | :exc:`IsADirectoryError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_KeyError` | :exc:`KeyError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_KeyboardInterrupt` | :exc:`KeyboardInterrupt` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_MemoryError` | :exc:`MemoryError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_NameError` | :exc:`NameError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_NotADirectoryError` | :exc:`NotADirectoryError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_NotImplementedError` | :exc:`NotImplementedError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_OSError` | :exc:`OSError` | \(1) | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_OverflowError` | :exc:`OverflowError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_PermissionError` | :exc:`PermissionError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ProcessLookupError` | :exc:`ProcessLookupError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ReferenceError` | :exc:`ReferenceError` | \(2) | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_RuntimeError` | :exc:`RuntimeError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_SyntaxError` | :exc:`SyntaxError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_SystemError` | :exc:`SystemError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_TimeoutError` | :exc:`TimeoutError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_SystemExit` | :exc:`SystemExit` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_TypeError` | :exc:`TypeError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ValueError` | :exc:`ValueError` | | ++-----------------------------------------+---------------------------------+----------+ +| :c:data:`PyExc_ZeroDivisionError` | :exc:`ZeroDivisionError` | | ++-----------------------------------------+---------------------------------+----------+ + +.. versionadded:: 3.3 + :c:data:`PyExc_BlockingIOError`, :c:data:`PyExc_BrokenPipeError`, + :c:data:`PyExc_ChildProcessError`, :c:data:`PyExc_ConnectionError`, + :c:data:`PyExc_ConnectionAbortedError`, :c:data:`PyExc_ConnectionRefusedError`, + :c:data:`PyExc_ConnectionResetError`, :c:data:`PyExc_FileExistsError`, + :c:data:`PyExc_FileNotFoundError`, :c:data:`PyExc_InterruptedError`, + :c:data:`PyExc_IsADirectoryError`, :c:data:`PyExc_NotADirectoryError`, + :c:data:`PyExc_PermissionError`, :c:data:`PyExc_ProcessLookupError` + and :c:data:`PyExc_TimeoutError` were introduced following :pep:`3151`. + + +These are compatibility aliases to :c:data:`PyExc_OSError`: + ++-------------------------------------+----------+ +| C Name | Notes | ++=====================================+==========+ +| :c:data:`PyExc_EnvironmentError` | | ++-------------------------------------+----------+ +| :c:data:`PyExc_IOError` | | ++-------------------------------------+----------+ +| :c:data:`PyExc_WindowsError` | \(3) | ++-------------------------------------+----------+ + +.. versionchanged:: 3.3 + These aliases used to be separate exception types. + .. index:: single: PyExc_BaseException @@ -649,28 +765,42 @@ the variables: single: PyExc_LookupError single: PyExc_AssertionError single: PyExc_AttributeError + single: PyExc_BlockingIOError + single: PyExc_BrokenPipeError + single: PyExc_ConnectionError + single: PyExc_ConnectionAbortedError + single: PyExc_ConnectionRefusedError + single: PyExc_ConnectionResetError single: PyExc_EOFError - single: PyExc_EnvironmentError + single: PyExc_FileExistsError + single: PyExc_FileNotFoundError single: PyExc_FloatingPointError - single: PyExc_IOError single: PyExc_ImportError single: PyExc_IndexError + single: PyExc_InterruptedError + single: PyExc_IsADirectoryError single: PyExc_KeyError single: PyExc_KeyboardInterrupt single: PyExc_MemoryError single: PyExc_NameError + single: PyExc_NotADirectoryError single: PyExc_NotImplementedError single: PyExc_OSError single: PyExc_OverflowError + single: PyExc_PermissionError + single: PyExc_ProcessLookupError single: PyExc_ReferenceError single: PyExc_RuntimeError single: PyExc_SyntaxError single: PyExc_SystemError single: PyExc_SystemExit + single: PyExc_TimeoutError single: PyExc_TypeError single: PyExc_ValueError - single: PyExc_WindowsError single: PyExc_ZeroDivisionError + single: PyExc_EnvironmentError + single: PyExc_IOError + single: PyExc_WindowsError Notes: diff --git a/Doc/c-api/file.rst b/Doc/c-api/file.rst index c5a4a59..cc190c9 100644 --- a/Doc/c-api/file.rst +++ b/Doc/c-api/file.rst @@ -40,9 +40,9 @@ the :mod:`io` APIs instead. Return the file descriptor associated with *p* as an :c:type:`int`. If the object is an integer, its value is returned. If not, the - object's :meth:`fileno` method is called if it exists; the method must return - an integer, which is returned as the file descriptor value. Sets an - exception and returns ``-1`` on failure. + object's :meth:`~io.IOBase.fileno` method is called if it exists; the + method must return an integer, which is returned as the file descriptor + value. Sets an exception and returns ``-1`` on failure. .. c:function:: PyObject* PyFile_GetLine(PyObject *p, int n) @@ -50,7 +50,8 @@ the :mod:`io` APIs instead. .. index:: single: EOFError (built-in exception) Equivalent to ``p.readline([n])``, this function reads one line from the - object *p*. *p* may be a file object or any object with a :meth:`readline` + object *p*. *p* may be a file object or any object with a + :meth:`~io.IOBase.readline` method. If *n* is ``0``, exactly one line is read, regardless of the length of the line. If *n* is greater than ``0``, no more than *n* bytes will be read from the file; a partial line can be returned. In both cases, an empty string diff --git a/Doc/c-api/function.rst b/Doc/c-api/function.rst index 31805fd..ad98322 100644 --- a/Doc/c-api/function.rst +++ b/Doc/c-api/function.rst @@ -38,6 +38,16 @@ There are a few functions specific to Python functions. object, the argument defaults and closure are set to *NULL*. +.. c:function:: PyObject* PyFunction_NewWithQualName(PyObject *code, PyObject *globals, PyObject *qualname) + + As :c:func:`PyFunction_New`, but also allows to set the function object's + ``__qualname__`` attribute. *qualname* should be a unicode object or NULL; + if NULL, the ``__qualname__`` attribute is set to the same value as its + ``__name__`` attribute. + + .. versionadded:: 3.3 + + .. c:function:: PyObject* PyFunction_GetCode(PyObject *op) Return the code object associated with the function object *op*. diff --git a/Doc/c-api/gcsupport.rst b/Doc/c-api/gcsupport.rst index 3875ff2..9f6ad85 100644 --- a/Doc/c-api/gcsupport.rst +++ b/Doc/c-api/gcsupport.rst @@ -12,10 +12,10 @@ other objects, or which only store references to atomic types (such as numbers or strings), do not need to provide any explicit support for garbage collection. -To create a container type, the :attr:`tp_flags` field of the type object must +To create a container type, the :c:member:`~PyTypeObject.tp_flags` field of the type object must include the :const:`Py_TPFLAGS_HAVE_GC` and provide an implementation of the -:attr:`tp_traverse` handler. If instances of the type are mutable, a -:attr:`tp_clear` implementation must also be provided. +:c:member:`~PyTypeObject.tp_traverse` handler. If instances of the type are mutable, a +:c:member:`~PyTypeObject.tp_clear` implementation must also be provided. .. data:: Py_TPFLAGS_HAVE_GC @@ -57,7 +57,7 @@ Constructors for container types must conform to two rules: Adds the object *op* to the set of container objects tracked by the collector. The collector can run at unexpected times so objects must be valid while being tracked. This should be called once all the fields - followed by the :attr:`tp_traverse` handler become valid, usually near the + followed by the :c:member:`~PyTypeObject.tp_traverse` handler become valid, usually near the end of the constructor. @@ -86,8 +86,8 @@ rules: Remove the object *op* from the set of container objects tracked by the collector. Note that :c:func:`PyObject_GC_Track` can be called again on this object to add it back to the set of tracked objects. The deallocator - (:attr:`tp_dealloc` handler) should call this for the object before any of - the fields used by the :attr:`tp_traverse` handler become invalid. + (:c:member:`~PyTypeObject.tp_dealloc` handler) should call this for the object before any of + the fields used by the :c:member:`~PyTypeObject.tp_traverse` handler become invalid. .. c:function:: void _PyObject_GC_UNTRACK(PyObject *op) @@ -95,19 +95,19 @@ rules: A macro version of :c:func:`PyObject_GC_UnTrack`. It should not be used for extension modules. -The :attr:`tp_traverse` handler accepts a function parameter of this type: +The :c:member:`~PyTypeObject.tp_traverse` handler accepts a function parameter of this type: .. c:type:: int (*visitproc)(PyObject *object, void *arg) - Type of the visitor function passed to the :attr:`tp_traverse` handler. + Type of the visitor function passed to the :c:member:`~PyTypeObject.tp_traverse` handler. The function should be called with an object to traverse as *object* and - the third parameter to the :attr:`tp_traverse` handler as *arg*. The + the third parameter to the :c:member:`~PyTypeObject.tp_traverse` handler as *arg*. The Python core uses several visitor functions to implement cyclic garbage detection; it's not expected that users will need to write their own visitor functions. -The :attr:`tp_traverse` handler must have the following type: +The :c:member:`~PyTypeObject.tp_traverse` handler must have the following type: .. c:type:: int (*traverseproc)(PyObject *self, visitproc visit, void *arg) @@ -119,15 +119,15 @@ The :attr:`tp_traverse` handler must have the following type: object argument. If *visit* returns a non-zero value that value should be returned immediately. -To simplify writing :attr:`tp_traverse` handlers, a :c:func:`Py_VISIT` macro is -provided. In order to use this macro, the :attr:`tp_traverse` implementation +To simplify writing :c:member:`~PyTypeObject.tp_traverse` handlers, a :c:func:`Py_VISIT` macro is +provided. In order to use this macro, the :c:member:`~PyTypeObject.tp_traverse` implementation must name its arguments exactly *visit* and *arg*: .. c:function:: void Py_VISIT(PyObject *o) Call the *visit* callback, with arguments *o* and *arg*. If *visit* returns - a non-zero value, then return it. Using this macro, :attr:`tp_traverse` + a non-zero value, then return it. Using this macro, :c:member:`~PyTypeObject.tp_traverse` handlers look like:: static int @@ -138,7 +138,7 @@ must name its arguments exactly *visit* and *arg*: return 0; } -The :attr:`tp_clear` handler must be of the :c:type:`inquiry` type, or *NULL* +The :c:member:`~PyTypeObject.tp_clear` handler must be of the :c:type:`inquiry` type, or *NULL* if the object is immutable. diff --git a/Doc/c-api/import.rst b/Doc/c-api/import.rst index cf48363..270152e 100644 --- a/Doc/c-api/import.rst +++ b/Doc/c-api/import.rst @@ -30,13 +30,13 @@ Importing Modules .. c:function:: PyObject* PyImport_ImportModuleNoBlock(const char *name) - This version of :c:func:`PyImport_ImportModule` does not block. It's intended - to be used in C functions that import other modules to execute a function. - The import may block if another thread holds the import lock. The function - :c:func:`PyImport_ImportModuleNoBlock` never blocks. It first tries to fetch - the module from sys.modules and falls back to :c:func:`PyImport_ImportModule` - unless the lock is held, in which case the function will raise an - :exc:`ImportError`. + This function is a deprecated alias of :c:func:`PyImport_ImportModule`. + + .. versionchanged:: 3.3 + This function used to fail immediately when the import lock was held + by another thread. In Python 3.3 though, the locking scheme switched + to per-module locks for most purposes, so this function's special + behaviour isn't needed anymore. .. c:function:: PyObject* PyImport_ImportModuleEx(char *name, PyObject *globals, PyObject *locals, PyObject *fromlist) @@ -44,8 +44,7 @@ Importing Modules .. index:: builtin: __import__ Import a module. This is best described by referring to the built-in Python - function :func:`__import__`, as the standard :func:`__import__` function calls - this function directly. + function :func:`__import__`. The return value is a new reference to the imported module or top-level package, or *NULL* with an exception set on failure. Like for @@ -57,7 +56,7 @@ Importing Modules :c:func:`PyImport_ImportModule`. -.. c:function:: PyObject* PyImport_ImportModuleLevel(char *name, PyObject *globals, PyObject *locals, PyObject *fromlist, int level) +.. c:function:: PyObject* PyImport_ImportModuleLevelObject(PyObject *name, PyObject *globals, PyObject *locals, PyObject *fromlist, int level) Import a module. This is best described by referring to the built-in Python function :func:`__import__`, as the standard :func:`__import__` function calls @@ -68,6 +67,16 @@ Importing Modules the return value when a submodule of a package was requested is normally the top-level package, unless a non-empty *fromlist* was given. + .. versionadded:: 3.3 + + +.. c:function:: PyObject* PyImport_ImportModuleLevel(char *name, PyObject *globals, PyObject *locals, PyObject *fromlist, int level) + + Similar to :c:func:`PyImport_ImportModuleLevelObject`, but the name is an + UTF-8 encoded string instead of a Unicode object. + + .. versionchanged:: 3.3 + Negative values for *level* are no longer accepted. .. c:function:: PyObject* PyImport_Import(PyObject *name) @@ -86,7 +95,7 @@ Importing Modules an exception set on failure (the module still exists in this case). -.. c:function:: PyObject* PyImport_AddModule(const char *name) +.. c:function:: PyObject* PyImport_AddModuleObject(PyObject *name) Return the module object corresponding to a module name. The *name* argument may be of the form ``package.module``. First check the modules dictionary if @@ -100,6 +109,14 @@ Importing Modules or one of its variants to import a module. Package structures implied by a dotted name for *name* are not created if not already present. + .. versionadded:: 3.3 + + +.. c:function:: PyObject* PyImport_AddModule(const char *name) + + Similar to :c:func:`PyImport_AddModuleObject`, but the name is a UTF-8 + encoded string instead of a Unicode object. + .. c:function:: PyObject* PyImport_ExecCodeModule(char *name, PyObject *co) @@ -136,25 +153,43 @@ Importing Modules See also :c:func:`PyImport_ExecCodeModuleWithPathnames`. -.. c:function:: PyObject* PyImport_ExecCodeModuleWithPathnames(char *name, PyObject *co, char *pathname, char *cpathname) +.. c:function:: PyObject* PyImport_ExecCodeModuleObject(PyObject *name, PyObject *co, PyObject *pathname, PyObject *cpathname) Like :c:func:`PyImport_ExecCodeModuleEx`, but the :attr:`__cached__` attribute of the module object is set to *cpathname* if it is non-``NULL``. Of the three functions, this is the preferred one to use. + .. versionadded:: 3.3 + + +.. c:function:: PyObject* PyImport_ExecCodeModuleWithPathnames(char *name, PyObject *co, char *pathname, char *cpathname) + + Like :c:func:`PyImport_ExecCodeModuleObject`, but *name*, *pathname* and + *cpathname* are UTF-8 encoded strings. Attempts are also made to figure out + what the value for *pathname* should be from *cpathname* if the former is + set to ``NULL``. + .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Uses :func:`imp.source_from_cache()` in calculating the source path if + only the bytecode path is provided. + .. c:function:: long PyImport_GetMagicNumber() Return the magic number for Python bytecode files (a.k.a. :file:`.pyc` and :file:`.pyo` files). The magic number should be present in the first four bytes - of the bytecode file, in little-endian byte order. + of the bytecode file, in little-endian byte order. Returns -1 on error. + + .. versionchanged:: 3.3 + Return value of -1 upon failure. .. c:function:: const char * PyImport_GetMagicTag() Return the magic tag string for :pep:`3147` format Python bytecode file - names. + names. Keep in mind that the value at ``sys.implementation.cache_tag`` is + authoritative and should be used instead of this function. .. versionadded:: 3.2 @@ -200,7 +235,7 @@ Importing Modules For internal use only. -.. c:function:: int PyImport_ImportFrozenModule(char *name) +.. c:function:: int PyImport_ImportFrozenModuleObject(PyObject *name) Load a frozen module named *name*. Return ``1`` for success, ``0`` if the module is not found, and ``-1`` with an exception set if the initialization @@ -208,6 +243,14 @@ Importing Modules :c:func:`PyImport_ImportModule`. (Note the misnomer --- this function would reload the module if it was already imported.) + .. versionadded:: 3.3 + + +.. c:function:: int PyImport_ImportFrozenModule(char *name) + + Similar to :c:func:`PyImport_ImportFrozenModuleObject`, but the name is a + UTF-8 encoded string instead of a Unicode object. + .. c:type:: struct _frozen @@ -247,13 +290,13 @@ Importing Modules Structure describing a single entry in the list of built-in modules. Each of these structures gives the name and initialization function for a module built - into the interpreter. Programs which embed Python may use an array of these - structures in conjunction with :c:func:`PyImport_ExtendInittab` to provide - additional built-in modules. The structure is defined in - :file:`Include/import.h` as:: + into the interpreter. The name is an ASCII encoded string. Programs which + embed Python may use an array of these structures in conjunction with + :c:func:`PyImport_ExtendInittab` to provide additional built-in modules. + The structure is defined in :file:`Include/import.h` as:: struct _inittab { - char *name; + char *name; /* ASCII encoded string */ PyObject* (*initfunc)(void); }; diff --git a/Doc/c-api/index.rst b/Doc/c-api/index.rst index 2ce7b98..3bfbaf4 100644 --- a/Doc/c-api/index.rst +++ b/Doc/c-api/index.rst @@ -13,6 +13,7 @@ document the API functions in detail. :maxdepth: 2 intro.rst + stable.rst veryhigh.rst refcounting.rst exceptions.rst @@ -22,3 +23,4 @@ document the API functions in detail. init.rst memory.rst objimpl.rst + apiabiversion.rst diff --git a/Doc/c-api/init.rst b/Doc/c-api/init.rst index 7507e3b..4dad2c8 100644 --- a/Doc/c-api/init.rst +++ b/Doc/c-api/init.rst @@ -442,6 +442,9 @@ pointer. standard :mod:`zlib` and :mod:`hashlib` modules release the GIL when compressing or hashing data. + +.. _gilstate: + Non-Python created threads -------------------------- @@ -545,6 +548,7 @@ code, or when embedding the Python interpreter: .. index:: module: _thread .. note:: + When only the main thread exists, no GIL operations are needed. This is a common situation (most Python programs do not use threads), and the lock operations slow the interpreter down a bit. Therefore, the lock is not @@ -646,7 +650,7 @@ with sub-interpreters: :c:func:`PyGILState_Release` on the same thread. -.. c:function:: PyThreadState PyGILState_GetThisThreadState() +.. c:function:: PyThreadState* PyGILState_GetThisThreadState() Get the current thread state for this thread. May return ``NULL`` if no GILState API has been used on the current thread. Note that the main thread @@ -905,41 +909,43 @@ Asynchronous Notifications A mechanism is provided to make asynchronous notifications to the main interpreter thread. These notifications take the form of a function -pointer and a void argument. - -.. index:: single: setcheckinterval() (in module sys) +pointer and a void pointer argument. -Every check interval, when the global interpreter lock is released and -reacquired, Python will also call any such provided functions. This can be used -for example by asynchronous IO handlers. The notification can be scheduled from -a worker thread and the actual call than made at the earliest convenience by the -main thread where it has possession of the global interpreter lock and can -perform any Python API calls. .. c:function:: int Py_AddPendingCall(int (*func)(void *), void *arg) .. index:: single: Py_AddPendingCall() - Post a notification to the Python main thread. If successful, *func* will be - called with the argument *arg* at the earliest convenience. *func* will be - called having the global interpreter lock held and can thus use the full - Python API and can take any action such as setting object attributes to - signal IO completion. It must return 0 on success, or -1 signalling an - exception. The notification function won't be interrupted to perform another - asynchronous notification recursively, but it can still be interrupted to - switch threads if the global interpreter lock is released, for example, if it - calls back into Python code. + Schedule a function to be called from the main interpreter thread. On + success, 0 is returned and *func* is queued for being called in the + main thread. On failure, -1 is returned without setting any exception. - This function returns 0 on success in which case the notification has been - scheduled. Otherwise, for example if the notification buffer is full, it - returns -1 without setting any exception. + When successfully queued, *func* will be *eventually* called from the + main interpreter thread with the argument *arg*. It will be called + asynchronously with respect to normally running Python code, but with + both these conditions met: - This function can be called on any thread, be it a Python thread or some - other system thread. If it is a Python thread, it doesn't matter if it holds - the global interpreter lock or not. + * on a :term:`bytecode` boundary; + * with the main thread holding the :term:`global interpreter lock` + (*func* can therefore use the full C API). - .. versionadded:: 3.1 + *func* must return 0 on success, or -1 on failure with an exception + set. *func* won't be interrupted to perform another asynchronous + notification recursively, but it can still be interrupted to switch + threads if the global interpreter lock is released. + + This function doesn't need a current thread state to run, and it doesn't + need the global interpreter lock. + .. warning:: + This is a low-level function, only useful for very special cases. + There is no guarantee that *func* will be called as quick as + possible. If the main thread is busy executing a system call, + *func* won't be called before the system call returns. This + function is generally **not** suitable for calling Python code from + arbitrary C threads. Instead, use the :ref:`PyGILState API<gilstate>`. + + .. versionadded:: 3.1 .. _profiling: diff --git a/Doc/c-api/iter.rst b/Doc/c-api/iter.rst index 3f0f554..2ba444d 100644 --- a/Doc/c-api/iter.rst +++ b/Doc/c-api/iter.rst @@ -5,7 +5,7 @@ Iterator Protocol ================= -There are only a couple of functions specifically for working with iterators. +There are two functions specifically for working with iterators. .. c:function:: int PyIter_Check(PyObject *o) @@ -14,11 +14,10 @@ There are only a couple of functions specifically for working with iterators. .. c:function:: PyObject* PyIter_Next(PyObject *o) - Return the next value from the iteration *o*. If the object is an iterator, - this retrieves the next value from the iteration, and returns *NULL* with no - exception set if there are no remaining items. If the object is not an - iterator, :exc:`TypeError` is raised, or if there is an error in retrieving the - item, returns *NULL* and passes along the exception. + Return the next value from the iteration *o*. The object must be an iterator + (it is up to the caller to check this). If there are no remaining values, + returns *NULL* with no exception set. If an error occurs while retrieving + the item, returns *NULL* and passes along the exception. To write a loop which iterates over an iterator, the C code should look something like this:: diff --git a/Doc/c-api/list.rst b/Doc/c-api/list.rst index feb9015..5b263a7 100644 --- a/Doc/c-api/list.rst +++ b/Doc/c-api/list.rst @@ -142,3 +142,10 @@ List Objects Return a new tuple object containing the contents of *list*; equivalent to ``tuple(list)``. + + +.. c:function:: int PyList_ClearFreeList() + + Clear the free list. Return the total number of freed items. + + .. versionadded:: 3.3 diff --git a/Doc/c-api/long.rst b/Doc/c-api/long.rst index 72be017..d5430fd 100644 --- a/Doc/c-api/long.rst +++ b/Doc/c-api/long.rst @@ -100,6 +100,20 @@ All integers are implemented as "long" integer objects of arbitrary size. string is first encoded to a byte string using :c:func:`PyUnicode_EncodeDecimal` and then converted using :c:func:`PyLong_FromString`. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyLong_FromUnicodeObject`. + + +.. c:function:: PyObject* PyLong_FromUnicodeObject(PyObject *u, int base) + + Convert a sequence of Unicode digits in the string *u* to a Python integer + value. The Unicode string is first encoded to a byte string using + :c:func:`PyUnicode_EncodeDecimal` and then converted using + :c:func:`PyLong_FromString`. + + .. versionadded:: 3.3 + .. c:function:: PyObject* PyLong_FromVoidPtr(void *p) diff --git a/Doc/c-api/memoryview.rst b/Doc/c-api/memoryview.rst index 6b49cdf..5e50977 100644 --- a/Doc/c-api/memoryview.rst +++ b/Doc/c-api/memoryview.rst @@ -17,16 +17,21 @@ any other object. Create a memoryview object from an object that provides the buffer interface. If *obj* supports writable buffer exports, the memoryview object will be - readable and writable, otherwise it will be read-only. + read/write, otherwise it may be either read-only or read/write at the + discretion of the exporter. +.. c:function:: PyObject *PyMemoryView_FromMemory(char *mem, Py_ssize_t size, int flags) + + Create a memoryview object using *mem* as the underlying buffer. + *flags* can be one of :c:macro:`PyBUF_READ` or :c:macro:`PyBUF_WRITE`. + + .. versionadded:: 3.3 .. c:function:: PyObject *PyMemoryView_FromBuffer(Py_buffer *view) Create a memoryview object wrapping the given buffer structure *view*. - The memoryview object then owns the buffer represented by *view*, which - means you shouldn't try to call :c:func:`PyBuffer_Release` yourself: it - will be done on deallocation of the memoryview object. - + For simple byte buffers, :c:func:`PyMemoryView_FromMemory` is the preferred + function. .. c:function:: PyObject *PyMemoryView_GetContiguous(PyObject *obj, int buffertype, char order) @@ -43,10 +48,16 @@ any other object. currently allowed to create subclasses of :class:`memoryview`. -.. c:function:: Py_buffer *PyMemoryView_GET_BUFFER(PyObject *obj) +.. c:function:: Py_buffer *PyMemoryView_GET_BUFFER(PyObject *mview) + + Return a pointer to the memoryview's private copy of the exporter's buffer. + *mview* **must** be a memoryview instance; this macro doesn't check its type, + you must do it yourself or you will risk crashes. + +.. c:function:: Py_buffer *PyMemoryView_GET_BASE(PyObject *mview) - Return a pointer to the buffer structure wrapped by the given - memoryview object. The object **must** be a memoryview instance; - this macro doesn't check its type, you must do it yourself or you - will risk crashes. + Return either a pointer to the exporting object that the memoryview is based + on or *NULL* if the memoryview has been created by one of the functions + :c:func:`PyMemoryView_FromMemory` or :c:func:`PyMemoryView_FromBuffer`. + *mview* **must** be a memoryview instance. diff --git a/Doc/c-api/module.rst b/Doc/c-api/module.rst index ffd68e3..bd46170 100644 --- a/Doc/c-api/module.rst +++ b/Doc/c-api/module.rst @@ -29,7 +29,7 @@ There are only a few functions special to module objects. :c:data:`PyModule_Type`. -.. c:function:: PyObject* PyModule_New(const char *name) +.. c:function:: PyObject* PyModule_NewObject(PyObject *name) .. index:: single: __name__ (module attribute) @@ -40,6 +40,14 @@ There are only a few functions special to module objects. Only the module's :attr:`__doc__` and :attr:`__name__` attributes are filled in; the caller is responsible for providing a :attr:`__file__` attribute. + .. versionadded:: 3.3 + + +.. c:function:: PyObject* PyModule_New(const char *name) + + Similar to :c:func:`PyImport_NewObject`, but the name is an UTF-8 encoded + string instead of a Unicode object. + .. c:function:: PyObject* PyModule_GetDict(PyObject *module) @@ -52,7 +60,7 @@ There are only a few functions special to module objects. manipulate a module's :attr:`__dict__`. -.. c:function:: char* PyModule_GetName(PyObject *module) +.. c:function:: PyObject* PyModule_GetNameObject(PyObject *module) .. index:: single: __name__ (module attribute) @@ -61,15 +69,13 @@ There are only a few functions special to module objects. Return *module*'s :attr:`__name__` value. If the module does not provide one, or if it is not a string, :exc:`SystemError` is raised and *NULL* is returned. + .. versionadded:: 3.3 -.. c:function:: char* PyModule_GetFilename(PyObject *module) - Similar to :c:func:`PyModule_GetFilenameObject` but return the filename - encoded to 'utf-8'. +.. c:function:: char* PyModule_GetName(PyObject *module) - .. deprecated:: 3.2 - :c:func:`PyModule_GetFilename` raises :c:type:`UnicodeEncodeError` on - unencodable filenames, use :c:func:`PyModule_GetFilenameObject` instead. + Similar to :c:func:`PyModule_GetNameObject` but return the name encoded to + ``'utf-8'``. .. c:function:: PyObject* PyModule_GetFilenameObject(PyObject *module) @@ -81,11 +87,21 @@ There are only a few functions special to module objects. Return the name of the file from which *module* was loaded using *module*'s :attr:`__file__` attribute. If this is not defined, or if it is not a unicode string, raise :exc:`SystemError` and return *NULL*; otherwise return - a reference to a :c:type:`PyUnicodeObject`. + a reference to a Unicode object. .. versionadded:: 3.2 +.. c:function:: char* PyModule_GetFilename(PyObject *module) + + Similar to :c:func:`PyModule_GetFilenameObject` but return the filename + encoded to 'utf-8'. + + .. deprecated:: 3.2 + :c:func:`PyModule_GetFilename` raises :c:type:`UnicodeEncodeError` on + unencodable filenames, use :c:func:`PyModule_GetFilenameObject` instead. + + .. c:function:: void* PyModule_GetState(PyObject *module) Return the "state" of the module, that is, a pointer to the block of memory @@ -99,6 +115,26 @@ There are only a few functions special to module objects. created, or *NULL* if the module wasn't created with :c:func:`PyModule_Create`. +.. c:function:: PyObject* PyState_FindModule(PyModuleDef *def) + + Returns the module object that was created from *def* for the current interpreter. + This method requires that the module object has been attached to the interpreter state with + :c:func:`PyState_AddModule` beforehand. In case the corresponding module object is not + found or has not been attached to the interpreter state yet, it returns NULL. + +.. c:function:: int PyState_AddModule(PyObject *module, PyModuleDef *def) + + Attaches the module object passed to the function to the interpreter state. This allows + the module object to be accessible via + :c:func:`PyState_FindModule`. + + .. versionadded:: 3.3 + +.. c:function:: int PyState_RemoveModule(PyModuleDef *def) + + Removes the module object created from *def* from the interpreter state. + + .. versionadded:: 3.3 Initializing C modules ^^^^^^^^^^^^^^^^^^^^^^ @@ -146,16 +182,22 @@ These functions are usually used in the module initialization function. .. c:member:: Py_ssize_t m_size - If the module object needs additional memory, this should be set to the - number of bytes to allocate; a pointer to the block of memory can be - retrieved with :c:func:`PyModule_GetState`. If no memory is needed, set - this to ``-1``. + Some modules allow re-initialization (calling their ``PyInit_*`` function + more than once). These modules should keep their state in a per-module + memory area that can be retrieved with :c:func:`PyModule_GetState`. This memory should be used, rather than static globals, to hold per-module state, since it is then safe for use in multiple sub-interpreters. It is freed when the module object is deallocated, after the :c:member:`m_free` function has been called, if present. + Setting ``m_size`` to ``-1`` means that the module can not be + re-initialized because it has global state. Setting it to a non-negative + value means that the module can be re-initialized and specifies the + additional amount of memory it requires for its state. + + See :PEP:`3121` for more details. + .. c:member:: PyMethodDef* m_methods A pointer to a table of module-level functions, described by diff --git a/Doc/c-api/object.rst b/Doc/c-api/object.rst index d0d45ad..0aba360 100644 --- a/Doc/c-api/object.rst +++ b/Doc/c-api/object.rst @@ -6,6 +6,19 @@ Object Protocol =============== +.. c:var:: PyObject* Py_NotImplemented + + The ``NotImplemented`` singleton, used to signal that an operation is + not implemented for the given type combination. + + +.. c:macro:: Py_RETURN_NOTIMPLEMENTED + + Properly handle returning :c:data:`Py_NotImplemented` from within a C + function (that is, increment the reference count of NotImplemented and + return it). + + .. c:function:: int PyObject_Print(PyObject *o, FILE *fp, int flags) Print an object *o*, on file *fp*. Returns ``-1`` on error. The flags argument @@ -47,8 +60,8 @@ Object Protocol Generic attribute getter function that is meant to be put into a type object's ``tp_getattro`` slot. It looks for a descriptor in the dictionary of classes in the object's MRO as well as an attribute in the object's - :attr:`__dict__` (if present). As outlined in :ref:`descriptors`, data - descriptors take preference over instance attributes, while non-data + :attr:`~object.__dict__` (if present). As outlined in :ref:`descriptors`, + data descriptors take preference over instance attributes, while non-data descriptors don't. Otherwise, an :exc:`AttributeError` is raised. @@ -72,8 +85,8 @@ Object Protocol object's ``tp_setattro`` slot. It looks for a data descriptor in the dictionary of classes in the object's MRO, and if found it takes preference over setting the attribute in the instance dictionary. Otherwise, the - attribute is set in the object's :attr:`__dict__` (if present). Otherwise, - an :exc:`AttributeError` is raised and ``-1`` is returned. + attribute is set in the object's :attr:`~object.__dict__` (if present). + Otherwise, an :exc:`AttributeError` is raised and ``-1`` is returned. .. c:function:: int PyObject_DelAttr(PyObject *o, PyObject *attr_name) @@ -88,6 +101,22 @@ Object Protocol This is the equivalent of the Python statement ``del o.attr_name``. +.. c:function:: PyObject* PyType_GenericGetDict(PyObject *o, void *context) + + A generic implementation for the getter of a ``__dict__`` descriptor. It + creates the dictionary if necessary. + + .. versionadded:: 3.3 + + +.. c:function:: int PyType_GenericSetDict(PyObject *o, void *context) + + A generic implementation for the setter of a ``__dict__`` descriptor. This + implementation does not allow the dictionary to be deleted. + + .. versionadded:: 3.3 + + .. c:function:: PyObject* PyObject_RichCompare(PyObject *o1, PyObject *o2, int opid) Compare the values of *o1* and *o2* using the operation specified by *opid*, @@ -131,10 +160,10 @@ Object Protocol a string similar to that returned by :c:func:`PyObject_Repr` in Python 2. Called by the :func:`ascii` built-in function. + .. index:: string; PyObject_Str (C function) -.. c:function:: PyObject* PyObject_Str(PyObject *o) - .. index:: builtin: str +.. c:function:: PyObject* PyObject_Str(PyObject *o) Compute a string representation of object *o*. Returns the string representation on success, *NULL* on failure. This is the equivalent of the @@ -160,9 +189,9 @@ Object Protocol be done against every entry in *cls*. The result will be ``1`` when at least one of the checks returns ``1``, otherwise it will be ``0``. If *inst* is not a class instance and *cls* is neither a type object, nor a class object, nor a - tuple, *inst* must have a :attr:`__class__` attribute --- the class relationship - of the value of that attribute with *cls* will be used to determine the result - of this function. + tuple, *inst* must have a :attr:`~instance.__class__` attribute --- the + class relationship of the value of that attribute with *cls* will be used + to determine the result of this function. Subclass determination is done in a fairly straightforward way, but includes a @@ -172,9 +201,9 @@ of. If :class:`A` and :class:`B` are class objects, :class:`B` is a subclass of either is not a class object, a more general mechanism is used to determine the class relationship of the two objects. When testing if *B* is a subclass of *A*, if *A* is *B*, :c:func:`PyObject_IsSubclass` returns true. If *A* and *B* -are different objects, *B*'s :attr:`__bases__` attribute is searched in a -depth-first fashion for *A* --- the presence of the :attr:`__bases__` attribute -is considered sufficient for this determination. +are different objects, *B*'s :attr:`~class.__bases__` attribute is searched in +a depth-first fashion for *A* --- the presence of the :attr:`~class.__bases__` +attribute is considered sufficient for this determination. .. c:function:: int PyObject_IsSubclass(PyObject *derived, PyObject *cls) diff --git a/Doc/c-api/set.rst b/Doc/c-api/set.rst index 66b47c4..7f4d534 100644 --- a/Doc/c-api/set.rst +++ b/Doc/c-api/set.rst @@ -140,7 +140,7 @@ subtypes but not for instances of :class:`frozenset` or its subtypes. Return 1 if found and removed, 0 if not found (no action taken), and -1 if an error is encountered. Does not raise :exc:`KeyError` for missing keys. Raise a - :exc:`TypeError` if the *key* is unhashable. Unlike the Python :meth:`discard` + :exc:`TypeError` if the *key* is unhashable. Unlike the Python :meth:`~set.discard` method, this function does not automatically convert unhashable sets into temporary frozensets. Raise :exc:`PyExc_SystemError` if *set* is an not an instance of :class:`set` or its subtype. @@ -157,3 +157,10 @@ subtypes but not for instances of :class:`frozenset` or its subtypes. .. c:function:: int PySet_Clear(PyObject *set) Empty an existing set of all elements. + + +.. c:function:: int PySet_ClearFreeList() + + Clear the free list. Return the total number of freed items. + + .. versionadded:: 3.3 diff --git a/Doc/c-api/stable.rst b/Doc/c-api/stable.rst new file mode 100644 index 0000000..063f856 --- /dev/null +++ b/Doc/c-api/stable.rst @@ -0,0 +1,38 @@ +.. highlightlang:: c + +.. _stable: + +*********************************** +Stable Application Binary Interface +*********************************** + +Traditionally, the C API of Python will change with every release. Most changes +will be source-compatible, typically by only adding API, rather than changing +existing API or removing API (although some interfaces do get removed after +being deprecated first). + +Unfortunately, the API compatibility does not extend to binary compatibility +(the ABI). The reason is primarily the evolution of struct definitions, where +addition of a new field, or changing the type of a field, might not break the +API, but can break the ABI. As a consequence, extension modules need to be +recompiled for every Python release (although an exception is possible on Unix +when none of the affected interfaces are used). In addition, on Windows, +extension modules link with a specific pythonXY.dll and need to be recompiled to +link with a newer one. + +Since Python 3.2, a subset of the API has been declared to guarantee a stable +ABI. Extension modules wishing to use this API (called "limited API") need to +define ``Py_LIMITED_API``. A number of interpreter details then become hidden +from the extension module; in return, a module is built that works on any 3.x +version (x>=2) without recompilation. + +In some cases, the stable ABI needs to be extended with new functions. +Extension modules wishing to use these new APIs need to set ``Py_LIMITED_API`` +to the ``PY_VERSION_HEX`` value (see :ref:`apiabiversion`) of the minimum Python +version they want to support (e.g. ``0x03030000`` for Python 3.3). Such modules +will work on all subsequent Python releases, but fail to load (because of +missing symbols) on the older releases. + +As of Python 3.2, the set of functions available to the limited API is +documented in PEP 384. In the C API documentation, API elements that are not +part of the limited API are marked as "Not part of the limited API." diff --git a/Doc/c-api/tuple.rst b/Doc/c-api/tuple.rst index 3cbfe5b..184affb 100644 --- a/Doc/c-api/tuple.rst +++ b/Doc/c-api/tuple.rst @@ -108,3 +108,103 @@ Tuple Objects .. c:function:: int PyTuple_ClearFreeList() Clear the free list. Return the total number of freed items. + + +Struct Sequence Objects +----------------------- + +Struct sequence objects are the C equivalent of :func:`~collections.namedtuple` +objects, i.e. a sequence whose items can also be accessed through attributes. +To create a struct sequence, you first have to create a specific struct sequence +type. + +.. c:function:: PyTypeObject* PyStructSequence_NewType(PyStructSequence_Desc *desc) + + Create a new struct sequence type from the data in *desc*, described below. Instances + of the resulting type can be created with :c:func:`PyStructSequence_New`. + + +.. c:function:: void PyStructSequence_InitType(PyTypeObject *type, PyStructSequence_Desc *desc) + + Initializes a struct sequence type *type* from *desc* in place. + + +.. c:type:: PyStructSequence_Desc + + Contains the meta information of a struct sequence type to create. + + +-------------------+------------------------------+------------------------------------+ + | Field | C Type | Meaning | + +===================+==============================+====================================+ + | ``name`` | ``char *`` | name of the struct sequence type | + +-------------------+------------------------------+------------------------------------+ + | ``doc`` | ``char *`` | pointer to docstring for the type | + | | | or NULL to omit | + +-------------------+------------------------------+------------------------------------+ + | ``fields`` | ``PyStructSequence_Field *`` | pointer to *NULL*-terminated array | + | | | with field names of the new type | + +-------------------+------------------------------+------------------------------------+ + | ``n_in_sequence`` | ``int`` | number of fields visible to the | + | | | Python side (if used as tuple) | + +-------------------+------------------------------+------------------------------------+ + + +.. c:type:: PyStructSequence_Field + + Describes a field of a struct sequence. As a struct sequence is modeled as a + tuple, all fields are typed as :c:type:`PyObject\*`. The index in the + :attr:`fields` array of the :c:type:`PyStructSequence_Desc` determines which + field of the struct sequence is described. + + +-----------+---------------+--------------------------------------+ + | Field | C Type | Meaning | + +===========+===============+======================================+ + | ``name`` | ``char *`` | name for the field or *NULL* to end | + | | | the list of named fields, set to | + | | | PyStructSequence_UnnamedField to | + | | | leave unnamed | + +-----------+---------------+--------------------------------------+ + | ``doc`` | ``char *`` | field docstring or *NULL* to omit | + +-----------+---------------+--------------------------------------+ + + +.. c:var:: char* PyStructSequence_UnnamedField + + Special value for a field name to leave it unnamed. + + +.. c:function:: PyObject* PyStructSequence_New(PyTypeObject *type) + + Creates an instance of *type*, which must have been created with + :c:func:`PyStructSequence_NewType`. + + +.. c:function:: PyObject* PyStructSequence_GetItem(PyObject *p, Py_ssize_t pos) + + Return the object at position *pos* in the struct sequence pointed to by *p*. + No bounds checking is performed. + + +.. c:function:: PyObject* PyStructSequence_GET_ITEM(PyObject *p, Py_ssize_t pos) + + Macro equivalent of :c:func:`PyStructSequence_GetItem`. + + +.. c:function:: void PyStructSequence_SetItem(PyObject *p, Py_ssize_t pos, PyObject *o) + + Sets the field at index *pos* of the struct sequence *p* to value *o*. Like + :c:func:`PyTuple_SET_ITEM`, this should only be used to fill in brand new + instances. + + .. note:: + + This function "steals" a reference to *o*. + + +.. c:function:: PyObject* PyStructSequence_SET_ITEM(PyObject *p, Py_ssize_t *pos, PyObject *o) + + Macro equivalent of :c:func:`PyStructSequence_SetItem`. + + .. note:: + + This function "steals" a reference to *o*. diff --git a/Doc/c-api/type.rst b/Doc/c-api/type.rst index b3386ea..5d83254 100644 --- a/Doc/c-api/type.rst +++ b/Doc/c-api/type.rst @@ -37,10 +37,10 @@ Type Objects .. c:function:: long PyType_GetFlags(PyTypeObject* type) - Return the :attr:`tp_flags` member of *type*. This function is primarily + Return the :c:member:`~PyTypeObject.tp_flags` member of *type*. This function is primarily meant for use with `Py_LIMITED_API`; the individual flag bits are guaranteed to be stable across Python releases, but access to - :attr:`tp_flags` itself is not part of the limited API. + :c:member:`~PyTypeObject.tp_flags` itself is not part of the limited API. .. versionadded:: 3.2 @@ -51,13 +51,13 @@ Type Objects modification of the attributes or base classes of the type. -.. c:function:: int PyType_HasFeature(PyObject *o, int feature) +.. c:function:: int PyType_HasFeature(PyTypeObject *o, int feature) Return true if the type object *o* sets the feature *feature*. Type features are denoted by single bit flags. -.. c:function:: int PyType_IS_GC(PyObject *o) +.. c:function:: int PyType_IS_GC(PyTypeObject *o) Return true if the type object includes support for the cycle detector; this tests the type flag :const:`Py_TPFLAGS_HAVE_GC`. @@ -70,13 +70,14 @@ Type Objects .. c:function:: PyObject* PyType_GenericAlloc(PyTypeObject *type, Py_ssize_t nitems) - XXX: Document. - + Generic handler for the :c:member:`~PyTypeObject.tp_alloc` slot of a type object. Use + Python's default memory allocation mechanism to allocate a new instance and + initialize all its contents to *NULL*. .. c:function:: PyObject* PyType_GenericNew(PyTypeObject *type, PyObject *args, PyObject *kwds) - XXX: Document. - + Generic handler for the :c:member:`~PyTypeObject.tp_new` slot of a type object. Create a + new instance using the type's :c:member:`~PyTypeObject.tp_alloc` slot. .. c:function:: int PyType_Ready(PyTypeObject *type) @@ -84,3 +85,15 @@ Type Objects their initialization. This function is responsible for adding inherited slots from a type's base class. Return ``0`` on success, or return ``-1`` and sets an exception on error. + +.. c:function:: PyObject* PyType_FromSpec(PyType_Spec *spec) + + Creates and returns a heap type object from the *spec* passed to the function. + +.. c:function:: PyObject* PyType_FromSpecWithBases(PyType_Spec *spec, PyObject *bases) + + Creates and returns a heap type object from the *spec*. In addition to that, + the created heap type contains all types contained by the *bases* tuple as base + types. This allows the caller to reference other heap types as base types. + + .. versionadded:: 3.3 diff --git a/Doc/c-api/typeobj.rst b/Doc/c-api/typeobj.rst index 68ca9ad..f3089d0 100644 --- a/Doc/c-api/typeobj.rst +++ b/Doc/c-api/typeobj.rst @@ -35,7 +35,7 @@ definition found there: The type object structure extends the :c:type:`PyVarObject` structure. The :attr:`ob_size` field is used for dynamic types (created by :func:`type_new`, usually called from a class statement). Note that :c:data:`PyType_Type` (the -metatype) initializes :attr:`tp_itemsize`, which means that its instances (i.e. +metatype) initializes :c:member:`~PyTypeObject.tp_itemsize`, which means that its instances (i.e. type objects) *must* have the :attr:`ob_size` field. @@ -102,7 +102,7 @@ type objects) *must* have the :attr:`ob_size` field. should be just the type name. If the module is a submodule of a package, the full package name is part of the full module name. For example, a type named :class:`T` defined in module :mod:`M` in subpackage :mod:`Q` in package :mod:`P` - should have the :attr:`tp_name` initializer ``"P.Q.M.T"``. + should have the :c:member:`~PyTypeObject.tp_name` initializer ``"P.Q.M.T"``. For dynamically allocated type objects, this should just be the type name, and the module name explicitly stored in the type dict as the value for key @@ -113,7 +113,7 @@ type objects) *must* have the :attr:`ob_size` field. attribute, and everything after the last dot is made accessible as the :attr:`__name__` attribute. - If no dot is present, the entire :attr:`tp_name` field is made accessible as the + If no dot is present, the entire :c:member:`~PyTypeObject.tp_name` field is made accessible as the :attr:`__name__` attribute, and the :attr:`__module__` attribute is undefined (unless explicitly set in the dictionary, as explained above). This means your type will be impossible to pickle. @@ -127,13 +127,13 @@ type objects) *must* have the :attr:`ob_size` field. These fields allow calculating the size in bytes of instances of the type. There are two kinds of types: types with fixed-length instances have a zero - :attr:`tp_itemsize` field, types with variable-length instances have a non-zero - :attr:`tp_itemsize` field. For a type with fixed-length instances, all - instances have the same size, given in :attr:`tp_basicsize`. + :c:member:`~PyTypeObject.tp_itemsize` field, types with variable-length instances have a non-zero + :c:member:`~PyTypeObject.tp_itemsize` field. For a type with fixed-length instances, all + instances have the same size, given in :c:member:`~PyTypeObject.tp_basicsize`. For a type with variable-length instances, the instances must have an - :attr:`ob_size` field, and the instance size is :attr:`tp_basicsize` plus N - times :attr:`tp_itemsize`, where N is the "length" of the object. The value of + :attr:`ob_size` field, and the instance size is :c:member:`~PyTypeObject.tp_basicsize` plus N + times :c:member:`~PyTypeObject.tp_itemsize`, where N is the "length" of the object. The value of N is typically stored in the instance's :attr:`ob_size` field. There are exceptions: for example, ints use a negative :attr:`ob_size` to indicate a negative number, and N is ``abs(ob_size)`` there. Also, the presence of an @@ -146,20 +146,20 @@ type objects) *must* have the :attr:`ob_size` field. :c:macro:`PyObject_HEAD` or :c:macro:`PyObject_VAR_HEAD` (whichever is used to declare the instance struct) and this in turn includes the :attr:`_ob_prev` and :attr:`_ob_next` fields if they are present. This means that the only correct - way to get an initializer for the :attr:`tp_basicsize` is to use the + way to get an initializer for the :c:member:`~PyTypeObject.tp_basicsize` is to use the ``sizeof`` operator on the struct used to declare the instance layout. The basic size does not include the GC header size. These fields are inherited separately by subtypes. If the base type has a - non-zero :attr:`tp_itemsize`, it is generally not safe to set - :attr:`tp_itemsize` to a different non-zero value in a subtype (though this + non-zero :c:member:`~PyTypeObject.tp_itemsize`, it is generally not safe to set + :c:member:`~PyTypeObject.tp_itemsize` to a different non-zero value in a subtype (though this depends on the implementation of the base type). A note about alignment: if the variable items require a particular alignment, - this should be taken care of by the value of :attr:`tp_basicsize`. Example: - suppose a type implements an array of ``double``. :attr:`tp_itemsize` is + this should be taken care of by the value of :c:member:`~PyTypeObject.tp_basicsize`. Example: + suppose a type implements an array of ``double``. :c:member:`~PyTypeObject.tp_itemsize` is ``sizeof(double)``. It is the programmer's responsibility that - :attr:`tp_basicsize` is a multiple of ``sizeof(double)`` (assuming this is the + :c:member:`~PyTypeObject.tp_basicsize` is a multiple of ``sizeof(double)`` (assuming this is the alignment requirement for ``double``). @@ -175,10 +175,10 @@ type objects) *must* have the :attr:`ob_size` field. destructor function should free all references which the instance owns, free all memory buffers owned by the instance (using the freeing function corresponding to the allocation function used to allocate the buffer), and finally (as its - last action) call the type's :attr:`tp_free` function. If the type is not + last action) call the type's :c:member:`~PyTypeObject.tp_free` function. If the type is not subtypable (doesn't have the :const:`Py_TPFLAGS_BASETYPE` flag bit set), it is permissible to call the object deallocator directly instead of via - :attr:`tp_free`. The object deallocator should be the one used to allocate the + :c:member:`~PyTypeObject.tp_free`. The object deallocator should be the one used to allocate the instance; this is normally :c:func:`PyObject_Del` if the instance was allocated using :c:func:`PyObject_New` or :c:func:`PyObject_VarNew`, or :c:func:`PyObject_GC_Del` if the instance was allocated using @@ -192,26 +192,25 @@ type objects) *must* have the :attr:`ob_size` field. An optional pointer to the instance print function. The print function is only called when the instance is printed to a *real* file; - when it is printed to a pseudo-file (like a :class:`StringIO` instance), the - instance's :attr:`tp_repr` or :attr:`tp_str` function is called to convert it to - a string. These are also called when the type's :attr:`tp_print` field is - *NULL*. A type should never implement :attr:`tp_print` in a way that produces - different output than :attr:`tp_repr` or :attr:`tp_str` would. + when it is printed to a pseudo-file (like a :class:`io.StringIO` instance), the + instance's :c:member:`~PyTypeObject.tp_repr` or :c:member:`~PyTypeObject.tp_str` function is called to convert it to + a string. These are also called when the type's :c:member:`~PyTypeObject.tp_print` field is + *NULL*. A type should never implement :c:member:`~PyTypeObject.tp_print` in a way that produces + different output than :c:member:`~PyTypeObject.tp_repr` or :c:member:`~PyTypeObject.tp_str` would. The print function is called with the same signature as :c:func:`PyObject_Print`: ``int tp_print(PyObject *self, FILE *file, int flags)``. The *self* argument is the instance to be printed. The *file* argument is the stdio file to which it is to be printed. The *flags* argument is composed of flag bits. The only flag bit currently defined is :const:`Py_PRINT_RAW`. When the :const:`Py_PRINT_RAW` - flag bit is set, the instance should be printed the same way as :attr:`tp_str` + flag bit is set, the instance should be printed the same way as :c:member:`~PyTypeObject.tp_str` would format it; when the :const:`Py_PRINT_RAW` flag bit is clear, the instance - should be printed the same was as :attr:`tp_repr` would format it. It should - return ``-1`` and set an exception condition when an error occurred during the - comparison. + should be printed the same way as :c:member:`~PyTypeObject.tp_repr` would format it. It should + return ``-1`` and set an exception condition when an error occurs. - It is possible that the :attr:`tp_print` field will be deprecated. In any case, - it is recommended not to define :attr:`tp_print`, but instead to rely on - :attr:`tp_repr` and :attr:`tp_str` for printing. + It is possible that the :c:member:`~PyTypeObject.tp_print` field will be deprecated. In any case, + it is recommended not to define :c:member:`~PyTypeObject.tp_print`, but instead to rely on + :c:member:`~PyTypeObject.tp_repr` and :c:member:`~PyTypeObject.tp_str` for printing. This field is inherited by subtypes. @@ -221,13 +220,13 @@ type objects) *must* have the :attr:`ob_size` field. An optional pointer to the get-attribute-string function. This field is deprecated. When it is defined, it should point to a function - that acts the same as the :attr:`tp_getattro` function, but taking a C string + that acts the same as the :c:member:`~PyTypeObject.tp_getattro` function, but taking a C string instead of a Python string object to give the attribute name. The signature is the same as for :c:func:`PyObject_GetAttrString`. - This field is inherited by subtypes together with :attr:`tp_getattro`: a subtype - inherits both :attr:`tp_getattr` and :attr:`tp_getattro` from its base type when - the subtype's :attr:`tp_getattr` and :attr:`tp_getattro` are both *NULL*. + This field is inherited by subtypes together with :c:member:`~PyTypeObject.tp_getattro`: a subtype + inherits both :c:member:`~PyTypeObject.tp_getattr` and :c:member:`~PyTypeObject.tp_getattro` from its base type when + the subtype's :c:member:`~PyTypeObject.tp_getattr` and :c:member:`~PyTypeObject.tp_getattro` are both *NULL*. .. c:member:: setattrfunc PyTypeObject.tp_setattr @@ -235,13 +234,13 @@ type objects) *must* have the :attr:`ob_size` field. An optional pointer to the set-attribute-string function. This field is deprecated. When it is defined, it should point to a function - that acts the same as the :attr:`tp_setattro` function, but taking a C string + that acts the same as the :c:member:`~PyTypeObject.tp_setattro` function, but taking a C string instead of a Python string object to give the attribute name. The signature is the same as for :c:func:`PyObject_SetAttrString`. - This field is inherited by subtypes together with :attr:`tp_setattro`: a subtype - inherits both :attr:`tp_setattr` and :attr:`tp_setattro` from its base type when - the subtype's :attr:`tp_setattr` and :attr:`tp_setattro` are both *NULL*. + This field is inherited by subtypes together with :c:member:`~PyTypeObject.tp_setattro`: a subtype + inherits both :c:member:`~PyTypeObject.tp_setattr` and :c:member:`~PyTypeObject.tp_setattro` from its base type when + the subtype's :c:member:`~PyTypeObject.tp_setattr` and :c:member:`~PyTypeObject.tp_setattro` are both *NULL*. .. c:member:: void* PyTypeObject.tp_reserved @@ -275,7 +274,7 @@ type objects) *must* have the :attr:`ob_size` field. objects which implement the number protocol. These fields are documented in :ref:`number-structs`. - The :attr:`tp_as_number` field is not inherited, but the contained fields are + The :c:member:`~PyTypeObject.tp_as_number` field is not inherited, but the contained fields are inherited individually. @@ -285,7 +284,7 @@ type objects) *must* have the :attr:`ob_size` field. objects which implement the sequence protocol. These fields are documented in :ref:`sequence-structs`. - The :attr:`tp_as_sequence` field is not inherited, but the contained fields + The :c:member:`~PyTypeObject.tp_as_sequence` field is not inherited, but the contained fields are inherited individually. @@ -295,7 +294,7 @@ type objects) *must* have the :attr:`ob_size` field. objects which implement the mapping protocol. These fields are documented in :ref:`mapping-structs`. - The :attr:`tp_as_mapping` field is not inherited, but the contained fields + The :c:member:`~PyTypeObject.tp_as_mapping` field is not inherited, but the contained fields are inherited individually. @@ -323,9 +322,9 @@ type objects) *must* have the :attr:`ob_size` field. object raises :exc:`TypeError`. This field is inherited by subtypes together with - :attr:`tp_richcompare`: a subtype inherits both of - :attr:`tp_richcompare` and :attr:`tp_hash`, when the subtype's - :attr:`tp_richcompare` and :attr:`tp_hash` are both *NULL*. + :c:member:`~PyTypeObject.tp_richcompare`: a subtype inherits both of + :c:member:`~PyTypeObject.tp_richcompare` and :c:member:`~PyTypeObject.tp_hash`, when the subtype's + :c:member:`~PyTypeObject.tp_richcompare` and :c:member:`~PyTypeObject.tp_hash` are both *NULL*. .. c:member:: ternaryfunc PyTypeObject.tp_call @@ -363,9 +362,9 @@ type objects) *must* have the :attr:`ob_size` field. convenient to set this field to :c:func:`PyObject_GenericGetAttr`, which implements the normal way of looking for object attributes. - This field is inherited by subtypes together with :attr:`tp_getattr`: a subtype - inherits both :attr:`tp_getattr` and :attr:`tp_getattro` from its base type when - the subtype's :attr:`tp_getattr` and :attr:`tp_getattro` are both *NULL*. + This field is inherited by subtypes together with :c:member:`~PyTypeObject.tp_getattr`: a subtype + inherits both :c:member:`~PyTypeObject.tp_getattr` and :c:member:`~PyTypeObject.tp_getattro` from its base type when + the subtype's :c:member:`~PyTypeObject.tp_getattr` and :c:member:`~PyTypeObject.tp_getattro` are both *NULL*. .. c:member:: setattrofunc PyTypeObject.tp_setattro @@ -376,9 +375,9 @@ type objects) *must* have the :attr:`ob_size` field. convenient to set this field to :c:func:`PyObject_GenericSetAttr`, which implements the normal way of setting object attributes. - This field is inherited by subtypes together with :attr:`tp_setattr`: a subtype - inherits both :attr:`tp_setattr` and :attr:`tp_setattro` from its base type when - the subtype's :attr:`tp_setattr` and :attr:`tp_setattro` are both *NULL*. + This field is inherited by subtypes together with :c:member:`~PyTypeObject.tp_setattr`: a subtype + inherits both :c:member:`~PyTypeObject.tp_setattr` and :c:member:`~PyTypeObject.tp_setattro` from its base type when + the subtype's :c:member:`~PyTypeObject.tp_setattr` and :c:member:`~PyTypeObject.tp_setattro` are both *NULL*. .. c:member:: PyBufferProcs* PyTypeObject.tp_as_buffer @@ -387,7 +386,7 @@ type objects) *must* have the :attr:`ob_size` field. which implement the buffer interface. These fields are documented in :ref:`buffer-structs`. - The :attr:`tp_as_buffer` field is not inherited, but the contained fields are + The :c:member:`~PyTypeObject.tp_as_buffer` field is not inherited, but the contained fields are inherited individually. @@ -396,8 +395,8 @@ type objects) *must* have the :attr:`ob_size` field. This field is a bit mask of various flags. Some flags indicate variant semantics for certain situations; others are used to indicate that certain fields in the type object (or in the extension structures referenced via - :attr:`tp_as_number`, :attr:`tp_as_sequence`, :attr:`tp_as_mapping`, and - :attr:`tp_as_buffer`) that were historically not always present are valid; if + :c:member:`~PyTypeObject.tp_as_number`, :c:member:`~PyTypeObject.tp_as_sequence`, :c:member:`~PyTypeObject.tp_as_mapping`, and + :c:member:`~PyTypeObject.tp_as_buffer`) that were historically not always present are valid; if such a flag bit is clear, the type fields it guards must not be accessed and must be considered to have a zero or *NULL* value instead. @@ -407,13 +406,13 @@ type objects) *must* have the :attr:`ob_size` field. inherited if the extension structure is inherited, i.e. the base type's value of the flag bit is copied into the subtype together with a pointer to the extension structure. The :const:`Py_TPFLAGS_HAVE_GC` flag bit is inherited together with - the :attr:`tp_traverse` and :attr:`tp_clear` fields, i.e. if the + the :c:member:`~PyTypeObject.tp_traverse` and :c:member:`~PyTypeObject.tp_clear` fields, i.e. if the :const:`Py_TPFLAGS_HAVE_GC` flag bit is clear in the subtype and the - :attr:`tp_traverse` and :attr:`tp_clear` fields in the subtype exist and have + :c:member:`~PyTypeObject.tp_traverse` and :c:member:`~PyTypeObject.tp_clear` fields in the subtype exist and have *NULL* values. The following bit masks are currently defined; these can be ORed together using - the ``|`` operator to form the value of the :attr:`tp_flags` field. The macro + the ``|`` operator to form the value of the :c:member:`~PyTypeObject.tp_flags` field. The macro :c:func:`PyType_HasFeature` takes a type and a flags value, *tp* and *f*, and checks whether ``tp->tp_flags & f`` is non-zero. @@ -453,7 +452,7 @@ type objects) *must* have the :attr:`ob_size` field. is set, instances must be created using :c:func:`PyObject_GC_New` and destroyed using :c:func:`PyObject_GC_Del`. More information in section :ref:`supporting-cycle-detection`. This bit also implies that the - GC-related fields :attr:`tp_traverse` and :attr:`tp_clear` are present in + GC-related fields :c:member:`~PyTypeObject.tp_traverse` and :c:member:`~PyTypeObject.tp_clear` are present in the type object. @@ -481,8 +480,8 @@ type objects) *must* have the :attr:`ob_size` field. about Python's garbage collection scheme can be found in section :ref:`supporting-cycle-detection`. - The :attr:`tp_traverse` pointer is used by the garbage collector to detect - reference cycles. A typical implementation of a :attr:`tp_traverse` function + The :c:member:`~PyTypeObject.tp_traverse` pointer is used by the garbage collector to detect + reference cycles. A typical implementation of a :c:member:`~PyTypeObject.tp_traverse` function simply calls :c:func:`Py_VISIT` on each of the instance's members that are Python objects. For example, this is function :c:func:`local_traverse` from the :mod:`_thread` extension module:: @@ -502,15 +501,15 @@ type objects) *must* have the :attr:`ob_size` field. On the other hand, even if you know a member can never be part of a cycle, as a debugging aid you may want to visit it anyway just so the :mod:`gc` module's - :func:`get_referents` function will include it. + :func:`~gc.get_referents` function will include it. Note that :c:func:`Py_VISIT` requires the *visit* and *arg* parameters to :c:func:`local_traverse` to have these specific names; don't name them just anything. - This field is inherited by subtypes together with :attr:`tp_clear` and the - :const:`Py_TPFLAGS_HAVE_GC` flag bit: the flag bit, :attr:`tp_traverse`, and - :attr:`tp_clear` are all inherited from the base type if they are all zero in + This field is inherited by subtypes together with :c:member:`~PyTypeObject.tp_clear` and the + :const:`Py_TPFLAGS_HAVE_GC` flag bit: the flag bit, :c:member:`~PyTypeObject.tp_traverse`, and + :c:member:`~PyTypeObject.tp_clear` are all inherited from the base type if they are all zero in the subtype. @@ -519,17 +518,17 @@ type objects) *must* have the :attr:`ob_size` field. An optional pointer to a clear function for the garbage collector. This is only used if the :const:`Py_TPFLAGS_HAVE_GC` flag bit is set. - The :attr:`tp_clear` member function is used to break reference cycles in cyclic - garbage detected by the garbage collector. Taken together, all :attr:`tp_clear` + The :c:member:`~PyTypeObject.tp_clear` member function is used to break reference cycles in cyclic + garbage detected by the garbage collector. Taken together, all :c:member:`~PyTypeObject.tp_clear` functions in the system must combine to break all reference cycles. This is - subtle, and if in any doubt supply a :attr:`tp_clear` function. For example, - the tuple type does not implement a :attr:`tp_clear` function, because it's + subtle, and if in any doubt supply a :c:member:`~PyTypeObject.tp_clear` function. For example, + the tuple type does not implement a :c:member:`~PyTypeObject.tp_clear` function, because it's possible to prove that no reference cycle can be composed entirely of tuples. - Therefore the :attr:`tp_clear` functions of other types must be sufficient to + Therefore the :c:member:`~PyTypeObject.tp_clear` functions of other types must be sufficient to break any cycle containing a tuple. This isn't immediately obvious, and there's - rarely a good reason to avoid implementing :attr:`tp_clear`. + rarely a good reason to avoid implementing :c:member:`~PyTypeObject.tp_clear`. - Implementations of :attr:`tp_clear` should drop the instance's references to + Implementations of :c:member:`~PyTypeObject.tp_clear` should drop the instance's references to those of its members that may be Python objects, and set its pointers to those members to *NULL*, as in the following example:: @@ -554,18 +553,18 @@ type objects) *must* have the :attr:`ob_size` field. so that *self* knows the contained object can no longer be used. The :c:func:`Py_CLEAR` macro performs the operations in a safe order. - Because the goal of :attr:`tp_clear` functions is to break reference cycles, + Because the goal of :c:member:`~PyTypeObject.tp_clear` functions is to break reference cycles, it's not necessary to clear contained objects like Python strings or Python integers, which can't participate in reference cycles. On the other hand, it may be convenient to clear all contained Python objects, and write the type's - :attr:`tp_dealloc` function to invoke :attr:`tp_clear`. + :c:member:`~PyTypeObject.tp_dealloc` function to invoke :c:member:`~PyTypeObject.tp_clear`. More information about Python's garbage collection scheme can be found in section :ref:`supporting-cycle-detection`. - This field is inherited by subtypes together with :attr:`tp_traverse` and the - :const:`Py_TPFLAGS_HAVE_GC` flag bit: the flag bit, :attr:`tp_traverse`, and - :attr:`tp_clear` are all inherited from the base type if they are all zero in + This field is inherited by subtypes together with :c:member:`~PyTypeObject.tp_traverse` and the + :const:`Py_TPFLAGS_HAVE_GC` flag bit: the flag bit, :c:member:`~PyTypeObject.tp_traverse`, and + :c:member:`~PyTypeObject.tp_clear` are all inherited from the base type if they are all zero in the subtype. @@ -585,13 +584,13 @@ type objects) *must* have the :attr:`ob_size` field. comparisons makes sense (e.g. ``==`` and ``!=``, but not ``<`` and friends), directly raise :exc:`TypeError` in the rich comparison function. - This field is inherited by subtypes together with :attr:`tp_hash`: - a subtype inherits :attr:`tp_richcompare` and :attr:`tp_hash` when - the subtype's :attr:`tp_richcompare` and :attr:`tp_hash` are both + This field is inherited by subtypes together with :c:member:`~PyTypeObject.tp_hash`: + a subtype inherits :c:member:`~PyTypeObject.tp_richcompare` and :c:member:`~PyTypeObject.tp_hash` when + the subtype's :c:member:`~PyTypeObject.tp_richcompare` and :c:member:`~PyTypeObject.tp_hash` are both *NULL*. The following constants are defined to be used as the third argument for - :attr:`tp_richcompare` and for :c:func:`PyObject_RichCompare`: + :c:member:`~PyTypeObject.tp_richcompare` and for :c:func:`PyObject_RichCompare`: +----------------+------------+ | Constant | Comparison | @@ -619,26 +618,26 @@ type objects) *must* have the :attr:`ob_size` field. instance structure needs to include a field of type :c:type:`PyObject\*` which is initialized to *NULL*. - Do not confuse this field with :attr:`tp_weaklist`; that is the list head for + Do not confuse this field with :c:member:`~PyTypeObject.tp_weaklist`; that is the list head for weak references to the type object itself. This field is inherited by subtypes, but see the rules listed below. A subtype may override this offset; this means that the subtype uses a different weak reference list head than the base type. Since the list head is always found via - :attr:`tp_weaklistoffset`, this should not be a problem. + :c:member:`~PyTypeObject.tp_weaklistoffset`, this should not be a problem. - When a type defined by a class statement has no :attr:`__slots__` declaration, + When a type defined by a class statement has no :attr:`~object.__slots__` declaration, and none of its base types are weakly referenceable, the type is made weakly referenceable by adding a weak reference list head slot to the instance layout - and setting the :attr:`tp_weaklistoffset` of that slot's offset. + and setting the :c:member:`~PyTypeObject.tp_weaklistoffset` of that slot's offset. When a type's :attr:`__slots__` declaration contains a slot named :attr:`__weakref__`, that slot becomes the weak reference list head for instances of the type, and the slot's offset is stored in the type's - :attr:`tp_weaklistoffset`. + :c:member:`~PyTypeObject.tp_weaklistoffset`. When a type's :attr:`__slots__` declaration does not contain a slot named - :attr:`__weakref__`, the type inherits its :attr:`tp_weaklistoffset` from its + :attr:`__weakref__`, the type inherits its :c:member:`~PyTypeObject.tp_weaklistoffset` from its base type. .. c:member:: getiterfunc PyTypeObject.tp_iter @@ -660,7 +659,7 @@ type objects) *must* have the :attr:`ob_size` field. *NULL* too. Its presence signals that the instances of this type are iterators. - Iterator types should also define the :attr:`tp_iter` function, and that + Iterator types should also define the :c:member:`~PyTypeObject.tp_iter` function, and that function should return the iterator instance itself (not a new iterator instance). @@ -675,7 +674,7 @@ type objects) *must* have the :attr:`ob_size` field. structures, declaring regular methods of this type. For each entry in the array, an entry is added to the type's dictionary (see - :attr:`tp_dict` below) containing a method descriptor. + :c:member:`~PyTypeObject.tp_dict` below) containing a method descriptor. This field is not inherited by subtypes (methods are inherited through a different mechanism). @@ -688,7 +687,7 @@ type objects) *must* have the :attr:`ob_size` field. this type. For each entry in the array, an entry is added to the type's dictionary (see - :attr:`tp_dict` below) containing a member descriptor. + :c:member:`~PyTypeObject.tp_dict` below) containing a member descriptor. This field is not inherited by subtypes (members are inherited through a different mechanism). @@ -700,7 +699,7 @@ type objects) *must* have the :attr:`ob_size` field. structures, declaring computed attributes of instances of this type. For each entry in the array, an entry is added to the type's dictionary (see - :attr:`tp_dict` below) containing a getset descriptor. + :c:member:`~PyTypeObject.tp_dict` below) containing a getset descriptor. This field is not inherited by subtypes (computed attributes are inherited through a different mechanism). @@ -748,7 +747,7 @@ type objects) *must* have the :attr:`ob_size` field. .. warning:: It is not safe to use :c:func:`PyDict_SetItem` on or otherwise modify - :attr:`tp_dict` with the dictionary C-API. + :c:member:`~PyTypeObject.tp_dict` with the dictionary C-API. .. c:member:: descrgetfunc PyTypeObject.tp_descr_get @@ -784,7 +783,7 @@ type objects) *must* have the :attr:`ob_size` field. the instance variable dictionary; this offset is used by :c:func:`PyObject_GenericGetAttr`. - Do not confuse this field with :attr:`tp_dict`; that is the dictionary for + Do not confuse this field with :c:member:`~PyTypeObject.tp_dict`; that is the dictionary for attributes of the type object itself. If the value of this field is greater than zero, it specifies the offset from @@ -793,20 +792,20 @@ type objects) *must* have the :attr:`ob_size` field. offset is more expensive to use, and should only be used when the instance structure contains a variable-length part. This is used for example to add an instance variable dictionary to subtypes of :class:`str` or :class:`tuple`. Note - that the :attr:`tp_basicsize` field should account for the dictionary added to + that the :c:member:`~PyTypeObject.tp_basicsize` field should account for the dictionary added to the end in that case, even though the dictionary is not included in the basic object layout. On a system with a pointer size of 4 bytes, - :attr:`tp_dictoffset` should be set to ``-4`` to indicate that the dictionary is + :c:member:`~PyTypeObject.tp_dictoffset` should be set to ``-4`` to indicate that the dictionary is at the very end of the structure. The real dictionary offset in an instance can be computed from a negative - :attr:`tp_dictoffset` as follows:: + :c:member:`~PyTypeObject.tp_dictoffset` as follows:: dictoffset = tp_basicsize + abs(ob_size)*tp_itemsize + tp_dictoffset if dictoffset is not aligned on sizeof(void*): round up to sizeof(void*) - where :attr:`tp_basicsize`, :attr:`tp_itemsize` and :attr:`tp_dictoffset` are + where :c:member:`~PyTypeObject.tp_basicsize`, :c:member:`~PyTypeObject.tp_itemsize` and :c:member:`~PyTypeObject.tp_dictoffset` are taken from the type object, and :attr:`ob_size` is taken from the instance. The absolute value is taken because ints use the sign of :attr:`ob_size` to store the sign of the number. (There's never a need to do this calculation @@ -815,17 +814,17 @@ type objects) *must* have the :attr:`ob_size` field. This field is inherited by subtypes, but see the rules listed below. A subtype may override this offset; this means that the subtype instances store the dictionary at a difference offset than the base type. Since the dictionary is - always found via :attr:`tp_dictoffset`, this should not be a problem. + always found via :c:member:`~PyTypeObject.tp_dictoffset`, this should not be a problem. - When a type defined by a class statement has no :attr:`__slots__` declaration, + When a type defined by a class statement has no :attr:`~object.__slots__` declaration, and none of its base types has an instance variable dictionary, a dictionary - slot is added to the instance layout and the :attr:`tp_dictoffset` is set to + slot is added to the instance layout and the :c:member:`~PyTypeObject.tp_dictoffset` is set to that slot's offset. When a type defined by a class statement has a :attr:`__slots__` declaration, - the type inherits its :attr:`tp_dictoffset` from its base type. + the type inherits its :c:member:`~PyTypeObject.tp_dictoffset` from its base type. - (Adding a slot named :attr:`__dict__` to the :attr:`__slots__` declaration does + (Adding a slot named :attr:`~object.__dict__` to the :attr:`__slots__` declaration does not have the expected effect, it just causes confusion. Maybe this should be added as a feature just like :attr:`__weakref__` though.) @@ -847,12 +846,12 @@ type objects) *must* have the :attr:`ob_size` field. arguments represent positional and keyword arguments of the call to :meth:`__init__`. - The :attr:`tp_init` function, if not *NULL*, is called when an instance is - created normally by calling its type, after the type's :attr:`tp_new` function - has returned an instance of the type. If the :attr:`tp_new` function returns an + The :c:member:`~PyTypeObject.tp_init` function, if not *NULL*, is called when an instance is + created normally by calling its type, after the type's :c:member:`~PyTypeObject.tp_new` function + has returned an instance of the type. If the :c:member:`~PyTypeObject.tp_new` function returns an instance of some other type that is not a subtype of the original type, no - :attr:`tp_init` function is called; if :attr:`tp_new` returns an instance of a - subtype of the original type, the subtype's :attr:`tp_init` is called. + :c:member:`~PyTypeObject.tp_init` function is called; if :c:member:`~PyTypeObject.tp_new` returns an instance of a + subtype of the original type, the subtype's :c:member:`~PyTypeObject.tp_init` is called. This field is inherited by subtypes. @@ -869,14 +868,14 @@ type objects) *must* have the :attr:`ob_size` field. initialization. It should return a pointer to a block of memory of adequate length for the instance, suitably aligned, and initialized to zeros, but with :attr:`ob_refcnt` set to ``1`` and :attr:`ob_type` set to the type argument. If - the type's :attr:`tp_itemsize` is non-zero, the object's :attr:`ob_size` field + the type's :c:member:`~PyTypeObject.tp_itemsize` is non-zero, the object's :attr:`ob_size` field should be initialized to *nitems* and the length of the allocated memory block should be ``tp_basicsize + nitems*tp_itemsize``, rounded up to a multiple of ``sizeof(void*)``; otherwise, *nitems* is not used and the length of the block - should be :attr:`tp_basicsize`. + should be :c:member:`~PyTypeObject.tp_basicsize`. Do not use this function to do any other instance initialization, not even to - allocate additional memory; that should be done by :attr:`tp_new`. + allocate additional memory; that should be done by :c:member:`~PyTypeObject.tp_new`. This field is inherited by static subtypes, but not by dynamic subtypes (subtypes created by a class statement); in the latter, this field is always set @@ -898,20 +897,20 @@ type objects) *must* have the :attr:`ob_size` field. The subtype argument is the type of the object being created; the *args* and *kwds* arguments represent positional and keyword arguments of the call to the - type. Note that subtype doesn't have to equal the type whose :attr:`tp_new` + type. Note that subtype doesn't have to equal the type whose :c:member:`~PyTypeObject.tp_new` function is called; it may be a subtype of that type (but not an unrelated type). - The :attr:`tp_new` function should call ``subtype->tp_alloc(subtype, nitems)`` + The :c:member:`~PyTypeObject.tp_new` function should call ``subtype->tp_alloc(subtype, nitems)`` to allocate space for the object, and then do only as much further initialization as is absolutely necessary. Initialization that can safely be - ignored or repeated should be placed in the :attr:`tp_init` handler. A good + ignored or repeated should be placed in the :c:member:`~PyTypeObject.tp_init` handler. A good rule of thumb is that for immutable types, all initialization should take place - in :attr:`tp_new`, while for mutable types, most initialization should be - deferred to :attr:`tp_init`. + in :c:member:`~PyTypeObject.tp_new`, while for mutable types, most initialization should be + deferred to :c:member:`~PyTypeObject.tp_init`. This field is inherited by subtypes, except it is not inherited by static types - whose :attr:`tp_base` is *NULL* or ``&PyBaseObject_Type``. + whose :c:member:`~PyTypeObject.tp_base` is *NULL* or ``&PyBaseObject_Type``. .. c:member:: destructor PyTypeObject.tp_free @@ -935,7 +934,7 @@ type objects) *must* have the :attr:`ob_size` field. The garbage collector needs to know whether a particular object is collectible or not. Normally, it is sufficient to look at the object's type's - :attr:`tp_flags` field, and check the :const:`Py_TPFLAGS_HAVE_GC` flag bit. But + :c:member:`~PyTypeObject.tp_flags` field, and check the :const:`Py_TPFLAGS_HAVE_GC` flag bit. But some types have a mixture of statically and dynamically allocated instances, and the statically allocated instances are not collectible. Such types should define this function; it should return ``1`` for a collectible instance, and @@ -1006,7 +1005,7 @@ subtypes. .. c:member:: PyTypeObject* PyTypeObject.tp_next - Pointer to the next type object with a non-zero :attr:`tp_allocs` field. + Pointer to the next type object with a non-zero :c:member:`~PyTypeObject.tp_allocs` field. Also, note that, in a garbage collected Python, tp_dealloc may be called from any Python thread, not just the thread which created the object (if the object @@ -1145,13 +1144,13 @@ Sequence Object Structures This function is used by :c:func:`PySequence_Concat` and has the same signature. It is also used by the ``+`` operator, after trying the numeric - addition via the :attr:`tp_as_number.nb_add` slot. + addition via the :c:member:`~PyTypeObject.tp_as_number.nb_add` slot. .. c:member:: ssizeargfunc PySequenceMethods.sq_repeat This function is used by :c:func:`PySequence_Repeat` and has the same signature. It is also used by the ``*`` operator, after trying numeric - multiplication via the :attr:`tp_as_number.nb_mul` slot. + multiplication via the :c:member:`~PyTypeObject.tp_as_number.nb_mul` slot. .. c:member:: ssizeargfunc PySequenceMethods.sq_item @@ -1198,46 +1197,88 @@ Buffer Object Structures .. sectionauthor:: Greg J. Stein <greg@lyra.org> .. sectionauthor:: Benjamin Peterson +.. sectionauthor:: Stefan Krah +.. c:type:: PyBufferProcs -The :ref:`buffer interface <bufferobjects>` exports a model where an object can expose its internal -data. + This structure holds pointers to the functions required by the + :ref:`Buffer protocol <bufferobjects>`. The protocol defines how + an exporter object can expose its internal data to consumer objects. -If an object does not export the buffer interface, then its :attr:`tp_as_buffer` -member in the :c:type:`PyTypeObject` structure should be *NULL*. Otherwise, the -:attr:`tp_as_buffer` will point to a :c:type:`PyBufferProcs` structure. +.. c:member:: getbufferproc PyBufferProcs.bf_getbuffer + The signature of this function is:: -.. c:type:: PyBufferProcs + int (PyObject *exporter, Py_buffer *view, int flags); + + Handle a request to *exporter* to fill in *view* as specified by *flags*. + Except for point (3), an implementation of this function MUST take these + steps: + + (1) Check if the request can be met. If not, raise :c:data:`PyExc_BufferError`, + set :c:data:`view->obj` to *NULL* and return -1. + + (2) Fill in the requested fields. + + (3) Increment an internal counter for the number of exports. + + (4) Set :c:data:`view->obj` to *exporter* and increment :c:data:`view->obj`. + + (5) Return 0. + + If *exporter* is part of a chain or tree of buffer providers, two main + schemes can be used: + + * Re-export: Each member of the tree acts as the exporting object and + sets :c:data:`view->obj` to a new reference to itself. + + * Redirect: The buffer request is redirected to the root object of the + tree. Here, :c:data:`view->obj` will be a new reference to the root + object. + + The individual fields of *view* are described in section + :ref:`Buffer structure <buffer-structure>`, the rules how an exporter + must react to specific requests are in section + :ref:`Buffer request types <buffer-request-types>`. + + All memory pointed to in the :c:type:`Py_buffer` structure belongs to + the exporter and must remain valid until there are no consumers left. + :c:member:`~Py_buffer.format`, :c:member:`~Py_buffer.shape`, + :c:member:`~Py_buffer.strides`, :c:member:`~Py_buffer.suboffsets` + and :c:member:`~Py_buffer.internal` + are read-only for the consumer. + + :c:func:`PyBuffer_FillInfo` provides an easy way of exposing a simple + bytes buffer while dealing correctly with all request types. + + :c:func:`PyObject_GetBuffer` is the interface for the consumer that + wraps this function. + +.. c:member:: releasebufferproc PyBufferProcs.bf_releasebuffer + + The signature of this function is:: + + void (PyObject *exporter, Py_buffer *view); - Structure used to hold the function pointers which define an implementation of - the buffer protocol. + Handle a request to release the resources of the buffer. If no resources + need to be released, :c:member:`PyBufferProcs.bf_releasebuffer` may be + *NULL*. Otherwise, a standard implementation of this function will take + these optional steps: - .. c:member:: getbufferproc bf_getbuffer + (1) Decrement an internal counter for the number of exports. - This should fill a :c:type:`Py_buffer` with the necessary data for - exporting the type. The signature of :data:`getbufferproc` is ``int - (PyObject *obj, Py_buffer *view, int flags)``. *obj* is the object to - export, *view* is the :c:type:`Py_buffer` struct to fill, and *flags* gives - the conditions the caller wants the memory under. (See - :c:func:`PyObject_GetBuffer` for all flags.) :c:member:`bf_getbuffer` is - responsible for filling *view* with the appropriate information. - (:c:func:`PyBuffer_FillView` can be used in simple cases.) See - :c:type:`Py_buffer`\s docs for what needs to be filled in. + (2) If the counter is 0, free all memory associated with *view*. + The exporter MUST use the :c:member:`~Py_buffer.internal` field to keep + track of buffer-specific resources. This field is guaranteed to remain + constant, while a consumer MAY pass a copy of the original buffer as the + *view* argument. - .. c:member:: releasebufferproc bf_releasebuffer - This should release the resources of the buffer. The signature of - :c:data:`releasebufferproc` is ``void (PyObject *obj, Py_buffer *view)``. - If the :c:data:`bf_releasebuffer` function is not provided (i.e. it is - *NULL*), then it does not ever need to be called. + This function MUST NOT decrement :c:data:`view->obj`, since that is + done automatically in :c:func:`PyBuffer_Release` (this scheme is + useful for breaking reference cycles). - The exporter of the buffer interface must make sure that any memory - pointed to in the :c:type:`Py_buffer` structure remains valid until - releasebuffer is called. Exporters will need to define a - :c:data:`bf_releasebuffer` function if they can re-allocate their memory, - strides, shape, suboffsets, or format variables which they might share - through the struct bufferinfo. - See :c:func:`PyBuffer_Release`. + :c:func:`PyBuffer_Release` is the interface for the consumer that + wraps this function. diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst index d1b57d9..3649cfb 100644 --- a/Doc/c-api/unicode.rst +++ b/Doc/c-api/unicode.rst @@ -6,38 +6,72 @@ Unicode Objects and Codecs -------------------------- .. sectionauthor:: Marc-André Lemburg <mal@lemburg.com> +.. sectionauthor:: Georg Brandl <georg@python.org> Unicode Objects ^^^^^^^^^^^^^^^ +Since the implementation of :pep:`393` in Python 3.3, Unicode objects internally +use a variety of representations, in order to allow handling the complete range +of Unicode characters while staying memory efficient. There are special cases +for strings where all code points are below 128, 256, or 65536; otherwise, code +points must be below 1114112 (which is the full Unicode range). + +:c:type:`Py_UNICODE*` and UTF-8 representations are created on demand and cached +in the Unicode object. The :c:type:`Py_UNICODE*` representation is deprecated +and inefficient; it should be avoided in performance- or memory-sensitive +situations. + +Due to the transition between the old APIs and the new APIs, unicode objects +can internally be in two states depending on how they were created: + +* "canonical" unicode objects are all objects created by a non-deprecated + unicode API. They use the most efficient representation allowed by the + implementation. + +* "legacy" unicode objects have been created through one of the deprecated + APIs (typically :c:func:`PyUnicode_FromUnicode`) and only bear the + :c:type:`Py_UNICODE*` representation; you will have to call + :c:func:`PyUnicode_READY` on them before calling any other API. + + Unicode Type """""""""""" These are the basic Unicode object types used for the Unicode implementation in Python: +.. c:type:: Py_UCS4 + Py_UCS2 + Py_UCS1 + + These types are typedefs for unsigned integer types wide enough to contain + characters of 32 bits, 16 bits and 8 bits, respectively. When dealing with + single Unicode characters, use :c:type:`Py_UCS4`. + + .. versionadded:: 3.3 + .. c:type:: Py_UNICODE - This type represents the storage type which is used by Python internally as - basis for holding Unicode ordinals. Python's default builds use a 16-bit type - for :c:type:`Py_UNICODE` and store Unicode values internally as UCS2. It is also - possible to build a UCS4 version of Python (most recent Linux distributions come - with UCS4 builds of Python). These builds then use a 32-bit type for - :c:type:`Py_UNICODE` and store Unicode data internally as UCS4. On platforms - where :c:type:`wchar_t` is available and compatible with the chosen Python - Unicode build variant, :c:type:`Py_UNICODE` is a typedef alias for - :c:type:`wchar_t` to enhance native platform compatibility. On all other - platforms, :c:type:`Py_UNICODE` is a typedef alias for either :c:type:`unsigned - short` (UCS2) or :c:type:`unsigned long` (UCS4). + This is a typedef of :c:type:`wchar_t`, which is a 16-bit type or 32-bit type + depending on the platform. -Note that UCS2 and UCS4 Python builds are not binary compatible. Please keep -this in mind when writing extensions or interfaces. + .. versionchanged:: 3.3 + In previous versions, this was a 16-bit type or a 32-bit type depending on + whether you selected a "narrow" or "wide" Unicode version of Python at + build time. -.. c:type:: PyUnicodeObject +.. c:type:: PyASCIIObject + PyCompactUnicodeObject + PyUnicodeObject - This subtype of :c:type:`PyObject` represents a Python Unicode object. + These subtypes of :c:type:`PyObject` represent a Python Unicode object. In + almost all cases, they shouldn't be used directly, since all API functions + that deal with Unicode objects take and return :c:type:`PyObject` pointers. + + .. versionadded:: 3.3 .. c:var:: PyTypeObject PyUnicode_Type @@ -45,10 +79,10 @@ this in mind when writing extensions or interfaces. This instance of :c:type:`PyTypeObject` represents the Python Unicode type. It is exposed to Python code as ``str``. + The following APIs are really C macros and can be used to do fast checks and to access internal read-only data of Unicode objects: - .. c:function:: int PyUnicode_Check(PyObject *o) Return true if the object *o* is a Unicode object or an instance of a Unicode @@ -61,28 +95,106 @@ access internal read-only data of Unicode objects: subtype. -.. c:function:: Py_ssize_t PyUnicode_GET_SIZE(PyObject *o) +.. c:function:: int PyUnicode_READY(PyObject *o) - Return the size of the object. *o* has to be a :c:type:`PyUnicodeObject` (not - checked). + Ensure the string object *o* is in the "canonical" representation. This is + required before using any of the access macros described below. + .. XXX expand on when it is not required -.. c:function:: Py_ssize_t PyUnicode_GET_DATA_SIZE(PyObject *o) + Returns 0 on success and -1 with an exception set on failure, which in + particular happens if memory allocation fails. - Return the size of the object's internal buffer in bytes. *o* has to be a - :c:type:`PyUnicodeObject` (not checked). + .. versionadded:: 3.3 -.. c:function:: Py_UNICODE* PyUnicode_AS_UNICODE(PyObject *o) +.. c:function:: Py_ssize_t PyUnicode_GET_LENGTH(PyObject *o) + + Return the length of the Unicode string, in code points. *o* has to be a + Unicode object in the "canonical" representation (not checked). + + .. versionadded:: 3.3 + + +.. c:function:: Py_UCS1* PyUnicode_1BYTE_DATA(PyObject *o) + Py_UCS2* PyUnicode_2BYTE_DATA(PyObject *o) + Py_UCS4* PyUnicode_4BYTE_DATA(PyObject *o) + + Return a pointer to the canonical representation cast to UCS1, UCS2 or UCS4 + integer types for direct character access. No checks are performed if the + canonical representation has the correct character size; use + :c:func:`PyUnicode_KIND` to select the right macro. Make sure + :c:func:`PyUnicode_READY` has been called before accessing this. + + .. versionadded:: 3.3 + + +.. c:macro:: PyUnicode_WCHAR_KIND + PyUnicode_1BYTE_KIND + PyUnicode_2BYTE_KIND + PyUnicode_4BYTE_KIND + + Return values of the :c:func:`PyUnicode_KIND` macro. + + .. versionadded:: 3.3 + + +.. c:function:: int PyUnicode_KIND(PyObject *o) + + Return one of the PyUnicode kind constants (see above) that indicate how many + bytes per character this Unicode object uses to store its data. *o* has to + be a Unicode object in the "canonical" representation (not checked). + + .. XXX document "0" return value? + + .. versionadded:: 3.3 + + +.. c:function:: void* PyUnicode_DATA(PyObject *o) + + Return a void pointer to the raw unicode buffer. *o* has to be a Unicode + object in the "canonical" representation (not checked). + + .. versionadded:: 3.3 + + +.. c:function:: void PyUnicode_WRITE(int kind, void *data, Py_ssize_t index, \ + Py_UCS4 value) + + Write into a canonical representation *data* (as obtained with + :c:func:`PyUnicode_DATA`). This macro does not do any sanity checks and is + intended for usage in loops. The caller should cache the *kind* value and + *data* pointer as obtained from other macro calls. *index* is the index in + the string (starts at 0) and *value* is the new code point value which should + be written to that location. + + .. versionadded:: 3.3 + + +.. c:function:: Py_UCS4 PyUnicode_READ(int kind, void *data, Py_ssize_t index) + + Read a code point from a canonical representation *data* (as obtained with + :c:func:`PyUnicode_DATA`). No checks or ready calls are performed. + + .. versionadded:: 3.3 + + +.. c:function:: Py_UCS4 PyUnicode_READ_CHAR(PyObject *o, Py_ssize_t index) + + Read a character from a Unicode object *o*, which must be in the "canonical" + representation. This is less efficient than :c:func:`PyUnicode_READ` if you + do multiple consecutive reads. + + .. versionadded:: 3.3 - Return a pointer to the internal :c:type:`Py_UNICODE` buffer of the object. *o* - has to be a :c:type:`PyUnicodeObject` (not checked). +.. c:function:: PyUnicode_MAX_CHAR_VALUE(PyObject *o) -.. c:function:: const char* PyUnicode_AS_DATA(PyObject *o) + Return the maximum code point that is suitable for creating another string + based on *o*, which must be in the "canonical" representation. This is + always an approximation but more efficient than iterating over the string. - Return a pointer to the internal buffer of the object. *o* has to be a - :c:type:`PyUnicodeObject` (not checked). + .. versionadded:: 3.3 .. c:function:: int PyUnicode_ClearFreeList() @@ -90,6 +202,46 @@ access internal read-only data of Unicode objects: Clear the free list. Return the total number of freed items. +.. c:function:: Py_ssize_t PyUnicode_GET_SIZE(PyObject *o) + + Return the size of the deprecated :c:type:`Py_UNICODE` representation, in + code units (this includes surrogate pairs as 2 units). *o* has to be a + Unicode object (not checked). + + .. deprecated-removed:: 3.3 4.0 + Part of the old-style Unicode API, please migrate to using + :c:func:`PyUnicode_GET_LENGTH`. + + +.. c:function:: Py_ssize_t PyUnicode_GET_DATA_SIZE(PyObject *o) + + Return the size of the deprecated :c:type:`Py_UNICODE` representation in + bytes. *o* has to be a Unicode object (not checked). + + .. deprecated-removed:: 3.3 4.0 + Part of the old-style Unicode API, please migrate to using + :c:func:`PyUnicode_GET_LENGTH`. + + +.. c:function:: Py_UNICODE* PyUnicode_AS_UNICODE(PyObject *o) + const char* PyUnicode_AS_DATA(PyObject *o) + + Return a pointer to a :c:type:`Py_UNICODE` representation of the object. The + ``AS_DATA`` form casts the pointer to :c:type:`const char *`. *o* has to be + a Unicode object (not checked). + + .. versionchanged:: 3.3 + This macro is now inefficient -- because in many cases the + :c:type:`Py_UNICODE` representation does not exist and needs to be created + -- and can fail (return *NULL* with an exception set). Try to port the + code to use the new :c:func:`PyUnicode_nBYTE_DATA` macros or use + :c:func:`PyUnicode_WRITE` or :c:func:`PyUnicode_READ`. + + .. deprecated-removed:: 3.3 4.0 + Part of the old-style Unicode API, please migrate to using the + :c:func:`PyUnicode_nBYTE_DATA` family of macros. + + Unicode Character Properties """""""""""""""""""""""""""" @@ -166,16 +318,25 @@ These APIs can be used for fast direct character conversions: Return the character *ch* converted to lower case. + .. deprecated:: 3.3 + This function uses simple case mappings. + .. c:function:: Py_UNICODE Py_UNICODE_TOUPPER(Py_UNICODE ch) Return the character *ch* converted to upper case. + .. deprecated:: 3.3 + This function uses simple case mappings. + .. c:function:: Py_UNICODE Py_UNICODE_TOTITLE(Py_UNICODE ch) Return the character *ch* converted to title case. + .. deprecated:: 3.3 + This function uses simple case mappings. + .. c:function:: int Py_UNICODE_TODECIMAL(Py_UNICODE ch) @@ -195,31 +356,66 @@ These APIs can be used for fast direct character conversions: possible. This macro does not raise exceptions. -Plain Py_UNICODE -"""""""""""""""" +These APIs can be used to work with surrogates: + +.. c:macro:: Py_UNICODE_IS_SURROGATE(ch) + + Check if *ch* is a surrogate (``0xD800 <= ch <= 0xDFFF``). + +.. c:macro:: Py_UNICODE_IS_HIGH_SURROGATE(ch) + + Check if *ch* is an high surrogate (``0xD800 <= ch <= 0xDBFF``). + +.. c:macro:: Py_UNICODE_IS_LOW_SURROGATE(ch) + + Check if *ch* is a low surrogate (``0xDC00 <= ch <= 0xDFFF``). + +.. c:macro:: Py_UNICODE_JOIN_SURROGATES(high, low) + + Join two surrogate characters and return a single Py_UCS4 value. + *high* and *low* are respectively the leading and trailing surrogates in a + surrogate pair. + + +Creating and accessing Unicode strings +"""""""""""""""""""""""""""""""""""""" To create Unicode objects and access their basic sequence properties, use these APIs: +.. c:function:: PyObject* PyUnicode_New(Py_ssize_t size, Py_UCS4 maxchar) -.. c:function:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size) + Create a new Unicode object. *maxchar* should be the true maximum code point + to be placed in the string. As an approximation, it can be rounded up to the + nearest value in the sequence 127, 255, 65535, 1114111. - Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u* - may be *NULL* which causes the contents to be undefined. It is the user's - responsibility to fill in the needed data. The buffer is copied into the new - object. If the buffer is not *NULL*, the return value might be a shared object. - Therefore, modification of the resulting Unicode object is only allowed when *u* - is *NULL*. + This is the recommended way to allocate a new Unicode object. Objects + created using this function are not resizable. + + .. versionadded:: 3.3 + + +.. c:function:: PyObject* PyUnicode_FromKindAndData(int kind, const void *buffer, \ + Py_ssize_t size) + + Create a new Unicode object with the given *kind* (possible values are + :c:macro:`PyUnicode_1BYTE_KIND` etc., as returned by + :c:func:`PyUnicode_KIND`). The *buffer* must point to an array of *size* + units of 1, 2 or 4 bytes per character, as given by the kind. + + .. versionadded:: 3.3 .. c:function:: PyObject* PyUnicode_FromStringAndSize(const char *u, Py_ssize_t size) - Create a Unicode object from the char buffer *u*. The bytes will be interpreted - as being UTF-8 encoded. *u* may also be *NULL* which - causes the contents to be undefined. It is the user's responsibility to fill in - the needed data. The buffer is copied into the new object. If the buffer is not - *NULL*, the return value might be a shared object. Therefore, modification of - the resulting Unicode object is only allowed when *u* is *NULL*. + Create a Unicode object from the char buffer *u*. The bytes will be + interpreted as being UTF-8 encoded. The buffer is copied into the new + object. If the buffer is not *NULL*, the return value might be a shared + object, i.e. modification of the data is not allowed. + + If *u* is *NULL*, this function behaves like :c:func:`PyUnicode_FromUnicode` + with the buffer set to *NULL*. This usage is deprecated in favor of + :c:func:`PyUnicode_New`. .. c:function:: PyObject *PyUnicode_FromString(const char *u) @@ -243,6 +439,8 @@ APIs: .. % Similar comments apply to the %ll width modifier and .. % PY_FORMAT_LONG_LONG. + .. tabularcolumns:: |l|l|L| + +-------------------+---------------------+--------------------------------+ | Format Characters | Type | Comment | +===================+=====================+================================+ @@ -260,18 +458,27 @@ APIs: | :attr:`%ld` | long | Exactly equivalent to | | | | ``printf("%ld")``. | +-------------------+---------------------+--------------------------------+ + | :attr:`%li` | long | Exactly equivalent to | + | | | ``printf("%li")``. | + +-------------------+---------------------+--------------------------------+ | :attr:`%lu` | unsigned long | Exactly equivalent to | | | | ``printf("%lu")``. | +-------------------+---------------------+--------------------------------+ | :attr:`%lld` | long long | Exactly equivalent to | | | | ``printf("%lld")``. | +-------------------+---------------------+--------------------------------+ + | :attr:`%lli` | long long | Exactly equivalent to | + | | | ``printf("%lli")``. | + +-------------------+---------------------+--------------------------------+ | :attr:`%llu` | unsigned long long | Exactly equivalent to | | | | ``printf("%llu")``. | +-------------------+---------------------+--------------------------------+ | :attr:`%zd` | Py_ssize_t | Exactly equivalent to | | | | ``printf("%zd")``. | +-------------------+---------------------+--------------------------------+ + | :attr:`%zi` | Py_ssize_t | Exactly equivalent to | + | | | ``printf("%zi")``. | + +-------------------+---------------------+--------------------------------+ | :attr:`%zu` | size_t | Exactly equivalent to | | | | ``printf("%zu")``. | +-------------------+---------------------+--------------------------------+ @@ -322,27 +529,178 @@ APIs: .. versionchanged:: 3.2 Support for ``"%lld"`` and ``"%llu"`` added. + .. versionchanged:: 3.3 + Support for ``"%li"``, ``"%lli"`` and ``"%zi"`` added. + .. c:function:: PyObject* PyUnicode_FromFormatV(const char *format, va_list vargs) Identical to :c:func:`PyUnicode_FromFormat` except that it takes exactly two arguments. + +.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, \ + const char *encoding, const char *errors) + + Coerce an encoded object *obj* to an Unicode object and return a reference with + incremented refcount. + + :class:`bytes`, :class:`bytearray` and other char buffer compatible objects + are decoded according to the given *encoding* and using the error handling + defined by *errors*. Both can be *NULL* to have the interface use the default + values (see the next section for details). + + All other objects, including Unicode objects, cause a :exc:`TypeError` to be + set. + + The API returns *NULL* if there was an error. The caller is responsible for + decref'ing the returned objects. + + +.. c:function:: Py_ssize_t PyUnicode_GetLength(PyObject *unicode) + + Return the length of the Unicode object, in code points. + + .. versionadded:: 3.3 + + +.. c:function:: int PyUnicode_CopyCharacters(PyObject *to, Py_ssize_t to_start, \ + PyObject *from, Py_ssize_t from_start, Py_ssize_t how_many) + + Copy characters from one Unicode object into another. This function performs + character conversion when necessary and falls back to :c:func:`memcpy` if + possible. Returns ``-1`` and sets an exception on error, otherwise returns + ``0``. + + .. versionadded:: 3.3 + + +.. c:function:: Py_ssize_t PyUnicode_Fill(PyObject *unicode, Py_ssize_t start, \ + Py_ssize_t length, Py_UCS4 fill_char) + + Fill a string with a character: write *fill_char* into + ``unicode[start:start+length]``. + + Fail if *fill_char* is bigger than the string maximum character, or if the + string has more than 1 reference. + + Return the number of written character, or return ``-1`` and raise an + exception on error. + + .. versionadded:: 3.3 + + +.. c:function:: int PyUnicode_WriteChar(PyObject *unicode, Py_ssize_t index, \ + Py_UCS4 character) + + Write a character to a string. The string must have been created through + :c:func:`PyUnicode_New`. Since Unicode strings are supposed to be immutable, + the string must not be shared, or have been hashed yet. + + This function checks that *unicode* is a Unicode object, that the index is + not out of bounds, and that the object can be modified safely (i.e. that it + its reference count is one), in contrast to the macro version + :c:func:`PyUnicode_WRITE_CHAR`. + + .. versionadded:: 3.3 + + +.. c:function:: Py_UCS4 PyUnicode_ReadChar(PyObject *unicode, Py_ssize_t index) + + Read a character from a string. This function checks that *unicode* is a + Unicode object and the index is not out of bounds, in contrast to the macro + version :c:func:`PyUnicode_READ_CHAR`. + + .. versionadded:: 3.3 + + +.. c:function:: PyObject* PyUnicode_Substring(PyObject *str, Py_ssize_t start, \ + Py_ssize_t end) + + Return a substring of *str*, from character index *start* (included) to + character index *end* (excluded). Negative indices are not supported. + + .. versionadded:: 3.3 + + +.. c:function:: Py_UCS4* PyUnicode_AsUCS4(PyObject *u, Py_UCS4 *buffer, \ + Py_ssize_t buflen, int copy_null) + + Copy the string *u* into a UCS4 buffer, including a null character, if + *copy_null* is set. Returns *NULL* and sets an exception on error (in + particular, a :exc:`ValueError` if *buflen* is smaller than the length of + *u*). *buffer* is returned on success. + + .. versionadded:: 3.3 + + +.. c:function:: Py_UCS4* PyUnicode_AsUCS4Copy(PyObject *u) + + Copy the string *u* into a new UCS4 buffer that is allocated using + :c:func:`PyMem_Malloc`. If this fails, *NULL* is returned with a + :exc:`MemoryError` set. + + .. versionadded:: 3.3 + + +Deprecated Py_UNICODE APIs +"""""""""""""""""""""""""" + +.. deprecated-removed:: 3.3 4.0 + +These API functions are deprecated with the implementation of :pep:`393`. +Extension modules can continue using them, as they will not be removed in Python +3.x, but need to be aware that their use can now cause performance and memory hits. + + +.. c:function:: PyObject* PyUnicode_FromUnicode(const Py_UNICODE *u, Py_ssize_t size) + + Create a Unicode object from the Py_UNICODE buffer *u* of the given size. *u* + may be *NULL* which causes the contents to be undefined. It is the user's + responsibility to fill in the needed data. The buffer is copied into the new + object. + + If the buffer is not *NULL*, the return value might be a shared object. + Therefore, modification of the resulting Unicode object is only allowed when + *u* is *NULL*. + + If the buffer is *NULL*, :c:func:`PyUnicode_READY` must be called once the + string content has been filled before using any of the access macros such as + :c:func:`PyUnicode_KIND`. + + Please migrate to using :c:func:`PyUnicode_FromKindAndData` or + :c:func:`PyUnicode_New`. + + +.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode) + + Return a read-only pointer to the Unicode object's internal + :c:type:`Py_UNICODE` buffer, or *NULL* on error. This will create the + :c:type:`Py_UNICODE*` representation of the object if it is not yet + available. Note that the resulting :c:type:`Py_UNICODE` string may contain + embedded null characters, which would cause the string to be truncated when + used in most C functions. + + Please migrate to using :c:func:`PyUnicode_AsUCS4`, + :c:func:`PyUnicode_Substring`, :c:func:`PyUnicode_ReadChar` or similar new + APIs. + + .. c:function:: PyObject* PyUnicode_TransformDecimalToASCII(Py_UNICODE *s, Py_ssize_t size) Create a Unicode object by replacing all decimal digits in :c:type:`Py_UNICODE` buffer of the given *size* by ASCII digits 0--9 - according to their decimal value. Return *NULL* if an exception - occurs. + according to their decimal value. Return *NULL* if an exception occurs. -.. c:function:: Py_UNICODE* PyUnicode_AsUnicode(PyObject *unicode) +.. c:function:: Py_UNICODE* PyUnicode_AsUnicodeAndSize(PyObject *unicode, Py_ssize_t *size) - Return a read-only pointer to the Unicode object's internal - :c:type:`Py_UNICODE` buffer, *NULL* if *unicode* is not a Unicode object. - Note that the resulting :c:type:`Py_UNICODE*` string may contain embedded - null characters, which would cause the string to be truncated when used in - most C functions. + Like :c:func:`PyUnicode_AsUnicode`, but also saves the :c:func:`Py_UNICODE` + array length in *size*. Note that the resulting :c:type:`Py_UNICODE*` string + may contain embedded null characters, which would cause the string to be + truncated when used in most C functions. + + .. versionadded:: 3.3 .. c:function:: Py_UNICODE* PyUnicode_AsUnicodeCopy(PyObject *unicode) @@ -350,44 +708,77 @@ APIs: Create a copy of a Unicode string ending with a nul character. Return *NULL* and raise a :exc:`MemoryError` exception on memory allocation failure, otherwise return a new allocated buffer (use :c:func:`PyMem_Free` to free - the buffer). Note that the resulting :c:type:`Py_UNICODE*` string may contain - embedded null characters, which would cause the string to be truncated when - used in most C functions. + the buffer). Note that the resulting :c:type:`Py_UNICODE*` string may + contain embedded null characters, which would cause the string to be + truncated when used in most C functions. .. versionadded:: 3.2 + Please migrate to using :c:func:`PyUnicode_AsUCS4Copy` or similar new APIs. + .. c:function:: Py_ssize_t PyUnicode_GetSize(PyObject *unicode) - Return the length of the Unicode object. + Return the size of the deprecated :c:type:`Py_UNICODE` representation, in + code units (this includes surrogate pairs as 2 units). + Please migrate to using :c:func:`PyUnicode_GetLength`. -.. c:function:: PyObject* PyUnicode_FromEncodedObject(PyObject *obj, const char *encoding, const char *errors) - Coerce an encoded object *obj* to an Unicode object and return a reference with - incremented refcount. +.. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj) - :class:`bytes`, :class:`bytearray` and other char buffer compatible objects - are decoded according to the given *encoding* and using the error handling - defined by *errors*. Both can be *NULL* to have the interface use the default - values (see the next section for details). + Shortcut for ``PyUnicode_FromEncodedObject(obj, NULL, "strict")`` which is used + throughout the interpreter whenever coercion to Unicode is needed. - All other objects, including Unicode objects, cause a :exc:`TypeError` to be - set. - The API returns *NULL* if there was an error. The caller is responsible for - decref'ing the returned objects. +Locale Encoding +""""""""""""""" +The current locale encoding can be used to decode text from the operating +system. -.. c:function:: PyObject* PyUnicode_FromObject(PyObject *obj) +.. c:function:: PyObject* PyUnicode_DecodeLocaleAndSize(const char *str, \ + Py_ssize_t len, \ + const char *errors) - Shortcut for ``PyUnicode_FromEncodedObject(obj, NULL, "strict")`` which is used - throughout the interpreter whenever coercion to Unicode is needed. + Decode a string from the current locale encoding. The supported + error handlers are ``"strict"`` and ``"surrogateescape"`` + (:pep:`383`). The decoder uses ``"strict"`` error handler if + *errors* is ``NULL``. *str* must end with a null character but + cannot contain embedded null characters. + + .. seealso:: + + Use :c:func:`PyUnicode_DecodeFSDefaultAndSize` to decode a string from + :c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at + Python startup). + + .. versionadded:: 3.3 + + +.. c:function:: PyObject* PyUnicode_DecodeLocale(const char *str, const char *errors) + + Similar to :c:func:`PyUnicode_DecodeLocaleAndSize`, but compute the string + length using :c:func:`strlen`. + + .. versionadded:: 3.3 -If the platform supports :c:type:`wchar_t` and provides a header file wchar.h, -Python can interface directly to this type using the following functions. -Support is optimized if Python's own :c:type:`Py_UNICODE` type is identical to -the system's :c:type:`wchar_t`. + +.. c:function:: PyObject* PyUnicode_EncodeLocale(PyObject *unicode, const char *errors) + + Encode a Unicode object to the current locale encoding. The + supported error handlers are ``"strict"`` and ``"surrogateescape"`` + (:pep:`383`). The encoder uses ``"strict"`` error handler if + *errors* is ``NULL``. Return a :class:`bytes` object. *str* cannot + contain embedded null characters. + + .. seealso:: + + Use :c:func:`PyUnicode_EncodeFSDefault` to encode a string to + :c:data:`Py_FileSystemDefaultEncoding` (the locale encoding read at + Python startup). + + .. versionadded:: 3.3 File System Encoding @@ -425,19 +816,26 @@ used, passing :c:func:`PyUnicode_FSDecoder` as the conversion function: .. c:function:: PyObject* PyUnicode_DecodeFSDefaultAndSize(const char *s, Py_ssize_t size) Decode a string using :c:data:`Py_FileSystemDefaultEncoding` and the - ``'surrogateescape'`` error handler, or ``'strict'`` on Windows. + ``"surrogateescape"`` error handler, or ``"strict"`` on Windows. If :c:data:`Py_FileSystemDefaultEncoding` is not set, fall back to the locale encoding. + .. seealso:: + + :c:data:`Py_FileSystemDefaultEncoding` is initialized at startup from the + locale encoding and cannot be modified later. If you need to decode a + string from the current locale encoding, use + :c:func:`PyUnicode_DecodeLocaleAndSize`. + .. versionchanged:: 3.2 - Use ``'strict'`` error handler on Windows. + Use ``"strict"`` error handler on Windows. .. c:function:: PyObject* PyUnicode_DecodeFSDefault(const char *s) Decode a null-terminated string using :c:data:`Py_FileSystemDefaultEncoding` - and the ``'surrogateescape'`` error handler, or ``'strict'`` on Windows. + and the ``"surrogateescape"`` error handler, or ``"strict"`` on Windows. If :c:data:`Py_FileSystemDefaultEncoding` is not set, fall back to the locale encoding. @@ -445,19 +843,26 @@ used, passing :c:func:`PyUnicode_FSDecoder` as the conversion function: Use :c:func:`PyUnicode_DecodeFSDefaultAndSize` if you know the string length. .. versionchanged:: 3.2 - Use ``'strict'`` error handler on Windows. + Use ``"strict"`` error handler on Windows. .. c:function:: PyObject* PyUnicode_EncodeFSDefault(PyObject *unicode) Encode a Unicode object to :c:data:`Py_FileSystemDefaultEncoding` with the - ``'surrogateescape'`` error handler, or ``'strict'`` on Windows, and return + ``"surrogateescape"`` error handler, or ``"strict"`` on Windows, and return :class:`bytes`. Note that the resulting :class:`bytes` object may contain null bytes. If :c:data:`Py_FileSystemDefaultEncoding` is not set, fall back to the locale encoding. + .. seealso:: + + :c:data:`Py_FileSystemDefaultEncoding` is initialized at startup from the + locale encoding and cannot be modified later. If you need to encode a + string to the current locale encoding, use + :c:func:`PyUnicode_EncodeLocale`. + .. versionadded:: 3.2 @@ -479,9 +884,9 @@ wchar_t Support Copy the Unicode object contents into the :c:type:`wchar_t` buffer *w*. At most *size* :c:type:`wchar_t` characters are copied (excluding a possibly trailing 0-termination character). Return the number of :c:type:`wchar_t` characters - copied or -1 in case of an error. Note that the resulting :c:type:`wchar_t` + copied or -1 in case of an error. Note that the resulting :c:type:`wchar_t*` string may or may not be 0-terminated. It is the responsibility of the caller - to make sure that the :c:type:`wchar_t` string is 0-terminated in case this is + to make sure that the :c:type:`wchar_t*` string is 0-terminated in case this is required by the application. Also, note that the :c:type:`wchar_t*` string might contain null characters, which would cause the string to be truncated when used with most C functions. @@ -497,12 +902,32 @@ wchar_t Support Returns a buffer allocated by :c:func:`PyMem_Alloc` (use :c:func:`PyMem_Free` to free it) on success. On error, returns *NULL*, *\*size* is undefined and raises a :exc:`MemoryError`. Note that the - resulting :c:type:`wchar_t*` string might contain null characters, which + resulting :c:type:`wchar_t` string might contain null characters, which would cause the string to be truncated when used with most C functions. .. versionadded:: 3.2 +UCS4 Support +"""""""""""" + +.. versionadded:: 3.3 + +.. XXX are these meant to be public? + +.. c:function:: size_t Py_UCS4_strlen(const Py_UCS4 *u) + Py_UCS4* Py_UCS4_strcpy(Py_UCS4 *s1, const Py_UCS4 *s2) + Py_UCS4* Py_UCS4_strncpy(Py_UCS4 *s1, const Py_UCS4 *s2, size_t n) + Py_UCS4* Py_UCS4_strcat(Py_UCS4 *s1, const Py_UCS4 *s2) + int Py_UCS4_strcmp(const Py_UCS4 *s1, const Py_UCS4 *s2) + int Py_UCS4_strncmp(const Py_UCS4 *s1, const Py_UCS4 *s2, size_t n) + Py_UCS4* Py_UCS4_strchr(const Py_UCS4 *s, Py_UCS4 c) + Py_UCS4* Py_UCS4_strrchr(const Py_UCS4 *s, Py_UCS4 c) + + These utility functions work on strings of :c:type:`Py_UCS4` characters and + otherwise behave like the C standard library functions with the same name. + + .. _builtincodecs: Built-in Codecs @@ -537,31 +962,38 @@ Generic Codecs These are the generic codec APIs: -.. c:function:: PyObject* PyUnicode_Decode(const char *s, Py_ssize_t size, const char *encoding, const char *errors) +.. c:function:: PyObject* PyUnicode_Decode(const char *s, Py_ssize_t size, \ + const char *encoding, const char *errors) Create a Unicode object by decoding *size* bytes of the encoded string *s*. *encoding* and *errors* have the same meaning as the parameters of the same name - in the :func:`unicode` built-in function. The codec to be used is looked up + in the :func:`str` built-in function. The codec to be used is looked up using the Python codec registry. Return *NULL* if an exception was raised by the codec. -.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, const char *encoding, const char *errors) +.. c:function:: PyObject* PyUnicode_AsEncodedString(PyObject *unicode, \ + const char *encoding, const char *errors) + + Encode a Unicode object and return the result as Python bytes object. + *encoding* and *errors* have the same meaning as the parameters of the same + name in the Unicode :meth:`~str.encode` method. The codec to be used is looked up + using the Python codec registry. Return *NULL* if an exception was raised by + the codec. + + +.. c:function:: PyObject* PyUnicode_Encode(const Py_UNICODE *s, Py_ssize_t size, \ + const char *encoding, const char *errors) Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* and return a Python bytes object. *encoding* and *errors* have the same meaning as the - parameters of the same name in the Unicode :meth:`encode` method. The codec + parameters of the same name in the Unicode :meth:`~str.encode` method. The codec to be used is looked up using the Python codec registry. Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsEncodedString(PyObject *unicode, const char *encoding, const char *errors) - - Encode a Unicode object and return the result as Python bytes object. - *encoding* and *errors* have the same meaning as the parameters of the same - name in the Unicode :meth:`encode` method. The codec to be used is looked up - using the Python codec registry. Return *NULL* if an exception was raised by - the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsEncodedString`. UTF-8 Codecs @@ -576,7 +1008,8 @@ These are the UTF-8 codec APIs: *s*. Return *NULL* if an exception was raised by the codec. -.. c:function:: PyObject* PyUnicode_DecodeUTF8Stateful(const char *s, Py_ssize_t size, const char *errors, Py_ssize_t *consumed) +.. c:function:: PyObject* PyUnicode_DecodeUTF8Stateful(const char *s, Py_ssize_t size, \ + const char *errors, Py_ssize_t *consumed) If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF8`. If *consumed* is not *NULL*, trailing incomplete UTF-8 byte sequences will not be @@ -584,18 +1017,45 @@ These are the UTF-8 codec APIs: that have been decoded will be stored in *consumed*. +.. c:function:: PyObject* PyUnicode_AsUTF8String(PyObject *unicode) + + Encode a Unicode object using UTF-8 and return the result as Python bytes + object. Error handling is "strict". Return *NULL* if an exception was + raised by the codec. + + +.. c:function:: char* PyUnicode_AsUTF8AndSize(PyObject *unicode, Py_ssize_t *size) + + Return a pointer to the default encoding (UTF-8) of the Unicode object, and + store the size of the encoded representation (in bytes) in *size*. *size* + can be *NULL*, in this case no size will be stored. + + In the case of an error, *NULL* is returned with an exception set and no + *size* is stored. + + This caches the UTF-8 representation of the string in the Unicode object, and + subsequent calls will return a pointer to the same buffer. The caller is not + responsible for deallocating the buffer. + + .. versionadded:: 3.3 + + +.. c:function:: char* PyUnicode_AsUTF8(PyObject *unicode) + + As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size. + + .. versionadded:: 3.3 + + .. c:function:: PyObject* PyUnicode_EncodeUTF8(const Py_UNICODE *s, Py_ssize_t size, const char *errors) Encode the :c:type:`Py_UNICODE` buffer *s* of the given *size* using UTF-8 and return a Python bytes object. Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsUTF8String(PyObject *unicode) - - Encode a Unicode object using UTF-8 and return the result as Python bytes - object. Error handling is "strict". Return *NULL* if an exception was - raised by the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsUTF8String` or :c:func:`PyUnicode_AsUTF8AndSize`. UTF-32 Codecs @@ -604,7 +1064,8 @@ UTF-32 Codecs These are the UTF-32 codec APIs: -.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, const char *errors, int *byteorder) +.. c:function:: PyObject* PyUnicode_DecodeUTF32(const char *s, Py_ssize_t size, \ + const char *errors, int *byteorder) Decode *size* bytes from a UTF-32 encoded buffer string and return the corresponding Unicode object. *errors* (if non-*NULL*) defines the error @@ -625,14 +1086,13 @@ These are the UTF-32 codec APIs: After completion, *\*byteorder* is set to the current byte order at the end of input data. - In a narrow build codepoints outside the BMP will be decoded as surrogate pairs. - If *byteorder* is *NULL*, the codec starts in native order mode. Return *NULL* if an exception was raised by the codec. -.. c:function:: PyObject* PyUnicode_DecodeUTF32Stateful(const char *s, Py_ssize_t size, const char *errors, int *byteorder, Py_ssize_t *consumed) +.. c:function:: PyObject* PyUnicode_DecodeUTF32Stateful(const char *s, Py_ssize_t size, \ + const char *errors, int *byteorder, Py_ssize_t *consumed) If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF32`. If *consumed* is not *NULL*, :c:func:`PyUnicode_DecodeUTF32Stateful` will not treat @@ -641,7 +1101,15 @@ These are the UTF-32 codec APIs: that have been decoded will be stored in *consumed*. -.. c:function:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder) +.. c:function:: PyObject* PyUnicode_AsUTF32String(PyObject *unicode) + + Return a Python byte string using the UTF-32 encoding in native byte + order. The string always starts with a BOM mark. Error handling is "strict". + Return *NULL* if an exception was raised by the codec. + + +.. c:function:: PyObject* PyUnicode_EncodeUTF32(const Py_UNICODE *s, Py_ssize_t size, \ + const char *errors, int byteorder) Return a Python bytes object holding the UTF-32 encoded value of the Unicode data in *s*. Output is written according to the following byte order:: @@ -658,12 +1126,9 @@ These are the UTF-32 codec APIs: Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsUTF32String(PyObject *unicode) - - Return a Python byte string using the UTF-32 encoding in native byte - order. The string always starts with a BOM mark. Error handling is "strict". - Return *NULL* if an exception was raised by the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsUTF32String`. UTF-16 Codecs @@ -672,7 +1137,8 @@ UTF-16 Codecs These are the UTF-16 codec APIs: -.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, const char *errors, int *byteorder) +.. c:function:: PyObject* PyUnicode_DecodeUTF16(const char *s, Py_ssize_t size, \ + const char *errors, int *byteorder) Decode *size* bytes from a UTF-16 encoded buffer string and return the corresponding Unicode object. *errors* (if non-*NULL*) defines the error @@ -699,7 +1165,8 @@ These are the UTF-16 codec APIs: Return *NULL* if an exception was raised by the codec. -.. c:function:: PyObject* PyUnicode_DecodeUTF16Stateful(const char *s, Py_ssize_t size, const char *errors, int *byteorder, Py_ssize_t *consumed) +.. c:function:: PyObject* PyUnicode_DecodeUTF16Stateful(const char *s, Py_ssize_t size, \ + const char *errors, int *byteorder, Py_ssize_t *consumed) If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF16`. If *consumed* is not *NULL*, :c:func:`PyUnicode_DecodeUTF16Stateful` will not treat @@ -708,7 +1175,15 @@ These are the UTF-16 codec APIs: number of bytes that have been decoded will be stored in *consumed*. -.. c:function:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, const char *errors, int byteorder) +.. c:function:: PyObject* PyUnicode_AsUTF16String(PyObject *unicode) + + Return a Python byte string using the UTF-16 encoding in native byte + order. The string always starts with a BOM mark. Error handling is "strict". + Return *NULL* if an exception was raised by the codec. + + +.. c:function:: PyObject* PyUnicode_EncodeUTF16(const Py_UNICODE *s, Py_ssize_t size, \ + const char *errors, int byteorder) Return a Python bytes object holding the UTF-16 encoded value of the Unicode data in *s*. Output is written according to the following byte order:: @@ -726,12 +1201,9 @@ These are the UTF-16 codec APIs: Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsUTF16String(PyObject *unicode) - - Return a Python byte string using the UTF-16 encoding in native byte - order. The string always starts with a BOM mark. Error handling is "strict". - Return *NULL* if an exception was raised by the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsUTF16String`. UTF-7 Codecs @@ -746,7 +1218,8 @@ These are the UTF-7 codec APIs: *s*. Return *NULL* if an exception was raised by the codec. -.. c:function:: PyObject* PyUnicode_DecodeUTF7Stateful(const char *s, Py_ssize_t size, const char *errors, Py_ssize_t *consumed) +.. c:function:: PyObject* PyUnicode_DecodeUTF7Stateful(const char *s, Py_ssize_t size, \ + const char *errors, Py_ssize_t *consumed) If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeUTF7`. If *consumed* is not *NULL*, trailing incomplete UTF-7 base-64 sections will not @@ -754,7 +1227,8 @@ These are the UTF-7 codec APIs: bytes that have been decoded will be stored in *consumed*. -.. c:function:: PyObject* PyUnicode_EncodeUTF7(const Py_UNICODE *s, Py_ssize_t size, int base64SetO, int base64WhiteSpace, const char *errors) +.. c:function:: PyObject* PyUnicode_EncodeUTF7(const Py_UNICODE *s, Py_ssize_t size, \ + int base64SetO, int base64WhiteSpace, const char *errors) Encode the :c:type:`Py_UNICODE` buffer of the given size using UTF-7 and return a Python bytes object. Return *NULL* if an exception was raised by @@ -765,6 +1239,11 @@ These are the UTF-7 codec APIs: nonzero, whitespace will be encoded in base-64. Both are set to zero for the Python "utf-7" codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API. + + .. XXX replace with what? + Unicode-Escape Codecs """"""""""""""""""""" @@ -772,24 +1251,29 @@ Unicode-Escape Codecs These are the "Unicode Escape" codec APIs: -.. c:function:: PyObject* PyUnicode_DecodeUnicodeEscape(const char *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_DecodeUnicodeEscape(const char *s, \ + Py_ssize_t size, const char *errors) Create a Unicode object by decoding *size* bytes of the Unicode-Escape encoded string *s*. Return *NULL* if an exception was raised by the codec. +.. c:function:: PyObject* PyUnicode_AsUnicodeEscapeString(PyObject *unicode) + + Encode a Unicode object using Unicode-Escape and return the result as Python + string object. Error handling is "strict". Return *NULL* if an exception was + raised by the codec. + + .. c:function:: PyObject* PyUnicode_EncodeUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size) Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Unicode-Escape and return a Python string object. Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsUnicodeEscapeString(PyObject *unicode) - - Encode a Unicode object using Unicode-Escape and return the result as Python - string object. Error handling is "strict". Return *NULL* if an exception was - raised by the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsUnicodeEscapeString`. Raw-Unicode-Escape Codecs @@ -798,19 +1282,13 @@ Raw-Unicode-Escape Codecs These are the "Raw Unicode Escape" codec APIs: -.. c:function:: PyObject* PyUnicode_DecodeRawUnicodeEscape(const char *s, Py_ssize_t size, const char *errors) +.. c:function:: PyObject* PyUnicode_DecodeRawUnicodeEscape(const char *s, \ + Py_ssize_t size, const char *errors) Create a Unicode object by decoding *size* bytes of the Raw-Unicode-Escape encoded string *s*. Return *NULL* if an exception was raised by the codec. -.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, Py_ssize_t size, const char *errors) - - Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape - and return a Python string object. Return *NULL* if an exception was raised by - the codec. - - .. c:function:: PyObject* PyUnicode_AsRawUnicodeEscapeString(PyObject *unicode) Encode a Unicode object using Raw-Unicode-Escape and return the result as @@ -818,6 +1296,18 @@ These are the "Raw Unicode Escape" codec APIs: was raised by the codec. +.. c:function:: PyObject* PyUnicode_EncodeRawUnicodeEscape(const Py_UNICODE *s, \ + Py_ssize_t size, const char *errors) + + Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Raw-Unicode-Escape + and return a Python string object. Return *NULL* if an exception was raised by + the codec. + + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsRawUnicodeEscapeString`. + + Latin-1 Codecs """""""""""""" @@ -831,18 +1321,22 @@ ordinals and only these are accepted by the codecs during encoding. *s*. Return *NULL* if an exception was raised by the codec. +.. c:function:: PyObject* PyUnicode_AsLatin1String(PyObject *unicode) + + Encode a Unicode object using Latin-1 and return the result as Python bytes + object. Error handling is "strict". Return *NULL* if an exception was + raised by the codec. + + .. c:function:: PyObject* PyUnicode_EncodeLatin1(const Py_UNICODE *s, Py_ssize_t size, const char *errors) Encode the :c:type:`Py_UNICODE` buffer of the given *size* using Latin-1 and return a Python bytes object. Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsLatin1String(PyObject *unicode) - - Encode a Unicode object using Latin-1 and return the result as Python bytes - object. Error handling is "strict". Return *NULL* if an exception was - raised by the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsLatin1String`. ASCII Codecs @@ -858,18 +1352,22 @@ codes generate errors. *s*. Return *NULL* if an exception was raised by the codec. +.. c:function:: PyObject* PyUnicode_AsASCIIString(PyObject *unicode) + + Encode a Unicode object using ASCII and return the result as Python bytes + object. Error handling is "strict". Return *NULL* if an exception was + raised by the codec. + + .. c:function:: PyObject* PyUnicode_EncodeASCII(const Py_UNICODE *s, Py_ssize_t size, const char *errors) Encode the :c:type:`Py_UNICODE` buffer of the given *size* using ASCII and return a Python bytes object. Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsASCIIString(PyObject *unicode) - - Encode a Unicode object using ASCII and return the result as Python bytes - object. Error handling is "strict". Return *NULL* if an exception was - raised by the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsASCIIString`. Character Map Codecs @@ -898,7 +1396,8 @@ characters to different code points. These are the mapping codec APIs: -.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, PyObject *mapping, const char *errors) +.. c:function:: PyObject* PyUnicode_DecodeCharmap(const char *s, Py_ssize_t size, \ + PyObject *mapping, const char *errors) Create a Unicode object by decoding *size* bytes of the encoded string *s* using the given *mapping* object. Return *NULL* if an exception was raised by the @@ -908,13 +1407,6 @@ These are the mapping codec APIs: treated as "undefined mapping". -.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *mapping, const char *errors) - - Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given - *mapping* object and return a Python string object. Return *NULL* if an - exception was raised by the codec. - - .. c:function:: PyObject* PyUnicode_AsCharmapString(PyObject *unicode, PyObject *mapping) Encode a Unicode object using the given *mapping* object and return the result @@ -924,7 +1416,8 @@ These are the mapping codec APIs: The following codec API is special in that maps Unicode to Unicode. -.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, PyObject *table, const char *errors) +.. c:function:: PyObject* PyUnicode_TranslateCharmap(const Py_UNICODE *s, Py_ssize_t size, \ + PyObject *table, const char *errors) Translate a :c:type:`Py_UNICODE` buffer of the given *size* by applying a character mapping *table* to it and return the resulting Unicode object. Return @@ -937,6 +1430,22 @@ The following codec API is special in that maps Unicode to Unicode. and sequences work well. Unmapped character ordinals (ones which cause a :exc:`LookupError`) are left untouched and are copied as-is. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API. + + .. XXX replace with what? + + +.. c:function:: PyObject* PyUnicode_EncodeCharmap(const Py_UNICODE *s, Py_ssize_t size, \ + PyObject *mapping, const char *errors) + + Encode the :c:type:`Py_UNICODE` buffer of the given *size* using the given + *mapping* object and return a Python string object. Return *NULL* if an + exception was raised by the codec. + + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsCharmapString`. MBCS codecs for Windows @@ -953,7 +1462,8 @@ the user settings on the machine running the codec. Return *NULL* if an exception was raised by the codec. -.. c:function:: PyObject* PyUnicode_DecodeMBCSStateful(const char *s, int size, const char *errors, int *consumed) +.. c:function:: PyObject* PyUnicode_DecodeMBCSStateful(const char *s, int size, \ + const char *errors, int *consumed) If *consumed* is *NULL*, behave like :c:func:`PyUnicode_DecodeMBCS`. If *consumed* is not *NULL*, :c:func:`PyUnicode_DecodeMBCSStateful` will not decode @@ -961,18 +1471,31 @@ the user settings on the machine running the codec. in *consumed*. +.. c:function:: PyObject* PyUnicode_AsMBCSString(PyObject *unicode) + + Encode a Unicode object using MBCS and return the result as Python bytes + object. Error handling is "strict". Return *NULL* if an exception was + raised by the codec. + + +.. c:function:: PyObject* PyUnicode_EncodeCodePage(int code_page, PyObject *unicode, const char *errors) + + Encode the Unicode object using the specified code page and return a Python + bytes object. Return *NULL* if an exception was raised by the codec. Use + :c:data:`CP_ACP` code page to get the MBCS encoder. + + .. versionadded:: 3.3 + + .. c:function:: PyObject* PyUnicode_EncodeMBCS(const Py_UNICODE *s, Py_ssize_t size, const char *errors) Encode the :c:type:`Py_UNICODE` buffer of the given *size* using MBCS and return a Python bytes object. Return *NULL* if an exception was raised by the codec. - -.. c:function:: PyObject* PyUnicode_AsMBCSString(PyObject *unicode) - - Encode a Unicode object using MBCS and return the result as Python bytes - object. Error handling is "strict". Return *NULL* if an exception was - raised by the codec. + .. deprecated-removed:: 3.3 4.0 + Part of the old-style :c:type:`Py_UNICODE` API; please migrate to using + :c:func:`PyUnicode_AsMBCSString` or :c:func:`PyUnicode_EncodeCodePage`. Methods & Slots @@ -1011,7 +1534,8 @@ They all return *NULL* or ``-1`` if an exception occurs. characters are not included in the resulting strings. -.. c:function:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, const char *errors) +.. c:function:: PyObject* PyUnicode_Translate(PyObject *str, PyObject *table, \ + const char *errors) Translate a string by applying a character mapping table to it and return the resulting Unicode object. @@ -1033,14 +1557,16 @@ They all return *NULL* or ``-1`` if an exception occurs. Unicode string. -.. c:function:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction) +.. c:function:: int PyUnicode_Tailmatch(PyObject *str, PyObject *substr, \ + Py_ssize_t start, Py_ssize_t end, int direction) Return 1 if *substr* matches ``str[start:end]`` at the given tail end (*direction* == -1 means to do a prefix match, *direction* == 1 a suffix match), 0 otherwise. Return ``-1`` if an error occurred. -.. c:function:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end, int direction) +.. c:function:: Py_ssize_t PyUnicode_Find(PyObject *str, PyObject *substr, \ + Py_ssize_t start, Py_ssize_t end, int direction) Return the first position of *substr* in ``str[start:end]`` using the given *direction* (*direction* == 1 means to do a forward search, *direction* == -1 a @@ -1049,13 +1575,27 @@ They all return *NULL* or ``-1`` if an exception occurs. occurred and an exception has been set. -.. c:function:: Py_ssize_t PyUnicode_Count(PyObject *str, PyObject *substr, Py_ssize_t start, Py_ssize_t end) +.. c:function:: Py_ssize_t PyUnicode_FindChar(PyObject *str, Py_UCS4 ch, \ + Py_ssize_t start, Py_ssize_t end, int direction) + + Return the first position of the character *ch* in ``str[start:end]`` using + the given *direction* (*direction* == 1 means to do a forward search, + *direction* == -1 a backward search). The return value is the index of the + first match; a value of ``-1`` indicates that no match was found, and ``-2`` + indicates that an error occurred and an exception has been set. + + .. versionadded:: 3.3 + + +.. c:function:: Py_ssize_t PyUnicode_Count(PyObject *str, PyObject *substr, \ + Py_ssize_t start, Py_ssize_t end) Return the number of non-overlapping occurrences of *substr* in ``str[start:end]``. Return ``-1`` if an error occurred. -.. c:function:: PyObject* PyUnicode_Replace(PyObject *str, PyObject *substr, PyObject *replstr, Py_ssize_t maxcount) +.. c:function:: PyObject* PyUnicode_Replace(PyObject *str, PyObject *substr, \ + PyObject *replstr, Py_ssize_t maxcount) Replace at most *maxcount* occurrences of *substr* in *str* with *replstr* and return the resulting Unicode object. *maxcount* == -1 means replace all @@ -1076,7 +1616,7 @@ They all return *NULL* or ``-1`` if an exception occurs. ISO-8859-1 if it contains non-ASCII characters". -.. c:function:: int PyUnicode_RichCompare(PyObject *left, PyObject *right, int op) +.. c:function:: PyObject* PyUnicode_RichCompare(PyObject *left, PyObject *right, int op) Rich compare two unicode strings and return one of the following: @@ -1103,8 +1643,8 @@ They all return *NULL* or ``-1`` if an exception occurs. Check whether *element* is contained in *container* and return true or false accordingly. - *element* has to coerce to a one element Unicode string. ``-1`` is returned if - there was an error. + *element* has to coerce to a one element Unicode string. ``-1`` is returned + if there was an error. .. c:function:: void PyUnicode_InternInPlace(PyObject **string) @@ -1123,7 +1663,6 @@ They all return *NULL* or ``-1`` if an exception occurs. .. c:function:: PyObject* PyUnicode_InternFromString(const char *v) A combination of :c:func:`PyUnicode_FromString` and - :c:func:`PyUnicode_InternInPlace`, returning either a new unicode string object - that has been interned, or a new ("owned") reference to an earlier interned - string object with the same value. - + :c:func:`PyUnicode_InternInPlace`, returning either a new unicode string + object that has been interned, or a new ("owned") reference to an earlier + interned string object with the same value. diff --git a/Doc/c-api/veryhigh.rst b/Doc/c-api/veryhigh.rst index 41cdd6b..14ef8df 100644 --- a/Doc/c-api/veryhigh.rst +++ b/Doc/c-api/veryhigh.rst @@ -95,12 +95,6 @@ the same library that the Python runtime is using. leaving *closeit* set to ``0`` and *flags* set to *NULL*. -.. c:function:: int PyRun_SimpleFileFlags(FILE *fp, const char *filename, PyCompilerFlags *flags) - - This is a simplified interface to :c:func:`PyRun_SimpleFileExFlags` below, - leaving *closeit* set to ``0``. - - .. c:function:: int PyRun_SimpleFileEx(FILE *fp, const char *filename, int closeit) This is a simplified interface to :c:func:`PyRun_SimpleFileExFlags` below, @@ -289,7 +283,7 @@ the same library that the Python runtime is using. frame *f* is executed, interpreting bytecode and executing calls as needed. The additional *throwflag* parameter can mostly be ignored - if true, then it causes an exception to immediately be thrown; this is used for the - :meth:`throw` methods of generator objects. + :meth:`~generator.throw` methods of generator objects. .. c:function:: int PyEval_MergeCompilerFlags(PyCompilerFlags *cf) diff --git a/Doc/conf.py b/Doc/conf.py index 555f281..5b63cad 100644 --- a/Doc/conf.py +++ b/Doc/conf.py @@ -12,8 +12,8 @@ sys.path.append(os.path.abspath('tools/sphinxext')) # General configuration # --------------------- -extensions = ['sphinx.ext.refcounting', 'sphinx.ext.coverage', - 'sphinx.ext.doctest', 'pyspecific'] +extensions = ['sphinx.ext.coverage', 'sphinx.ext.doctest', + 'pyspecific', 'c_annotations'] templates_path = ['tools/sphinxext'] # General substitutions. @@ -91,7 +91,7 @@ html_additional_pages = { } # Output an OpenSearch description file. -html_use_opensearch = 'http://docs.python.org/3.2' +html_use_opensearch = 'http://docs.python.org/' + version # Additional static files. html_static_path = ['tools/sphinxext/static'] diff --git a/Doc/copyright.rst b/Doc/copyright.rst index 94f777c..be47e8a 100644 --- a/Doc/copyright.rst +++ b/Doc/copyright.rst @@ -4,7 +4,7 @@ Copyright Python and this documentation is: -Copyright © 2001-2013 Python Software Foundation. All rights reserved. +Copyright © 2001-2014 Python Software Foundation. All rights reserved. Copyright © 2000 BeOpen.com. All rights reserved. diff --git a/Doc/data/refcounts.dat b/Doc/data/refcounts.dat index b5dde33..a42584c 100644 --- a/Doc/data/refcounts.dat +++ b/Doc/data/refcounts.dat @@ -210,7 +210,7 @@ PyDict_DelItem:PyObject*:key:0: PyDict_DelItemString:int::: PyDict_DelItemString:PyObject*:p:0: -PyDict_DelItemString:char*:key:: +PyDict_DelItemString:const char*:key:: PyDict_GetItem:PyObject*::0:0 PyDict_GetItem:PyObject*:p:0: @@ -218,7 +218,7 @@ PyDict_GetItem:PyObject*:key:0: PyDict_GetItemString:PyObject*::0: PyDict_GetItemString:PyObject*:p:0: -PyDict_GetItemString:char*:key:: +PyDict_GetItemString:const char*:key:: PyDict_Items:PyObject*::+1: PyDict_Items:PyObject*:p:0: @@ -244,7 +244,7 @@ PyDict_SetItem:PyObject*:val:+1: PyDict_SetItemString:int::: PyDict_SetItemString:PyObject*:p:0: -PyDict_SetItemString:char*:key:: +PyDict_SetItemString:const char*:key:: PyDict_SetItemString:PyObject*:val:+1: PyDict_Size:int::: @@ -277,13 +277,13 @@ PyErr_GivenExceptionMatches:PyObject*:given:0: PyErr_GivenExceptionMatches:PyObject*:exc:0: PyErr_NewException:PyObject*::+1: -PyErr_NewException:char*:name:: +PyErr_NewException:const char*:name:: PyErr_NewException:PyObject*:base:0: PyErr_NewException:PyObject*:dict:0: PyErr_NewExceptionWithDoc:PyObject*::+1: -PyErr_NewExceptionWithDoc:char*:name:: -PyErr_NewExceptionWithDoc:char*:doc:: +PyErr_NewExceptionWithDoc:const char*:name:: +PyErr_NewExceptionWithDoc:const char*:doc:: PyErr_NewExceptionWithDoc:PyObject*:base:0: PyErr_NewExceptionWithDoc:PyObject*:dict:0: @@ -310,21 +310,21 @@ PyErr_SetExcFromWindowsErr:int:ierr:: PyErr_SetExcFromWindowsErrWithFilename:PyObject*::null: PyErr_SetExcFromWindowsErrWithFilename:PyObject*:type:0: PyErr_SetExcFromWindowsErrWithFilename:int:ierr:: -PyErr_SetExcFromWindowsErrWithFilename:char*:filename:: +PyErr_SetExcFromWindowsErrWithFilename:const char*:filename:: PyErr_SetFromErrno:PyObject*::null: PyErr_SetFromErrno:PyObject*:type:0: PyErr_SetFromErrnoWithFilename:PyObject*::null: PyErr_SetFromErrnoWithFilename:PyObject*:type:0: -PyErr_SetFromErrnoWithFilename:char*:filename:: +PyErr_SetFromErrnoWithFilename:const char*:filename:: PyErr_SetFromWindowsErr:PyObject*::null: PyErr_SetFromWindowsErr:int:ierr:: PyErr_SetFromWindowsErrWithFilename:PyObject*::null: PyErr_SetFromWindowsErrWithFilename:int:ierr:: -PyErr_SetFromWindowsErrWithFilename:char*:filename:: +PyErr_SetFromWindowsErrWithFilename:const char*:filename:: PyErr_SetInterrupt:void::: @@ -337,11 +337,11 @@ PyErr_SetObject:PyObject*:value:+1: PyErr_SetString:void::: PyErr_SetString:PyObject*:type:+1: -PyErr_SetString:char*:message:: +PyErr_SetString:const char*:message:: PyErr_Format:PyObject*::null: PyErr_Format:PyObject*:exception:+1: -PyErr_Format:char*:format:: +PyErr_Format:const char*:format:: PyErr_Format::...:: PyErr_WarnEx:int::: @@ -386,22 +386,22 @@ PyFile_Check:PyObject*:p:0: PyFile_FromFile:PyObject*::+1: PyFile_FromFile:FILE*:fp:: -PyFile_FromFile:char*:name:: -PyFile_FromFile:char*:mode:: +PyFile_FromFile:const char*:name:: +PyFile_FromFile:const char*:mode:: PyFile_FromFile:int(*:close):: PyFile_FromFileEx:PyObject*::+1: PyFile_FromFileEx:FILE*:fp:: -PyFile_FromFileEx:char*:name:: -PyFile_FromFileEx:char*:mode:: +PyFile_FromFileEx:const char*:name:: +PyFile_FromFileEx:const char*:mode:: PyFile_FromFileEx:int(*:close):: PyFile_FromFileEx:int:buffering:: -PyFile_FromFileEx:char*:encoding:: -PyFile_FromFileEx:char*:newline:: +PyFile_FromFileEx:const char*:encoding:: +PyFile_FromFileEx:const char*:newline:: PyFile_FromString:PyObject*::+1: -PyFile_FromString:char*:name:: -PyFile_FromString:char*:mode:: +PyFile_FromString:const char*:name:: +PyFile_FromString:const char*:mode:: PyFile_GetLine:PyObject*::+1: PyFile_GetLine:PyObject*:p:: @@ -465,6 +465,11 @@ PyFunction_New:PyObject*::+1: PyFunction_New:PyObject*:code:+1: PyFunction_New:PyObject*:globals:+1: +PyFunction_NewWithQualName:PyObject*::+1: +PyFunction_NewWithQualName:PyObject*:code:+1: +PyFunction_NewWithQualName:PyObject*:globals:+1: +PyFunction_NewWithQualName:PyObject*:qualname:+1: + PyFunction_SetClosure:int::: PyFunction_SetClosure:PyObject*:op:0: PyFunction_SetClosure:PyObject*:closure:+1: @@ -477,23 +482,23 @@ PyGen_New:PyObject*::+1: PyGen_New:PyFrameObject*:frame:0: Py_InitModule:PyObject*::0: -Py_InitModule:char*:name:: +Py_InitModule:const char*:name:: Py_InitModule:PyMethodDef[]:methods:: Py_InitModule3:PyObject*::0: -Py_InitModule3:char*:name:: +Py_InitModule3:const char*:name:: Py_InitModule3:PyMethodDef[]:methods:: -Py_InitModule3:char*:doc:: +Py_InitModule3:const char*:doc:: Py_InitModule4:PyObject*::0: -Py_InitModule4:char*:name:: +Py_InitModule4:const char*:name:: Py_InitModule4:PyMethodDef[]:methods:: -Py_InitModule4:char*:doc:: +Py_InitModule4:const char*:doc:: Py_InitModule4:PyObject*:self:: Py_InitModule4:int:apiver::usually provided by Py_InitModule or Py_InitModule3 PyImport_AddModule:PyObject*::0:reference borrowed from sys.modules -PyImport_AddModule:char*:name:: +PyImport_AddModule:const char*:name:: PyImport_Cleanup:void::: @@ -517,16 +522,16 @@ PyImport_ImportFrozenModule:int::: PyImport_ImportFrozenModule:char*::: PyImport_ImportModule:PyObject*::+1: -PyImport_ImportModule:char*:name:: +PyImport_ImportModule:const char*:name:: PyImport_ImportModuleEx:PyObject*::+1: -PyImport_ImportModuleEx:char*:name:: +PyImport_ImportModuleEx:const char*:name:: PyImport_ImportModuleEx:PyObject*:globals:0:??? PyImport_ImportModuleEx:PyObject*:locals:0:??? PyImport_ImportModuleEx:PyObject*:fromlist:0:??? PyImport_ImportModuleLevel:PyObject*::+1: -PyImport_ImportModuleLevel:char*:name:: +PyImport_ImportModuleLevel:const char*:name:: PyImport_ImportModuleLevel:PyObject*:globals:0:??? PyImport_ImportModuleLevel:PyObject*:locals:0:??? PyImport_ImportModuleLevel:PyObject*:fromlist:0:??? @@ -687,7 +692,7 @@ PyMapping_DelItem:PyObject*:key:0: PyMapping_DelItemString:int::: PyMapping_DelItemString:PyObject*:o:0: -PyMapping_DelItemString:char*:key:: +PyMapping_DelItemString:const char*:key:: PyMapping_GetItemString:PyObject*::+1: PyMapping_GetItemString:PyObject*:o:0: @@ -757,10 +762,10 @@ PyMethod_Self:PyObject*:im:0: PyModule_GetDict:PyObject*::0: PyModule_GetDict::PyObject* module:0: -PyModule_GetFilename:char*::: +PyModule_GetFilename:const char*::: PyModule_GetFilename:PyObject*:module:0: -PyModule_GetName:char*::: +PyModule_GetName:const char*::: PyModule_GetName:PyObject*:module:0: PyModule_New:PyObject*::+1: @@ -922,7 +927,7 @@ PyObject_CallMethod::...:: PyObject_CallMethodObjArgs:PyObject*::+1: PyObject_CallMethodObjArgs:PyObject*:o:0: -PyObject_CallMethodObjArgs:char*:name:: +PyObject_CallMethodObjArgs:PyObject*:name:0: PyObject_CallMethodObjArgs::...:: PyObject_CallObject:PyObject*::+1: @@ -944,7 +949,7 @@ PyObject_DelAttr:PyObject*:attr_name:0: PyObject_DelAttrString:int::: PyObject_DelAttrString:PyObject*:o:0: -PyObject_DelAttrString:char*:attr_name:: +PyObject_DelAttrString:const char*:attr_name:: PyObject_DelItem:int::: PyObject_DelItem:PyObject*:o:0: @@ -959,7 +964,7 @@ PyObject_GetAttr:PyObject*:attr_name:0: PyObject_GetAttrString:PyObject*::+1: PyObject_GetAttrString:PyObject*:o:0: -PyObject_GetAttrString:char*:attr_name:: +PyObject_GetAttrString:const char*:attr_name:: PyObject_GetItem:PyObject*::+1: PyObject_GetItem:PyObject*:o:0: @@ -974,7 +979,7 @@ PyObject_HasAttr:PyObject*:attr_name:0: PyObject_HasAttrString:int::: PyObject_HasAttrString:PyObject*:o:0: -PyObject_HasAttrString:char*:attr_name:0: +PyObject_HasAttrString:const char*:attr_name:0: PyObject_Hash:int::: PyObject_Hash:PyObject*:o:0: @@ -1024,7 +1029,7 @@ PyObject_SetAttr:PyObject*:v:+1: PyObject_SetAttrString:int::: PyObject_SetAttrString:PyObject*:o:0: -PyObject_SetAttrString:char*:attr_name:: +PyObject_SetAttrString:const char*:attr_name:: PyObject_SetAttrString:PyObject*:v:+1: PyObject_SetItem:int::: @@ -1043,27 +1048,27 @@ PyObject_Unicode:PyObject*:o:0: PyParser_SimpleParseFile:struct _node*::: PyParser_SimpleParseFile:FILE*:fp:: -PyParser_SimpleParseFile:char*:filename:: +PyParser_SimpleParseFile:const char*:filename:: PyParser_SimpleParseFile:int:start:: PyParser_SimpleParseString:struct _node*::: -PyParser_SimpleParseString:char*:str:: +PyParser_SimpleParseString:const char*:str:: PyParser_SimpleParseString:int:start:: PyRun_AnyFile:int::: PyRun_AnyFile:FILE*:fp:: -PyRun_AnyFile:char*:filename:: +PyRun_AnyFile:const char*:filename:: PyRun_File:PyObject*::+1:??? -- same as eval_code2() PyRun_File:FILE*:fp:: -PyRun_File:char*:filename:: +PyRun_File:const char*:filename:: PyRun_File:int:start:: PyRun_File:PyObject*:globals:0: PyRun_File:PyObject*:locals:0: PyRun_FileEx:PyObject*::+1:??? -- same as eval_code2() PyRun_FileEx:FILE*:fp:: -PyRun_FileEx:char*:filename:: +PyRun_FileEx:const char*:filename:: PyRun_FileEx:int:start:: PyRun_FileEx:PyObject*:globals:0: PyRun_FileEx:PyObject*:locals:0: @@ -1071,7 +1076,7 @@ PyRun_FileEx:int:closeit:: PyRun_FileFlags:PyObject*::+1:??? -- same as eval_code2() PyRun_FileFlags:FILE*:fp:: -PyRun_FileFlags:char*:filename:: +PyRun_FileFlags:const char*:filename:: PyRun_FileFlags:int:start:: PyRun_FileFlags:PyObject*:globals:0: PyRun_FileFlags:PyObject*:locals:0: @@ -1079,7 +1084,7 @@ PyRun_FileFlags:PyCompilerFlags*:flags:: PyRun_FileExFlags:PyObject*::+1:??? -- same as eval_code2() PyRun_FileExFlags:FILE*:fp:: -PyRun_FileExFlags:char*:filename:: +PyRun_FileExFlags:const char*:filename:: PyRun_FileExFlags:int:start:: PyRun_FileExFlags:PyObject*:globals:0: PyRun_FileExFlags:PyObject*:locals:0: @@ -1088,27 +1093,27 @@ PyRun_FileExFlags:PyCompilerFlags*:flags:: PyRun_InteractiveLoop:int::: PyRun_InteractiveLoop:FILE*:fp:: -PyRun_InteractiveLoop:char*:filename:: +PyRun_InteractiveLoop:const char*:filename:: PyRun_InteractiveOne:int::: PyRun_InteractiveOne:FILE*:fp:: -PyRun_InteractiveOne:char*:filename:: +PyRun_InteractiveOne:const char*:filename:: PyRun_SimpleFile:int::: PyRun_SimpleFile:FILE*:fp:: -PyRun_SimpleFile:char*:filename:: +PyRun_SimpleFile:const char*:filename:: PyRun_SimpleString:int::: -PyRun_SimpleString:char*:command:: +PyRun_SimpleString:const char*:command:: PyRun_String:PyObject*::+1:??? -- same as eval_code2() -PyRun_String:char*:str:: +PyRun_String:const char*:str:: PyRun_String:int:start:: PyRun_String:PyObject*:globals:0: PyRun_String:PyObject*:locals:0: PyRun_StringFlags:PyObject*::+1:??? -- same as eval_code2() -PyRun_StringFlags:char*:str:: +PyRun_StringFlags:const char*:str:: PyRun_StringFlags:int:start:: PyRun_StringFlags:PyObject*:globals:0: PyRun_StringFlags:PyObject*:locals:0: @@ -1224,7 +1229,7 @@ PySlice_New:PyObject*:start:0: PySlice_New:PyObject*:stop:0: PySlice_New:PyObject*:step:0: -PyString_AS_STRING:char*::: +PyString_AS_STRING:const char*::: PyString_AS_STRING:PyObject*:string:0: PyString_AsDecodedObject:PyObject*::+1: @@ -1237,7 +1242,7 @@ PyString_AsEncodedObject:PyObject*:str:0: PyString_AsEncodedObject:const char*:encoding:: PyString_AsEncodedObject:const char*:errors:: -PyString_AsString:char*::: +PyString_AsString:const char*::: PyString_AsString:PyObject*:string:0: PyString_AsStringAndSize:int::: @@ -1305,13 +1310,13 @@ PyString_AsEncodedString:const char*:encoding:: PyString_AsEncodedString:const char*:errors:: PySys_AddWarnOption:void::: -PySys_AddWarnOption:char*:s:: +PySys_AddWarnOption:const char*:s:: PySys_AddXOption:void::: PySys_AddXOption:const wchar_t*:s:: PySys_GetObject:PyObject*::0: -PySys_GetObject:char*:name:: +PySys_GetObject:const char*:name:: PySys_GetXOptions:PyObject*::0: @@ -1320,16 +1325,16 @@ PySys_SetArgv:int:argc:: PySys_SetArgv:char**:argv:: PySys_SetObject:int::: -PySys_SetObject:char*:name:: +PySys_SetObject:const char*:name:: PySys_SetObject:PyObject*:v:+1: PySys_ResetWarnOptions:void::: PySys_WriteStdout:void::: -PySys_WriteStdout:char*:format:: +PySys_WriteStdout:const char*:format:: PySys_WriteStderr:void::: -PySys_WriteStderr:char*:format:: +PySys_WriteStderr:const char*:format:: PyThreadState_Clear:void::: PyThreadState_Clear:PyThreadState*:tstate:: @@ -1709,16 +1714,16 @@ Py_AtExit:int::: Py_AtExit:void (*)():func:: Py_BuildValue:PyObject*::+1: -Py_BuildValue:char*:format:: +Py_BuildValue:const char*:format:: Py_CompileString:PyObject*::+1: -Py_CompileString:char*:str:: -Py_CompileString:char*:filename:: +Py_CompileString:const char*:str:: +Py_CompileString:const char*:filename:: Py_CompileString:int:start:: Py_CompileStringFlags:PyObject*::+1: -Py_CompileStringFlags:char*:str:: -Py_CompileStringFlags:char*:filename:: +Py_CompileStringFlags:const char*:str:: +Py_CompileStringFlags:const char*:filename:: Py_CompileStringFlags:int:start:: Py_CompileStringFlags:PyCompilerFlags*:flags:: @@ -1732,33 +1737,33 @@ Py_Exit:void::: Py_Exit:int:status:: Py_FatalError:void::: -Py_FatalError:char*:message:: +Py_FatalError:const char*:message:: Py_FdIsInteractive:int::: Py_FdIsInteractive:FILE*:fp:: -Py_FdIsInteractive:char*:filename:: +Py_FdIsInteractive:const char*:filename:: Py_Finalize:void::: -Py_GetBuildInfoconst:char*::: +Py_GetBuildInfoconst:const char*::: -Py_GetCompilerconst:char*::: +Py_GetCompilerconst:const char*::: -Py_GetCopyrightconst:char*::: +Py_GetCopyrightconst:const char*::: -Py_GetExecPrefix:char*::: +Py_GetExecPrefix:const char*::: -Py_GetPath:char*::: +Py_GetPath:const char*::: -Py_GetPlatformconst:char*::: +Py_GetPlatformconst:const char*::: -Py_GetPrefix:char*::: +Py_GetPrefix:const char*::: -Py_GetProgramFullPath:char*::: +Py_GetProgramFullPath:const char*::: -Py_GetProgramName:char*::: +Py_GetProgramName:const char*::: -Py_GetVersionconst:char*::: +Py_GetVersionconst:const char*::: Py_INCREF:void::: Py_INCREF:PyObject*:o:+1: @@ -1770,7 +1775,7 @@ Py_IsInitialized:int::: Py_NewInterpreter:PyThreadState*::: Py_SetProgramName:void::: -Py_SetProgramName:char*:name:: +Py_SetProgramName:const char*:name:: Py_XDECREF:void::: Py_XDECREF:PyObject*:o:-1:if o is not NULL @@ -1779,14 +1784,14 @@ Py_XINCREF:void::: Py_XINCREF:PyObject*:o:+1:if o is not NULL _PyImport_FindExtension:PyObject*::0:??? see PyImport_AddModule -_PyImport_FindExtension:char*::: -_PyImport_FindExtension:char*::: +_PyImport_FindExtension:const char*::: +_PyImport_FindExtension:const char*::: _PyImport_Fini:void::: _PyImport_FixupExtension:PyObject*:::??? -_PyImport_FixupExtension:char*::: -_PyImport_FixupExtension:char*::: +_PyImport_FixupExtension:const char*::: +_PyImport_FixupExtension:const char*::: _PyImport_Init:void::: diff --git a/Doc/distutils/apiref.rst b/Doc/distutils/apiref.rst index c4127bd..9853f02 100644 --- a/Doc/distutils/apiref.rst +++ b/Doc/distutils/apiref.rst @@ -26,6 +26,8 @@ setup script). Indirectly provides the :class:`distutils.dist.Distribution` and The setup function takes a large number of arguments. These are laid out in the following table. + .. tabularcolumns:: |l|L|L| + +--------------------+--------------------------------+-------------------------------------------------------------+ | argument name | value | type | +====================+================================+=============================================================+ @@ -125,6 +127,8 @@ setup script). Indirectly provides the :class:`distutils.dist.Distribution` and *stop_after* tells :func:`setup` when to stop processing; possible values: + .. tabularcolumns:: |l|L| + +---------------+---------------------------------------------+ | value | description | +===============+=============================================+ @@ -163,7 +167,9 @@ the full reference. .. class:: Extension The Extension class describes a single C or C++extension module in a setup - script. It accepts the following keyword arguments in its constructor + script. It accepts the following keyword arguments in its constructor: + + .. tabularcolumns:: |l|L|l| +------------------------+--------------------------------+---------------------------+ | argument name | value | type | @@ -728,7 +734,7 @@ This module provides the following functions. .. method:: CCompiler.execute(func, args[, msg=None, level=1]) - Invokes :func:`distutils.util.execute` This method invokes a Python function + Invokes :func:`distutils.util.execute`. This method invokes a Python function *func* with the given arguments *args*, after logging and taking into account the *dry_run* flag. @@ -988,8 +994,9 @@ directories. simply the list of all files under *src*, with the names changed to be under *dst*. - *preserve_mode* and *preserve_times* are the same as for :func:`copy_file` in - :mod:`distutils.file_util`; note that they only apply to regular files, not to + *preserve_mode* and *preserve_times* are the same as for + :func:`distutils.file_util.copy_file`; note that they only apply to + regular files, not to directories. If *preserve_symlinks* is true, symlinks will be copied as symlinks (on platforms that support them!); otherwise (the default), the destination of the symlink will be copied. *update* and *verbose* are the same @@ -999,7 +1006,7 @@ directories. these files is available in answer D2 of the `NFS FAQ page <http://nfs.sourceforge.net/#section_d>`_. - .. versionchanged:: 3.2.4 + .. versionchanged:: 3.3.1 NFS files are ignored. .. function:: remove_tree(directory[, verbose=0, dry_run=0]) @@ -1164,16 +1171,6 @@ other utility module. underscore. No { } or ( ) style quoting is available. -.. function:: grok_environment_error(exc[, prefix='error: ']) - - Generate a useful error message from an :exc:`EnvironmentError` (:exc:`IOError` - or :exc:`OSError`) exception object. Handles Python 1.5.1 and later styles, - and does what it can to deal with exception objects that don't have a filename - (which happens when the error is due to a two-file operation, such as - :func:`rename` or :func:`link`). Returns the error message as a string - prefixed with *prefix*. - - .. function:: split_quoted(s) Split a string up according to Unix shell-like rules for quotes and backslashes. @@ -1260,8 +1257,8 @@ other utility module. built/installed/distributed -This module provides the :class:`Distribution` class, which represents the -module distribution being built/installed/distributed. +This module provides the :class:`~distutils.core.Distribution` class, which +represents the module distribution being built/installed/distributed. :mod:`distutils.extension` --- The Extension class @@ -1563,6 +1560,8 @@ lines, and joining lines with backslashes. The options are all boolean, and affect the values returned by :meth:`readline` + .. tabularcolumns:: |l|L|l| + +------------------+--------------------------------+---------+ | option name | description | default | +==================+================================+=========+ @@ -1705,8 +1704,8 @@ This module supplies the abstract base class :class:`Command`. options, is the :meth:`run` method, which must also be implemented by every command class. - The class constructor takes a single argument *dist*, a :class:`Distribution` - instance. + The class constructor takes a single argument *dist*, a + :class:`~distutils.core.Distribution` instance. Creating a new Distutils command @@ -1935,8 +1934,12 @@ Subclasses of :class:`Command` must define the following methods. .. module:: distutils.command.clean :synopsis: Clean a package build area +This command removes the temporary files created by :command:`build` +and its subcommands, like intermediary compiled object files. With +the ``--all`` option, the complete build directory will be removed. -.. % todo +Extension modules built :ref:`in place <distutils-build-ext-inplace>` +will not be cleaned, as they are not in the build directory. :mod:`distutils.command.config` --- Perform package configuration diff --git a/Doc/distutils/configfile.rst b/Doc/distutils/configfile.rst index 890047c..ac79671 100644 --- a/Doc/distutils/configfile.rst +++ b/Doc/distutils/configfile.rst @@ -69,6 +69,8 @@ universal :option:`--help` option, e.g. :: Note that an option spelled :option:`--foo-bar` on the command-line is spelled :option:`foo_bar` in configuration files. +.. _distutils-build-ext-inplace: + For example, say you want your extensions to be built "in-place"---that is, you have an extension :mod:`pkg.ext`, and you want the compiled extension file (:file:`ext.so` on Unix, say) to be put in the same source directory as your diff --git a/Doc/distutils/examples.rst b/Doc/distutils/examples.rst index b268486..5eb654a 100644 --- a/Doc/distutils/examples.rst +++ b/Doc/distutils/examples.rst @@ -284,6 +284,48 @@ by using the :mod:`docutils` parser:: warning: check: Title underline too short. (line 2) warning: check: Could not finish the parsing. +Reading the metadata +===================== + +The :func:`distutils.core.setup` function provides a command-line interface +that allows you to query the metadata fields of a project through the +`setup.py` script of a given project:: + + $ python setup.py --name + distribute + +This call reads the `name` metadata by running the +:func:`distutils.core.setup` function. Although, when a source or binary +distribution is created with Distutils, the metadata fields are written +in a static file called :file:`PKG-INFO`. When a Distutils-based project is +installed in Python, the :file:`PKG-INFO` file is copied alongside the modules +and packages of the distribution under :file:`NAME-VERSION-pyX.X.egg-info`, +where `NAME` is the name of the project, `VERSION` its version as defined +in the Metadata, and `pyX.X` the major and minor version of Python like +`2.7` or `3.2`. + +You can read back this static file, by using the +:class:`distutils.dist.DistributionMetadata` class and its +:func:`read_pkg_file` method:: + + >>> from distutils.dist import DistributionMetadata + >>> metadata = DistributionMetadata() + >>> metadata.read_pkg_file(open('distribute-0.6.8-py2.7.egg-info')) + >>> metadata.name + 'distribute' + >>> metadata.version + '0.6.8' + >>> metadata.description + 'Easily download, build, install, upgrade, and uninstall Python packages' + +Notice that the class can also be instanciated with a metadata file path to +loads its values:: + + >>> pkg_info_path = 'distribute-0.6.8-py2.7.egg-info' + >>> DistributionMetadata(pkg_info_path).name + 'distribute' + + .. % \section{Multiple extension modules} .. % \label{multiple-ext} diff --git a/Doc/distutils/index.rst b/Doc/distutils/index.rst index 1cd5f3c..1a6f04c 100644 --- a/Doc/distutils/index.rst +++ b/Doc/distutils/index.rst @@ -12,6 +12,16 @@ the module developer's point of view, describing how to use the Distutils to make Python modules and extensions easily available to a wider audience with very little overhead for build/release/install mechanics. +.. note:: + + This guide only covers the basic tools for building and distributing + extensions that are provided as part of this version of Python. Third + party tools offer easier to use and more secure alternatives. Refer to the + `quick recommendations section + <https://python-packaging-user-guide.readthedocs.org/en/latest/current.html>`__ + in the Python Packaging User Guide for more information. + + .. toctree:: :maxdepth: 2 :numbered: diff --git a/Doc/distutils/setupscript.rst b/Doc/distutils/setupscript.rst index 6ed6fbf..fbe4290 100644 --- a/Doc/distutils/setupscript.rst +++ b/Doc/distutils/setupscript.rst @@ -139,7 +139,8 @@ directories, libraries to link with, etc.). All of this is done through another keyword argument to :func:`setup`, the :option:`ext_modules` option. :option:`ext_modules` is just a list of -:class:`Extension` instances, each of which describes a single extension module. +:class:`~distutils.core.Extension` instances, each of which describes a +single extension module. Suppose your distribution includes a single extension, called :mod:`foo` and implemented by :file:`foo.c`. If no additional instructions to the compiler/linker are needed, describing this extension is quite simple:: @@ -165,8 +166,8 @@ following sections. Extension names and packages ---------------------------- -The first argument to the :class:`Extension` constructor is always the name of -the extension, including any package names. For example, :: +The first argument to the :class:`~distutils.core.Extension` constructor is +always the name of the extension, including any package names. For example, :: Extension('foo', ['src/foo1.c', 'src/foo2.c']) @@ -196,7 +197,8 @@ will compile :file:`foo.c` to the extension :mod:`pkg.foo`, and :file:`bar.c` to Extension source files ---------------------- -The second argument to the :class:`Extension` constructor is a list of source +The second argument to the :class:`~distutils.core.Extension` constructor is +a list of source files. Since the Distutils currently only support C, C++, and Objective-C extensions, these are normally C/C++/Objective-C source files. (Be sure to use appropriate extensions to distinguish C++\ source files: :file:`.cc` and @@ -232,9 +234,9 @@ linked into the executable. Preprocessor options -------------------- -Three optional arguments to :class:`Extension` will help if you need to specify -include directories to search or preprocessor macros to define/undefine: -``include_dirs``, ``define_macros``, and ``undef_macros``. +Three optional arguments to :class:`~distutils.core.Extension` will help if +you need to specify include directories to search or preprocessor macros to +define/undefine: ``include_dirs``, ``define_macros``, and ``undef_macros``. For example, if your extension requires header files in the :file:`include` directory under your distribution root, use the ``include_dirs`` option:: @@ -683,6 +685,8 @@ include the following code fragment in your :file:`setup.py` before the DistributionMetadata.download_url = None +.. _debug-setup-script: + Debugging the setup script ========================== @@ -698,7 +702,8 @@ installation is broken because they don't read all the way down to the bottom and see that it's a permission problem. On the other hand, this doesn't help the developer to find the cause of the -failure. For this purpose, the DISTUTILS_DEBUG environment variable can be set +failure. For this purpose, the :envvar:`DISTUTILS_DEBUG` environment variable can be set to anything except an empty string, and distutils will now print detailed -information what it is doing, and prints the full traceback in case an exception -occurs. +information about what it is doing, dump the full traceback when an exception +occurs, and print the whole command line when an external program (like a C +compiler) fails. diff --git a/Doc/extending/building.rst b/Doc/extending/building.rst index f4d95b2..08b0cc2 100644 --- a/Doc/extending/building.rst +++ b/Doc/extending/building.rst @@ -58,8 +58,9 @@ distutils; this section explains building extension modules only. It is common to pre-compute arguments to :func:`setup`, to better structure the driver script. In the example above, the\ ``ext_modules`` argument to :func:`setup` is a list of extension modules, each of which is an instance of -the :class:`Extension`. In the example, the instance defines an extension named -``demo`` which is build by compiling a single source file, :file:`demo.c`. +the :class:`~distutils.extension.Extension`. In the example, the instance +defines an extension named ``demo`` which is build by compiling a single source +file, :file:`demo.c`. In many cases, building an extension is more complex, since additional preprocessor defines and libraries may be needed. This is demonstrated in the diff --git a/Doc/extending/embedding.rst b/Doc/extending/embedding.rst index 9761e25..bd66086 100644 --- a/Doc/extending/embedding.rst +++ b/Doc/extending/embedding.rst @@ -285,14 +285,14 @@ be directly useful to you: * ``pythonX.Y-config --cflags`` will give you the recommended flags when compiling:: - $ /opt/bin/python3.2-config --cflags - -I/opt/include/python3.2m -I/opt/include/python3.2m -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes + $ /opt/bin/python3.3-config --cflags + -I/opt/include/python3.3m -I/opt/include/python3.3m -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes * ``pythonX.Y-config --ldflags`` will give you the recommended flags when linking:: - $ /opt/bin/python3.2-config --ldflags - -I/opt/lib/python3.2/config-3.2m -lpthread -ldl -lutil -lm -lpython3.2m -Xlinker -export-dynamic + $ /opt/bin/python3.3-config --ldflags + -L/opt/lib/python3.3/config-3.3m -lpthread -ldl -lutil -lm -lpython3.3m -Xlinker -export-dynamic .. note:: To avoid confusion between several Python installations (and especially @@ -307,11 +307,13 @@ examine Python's :file:`Makefile` (use :func:`sysconfig.get_makefile_filename` to find its location) and compilation options. In this case, the :mod:`sysconfig` module is a useful tool to programmatically extract the configuration values that you will want to -combine together: +combine together. For example: .. code-block:: python >>> import sysconfig + >>> sysconfig.get_config_var('LIBS') + '-lpthread -ldl -lutil' >>> sysconfig.get_config_var('LINKFORSHARED') '-Xlinker -export-dynamic' diff --git a/Doc/extending/extending.rst b/Doc/extending/extending.rst index 7fd9f72..a3bf265 100644 --- a/Doc/extending/extending.rst +++ b/Doc/extending/extending.rst @@ -321,7 +321,7 @@ parameters to be passed in as a tuple acceptable for parsing via The :const:`METH_KEYWORDS` bit may be set in the third field if keyword arguments should be passed to the function. In this case, the C function should -accept a third ``PyObject \*`` parameter which will be a dictionary of keywords. +accept a third ``PyObject *`` parameter which will be a dictionary of keywords. Use :c:func:`PyArg_ParseTupleAndKeywords` to parse the arguments to such a function. @@ -384,8 +384,7 @@ optionally followed by an import of the module:: imports it. */ PyImport_ImportModule("spam"); -An example may be found in the file :file:`Demo/embed/demo.c` in the Python -source distribution. + ... .. note:: @@ -518,7 +517,7 @@ or more format codes between parentheses. For example:: value of the Python function. :c:func:`PyObject_CallObject` is "reference-count-neutral" with respect to its arguments. In the example a new tuple was created to serve as the argument list, which is :c:func:`Py_DECREF`\ --ed immediately after the call. +-ed immediately after the :c:func:`PyObject_CallObject` call. The return value of :c:func:`PyObject_CallObject` is "new": either it is a brand new object, or it is an existing object whose reference count has been @@ -720,9 +719,7 @@ Philbrick (philbrick@hks.com):: action, voltage); printf("-- Lovely plumage, the %s -- It's %s!\n", type, state); - Py_INCREF(Py_None); - - return Py_None; + Py_RETURN_NONE; } static PyMethodDef keywdarg_methods[] = { @@ -863,9 +860,9 @@ the cycle itself. The cycle detector is able to detect garbage cycles and can reclaim them so long as there are no finalizers implemented in Python (:meth:`__del__` methods). When there are such finalizers, the detector exposes the cycles through the -:mod:`gc` module (specifically, the -``garbage`` variable in that module). The :mod:`gc` module also exposes a way -to run the detector (the :func:`collect` function), as well as configuration +:mod:`gc` module (specifically, the :attr:`~gc.garbage` variable in that module). +The :mod:`gc` module also exposes a way to run the detector (the +:func:`~gc.collect` function), as well as configuration interfaces and the ability to disable the detector at runtime. The cycle detector is considered an optional component; though it is included by default, it can be disabled at build time using the :option:`--without-cycle-gc` option diff --git a/Doc/extending/index.rst b/Doc/extending/index.rst index 75cf4c5..44a7f92 100644 --- a/Doc/extending/index.rst +++ b/Doc/extending/index.rst @@ -5,12 +5,12 @@ ################################################## This document describes how to write modules in C or C++ to extend the Python -interpreter with new modules. Those modules can define new functions but also -new object types and their methods. The document also describes how to embed -the Python interpreter in another application, for use as an extension language. -Finally, it shows how to compile and link extension modules so that they can be -loaded dynamically (at run time) into the interpreter, if the underlying -operating system supports this feature. +interpreter with new modules. Those modules can not only define new functions +but also new object types and their methods. The document also describes how +to embed the Python interpreter in another application, for use as an extension +language. Finally, it shows how to compile and link extension modules so that +they can be loaded dynamically (at run time) into the interpreter, if the +underlying operating system supports this feature. This document assumes basic knowledge about Python. For an informal introduction to the language, see :ref:`tutorial-index`. :ref:`reference-index` @@ -21,6 +21,15 @@ Python) that give the language its wide application range. For a detailed description of the whole Python/C API, see the separate :ref:`c-api-index`. +.. note:: + + This guide only covers the basic tools for creating extensions provided + as part of this version of CPython. Third party tools may offer simpler + alternatives. Refer to the `binary extensions section + <https://python-packaging-user-guide.readthedocs.org/en/latest/extensions.html>`__ + in the Python Packaging User Guide for more information. + + .. toctree:: :maxdepth: 2 :numbered: diff --git a/Doc/extending/newtypes.rst b/Doc/extending/newtypes.rst index 3d68251..f484ba4 100644 --- a/Doc/extending/newtypes.rst +++ b/Doc/extending/newtypes.rst @@ -26,11 +26,12 @@ The Basics ========== The Python runtime sees all Python objects as variables of type -:c:type:`PyObject\*`. A :c:type:`PyObject` is not a very magnificent object - it -just contains the refcount and a pointer to the object's "type object". This is -where the action is; the type object determines which (C) functions get called -when, for instance, an attribute gets looked up on an object or it is multiplied -by another object. These C functions are called "type methods". +:c:type:`PyObject\*`, which serves as a "base type" for all Python objects. +:c:type:`PyObject` itself only contains the refcount and a pointer to the +object's "type object". This is where the action is; the type object determines +which (C) functions get called when, for instance, an attribute gets looked +up on an object or it is multiplied by another object. These C functions +are called "type methods". So, if you want to define a new object type, you need to create a new type object. @@ -50,15 +51,15 @@ The first bit that will be new is:: PyObject_HEAD } noddy_NoddyObject; -This is what a Noddy object will contain---in this case, nothing more than every -Python object contains, namely a refcount and a pointer to a type object. These -are the fields the ``PyObject_HEAD`` macro brings in. The reason for the macro -is to standardize the layout and to enable special debugging fields in debug -builds. Note that there is no semicolon after the ``PyObject_HEAD`` macro; one -is included in the macro definition. Be wary of adding one by accident; it's -easy to do from habit, and your compiler might not complain, but someone else's -probably will! (On Windows, MSVC is known to call this an error and refuse to -compile the code.) +This is what a Noddy object will contain---in this case, nothing more than what +every Python object contains---a refcount and a pointer to a type object. +These are the fields the ``PyObject_HEAD`` macro brings in. The reason for the +macro is to standardize the layout and to enable special debugging fields in +debug builds. Note that there is no semicolon after the ``PyObject_HEAD`` +macro; one is included in the macro definition. Be wary of adding one by +accident; it's easy to do from habit, and your compiler might not complain, +but someone else's probably will! (On Windows, MSVC is known to call this an +error and refuse to compile the code.) For contrast, let's take a look at the corresponding definition for standard Python floats:: @@ -134,11 +135,11 @@ This is so that Python knows how much memory to allocate when you call .. note:: If you want your type to be subclassable from Python, and your type has the same - :attr:`tp_basicsize` as its base type, you may have problems with multiple + :c:member:`~PyTypeObject.tp_basicsize` as its base type, you may have problems with multiple inheritance. A Python subclass of your type will have to list your type first - in its :attr:`__bases__`, or else it will not be able to call your type's + in its :attr:`~class.__bases__`, or else it will not be able to call your type's :meth:`__new__` method without getting an error. You can avoid this problem by - ensuring that your type has a larger value for :attr:`tp_basicsize` than its + ensuring that your type has a larger value for :c:member:`~PyTypeObject.tp_basicsize` than its base type does. Most of the time, this will be true anyway, because either your base type will be :class:`object`, or else you will be adding data members to your base type, and therefore increasing its size. @@ -158,7 +159,7 @@ to :const:`Py_TPFLAGS_DEFAULT`. :: All types should include this constant in their flags. It enables all of the members defined by the current version of Python. -We provide a doc string for the type in :attr:`tp_doc`. :: +We provide a doc string for the type in :c:member:`~PyTypeObject.tp_doc`. :: "Noddy objects", /* tp_doc */ @@ -167,12 +168,12 @@ from the others. We aren't going to implement any of these in this version of the module. We'll expand this example later to have more interesting behavior. For now, all we want to be able to do is to create new :class:`Noddy` objects. -To enable object creation, we have to provide a :attr:`tp_new` implementation. +To enable object creation, we have to provide a :c:member:`~PyTypeObject.tp_new` implementation. In this case, we can just use the default implementation provided by the API function :c:func:`PyType_GenericNew`. We'd like to just assign this to the -:attr:`tp_new` slot, but we can't, for portability sake, On some platforms or +:c:member:`~PyTypeObject.tp_new` slot, but we can't, for portability sake, On some platforms or compilers, we can't statically initialize a structure member with a function -defined in another C module, so, instead, we'll assign the :attr:`tp_new` slot +defined in another C module, so, instead, we'll assign the :c:member:`~PyTypeObject.tp_new` slot in the module initialization function just before calling :c:func:`PyType_Ready`:: @@ -224,7 +225,7 @@ doesn't do anything. It can't even be subclassed. Adding data and methods to the Basic example -------------------------------------------- -Let's expend the basic example to add some data and methods. Let's also make +Let's extend the basic example to add some data and methods. Let's also make the type usable as a base class. We'll create a new module, :mod:`noddy2` that adds these capabilities: @@ -267,13 +268,13 @@ allocation and deallocation. At a minimum, we need a deallocation method:: Py_TYPE(self)->tp_free((PyObject*)self); } -which is assigned to the :attr:`tp_dealloc` member:: +which is assigned to the :c:member:`~PyTypeObject.tp_dealloc` member:: (destructor)Noddy_dealloc, /*tp_dealloc*/ This method decrements the reference counts of the two Python attributes. We use :c:func:`Py_XDECREF` here because the :attr:`first` and :attr:`last` members -could be *NULL*. It then calls the :attr:`tp_free` member of the object's type +could be *NULL*. It then calls the :c:member:`~PyTypeObject.tp_free` member of the object's type to free the object's memory. Note that the object's type might not be :class:`NoddyType`, because the object may be an instance of a subclass. @@ -288,18 +289,16 @@ strings, so we provide a new method:: self = (Noddy *)type->tp_alloc(type, 0); if (self != NULL) { self->first = PyUnicode_FromString(""); - if (self->first == NULL) - { + if (self->first == NULL) { Py_DECREF(self); return NULL; - } + } self->last = PyUnicode_FromString(""); - if (self->last == NULL) - { + if (self->last == NULL) { Py_DECREF(self); return NULL; - } + } self->number = 0; } @@ -307,7 +306,7 @@ strings, so we provide a new method:: return (PyObject *)self; } -and install it in the :attr:`tp_new` member:: +and install it in the :c:member:`~PyTypeObject.tp_new` member:: Noddy_new, /* tp_new */ @@ -327,17 +326,17 @@ any arguments passed when the type was called, and that returns the new object created. New methods always accept positional and keyword arguments, but they often ignore the arguments, leaving the argument handling to initializer methods. Note that if the type supports subclassing, the type passed may not be -the type being defined. The new method calls the tp_alloc slot to allocate -memory. We don't fill the :attr:`tp_alloc` slot ourselves. Rather +the type being defined. The new method calls the :c:member:`~PyTypeObject.tp_alloc` slot to +allocate memory. We don't fill the :c:member:`~PyTypeObject.tp_alloc` slot ourselves. Rather :c:func:`PyType_Ready` fills it for us by inheriting it from our base class, which is :class:`object` by default. Most types use the default allocation. .. note:: - If you are creating a co-operative :attr:`tp_new` (one that calls a base type's - :attr:`tp_new` or :meth:`__new__`), you must *not* try to determine what method + If you are creating a co-operative :c:member:`~PyTypeObject.tp_new` (one that calls a base type's + :c:member:`~PyTypeObject.tp_new` or :meth:`__new__`), you must *not* try to determine what method to call using method resolution order at runtime. Always statically determine - what type you are going to call, and call its :attr:`tp_new` directly, or via + what type you are going to call, and call its :c:member:`~PyTypeObject.tp_new` directly, or via ``type->tp_base->tp_new``. If you do not do this, Python subclasses of your type that also inherit from other Python-defined classes may not work correctly. (Specifically, you may not be able to create instances of such subclasses @@ -374,11 +373,11 @@ We provide an initialization function:: return 0; } -by filling the :attr:`tp_init` slot. :: +by filling the :c:member:`~PyTypeObject.tp_init` slot. :: (initproc)Noddy_init, /* tp_init */ -The :attr:`tp_init` slot is exposed in Python as the :meth:`__init__` method. It +The :c:member:`~PyTypeObject.tp_init` slot is exposed in Python as the :meth:`__init__` method. It is used to initialize an object after it's created. Unlike the new method, we can't guarantee that the initializer is called. The initializer isn't called when unpickling objects and it can be overridden. Our initializer accepts @@ -408,7 +407,7 @@ reference counts. When don't we have to do this? * when we know that deallocation of the object [#]_ will not cause any calls back into our type's code -* when decrementing a reference count in a :attr:`tp_dealloc` handler when +* when decrementing a reference count in a :c:member:`~PyTypeObject.tp_dealloc` handler when garbage-collections is not supported [#]_ We want to expose our instance variables as attributes. There are a @@ -424,7 +423,7 @@ number of ways to do that. The simplest way is to define member definitions:: {NULL} /* Sentinel */ }; -and put the definitions in the :attr:`tp_members` slot:: +and put the definitions in the :c:member:`~PyTypeObject.tp_members` slot:: Noddy_members, /* tp_members */ @@ -445,15 +444,6 @@ concatenation of the first and last names. :: static PyObject * Noddy_name(Noddy* self) { - static PyObject *format = NULL; - PyObject *args, *result; - - if (format == NULL) { - format = PyUnicode_FromString("%s %s"); - if (format == NULL) - return NULL; - } - if (self->first == NULL) { PyErr_SetString(PyExc_AttributeError, "first"); return NULL; @@ -464,20 +454,13 @@ concatenation of the first and last names. :: return NULL; } - args = Py_BuildValue("OO", self->first, self->last); - if (args == NULL) - return NULL; - - result = PyUnicode_Format(format, args); - Py_DECREF(args); - - return result; + return PyUnicode_FromFormat("%S %S", self->first, self->last); } The method is implemented as a C function that takes a :class:`Noddy` (or :class:`Noddy` subclass) instance as the first argument. Methods always take an instance as the first argument. Methods often take positional and keyword -arguments as well, but in this cased we don't take any and don't need to accept +arguments as well, but in this case we don't take any and don't need to accept a positional argument tuple or keyword argument dictionary. This method is equivalent to the Python method:: @@ -500,7 +483,7 @@ definitions:: {NULL} /* Sentinel */ }; -and assign them to the :attr:`tp_methods` slot:: +and assign them to the :c:member:`~PyTypeObject.tp_methods` slot:: Noddy_methods, /* tp_methods */ @@ -595,7 +578,7 @@ We create an array of :c:type:`PyGetSetDef` structures:: {NULL} /* Sentinel */ }; -and register it in the :attr:`tp_getset` slot:: +and register it in the :c:member:`~PyTypeObject.tp_getset` slot:: Noddy_getseters, /* tp_getset */ @@ -612,7 +595,7 @@ We also remove the member definitions for these attributes:: {NULL} /* Sentinel */ }; -We also need to update the :attr:`tp_init` handler to only allow strings [#]_ to +We also need to update the :c:member:`~PyTypeObject.tp_init` handler to only allow strings [#]_ to be passed:: static int @@ -730,7 +713,7 @@ functions. With :c:func:`Py_VISIT`, :c:func:`Noddy_traverse` can be simplified: .. note:: - Note that the :attr:`tp_traverse` implementation must name its arguments exactly + Note that the :c:member:`~PyTypeObject.tp_traverse` implementation must name its arguments exactly *visit* and *arg* in order to use :c:func:`Py_VISIT`. This is to encourage uniformity across these boring implementations. @@ -767,7 +750,7 @@ its reference count. We do this because, as was discussed earlier, if the reference count drops to zero, we might cause code to run that calls back into the object. In addition, because we now support garbage collection, we also have to worry about code being run that triggers garbage collection. If garbage -collection is run, our :attr:`tp_traverse` handler could get called. We can't +collection is run, our :c:member:`~PyTypeObject.tp_traverse` handler could get called. We can't take a chance of having :c:func:`Noddy_traverse` called when a member's reference count has dropped to zero and its value hasn't been set to *NULL*. @@ -787,8 +770,8 @@ Finally, we add the :const:`Py_TPFLAGS_HAVE_GC` flag to the class flags:: Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE | Py_TPFLAGS_HAVE_GC, /* tp_flags */ -That's pretty much it. If we had written custom :attr:`tp_alloc` or -:attr:`tp_free` slots, we'd need to modify them for cyclic-garbage collection. +That's pretty much it. If we had written custom :c:member:`~PyTypeObject.tp_alloc` or +:c:member:`~PyTypeObject.tp_free` slots, we'd need to modify them for cyclic-garbage collection. Most extensions will use the versions automatically provided. @@ -847,8 +830,8 @@ the :attr:`__init__` method of the base type. This pattern is important when writing a type with custom :attr:`new` and :attr:`dealloc` methods. The :attr:`new` method should not actually create the -memory for the object with :attr:`tp_alloc`, that will be handled by the base -class when calling its :attr:`tp_new`. +memory for the object with :c:member:`~PyTypeObject.tp_alloc`, that will be handled by the base +class when calling its :c:member:`~PyTypeObject.tp_new`. When filling out the :c:func:`PyTypeObject` for the :class:`Shoddy` type, you see a slot for :c:func:`tp_base`. Due to cross platform compiler issues, you can't @@ -874,8 +857,8 @@ the module's :c:func:`init` function. :: } Before calling :c:func:`PyType_Ready`, the type structure must have the -:attr:`tp_base` slot filled in. When we are deriving a new type, it is not -necessary to fill out the :attr:`tp_alloc` slot with :c:func:`PyType_GenericNew` +:c:member:`~PyTypeObject.tp_base` slot filled in. When we are deriving a new type, it is not +necessary to fill out the :c:member:`~PyTypeObject.tp_alloc` slot with :c:func:`PyType_GenericNew` -- the allocate function from the base type will be inherited. After that, calling :c:func:`PyType_Ready` and adding the type object to the @@ -918,7 +901,7 @@ that will be helpful in such a situation! :: These fields tell the runtime how much memory to allocate when new objects of this type are created. Python has some built-in support for variable length -structures (think: strings, lists) which is where the :attr:`tp_itemsize` field +structures (think: strings, lists) which is where the :c:member:`~PyTypeObject.tp_itemsize` field comes in. This will be dealt with later. :: char *tp_doc; @@ -979,10 +962,9 @@ done. This can be done using the :c:func:`PyErr_Fetch` and if (self->my_callback != NULL) { PyObject *err_type, *err_value, *err_traceback; - int have_error = PyErr_Occurred() ? 1 : 0; - if (have_error) - PyErr_Fetch(&err_type, &err_value, &err_traceback); + /* This saves the current exception state */ + PyErr_Fetch(&err_type, &err_value, &err_traceback); cbresult = PyObject_CallObject(self->my_callback, NULL); if (cbresult == NULL) @@ -990,8 +972,8 @@ done. This can be done using the :c:func:`PyErr_Fetch` and else Py_DECREF(cbresult); - if (have_error) - PyErr_Restore(err_type, err_value, err_traceback); + /* This restores the saved exception state */ + PyErr_Restore(err_type, err_value, err_traceback); Py_DECREF(self->my_callback); } @@ -999,12 +981,12 @@ done. This can be done using the :c:func:`PyErr_Fetch` and } -Object Presentation -------------------- - .. index:: + single: string; object representation builtin: repr - builtin: str + +Object Presentation +------------------- In Python, there are two ways to generate a textual representation of an object: the :func:`repr` function, and the :func:`str` function. (The :func:`print` @@ -1015,7 +997,7 @@ function just calls :func:`str`.) These handlers are both optional. reprfunc tp_repr; reprfunc tp_str; -The :attr:`tp_repr` handler should return a string object containing a +The :c:member:`~PyTypeObject.tp_repr` handler should return a string object containing a representation of the instance for which it is called. Here is a simple example:: @@ -1026,15 +1008,15 @@ example:: obj->obj_UnderlyingDatatypePtr->size); } -If no :attr:`tp_repr` handler is specified, the interpreter will supply a -representation that uses the type's :attr:`tp_name` and a uniquely-identifying +If no :c:member:`~PyTypeObject.tp_repr` handler is specified, the interpreter will supply a +representation that uses the type's :c:member:`~PyTypeObject.tp_name` and a uniquely-identifying value for the object. -The :attr:`tp_str` handler is to :func:`str` what the :attr:`tp_repr` handler +The :c:member:`~PyTypeObject.tp_str` handler is to :func:`str` what the :c:member:`~PyTypeObject.tp_repr` handler described above is to :func:`repr`; that is, it is called when Python code calls :func:`str` on an instance of your object. Its implementation is very similar -to the :attr:`tp_repr` function, but the resulting string is intended for human -consumption. If :attr:`tp_str` is not specified, the :attr:`tp_repr` handler is +to the :c:member:`~PyTypeObject.tp_repr` function, but the resulting string is intended for human +consumption. If :c:member:`~PyTypeObject.tp_str` is not specified, the :c:member:`~PyTypeObject.tp_repr` handler is used instead. Here is a simple example:: @@ -1099,7 +1081,7 @@ type object to create :term:`descriptor`\s which are placed in the dictionary of type object. Each descriptor controls access to one attribute of the instance object. Each of the tables is optional; if all three are *NULL*, instances of the type will only have attributes that are inherited from their base type, and -should leave the :attr:`tp_getattro` and :attr:`tp_setattro` fields *NULL* as +should leave the :c:member:`~PyTypeObject.tp_getattro` and :c:member:`~PyTypeObject.tp_setattro` fields *NULL* as well, allowing the base type to handle attributes. The tables are declared as three fields of the type object:: @@ -1108,7 +1090,7 @@ The tables are declared as three fields of the type object:: struct PyMemberDef *tp_members; struct PyGetSetDef *tp_getset; -If :attr:`tp_methods` is not *NULL*, it must refer to an array of +If :c:member:`~PyTypeObject.tp_methods` is not *NULL*, it must refer to an array of :c:type:`PyMethodDef` structures. Each entry in the table is an instance of this structure:: @@ -1124,9 +1106,6 @@ needed for methods inherited from a base type. One additional entry is needed at the end; it is a sentinel that marks the end of the array. The :attr:`ml_name` field of the sentinel must be *NULL*. -XXX Need to refer to some unified discussion of the structure fields, shared -with the next section. - The second table is used to define attributes which map directly to data stored in the instance. A variety of primitive C types are supported, and access may be read-only or read-write. The structures in the table are defined as:: @@ -1146,8 +1125,6 @@ type which will be able to extract a value from the instance structure. The convert Python values to and from C values. The :attr:`flags` field is used to store flags which control how the attribute can be accessed. -XXX Need to move some of this to a shared section! - The following flag constants are defined in :file:`structmember.h`; they may be combined using bitwise-OR. @@ -1169,13 +1146,13 @@ combined using bitwise-OR. single: WRITE_RESTRICTED single: RESTRICTED -An interesting advantage of using the :attr:`tp_members` table to build +An interesting advantage of using the :c:member:`~PyTypeObject.tp_members` table to build descriptors that are used at runtime is that any attribute defined this way can have an associated doc string simply by providing the text in the table. An application can use the introspection API to retrieve the descriptor from the class object, and get the doc string using its :attr:`__doc__` attribute. -As with the :attr:`tp_methods` table, a sentinel entry with a :attr:`name` value +As with the :c:member:`~PyTypeObject.tp_methods` table, a sentinel entry with a :attr:`name` value of *NULL* is required. .. XXX Descriptors need to be explained in more detail somewhere, but not here. @@ -1199,7 +1176,7 @@ support added in Python 2.2. It explains how the handler functions are called, so that if you do need to extend their functionality, you'll understand what needs to be done. -The :attr:`tp_getattr` handler is called when the object requires an attribute +The :c:member:`~PyTypeObject.tp_getattr` handler is called when the object requires an attribute look-up. It is called in the same situations where the :meth:`__getattr__` method of a class would be called. @@ -1219,11 +1196,11 @@ Here is an example:: return NULL; } -The :attr:`tp_setattr` handler is called when the :meth:`__setattr__` or +The :c:member:`~PyTypeObject.tp_setattr` handler is called when the :meth:`__setattr__` or :meth:`__delattr__` method of a class instance would be called. When an attribute should be deleted, the third parameter will be *NULL*. Here is an example that simply raises an exception; if this were really all you wanted, the -:attr:`tp_setattr` handler should be set to *NULL*. :: +:c:member:`~PyTypeObject.tp_setattr` handler should be set to *NULL*. :: static int newdatatype_setattr(newdatatypeobject *obj, char *name, PyObject *v) @@ -1239,7 +1216,7 @@ Object Comparison richcmpfunc tp_richcompare; -The :attr:`tp_richcompare` handler is called when comparisons are needed. It is +The :c:member:`~PyTypeObject.tp_richcompare` handler is called when comparisons are needed. It is analogous to the :ref:`rich comparison methods <richcmpfuncs>`, like :meth:`__lt__`, and also called by :c:func:`PyObject_RichCompare` and :c:func:`PyObject_RichCompareBool`. @@ -1255,7 +1232,7 @@ if an exception was set. Here is a sample implementation, for a datatype that is considered equal if the size of an internal pointer is equal:: - static int + static PyObject * newdatatype_richcmp(PyObject *obj1, PyObject *obj2, int op) { PyObject *result; @@ -1330,7 +1307,7 @@ instance of your data type. Here is a moderately pointless example:: This function is called when an instance of your data type is "called", for example, if ``obj1`` is an instance of your data type and the Python script -contains ``obj1('hello')``, the :attr:`tp_call` handler is invoked. +contains ``obj1('hello')``, the :c:member:`~PyTypeObject.tp_call` handler is invoked. This function takes three arguments: @@ -1371,7 +1348,7 @@ Here is a desultory example of the implementation of the call function. :: return result; } -XXX some fields need to be added here... :: +:: /* Iterators */ getiterfunc tp_iter; @@ -1417,7 +1394,7 @@ those objects which do not benefit by weak referencing (such as numbers). For an object to be weakly referencable, the extension must include a :c:type:`PyObject\*` field in the instance structure for the use of the weak reference mechanism; it must be initialized to *NULL* by the object's -constructor. It must also set the :attr:`tp_weaklistoffset` field of the +constructor. It must also set the :c:member:`~PyTypeObject.tp_weaklistoffset` field of the corresponding type object to the offset of the field. For example, the instance type is defined with the following structure:: @@ -1503,7 +1480,7 @@ might be something like the following:: .. [#] This is true when we know that the object is a basic type, like a string or a float. -.. [#] We relied on this in the :attr:`tp_dealloc` handler in this example, because our +.. [#] We relied on this in the :c:member:`~PyTypeObject.tp_dealloc` handler in this example, because our type doesn't support garbage collection. Even if a type supports garbage collection, there are calls that can be made to "untrack" the object from garbage collection, however, these calls are advanced and not covered here. diff --git a/Doc/faq/design.rst b/Doc/faq/design.rst index c3061ae..49e0c6d 100644 --- a/Doc/faq/design.rst +++ b/Doc/faq/design.rst @@ -347,20 +347,20 @@ Answer 2: Fortunately, there is `Stackless Python <http://www.stackless.com>`_, which has a completely redesigned interpreter loop that avoids the C stack. -Why can't lambda forms contain statements? ------------------------------------------- +Why can't lambda expressions contain statements? +------------------------------------------------ -Python lambda forms cannot contain statements because Python's syntactic +Python lambda expressions cannot contain statements because Python's syntactic framework can't handle statements nested inside expressions. However, in Python, this is not a serious problem. Unlike lambda forms in other languages, where they add functionality, Python lambdas are only a shorthand notation if you're too lazy to define a function. Functions are already first class objects in Python, and can be declared in a -local scope. Therefore the only advantage of using a lambda form instead of a +local scope. Therefore the only advantage of using a lambda instead of a locally-defined function is that you don't need to invent a name for the function -- but that's just a local variable to which the function object (which -is exactly the same type of object that a lambda form yields) is assigned! +is exactly the same type of object that a lambda expression yields) is assigned! Can Python be compiled to machine code, C or some other language? @@ -512,14 +512,16 @@ far) under most circumstances, and the implementation is simpler. Dictionaries work by computing a hash code for each key stored in the dictionary using the :func:`hash` built-in function. The hash code varies widely depending -on the key; for example, "Python" hashes to -539294296 while "python", a string -that differs by a single bit, hashes to 1142331976. The hash code is then used -to calculate a location in an internal array where the value will be stored. -Assuming that you're storing keys that all have different hash values, this -means that dictionaries take constant time -- O(1), in computer science notation --- to retrieve a key. It also means that no sorted order of the keys is -maintained, and traversing the array as the ``.keys()`` and ``.items()`` do will -output the dictionary's content in some arbitrary jumbled order. +on the key and a per-process seed; for example, "Python" could hash to +-539294296 while "python", a string that differs by a single bit, could hash +to 1142331976. The hash code is then used to calculate a location in an +internal array where the value will be stored. Assuming that you're storing +keys that all have different hash values, this means that dictionaries take +constant time -- O(1), in computer science notation -- to retrieve a key. It +also means that no sorted order of the keys is maintained, and traversing the +array as the ``.keys()`` and ``.items()`` do will output the dictionary's +content in some arbitrary jumbled order that can change with every invocation of +a program. Why must dictionary keys be immutable? @@ -631,8 +633,9 @@ construction of large programs. Python 2.6 adds an :mod:`abc` module that lets you define Abstract Base Classes (ABCs). You can then use :func:`isinstance` and :func:`issubclass` to check whether an instance or a class implements a particular ABC. The -:mod:`collections` module defines a set of useful ABCs such as -:class:`Iterable`, :class:`Container`, and :class:`MutableMapping`. +:mod:`collections.abc` module defines a set of useful ABCs such as +:class:`~collections.abc.Iterable`, :class:`~collections.abc.Container`, and +:class:`~collections.abc.MutableMapping`. For Python, many of the advantages of interface specifications can be obtained by an appropriate test discipline for components. There is also a tool, @@ -860,8 +863,8 @@ There are several reasons to allow this. When you have a literal value for a list, tuple, or dictionary spread across multiple lines, it's easier to add more elements because you don't have to -remember to add a comma to the previous line. The lines can also be sorted in -your editor without creating a syntax error. +remember to add a comma to the previous line. The lines can also be reordered +without creating a syntax error. Accidentally omitting the comma can lead to errors that are hard to diagnose. For example:: diff --git a/Doc/faq/extending.rst b/Doc/faq/extending.rst index f862151..a9a234b 100644 --- a/Doc/faq/extending.rst +++ b/Doc/faq/extending.rst @@ -2,7 +2,9 @@ Extending/Embedding FAQ ======================= -.. contents:: +.. only:: html + + .. contents:: .. highlight:: c @@ -447,34 +449,3 @@ In Python 2.2, you can inherit from built-in classes such as :class:`int`, The Boost Python Library (BPL, http://www.boost.org/libs/python/doc/index.html) provides a way of doing this from C++ (i.e. you can inherit from an extension class written in C++ using the BPL). - - -When importing module X, why do I get "undefined symbol: PyUnicodeUCS2*"? -------------------------------------------------------------------------- - -You are using a version of Python that uses a 4-byte representation for Unicode -characters, but some C extension module you are importing was compiled using a -Python that uses a 2-byte representation for Unicode characters (the default). - -If instead the name of the undefined symbol starts with ``PyUnicodeUCS4``, the -problem is the reverse: Python was built using 2-byte Unicode characters, and -the extension module was compiled using a Python with 4-byte Unicode characters. - -This can easily occur when using pre-built extension packages. RedHat Linux -7.x, in particular, provided a "python2" binary that is compiled with 4-byte -Unicode. This only causes the link failure if the extension uses any of the -``PyUnicode_*()`` functions. It is also a problem if an extension uses any of -the Unicode-related format specifiers for :c:func:`Py_BuildValue` (or similar) or -parameter specifications for :c:func:`PyArg_ParseTuple`. - -You can check the size of the Unicode character a Python interpreter is using by -checking the value of sys.maxunicode: - - >>> import sys - >>> if sys.maxunicode > 65535: - ... print('UCS4 build') - ... else: - ... print('UCS2 build') - -The only way to solve this problem is to use extension modules compiled with a -Python binary built using the same size for Unicode characters. diff --git a/Doc/faq/general.rst b/Doc/faq/general.rst index 9f26dc9..6f4733f 100644 --- a/Doc/faq/general.rst +++ b/Doc/faq/general.rst @@ -4,7 +4,10 @@ General Python FAQ ================== -.. contents:: +.. only:: html + + .. contents:: + General Information =================== @@ -178,8 +181,8 @@ at http://docs.python.org/. PDF, plain text, and downloadable HTML versions are also available at http://docs.python.org/download.html. The documentation is written in reStructuredText and processed by `the Sphinx -documentation tool <http://sphinx.pocoo.org/>`__. The reStructuredText source -for the documentation is part of the Python source distribution. +documentation tool <http://sphinx-doc.org/>`__. The reStructuredText source for +the documentation is part of the Python source distribution. I've never programmed before. Is there a Python tutorial? @@ -265,9 +268,13 @@ Python references; or perhaps search for "Python" and "language". Where in the world is www.python.org located? --------------------------------------------- -It's currently in Amsterdam, graciously hosted by `XS4ALL -<http://www.xs4all.nl>`_. Thanks to Thomas Wouters for his work in arranging -python.org's hosting. +The Python project's infrastructure is located all over the world. +`www.python.org <http://www.python.org>`_ is currently in Amsterdam, graciously +hosted by `XS4ALL <http://www.xs4all.nl>`_. `Upfront Systems +<http://www.upfrontsystems.co.za>`_ hosts `bugs.python.org +<http://bugs.python.org>`_. Most other Python services like `PyPI +<https://pypi.python.org>`_ and hg.python.org are hosted by `Oregon State +University Open Source Lab <https://osuosl.org>`_. Why is it called Python? @@ -464,7 +471,8 @@ that is written in Python using Tkinter. PythonWin is a Windows-specific IDE. Emacs users will be happy to know that there is a very good Python mode for Emacs. All of these programming environments provide syntax highlighting, auto-indenting, and access to the interactive interpreter while coding. Consult -http://www.python.org/editors/ for a full list of Python editing environments. +`the Python wiki <https://wiki.python.org/moin/PythonEditors>`_ for a full list +of Python editing environments. If you want to discuss Python's use in education, you may be interested in joining `the edu-sig mailing list diff --git a/Doc/faq/gui.rst b/Doc/faq/gui.rst index f697cd3..5827f28 100644 --- a/Doc/faq/gui.rst +++ b/Doc/faq/gui.rst @@ -4,7 +4,9 @@ Graphic User Interface FAQ ========================== -.. contents:: +.. only:: html + + .. contents:: .. XXX need review for Python 3. diff --git a/Doc/faq/library.rst b/Doc/faq/library.rst index cab2d7b..2566932 100644 --- a/Doc/faq/library.rst +++ b/Doc/faq/library.rst @@ -4,7 +4,9 @@ Library and Extension FAQ ========================= -.. contents:: +.. only:: html + + .. contents:: General Library Questions ========================= @@ -507,6 +509,7 @@ For data that is more regular (e.g. a homogeneous list of ints or floats), you can also use the :mod:`array` module. .. note:: + To read and write binary data, it is mandatory to open the file in binary mode (here, passing ``"rb"`` to :func:`open`). If you use ``"r"`` instead (the default), the file will be open in text mode @@ -656,7 +659,7 @@ and client-side web systems. .. XXX check if wiki page is still up to date A summary of available frameworks is maintained by Paul Boddie at -http://wiki.python.org/moin/WebProgramming . +http://wiki.python.org/moin/WebProgramming\ . Cameron Laird maintains a useful set of pages about Python web technologies at http://phaseit.net/claird/comp.lang.python/web_python. diff --git a/Doc/faq/programming.rst b/Doc/faq/programming.rst index 32123de..280d5e1 100644 --- a/Doc/faq/programming.rst +++ b/Doc/faq/programming.rst @@ -4,7 +4,9 @@ Programming FAQ =============== -.. contents:: +.. only:: html + + .. contents:: General Questions ================= @@ -212,9 +214,9 @@ Why do lambdas defined in a loop with different values all return the same resul Assume you use a for loop to define a few different lambdas (or even plain functions), e.g.:: - squares = [] - for x in range(5): - squares.append(lambda: x**2) + >>> squares = [] + >>> for x in range(5): + ... squares.append(lambda: x**2) This gives you a list that contains 5 lambdas that calculate ``x**2``. You might expect that, when called, they would return, respectively, ``0``, ``1``, @@ -239,9 +241,9 @@ changing the value of ``x`` and see how the results of the lambdas change:: In order to avoid this, you need to save the values in variables local to the lambdas, so that they don't rely on the value of the global ``x``:: - squares = [] - for x in range(5): - squares.append(lambda n=x: n**2) + >>> squares = [] + >>> for x in range(5): + ... squares.append(lambda n=x: n**2) Here, ``n=x`` creates a new variable ``n`` local to the lambda and computed when the lambda is defined so that it has the same value that ``x`` had at @@ -590,11 +592,11 @@ Comma is not an operator in Python. Consider this session:: Since the comma is not an operator, but a separator between expressions the above is evaluated as if you had entered:: - >>> ("a" in "b"), "a" + ("a" in "b"), "a" not:: - >>> "a" in ("b", "a") + "a" in ("b", "a") The same is true of the various assignment operators (``=``, ``+=`` etc). They are not truly operators but syntactic delimiters in assignment statements. @@ -742,6 +744,7 @@ it from. However, if you need an object with the ability to modify in-place unicode data, try using a :class:`io.StringIO` object or the :mod:`array` module:: + >>> import io >>> s = "Hello, world" >>> sio = io.StringIO(s) >>> sio.getvalue() @@ -759,7 +762,7 @@ module:: array('u', 'Hello, world') >>> a[0] = 'y' >>> print(a) - array('u', 'yello world') + array('u', 'yello, world') >>> a.tounicode() 'yello, world' @@ -1058,7 +1061,7 @@ How do I create a multidimensional list? You probably tried to make a multidimensional array like this:: - A = [[None] * 2] * 3 + >>> A = [[None] * 2] * 3 This looks correct if you print it:: @@ -1090,7 +1093,7 @@ use a list comprehension:: A = [[None] * w for i in range(h)] Or, you can use an extension that provides a matrix datatype; `Numeric Python -<http://numpy.scipy.org/>`_ is the best known. +<http://www.numpy.org/>`_ is the best known. How do I apply a method to a sequence of objects? @@ -1100,38 +1103,101 @@ Use a list comprehension:: result = [obj.method() for obj in mylist] +.. _faq-augmented-assignment-tuple-error: -Dictionaries -============ +Why does a_tuple[i] += ['item'] raise an exception when the addition works? +--------------------------------------------------------------------------- -How can I get a dictionary to display its keys in a consistent order? ---------------------------------------------------------------------- +This is because of a combination of the fact that augmented assignment +operators are *assignment* operators, and the difference between mutable and +immutable objects in Python. -You can't. Dictionaries store their keys in an unpredictable order, so the -display order of a dictionary's elements will be similarly unpredictable. +This discussion applies in general when augmented assignment operators are +applied to elements of a tuple that point to mutable objects, but we'll use +a ``list`` and ``+=`` as our exemplar. -This can be frustrating if you want to save a printable version to a file, make -some changes and then compare it with some other printed dictionary. In this -case, use the ``pprint`` module to pretty-print the dictionary; the items will -be presented in order sorted by the key. +If you wrote:: -A more complicated solution is to subclass ``dict`` to create a -``SortedDict`` class that prints itself in a predictable order. Here's one -simpleminded implementation of such a class:: + >>> a_tuple = (1, 2) + >>> a_tuple[0] += 1 + Traceback (most recent call last): + ... + TypeError: 'tuple' object does not support item assignment - class SortedDict(dict): - def __repr__(self): - keys = sorted(self.keys()) - result = ("{!r}: {!r}".format(k, self[k]) for k in keys) - return "{{{}}}".format(", ".join(result)) +The reason for the exception should be immediately clear: ``1`` is added to the +object ``a_tuple[0]`` points to (``1``), producing the result object, ``2``, +but when we attempt to assign the result of the computation, ``2``, to element +``0`` of the tuple, we get an error because we can't change what an element of +a tuple points to. - __str__ = __repr__ +Under the covers, what this augmented assignment statement is doing is +approximately this:: -This will work for many common situations you might encounter, though it's far -from a perfect solution. The largest flaw is that if some values in the -dictionary are also dictionaries, their values won't be presented in any -particular order. + >>> result = a_tuple[0] + 1 + >>> a_tuple[0] = result + Traceback (most recent call last): + ... + TypeError: 'tuple' object does not support item assignment + +It is the assignment part of the operation that produces the error, since a +tuple is immutable. +When you write something like:: + + >>> a_tuple = (['foo'], 'bar') + >>> a_tuple[0] += ['item'] + Traceback (most recent call last): + ... + TypeError: 'tuple' object does not support item assignment + +The exception is a bit more surprising, and even more surprising is the fact +that even though there was an error, the append worked:: + + >>> a_tuple[0] + ['foo', 'item'] + +To see why this happens, you need to know that (a) if an object implements an +``__iadd__`` magic method, it gets called when the ``+=`` augmented assignment +is executed, and its return value is what gets used in the assignment statement; +and (b) for lists, ``__iadd__`` is equivalent to calling ``extend`` on the list +and returning the list. That's why we say that for lists, ``+=`` is a +"shorthand" for ``list.extend``:: + + >>> a_list = [] + >>> a_list += [1] + >>> a_list + [1] + +This is equivalent to:: + + >>> result = a_list.__iadd__([1]) + >>> a_list = result + +The object pointed to by a_list has been mutated, and the pointer to the +mutated object is assigned back to ``a_list``. The end result of the +assignment is a no-op, since it is a pointer to the same object that ``a_list`` +was previously pointing to, but the assignment still happens. + +Thus, in our tuple example what is happening is equivalent to:: + + >>> result = a_tuple[0].__iadd__(['item']) + >>> a_tuple[0] = result + Traceback (most recent call last): + ... + TypeError: 'tuple' object does not support item assignment + +The ``__iadd__`` succeeds, and thus the list is extended, but even though +``result`` points to the same object that ``a_tuple[0]`` already points to, +that final assignment still results in an error, because tuples are immutable. + + +Dictionaries +============ + +How can I get a dictionary to store and display its keys in a consistent order? +------------------------------------------------------------------------------- + +Use :class:`collections.OrderedDict`. I want to do a complicated sort: can you do a Schwartzian Transform in Python? ------------------------------------------------------------------------------ @@ -1510,41 +1576,76 @@ You can program the class's constructor to keep track of all instances by keeping a list of weak references to each instance. +Why does the result of ``id()`` appear to be not unique? +-------------------------------------------------------- + +The :func:`id` builtin returns an integer that is guaranteed to be unique during +the lifetime of the object. Since in CPython, this is the object's memory +address, it happens frequently that after an object is deleted from memory, the +next freshly created object is allocated at the same position in memory. This +is illustrated by this example: + +>>> id(1000) +13901272 +>>> id(2000) +13901272 + +The two ids belong to different integer objects that are created before, and +deleted immediately after execution of the ``id()`` call. To be sure that +objects whose id you want to examine are still alive, create another reference +to the object: + +>>> a = 1000; b = 2000 +>>> id(a) +13901272 +>>> id(b) +13891296 + + Modules ======= How do I create a .pyc file? ---------------------------- -When a module is imported for the first time (or when the source is more recent -than the current compiled file) a ``.pyc`` file containing the compiled code -should be created in the same directory as the ``.py`` file. - -One reason that a ``.pyc`` file may not be created is permissions problems with -the directory. This can happen, for example, if you develop as one user but run -as another, such as if you are testing with a web server. Creation of a .pyc -file is automatic if you're importing a module and Python has the ability -(permissions, free space, etc...) to write the compiled module back to the -directory. - -Running Python on a top level script is not considered an import and no ``.pyc`` -will be created. For example, if you have a top-level module ``abc.py`` that -imports another module ``xyz.py``, when you run abc, ``xyz.pyc`` will be created -since xyz is imported, but no ``abc.pyc`` file will be created since ``abc.py`` -isn't being imported. - -If you need to create abc.pyc -- that is, to create a .pyc file for a module -that is not imported -- you can, using the :mod:`py_compile` and -:mod:`compileall` modules. +When a module is imported for the first time (or when the source file has +changed since the current compiled file was created) a ``.pyc`` file containing +the compiled code should be created in a ``__pycache__`` subdirectory of the +directory containing the ``.py`` file. The ``.pyc`` file will have a +filename that starts with the same name as the ``.py`` file, and ends with +``.pyc``, with a middle component that depends on the particular ``python`` +binary that created it. (See :pep:`3147` for details.) + +One reason that a ``.pyc`` file may not be created is a permissions problem +with the directory containing the source file, meaning that the ``__pycache__`` +subdirectory cannot be created. This can happen, for example, if you develop as +one user but run as another, such as if you are testing with a web server. + +Unless the :envvar:`PYTHONDONTWRITEBYTECODE` environment variable is set, +creation of a .pyc file is automatic if you're importing a module and Python +has the ability (permissions, free space, etc...) to create a ``__pycache__`` +subdirectory and write the compiled module to that subdirectory. + +Running Python on a top level script is not considered an import and no +``.pyc`` will be created. For example, if you have a top-level module +``foo.py`` that imports another module ``xyz.py``, when you run ``foo`` (by +typing ``python foo.py`` as a shell command), a ``.pyc`` will be created for +``xyz`` because ``xyz`` is imported, but no ``.pyc`` file will be created for +``foo`` since ``foo.py`` isn't being imported. + +If you need to create a ``.pyc`` file for ``foo`` -- that is, to create a +``.pyc`` file for a module that is not imported -- you can, using the +:mod:`py_compile` and :mod:`compileall` modules. The :mod:`py_compile` module can manually compile any module. One way is to use the ``compile()`` function in that module interactively:: >>> import py_compile - >>> py_compile.compile('abc.py') + >>> py_compile.compile('foo.py') # doctest: +SKIP -This will write the ``.pyc`` to the same location as ``abc.py`` (or you can -override that with the optional parameter ``cfile``). +This will write the ``.pyc`` to a ``__pycache__`` subdirectory in the same +location as ``foo.py`` (or you can override that with the optional parameter +``cfile``). You can also automatically compile all files in a directory or directories using the :mod:`compileall` module. You can do it from the shell prompt by running diff --git a/Doc/faq/windows.rst b/Doc/faq/windows.rst index 0a85a40..59f9480 100644 --- a/Doc/faq/windows.rst +++ b/Doc/faq/windows.rst @@ -6,7 +6,9 @@ Python on Windows FAQ ===================== -.. contents:: +.. only:: html + + .. contents:: .. XXX need review for Python 3. XXX need review for Windows Vista/Seven? @@ -168,18 +170,20 @@ offender. How do I make an executable from a Python script? ------------------------------------------------- -See http://www.py2exe.org/ for a distutils extension that allows you +See http://cx-freeze.sourceforge.net/ for a distutils extension that allows you to create console and GUI executables from Python code. +`py2exe <http://www.py2exe.org/>`_, the most popular extension for building +Python 2.x-based executables, does not yet support Python 3 but a version that +does is in development. + Is a ``*.pyd`` file the same as a DLL? -------------------------------------- -.. XXX update for py3k (PyInit_foo) - Yes, .pyd files are dll's, but there are a few differences. If you have a DLL -named ``foo.pyd``, then it must have a function ``initfoo()``. You can then +named ``foo.pyd``, then it must have a function ``PyInit_foo()``. You can then write Python "import foo", and Python will search for foo.pyd (as well as -foo.py, foo.pyc) and if it finds it, will attempt to call ``initfoo()`` to +foo.py, foo.pyc) and if it finds it, will attempt to call ``PyInit_foo()`` to initialize it. You do not link your .exe with foo.lib, as that would cause Windows to require the DLL to be present. @@ -245,7 +249,7 @@ Embedding the Python interpreter in a Windows app can be summarized as follows: ... Py_Initialize(); // Initialize Python. initmyAppc(); // Initialize (import) the helper class. - PyRun_SimpleString("import myApp") ; // Import the shadow class. + PyRun_SimpleString("import myApp"); // Import the shadow class. 5. There are two problems with Python's C API which will become apparent if you use a compiler other than MSVC, the compiler used to build pythonNN.dll. diff --git a/Doc/glossary.rst b/Doc/glossary.rst index 1100b1f..b48eb1e 100644 --- a/Doc/glossary.rst +++ b/Doc/glossary.rst @@ -34,14 +34,14 @@ Glossary subclasses, which are classes that don't inherit from a class but are still recognized by :func:`isinstance` and :func:`issubclass`; see the :mod:`abc` module documentation. Python comes with many built-in ABCs for - data structures (in the :mod:`collections` module), numbers (in the + data structures (in the :mod:`collections.abc` module), numbers (in the :mod:`numbers` module), streams (in the :mod:`io` module), import finders and loaders (in the :mod:`importlib.abc` module). You can create your own ABCs with the :mod:`abc` module. argument A value passed to a :term:`function` (or :term:`method`) when calling the - function. There are two types of arguments: + function. There are two kinds of argument: * :dfn:`keyword argument`: an argument preceded by an identifier (e.g. ``name=``) in a function call or passed as a value in a dictionary @@ -78,6 +78,21 @@ Glossary Benevolent Dictator For Life, a.k.a. `Guido van Rossum <http://www.python.org/~guido/>`_, Python's creator. + binary file + A :term:`file object` able to read and write + :term:`bytes-like objects <bytes-like object>`. + + .. seealso:: + A :term:`text file` reads and writes :class:`str` objects. + + bytes-like object + An object that supports the :ref:`bufferobjects`, like :class:`bytes`, + :class:`bytearray` or :class:`memoryview`. Bytes-like objects can + be used for various operations that expect binary data, such as + compression, saving to a binary file or sending over a socket. + Some operations need the binary data to be mutable, in which case + not all bytes-like objects can apply. + bytecode Python source code is compiled into bytecode, the internal representation of a Python program in the CPython interpreter. The bytecode is also @@ -217,19 +232,20 @@ Glossary etc.). File objects are also called :dfn:`file-like objects` or :dfn:`streams`. - There are actually three categories of file objects: raw binary files, - buffered binary files and text files. Their interfaces are defined in the - :mod:`io` module. The canonical way to create a file object is by using - the :func:`open` function. + There are actually three categories of file objects: raw + :term:`binary files <binary file>`, buffered + :term:`binary files <binary file>` and :term:`text files <text file>`. + Their interfaces are defined in the :mod:`io` module. The canonical + way to create a file object is by using the :func:`open` function. file-like object A synonym for :term:`file object`. finder An object that tries to find the :term:`loader` for a module. It must - implement a method named :meth:`find_module`. See :pep:`302` for - details and :class:`importlib.abc.Finder` for an - :term:`abstract base class`. + implement either a method named :meth:`find_loader` or a method named + :meth:`find_module`. See :pep:`302` and :pep:`420` for details and + :class:`importlib.abc.Finder` for an :term:`abstract base class`. floor division Mathematical division that rounds down to nearest integer. The floor @@ -244,6 +260,16 @@ Glossary the execution of the body. See also :term:`parameter`, :term:`method`, and the :ref:`function` section. + function annotation + An arbitrary metadata value associated with a function parameter or return + value. Its syntax is explained in section :ref:`function`. Annotations + may be accessed via the :attr:`__annotations__` special attribute of a + function object. + + Python itself does not assign any particular meaning to function + annotations. They are intended to be interpreted by third-party libraries + or tools. See :pep:`3107`, which describes some of their potential uses. + __future__ A pseudo-module which programmers can use to enable new language features which are not compatible with the current interpreter. @@ -335,6 +361,17 @@ Glossary role in places where a constant hash value is needed, for example as a key in a dictionary. + import path + A list of locations (or :term:`path entries <path entry>`) that are + searched by the :term:`path based finder` for modules to import. During + import, this list of locations usually comes from :data:`sys.path`, but + for subpackages it may also come from the parent package's ``__path__`` + attribute. + + importing + The process by which Python code in one module is made available to + Python code in another module. + importer An object that both finds and loads a module; both a :term:`finder` and :term:`loader` object. @@ -451,12 +488,17 @@ Glossary mapping A container object that supports arbitrary key lookups and implements the - methods specified in the :class:`~collections.Mapping` or - :class:`~collections.MutableMapping` + methods specified in the :class:`~collections.abc.Mapping` or + :class:`~collections.abc.MutableMapping` :ref:`abstract base classes <collections-abstract-base-classes>`. Examples include :class:`dict`, :class:`collections.defaultdict`, :class:`collections.OrderedDict` and :class:`collections.Counter`. + meta path finder + A finder returned by a search of :data:`sys.meta_path`. Meta path + finders are related to, but different from :term:`path entry finders + <path entry finder>`. + metaclass The class of a class. Class definitions create a class name, a class dictionary, and a list of base classes. The metaclass is responsible for @@ -481,6 +523,13 @@ Glossary for a member during lookup. See `The Python 2.3 Method Resolution Order <http://www.python.org/download/releases/2.3/mro/>`_. + module + An object that serves as an organizational unit of Python code. Modules + have a namespace containing arbitrary Python objects. Modules are loaded + into Python by the process of :term:`importing`. + + See also :term:`package`. + MRO See :term:`method resolution order`. @@ -506,13 +555,21 @@ Glossary dictionaries. There are the local, global and built-in namespaces as well as nested namespaces in objects (in methods). Namespaces support modularity by preventing naming conflicts. For instance, the functions - :func:`builtins.open` and :func:`os.open` are distinguished by their - namespaces. Namespaces also aid readability and maintainability by making - it clear which module implements a function. For instance, writing + :func:`builtins.open <.open>` and :func:`os.open` are distinguished by + their namespaces. Namespaces also aid readability and maintainability by + making it clear which module implements a function. For instance, writing :func:`random.seed` or :func:`itertools.islice` makes it clear that those functions are implemented by the :mod:`random` and :mod:`itertools` modules, respectively. + namespace package + A :pep:`420` :term:`package` which serves only as a container for + subpackages. Namespace packages may have no physical representation, + and specifically are not like a :term:`regular package` because they + have no ``__init__.py`` file. + + See also :term:`module`. + nested scope The ability to refer to a variable in an enclosing definition. For instance, a function defined inside another function can refer to @@ -525,18 +582,25 @@ Glossary new-style class Old name for the flavor of classes now used for all class objects. In earlier Python versions, only new-style classes could use Python's newer, - versatile features like :attr:`__slots__`, descriptors, properties, - :meth:`__getattribute__`, class methods, and static methods. + versatile features like :attr:`~object.__slots__`, descriptors, + properties, :meth:`__getattribute__`, class methods, and static methods. object Any data with state (attributes or value) and defined behavior (methods). Also the ultimate base class of any :term:`new-style class`. + package + A Python :term:`module` which can contain submodules or recursively, + subpackages. Technically, a package is a Python module with an + ``__path__`` attribute. + + See also :term:`regular package` and :term:`namespace package`. + parameter A named entity in a :term:`function` (or method) definition that specifies an :term:`argument` (or in some cases, arguments) that the - function can accept. There are five types of parameters: + function can accept. There are five kinds of parameter: * :dfn:`positional-or-keyword`: specifies an argument that can be passed either :term:`positionally <argument>` or as a :term:`keyword argument @@ -550,6 +614,8 @@ Glossary parameters. However, some built-in functions have positional-only parameters (e.g. :func:`abs`). + .. _keyword-only_parameter: + * :dfn:`keyword-only`: specifies an argument that can be supplied only by keyword. Keyword-only parameters can be defined by including a single var-positional parameter or bare ``*`` in the parameter list @@ -580,12 +646,48 @@ Glossary <faq-argument-vs-parameter>`, the :class:`inspect.Parameter` class, the :ref:`function` section, and :pep:`362`. + path entry + A single location on the :term:`import path` which the :term:`path + based finder` consults to find modules for importing. + + path entry finder + A :term:`finder` returned by a callable on :data:`sys.path_hooks` + (i.e. a :term:`path entry hook`) which knows how to locate modules given + a :term:`path entry`. + + path entry hook + A callable on the :data:`sys.path_hook` list which returns a :term:`path + entry finder` if it knows how to find modules on a specific :term:`path + entry`. + + path based finder + One of the default :term:`meta path finders <meta path finder>` which + searches an :term:`import path` for modules. + + portion + A set of files in a single directory (possibly stored in a zip file) + that contribute to a namespace package, as defined in :pep:`420`. + positional argument See :term:`argument`. + provisional package + A provisional package is one which has been deliberately excluded from + the standard library's backwards compatibility guarantees. While major + changes to such packages are not expected, as long as they are marked + provisional, backwards incompatible changes (up to and including removal + of the package) may occur if deemed necessary by core developers. Such + changes will not be made gratuitously -- they will occur only if serious + flaws are uncovered that were missed prior to the inclusion of the + package. + + This process allows the standard library to continue to evolve over + time, without locking in problematic design errors for extended periods + of time. See :pep:`411` for more details. + Python 3000 - Nickname for the Python 3.x release line (coined long ago when the release - of version 3 was something in the distant future.) This is also + Nickname for the Python 3.x release line (coined long ago when the + release of version 3 was something in the distant future.) This is also abbreviated "Py3k". Pythonic @@ -604,6 +706,32 @@ Glossary for piece in food: print(piece) + qualified name + A dotted name showing the "path" from a module's global scope to a + class, function or method defined in that module, as defined in + :pep:`3155`. For top-level functions and classes, the qualified name + is the same as the object's name:: + + >>> class C: + ... class D: + ... def meth(self): + ... pass + ... + >>> C.__qualname__ + 'C' + >>> C.D.__qualname__ + 'C.D' + >>> C.D.meth.__qualname__ + 'C.D.meth' + + When used to refer to modules, the *fully qualified name* means the + entire dotted path to the module, including any parent packages, + e.g. ``email.mime.text``:: + + >>> import email.mime.text + >>> email.mime.text.__name__ + 'email.mime.text' + reference count The number of references to an object. When the reference count of an object drops to zero, it is deallocated. Reference counting is @@ -612,6 +740,12 @@ Glossary :func:`~sys.getrefcount` function that programmers can call to return the reference count for a particular object. + regular package + A traditional :term:`package`, such as a directory containing an + ``__init__.py`` file. + + See also :term:`namespace package`. + __slots__ A declaration inside a class that saves memory by pre-declaring space for instance attributes and eliminating instance dictionaries. Though @@ -629,6 +763,14 @@ Glossary mapping rather than a sequence because the lookups use arbitrary :term:`immutable` keys rather than integers. + The :class:`collections.abc.Sequence` abstract base class + defines a much richer interface that goes beyond just + :meth:`__getitem__` and :meth:`__len__`, adding :meth:`count`, + :meth:`index`, :meth:`__contains__`, and + :meth:`__reversed__`. Types that implement this expanded + interface can be registered explicitly using + :func:`~abc.register`. + slice An object usually containing a portion of a :term:`sequence`. A slice is created using the subscript notation, ``[]`` with colons between numbers @@ -643,9 +785,25 @@ Glossary statement A statement is part of a suite (a "block" of code). A statement is either - an :term:`expression` or a one of several constructs with a keyword, such + an :term:`expression` or one of several constructs with a keyword, such as :keyword:`if`, :keyword:`while` or :keyword:`for`. + struct sequence + A tuple with named elements. Struct sequences expose an interface similar + to :term:`named tuple` in that elements can either be accessed either by + index or as an attribute. However, they do not have any of the named tuple + methods like :meth:`~collections.somenamedtuple._make` or + :meth:`~collections.somenamedtuple._asdict`. Examples of struct sequences + include :data:`sys.float_info` and the return value of :func:`os.stat`. + + text file + A :term:`file object` able to read and write :class:`str` objects. + Often, a text file actually accesses a byte-oriented datastream + and handles the text encoding automatically. + + .. seealso:: + A :term:`binary file` reads and write :class:`bytes` objects. + triple-quoted string A string which is bound by three instances of either a quotation mark (") or an apostrophe ('). While they don't provide any functionality @@ -658,7 +816,8 @@ Glossary type The type of a Python object determines what kind of object it is; every object has a type. An object's type is accessible as its - :attr:`__class__` attribute or can be retrieved with ``type(obj)``. + :attr:`~instance.__class__` attribute or can be retrieved with + ``type(obj)``. universal newlines A manner of interpreting text streams in which all of the following are diff --git a/Doc/howto/advocacy.rst b/Doc/howto/advocacy.rst deleted file mode 100644 index 2969d26..0000000 --- a/Doc/howto/advocacy.rst +++ /dev/null @@ -1,355 +0,0 @@ -************************* - Python Advocacy HOWTO -************************* - -:Author: A.M. Kuchling -:Release: 0.03 - - -.. topic:: Abstract - - It's usually difficult to get your management to accept open source software, - and Python is no exception to this rule. This document discusses reasons to use - Python, strategies for winning acceptance, facts and arguments you can use, and - cases where you *shouldn't* try to use Python. - - -Reasons to Use Python -===================== - -There are several reasons to incorporate a scripting language into your -development process, and this section will discuss them, and why Python has some -properties that make it a particularly good choice. - - -Programmability ---------------- - -Programs are often organized in a modular fashion. Lower-level operations are -grouped together, and called by higher-level functions, which may in turn be -used as basic operations by still further upper levels. - -For example, the lowest level might define a very low-level set of functions for -accessing a hash table. The next level might use hash tables to store the -headers of a mail message, mapping a header name like ``Date`` to a value such -as ``Tue, 13 May 1997 20:00:54 -0400``. A yet higher level may operate on -message objects, without knowing or caring that message headers are stored in a -hash table, and so forth. - -Often, the lowest levels do very simple things; they implement a data structure -such as a binary tree or hash table, or they perform some simple computation, -such as converting a date string to a number. The higher levels then contain -logic connecting these primitive operations. Using the approach, the primitives -can be seen as basic building blocks which are then glued together to produce -the complete product. - -Why is this design approach relevant to Python? Because Python is well suited -to functioning as such a glue language. A common approach is to write a Python -module that implements the lower level operations; for the sake of speed, the -implementation might be in C, Java, or even Fortran. Once the primitives are -available to Python programs, the logic underlying higher level operations is -written in the form of Python code. The high-level logic is then more -understandable, and easier to modify. - -John Ousterhout wrote a paper that explains this idea at greater length, -entitled "Scripting: Higher Level Programming for the 21st Century". I -recommend that you read this paper; see the references for the URL. Ousterhout -is the inventor of the Tcl language, and therefore argues that Tcl should be -used for this purpose; he only briefly refers to other languages such as Python, -Perl, and Lisp/Scheme, but in reality, Ousterhout's argument applies to -scripting languages in general, since you could equally write extensions for any -of the languages mentioned above. - - -Prototyping ------------ - -In *The Mythical Man-Month*, Fredrick Brooks suggests the following rule when -planning software projects: "Plan to throw one away; you will anyway." Brooks -is saying that the first attempt at a software design often turns out to be -wrong; unless the problem is very simple or you're an extremely good designer, -you'll find that new requirements and features become apparent once development -has actually started. If these new requirements can't be cleanly incorporated -into the program's structure, you're presented with two unpleasant choices: -hammer the new features into the program somehow, or scrap everything and write -a new version of the program, taking the new features into account from the -beginning. - -Python provides you with a good environment for quickly developing an initial -prototype. That lets you get the overall program structure and logic right, and -you can fine-tune small details in the fast development cycle that Python -provides. Once you're satisfied with the GUI interface or program output, you -can translate the Python code into C++, Fortran, Java, or some other compiled -language. - -Prototyping means you have to be careful not to use too many Python features -that are hard to implement in your other language. Using ``eval()``, or regular -expressions, or the :mod:`pickle` module, means that you're going to need C or -Java libraries for formula evaluation, regular expressions, and serialization, -for example. But it's not hard to avoid such tricky code, and in the end the -translation usually isn't very difficult. The resulting code can be rapidly -debugged, because any serious logical errors will have been removed from the -prototype, leaving only more minor slip-ups in the translation to track down. - -This strategy builds on the earlier discussion of programmability. Using Python -as glue to connect lower-level components has obvious relevance for constructing -prototype systems. In this way Python can help you with development, even if -end users never come in contact with Python code at all. If the performance of -the Python version is adequate and corporate politics allow it, you may not need -to do a translation into C or Java, but it can still be faster to develop a -prototype and then translate it, instead of attempting to produce the final -version immediately. - -One example of this development strategy is Microsoft Merchant Server. Version -1.0 was written in pure Python, by a company that subsequently was purchased by -Microsoft. Version 2.0 began to translate the code into C++, shipping with some -C++code and some Python code. Version 3.0 didn't contain any Python at all; all -the code had been translated into C++. Even though the product doesn't contain -a Python interpreter, the Python language has still served a useful purpose by -speeding up development. - -This is a very common use for Python. Past conference papers have also -described this approach for developing high-level numerical algorithms; see -David M. Beazley and Peter S. Lomdahl's paper "Feeding a Large-scale Physics -Application to Python" in the references for a good example. If an algorithm's -basic operations are things like "Take the inverse of this 4000x4000 matrix", -and are implemented in some lower-level language, then Python has almost no -additional performance cost; the extra time required for Python to evaluate an -expression like ``m.invert()`` is dwarfed by the cost of the actual computation. -It's particularly good for applications where seemingly endless tweaking is -required to get things right. GUI interfaces and Web sites are prime examples. - -The Python code is also shorter and faster to write (once you're familiar with -Python), so it's easier to throw it away if you decide your approach was wrong; -if you'd spent two weeks working on it instead of just two hours, you might -waste time trying to patch up what you've got out of a natural reluctance to -admit that those two weeks were wasted. Truthfully, those two weeks haven't -been wasted, since you've learnt something about the problem and the technology -you're using to solve it, but it's human nature to view this as a failure of -some sort. - - -Simplicity and Ease of Understanding ------------------------------------- - -Python is definitely *not* a toy language that's only usable for small tasks. -The language features are general and powerful enough to enable it to be used -for many different purposes. It's useful at the small end, for 10- or 20-line -scripts, but it also scales up to larger systems that contain thousands of lines -of code. - -However, this expressiveness doesn't come at the cost of an obscure or tricky -syntax. While Python has some dark corners that can lead to obscure code, there -are relatively few such corners, and proper design can isolate their use to only -a few classes or modules. It's certainly possible to write confusing code by -using too many features with too little concern for clarity, but most Python -code can look a lot like a slightly-formalized version of human-understandable -pseudocode. - -In *The New Hacker's Dictionary*, Eric S. Raymond gives the following definition -for "compact": - -.. epigraph:: - - Compact *adj.* Of a design, describes the valuable property that it can all be - apprehended at once in one's head. This generally means the thing created from - the design can be used with greater facility and fewer errors than an equivalent - tool that is not compact. Compactness does not imply triviality or lack of - power; for example, C is compact and FORTRAN is not, but C is more powerful than - FORTRAN. Designs become non-compact through accreting features and cruft that - don't merge cleanly into the overall design scheme (thus, some fans of Classic C - maintain that ANSI C is no longer compact). - - (From http://www.catb.org/~esr/jargon/html/C/compact.html) - -In this sense of the word, Python is quite compact, because the language has -just a few ideas, which are used in lots of places. Take namespaces, for -example. Import a module with ``import math``, and you create a new namespace -called ``math``. Classes are also namespaces that share many of the properties -of modules, and have a few of their own; for example, you can create instances -of a class. Instances? They're yet another namespace. Namespaces are currently -implemented as Python dictionaries, so they have the same methods as the -standard dictionary data type: .keys() returns all the keys, and so forth. - -This simplicity arises from Python's development history. The language syntax -derives from different sources; ABC, a relatively obscure teaching language, is -one primary influence, and Modula-3 is another. (For more information about ABC -and Modula-3, consult their respective Web sites at http://www.cwi.nl/~steven/abc/ -and http://www.m3.org.) Other features have come from C, Icon, -Algol-68, and even Perl. Python hasn't really innovated very much, but instead -has tried to keep the language small and easy to learn, building on ideas that -have been tried in other languages and found useful. - -Simplicity is a virtue that should not be underestimated. It lets you learn the -language more quickly, and then rapidly write code -- code that often works the -first time you run it. - - -Java Integration ----------------- - -If you're working with Java, Jython (http://www.jython.org/) is definitely worth -your attention. Jython is a re-implementation of Python in Java that compiles -Python code into Java bytecodes. The resulting environment has very tight, -almost seamless, integration with Java. It's trivial to access Java classes -from Python, and you can write Python classes that subclass Java classes. -Jython can be used for prototyping Java applications in much the same way -CPython is used, and it can also be used for test suites for Java code, or -embedded in a Java application to add scripting capabilities. - - -Arguments and Rebuttals -======================= - -Let's say that you've decided upon Python as the best choice for your -application. How can you convince your management, or your fellow developers, -to use Python? This section lists some common arguments against using Python, -and provides some possible rebuttals. - -**Python is freely available software that doesn't cost anything. How good can -it be?** - -Very good, indeed. These days Linux and Apache, two other pieces of open source -software, are becoming more respected as alternatives to commercial software, -but Python hasn't had all the publicity. - -Python has been around for several years, with many users and developers. -Accordingly, the interpreter has been used by many people, and has gotten most -of the bugs shaken out of it. While bugs are still discovered at intervals, -they're usually either quite obscure (they'd have to be, for no one to have run -into them before) or they involve interfaces to external libraries. The -internals of the language itself are quite stable. - -Having the source code should be viewed as making the software available for -peer review; people can examine the code, suggest (and implement) improvements, -and track down bugs. To find out more about the idea of open source code, along -with arguments and case studies supporting it, go to http://www.opensource.org. - -**Who's going to support it?** - -Python has a sizable community of developers, and the number is still growing. -The Internet community surrounding the language is an active one, and is worth -being considered another one of Python's advantages. Most questions posted to -the comp.lang.python newsgroup are quickly answered by someone. - -Should you need to dig into the source code, you'll find it's clear and -well-organized, so it's not very difficult to write extensions and track down -bugs yourself. If you'd prefer to pay for support, there are companies and -individuals who offer commercial support for Python. - -**Who uses Python for serious work?** - -Lots of people; one interesting thing about Python is the surprising diversity -of applications that it's been used for. People are using Python to: - -* Run Web sites - -* Write GUI interfaces - -* Control number-crunching code on supercomputers - -* Make a commercial application scriptable by embedding the Python interpreter - inside it - -* Process large XML data sets - -* Build test suites for C or Java code - -Whatever your application domain is, there's probably someone who's used Python -for something similar. Yet, despite being useable for such high-end -applications, Python's still simple enough to use for little jobs. - -See http://wiki.python.org/moin/OrganizationsUsingPython for a list of some of -the organizations that use Python. - -**What are the restrictions on Python's use?** - -They're practically nonexistent. Consult :ref:`history-and-license` for the full -language, but it boils down to three conditions: - -* You have to leave the copyright notice on the software; if you don't include - the source code in a product, you have to put the copyright notice in the - supporting documentation. - -* Don't claim that the institutions that have developed Python endorse your - product in any way. - -* If something goes wrong, you can't sue for damages. Practically all software - licenses contain this condition. - -Notice that you don't have to provide source code for anything that contains -Python or is built with it. Also, the Python interpreter and accompanying -documentation can be modified and redistributed in any way you like, and you -don't have to pay anyone any licensing fees at all. - -**Why should we use an obscure language like Python instead of well-known -language X?** - -I hope this HOWTO, and the documents listed in the final section, will help -convince you that Python isn't obscure, and has a healthily growing user base. -One word of advice: always present Python's positive advantages, instead of -concentrating on language X's failings. People want to know why a solution is -good, rather than why all the other solutions are bad. So instead of attacking -a competing solution on various grounds, simply show how Python's virtues can -help. - - -Useful Resources -================ - -http://www.pythonology.com/success - The Python Success Stories are a collection of stories from successful users of - Python, with the emphasis on business and corporate users. - -.. http://www.fsbassociates.com/books/pythonchpt1.htm - The first chapter of \emph{Internet Programming with Python} also - examines some of the reasons for using Python. The book is well worth - buying, but the publishers have made the first chapter available on - the Web. - -http://www.tcl.tk/doc/scripting.html - John Ousterhout's white paper on scripting is a good argument for the utility of - scripting languages, though naturally enough, he emphasizes Tcl, the language he - developed. Most of the arguments would apply to any scripting language. - -http://www.python.org/workshops/1997-10/proceedings/beazley.html - The authors, David M. Beazley and Peter S. Lomdahl, describe their use of - Python at Los Alamos National Laboratory. It's another good example of how - Python can help get real work done. This quotation from the paper has been - echoed by many people: - - .. epigraph:: - - Originally developed as a large monolithic application for massively parallel - processing systems, we have used Python to transform our application into a - flexible, highly modular, and extremely powerful system for performing - simulation, data analysis, and visualization. In addition, we describe how - Python has solved a number of important problems related to the development, - debugging, deployment, and maintenance of scientific software. - -http://pythonjournal.cognizor.com/pyj1/Everitt-Feit_interview98-V1.html - This interview with Andy Feit, discussing Infoseek's use of Python, can be used - to show that choosing Python didn't introduce any difficulties into a company's - development process, and provided some substantial benefits. - -.. http://www.python.org/psa/Commercial.html - Robin Friedrich wrote this document on how to support Python's use in - commercial projects. - -http://www.python.org/workshops/1997-10/proceedings/stein.ps - For the 6th Python conference, Greg Stein presented a paper that traced Python's - adoption and usage at a startup called eShop, and later at Microsoft. - -http://www.opensource.org - Management may be doubtful of the reliability and usefulness of software that - wasn't written commercially. This site presents arguments that show how open - source software can have considerable advantages over closed-source software. - -http://www.faqs.org/docs/Linux-mini/Advocacy.html - The Linux Advocacy mini-HOWTO was the inspiration for this document, and is also - well worth reading for general suggestions on winning acceptance for a new - technology, such as Linux or Python. In general, you won't make much progress - by simply attacking existing systems and complaining about their inadequacies; - this often ends up looking like unfocused whining. It's much better to point - out some of the many areas where Python is an improvement over other systems. - diff --git a/Doc/howto/argparse.rst b/Doc/howto/argparse.rst index a134036..510d1d4 100644 --- a/Doc/howto/argparse.rst +++ b/Doc/howto/argparse.rst @@ -11,7 +11,7 @@ recommended command-line parsing module in the Python standard library. .. note:: - There's two other modules that fulfill the same task, namely + There are two other modules that fulfill the same task, namely :mod:`getopt` (an equivalent for :c:func:`getopt` from the C language) and the deprecated :mod:`optparse`. Note also that :mod:`argparse` is based on :mod:`optparse`, @@ -63,7 +63,7 @@ A few concepts we can learn from the four commands: * That's a snippet of the help text. It's very useful in that you can come across a program you have never used before, and can figure out - how it works simply by reading it's help text. + how it works simply by reading its help text. The basics @@ -468,7 +468,7 @@ verbosity argument (check the output of ``python --help``):: print(answer) We have introduced another action, "count", -to count the number of occurences of a specific optional arguments: +to count the number of occurrences of a specific optional arguments: .. code-block:: sh @@ -668,8 +668,8 @@ Conflicting options So far, we have been working with two methods of an :class:`argparse.ArgumentParser` instance. Let's introduce a third one, :meth:`add_mutually_exclusive_group`. It allows for us to specify options that -conflict with each other. Let's also change the rest of the program make the -new functionality makes more sense: +conflict with each other. Let's also change the rest of the program so that +the new functionality makes more sense: we'll introduce the ``--quiet`` option, which will be the opposite of the ``--verbose`` one:: diff --git a/Doc/howto/cporting.rst b/Doc/howto/cporting.rst index 9d8a1b0..1ad77d6 100644 --- a/Doc/howto/cporting.rst +++ b/Doc/howto/cporting.rst @@ -100,25 +100,6 @@ corresponds to Python 2's :func:`long` type--the :func:`int` type used in Python 2 was removed. In the C-API, ``PyInt_*`` functions are replaced by their ``PyLong_*`` equivalents. -The best course of action here is using the ``PyInt_*`` functions aliased to -``PyLong_*`` found in :file:`intobject.h`. The abstract ``PyNumber_*`` APIs -can also be used in some cases. :: - - #include "Python.h" - #include "intobject.h" - - static PyObject * - add_ints(PyObject *self, PyObject *args) { - int one, two; - PyObject *result; - - if (!PyArg_ParseTuple(args, "ii:add_ints", &one, &two)) - return NULL; - - return PyInt_FromLong(one + two); - } - - Module initialization and state =============================== diff --git a/Doc/howto/curses.rst b/Doc/howto/curses.rst index 1b14ceb..ea62b1c 100644 --- a/Doc/howto/curses.rst +++ b/Doc/howto/curses.rst @@ -5,39 +5,43 @@ ********************************** :Author: A.M. Kuchling, Eric S. Raymond -:Release: 2.03 +:Release: 2.04 .. topic:: Abstract - This document describes how to write text-mode programs with Python 2.x, using - the :mod:`curses` extension module to control the display. + This document describes how to use the :mod:`curses` extension + module to control text-mode displays. What is curses? =============== The curses library supplies a terminal-independent screen-painting and -keyboard-handling facility for text-based terminals; such terminals include -VT100s, the Linux console, and the simulated terminal provided by X11 programs -such as xterm and rxvt. Display terminals support various control codes to -perform common operations such as moving the cursor, scrolling the screen, and -erasing areas. Different terminals use widely differing codes, and often have -their own minor quirks. - -In a world of X displays, one might ask "why bother"? It's true that -character-cell display terminals are an obsolete technology, but there are -niches in which being able to do fancy things with them are still valuable. One -is on small-footprint or embedded Unixes that don't carry an X server. Another -is for tools like OS installers and kernel configurators that may have to run -before X is available. - -The curses library hides all the details of different terminals, and provides -the programmer with an abstraction of a display, containing multiple -non-overlapping windows. The contents of a window can be changed in various -ways-- adding text, erasing it, changing its appearance--and the curses library -will automagically figure out what control codes need to be sent to the terminal -to produce the right output. +keyboard-handling facility for text-based terminals; such terminals +include VT100s, the Linux console, and the simulated terminal provided +by various programs. Display terminals support various control codes +to perform common operations such as moving the cursor, scrolling the +screen, and erasing areas. Different terminals use widely differing +codes, and often have their own minor quirks. + +In a world of graphical displays, one might ask "why bother"? It's +true that character-cell display terminals are an obsolete technology, +but there are niches in which being able to do fancy things with them +are still valuable. One niche is on small-footprint or embedded +Unixes that don't run an X server. Another is tools such as OS +installers and kernel configurators that may have to run before any +graphical support is available. + +The curses library provides fairly basic functionality, providing the +programmer with an abstraction of a display containing multiple +non-overlapping windows of text. The contents of a window can be +changed in various ways---adding text, erasing it, changing its +appearance---and the curses library will figure out what control codes +need to be sent to the terminal to produce the right output. curses +doesn't provide many user-interface concepts such as buttons, checkboxes, +or dialogs; if you need such features, consider a user interface library such as +`Urwid <https://pypi.python.org/pypi/urwid/>`_. The curses library was originally written for BSD Unix; the later System V versions of Unix from AT&T added many enhancements and new functions. BSD curses @@ -49,10 +53,13 @@ code, all the functions described here will probably be available. The older versions of curses carried by some proprietary Unixes may not support everything, though. -No one has made a Windows port of the curses module. On a Windows platform, try -the Console module written by Fredrik Lundh. The Console module provides -cursor-addressable text output, plus full support for mouse and keyboard input, -and is available from http://effbot.org/zone/console-index.htm. +The Windows version of Python doesn't include the :mod:`curses` +module. A ported version called `UniCurses +<https://pypi.python.org/pypi/UniCurses>`_ is available. You could +also try `the Console module <http://effbot.org/zone/console-index.htm>`_ +written by Fredrik Lundh, which doesn't +use the same API as curses but provides cursor-addressable text output +and full support for mouse and keyboard input. The Python curses module @@ -61,11 +68,12 @@ The Python curses module Thy Python module is a fairly simple wrapper over the C functions provided by curses; if you're already familiar with curses programming in C, it's really easy to transfer that knowledge to Python. The biggest difference is that the -Python interface makes things simpler, by merging different C functions such as -:func:`addstr`, :func:`mvaddstr`, :func:`mvwaddstr`, into a single -:meth:`addstr` method. You'll see this covered in more detail later. +Python interface makes things simpler by merging different C functions such as +:c:func:`addstr`, :c:func:`mvaddstr`, and :c:func:`mvwaddstr` into a single +:meth:`~curses.window.addstr` method. You'll see this covered in more +detail later. -This HOWTO is simply an introduction to writing text-mode programs with curses +This HOWTO is an introduction to writing text-mode programs with curses and Python. It doesn't attempt to be a complete guide to the curses API; for that, see the Python library guide's section on ncurses, and the C manual pages for ncurses. It will, however, give you the basic ideas. @@ -74,25 +82,27 @@ for ncurses. It will, however, give you the basic ideas. Starting and ending a curses application ======================================== -Before doing anything, curses must be initialized. This is done by calling the -:func:`initscr` function, which will determine the terminal type, send any -required setup codes to the terminal, and create various internal data -structures. If successful, :func:`initscr` returns a window object representing -the entire screen; this is usually called ``stdscr``, after the name of the +Before doing anything, curses must be initialized. This is done by +calling the :func:`~curses.initscr` function, which will determine the +terminal type, send any required setup codes to the terminal, and +create various internal data structures. If successful, +:func:`initscr` returns a window object representing the entire +screen; this is usually called ``stdscr`` after the name of the corresponding C variable. :: import curses stdscr = curses.initscr() -Usually curses applications turn off automatic echoing of keys to the screen, in -order to be able to read keys and only display them under certain circumstances. -This requires calling the :func:`noecho` function. :: +Usually curses applications turn off automatic echoing of keys to the +screen, in order to be able to read keys and only display them under +certain circumstances. This requires calling the +:func:`~curses.noecho` function. :: curses.noecho() -Applications will also commonly need to react to keys instantly, without -requiring the Enter key to be pressed; this is called cbreak mode, as opposed to -the usual buffered input mode. :: +Applications will also commonly need to react to keys instantly, +without requiring the Enter key to be pressed; this is called cbreak +mode, as opposed to the usual buffered input mode. :: curses.cbreak() @@ -103,15 +113,18 @@ curses can do it for you, returning a special value such as :const:`curses.KEY_LEFT`. To get curses to do the job, you'll have to enable keypad mode. :: - stdscr.keypad(1) + stdscr.keypad(True) Terminating a curses application is much easier than starting one. You'll need -to call :: +to call:: - curses.nocbreak(); stdscr.keypad(0); curses.echo() + curses.nocbreak() + stdscr.keypad(False) + curses.echo() -to reverse the curses-friendly terminal settings. Then call the :func:`endwin` -function to restore the terminal to its original operating mode. :: +to reverse the curses-friendly terminal settings. Then call the +:func:`~curses.endwin` function to restore the terminal to its original +operating mode. :: curses.endwin() @@ -122,102 +135,147 @@ raises an uncaught exception. Keys are no longer echoed to the screen when you type them, for example, which makes using the shell difficult. In Python you can avoid these complications and make debugging much easier by -importing the module :mod:`curses.wrapper`. It supplies a :func:`wrapper` -function that takes a callable. It does the initializations described above, -and also initializes colors if color support is present. It then runs your -provided callable and finally deinitializes appropriately. The callable is -called inside a try-catch clause which catches exceptions, performs curses -deinitialization, and then passes the exception upwards. Thus, your terminal -won't be left in a funny state on exception. +importing the :func:`curses.wrapper` function and using it like this:: + + from curses import wrapper + + def main(stdscr): + # Clear screen + stdscr.clear() + + # This raises ZeroDivisionError when i == 10. + for i in range(0, 11): + v = i-10 + stdscr.addstr(i, 0, '10 divided by {} is {}'.format(v, 10/v)) + + stdscr.refresh() + stdscr.getkey() + + wrapper(main) + +The :func:`~curses.wrapper` function takes a callable object and does the +initializations described above, also initializing colors if color +support is present. :func:`wrapper` then runs your provided callable. +Once the callable returns, :func:`wrapper` will restore the original +state of the terminal. The callable is called inside a +:keyword:`try`...\ :keyword:`except` that catches exceptions, restores +the state of the terminal, and then re-raises the exception. Therefore +your terminal won't be left in a funny state on exception and you'll be +able to read the exception's message and traceback. Windows and Pads ================ Windows are the basic abstraction in curses. A window object represents a -rectangular area of the screen, and supports various methods to display text, +rectangular area of the screen, and supports methods to display text, erase it, allow the user to input strings, and so forth. -The ``stdscr`` object returned by the :func:`initscr` function is a window -object that covers the entire screen. Many programs may need only this single -window, but you might wish to divide the screen into smaller windows, in order -to redraw or clear them separately. The :func:`newwin` function creates a new -window of a given size, returning the new window object. :: +The ``stdscr`` object returned by the :func:`~curses.initscr` function is a +window object that covers the entire screen. Many programs may need +only this single window, but you might wish to divide the screen into +smaller windows, in order to redraw or clear them separately. The +:func:`~curses.newwin` function creates a new window of a given size, +returning the new window object. :: - begin_x = 20 ; begin_y = 7 - height = 5 ; width = 40 + begin_x = 20; begin_y = 7 + height = 5; width = 40 win = curses.newwin(height, width, begin_y, begin_x) -A word about the coordinate system used in curses: coordinates are always passed -in the order *y,x*, and the top-left corner of a window is coordinate (0,0). -This breaks a common convention for handling coordinates, where the *x* -coordinate usually comes first. This is an unfortunate difference from most -other computer applications, but it's been part of curses since it was first -written, and it's too late to change things now. - -When you call a method to display or erase text, the effect doesn't immediately -show up on the display. This is because curses was originally written with slow -300-baud terminal connections in mind; with these terminals, minimizing the time -required to redraw the screen is very important. This lets curses accumulate -changes to the screen, and display them in the most efficient manner. For -example, if your program displays some characters in a window, and then clears -the window, there's no need to send the original characters because they'd never -be visible. - -Accordingly, curses requires that you explicitly tell it to redraw windows, -using the :func:`refresh` method of window objects. In practice, this doesn't +Note that the coordinate system used in curses is unusual. +Coordinates are always passed in the order *y,x*, and the top-left +corner of a window is coordinate (0,0). This breaks the normal +convention for handling coordinates where the *x* coordinate comes +first. This is an unfortunate difference from most other computer +applications, but it's been part of curses since it was first written, +and it's too late to change things now. + +Your application can determine the size of the screen by using the +:data:`curses.LINES` and :data:`curses.COLS` variables to obtain the *y* and +*x* sizes. Legal coordinates will then extend from ``(0,0)`` to +``(curses.LINES - 1, curses.COLS - 1)``. + +When you call a method to display or erase text, the effect doesn't +immediately show up on the display. Instead you must call the +:meth:`~curses.window.refresh` method of window objects to update the +screen. + +This is because curses was originally written with slow 300-baud +terminal connections in mind; with these terminals, minimizing the +time required to redraw the screen was very important. Instead curses +accumulates changes to the screen and displays them in the most +efficient manner when you call :meth:`refresh`. For example, if your +program displays some text in a window and then clears the window, +there's no need to send the original text because they're never +visible. + +In practice, explicitly telling curses to redraw a window doesn't really complicate programming with curses much. Most programs go into a flurry of activity, and then pause waiting for a keypress or some other action on the part of the user. All you have to do is to be sure that the screen has been -redrawn before pausing to wait for user input, by simply calling -``stdscr.refresh()`` or the :func:`refresh` method of some other relevant +redrawn before pausing to wait for user input, by first calling +``stdscr.refresh()`` or the :meth:`refresh` method of some other relevant window. A pad is a special case of a window; it can be larger than the actual display -screen, and only a portion of it displayed at a time. Creating a pad simply +screen, and only a portion of the pad displayed at a time. Creating a pad requires the pad's height and width, while refreshing a pad requires giving the coordinates of the on-screen area where a subsection of the pad will be -displayed. :: +displayed. :: pad = curses.newpad(100, 100) - # These loops fill the pad with letters; this is + # These loops fill the pad with letters; addch() is # explained in the next section - for y in range(0, 100): - for x in range(0, 100): - try: pad.addch(y,x, ord('a') + (x*x+y*y) % 26 ) - except curses.error: pass - - # Displays a section of the pad in the middle of the screen + for y in range(0, 99): + for x in range(0, 99): + pad.addch(y,x, ord('a') + (x*x+y*y) % 26) + + # Displays a section of the pad in the middle of the screen. + # (0,0) : coordinate of upper-left corner of pad area to display. + # (5,5) : coordinate of upper-left corner of window area to be filled + # with pad content. + # (20, 75) : coordinate of lower-right corner of window area to be + # : filled with pad content. pad.refresh( 0,0, 5,5, 20,75) -The :func:`refresh` call displays a section of the pad in the rectangle +The :meth:`refresh` call displays a section of the pad in the rectangle extending from coordinate (5,5) to coordinate (20,75) on the screen; the upper left corner of the displayed section is coordinate (0,0) on the pad. Beyond that difference, pads are exactly like ordinary windows and support the same methods. -If you have multiple windows and pads on screen there is a more efficient way to -go, which will prevent annoying screen flicker at refresh time. Use the -:meth:`noutrefresh` method of each window to update the data structure -representing the desired state of the screen; then change the physical screen to -match the desired state in one go with the function :func:`doupdate`. The -normal :meth:`refresh` method calls :func:`doupdate` as its last act. +If you have multiple windows and pads on screen there is a more +efficient way to update the screen and prevent annoying screen flicker +as each part of the screen gets updated. :meth:`refresh` actually +does two things: + +1) Calls the :meth:`~curses.window.noutrefresh` method of each window + to update an underlying data structure representing the desired + state of the screen. +2) Calls the function :func:`~curses.doupdate` function to change the + physical screen to match the desired state recorded in the data structure. + +Instead you can call :meth:`noutrefresh` on a number of windows to +update the data structure, and then call :func:`doupdate` to update +the screen. Displaying Text =============== -From a C programmer's point of view, curses may sometimes look like a twisty -maze of functions, all subtly different. For example, :func:`addstr` displays a -string at the current cursor location in the ``stdscr`` window, while -:func:`mvaddstr` moves to a given y,x coordinate first before displaying the -string. :func:`waddstr` is just like :func:`addstr`, but allows specifying a -window to use, instead of using ``stdscr`` by default. :func:`mvwaddstr` follows -similarly. +From a C programmer's point of view, curses may sometimes look like a +twisty maze of functions, all subtly different. For example, +:c:func:`addstr` displays a string at the current cursor location in +the ``stdscr`` window, while :c:func:`mvaddstr` moves to a given y,x +coordinate first before displaying the string. :c:func:`waddstr` is just +like :c:func:`addstr`, but allows specifying a window to use instead of +using ``stdscr`` by default. :c:func:`mvwaddstr` allows specifying both +a window and a coordinate. -Fortunately the Python interface hides all these details; ``stdscr`` is a window -object like any other, and methods like :func:`addstr` accept multiple argument -forms. Usually there are four different forms. +Fortunately the Python interface hides all these details. ``stdscr`` +is a window object like any other, and methods such as +:meth:`~curses.window.addstr` accept multiple argument forms. Usually there +are four different forms. +---------------------------------+-----------------------------------------------+ | Form | Description | @@ -236,17 +294,26 @@ forms. Usually there are four different forms. | | display *str* or *ch*, using attribute *attr* | +---------------------------------+-----------------------------------------------+ -Attributes allow displaying text in highlighted forms, such as in boldface, +Attributes allow displaying text in highlighted forms such as boldface, underline, reverse code, or in color. They'll be explained in more detail in the next subsection. -The :func:`addstr` function takes a Python string as the value to be displayed, -while the :func:`addch` functions take a character, which can be either a Python -string of length 1 or an integer. If it's a string, you're limited to -displaying characters between 0 and 255. SVr4 curses provides constants for -extension characters; these constants are integers greater than 255. For -example, :const:`ACS_PLMINUS` is a +/- symbol, and :const:`ACS_ULCORNER` is the -upper left corner of a box (handy for drawing borders). + +The :meth:`~curses.window.addstr` method takes a Python string or +bytestring as the value to be displayed. The contents of bytestrings +are sent to the terminal as-is. Strings are encoded to bytes using +the value of the window's :attr:`encoding` attribute; this defaults to +the default system encoding as returned by +:func:`locale.getpreferredencoding`. + +The :meth:`~curses.window.addch` methods take a character, which can be +either a string of length 1, a bytestring of length 1, or an integer. + +Constants are provided for extension characters; these constants are +integers greater than 255. For example, :const:`ACS_PLMINUS` is a +/- +symbol, and :const:`ACS_ULCORNER` is the upper left corner of a box +(handy for drawing borders). You can also use the appropriate Unicode +character. Windows remember where the cursor was left after the last operation, so if you leave out the *y,x* coordinates, the string or character will be displayed @@ -256,10 +323,11 @@ you may want to ensure that the cursor is positioned in some location where it won't be distracting; it can be confusing to have the cursor blinking at some apparently random location. -If your application doesn't need a blinking cursor at all, you can call -``curs_set(0)`` to make it invisible. Equivalently, and for compatibility with -older curses versions, there's a ``leaveok(bool)`` function. When *bool* is -true, the curses library will attempt to suppress the flashing cursor, and you +If your application doesn't need a blinking cursor at all, you can +call ``curs_set(False)`` to make it invisible. For compatibility +with older curses versions, there's a ``leaveok(bool)`` function +that's a synonym for :func:`~curses.curs_set`. When *bool* is true, the +curses library will attempt to suppress the flashing cursor, and you won't need to worry about leaving it in odd locations. @@ -267,15 +335,16 @@ Attributes and Color -------------------- Characters can be displayed in different ways. Status lines in a text-based -application are commonly shown in reverse video; a text viewer may need to +application are commonly shown in reverse video, or a text viewer may need to highlight certain words. curses supports this by allowing you to specify an attribute for each cell on the screen. -An attribute is an integer, each bit representing a different attribute. You can -try to display text with multiple attribute bits set, but curses doesn't -guarantee that all the possible combinations are available, or that they're all -visually distinct. That depends on the ability of the terminal being used, so -it's safest to stick to the most commonly available attributes, listed here. +An attribute is an integer, each bit representing a different +attribute. You can try to display text with multiple attribute bits +set, but curses doesn't guarantee that all the possible combinations +are available, or that they're all visually distinct. That depends on +the ability of the terminal being used, so it's safest to stick to the +most commonly available attributes, listed here. +----------------------+--------------------------------------+ | Attribute | Description | @@ -304,10 +373,11 @@ The curses library also supports color on those terminals that provide it. The most common such terminal is probably the Linux console, followed by color xterms. -To use color, you must call the :func:`start_color` function soon after calling -:func:`initscr`, to initialize the default color set (the -:func:`curses.wrapper.wrapper` function does this automatically). Once that's -done, the :func:`has_colors` function returns TRUE if the terminal in use can +To use color, you must call the :func:`~curses.start_color` function soon +after calling :func:`~curses.initscr`, to initialize the default color set +(the :func:`curses.wrapper` function does this automatically). Once that's +done, the :func:`~curses.has_colors` function returns TRUE if the terminal +in use can actually display color. (Note: curses uses the American spelling 'color', instead of the Canadian/British spelling 'colour'. If you're used to the British spelling, you'll have to resign yourself to misspelling it for the sake @@ -315,25 +385,27 @@ of these functions.) The curses library maintains a finite number of color pairs, containing a foreground (or text) color and a background color. You can get the attribute -value corresponding to a color pair with the :func:`color_pair` function; this -can be bitwise-OR'ed with other attributes such as :const:`A_REVERSE`, but -again, such combinations are not guaranteed to work on all terminals. +value corresponding to a color pair with the :func:`~curses.color_pair` +function; this can be bitwise-OR'ed with other attributes such as +:const:`A_REVERSE`, but again, such combinations are not guaranteed to work +on all terminals. An example, which displays a line of text using color pair 1:: - stdscr.addstr( "Pretty text", curses.color_pair(1) ) + stdscr.addstr("Pretty text", curses.color_pair(1)) stdscr.refresh() As I said before, a color pair consists of a foreground and background color. -:func:`start_color` initializes 8 basic colors when it activates color mode. -They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and -7:white. The curses module defines named constants for each of these colors: -:const:`curses.COLOR_BLACK`, :const:`curses.COLOR_RED`, and so forth. - The ``init_pair(n, f, b)`` function changes the definition of color pair *n*, to foreground color f and background color b. Color pair 0 is hard-wired to white on black, and cannot be changed. +Colors are numbered, and :func:`start_color` initializes 8 basic +colors when it activates color mode. They are: 0:black, 1:red, +2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The :mod:`curses` +module defines named constants for each of these colors: +:const:`curses.COLOR_BLACK`, :const:`curses.COLOR_RED`, and so forth. + Let's put all this together. To change color 1 to red text on a white background, you would call:: @@ -343,94 +415,138 @@ When you change a color pair, any text already displayed using that color pair will change to the new colors. You can also display new text in this color with:: - stdscr.addstr(0,0, "RED ALERT!", curses.color_pair(1) ) + stdscr.addstr(0,0, "RED ALERT!", curses.color_pair(1)) Very fancy terminals can change the definitions of the actual colors to a given RGB value. This lets you change color 1, which is usually red, to purple or blue or any other color you like. Unfortunately, the Linux console doesn't support this, so I'm unable to try it out, and can't provide any examples. You -can check if your terminal can do this by calling :func:`can_change_color`, -which returns TRUE if the capability is there. If you're lucky enough to have -such a talented terminal, consult your system's man pages for more information. +can check if your terminal can do this by calling +:func:`~curses.can_change_color`, which returns ``True`` if the capability is +there. If you're lucky enough to have such a talented terminal, consult your +system's man pages for more information. User Input ========== -The curses library itself offers only very simple input mechanisms. Python's -support adds a text-input widget that makes up some of the lack. - -The most common way to get input to a window is to use its :meth:`getch` method. -:meth:`getch` pauses and waits for the user to hit a key, displaying it if -:func:`echo` has been called earlier. You can optionally specify a coordinate -to which the cursor should be moved before pausing. - -It's possible to change this behavior with the method :meth:`nodelay`. After -``nodelay(1)``, :meth:`getch` for the window becomes non-blocking and returns -``curses.ERR`` (a value of -1) when no input is ready. There's also a -:func:`halfdelay` function, which can be used to (in effect) set a timer on each -:meth:`getch`; if no input becomes available within a specified -delay (measured in tenths of a second), curses raises an exception. +The C curses library offers only very simple input mechanisms. Python's +:mod:`curses` module adds a basic text-input widget. (Other libraries +such as `Urwid <https://pypi.python.org/pypi/urwid/>`_ have more extensive +collections of widgets.) + +There are two methods for getting input from a window: + +* :meth:`~curses.window.getch` refreshes the screen and then waits for + the user to hit a key, displaying the key if :func:`~curses.echo` has been + called earlier. You can optionally specify a coordinate to which + the cursor should be moved before pausing. + +* :meth:`~curses.window.getkey` does the same thing but converts the + integer to a string. Individual characters are returned as + 1-character strings, and special keys such as function keys return + longer strings containing a key name such as ``KEY_UP`` or ``^G``. + +It's possible to not wait for the user using the +:meth:`~curses.window.nodelay` window method. After ``nodelay(True)``, +:meth:`getch` and :meth:`getkey` for the window become +non-blocking. To signal that no input is ready, :meth:`getch` returns +``curses.ERR`` (a value of -1) and :meth:`getkey` raises an exception. +There's also a :func:`~curses.halfdelay` function, which can be used to (in +effect) set a timer on each :meth:`getch`; if no input becomes +available within a specified delay (measured in tenths of a second), +curses raises an exception. The :meth:`getch` method returns an integer; if it's between 0 and 255, it represents the ASCII code of the key pressed. Values greater than 255 are special keys such as Page Up, Home, or the cursor keys. You can compare the value returned to constants such as :const:`curses.KEY_PPAGE`, -:const:`curses.KEY_HOME`, or :const:`curses.KEY_LEFT`. Usually the main loop of -your program will look something like this:: +:const:`curses.KEY_HOME`, or :const:`curses.KEY_LEFT`. The main loop of +your program may look something like this:: while True: c = stdscr.getch() - if c == ord('p'): PrintDocument() - elif c == ord('q'): break # Exit the while() - elif c == curses.KEY_HOME: x = y = 0 + if c == ord('p'): + PrintDocument() + elif c == ord('q'): + break # Exit the while loop + elif c == curses.KEY_HOME: + x = y = 0 The :mod:`curses.ascii` module supplies ASCII class membership functions that -take either integer or 1-character-string arguments; these may be useful in -writing more readable tests for your command interpreters. It also supplies +take either integer or 1-character string arguments; these may be useful in +writing more readable tests for such loops. It also supplies conversion functions that take either integer or 1-character-string arguments and return the same type. For example, :func:`curses.ascii.ctrl` returns the control character corresponding to its argument. -There's also a method to retrieve an entire string, :const:`getstr()`. It isn't -used very often, because its functionality is quite limited; the only editing -keys available are the backspace key and the Enter key, which terminates the -string. It can optionally be limited to a fixed number of characters. :: +There's also a method to retrieve an entire string, +:meth:`~curses.window.getstr`. It isn't used very often, because its +functionality is quite limited; the only editing keys available are +the backspace key and the Enter key, which terminates the string. It +can optionally be limited to a fixed number of characters. :: curses.echo() # Enable echoing of characters # Get a 15-character string, with the cursor on the top line s = stdscr.getstr(0,0, 15) -The Python :mod:`curses.textpad` module supplies something better. With it, you -can turn a window into a text box that supports an Emacs-like set of -keybindings. Various methods of :class:`Textbox` class support editing with -input validation and gathering the edit results either with or without trailing -spaces. See the library documentation on :mod:`curses.textpad` for the -details. +The :mod:`curses.textpad` module supplies a text box that supports an +Emacs-like set of keybindings. Various methods of the +:class:`~curses.textpad.Textbox` class support editing with input +validation and gathering the edit results either with or without +trailing spaces. Here's an example:: + + import curses + from curses.textpad import Textbox, rectangle + def main(stdscr): + stdscr.addstr(0, 0, "Enter IM message: (hit Ctrl-G to send)") -For More Information -==================== + editwin = curses.newwin(5,30, 2,1) + rectangle(stdscr, 1,0, 1+5+1, 1+30+1) + stdscr.refresh() -This HOWTO didn't cover some advanced topics, such as screen-scraping or -capturing mouse events from an xterm instance. But the Python library page for -the curses modules is now pretty complete. You should browse it next. + box = Textbox(editwin) -If you're in doubt about the detailed behavior of any of the ncurses entry -points, consult the manual pages for your curses implementation, whether it's -ncurses or a proprietary Unix vendor's. The manual pages will document any -quirks, and provide complete lists of all the functions, attributes, and -:const:`ACS_\*` characters available to you. + # Let the user edit until Ctrl-G is struck. + box.edit() -Because the curses API is so large, some functions aren't supported in the -Python interface, not because they're difficult to implement, but because no one -has needed them yet. Feel free to add them and then submit a patch. Also, we -don't yet have support for the menu library associated with -ncurses; feel free to add that. + # Get resulting contents + message = box.gather() -If you write an interesting little program, feel free to contribute it as -another demo. We can always use more of them! +See the library documentation on :mod:`curses.textpad` for more details. -The ncurses FAQ: http://invisible-island.net/ncurses/ncurses.faq.html +For More Information +==================== + +This HOWTO doesn't cover some advanced topics, such as reading the +contents of the screen or capturing mouse events from an xterm +instance, but the Python library page for the :mod:`curses` module is now +reasonably complete. You should browse it next. + +If you're in doubt about the detailed behavior of the curses +functions, consult the manual pages for your curses implementation, +whether it's ncurses or a proprietary Unix vendor's. The manual pages +will document any quirks, and provide complete lists of all the +functions, attributes, and :const:`ACS_\*` characters available to +you. + +Because the curses API is so large, some functions aren't supported in +the Python interface. Often this isn't because they're difficult to +implement, but because no one has needed them yet. Also, Python +doesn't yet support the menu library associated with ncurses. +Patches adding support for these would be welcome; see +`the Python Developer's Guide <http://docs.python.org/devguide/>`_ to +learn more about submitting patches to Python. + +* `Writing Programs with NCURSES <http://invisible-island.net/ncurses/ncurses-intro.html>`_: + a lengthy tutorial for C programmers. +* `The ncurses man page <http://www.linuxmanpages.com/man3/ncurses.3x.php>`_ +* `The ncurses FAQ <http://invisible-island.net/ncurses/ncurses.faq.html>`_ +* `"Use curses... don't swear" <http://www.youtube.com/watch?v=eN1eZtjLEnU>`_: + video of a PyCon 2013 talk on controlling terminals using curses or Urwid. +* `"Console Applications with Urwid" <http://www.pyvideo.org/video/1568/console-applications-with-urwid>`_: + video of a PyCon CA 2012 talk demonstrating some applications written using + Urwid. diff --git a/Doc/howto/descriptor.rst b/Doc/howto/descriptor.rst index f8763d8..a0c6988 100644 --- a/Doc/howto/descriptor.rst +++ b/Doc/howto/descriptor.rst @@ -36,9 +36,7 @@ continuing through the base classes of ``type(a)`` excluding metaclasses. If the looked-up value is an object defining one of the descriptor methods, then Python may override the default behavior and invoke the descriptor method instead. Where this occurs in the precedence chain depends on which descriptor methods -were defined. Note that descriptors are only invoked for new style objects or -classes (a class is new style if it inherits from :class:`object` or -:class:`type`). +were defined. Descriptors are a powerful, general purpose protocol. They are the mechanism behind properties, methods, static methods, class methods, and :func:`super()`. @@ -89,8 +87,6 @@ of ``obj``. If ``d`` defines the method :meth:`__get__`, then ``d.__get__(obj)` is invoked according to the precedence rules listed below. The details of invocation depend on whether ``obj`` is an object or a class. -Either way, descriptors only work for new style objects and classes. A class is -new style if it is a subclass of :class:`object`. For objects, the machinery is in :meth:`object.__getattribute__` which transforms ``b.x`` into ``type(b).__dict__['x'].__get__(b, type(b))``. The @@ -115,7 +111,6 @@ The important points to remember are: * descriptors are invoked by the :meth:`__getattribute__` method * overriding :meth:`__getattribute__` prevents automatic descriptor calls -* :meth:`__getattribute__` is only available with new style classes and objects * :meth:`object.__getattribute__` and :meth:`type.__getattribute__` make different calls to :meth:`__get__`. * data descriptors always override instance dictionaries. @@ -124,14 +119,11 @@ The important points to remember are: The object returned by ``super()`` also has a custom :meth:`__getattribute__` method for invoking descriptors. The call ``super(B, obj).m()`` searches ``obj.__class__.__mro__`` for the base class ``A`` immediately following ``B`` -and then returns ``A.__dict__['m'].__get__(obj, A)``. If not a descriptor, +and then returns ``A.__dict__['m'].__get__(obj, B)``. If not a descriptor, ``m`` is returned unchanged. If not in the dictionary, ``m`` reverts to a search using :meth:`object.__getattribute__`. -Note, in Python 2.2, ``super(B, obj).m()`` would only invoke :meth:`__get__` if -``m`` was a data descriptor. In Python 2.3, non-data descriptors also get -invoked unless an old-style class is involved. The implementation details are -in :c:func:`super_getattro()` in +The implementation details are in :c:func:`super_getattro()` in `Objects/typeobject.c <http://svn.python.org/view/python/trunk/Objects/typeobject.c?view=markup>`_ and a pure Python equivalent can be found in `Guido's Tutorial`_. @@ -218,6 +210,8 @@ here is a pure Python equivalent:: self.fget = fget self.fset = fset self.fdel = fdel + if doc is None and fget is not None: + doc = fget.__doc__ self.__doc__ = doc def __get__(self, obj, objtype=None): @@ -237,6 +231,15 @@ here is a pure Python equivalent:: raise AttributeError("can't delete attribute") self.fdel(obj) + def getter(self, fget): + return type(self)(fget, self.fset, self.fdel, self.__doc__) + + def setter(self, fset): + return type(self)(self.fget, fset, self.fdel, self.__doc__) + + def deleter(self, fdel): + return type(self)(self.fget, self.fset, fdel, self.__doc__) + The :func:`property` builtin helps whenever a user interface has granted attribute access and then subsequent changes require the intervention of a method. @@ -398,7 +401,7 @@ is to create alternate class constructors. In Python 2.3, the classmethod :func:`dict.fromkeys` creates a new dictionary from a list of keys. The pure Python equivalent is:: - class Dict: + class Dict(object): . . . def fromkeys(klass, iterable, value=None): "Emulate dict_fromkeys() in Objects/dictobject.c" diff --git a/Doc/howto/functional.rst b/Doc/howto/functional.rst index ebbb229..7189c65 100644 --- a/Doc/howto/functional.rst +++ b/Doc/howto/functional.rst @@ -292,13 +292,14 @@ ordering of the objects in the dictionary. Applying :func:`iter` to a dictionary always loops over the keys, but dictionaries have methods that return other iterators. If you want to iterate over values or key/value pairs, you can explicitly call the -:meth:`~dict.values` or :meth:`~dict.items` methods to get an appropriate iterator. +:meth:`~dict.values` or :meth:`~dict.items` methods to get an appropriate +iterator. The :func:`dict` constructor can accept an iterator that returns a finite stream of ``(key, value)`` tuples: >>> L = [('Italy', 'Rome'), ('France', 'Paris'), ('US', 'Washington DC')] - >>> dict(iter(L)) + >>> dict(iter(L)) #doctest: +SKIP {'Italy': 'Rome', 'US': 'Washington DC', 'France': 'Paris'} Files also support iteration by calling the :meth:`~io.TextIOBase.readline` @@ -478,13 +479,10 @@ Here's a sample usage of the ``generate_ints()`` generator: You could equally write ``for i in generate_ints(5)``, or ``a,b,c = generate_ints(3)``. -Inside a generator function, the ``return`` statement can only be used without a -value, and signals the end of the procession of values; after executing a -``return`` the generator cannot return any further values. ``return`` with a -value, such as ``return 5``, is a syntax error inside a generator function. The -end of the generator's results can also be indicated by raising -:exc:`StopIteration` manually, or by just letting the flow of execution fall off -the bottom of the function. +Inside a generator function, ``return value`` is semantically equivalent to +``raise StopIteration(value)``. If no value is returned or the bottom of the +function is reached, the procession of values ends and the generator cannot +return any further values. You could achieve the effect of generators manually by writing your own class and storing all the local variables of the generator as instance variables. For @@ -689,8 +687,8 @@ constructed list's :meth:`~list.sort` method. :: The :func:`any(iter) <any>` and :func:`all(iter) <all>` built-ins look at the -truth values of an iterable's contents. :func:`any` returns True if any element -in the iterable is a true value, and :func:`all` returns True if all of the +truth values of an iterable's contents. :func:`any` returns ``True`` if any element +in the iterable is a true value, and :func:`all` returns ``True`` if all of the elements are true values: >>> any([0,1,0]) diff --git a/Doc/howto/index.rst b/Doc/howto/index.rst index a11d3da..81a4f8b 100644 --- a/Doc/howto/index.rst +++ b/Doc/howto/index.rst @@ -13,7 +13,6 @@ Currently, the HOWTOs are: .. toctree:: :maxdepth: 1 - advocacy.rst pyporting.rst cporting.rst curses.rst @@ -28,4 +27,5 @@ Currently, the HOWTOs are: urllib2.rst webservers.rst argparse.rst + ipaddress.rst diff --git a/Doc/howto/ipaddress.rst b/Doc/howto/ipaddress.rst new file mode 100644 index 0000000..5e0ff3e --- /dev/null +++ b/Doc/howto/ipaddress.rst @@ -0,0 +1,341 @@ +.. _ipaddress-howto: + +*************************************** +An introduction to the ipaddress module +*************************************** + +:author: Peter Moody +:author: Nick Coghlan + +.. topic:: Overview + + This document aims to provide a gentle introduction to the + :mod:`ipaddress` module. It is aimed primarily at users that aren't + already familiar with IP networking terminology, but may also be useful + to network engineers wanting an overview of how :mod:`ipaddress` + represents IP network addressing concepts. + + +Creating Address/Network/Interface objects +========================================== + +Since :mod:`ipaddress` is a module for inspecting and manipulating IP addresses, +the first thing you'll want to do is create some objects. You can use +:mod:`ipaddress` to create objects from strings and integers. + + +A Note on IP Versions +--------------------- + +For readers that aren't particularly familiar with IP addressing, it's +important to know that the Internet Protocol is currently in the process +of moving from version 4 of the protocol to version 6. This transition is +occurring largely because version 4 of the protocol doesn't provide enough +addresses to handle the needs of the whole world, especially given the +increasing number of devices with direct connections to the internet. + +Explaining the details of the differences between the two versions of the +protocol is beyond the scope of this introduction, but readers need to at +least be aware that these two versions exist, and it will sometimes be +necessary to force the use of one version or the other. + + +IP Host Addresses +----------------- + +Addresses, often referred to as "host addresses" are the most basic unit +when working with IP addressing. The simplest way to create addresses is +to use the :func:`ipaddress.ip_address` factory function, which automatically +determines whether to create an IPv4 or IPv6 address based on the passed in +value: + +.. testsetup:: + >>> import ipaddress + +:: + + >>> ipaddress.ip_address('192.0.2.1') + IPv4Address('192.0.2.1') + >>> ipaddress.ip_address('2001:DB8::1') + IPv6Address('2001:db8::1') + +Addresses can also be created directly from integers. Values that will +fit within 32 bits are assumed to be IPv4 addresses:: + + >>> ipaddress.ip_address(3221225985) + IPv4Address('192.0.2.1') + >>> ipaddress.ip_address(42540766411282592856903984951653826561) + IPv6Address('2001:db8::1') + +To force the use of IPv4 or IPv6 addresses, the relevant classes can be +invoked directly. This is particularly useful to force creation of IPv6 +addresses for small integers:: + + >>> ipaddress.ip_address(1) + IPv4Address('0.0.0.1') + >>> ipaddress.IPv4Address(1) + IPv4Address('0.0.0.1') + >>> ipaddress.IPv6Address(1) + IPv6Address('::1') + + +Defining Networks +----------------- + +Host addresses are usually grouped together into IP networks, so +:mod:`ipaddress` provides a way to create, inspect and manipulate network +definitions. IP network objects are constructed from strings that define the +range of host addresses that are part of that network. The simplest form +for that information is a "network address/network prefix" pair, where the +prefix defines the number of leading bits that are compared to determine +whether or not an address is part of the network and the network address +defines the expected value of those bits. + +As for addresses, a factory function is provided that determines the correct +IP version automatically:: + + >>> ipaddress.ip_network('192.0.2.0/24') + IPv4Network('192.0.2.0/24') + >>> ipaddress.ip_network('2001:db8::0/96') + IPv6Network('2001:db8::/96') + +Network objects cannot have any host bits set. The practical effect of this +is that ``192.0.2.1/24`` does not describe a network. Such definitions are +referred to as interface objects since the ip-on-a-network notation is +commonly used to describe network interfaces of a computer on a given network +and are described further in the next section. + +By default, attempting to create a network object with host bits set will +result in :exc:`ValueError` being raised. To request that the +additional bits instead be coerced to zero, the flag ``strict=False`` can +be passed to the constructor:: + + >>> ipaddress.ip_network('192.0.2.1/24') + Traceback (most recent call last): + ... + ValueError: 192.0.2.1/24 has host bits set + >>> ipaddress.ip_network('192.0.2.1/24', strict=False) + IPv4Network('192.0.2.0/24') + +While the string form offers significantly more flexibility, networks can +also be defined with integers, just like host addresses. In this case, the +network is considered to contain only the single address identified by the +integer, so the network prefix includes the entire network address:: + + >>> ipaddress.ip_network(3221225984) + IPv4Network('192.0.2.0/32') + >>> ipaddress.ip_network(42540766411282592856903984951653826560) + IPv6Network('2001:db8::/128') + +As with addresses, creation of a particular kind of network can be forced +by calling the class constructor directly instead of using the factory +function. + + +Host Interfaces +--------------- + +As mentioned just above, if you need to describe an address on a particular +network, neither the address nor the network classes are sufficient. +Notation like ``192.0.2.1/24`` is commonly used by network engineers and the +people who write tools for firewalls and routers as shorthand for "the host +``192.0.2.1`` on the network ``192.0.2.0/24``", Accordingly, :mod:`ipaddress` +provides a set of hybrid classes that associate an address with a particular +network. The interface for creation is identical to that for defining network +objects, except that the address portion isn't constrained to being a network +address. + + >>> ipaddress.ip_interface('192.0.2.1/24') + IPv4Interface('192.0.2.1/24') + >>> ipaddress.ip_interface('2001:db8::1/96') + IPv6Interface('2001:db8::1/96') + +Integer inputs are accepted (as with networks), and use of a particular IP +version can be forced by calling the relevant constructor directly. + + +Inspecting Address/Network/Interface Objects +============================================ + +You've gone to the trouble of creating an IPv(4|6)(Address|Network|Interface) +object, so you probably want to get information about it. :mod:`ipaddress` +tries to make doing this easy and intuitive. + +Extracting the IP version:: + + >>> addr4 = ipaddress.ip_address('192.0.2.1') + >>> addr6 = ipaddress.ip_address('2001:db8::1') + >>> addr6.version + 6 + >>> addr4.version + 4 + +Obtaining the network from an interface:: + + >>> host4 = ipaddress.ip_interface('192.0.2.1/24') + >>> host4.network + IPv4Network('192.0.2.0/24') + >>> host6 = ipaddress.ip_interface('2001:db8::1/96') + >>> host6.network + IPv6Network('2001:db8::/96') + +Finding out how many individual addresses are in a network:: + + >>> net4 = ipaddress.ip_network('192.0.2.0/24') + >>> net4.num_addresses + 256 + >>> net6 = ipaddress.ip_network('2001:db8::0/96') + >>> net6.num_addresses + 4294967296 + +Iterating through the "usable" addresses on a network:: + + >>> net4 = ipaddress.ip_network('192.0.2.0/24') + >>> for x in net4.hosts(): + ... print(x) # doctest: +ELLIPSIS + 192.0.2.1 + 192.0.2.2 + 192.0.2.3 + 192.0.2.4 + ... + 192.0.2.252 + 192.0.2.253 + 192.0.2.254 + + +Obtaining the netmask (i.e. set bits corresponding to the network prefix) or +the hostmask (any bits that are not part of the netmask): + + >>> net4 = ipaddress.ip_network('192.0.2.0/24') + >>> net4.netmask + IPv4Address('255.255.255.0') + >>> net4.hostmask + IPv4Address('0.0.0.255') + >>> net6 = ipaddress.ip_network('2001:db8::0/96') + >>> net6.netmask + IPv6Address('ffff:ffff:ffff:ffff:ffff:ffff::') + >>> net6.hostmask + IPv6Address('::ffff:ffff') + + +Exploding or compressing the address:: + + >>> addr6.exploded + '2001:0db8:0000:0000:0000:0000:0000:0001' + >>> addr6.compressed + '2001:db8::1' + >>> net6.exploded + '2001:0db8:0000:0000:0000:0000:0000:0000/96' + >>> net6.compressed + '2001:db8::/96' + +While IPv4 doesn't support explosion or compression, the associated objects +still provide the relevant properties so that version neutral code can +easily ensure the most concise or most verbose form is used for IPv6 +addresses while still correctly handling IPv4 addresses. + + +Networks as lists of Addresses +============================== + +It's sometimes useful to treat networks as lists. This means it is possible +to index them like this:: + + >>> net4[1] + IPv4Address('192.0.2.1') + >>> net4[-1] + IPv4Address('192.0.2.255') + >>> net6[1] + IPv6Address('2001:db8::1') + >>> net6[-1] + IPv6Address('2001:db8::ffff:ffff') + + +It also means that network objects lend themselves to using the list +membership test syntax like this:: + + if address in network: + # do something + +Containment testing is done efficiently based on the network prefix:: + + >>> addr4 = ipaddress.ip_address('192.0.2.1') + >>> addr4 in ipaddress.ip_network('192.0.2.0/24') + True + >>> addr4 in ipaddress.ip_network('192.0.3.0/24') + False + + +Comparisons +=========== + +:mod:`ipaddress` provides some simple, hopefully intuitive ways to compare +objects, where it makes sense:: + + >>> ipaddress.ip_address('192.0.2.1') < ipaddress.ip_address('192.0.2.2') + True + +A :exc:`TypeError` exception is raised if you try to compare objects of +different versions or different types. + + +Using IP Addresses with other modules +===================================== + +Other modules that use IP addresses (such as :mod:`socket`) usually won't +accept objects from this module directly. Instead, they must be coerced to +an integer or string that the other module will accept:: + + >>> addr4 = ipaddress.ip_address('192.0.2.1') + >>> str(addr4) + '192.0.2.1' + >>> int(addr4) + 3221225985 + + +Getting more detail when instance creation fails +================================================ + +When creating address/network/interface objects using the version-agnostic +factory functions, any errors will be reported as :exc:`ValueError` with +a generic error message that simply says the passed in value was not +recognized as an object of that type. The lack of a specific error is +because it's necessary to know whether the value is *supposed* to be IPv4 +or IPv6 in order to provide more detail on why it has been rejected. + +To support use cases where it is useful to have access to this additional +detail, the individual class constructors actually raise the +:exc:`ValueError` subclasses :exc:`ipaddress.AddressValueError` and +:exc:`ipaddress.NetmaskValueError` to indicate exactly which part of +the definition failed to parse correctly. + +The error messages are significantly more detailed when using the +class constructors directly. For example:: + + >>> ipaddress.ip_address("192.168.0.256") + Traceback (most recent call last): + ... + ValueError: '192.168.0.256' does not appear to be an IPv4 or IPv6 address + >>> ipaddress.IPv4Address("192.168.0.256") + Traceback (most recent call last): + ... + ipaddress.AddressValueError: Octet 256 (> 255) not permitted in '192.168.0.256' + + >>> ipaddress.ip_network("192.168.0.1/64") + Traceback (most recent call last): + ... + ValueError: '192.168.0.1/64' does not appear to be an IPv4 or IPv6 network + >>> ipaddress.IPv4Network("192.168.0.1/64") + Traceback (most recent call last): + ... + ipaddress.NetmaskValueError: '64' is not a valid netmask + +However, both of the module specific exceptions have :exc:`ValueError` as their +parent class, so if you're not concerned with the particular type of error, +you can still write code like the following:: + + try: + network = ipaddress.IPv4Network(address) + except ValueError: + print('address/netmask is invalid for IPv4:', address) + diff --git a/Doc/howto/logging-cookbook.rst b/Doc/howto/logging-cookbook.rst index 33963f9..563da9d 100644 --- a/Doc/howto/logging-cookbook.rst +++ b/Doc/howto/logging-cookbook.rst @@ -97,11 +97,11 @@ The output looks like this:: Multiple handlers and formatters -------------------------------- -Loggers are plain Python objects. The :func:`addHandler` method has no minimum -or maximum quota for the number of handlers you may add. Sometimes it will be -beneficial for an application to log all messages of all severities to a text -file while simultaneously logging errors or above to the console. To set this -up, simply configure the appropriate handlers. The logging calls in the +Loggers are plain Python objects. The :meth:`~Logger.addHandler` method has no +minimum or maximum quota for the number of handlers you may add. Sometimes it +will be beneficial for an application to log all messages of all severities to a +text file while simultaneously logging errors or above to the console. To set +this up, simply configure the appropriate handlers. The logging calls in the application code will remain unchanged. Here is a slight modification to the previous simple module-based configuration example:: @@ -459,8 +459,9 @@ printed on the console; on the server side, you should see something like:: Note that there are some security issues with pickle in some scenarios. If these affect you, you can use an alternative serialization scheme by overriding -the :meth:`makePickle` method and implementing your alternative there, as -well as adapting the above script to use your alternative serialization. +the :meth:`~handlers.SocketHandler.makePickle` method and implementing your +alternative there, as well as adapting the above script to use your alternative +serialization. .. _context-info: @@ -509,9 +510,9 @@ information in the delegated call. Here's a snippet from the code of msg, kwargs = self.process(msg, kwargs) self.logger.debug(msg, *args, **kwargs) -The :meth:`process` method of :class:`LoggerAdapter` is where the contextual -information is added to the logging output. It's passed the message and -keyword arguments of the logging call, and it passes back (potentially) +The :meth:`~LoggerAdapter.process` method of :class:`LoggerAdapter` is where the +contextual information is added to the logging output. It's passed the message +and keyword arguments of the logging call, and it passes back (potentially) modified versions of these to use in the call to the underlying logger. The default implementation of this method leaves the message alone, but inserts an 'extra' key in the keyword argument whose value is the dict-like object @@ -523,70 +524,32 @@ merged into the :class:`LogRecord` instance's __dict__, allowing you to use customized strings with your :class:`Formatter` instances which know about the keys of the dict-like object. If you need a different method, e.g. if you want to prepend or append the contextual information to the message string, -you just need to subclass :class:`LoggerAdapter` and override :meth:`process` -to do what you need. Here's an example script which uses this class, which -also illustrates what dict-like behaviour is needed from an arbitrary -'dict-like' object for use in the constructor:: +you just need to subclass :class:`LoggerAdapter` and override +:meth:`~LoggerAdapter.process` to do what you need. Here is a simple example:: - import logging + class CustomAdapter(logging.LoggerAdapter): + """ + This example adapter expects the passed in dict-like object to have a + 'connid' key, whose value in brackets is prepended to the log message. + """ + def process(self, msg, kwargs): + return '[%s] %s' % (self.extra['connid'], msg), kwargs - class ConnInfo: - """ - An example class which shows how an arbitrary class can be used as - the 'extra' context information repository passed to a LoggerAdapter. - """ +which you can use like this:: - def __getitem__(self, name): - """ - To allow this instance to look like a dict. - """ - from random import choice - if name == 'ip': - result = choice(['127.0.0.1', '192.168.0.1']) - elif name == 'user': - result = choice(['jim', 'fred', 'sheila']) - else: - result = self.__dict__.get(name, '?') - return result - - def __iter__(self): - """ - To allow iteration over keys, which will be merged into - the LogRecord dict before formatting and output. - """ - keys = ['ip', 'user'] - keys.extend(self.__dict__.keys()) - return keys.__iter__() + logger = logging.getLogger(__name__) + adapter = CustomAdapter(logger, {'connid': some_conn_id}) - if __name__ == '__main__': - from random import choice - levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL) - a1 = logging.LoggerAdapter(logging.getLogger('a.b.c'), - { 'ip' : '123.231.231.123', 'user' : 'sheila' }) - logging.basicConfig(level=logging.DEBUG, - format='%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s') - a1.debug('A debug message') - a1.info('An info message with %s', 'some parameters') - a2 = logging.LoggerAdapter(logging.getLogger('d.e.f'), ConnInfo()) - for x in range(10): - lvl = choice(levels) - lvlname = logging.getLevelName(lvl) - a2.log(lvl, 'A message at %s level with %d %s', lvlname, 2, 'parameters') +Then any events that you log to the adapter will have the value of +``some_conn_id`` prepended to the log messages. -When this script is run, the output should look something like this:: +Using objects other than dicts to pass contextual information +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - 2008-01-18 14:49:54,023 a.b.c DEBUG IP: 123.231.231.123 User: sheila A debug message - 2008-01-18 14:49:54,023 a.b.c INFO IP: 123.231.231.123 User: sheila An info message with some parameters - 2008-01-18 14:49:54,023 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: jim A message at INFO level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: fred A message at ERROR level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: jim A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: fred A message at INFO level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters - 2008-01-18 14:49:54,033 d.e.f WARNING IP: 127.0.0.1 User: jim A message at WARNING level with 2 parameters +You don't need to pass an actual dict to a :class:`LoggerAdapter` - you could +pass an instance of a class which implements ``__getitem__`` and ``__iter__`` so +that it looks like a dict to logging. This would be useful if you want to +generate values dynamically (whereas the values in a dict would be constant). .. _filters-contextual: @@ -671,20 +634,20 @@ threads in a single process *is* supported, logging to a single file from *multiple processes* is *not* supported, because there is no standard way to serialize access to a single file across multiple processes in Python. If you need to log to a single file from multiple processes, one way of doing this is -to have all the processes log to a :class:`SocketHandler`, and have a separate -process which implements a socket server which reads from the socket and logs -to file. (If you prefer, you can dedicate one thread in one of the existing -processes to perform this function.) :ref:`This section <network-logging>` -documents this approach in more detail and includes a working socket receiver -which can be used as a starting point for you to adapt in your own -applications. +to have all the processes log to a :class:`~handlers.SocketHandler`, and have a +separate process which implements a socket server which reads from the socket +and logs to file. (If you prefer, you can dedicate one thread in one of the +existing processes to perform this function.) +:ref:`This section <network-logging>` documents this approach in more detail and +includes a working socket receiver which can be used as a starting point for you +to adapt in your own applications. If you are using a recent version of Python which includes the :mod:`multiprocessing` module, you could write your own handler which uses the -:class:`Lock` class from this module to serialize access to the file from -your processes. The existing :class:`FileHandler` and subclasses do not make -use of :mod:`multiprocessing` at present, though they may do so in the future. -Note that at present, the :mod:`multiprocessing` module does not provide +:class:`~multiprocessing.Lock` class from this module to serialize access to the +file from your processes. The existing :class:`FileHandler` and subclasses do +not make use of :mod:`multiprocessing` at present, though they may do so in the +future. Note that at present, the :mod:`multiprocessing` module does not provide working lock functionality on all platforms (see http://bugs.python.org/issue3770). @@ -878,7 +841,7 @@ separate thread:: }, 'loggers': { 'foo': { - 'handlers' : ['foofile'] + 'handlers': ['foofile'] } }, 'root': { @@ -918,7 +881,7 @@ Sometimes you want to let a log file grow to a certain size, then open a new file and log to that. You may want to keep a certain number of these files, and when that many files have been created, rotate the files so that the number of files and the size of the files both remain bounded. For this usage pattern, the -logging package provides a :class:`RotatingFileHandler`:: +logging package provides a :class:`~handlers.RotatingFileHandler`:: import glob import logging @@ -1096,12 +1059,46 @@ parentheses go around the format string and the arguments, not just the format string. That's because the __ notation is just syntax sugar for a constructor call to one of the XXXMessage classes. +If you prefer, you can use a :class:`LoggerAdapter` to achieve a similar effect +to the above, as in the following example:: + + import logging + + class Message(object): + def __init__(self, fmt, args): + self.fmt = fmt + self.args = args + + def __str__(self): + return self.fmt.format(*self.args) + + class StyleAdapter(logging.LoggerAdapter): + def __init__(self, logger, extra=None): + super(StyleAdapter, self).__init__(logger, extra or {}) + + def log(self, level, msg, *args, **kwargs): + if self.isEnabledFor(level): + msg, kwargs = self.process(msg, kwargs) + self.logger._log(level, Message(msg, args), (), **kwargs) + + logger = StyleAdapter(logging.getLogger(__name__)) + + def main(): + logger.debug('Hello, {}', 'world!') + + if __name__ == '__main__': + logging.basicConfig(level=logging.DEBUG) + main() + +The above script should log the message ``Hello, world!`` when run with +Python 3.2 or later. + .. currentmodule:: logging .. _custom-logrecord: -Customising ``LogRecord`` +Customizing ``LogRecord`` ------------------------- Every logging event is represented by a :class:`LogRecord` instance. @@ -1258,7 +1255,7 @@ An example dictionary-based configuration Below is an example of a logging configuration dictionary - it's taken from the `documentation on the Django project <https://docs.djangoproject.com/en/1.3/topics/logging/#configuring-logging>`_. -This dictionary is passed to :func:`~logging.config.dictConfig` to put the configuration into effect:: +This dictionary is passed to :func:`~config.dictConfig` to put the configuration into effect:: LOGGING = { 'version': 1, @@ -1316,6 +1313,33 @@ For more information about this configuration, you can see the `relevant section <https://docs.djangoproject.com/en/1.3/topics/logging/#configuring-logging>`_ of the Django documentation. +.. _cookbook-rotator-namer: + +Using a rotator and namer to customize log rotation processing +-------------------------------------------------------------- + +An example of how you can define a namer and rotator is given in the following +snippet, which shows zlib-based compression of the log file:: + + def namer(name): + return name + ".gz" + + def rotator(source, dest): + with open(source, "rb") as sf: + data = sf.read() + compressed = zlib.compress(data, 9) + with open(dest, "wb") as df: + df.write(compressed) + os.remove(source) + + rh = logging.handlers.RotatingFileHandler(...) + rh.rotator = rotator + rh.namer = namer + +These are not "true" .gz files, as they are bare compressed data, with no +"container" such as you’d find in an actual gzip file. This snippet is just +for illustration purposes. + A more elaborate multiprocessing example ---------------------------------------- @@ -1505,7 +1529,7 @@ works:: }, 'loggers': { 'foo': { - 'handlers' : ['foofile'] + 'handlers': ['foofile'] } }, 'root': { @@ -1670,3 +1694,344 @@ When the above script is run, it prints:: Note that the order of items might be different according to the version of Python used. + +.. _custom-handlers: + +.. currentmodule:: logging.config + +Customizing handlers with :func:`dictConfig` +-------------------------------------------- + +There are times when you want to customize logging handlers in particular ways, +and if you use :func:`dictConfig` you may be able to do this without +subclassing. As an example, consider that you may want to set the ownership of a +log file. On POSIX, this is easily done using :func:`shutil.chown`, but the file +handlers in the stdlib don't offer built-in support. You can customize handler +creation using a plain function such as:: + + def owned_file_handler(filename, mode='a', encoding=None, owner=None): + if owner: + if not os.path.exists(filename): + open(filename, 'a').close() + shutil.chown(filename, *owner) + return logging.FileHandler(filename, mode, encoding) + +You can then specify, in a logging configuration passed to :func:`dictConfig`, +that a logging handler be created by calling this function:: + + LOGGING = { + 'version': 1, + 'disable_existing_loggers': False, + 'formatters': { + 'default': { + 'format': '%(asctime)s %(levelname)s %(name)s %(message)s' + }, + }, + 'handlers': { + 'file':{ + # The values below are popped from this dictionary and + # used to create the handler, set the handler's level and + # its formatter. + '()': owned_file_handler, + 'level':'DEBUG', + 'formatter': 'default', + # The values below are passed to the handler creator callable + # as keyword arguments. + 'owner': ['pulse', 'pulse'], + 'filename': 'chowntest.log', + 'mode': 'w', + 'encoding': 'utf-8', + }, + }, + 'root': { + 'handlers': ['file'], + 'level': 'DEBUG', + }, + } + +In this example I am setting the ownership using the ``pulse`` user and group, +just for the purposes of illustration. Putting it together into a working +script, ``chowntest.py``:: + + import logging, logging.config, os, shutil + + def owned_file_handler(filename, mode='a', encoding=None, owner=None): + if owner: + if not os.path.exists(filename): + open(filename, 'a').close() + shutil.chown(filename, *owner) + return logging.FileHandler(filename, mode, encoding) + + LOGGING = { + 'version': 1, + 'disable_existing_loggers': False, + 'formatters': { + 'default': { + 'format': '%(asctime)s %(levelname)s %(name)s %(message)s' + }, + }, + 'handlers': { + 'file':{ + # The values below are popped from this dictionary and + # used to create the handler, set the handler's level and + # its formatter. + '()': owned_file_handler, + 'level':'DEBUG', + 'formatter': 'default', + # The values below are passed to the handler creator callable + # as keyword arguments. + 'owner': ['pulse', 'pulse'], + 'filename': 'chowntest.log', + 'mode': 'w', + 'encoding': 'utf-8', + }, + }, + 'root': { + 'handlers': ['file'], + 'level': 'DEBUG', + }, + } + + logging.config.dictConfig(LOGGING) + logger = logging.getLogger('mylogger') + logger.debug('A debug message') + +To run this, you will probably need to run as ``root``:: + + $ sudo python3.3 chowntest.py + $ cat chowntest.log + 2013-11-05 09:34:51,128 DEBUG mylogger A debug message + $ ls -l chowntest.log + -rw-r--r-- 1 pulse pulse 55 2013-11-05 09:34 chowntest.log + +Note that this example uses Python 3.3 because that's where :func:`shutil.chown` +makes an appearance. This approach should work with any Python version that +supports :func:`dictConfig` - namely, Python 2.7, 3.2 or later. With pre-3.3 +versions, you would need to implement the actual ownership change using e.g. +:func:`os.chown`. + +In practice, the handler-creating function may be in a utility module somewhere +in your project. Instead of the line in the configuration:: + + '()': owned_file_handler, + +you could use e.g.:: + + '()': 'ext://project.util.owned_file_handler', + +where ``project.util`` can be replaced with the actual name of the package +where the function resides. In the above working script, using +``'ext://__main__.owned_file_handler'`` should work. Here, the actual callable +is resolved by :func:`dictConfig` from the ``ext://`` specification. + +This example hopefully also points the way to how you could implement other +types of file change - e.g. setting specific POSIX permission bits - in the +same way, using :func:`os.chmod`. + +Of course, the approach could also be extended to types of handler other than a +:class:`~logging.FileHandler` - for example, one of the rotating file handlers, +or a different type of handler altogether. + + +.. currentmodule:: logging + +.. _formatting-styles: + +Using particular formatting styles throughout your application +-------------------------------------------------------------- + +In Python 3.2, the :class:`~logging.Formatter` gained a ``style`` keyword +parameter which, while defaulting to ``%`` for backward compatibility, allowed +the specification of ``{`` or ``$`` to support the formatting approaches +supported by :meth:`str.format` and :class:`string.Template`. Note that this +governs the formatting of logging messages for final output to logs, and is +completely orthogonal to how an individual logging message is constructed. + +Logging calls (:meth:`~Logger.debug`, :meth:`~Logger.info` etc.) only take +positional parameters for the actual logging message itself, with keyword +parameters used only for determining options for how to handle the logging call +(e.g. the ``exc_info`` keyword parameter to indicate that traceback information +should be logged, or the ``extra`` keyword parameter to indicate additional +contextual information to be added to the log). So you cannot directly make +logging calls using :meth:`str.format` or :class:`string.Template` syntax, +because internally the logging package uses %-formatting to merge the format +string and the variable arguments. There would no changing this while preserving +backward compatibility, since all logging calls which are out there in existing +code will be using %-format strings. + +There have been suggestions to associate format styles with specific loggers, +but that approach also runs into backward compatibility problems because any +existing code could be using a given logger name and using %-formatting. + +For logging to work interoperably between any third-party libraries and your +code, decisions about formatting need to be made at the level of the +individual logging call. This opens up a couple of ways in which alternative +formatting styles can be accommodated. + + +Using LogRecord factories +^^^^^^^^^^^^^^^^^^^^^^^^^ + +In Python 3.2, along with the :class:`~logging.Formatter` changes mentioned +above, the logging package gained the ability to allow users to set their own +:class:`LogRecord` subclasses, using the :func:`setLogRecordFactory` function. +You can use this to set your own subclass of :class:`LogRecord`, which does the +Right Thing by overriding the :meth:`~LogRecord.getMessage` method. The base +class implementation of this method is where the ``msg % args`` formatting +happens, and where you can substitute your alternate formatting; however, you +should be careful to support all formatting styles and allow %-formatting as +the default, to ensure interoperability with other code. Care should also be +taken to call ``str(self.msg)``, just as the base implementation does. + +Refer to the reference documentation on :func:`setLogRecordFactory` and +:class:`LogRecord` for more information. + + +Using custom message objects +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There is another, perhaps simpler way that you can use {}- and $- formatting to +construct your individual log messages. You may recall (from +:ref:`arbitrary-object-messages`) that when logging you can use an arbitrary +object as a message format string, and that the logging package will call +:func:`str` on that object to get the actual format string. Consider the +following two classes:: + + class BraceMessage(object): + def __init__(self, fmt, *args, **kwargs): + self.fmt = fmt + self.args = args + self.kwargs = kwargs + + def __str__(self): + return self.fmt.format(*self.args, **self.kwargs) + + class DollarMessage(object): + def __init__(self, fmt, **kwargs): + self.fmt = fmt + self.kwargs = kwargs + + def __str__(self): + from string import Template + return Template(self.fmt).substitute(**self.kwargs) + +Either of these can be used in place of a format string, to allow {}- or +$-formatting to be used to build the actual "message" part which appears in the +formatted log output in place of “%(message)s” or “{message}” or “$message”. +If you find it a little unwieldy to use the class names whenever you want to log +something, you can make it more palatable if you use an alias such as ``M`` or +``_`` for the message (or perhaps ``__``, if you are using ``_`` for +localization). + +Examples of this approach are given below. Firstly, formatting with +:meth:`str.format`:: + + >>> __ = BraceMessage + >>> print(__('Message with {0} {1}', 2, 'placeholders')) + Message with 2 placeholders + >>> class Point: pass + ... + >>> p = Point() + >>> p.x = 0.5 + >>> p.y = 0.5 + >>> print(__('Message with coordinates: ({point.x:.2f}, {point.y:.2f})', point=p)) + Message with coordinates: (0.50, 0.50) + +Secondly, formatting with :class:`string.Template`:: + + >>> __ = DollarMessage + >>> print(__('Message with $num $what', num=2, what='placeholders')) + Message with 2 placeholders + >>> + +One thing to note is that you pay no significant performance penalty with this +approach: the actual formatting happens not when you make the logging call, but +when (and if) the logged message is actually about to be output to a log by a +handler. So the only slightly unusual thing which might trip you up is that the +parentheses go around the format string and the arguments, not just the format +string. That’s because the __ notation is just syntax sugar for a constructor +call to one of the ``XXXMessage`` classes shown above. + + +.. _filters-dictconfig: + +.. currentmodule:: logging.config + +Configuring filters with :func:`dictConfig` +------------------------------------------- + +You *can* configure filters using :func:`~logging.config.dictConfig`, though it +might not be obvious at first glance how to do it (hence this recipe). Since +:class:`~logging.Filter` is the only filter class included in the standard +library, and it is unlikely to cater to many requirements (it's only there as a +base class), you will typically need to define your own :class:`~logging.Filter` +subclass with an overridden :meth:`~logging.Filter.filter` method. To do this, +specify the ``()`` key in the configuration dictionary for the filter, +specifying a callable which will be used to create the filter (a class is the +most obvious, but you can provide any callable which returns a +:class:`~logging.Filter` instance). Here is a complete example:: + + import logging + import logging.config + import sys + + class MyFilter(logging.Filter): + def __init__(self, param=None): + self.param = param + + def filter(self, record): + if self.param is None: + allow = True + else: + allow = self.param not in record.msg + if allow: + record.msg = 'changed: ' + record.msg + return allow + + LOGGING = { + 'version': 1, + 'filters': { + 'myfilter': { + '()': MyFilter, + 'param': 'noshow', + } + }, + 'handlers': { + 'console': { + 'class': 'logging.StreamHandler', + 'filters': ['myfilter'] + } + }, + 'root': { + 'level': 'DEBUG', + 'handlers': ['console'] + }, + } + + if __name__ == '__main__': + logging.config.dictConfig(LOGGING) + logging.debug('hello') + logging.debug('hello - noshow') + +This example shows how you can pass configuration data to the callable which +constructs the instance, in the form of keyword parameters. When run, the above +script will print:: + + changed: hello + +which shows that the filter is working as configured. + +A couple of extra points to note: + +* If you can't refer to the callable directly in the configuration (e.g. if it + lives in a different module, and you can't import it directly where the + configuration dictionary is), you can use the form ``ext://...`` as described + in :ref:`logging-config-dict-externalobj`. For example, you could have used + the text ``'ext://__main__.MyFilter'`` instead of ``MyFilter`` in the above + example. + +* As well as for filters, this technique can also be used to configure custom + handlers and formatters. See :ref:`logging-config-dict-userdef` for more + information on how logging supports using user-defined objects in its + configuration, and see the other cookbook recipe :ref:`custom-handlers` above. + diff --git a/Doc/howto/logging.rst b/Doc/howto/logging.rst index 79f1336..55b1c86 100644 --- a/Doc/howto/logging.rst +++ b/Doc/howto/logging.rst @@ -63,6 +63,8 @@ The logging functions are named after the level or severity of the events they are used to track. The standard levels and their applicability are described below (in increasing order of severity): +.. tabularcolumns:: |l|L| + +--------------+---------------------------------------------+ | Level | When it's used | +==============+=============================================+ @@ -120,7 +122,8 @@ Logging to a file ^^^^^^^^^^^^^^^^^ A very common situation is that of recording logging events in a file, so let's -look at that next:: +look at that next. Be sure to try the following in a newly-started Python +interpreter, and don't just continue from the session described above:: import logging logging.basicConfig(filename='example.log',level=logging.DEBUG) @@ -236,7 +239,7 @@ uses the old, %-style of string formatting. This is for backwards compatibility: the logging package pre-dates newer formatting options such as :meth:`str.format` and :class:`string.Template`. These newer formatting options *are* supported, but exploring them is outside the scope of this -tutorial. +tutorial: see :ref:`formatting-styles` for more information. Changing the format of displayed messages @@ -467,12 +470,13 @@ Handlers :class:`~logging.Handler` objects are responsible for dispatching the appropriate log messages (based on the log messages' severity) to the handler's -specified destination. Logger objects can add zero or more handler objects to -themselves with an :func:`addHandler` method. As an example scenario, an -application may want to send all log messages to a log file, all log messages -of error or higher to stdout, and all messages of critical to an email address. -This scenario requires three individual handlers where each handler is -responsible for sending messages of a specific severity to a specific location. +specified destination. :class:`Logger` objects can add zero or more handler +objects to themselves with an :meth:`~Logger.addHandler` method. As an example +scenario, an application may want to send all log messages to a log file, all +log messages of error or higher to stdout, and all messages of critical to an +email address. This scenario requires three individual handlers where each +handler is responsible for sending messages of a specific severity to a specific +location. The standard library includes quite a few handler types (see :ref:`useful-handlers`); the tutorials use mainly :class:`StreamHandler` and @@ -483,16 +487,17 @@ themselves with. The only handler methods that seem relevant for application developers who are using the built-in handler objects (that is, not creating custom handlers) are the following configuration methods: -* The :meth:`Handler.setLevel` method, just as in logger objects, specifies the +* The :meth:`~Handler.setLevel` method, just as in logger objects, specifies the lowest severity that will be dispatched to the appropriate destination. Why are there two :func:`setLevel` methods? The level set in the logger determines which severity of messages it will pass to its handlers. The level set in each handler determines which messages that handler will send on. -* :func:`setFormatter` selects a Formatter object for this handler to use. +* :meth:`~Handler.setFormatter` selects a Formatter object for this handler to + use. -* :func:`addFilter` and :func:`removeFilter` respectively configure and - deconfigure filter objects on handlers. +* :meth:`~Handler.addFilter` and :meth:`~Handler.removeFilter` respectively + configure and deconfigure filter objects on handlers. Application code should not directly instantiate and use instances of :class:`Handler`. Instead, the :class:`Handler` class is a base class that @@ -946,16 +951,16 @@ Logged messages are formatted for presentation through instances of the use with the % operator and a dictionary. For formatting multiple messages in a batch, instances of -:class:`BufferingFormatter` can be used. In addition to the format string (which -is applied to each message in the batch), there is provision for header and -trailer format strings. +:class:`~handlers.BufferingFormatter` can be used. In addition to the format +string (which is applied to each message in the batch), there is provision for +header and trailer format strings. When filtering based on logger level and/or handler level is not enough, instances of :class:`Filter` can be added to both :class:`Logger` and -:class:`Handler` instances (through their :meth:`addFilter` method). Before -deciding to process a message further, both loggers and handlers consult all -their filters for permission. If any filter returns a false value, the message -is not processed further. +:class:`Handler` instances (through their :meth:`~Handler.addFilter` method). +Before deciding to process a message further, both loggers and handlers consult +all their filters for permission. If any filter returns a false value, the +message is not processed further. The basic :class:`Filter` functionality allows filtering by specific logger name. If this feature is used, messages sent to the named logger and its @@ -973,12 +978,14 @@ in production. This is so that errors which occur while handling logging events cause the application using logging to terminate prematurely. :class:`SystemExit` and :class:`KeyboardInterrupt` exceptions are never -swallowed. Other exceptions which occur during the :meth:`emit` method of a -:class:`Handler` subclass are passed to its :meth:`handleError` method. +swallowed. Other exceptions which occur during the :meth:`~Handler.emit` method +of a :class:`Handler` subclass are passed to its :meth:`~Handler.handleError` +method. -The default implementation of :meth:`handleError` in :class:`Handler` checks -to see if a module-level variable, :data:`raiseExceptions`, is set. If set, a -traceback is printed to :data:`sys.stderr`. If not set, the exception is swallowed. +The default implementation of :meth:`~Handler.handleError` in :class:`Handler` +checks to see if a module-level variable, :data:`raiseExceptions`, is set. If +set, a traceback is printed to :data:`sys.stderr`. If not set, the exception is +swallowed. .. note:: The default value of :data:`raiseExceptions` is ``True``. This is because during development, you typically want to be notified of any @@ -995,11 +1002,11 @@ Using arbitrary objects as messages In the preceding sections and examples, it has been assumed that the message passed when logging the event is a string. However, this is not the only possibility. You can pass an arbitrary object as a message, and its -:meth:`__str__` method will be called when the logging system needs to convert -it to a string representation. In fact, if you want to, you can avoid +:meth:`~object.__str__` method will be called when the logging system needs to +convert it to a string representation. In fact, if you want to, you can avoid computing a string representation altogether - for example, the -:class:`SocketHandler` emits an event by pickling it and sending it over the -wire. +:class:`~handlers.SocketHandler` emits an event by pickling it and sending it +over the wire. Optimization @@ -1008,9 +1015,10 @@ Optimization Formatting of message arguments is deferred until it cannot be avoided. However, computing the arguments passed to the logging method can also be expensive, and you may want to avoid doing it if the logger will just throw -away your event. To decide what to do, you can call the :meth:`isEnabledFor` -method which takes a level argument and returns true if the event would be -created by the Logger for that level of call. You can write code like this:: +away your event. To decide what to do, you can call the +:meth:`~Logger.isEnabledFor` method which takes a level argument and returns +true if the event would be created by the Logger for that level of call. +You can write code like this:: if logger.isEnabledFor(logging.DEBUG): logger.debug('Message with %s, %s', expensive_func1(), diff --git a/Doc/howto/pyporting.rst b/Doc/howto/pyporting.rst index a2e4173..9d7e859 100644 --- a/Doc/howto/pyporting.rst +++ b/Doc/howto/pyporting.rst @@ -10,238 +10,211 @@ Porting Python 2 Code to Python 3 With Python 3 being the future of Python while Python 2 is still in active use, it is good to have your project available for both major releases of - Python. This guide is meant to help you choose which strategy works best - for your project to support both Python 2 & 3 along with how to execute - that strategy. + Python. This guide is meant to help you figure out how best to support both + Python 2 & 3 simultaneously. If you are looking to port an extension module instead of pure Python code, please see :ref:`cporting-howto`. + If you would like to read one core Python developer's take on why Python 3 + came into existence, you can read Nick Coghlan's `Python 3 Q & A`_. -Choosing a Strategy -=================== - -When a project makes the decision that it's time to support both Python 2 & 3, -a decision needs to be made as to how to go about accomplishing that goal. -The chosen strategy will depend on how large the project's existing -codebase is and how much divergence you want from your Python 2 codebase from -your Python 3 one (e.g., starting a new version with Python 3). - -If your project is brand-new or does not have a large codebase, then you may -want to consider writing/porting :ref:`all of your code for Python 3 -and use 3to2 <use_3to2>` to port your code for Python 2. - -If you would prefer to maintain a codebase which is semantically **and** -syntactically compatible with Python 2 & 3 simultaneously, you can write -:ref:`use_same_source`. While this tends to lead to somewhat non-idiomatic -code, it does mean you keep a rapid development process for you, the developer. - -Finally, you do have the option of :ref:`using 2to3 <use_2to3>` to translate -Python 2 code into Python 3 code (with some manual help). This can take the -form of branching your code and using 2to3 to start a Python 3 branch. You can -also have users perform the translation at installation time automatically so -that you only have to maintain a Python 2 codebase. - -Regardless of which approach you choose, porting is not as hard or -time-consuming as you might initially think. You can also tackle the problem -piece-meal as a good portion of porting is simply updating your code to follow -current best practices in a Python 2/3 compatible way. - - -Universal Bits of Advice ------------------------- - -Regardless of what strategy you pick, there are a few things you should -consider. - -One is make sure you have a robust test suite. You need to make sure everything -continues to work, just like when you support a new minor version of Python. -This means making sure your test suite is thorough and is ported properly -between Python 2 & 3. You will also most likely want to use something like tox_ -to automate testing between both a Python 2 and Python 3 VM. - -Two, once your project has Python 3 support, make sure to add the proper -classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3 -compatible it must have the -`Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_ -(from -http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/):: - - setup( - name='Your Library', - version='1.0', - classifiers=[ - # make sure to use :: Python *and* :: Python :: 3 so - # that pypi can list the package on the python 3 page - 'Programming Language :: Python', - 'Programming Language :: Python :: 3' - ], - packages=['yourlibrary'], - # make sure to add custom_fixers to the MANIFEST.in - include_package_data=True, - # ... - ) - - -Doing so will cause your project to show up in the -`Python 3 packages list -<http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know -you set the classifier properly as visiting your project page on the Cheeseshop -will show a Python 3 logo in the upper-left corner of the page. - -Three, the six_ project provides a library which helps iron out differences -between Python 2 & 3. If you find there is a sticky point that is a continual -point of contention in your translation or maintenance of code, consider using -a source-compatible solution relying on six. If you have to create your own -Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a -guard. - -Four, read all the approaches. Just because some bit of advice applies to one -approach more than another doesn't mean that some advice doesn't apply to other -strategies. - -Five, drop support for older Python versions if possible. `Python 2.5`_ + If you prefer to read a (free) book on porting a project to Python 3, + consider reading `Porting to Python 3`_ by Lennart Regebro which should cover + much of what is discussed in this HOWTO. + + For help with porting, you can email the python-porting_ mailing list with + questions. + +The Short Version +================= + +* Decide what's the oldest version of Python 2 you want to support (if at all) +* Make sure you have a thorough test suite and use continuous integration + testing to make sure you stay compatible with the versions of Python you care + about +* If you have dependencies, check their Python 3 status using caniusepython3 + (`command-line tool <https://pypi.python.org/pypi/caniusepython3>`__, + `web app <https://caniusepython3.com/>`__) + +With that done, your options are: + +* If you are dropping Python 2 support, use 2to3_ to port to Python 3 +* If you are keeping Python 2 support, then start writing Python 2/3-compatible + code starting **TODAY** + + + If you have dependencies that have not been ported, reach out to them to port + their project while working to make your code compatible with Python 3 so + you're ready when your dependencies are all ported + + If all your dependencies have been ported (or you have none), go ahead and + port to Python 3 + +* If you are creating a new project that wants to have 2/3 compatibility, + code in Python 3 and then backport to Python 2 + + +Before You Begin +================ + +If your project is on the Cheeseshop_/PyPI_, make sure it has the proper +`trove classifiers`_ to signify what versions of Python it **currently** +supports. At minimum you should specify the major version(s), e.g. +``Programming Language :: Python :: 2`` if your project currently only supports +Python 2. It is preferrable that you be as specific as possible by listing every +major/minor version of Python that you support, e.g. if your project supports +Python 2.6 and 2.7, then you want the classifiers of:: + + Programming Language :: Python :: 2 + Programming Language :: Python :: 2.6 + Programming Language :: Python :: 2.7 + +Once your project supports Python 3 you will want to go back and add the +appropriate classifiers for Python 3 as well. This is important as setting the +``Programming Language :: Python :: 3`` classifier will lead to your project +being listed under the `Python 3 Packages`_ section of PyPI. + +Make sure you have a robust test suite. You need to +make sure everything continues to work, just like when you support a new +minor/feature release of Python. This means making sure your test suite is +thorough and is ported properly between Python 2 & 3 (consider using coverage_ +to measure that you have effective test coverage). You will also most likely +want to use something like tox_ to automate testing between all of your +supported versions of Python. You will also want to **port your tests first** so +that you can make sure that you detect breakage during the transition. Tests also +tend to be simpler than the code they are testing so it gives you an idea of how +easy it can be to port code. + +Drop support for older Python versions if possible. `Python 2.5`_ introduced a lot of useful syntax and libraries which have become idiomatic in Python 3. `Python 2.6`_ introduced future statements which makes compatibility much easier if you are going from Python 2 to 3. -`Python 2.7`_ continues the trend in the stdlib. So choose the newest version +`Python 2.7`_ continues the trend in the stdlib. Choose the newest version of Python which you believe can be your minimum support version and work from there. +Target the newest version of Python 3 that you can. Beyond just the usual +bugfixes, compatibility has continued to improve between Python 2 and 3 as time +has passed. E.g. Python 3.3 added back the ``u`` prefix for +strings, making source-compatible Python code easier to write. -.. _tox: http://codespeak.net/tox/ -.. _Cheeseshop: -.. _PyPI: http://pypi.python.org/ -.. _six: http://packages.python.org/six -.. _Python 2.7: http://www.python.org/2.7.x -.. _Python 2.6: http://www.python.org/2.6.x -.. _Python 2.5: http://www.python.org/2.5.x -.. _Python 2.4: http://www.python.org/2.4.x -.. _Python 2.3: http://www.python.org/2.3.x -.. _Python 2.2: http://www.python.org/2.2.x +Writing Source-Compatible Python 2/3 Code +========================================= -.. _use_3to2: +Over the years the Python community has discovered that the easiest way to +support both Python 2 and 3 in parallel is to write Python code that works in +either version. While this might sound counter-intuitive at first, it actually +is not difficult and typically only requires following some select +(non-idiomatic) practices and using some key projects to help make bridging +between Python 2 and 3 easier. -Python 3 and 3to2 -================= +Projects to Consider +-------------------- -If you are starting a new project or your codebase is small enough, you may -want to consider writing your code for Python 3 and backporting to Python 2 -using 3to2_. Thanks to Python 3 being more strict about things than Python 2 -(e.g., bytes vs. strings), the source translation can be easier and more -straightforward than from Python 2 to 3. Plus it gives you more direct -experience developing in Python 3 which, since it is the future of Python, is a -good thing long-term. +The lowest level library for supporting Python 2 & 3 simultaneously is six_. +Reading through its documentation will give you an idea of where exactly the +Python language changed between versions 2 & 3 and thus what you will want the +library to help you continue to support. -A drawback of this approach is that 3to2 is a third-party project. This means -that the Python core developers (and thus this guide) can make no promises -about how well 3to2 works at any time. There is nothing to suggest, though, -that 3to2 is not a high-quality project. +To help automate porting your code over to using six, you can use +modernize_. This project will attempt to rewrite your code to be as modern as +possible while using six to smooth out any differences between Python 2 & 3. +If you want to write your compatible code to feel more like Python 3 there is +the future_ project. It tries to provide backports of objects from Python 3 so +that you can use them from Python 2-compatible code, e.g. replacing the +``bytes`` type from Python 2 with the one from Python 3. +It also provides a translation script like modernize (its translation code is +actually partially based on it) to help start working with a pre-existing code +base. It is also unique in that its translation script will also port Python 3 +code backwards as well as Python 2 code forwards. -.. _3to2: https://bitbucket.org/amentajo/lib3to2/overview +Tips & Tricks +------------- -.. _use_2to3: - -Python 2 and 2to3 -================= - -Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module) -helps with porting Python 2 to Python 3 by performing various source -translations. This is a perfect solution for projects which wish to branch -their Python 3 code from their Python 2 codebase and maintain them as -independent codebases. You can even begin preparing to use this approach -today by writing future-compatible Python code which works cleanly in -Python 2 in conjunction with 2to3; all steps outlined below will work -with Python 2 code up to the point when the actual use of 2to3 occurs. - -Use of 2to3 as an on-demand translation step at install time is also possible, -preventing the need to maintain a separate Python 3 codebase, but this approach -does come with some drawbacks. While users will only have to pay the -translation cost once at installation, you as a developer will need to pay the -cost regularly during development. If your codebase is sufficiently large -enough then the translation step ends up acting like a compilation step, -robbing you of the rapid development process you are used to with Python. -Obviously the time required to translate a project will vary, so do an -experimental translation just to see how long it takes to evaluate whether you -prefer this approach compared to using :ref:`use_same_source` or simply keeping -a separate Python 3 codebase. - -Below are the typical steps taken by a project which uses a 2to3-based approach -to supporting Python 2 & 3. - +To help with writing source-compatible code using one of the projects mentioned +in `Projects to Consider`_, consider following the below suggestions. Some of +them are handled by the suggested projects, so if you do use one of them then +read their documentation first to see which suggestions below will taken care of +for you. Support Python 2.7 ------------------- +////////////////// As a first step, make sure that your project is compatible with `Python 2.7`_. This is just good to do as Python 2.7 is the last release of Python 2 and thus will be used for a rather long time. It also allows for use of the ``-3`` flag -to Python to help discover places in your code which 2to3 cannot handle but are -known to cause issues. +to Python to help discover places in your code where compatibility might be an +issue (the ``-3`` flag is in Python 2.6 but Python 2.7 adds more warnings). Try to Support `Python 2.6`_ and Newer Only -------------------------------------------- +/////////////////////////////////////////// While not possible for all projects, if you can support `Python 2.6`_ and newer **only**, your life will be much easier. Various future statements, stdlib additions, etc. exist only in Python 2.6 and later which greatly assist in -porting to Python 3. But if you project must keep support for `Python 2.5`_ (or -even `Python 2.4`_) then it is still possible to port to Python 3. +supporting Python 3. But if you project must keep support for `Python 2.5`_ then +it is still possible to simultaneously support Python 3. Below are the benefits you gain if you only have to support Python 2.6 and newer. Some of these options are personal choice while others are **strongly** recommended (the ones that are more for personal choice are labeled as such). If you continue to support older versions of Python then you -at least need to watch out for situations that these solutions fix. +at least need to watch out for situations that these solutions fix and handle +them appropriately (which is where library help from e.g. six_ comes in handy). ``from __future__ import print_function`` ''''''''''''''''''''''''''''''''''''''''' -This is a personal choice. 2to3 handles the translation from the print -statement to the print function rather well so this is an optional step. This -future statement does help, though, with getting used to typing -``print('Hello, World')`` instead of ``print 'Hello, World'``. +It will not only get you used to typing ``print()`` as a function instead of a +statement, but it will also give you the various benefits the function has over +the Python 2 statement (six_ provides a function if you support Python 2.5 or +older). ``from __future__ import unicode_literals`` ''''''''''''''''''''''''''''''''''''''''''' -Another personal choice. You can always mark what you want to be a (unicode) -string with a ``u`` prefix to get the same effect. But regardless of whether -you use this future statement or not, you **must** make sure you know exactly -which Python 2 strings you want to be bytes, and which are to be strings. This -means you should, **at minimum** mark all strings that are meant to be text -strings with a ``u`` prefix if you do not use this future statement. +If you choose to use this future statement then all string literals in +Python 2 will be assumed to be Unicode (as is already the case in Python 3). +If you choose not to use this future statement then you should mark all of your +text strings with a ``u`` prefix and only support Python 3.3 or newer. But you +are **strongly** advised to do one or the other (six_ provides a function in +case you don't want to use the future statement **and** you want to support +Python 3.2 or older). -Bytes literals -'''''''''''''' +Bytes/string literals +''''''''''''''''''''' -This is a **very** important one. The ability to prefix Python 2 strings that -are meant to contain bytes with a ``b`` prefix help to very clearly delineate -what is and is not a Python 3 string. When you run 2to3 on code, all Python 2 -strings become Python 3 strings **unless** they are prefixed with ``b``. +This is a **very** important one. Prefix Python 2 strings that +are meant to contain bytes with a ``b`` prefix to very clearly delineate +what is and is not a Python 3 text string (six_ provides a function to use for +Python 2.5 compatibility). + +This point cannot be stressed enough: make sure you know what all of your string +literals in Python 2 are meant to be in Python 3. Any string literal that +should be treated as bytes should have the ``b`` prefix. Any string literal +that should be Unicode/text in Python 2 should either have the ``u`` literal +(supported, but ignored, in Python 3.3 and later) or you should have +``from __future__ import unicode_literals`` at the top of the file. But the key +point is you should know how Python 3 will treat every one one of your string +literals and you should mark them as appropriate. There are some differences between byte literals in Python 2 and those in Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2. -Probably the biggest "gotcha" is that indexing results in different values. In -Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``. -You can avoid this disparity by always slicing at the size of a single element: -``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close -enough). +See the `Handle Common "Gotchas"`_ section for what to watch out for. -You cannot concatenate bytes and strings in Python 3. But since Python -2 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in -Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue -also comes about when doing comparisons between bytes and strings. +``from __future__ import absolute_import`` +'''''''''''''''''''''''''''''''''''''''''' +Discussed in more detail below, but you should use this future statement to +prevent yourself from accidentally using implicit relative imports. Supporting `Python 2.5`_ and Newer Only ---------------------------------------- +/////////////////////////////////////// If you are supporting `Python 2.5`_ and newer there are still some features of Python that you can utilize. @@ -251,7 +224,7 @@ Python that you can utilize. '''''''''''''''''''''''''''''''''''''''''' Implicit relative imports (e.g., importing ``spam.bacon`` from within -``spam.eggs`` with the statement ``import bacon``) does not work in Python 3. +``spam.eggs`` with the statement ``import bacon``) do not work in Python 3. This future statement moves away from that and allows the use of explicit relative imports (e.g., ``from . import bacon``). @@ -261,16 +234,74 @@ implicit ones. In `Python 2.6`_ explicit relative imports are available without the statement, but you still want the __future__ statement to prevent implicit relative imports. In `Python 2.7`_ the __future__ statement is not needed. In other words, unless you are only supporting Python 2.7 or a version earlier -than Python 2.5, use the __future__ statement. +than Python 2.5, use this __future__ statement. + + +Mark all Unicode strings with a ``u`` prefix +''''''''''''''''''''''''''''''''''''''''''''' + +While Python 2.6 has a ``__future__`` statement to automatically cause Python 2 +to treat all string literals as Unicode, Python 2.5 does not have that shortcut. +This means you should go through and mark all string literals with a ``u`` +prefix to turn them explicitly into text strings where appropriate and only +support Python 3.3 or newer. Otherwise use a project like six_ which provides a +function to pass all text string literals through. + + +Capturing the Currently Raised Exception +'''''''''''''''''''''''''''''''''''''''' + +In Python 2.5 and earlier the syntax to access the current exception is:: + + try: + raise Exception() + except Exception, exc: + # Current exception is 'exc'. + pass + +This syntax changed in Python 3 (and backported to `Python 2.6`_ and later) +to:: + + try: + raise Exception() + except Exception as exc: + # Current exception is 'exc'. + # In Python 3, 'exc' is restricted to the block; in Python 2.6/2.7 it will "leak". + pass + +Because of this syntax change you must change how you capture the current +exception in Python 2.5 and earlier to:: + try: + raise Exception() + except Exception: + import sys + exc = sys.exc_info()[1] + # Current exception is 'exc'. + pass + +You can get more information about the raised exception from +:func:`sys.exc_info` than simply the current exception instance, but you most +likely don't need it. + +.. note:: + In Python 3, the traceback is attached to the exception instance + through the ``__traceback__`` attribute. If the instance is saved in + a local variable that persists outside of the ``except`` block, the + traceback will create a reference cycle with the current frame and its + dictionary of local variables. This will delay reclaiming dead + resources until the next cyclic :term:`garbage collection` pass. + + In Python 2, this problem only occurs if you save the traceback itself + (e.g. the third element of the tuple returned by :func:`sys.exc_info`) + in a variable. Handle Common "Gotchas" ------------------------ +/////////////////////// -There are a few things that just consistently come up as sticking points for -people which 2to3 cannot handle automatically or can easily be done in Python 2 -to help modernize your code. +These are things to watch out for no matter what version of Python 2 you are +supporting which are not syntactic considerations. ``from __future__ import division`` @@ -327,9 +358,9 @@ One of the biggest issues people have when porting code to Python 3 is handling the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold textual data, people have over the years been rather loose in their delineation of what ``str`` instances held text compared to bytes. In Python 3 you cannot -be so care-free anymore and need to properly handle the difference. The key +be so care-free anymore and need to properly handle the difference. The key to handling this issue is to make sure that **every** string literal in your -Python 2 code is either syntactically of functionally marked as either bytes or +Python 2 code is either syntactically or functionally marked as either bytes or text data. After this is done you then need to make sure your APIs are designed to either handle a specific type or made to be properly polymorphic. @@ -436,14 +467,7 @@ methods which have unpredictable results (e.g., infinite recursion if you happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your ``__str__()`` method). -There are two ways to solve this issue. One is to use a custom 2to3 fixer. The -blog post at http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ -specifies how to do this. That will allow 2to3 to change all instances of ``def -__unicode(self): ...`` to ``def __str__(self): ...``. This does require that you -define your ``__str__()`` method in Python 2 before your ``__unicode__()`` -method. - -The other option is to use a mixin class. This allows you to only define a +You can use a mixin class to work around this. This allows you to only define a ``__unicode__()`` method for your class and let the mixin derive ``__str__()`` for you (code from http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/):: @@ -486,6 +510,7 @@ sequence containing all arguments passed to the :meth:`__init__` method. Even better is to use the documented attributes the exception provides. + Don't use ``__getslice__`` & Friends '''''''''''''''''''''''''''''''''''' @@ -497,23 +522,23 @@ friends. Updating doctests ''''''''''''''''' -2to3_ will attempt to generate fixes for doctests that it comes across. It's -not perfect, though. If you wrote a monolithic set of doctests (e.g., a single -docstring containing all of your doctests), you should at least consider -breaking the doctests up into smaller pieces to make it more manageable to fix. -Otherwise it might very well be worth your time and effort to port your tests -to :mod:`unittest`. +Don't forget to make them Python 2/3 compatible as well. If you wrote a +monolithic set of doctests (e.g., a single docstring containing all of your +doctests), you should at least consider breaking the doctests up into smaller +pieces to make it more manageable to fix. Otherwise it might very well be worth +your time and effort to port your tests to :mod:`unittest`. -Update `map` for imbalanced input sequences -''''''''''''''''''''''''''''''''''''''''''' +Update ``map`` for imbalanced input sequences +''''''''''''''''''''''''''''''''''''''''''''' -With Python 2, `map` would pad input sequences of unequal length with -`None` values, returning a sequence as long as the longest input sequence. +With Python 2, when ``map`` was given more than one input sequence it would pad +the shorter sequences with `None` values, returning a sequence as long as the +longest input sequence. -With Python 3, if the input sequences to `map` are of unequal length, `map` +With Python 3, if the input sequences to ``map`` are of unequal length, ``map`` will stop at the termination of the shortest of the sequences. For full -compatibility with `map` from Python 2.x, also wrap the sequences in +compatibility with ``map`` from Python 2.x, wrap the sequence arguments in :func:`itertools.zip_longest`, e.g. ``map(func, *sequences)`` becomes ``list(map(func, itertools.zip_longest(*sequences)))``. @@ -522,176 +547,37 @@ Eliminate ``-3`` Warnings When you run your application's test suite, run it using the ``-3`` flag passed to Python. This will cause various warnings to be raised during execution about -things that 2to3 cannot handle automatically (e.g., modules that have been -removed). Try to eliminate those warnings to make your code even more portable -to Python 3. - - -Run 2to3 --------- - -Once you have made your Python 2 code future-compatible with Python 3, it's -time to use 2to3_ to actually port your code. - - -Manually -'''''''' - -To manually convert source code using 2to3_, you use the ``2to3`` script that -is installed with Python 2.6 and later.:: - - 2to3 <directory or file to convert> - -This will cause 2to3 to write out a diff with all of the fixers applied for the -converted source code. If you would like 2to3 to go ahead and apply the changes -you can pass it the ``-w`` flag:: - - 2to3 -w <stuff to convert> - -There are other flags available to control exactly which fixers are applied, -etc. +things that are semantic changes between Python 2 and 3. Try to eliminate those +warnings to make your code even more portable to Python 3. -During Installation -''''''''''''''''''' - -When a user installs your project for Python 3, you can have either -:mod:`distutils` or Distribute_ run 2to3_ on your behalf. -For distutils, use the following idiom:: - - try: # Python 3 - from distutils.command.build_py import build_py_2to3 as build_py - except ImportError: # Python 2 - from distutils.command.build_py import build_py - - setup(cmdclass = {'build_py': build_py}, - # ... - ) - -For Distribute:: - - setup(use_2to3=True, - # ... - ) - -This will allow you to not have to distribute a separate Python 3 version of -your project. It does require, though, that when you perform development that -you at least build your project and use the built Python 3 source for testing. - - -Verify & Test -------------- - -At this point you should (hopefully) have your project converted in such a way -that it works in Python 3. Verify it by running your unit tests and making sure -nothing has gone awry. If you miss something then figure out how to fix it in -Python 3, backport to your Python 2 code, and run your code through 2to3 again -to verify the fix transforms properly. - - -.. _2to3: http://docs.python.org/py3k/library/2to3.html -.. _Distribute: http://packages.python.org/distribute/ - - -.. _use_same_source: - -Python 2/3 Compatible Source -============================ - -While it may seem counter-intuitive, you can write Python code which is -source-compatible between Python 2 & 3. It does lead to code that is not -entirely idiomatic Python (e.g., having to extract the currently raised -exception from ``sys.exc_info()[1]``), but it can be run under Python 2 -**and** Python 3 without using 2to3_ as a translation step (although the tool -should be used to help find potential portability problems). This allows you to -continue to have a rapid development process regardless of whether you are -developing under Python 2 or Python 3. Whether this approach or using -:ref:`use_2to3` works best for you will be a per-project decision. - -To get a complete idea of what issues you will need to deal with, see the -`What's New in Python 3.0`_. Others have reorganized the data in other formats -such as http://docs.pythonsprints.com/python3_porting/py-porting.html . - -The following are some steps to take to try to support both Python 2 & 3 from -the same source code. - - -.. _What's New in Python 3.0: http://docs.python.org/release/3.0/whatsnew/3.0.html - - -Follow The Steps for Using 2to3_ --------------------------------- - -All of the steps outlined in how to -:ref:`port Python 2 code with 2to3 <use_2to3>` apply -to creating a Python 2/3 codebase. This includes trying only support Python 2.6 -or newer (the :mod:`__future__` statements work in Python 3 without issue), -eliminating warnings that are triggered by ``-3``, etc. - -You should even consider running 2to3_ over your code (without committing the -changes). This will let you know where potential pain points are within your -code so that you can fix them properly before they become an issue. - - -Use six_ --------- - -The six_ project contains many things to help you write portable Python code. -You should make sure to read its documentation from beginning to end and use -any and all features it provides. That way you will minimize any mistakes you -might make in writing cross-version code. - - -Capturing the Currently Raised Exception ----------------------------------------- - -One change between Python 2 and 3 that will require changing how you code (if -you support `Python 2.5`_ and earlier) is -accessing the currently raised exception. In Python 2.5 and earlier the syntax -to access the current exception is:: - - try: - raise Exception() - except Exception, exc: - # Current exception is 'exc' - pass +Alternative Approaches +====================== -This syntax changed in Python 3 (and backported to `Python 2.6`_ and later) -to:: +While supporting Python 2 & 3 simultaneously is typically the preferred choice +by people so that they can continue to improve code and have it work for the +most number of users, your life may be easier if you only have to support one +major version of Python going forward. - try: - raise Exception() - except Exception as exc: - # Current exception is 'exc' - # In Python 3, 'exc' is restricted to the block; Python 2.6 will "leak" - pass +Supporting Only Python 3 Going Forward From Python 2 Code +--------------------------------------------------------- -Because of this syntax change you must change to capturing the current -exception to:: +If you have Python 2 code but going forward only want to improve it as Python 3 +code, then you can use 2to3_ to translate your Python 2 code to Python 3 code. +This is only recommended, though, if your current version of your project is +going into maintenance mode and you want all new features to be exclusive to +Python 3. - try: - raise Exception() - except Exception: - import sys - exc = sys.exc_info()[1] - # Current exception is 'exc' - pass -You can get more information about the raised exception from -:func:`sys.exc_info` than simply the current exception instance, but you most -likely don't need it. +Backporting Python 3 code to Python 2 +------------------------------------- -.. note:: - In Python 3, the traceback is attached to the exception instance - through the ``__traceback__`` attribute. If the instance is saved in - a local variable that persists outside of the ``except`` block, the - traceback will create a reference cycle with the current frame and its - dictionary of local variables. This will delay reclaiming dead - resources until the next cyclic :term:`garbage collection` pass. - - In Python 2, this problem only occurs if you save the traceback itself - (e.g. the third element of the tuple returned by :func:`sys.exc_info`) - in a variable. +If you have Python 3 code and have little interest in supporting Python 2 you +can use 3to2_ to translate from Python 3 code to Python 2 code. This is only +recommended if you don't plan to heavily support Python 2 users. Otherwise +write your code for Python 3 and then backport as far back as you want. This +is typically easier than going from Python 2 to 3 as you will have worked out +any difficulties with e.g. bytes/strings, etc. Other Resources @@ -699,17 +585,41 @@ Other Resources The authors of the following blog posts, wiki pages, and books deserve special thanks for making public their tips for porting Python 2 code to Python 3 (and -thus helping provide information for this document): +thus helping provide information for this document and its various revisions +over the years): +* http://wiki.python.org/moin/PortingPythonToPy3k * http://python3porting.com/ * http://docs.pythonsprints.com/python3_porting/py-porting.html * http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/ * http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html * http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ * http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/ -* http://wiki.python.org/moin/PortingPythonToPy3k +* https://wiki.ubuntu.com/Python/3 If you feel there is something missing from this document that should be added, please email the python-porting_ mailing list. + + +.. _2to3: http://docs.python.org/2/library/2to3.html +.. _3to2: https://pypi.python.org/pypi/3to2 +.. _Cheeseshop: PyPI_ +.. _coverage: https://pypi.python.org/pypi/coverage +.. _future: http://python-future.org/ +.. _modernize: https://github.com/mitsuhiko/python-modernize +.. _Porting to Python 3: http://python3porting.com/ +.. _PyPI: http://pypi.python.org/ +.. _Python 2.2: http://www.python.org/2.2.x +.. _Python 2.5: http://www.python.org/2.5.x +.. _Python 2.6: http://www.python.org/2.6.x +.. _Python 2.7: http://www.python.org/2.7.x +.. _Python 2.5: http://www.python.org/2.5.x +.. _Python 3.3: http://www.python.org/3.3.x +.. _Python 3 Packages: https://pypi.python.org/pypi?:action=browse&c=533&show=all +.. _Python 3 Q & A: http://ncoghlan-devs-python-notes.readthedocs.org/en/latest/python3/questions_and_answers.html .. _python-porting: http://mail.python.org/mailman/listinfo/python-porting +.. _six: https://pypi.python.org/pypi/six +.. _tox: https://pypi.python.org/pypi/tox +.. _trove classifiers: https://pypi.python.org/pypi?%3Aaction=list_classifiers + diff --git a/Doc/howto/regex.rst b/Doc/howto/regex.rst index 9adfa85..5203e53 100644 --- a/Doc/howto/regex.rst +++ b/Doc/howto/regex.rst @@ -104,13 +104,25 @@ you can still match them in patterns; for example, if you need to match a ``[`` or ``\``, you can precede them with a backslash to remove their special meaning: ``\[`` or ``\\``. -Some of the special sequences beginning with ``'\'`` represent predefined sets -of characters that are often useful, such as the set of digits, the set of -letters, or the set of anything that isn't whitespace. The following predefined -special sequences are a subset of those available. The equivalent classes are -for bytes patterns. For a complete list of sequences and expanded class -definitions for Unicode string patterns, see the last part of -:ref:`Regular Expression Syntax <re-syntax>`. +Some of the special sequences beginning with ``'\'`` represent +predefined sets of characters that are often useful, such as the set +of digits, the set of letters, or the set of anything that isn't +whitespace. + +Let's take an example: ``\w`` matches any alphanumeric character. If +the regex pattern is expressed in bytes, this is equivalent to the +class ``[a-zA-Z0-9_]``. If the regex pattern is a string, ``\w`` will +match all the characters marked as letters in the Unicode database +provided by the :mod:`unicodedata` module. You can use the more +restricted definition of ``\w`` in a string pattern by supplying the +:const:`re.ASCII` flag when compiling the regular expression. + +The following list of special sequences isn't complete. For a complete +list of sequences and expanded class definitions for Unicode string +patterns, see the last part of :ref:`Regular Expression Syntax +<re-syntax>` in the Standard Library reference. In general, the +Unicode versions match any character that's in the appropriate +category in the Unicode database. ``\d`` Matches any decimal digit; this is equivalent to the class ``[0-9]``. @@ -160,9 +172,8 @@ previous character can be matched zero or more times, instead of exactly once. For example, ``ca*t`` will match ``ct`` (0 ``a`` characters), ``cat`` (1 ``a``), ``caaat`` (3 ``a`` characters), and so forth. The RE engine has various internal limitations stemming from the size of C's ``int`` type that will -prevent it from matching over 2 billion ``a`` characters; you probably don't -have enough memory to construct a string that large, so you shouldn't run into -that limit. +prevent it from matching over 2 billion ``a`` characters; patterns +are usually not written to match that much data. Repetitions such as ``*`` are :dfn:`greedy`; when repeating a RE, the matching engine will try to repeat it as many times as possible. If later portions of the @@ -353,7 +364,7 @@ for a complete listing. | | returns them as an :term:`iterator`. | +------------------+-----------------------------------------------+ -:meth:`match` and :meth:`search` return ``None`` if no match can be found. If +:meth:`~re.regex.match` and :meth:`~re.regex.search` return ``None`` if no match can be found. If they're successful, a :ref:`match object <match-objects>` instance is returned, containing information about the match: where it starts and ends, the substring it matched, and more. @@ -419,8 +430,8 @@ Trying these methods will soon clarify their meaning:: >>> m.span() (0, 5) -:meth:`group` returns the substring that was matched by the RE. :meth:`start` -and :meth:`end` return the starting and ending index of the match. :meth:`span` +:meth:`~re.match.group` returns the substring that was matched by the RE. :meth:`~re.match.start` +and :meth:`~re.match.end` return the starting and ending index of the match. :meth:`~re.match.span` returns both start and end indexes in a single tuple. Since the :meth:`match` method only checks if the RE matches at the start of a string, :meth:`start` will always be zero. However, the :meth:`search` method of patterns @@ -448,14 +459,14 @@ In actual programs, the most common style is to store the print('No match') Two pattern methods return all of the matches for a pattern. -:meth:`findall` returns a list of matching strings:: +:meth:`~re.regex.findall` returns a list of matching strings:: >>> p = re.compile('\d+') >>> p.findall('12 drummers drumming, 11 pipers piping, 10 lords a-leaping') ['12', '11', '10'] :meth:`findall` has to create the entire list before it can be returned as the -result. The :meth:`finditer` method returns a sequence of +result. The :meth:`~re.regex.finditer` method returns a sequence of :ref:`match object <match-objects>` instances as an :term:`iterator`:: >>> iterator = p.finditer('12 drummers drumming, 11 ... 10 ...') @@ -473,9 +484,9 @@ Module-Level Functions ---------------------- You don't have to create a pattern object and call its methods; the -:mod:`re` module also provides top-level functions called :func:`match`, -:func:`search`, :func:`findall`, :func:`sub`, and so forth. These functions -take the same arguments as the corresponding pattern method, with +:mod:`re` module also provides top-level functions called :func:`~re.match`, +:func:`~re.search`, :func:`~re.findall`, :func:`~re.sub`, and so forth. These functions +take the same arguments as the corresponding pattern method with the RE string added as the first argument, and still return either ``None`` or a :ref:`match object <match-objects>` instance. :: @@ -485,26 +496,15 @@ the RE string added as the first argument, and still return either ``None`` or a <_sre.SRE_Match object at 0x...> Under the hood, these functions simply create a pattern object for you -and call the appropriate method on it. They also store the compiled object in a -cache, so future calls using the same RE are faster. +and call the appropriate method on it. They also store the compiled +object in a cache, so future calls using the same RE won't need to +parse the pattern again and again. Should you use these module-level functions, or should you get the -pattern and call its methods yourself? That choice depends on how -frequently the RE will be used, and on your personal coding style. If the RE is -being used at only one point in the code, then the module functions are probably -more convenient. If a program contains a lot of regular expressions, or re-uses -the same ones in several locations, then it might be worthwhile to collect all -the definitions in one place, in a section of code that compiles all the REs -ahead of time. To take an example from the standard library, here's an extract -from the now-defunct Python 2 standard :mod:`xmllib` module:: - - ref = re.compile( ... ) - entityref = re.compile( ... ) - charref = re.compile( ... ) - starttagopen = re.compile( ... ) - -I generally prefer to work with the compiled object, even for one-time uses, but -few people will be as much of a purist about this as I am. +pattern and call its methods yourself? If you're accessing a regex +within a loop, pre-compiling it will save a few function calls. +Outside of loops, there's not much difference thanks to the internal +cache. Compilation Flags @@ -524,6 +524,10 @@ of each one. +---------------------------------+--------------------------------------------+ | Flag | Meaning | +=================================+============================================+ +| :const:`ASCII`, :const:`A` | Makes several escapes like ``\w``, ``\b``, | +| | ``\s`` and ``\d`` match only on ASCII | +| | characters with the respective property. | ++---------------------------------+--------------------------------------------+ | :const:`DOTALL`, :const:`S` | Make ``.`` match any character, including | | | newlines | +---------------------------------+--------------------------------------------+ @@ -535,11 +539,7 @@ of each one. | | ``$`` | +---------------------------------+--------------------------------------------+ | :const:`VERBOSE`, :const:`X` | Enable verbose REs, which can be organized | -| | more cleanly and understandably. | -+---------------------------------+--------------------------------------------+ -| :const:`ASCII`, :const:`A` | Makes several escapes like ``\w``, ``\b``, | -| | ``\s`` and ``\d`` match only on ASCII | -| | characters with the respective property. | +| (for 'extended') | more cleanly and understandably. | +---------------------------------+--------------------------------------------+ @@ -558,7 +558,8 @@ of each one. LOCALE :noindex: - Make ``\w``, ``\W``, ``\b``, and ``\B``, dependent on the current locale. + Make ``\w``, ``\W``, ``\b``, and ``\B``, dependent on the current locale + instead of the Unicode database. Locales are a feature of the C library intended to help in writing programs that take account of language differences. For example, if you're processing French @@ -851,11 +852,10 @@ keep track of the group numbers. There are two features which help with this problem. Both of them use a common syntax for regular expression extensions, so we'll look at that first. -Perl 5 added several additional features to standard regular expressions, and -the Python :mod:`re` module supports most of them. It would have been -difficult to choose new single-keystroke metacharacters or new special sequences -beginning with ``\`` to represent the new features without making Perl's regular -expressions confusingly different from standard REs. If you chose ``&`` as a +Perl 5 is well-known for its powerful additions to standard regular expressions. +For these new features the Perl developers couldn't choose new single-keystroke metacharacters +or new special sequences beginning with ``\`` without making Perl's regular +expressions confusingly different from standard REs. If they chose ``&`` as a new metacharacter, for example, old expressions would be assuming that ``&`` was a regular character and wouldn't have escaped it by writing ``\&`` or ``[&]``. @@ -867,22 +867,15 @@ what extension is being used, so ``(?=foo)`` is one thing (a positive lookahead assertion) and ``(?:foo)`` is something else (a non-capturing group containing the subexpression ``foo``). -Python adds an extension syntax to Perl's extension syntax. If the first -character after the question mark is a ``P``, you know that it's an extension -that's specific to Python. Currently there are two such extensions: -``(?P<name>...)`` defines a named group, and ``(?P=name)`` is a backreference to -a named group. If future versions of Perl 5 add similar features using a -different syntax, the :mod:`re` module will be changed to support the new -syntax, while preserving the Python-specific syntax for compatibility's sake. - -Now that we've looked at the general extension syntax, we can return to the -features that simplify working with groups in complex REs. Since groups are -numbered from left to right and a complex expression may use many groups, it can -become difficult to keep track of the correct numbering. Modifying such a -complex RE is annoying, too: insert a new group near the beginning and you -change the numbers of everything that follows it. - -Sometimes you'll want to use a group to collect a part of a regular expression, +Python supports several of Perl's extensions and adds an extension +syntax to Perl's extension syntax. If the first character after the +question mark is a ``P``, you know that it's an extension that's +specific to Python. + +Now that we've looked at the general extension syntax, we can return +to the features that simplify working with groups in complex REs. + +Sometimes you'll want to use a group to denote a part of a regular expression, but aren't interested in retrieving the group's contents. You can make this fact explicit by using a non-capturing group: ``(?:...)``, where you can replace the ``...`` with any other regular expression. :: @@ -908,7 +901,7 @@ numbers, groups can be referenced by a name. The syntax for a named group is one of the Python-specific extensions: ``(?P<name>...)``. *name* is, obviously, the name of the group. Named groups -also behave exactly like capturing groups, and additionally associate a name +behave exactly like capturing groups, and additionally associate a name with a group. The :ref:`match object <match-objects>` methods that deal with capturing groups all accept either integers that refer to the group by number or strings that contain the desired group's name. Named groups are still @@ -975,9 +968,10 @@ The pattern to match this is quite simple: ``.*[.].*$`` Notice that the ``.`` needs to be treated specially because it's a -metacharacter; I've put it inside a character class. Also notice the trailing -``$``; this is added to ensure that all the rest of the string must be included -in the extension. This regular expression matches ``foo.bar`` and +metacharacter, so it's inside a character class to only match that +specific character. Also notice the trailing ``$``; this is added to +ensure that all the rest of the string must be included in the +extension. This regular expression matches ``foo.bar`` and ``autoexec.bat`` and ``sendmail.cf`` and ``printers.conf``. Now, consider complicating the problem a bit; what if you want to match @@ -1051,7 +1045,7 @@ Splitting Strings The :meth:`split` method of a pattern splits a string apart wherever the RE matches, returning a list of the pieces. It's similar to the :meth:`split` method of strings but provides much more generality in the -delimiters that you can split by; :meth:`split` only supports splitting by +delimiters that you can split by; string :meth:`split` only supports splitting by whitespace or by a fixed string. As you'd expect, there's a module-level :func:`re.split` function, too. @@ -1106,7 +1100,6 @@ Another common task is to find all the matches for a pattern, and replace them with a different string. The :meth:`sub` method takes a replacement value, which can be either a string or a function, and the string to be processed. - .. method:: .sub(replacement, string[, count=0]) :noindex: @@ -1362,4 +1355,3 @@ and doesn't contain any Python material at all, so it won't be useful as a reference for programming in Python. (The first edition covered Python's now-removed :mod:`regex` module, which won't help you much.) Consider checking it out from your library. - diff --git a/Doc/howto/sockets.rst b/Doc/howto/sockets.rst index 279bb3e..7a9b0ed 100644 --- a/Doc/howto/sockets.rst +++ b/Doc/howto/sockets.rst @@ -19,14 +19,8 @@ Sockets ======= -Sockets are used nearly everywhere, but are one of the most severely -misunderstood technologies around. This is a 10,000 foot overview of sockets. -It's not really a tutorial - you'll still have work to do in getting things -working. It doesn't cover the fine points (and there are a lot of them), but I -hope it will give you enough background to begin using them decently. - -I'm only going to talk about INET sockets, but they account for at least 99% of -the sockets in use. And I'll only talk about STREAM sockets - unless you really +I'm only going to talk about INET (i.e. IPv4) sockets, but they account for at least 99% of +the sockets in use. And I'll only talk about STREAM (i.e. TCP) sockets - unless you really know what you're doing (in which case this HOWTO isn't for you!), you'll get better behavior and performance from a STREAM socket than anything else. I will try to clear up the mystery of what a socket is, as well as some hints on how to @@ -84,9 +78,11 @@ creates a "server socket":: serversocket.listen(5) A couple things to notice: we used ``socket.gethostname()`` so that the socket -would be visible to the outside world. If we had used ``s.bind(('', 80))`` or -``s.bind(('localhost', 80))`` or ``s.bind(('127.0.0.1', 80))`` we would still -have a "server" socket, but one that was only visible within the same machine. +would be visible to the outside world. If we had used ``s.bind(('localhost', +80))`` or ``s.bind(('127.0.0.1', 80))`` we would still have a "server" socket, +but one that was only visible within the same machine. ``s.bind(('', 80))`` +specifies that the socket is reachable by any address the machine happens to +have. A second thing to note: low number ports are usually reserved for "well known" services (HTTP, SNMP etc). If you're playing around, use a nice high number (4 @@ -208,10 +204,10 @@ length message:: totalsent = totalsent + sent def myreceive(self): - msg = '' + msg = b'' while len(msg) < MSGLEN: chunk = self.sock.recv(MSGLEN-len(msg)) - if chunk == '': + if chunk == b'': raise RuntimeError("socket connection broken") msg = msg + chunk return msg @@ -371,12 +367,6 @@ have created a new socket to ``connect`` to someone else, put it in the potential_writers list. If it shows up in the writable list, you have a decent chance that it has connected. -One very nasty problem with ``select``: if somewhere in those input lists of -sockets is one which has died a nasty death, the ``select`` will fail. You then -need to loop through every single damn socket in all those lists and do a -``select([sock],[],[],0)`` until you find the bad one. That timeout of 0 means -it won't take long, but it's ugly. - Actually, ``select`` can be handy even with blocking sockets. It's one way of determining whether you will block - the socket returns as readable when there's something in the buffers. However, this still doesn't help with the problem of @@ -386,26 +376,6 @@ determining whether the other end is done, or just busy with something else. files. Don't try this on Windows. On Windows, ``select`` works with sockets only. Also note that in C, many of the more advanced socket options are done differently on Windows. In fact, on Windows I usually use threads (which work -very, very well) with my sockets. Face it, if you want any kind of performance, -your code will look very different on Windows than on Unix. - - -Performance ------------ +very, very well) with my sockets. -There's no question that the fastest sockets code uses non-blocking sockets and -select to multiplex them. You can put together something that will saturate a -LAN connection without putting any strain on the CPU. - -The trouble is that an app written this way can't do much of anything else - -it needs to be ready to shuffle bytes around at all times. Assuming that your -app is actually supposed to do something more than that, threading is the -optimal solution, (and using non-blocking sockets will be faster than using -blocking sockets). - -Finally, remember that even though blocking sockets are somewhat slower than -non-blocking, in many cases they are the "right" solution. After all, if your -app is driven by the data it receives over a socket, there's not much sense in -complicating the logic just so your app can wait on ``select`` instead of -``recv``. diff --git a/Doc/howto/unicode.rst b/Doc/howto/unicode.rst index b309f60..9d48a78 100644 --- a/Doc/howto/unicode.rst +++ b/Doc/howto/unicode.rst @@ -28,15 +28,15 @@ which required accented characters couldn't be faithfully represented in ASCII. as 'naïve' and 'café', and some publications have house styles which require spellings such as 'coöperate'.) -For a while people just wrote programs that didn't display accents. I remember -looking at Apple ][ BASIC programs, published in French-language publications in -the mid-1980s, that had lines like these:: +For a while people just wrote programs that didn't display accents. +In the mid-1980s an Apple II BASIC program written by a French speaker +might have lines like these:: PRINT "FICHIER EST COMPLETE." PRINT "CARACTERE NON ACCEPTE." -Those messages should contain accents, and they just look wrong to someone who -can read French. +Those messages should contain accents (completé, caractère, accepté), +and they just look wrong to someone who can read French. In the 1980s, almost all personal computers were 8-bit, meaning that bytes could hold values ranging from 0 to 255. ASCII codes only went up to 127, so some @@ -69,9 +69,12 @@ There's a related ISO standard, ISO 10646. Unicode and ISO 10646 were originally separate efforts, but the specifications were merged with the 1.1 revision of Unicode. -(This discussion of Unicode's history is highly simplified. I don't think the -average Python programmer needs to worry about the historical details; consult -the Unicode consortium site listed in the References for more information.) +(This discussion of Unicode's history is highly simplified. The +precise historical details aren't necessary for understanding how to +use Unicode effectively, but if you're curious, consult the Unicode +consortium site listed in the References or +the `Wikipedia entry for Unicode <http://en.wikipedia.org/wiki/Unicode#History>`_ +for more information.) Definitions @@ -216,10 +219,8 @@ Unicode character tables. Another `good introductory article <http://www.joelonsoftware.com/articles/Unicode.html>`_ was written by Joel Spolsky. -If this introduction didn't make things clear to you, you should try reading this -alternate article before continuing. - -.. Jason Orendorff XXX http://www.jorendorff.com/articles/unicode/ is broken +If this introduction didn't make things clear to you, you should try +reading this alternate article before continuing. Wikipedia entries are often helpful; see the entries for "`character encoding <http://en.wikipedia.org/wiki/Character_encoding>`_" and `UTF-8 @@ -239,8 +240,31 @@ Since Python 3.0, the language features a :class:`str` type that contain Unicode characters, meaning any string created using ``"unicode rocks!"``, ``'unicode rocks!'``, or the triple-quoted string syntax is stored as Unicode. -To insert a non-ASCII Unicode character, e.g., any letters with -accents, one can use escape sequences in their string literals as such:: +The default encoding for Python source code is UTF-8, so you can simply +include a Unicode character in a string literal:: + + try: + with open('/tmp/input.txt', 'r') as f: + ... + except IOError: + # 'File not found' error message. + print("Fichier non trouvé") + +You can use a different encoding from UTF-8 by putting a specially-formatted +comment as the first or second line of the source code:: + + # -*- coding: <encoding name> -*- + +Side note: Python 3 also supports using Unicode characters in identifiers:: + + répertoire = "/tmp/records.log" + with open(répertoire, "w") as f: + f.write("test\n") + +If you can't enter a particular character in your editor or want to +keep the source code ASCII-only for some reason, you can also use +escape sequences in string literals. (Depending on your system, +you may see the actual capital-delta glyph instead of a \u escape.) :: >>> "\N{GREEK CAPITAL LETTER DELTA}" # Using the character name '\u0394' @@ -251,7 +275,7 @@ accents, one can use escape sequences in their string literals as such:: In addition, one can create a string using the :func:`~bytes.decode` method of :class:`bytes`. This method takes an *encoding* argument, such as ``UTF-8``, -and optionally, an *errors* argument. +and optionally an *errors* argument. The *errors* argument specifies the response when the input string can't be converted according to the encoding's rules. Legal values for this argument are @@ -295,11 +319,15 @@ Converting to Bytes The opposite method of :meth:`bytes.decode` is :meth:`str.encode`, which returns a :class:`bytes` representation of the Unicode string, encoded in the -requested *encoding*. The *errors* parameter is the same as the parameter of -the :meth:`~bytes.decode` method, with one additional possibility; as well as -``'strict'``, ``'ignore'``, and ``'replace'`` (which in this case inserts a -question mark instead of the unencodable character), you can also pass -``'xmlcharrefreplace'`` which uses XML's character references. +requested *encoding*. + +The *errors* parameter is the same as the parameter of the +:meth:`~bytes.decode` method but supports a few more possible handlers. As well as +``'strict'``, ``'ignore'``, and ``'replace'`` (which in this case +inserts a question mark instead of the unencodable character), there is +also ``'xmlcharrefreplace'`` (inserts an XML character reference) and +``backslashreplace`` (inserts a ``\uNNNN`` escape sequence). + The following example shows the different results:: >>> u = chr(40960) + 'abcd' + chr(1972) @@ -316,16 +344,15 @@ The following example shows the different results:: b'?abcd?' >>> u.encode('ascii', 'xmlcharrefreplace') b'ꀀabcd޴' + >>> u.encode('ascii', 'backslashreplace') + b'\\ua000abcd\\u07b4' -.. XXX mention the surrogate* error handlers - -The low-level routines for registering and accessing the available encodings are -found in the :mod:`codecs` module. However, the encoding and decoding functions -returned by this module are usually more low-level than is comfortable, so I'm -not going to describe the :mod:`codecs` module here. If you need to implement a -completely new encoding, you'll need to learn about the :mod:`codecs` module -interfaces, but implementing encodings is a specialized task that also won't be -covered here. Consult the Python documentation to learn more about this module. +The low-level routines for registering and accessing the available +encodings are found in the :mod:`codecs` module. Implementing new +encodings also requires understanding the :mod:`codecs` module. +However, the encoding and decoding functions returned by this module +are usually more low-level than is comfortable, and writing new encodings +is a specialized task, so the module won't be covered in this HOWTO. Unicode Literals in Python Source Code @@ -415,25 +442,61 @@ These are grouped into categories such as "Letter", "Number", "Punctuation", or from the above output, ``'Ll'`` means 'Letter, lowercase', ``'No'`` means "Number, other", ``'Mn'`` is "Mark, nonspacing", and ``'So'`` is "Symbol, other". See -<http://www.unicode.org/reports/tr44/#General_Category_Values> for a +`the General Category Values section of the Unicode Character Database documentation <http://www.unicode.org/reports/tr44/#General_Category_Values>`_ for a list of category codes. + +Unicode Regular Expressions +--------------------------- + +The regular expressions supported by the :mod:`re` module can be provided +either as bytes or strings. Some of the special character sequences such as +``\d`` and ``\w`` have different meanings depending on whether +the pattern is supplied as bytes or a string. For example, +``\d`` will match the characters ``[0-9]`` in bytes but +in strings will match any character that's in the ``'Nd'`` category. + +The string in this example has the number 57 written in both Thai and +Arabic numerals:: + + import re + p = re.compile('\d+') + + s = "Over \u0e55\u0e57 57 flavours" + m = p.search(s) + print(repr(m.group())) + +When executed, ``\d+`` will match the Thai numerals and print them +out. If you supply the :const:`re.ASCII` flag to +:func:`~re.compile`, ``\d+`` will match the substring "57" instead. + +Similarly, ``\w`` matches a wide variety of Unicode characters but +only ``[a-zA-Z0-9_]`` in bytes or if :const:`re.ASCII` is supplied, +and ``\s`` will match either Unicode whitespace characters or +``[ \t\n\r\f\v]``. + + References ---------- +.. comment should these be mentioned earlier, e.g. at the start of the "introduction to Unicode" first section? + +Some good alternative discussions of Python's Unicode support are: + +* `Processing Text Files in Python 3 <http://python-notes.curiousefficiency.org/en/latest/python3/text_file_processing.html>`_, by Nick Coghlan. +* `Pragmatic Unicode <http://nedbatchelder.com/text/unipain.html>`_, a PyCon 2012 presentation by Ned Batchelder. + The :class:`str` type is described in the Python library reference at -:ref:`typesseq`. +:ref:`textseq`. The documentation for the :mod:`unicodedata` module. The documentation for the :mod:`codecs` module. -Marc-André Lemburg gave a presentation at EuroPython 2002 titled "Python and -Unicode". A PDF version of his slides is available at -<http://downloads.egenix.com/python/Unicode-EPC2002-Talk.pdf>, and is an -excellent overview of the design of Python's Unicode features (based on Python -2, where the Unicode string type is called ``unicode`` and literals start with -``u``). +Marc-André Lemburg gave `a presentation titled "Python and Unicode" (PDF slides) <http://downloads.egenix.com/python/Unicode-EPC2002-Talk.pdf>`_ at +EuroPython 2002. The slides are an excellent overview of the design +of Python 2's Unicode features (where the Unicode string type is +called ``unicode`` and literals start with ``u``). Reading and Writing Unicode Data @@ -451,16 +514,16 @@ columns and can return Unicode values from an SQL query. Unicode data is usually converted to a particular encoding before it gets written to disk or sent over a socket. It's possible to do all the work -yourself: open a file, read an 8-bit bytes object from it, and convert the string +yourself: open a file, read an 8-bit bytes object from it, and convert the bytes with ``bytes.decode(encoding)``. However, the manual approach is not recommended. One problem is the multi-byte nature of encodings; one Unicode character can be represented by several bytes. If you want to read the file in arbitrary-sized -chunks (say, 1k or 4k), you need to write error-handling code to catch the case +chunks (say, 1024 or 4096 bytes), you need to write error-handling code to catch the case where only part of the bytes encoding a single Unicode character are read at the end of a chunk. One solution would be to read the entire file into memory and then perform the decoding, but that prevents you from working with files that -are extremely large; if you need to read a 2GB file, you need 2GB of RAM. +are extremely large; if you need to read a 2 GiB file, you need 2 GiB of RAM. (More, really, since for at least a moment you'd need to have both the encoded string and its Unicode version in memory.) @@ -468,13 +531,14 @@ The solution would be to use the low-level decoding interface to catch the case of partial coding sequences. The work of implementing this has already been done for you: the built-in :func:`open` function can return a file-like object that assumes the file's contents are in a specified encoding and accepts Unicode -parameters for methods such as :meth:`read` and :meth:`write`. This works through -:func:`open`\'s *encoding* and *errors* parameters which are interpreted just -like those in :meth:`str.encode` and :meth:`bytes.decode`. +parameters for methods such as :meth:`~io.TextIOBase.read` and +:meth:`~io.TextIOBase.write`. This works through :func:`open`\'s *encoding* and +*errors* parameters which are interpreted just like those in :meth:`str.encode` +and :meth:`bytes.decode`. Reading Unicode from a file is therefore simple:: - with open('unicode.rst', encoding='utf-8') as f: + with open('unicode.txt', encoding='utf-8') as f: for line in f: print(repr(line)) @@ -512,7 +576,7 @@ example, Mac OS X uses UTF-8 while Windows uses a configurable encoding; on Windows, Python uses the name "mbcs" to refer to whatever the currently configured encoding is. On Unix systems, there will only be a filesystem encoding if you've set the ``LANG`` or ``LC_CTYPE`` environment variables; if -you haven't, the default encoding is ASCII. +you haven't, the default encoding is UTF-8. The :func:`sys.getfilesystemencoding` function returns the encoding to use on your current system, in case you want to do the encoding manually, but there's @@ -527,13 +591,13 @@ automatically converted to the right encoding for you:: Functions in the :mod:`os` module such as :func:`os.stat` will also accept Unicode filenames. -Function :func:`os.listdir`, which returns filenames, raises an issue: should it return +The :func:`os.listdir` function returns filenames and raises an issue: should it return the Unicode version of filenames, or should it return bytes containing the encoded versions? :func:`os.listdir` will do both, depending on whether you provided the directory path as bytes or a Unicode string. If you pass a Unicode string as the path, filenames will be decoded using the filesystem's encoding and a list of Unicode strings will be returned, while passing a byte -path will return the bytes versions of the filenames. For example, +path will return the filenames as bytes. For example, assuming the default filesystem encoding is UTF-8, running the following program:: @@ -548,13 +612,13 @@ program:: will produce the following output:: amk:~$ python t.py - [b'.svn', b'filename\xe4\x94\x80abc', ...] - ['.svn', 'filename\u4500abc', ...] + [b'filename\xe4\x94\x80abc', ...] + ['filename\u4500abc', ...] The first list contains UTF-8-encoded filenames, and the second list contains the Unicode versions. -Note that in most occasions, the Unicode APIs should be used. The bytes APIs +Note that on most occasions, the Unicode APIs should be used. The bytes APIs should only be used on systems where undecodable file names can be present, i.e. Unix systems. @@ -585,68 +649,70 @@ data also specifies the encoding, since the attacker can then choose a clever way to hide malicious text in the encoded bytestream. +Converting Between File Encodings +''''''''''''''''''''''''''''''''' + +The :class:`~codecs.StreamRecoder` class can transparently convert between +encodings, taking a stream that returns data in encoding #1 +and behaving like a stream returning data in encoding #2. + +For example, if you have an input file *f* that's in Latin-1, you +can wrap it with a :class:`~codecs.StreamRecoder` to return bytes encoded in +UTF-8:: + + new_f = codecs.StreamRecoder(f, + # en/decoder: used by read() to encode its results and + # by write() to decode its input. + codecs.getencoder('utf-8'), codecs.getdecoder('utf-8'), + + # reader/writer: used to read and write to the stream. + codecs.getreader('latin-1'), codecs.getwriter('latin-1') ) + + +Files in an Unknown Encoding +'''''''''''''''''''''''''''' + +What can you do if you need to make a change to a file, but don't know +the file's encoding? If you know the encoding is ASCII-compatible and +only want to examine or modify the ASCII parts, you can open the file +with the ``surrogateescape`` error handler:: + + with open(fname, 'r', encoding="ascii", errors="surrogateescape") as f: + data = f.read() + + # make changes to the string 'data' + + with open(fname + '.new', 'w', + encoding="ascii", errors="surrogateescape") as f: + f.write(data) + +The ``surrogateescape`` error handler will decode any non-ASCII bytes +as code points in the Unicode Private Use Area ranging from U+DC80 to +U+DCFF. These private code points will then be turned back into the +same bytes when the ``surrogateescape`` error handler is used when +encoding the data and writing it back out. + + References ---------- -The PDF slides for Marc-André Lemburg's presentation "Writing Unicode-aware -Applications in Python" are available at -<http://downloads.egenix.com/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf> -and discuss questions of character encodings as well as how to internationalize +One section of `Mastering Python 3 Input/Output <http://pyvideo.org/video/289/pycon-2010--mastering-python-3-i-o>`_, a PyCon 2010 talk by David Beazley, discusses text processing and binary data handling. + +The `PDF slides for Marc-André Lemburg's presentation "Writing Unicode-aware Applications in Python" <http://downloads.egenix.com/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf>`_ +discuss questions of character encodings as well as how to internationalize and localize an application. These slides cover Python 2.x only. +`The Guts of Unicode in Python <http://pyvideo.org/video/1768/the-guts-of-unicode-in-python>`_ is a PyCon 2013 talk by Benjamin Peterson that discusses the internal Unicode representation in Python 3.3. + Acknowledgements ================ -Thanks to the following people who have noted errors or offered suggestions on -this article: Nicholas Bastin, Marius Gedminas, Kent Johnson, Ken Krugler, -Marc-André Lemburg, Martin von Löwis, Chad Whitacre. - -.. comment - Revision History - - Version 1.0: posted August 5 2005. - - Version 1.01: posted August 7 2005. Corrects factual and markup errors; adds - several links. - - Version 1.02: posted August 16 2005. Corrects factual errors. - - Version 1.1: Feb-Nov 2008. Updates the document with respect to Python 3 changes. - - Version 1.11: posted June 20 2010. Notes that Python 3.x is not covered, - and that the HOWTO only covers 2.x. - -.. comment Describe Python 3.x support (new section? new document?) -.. comment Additional topic: building Python w/ UCS2 or UCS4 support -.. comment Describe use of codecs.StreamRecoder and StreamReaderWriter - -.. comment - Original outline: - - - [ ] Unicode introduction - - [ ] ASCII - - [ ] Terms - - [ ] Character - - [ ] Code point - - [ ] Encodings - - [ ] Common encodings: ASCII, Latin-1, UTF-8 - - [ ] Unicode Python type - - [ ] Writing unicode literals - - [ ] Obscurity: -U switch - - [ ] Built-ins - - [ ] unichr() - - [ ] ord() - - [ ] unicode() constructor - - [ ] Unicode type - - [ ] encode(), decode() methods - - [ ] Unicodedata module for character properties - - [ ] I/O - - [ ] Reading/writing Unicode data into files - - [ ] Byte-order marks - - [ ] Unicode filenames - - [ ] Writing Unicode programs - - [ ] Do everything in Unicode - - [ ] Declaring source code encodings (PEP 263) - - [ ] Other issues - - [ ] Building Python (UCS2, UCS4) +The initial draft of this document was written by Andrew Kuchling. +It has since been revised further by Alexander Belopolsky, Georg Brandl, +Andrew Kuchling, and Ezio Melotti. + +Thanks to the following people who have noted errors or offered +suggestions on this article: Éric Araujo, Nicholas Bastin, Nick +Coghlan, Marius Gedminas, Kent Johnson, Ken Krugler, Marc-André +Lemburg, Martin von Löwis, Terry J. Reedy, Chad Whitacre. diff --git a/Doc/howto/urllib2.rst b/Doc/howto/urllib2.rst index 87f42ba..5a22660 100644 --- a/Doc/howto/urllib2.rst +++ b/Doc/howto/urllib2.rst @@ -56,6 +56,13 @@ The simplest way to use urllib.request is as follows:: response = urllib.request.urlopen('http://python.org/') html = response.read() +If you wish to retrieve a resource via URL and store it in a temporary location, +you can do so via the :func:`~urllib.request.urlretrieve` function:: + + import urllib.request + local_filename, headers = urllib.request.urlretrieve('http://python.org/') + html = open(local_filename) + Many uses of urllib will be that simple (note that instead of an 'http:' URL we could have used an URL starting with 'ftp:', 'file:', etc.). However, it's the purpose of this tutorial to explain the more complicated cases, concentrating on @@ -153,7 +160,7 @@ We'll discuss here one particular HTTP header, to illustrate how to add headers to your HTTP request. Some websites [#]_ dislike being browsed by programs, or send different versions -to different browsers [#]_ . By default urllib identifies itself as +to different browsers [#]_. By default urllib identifies itself as ``Python-urllib/x.y`` (where ``x`` and ``y`` are the major and minor version numbers of the Python release, e.g. ``Python-urllib/2.5``), which may confuse the site, or just plain @@ -447,7 +454,7 @@ Authentication Tutorial When authentication is required, the server sends a header (as well as the 401 error code) requesting authentication. This specifies the authentication scheme -and a 'realm'. The header looks like : ``WWW-Authenticate: SCHEME +and a 'realm'. The header looks like: ``WWW-Authenticate: SCHEME realm="REALM"``. e.g. :: @@ -497,7 +504,8 @@ than the URL you pass to .add_password() will also match. :: In the above example we only supplied our ``HTTPBasicAuthHandler`` to ``build_opener``. By default openers have the handlers for normal situations - -- ``ProxyHandler``, ``UnknownHandler``, ``HTTPHandler``, + -- ``ProxyHandler`` (if a proxy setting such as an :envvar:`http_proxy` + environment variable is set), ``UnknownHandler``, ``HTTPHandler``, ``HTTPDefaultErrorHandler``, ``HTTPRedirectHandler``, ``FTPHandler``, ``FileHandler``, ``HTTPErrorProcessor``. @@ -514,10 +522,11 @@ Proxies ======= **urllib** will auto-detect your proxy settings and use those. This is through -the ``ProxyHandler`` which is part of the normal handler chain. Normally that's -a good thing, but there are occasions when it may not be helpful [#]_. One way -to do this is to setup our own ``ProxyHandler``, with no proxies defined. This -is done using similar steps to setting up a `Basic Authentication`_ handler : :: +the ``ProxyHandler``, which is part of the normal handler chain when a proxy +setting is detected. Normally that's a good thing, but there are occasions +when it may not be helpful [#]_. One way to do this is to setup our own +``ProxyHandler``, with no proxies defined. This is done using similar steps to +setting up a `Basic Authentication`_ handler: :: >>> proxy_support = urllib.request.ProxyHandler({}) >>> opener = urllib.request.build_opener(proxy_support) diff --git a/Doc/includes/mp_benchmarks.py b/Doc/includes/mp_benchmarks.py index acdf642..3763ea9 100644 --- a/Doc/includes/mp_benchmarks.py +++ b/Doc/includes/mp_benchmarks.py @@ -6,16 +6,12 @@ # import time -import sys import multiprocessing import threading import queue import gc -if sys.platform == 'win32': - _timer = time.clock -else: - _timer = time.time +_timer = time.perf_counter delta = 1 diff --git a/Doc/includes/noddy2.c b/Doc/includes/noddy2.c index 9b8eafb..9641558 100644 --- a/Doc/includes/noddy2.c +++ b/Doc/includes/noddy2.c @@ -24,18 +24,16 @@ Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds) self = (Noddy *)type->tp_alloc(type, 0); if (self != NULL) { self->first = PyUnicode_FromString(""); - if (self->first == NULL) - { + if (self->first == NULL) { Py_DECREF(self); return NULL; - } - + } + self->last = PyUnicode_FromString(""); - if (self->last == NULL) - { + if (self->last == NULL) { Py_DECREF(self); return NULL; - } + } self->number = 0; } @@ -50,10 +48,10 @@ Noddy_init(Noddy *self, PyObject *args, PyObject *kwds) static char *kwlist[] = {"first", "last", "number", NULL}; - if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist, - &first, &last, + if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist, + &first, &last, &self->number)) - return -1; + return -1; if (first) { tmp = self->first; @@ -86,15 +84,6 @@ static PyMemberDef Noddy_members[] = { static PyObject * Noddy_name(Noddy* self) { - static PyObject *format = NULL; - PyObject *args, *result; - - if (format == NULL) { - format = PyUnicode_FromString("%s %s"); - if (format == NULL) - return NULL; - } - if (self->first == NULL) { PyErr_SetString(PyExc_AttributeError, "first"); return NULL; @@ -105,14 +94,7 @@ Noddy_name(Noddy* self) return NULL; } - args = Py_BuildValue("OO", self->first, self->last); - if (args == NULL) - return NULL; - - result = PyUnicode_Format(format, args); - Py_DECREF(args); - - return result; + return PyUnicode_FromFormat("%S %S", self->first, self->last); } static PyMethodDef Noddy_methods[] = { @@ -145,12 +127,12 @@ static PyTypeObject NoddyType = { Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */ "Noddy objects", /* tp_doc */ - 0, /* tp_traverse */ - 0, /* tp_clear */ - 0, /* tp_richcompare */ - 0, /* tp_weaklistoffset */ - 0, /* tp_iter */ - 0, /* tp_iternext */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ Noddy_methods, /* tp_methods */ Noddy_members, /* tp_members */ 0, /* tp_getset */ @@ -173,7 +155,7 @@ static PyModuleDef noddy2module = { }; PyMODINIT_FUNC -PyInit_noddy2(void) +PyInit_noddy2(void) { PyObject* m; diff --git a/Doc/includes/noddy3.c b/Doc/includes/noddy3.c index 89f3a77..8a5a753 100644 --- a/Doc/includes/noddy3.c +++ b/Doc/includes/noddy3.c @@ -24,18 +24,16 @@ Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds) self = (Noddy *)type->tp_alloc(type, 0); if (self != NULL) { self->first = PyUnicode_FromString(""); - if (self->first == NULL) - { + if (self->first == NULL) { Py_DECREF(self); return NULL; - } - + } + self->last = PyUnicode_FromString(""); - if (self->last == NULL) - { + if (self->last == NULL) { Py_DECREF(self); return NULL; - } + } self->number = 0; } @@ -50,10 +48,10 @@ Noddy_init(Noddy *self, PyObject *args, PyObject *kwds) static char *kwlist[] = {"first", "last", "number", NULL}; - if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist, - &first, &last, + if (! PyArg_ParseTupleAndKeywords(args, kwds, "|SSi", kwlist, + &first, &last, &self->number)) - return -1; + return -1; if (first) { tmp = self->first; @@ -88,22 +86,22 @@ Noddy_getfirst(Noddy *self, void *closure) static int Noddy_setfirst(Noddy *self, PyObject *value, void *closure) { - if (value == NULL) { - PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute"); - return -1; - } - - if (! PyUnicode_Check(value)) { - PyErr_SetString(PyExc_TypeError, - "The first attribute value must be a string"); - return -1; - } - - Py_DECREF(self->first); - Py_INCREF(value); - self->first = value; - - return 0; + if (value == NULL) { + PyErr_SetString(PyExc_TypeError, "Cannot delete the first attribute"); + return -1; + } + + if (! PyUnicode_Check(value)) { + PyErr_SetString(PyExc_TypeError, + "The first attribute value must be a string"); + return -1; + } + + Py_DECREF(self->first); + Py_INCREF(value); + self->first = value; + + return 0; } static PyObject * @@ -116,30 +114,30 @@ Noddy_getlast(Noddy *self, void *closure) static int Noddy_setlast(Noddy *self, PyObject *value, void *closure) { - if (value == NULL) { - PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute"); - return -1; - } - - if (! PyUnicode_Check(value)) { - PyErr_SetString(PyExc_TypeError, - "The last attribute value must be a string"); - return -1; - } - - Py_DECREF(self->last); - Py_INCREF(value); - self->last = value; - - return 0; + if (value == NULL) { + PyErr_SetString(PyExc_TypeError, "Cannot delete the last attribute"); + return -1; + } + + if (! PyUnicode_Check(value)) { + PyErr_SetString(PyExc_TypeError, + "The last attribute value must be a string"); + return -1; + } + + Py_DECREF(self->last); + Py_INCREF(value); + self->last = value; + + return 0; } static PyGetSetDef Noddy_getseters[] = { - {"first", + {"first", (getter)Noddy_getfirst, (setter)Noddy_setfirst, "first name", NULL}, - {"last", + {"last", (getter)Noddy_getlast, (setter)Noddy_setlast, "last name", NULL}, @@ -149,23 +147,7 @@ static PyGetSetDef Noddy_getseters[] = { static PyObject * Noddy_name(Noddy* self) { - static PyObject *format = NULL; - PyObject *args, *result; - - if (format == NULL) { - format = PyUnicode_FromString("%s %s"); - if (format == NULL) - return NULL; - } - - args = Py_BuildValue("OO", self->first, self->last); - if (args == NULL) - return NULL; - - result = PyUnicode_Format(format, args); - Py_DECREF(args); - - return result; + return PyUnicode_FromFormat("%S %S", self->first, self->last); } static PyMethodDef Noddy_methods[] = { @@ -198,12 +180,12 @@ static PyTypeObject NoddyType = { Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /* tp_flags */ "Noddy objects", /* tp_doc */ - 0, /* tp_traverse */ - 0, /* tp_clear */ - 0, /* tp_richcompare */ - 0, /* tp_weaklistoffset */ - 0, /* tp_iter */ - 0, /* tp_iternext */ + 0, /* tp_traverse */ + 0, /* tp_clear */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ Noddy_methods, /* tp_methods */ Noddy_members, /* tp_members */ Noddy_getseters, /* tp_getset */ @@ -226,7 +208,7 @@ static PyModuleDef noddy3module = { }; PyMODINIT_FUNC -PyInit_noddy3(void) +PyInit_noddy3(void) { PyObject* m; diff --git a/Doc/includes/noddy4.c b/Doc/includes/noddy4.c index 6a96fac..eb9622a 100644 --- a/Doc/includes/noddy4.c +++ b/Doc/includes/noddy4.c @@ -27,7 +27,7 @@ Noddy_traverse(Noddy *self, visitproc visit, void *arg) return 0; } -static int +static int Noddy_clear(Noddy *self) { PyObject *tmp; @@ -58,18 +58,16 @@ Noddy_new(PyTypeObject *type, PyObject *args, PyObject *kwds) self = (Noddy *)type->tp_alloc(type, 0); if (self != NULL) { self->first = PyUnicode_FromString(""); - if (self->first == NULL) - { + if (self->first == NULL) { Py_DECREF(self); return NULL; - } - + } + self->last = PyUnicode_FromString(""); - if (self->last == NULL) - { + if (self->last == NULL) { Py_DECREF(self); return NULL; - } + } self->number = 0; } @@ -84,10 +82,10 @@ Noddy_init(Noddy *self, PyObject *args, PyObject *kwds) static char *kwlist[] = {"first", "last", "number", NULL}; - if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist, - &first, &last, + if (! PyArg_ParseTupleAndKeywords(args, kwds, "|OOi", kwlist, + &first, &last, &self->number)) - return -1; + return -1; if (first) { tmp = self->first; @@ -120,15 +118,6 @@ static PyMemberDef Noddy_members[] = { static PyObject * Noddy_name(Noddy* self) { - static PyObject *format = NULL; - PyObject *args, *result; - - if (format == NULL) { - format = PyUnicode_FromString("%s %s"); - if (format == NULL) - return NULL; - } - if (self->first == NULL) { PyErr_SetString(PyExc_AttributeError, "first"); return NULL; @@ -139,14 +128,7 @@ Noddy_name(Noddy* self) return NULL; } - args = Py_BuildValue("OO", self->first, self->last); - if (args == NULL) - return NULL; - - result = PyUnicode_Format(format, args); - Py_DECREF(args); - - return result; + return PyUnicode_FromFormat("%S %S", self->first, self->last); } static PyMethodDef Noddy_methods[] = { @@ -182,10 +164,10 @@ static PyTypeObject NoddyType = { "Noddy objects", /* tp_doc */ (traverseproc)Noddy_traverse, /* tp_traverse */ (inquiry)Noddy_clear, /* tp_clear */ - 0, /* tp_richcompare */ - 0, /* tp_weaklistoffset */ - 0, /* tp_iter */ - 0, /* tp_iternext */ + 0, /* tp_richcompare */ + 0, /* tp_weaklistoffset */ + 0, /* tp_iter */ + 0, /* tp_iternext */ Noddy_methods, /* tp_methods */ Noddy_members, /* tp_members */ 0, /* tp_getset */ @@ -208,7 +190,7 @@ static PyModuleDef noddy4module = { }; PyMODINIT_FUNC -PyInit_noddy4(void) +PyInit_noddy4(void) { PyObject* m; diff --git a/Doc/install/index.rst b/Doc/install/index.rst index c31b198..0c9a9f2 100644 --- a/Doc/install/index.rst +++ b/Doc/install/index.rst @@ -20,12 +20,20 @@ Finally, it might be useful to include all the material from my "Care and Feeding of a Python Installation" talk in here somewhere. Yow! -.. topic:: Abstract +This document describes the Python Distribution Utilities ("Distutils") from the +end-user's point-of-view, describing how to extend the capabilities of a +standard Python installation by building and installing third-party Python +modules and extensions. - This document describes the Python Distribution Utilities ("Distutils") from the - end-user's point-of-view, describing how to extend the capabilities of a - standard Python installation by building and installing third-party Python - modules and extensions. + +.. note:: + + This guide only covers the basic tools for installing extensions that are + provided as part of this version of Python. Third party tools offer easier + to use and more secure alternatives. Refer to the + `quick recommendations section + <https://python-packaging-user-guide.readthedocs.org/en/latest/current.html>`__ + in the Python Packaging User Guide for more information. .. _inst-intro: @@ -50,7 +58,8 @@ new goodies to their toolbox. You don't need to know Python to read this document; there will be some brief forays into using Python's interactive mode to explore your installation, but that's it. If you're looking for information on how to distribute your own Python modules so that others may use them, see -the :ref:`distutils-index` manual. +the :ref:`distutils-index` manual. :ref:`debug-setup-script` may also be of +interest. .. _inst-trivial-install: @@ -235,6 +244,8 @@ by how you built/installed Python itself. On Unix (and Mac OS X, which is also Unix-based), it also depends on whether the module distribution being installed is pure Python or contains extensions ("non-pure"): +.. tabularcolumns:: |l|l|l|l| + +-----------------+-----------------------------------------------------+--------------------------------------------------+-------+ | Platform | Standard installation location | Default value | Notes | +=================+=====================================================+==================================================+=======+ @@ -643,6 +654,11 @@ environment variables, such as Mac OS 9, the configuration variables supplied by the Distutils are the only ones you can use.) See section :ref:`inst-config-files` for details. +.. note:: When a :ref:`virtual environment <venv-def>` is activated, any options + that change the installation path will be ignored from all distutils configuration + files to prevent inadvertently installing projects outside of the virtual + environment. + .. XXX need some Windows examples---when would custom installation schemes be needed on those platforms? @@ -1039,7 +1055,7 @@ These compilers require some special libraries. This task is more complex than for Borland's C++, because there is no program to convert the library. First you have to create a list of symbols which the Python DLL exports. (You can find a good program for this task at -http://www.emmestech.com/software/pexports-0.43/download_pexports.html). +http://sourceforge.net/projects/mingw/files/MinGW/Extension/pexports/). .. I don't understand what the next line means. --amk .. (inclusive the references on data structures.) diff --git a/Doc/library/2to3.rst b/Doc/library/2to3.rst index d07aaa1..1e5f42d 100644 --- a/Doc/library/2to3.rst +++ b/Doc/library/2to3.rst @@ -264,8 +264,7 @@ and off individually. They are described here in more detail. .. 2to3fixer:: long - Strips the ``L`` prefix on long literals and renames :class:`long` to - :class:`int`. + Renames :class:`long` to :class:`int`. .. 2to3fixer:: map @@ -291,11 +290,11 @@ and off individually. They are described here in more detail. Converts the use of iterator's :meth:`~iterator.next` methods to the :func:`next` function. It also renames :meth:`next` methods to - :meth:`~object.__next__`. + :meth:`~iterator.__next__`. .. 2to3fixer:: nonzero - Renames :meth:`~object.__nonzero__` to :meth:`~object.__bool__`. + Renames :meth:`__nonzero__` to :meth:`~object.__bool__`. .. 2to3fixer:: numliterals diff --git a/Doc/library/_thread.rst b/Doc/library/_thread.rst index 369e9cd..2e130c1 100644 --- a/Doc/library/_thread.rst +++ b/Doc/library/_thread.rst @@ -35,6 +35,9 @@ It defines the following constants and functions: Raised on thread-specific errors. + .. versionchanged:: 3.3 + This is now a synonym of the built-in :exc:`RuntimeError`. + .. data:: LockType @@ -90,15 +93,15 @@ It defines the following constants and functions: Return the thread stack size used when creating new threads. The optional *size* argument specifies the stack size to be used for subsequently created threads, and must be 0 (use platform or configured default) or a positive - integer value of at least 32,768 (32kB). If changing the thread stack size is - unsupported, a :exc:`ThreadError` is raised. If the specified stack size is - invalid, a :exc:`ValueError` is raised and the stack size is unmodified. 32kB + integer value of at least 32,768 (32 KiB). If changing the thread stack size is + unsupported, a :exc:`RuntimeError` is raised. If the specified stack size is + invalid, a :exc:`ValueError` is raised and the stack size is unmodified. 32 KiB is currently the minimum supported stack size value to guarantee sufficient stack space for the interpreter itself. Note that some platforms may have particular restrictions on values for the stack size, such as requiring a - minimum stack size > 32kB or requiring allocation in multiples of the system + minimum stack size > 32 KiB or requiring allocation in multiples of the system memory page size - platform documentation should be referred to for more - information (4kB pages are common; using multiples of 4096 for the stack size is + information (4 KiB pages are common; using multiples of 4096 for the stack size is the suggested approach in the absence of more specific information). Availability: Windows, systems with POSIX threads. @@ -174,7 +177,7 @@ In addition to these methods, lock objects can also be used via the equivalent to calling :func:`_thread.exit`. * Not all built-in functions that may block waiting for I/O allow other threads - to run. (The most popular ones (:func:`time.sleep`, :meth:`file.read`, + to run. (The most popular ones (:func:`time.sleep`, :meth:`io.FileIO.read`, :func:`select.select`) work as expected.) * It is not possible to interrupt the :meth:`acquire` method on a lock --- the diff --git a/Doc/library/abc.rst b/Doc/library/abc.rst index dcec19a..9299124 100644 --- a/Doc/library/abc.rst +++ b/Doc/library/abc.rst @@ -18,7 +18,7 @@ regarding a type hierarchy for numbers based on ABCs.) The :mod:`collections` module has some concrete classes that derive from ABCs; these can, of course, be further derived. In addition the -:mod:`collections` module has some ABCs that can be used to test whether +:mod:`collections.abc` submodule has some ABCs that can be used to test whether a class or instance provides a particular interface, for example, is it hashable or a mapping. @@ -55,6 +55,9 @@ This module provides the following class: assert issubclass(tuple, MyABC) assert isinstance((), MyABC) + .. versionchanged:: 3.3 + Returns the registered subclass, to allow usage as a class decorator. + You can also override this method in an abstract base class: .. method:: __subclasshook__(subclass) @@ -107,36 +110,35 @@ This module provides the following class: MyIterable.register(Foo) The ABC ``MyIterable`` defines the standard iterable method, - :meth:`__iter__`, as an abstract method. The implementation given here can - still be called from subclasses. The :meth:`get_iterator` method is also - part of the ``MyIterable`` abstract base class, but it does not have to be - overridden in non-abstract derived classes. + :meth:`~iterator.__iter__`, as an abstract method. The implementation given + here can still be called from subclasses. The :meth:`get_iterator` method + is also part of the ``MyIterable`` abstract base class, but it does not have + to be overridden in non-abstract derived classes. The :meth:`__subclasshook__` class method defined here says that any class - that has an :meth:`__iter__` method in its :attr:`__dict__` (or in that of - one of its base classes, accessed via the :attr:`__mro__` list) is - considered a ``MyIterable`` too. + that has an :meth:`~iterator.__iter__` method in its + :attr:`~object.__dict__` (or in that of one of its base classes, accessed + via the :attr:`~class.__mro__` list) is considered a ``MyIterable`` too. Finally, the last line makes ``Foo`` a virtual subclass of ``MyIterable``, - even though it does not define an :meth:`__iter__` method (it uses the - old-style iterable protocol, defined in terms of :meth:`__len__` and + even though it does not define an :meth:`~iterator.__iter__` method (it uses + the old-style iterable protocol, defined in terms of :meth:`__len__` and :meth:`__getitem__`). Note that this will not make ``get_iterator`` available as a method of ``Foo``, so it is provided separately. -It also provides the following decorators: +The :mod:`abc` module also provides the following decorators: .. decorator:: abstractmethod A decorator indicating abstract methods. - Using this decorator requires that the class's metaclass is :class:`ABCMeta` or - is derived from it. - A class that has a metaclass derived from :class:`ABCMeta` - cannot be instantiated unless all of its abstract methods and - properties are overridden. - The abstract methods can be called using any of the normal 'super' call - mechanisms. + Using this decorator requires that the class's metaclass is :class:`ABCMeta` + or is derived from it. A class that has a metaclass derived from + :class:`ABCMeta` cannot be instantiated unless all of its abstract methods + and properties are overridden. The abstract methods can be called using any + of the normal 'super' call mechanisms. :func:`abstractmethod` may be used + to declare abstract methods for properties and descriptors. Dynamically adding abstract methods to a class, or attempting to modify the abstraction status of a method or class once it is created, are not @@ -144,12 +146,52 @@ It also provides the following decorators: regular inheritance; "virtual subclasses" registered with the ABC's :meth:`register` method are not affected. - Usage:: + When :func:`abstractmethod` is applied in combination with other method + descriptors, it should be applied as the innermost decorator, as shown in + the following usage examples:: class C(metaclass=ABCMeta): @abstractmethod def my_abstract_method(self, ...): ... + @classmethod + @abstractmethod + def my_abstract_classmethod(cls, ...): + ... + @staticmethod + @abstractmethod + def my_abstract_staticmethod(...): + ... + + @property + @abstractmethod + def my_abstract_property(self): + ... + @my_abstract_property.setter + @abstractmethod + def my_abstract_property(self, val): + ... + + @abstractmethod + def _get_x(self): + ... + @abstractmethod + def _set_x(self, val): + ... + x = property(_get_x, _set_x) + + In order to correctly interoperate with the abstract base class machinery, + the descriptor must identify itself as abstract using + :attr:`__isabstractmethod__`. In general, this attribute should be ``True`` + if any of the methods used to compose the descriptor are abstract. For + example, Python's built-in property does the equivalent of:: + + class Descriptor: + ... + @property + def __isabstractmethod__(self): + return any(getattr(f, '__isabstractmethod__', False) for + f in (self._fget, self._fset, self._fdel)) .. note:: @@ -166,14 +208,20 @@ It also provides the following decorators: A subclass of the built-in :func:`classmethod`, indicating an abstract classmethod. Otherwise it is similar to :func:`abstractmethod`. - Usage:: + This special case is deprecated, as the :func:`classmethod` decorator + is now correctly identified as abstract when applied to an abstract + method:: class C(metaclass=ABCMeta): - @abstractclassmethod + @classmethod + @abstractmethod def my_abstract_classmethod(cls, ...): ... .. versionadded:: 3.2 + .. deprecated:: 3.3 + It is now possible to use :class:`classmethod` with + :func:`abstractmethod`, making this decorator redundant. .. decorator:: abstractstaticmethod @@ -181,41 +229,70 @@ It also provides the following decorators: A subclass of the built-in :func:`staticmethod`, indicating an abstract staticmethod. Otherwise it is similar to :func:`abstractmethod`. - Usage:: + This special case is deprecated, as the :func:`staticmethod` decorator + is now correctly identified as abstract when applied to an abstract + method:: class C(metaclass=ABCMeta): - @abstractstaticmethod + @staticmethod + @abstractmethod def my_abstract_staticmethod(...): ... .. versionadded:: 3.2 + .. deprecated:: 3.3 + It is now possible to use :class:`staticmethod` with + :func:`abstractmethod`, making this decorator redundant. -.. function:: abstractproperty(fget=None, fset=None, fdel=None, doc=None) +.. decorator:: abstractproperty(fget=None, fset=None, fdel=None, doc=None) - A subclass of the built-in :func:`property`, indicating an abstract property. + A subclass of the built-in :func:`property`, indicating an abstract + property. - Using this function requires that the class's metaclass is :class:`ABCMeta` or - is derived from it. - A class that has a metaclass derived from :class:`ABCMeta` cannot be - instantiated unless all of its abstract methods and properties are overridden. - The abstract properties can be called using any of the normal - 'super' call mechanisms. + Using this function requires that the class's metaclass is :class:`ABCMeta` + or is derived from it. A class that has a metaclass derived from + :class:`ABCMeta` cannot be instantiated unless all of its abstract methods + and properties are overridden. The abstract properties can be called using + any of the normal 'super' call mechanisms. - Usage:: + This special case is deprecated, as the :func:`property` decorator + is now correctly identified as abstract when applied to an abstract + method:: class C(metaclass=ABCMeta): - @abstractproperty + @property + @abstractmethod def my_abstract_property(self): ... - This defines a read-only property; you can also define a read-write abstract - property using the 'long' form of property declaration:: + The above example defines a read-only property; you can also define a + read-write abstract property by appropriately marking one or more of the + underlying methods as abstract:: class C(metaclass=ABCMeta): - def getx(self): ... - def setx(self, value): ... - x = abstractproperty(getx, setx) + @property + def x(self): + ... + + @x.setter + @abstractmethod + def x(self, val): + ... + + If only some components are abstract, only those components need to be + updated to create a concrete property in a subclass:: + + class D(C): + @C.x.setter + def x(self, val): + ... + + + .. deprecated:: 3.3 + It is now possible to use :class:`property`, :meth:`property.getter`, + :meth:`property.setter` and :meth:`property.deleter` with + :func:`abstractmethod`, making this decorator redundant. .. rubric:: Footnotes diff --git a/Doc/library/aifc.rst b/Doc/library/aifc.rst index 999bad8..8a3541b 100644 --- a/Doc/library/aifc.rst +++ b/Doc/library/aifc.rst @@ -30,8 +30,8 @@ sampling rate or frame rate is the number of times per second the sound is sampled. The number of channels indicate if the audio is mono, stereo, or quadro. Each frame consists of one sample per channel. The sample size is the size in bytes of each sample. Thus a frame consists of -*nchannels*\**samplesize* bytes, and a second's worth of audio consists of -*nchannels*\**samplesize*\**framerate* bytes. +*nchannels*\*\ *samplesize* bytes, and a second's worth of audio consists of +*nchannels*\*\ *samplesize*\*\ *framerate* bytes. For example, CD quality audio has a sample size of two bytes (16 bits), uses two channels (stereo) and has a frame rate of 44,100 frames/second. This gives a diff --git a/Doc/library/archiving.rst b/Doc/library/archiving.rst index 75d137c..c928494 100644 --- a/Doc/library/archiving.rst +++ b/Doc/library/archiving.rst @@ -5,8 +5,9 @@ Data Compression and Archiving ****************************** The modules described in this chapter support data compression with the zlib, -gzip, and bzip2 algorithms, and the creation of ZIP- and tar-format archives. -See also :ref:`archiving-operations` provided by the :mod:`shutil` module. +gzip, bzip2 and lzma algorithms, and the creation of ZIP- and tar-format +archives. See also :ref:`archiving-operations` provided by the :mod:`shutil` +module. .. toctree:: @@ -14,5 +15,6 @@ See also :ref:`archiving-operations` provided by the :mod:`shutil` module. zlib.rst gzip.rst bz2.rst + lzma.rst zipfile.rst tarfile.rst diff --git a/Doc/library/argparse.rst b/Doc/library/argparse.rst index 0b403b7..dc9c27f 100644 --- a/Doc/library/argparse.rst +++ b/Doc/library/argparse.rst @@ -46,7 +46,7 @@ produces either the sum or the max:: Assuming the Python code above is saved into a file called ``prog.py``, it can be run at the command line and provides useful help messages:: - $ prog.py -h + $ python prog.py -h usage: prog.py [-h] [--sum] N [N ...] Process some integers. @@ -61,15 +61,15 @@ be run at the command line and provides useful help messages:: When run with the appropriate arguments, it prints either the sum or the max of the command-line integers:: - $ prog.py 1 2 3 4 + $ python prog.py 1 2 3 4 4 - $ prog.py 1 2 3 4 --sum + $ python prog.py 1 2 3 4 --sum 10 If invalid arguments are passed in, it will issue an error:: - $ prog.py a b c + $ python prog.py a b c usage: prog.py [-h] [--sum] N [N ...] prog.py: error: argument N: invalid int value: 'a' @@ -137,40 +137,136 @@ ArgumentParser objects argument_default=None, conflict_handler='error', \ add_help=True) - Create a new :class:`ArgumentParser` object. Each parameter has its own more - detailed description below, but in short they are: + Create a new :class:`ArgumentParser` object. All parameters should be passed + as keyword arguments. Each parameter has its own more detailed description + below, but in short they are: - * description_ - Text to display before the argument help. + * prog_ - The name of the program (default: ``sys.argv[0]``) - * epilog_ - Text to display after the argument help. + * usage_ - The string describing the program usage (default: generated from + arguments added to parser) - * add_help_ - Add a -h/--help option to the parser. (default: ``True``) + * description_ - Text to display before the argument help (default: none) - * argument_default_ - Set the global default value for arguments. - (default: ``None``) + * epilog_ - Text to display after the argument help (default: none) * parents_ - A list of :class:`ArgumentParser` objects whose arguments should - also be included. + also be included + + * formatter_class_ - A class for customizing the help output - * prefix_chars_ - The set of characters that prefix optional arguments. + * prefix_chars_ - The set of characters that prefix optional arguments (default: '-') * fromfile_prefix_chars_ - The set of characters that prefix files from - which additional arguments should be read. (default: ``None``) + which additional arguments should be read (default: ``None``) - * formatter_class_ - A class for customizing the help output. - - * conflict_handler_ - Usually unnecessary, defines strategy for resolving - conflicting optionals. + * argument_default_ - The global default value for arguments + (default: ``None``) - * prog_ - The name of the program (default: - ``sys.argv[0]``) + * conflict_handler_ - The strategy for resolving conflicting optionals + (usually unnecessary) - * usage_ - The string describing the program usage (default: generated) + * add_help_ - Add a -h/--help option to the parser (default: ``True``) The following sections describe how each of these are used. +prog +^^^^ + +By default, :class:`ArgumentParser` objects uses ``sys.argv[0]`` to determine +how to display the name of the program in help messages. This default is almost +always desirable because it will make the help messages match how the program was +invoked on the command line. For example, consider a file named +``myprogram.py`` with the following code:: + + import argparse + parser = argparse.ArgumentParser() + parser.add_argument('--foo', help='foo help') + args = parser.parse_args() + +The help for this program will display ``myprogram.py`` as the program name +(regardless of where the program was invoked from):: + + $ python myprogram.py --help + usage: myprogram.py [-h] [--foo FOO] + + optional arguments: + -h, --help show this help message and exit + --foo FOO foo help + $ cd .. + $ python subdir\myprogram.py --help + usage: myprogram.py [-h] [--foo FOO] + + optional arguments: + -h, --help show this help message and exit + --foo FOO foo help + +To change this default behavior, another value can be supplied using the +``prog=`` argument to :class:`ArgumentParser`:: + + >>> parser = argparse.ArgumentParser(prog='myprogram') + >>> parser.print_help() + usage: myprogram [-h] + + optional arguments: + -h, --help show this help message and exit + +Note that the program name, whether determined from ``sys.argv[0]`` or from the +``prog=`` argument, is available to help messages using the ``%(prog)s`` format +specifier. + +:: + + >>> parser = argparse.ArgumentParser(prog='myprogram') + >>> parser.add_argument('--foo', help='foo of the %(prog)s program') + >>> parser.print_help() + usage: myprogram [-h] [--foo FOO] + + optional arguments: + -h, --help show this help message and exit + --foo FOO foo of the myprogram program + + +usage +^^^^^ + +By default, :class:`ArgumentParser` calculates the usage message from the +arguments it contains:: + + >>> parser = argparse.ArgumentParser(prog='PROG') + >>> parser.add_argument('--foo', nargs='?', help='foo help') + >>> parser.add_argument('bar', nargs='+', help='bar help') + >>> parser.print_help() + usage: PROG [-h] [--foo [FOO]] bar [bar ...] + + positional arguments: + bar bar help + + optional arguments: + -h, --help show this help message and exit + --foo [FOO] foo help + +The default message can be overridden with the ``usage=`` keyword argument:: + + >>> parser = argparse.ArgumentParser(prog='PROG', usage='%(prog)s [options]') + >>> parser.add_argument('--foo', nargs='?', help='foo help') + >>> parser.add_argument('bar', nargs='+', help='bar help') + >>> parser.print_help() + usage: PROG [options] + + positional arguments: + bar bar help + + optional arguments: + -h, --help show this help message and exit + --foo [FOO] foo help + +The ``%(prog)s`` format specifier is available to fill in the program name in +your usage messages. + + description ^^^^^^^^^^^ @@ -218,122 +314,6 @@ line-wrapped, but this behavior can be adjusted with the formatter_class_ argument to :class:`ArgumentParser`. -add_help -^^^^^^^^ - -By default, ArgumentParser objects add an option which simply displays -the parser's help message. For example, consider a file named -``myprogram.py`` containing the following code:: - - import argparse - parser = argparse.ArgumentParser() - parser.add_argument('--foo', help='foo help') - args = parser.parse_args() - -If ``-h`` or ``--help`` is supplied at the command line, the ArgumentParser -help will be printed:: - - $ python myprogram.py --help - usage: myprogram.py [-h] [--foo FOO] - - optional arguments: - -h, --help show this help message and exit - --foo FOO foo help - -Occasionally, it may be useful to disable the addition of this help option. -This can be achieved by passing ``False`` as the ``add_help=`` argument to -:class:`ArgumentParser`:: - - >>> parser = argparse.ArgumentParser(prog='PROG', add_help=False) - >>> parser.add_argument('--foo', help='foo help') - >>> parser.print_help() - usage: PROG [--foo FOO] - - optional arguments: - --foo FOO foo help - -The help option is typically ``-h/--help``. The exception to this is -if the ``prefix_chars=`` is specified and does not include ``-``, in -which case ``-h`` and ``--help`` are not valid options. In -this case, the first character in ``prefix_chars`` is used to prefix -the help options:: - - >>> parser = argparse.ArgumentParser(prog='PROG', prefix_chars='+/') - >>> parser.print_help() - usage: PROG [+h] - - optional arguments: - +h, ++help show this help message and exit - - -prefix_chars -^^^^^^^^^^^^ - -Most command-line options will use ``-`` as the prefix, e.g. ``-f/--foo``. -Parsers that need to support different or additional prefix -characters, e.g. for options -like ``+f`` or ``/foo``, may specify them using the ``prefix_chars=`` argument -to the ArgumentParser constructor:: - - >>> parser = argparse.ArgumentParser(prog='PROG', prefix_chars='-+') - >>> parser.add_argument('+f') - >>> parser.add_argument('++bar') - >>> parser.parse_args('+f X ++bar Y'.split()) - Namespace(bar='Y', f='X') - -The ``prefix_chars=`` argument defaults to ``'-'``. Supplying a set of -characters that does not include ``-`` will cause ``-f/--foo`` options to be -disallowed. - - -fromfile_prefix_chars -^^^^^^^^^^^^^^^^^^^^^ - -Sometimes, for example when dealing with a particularly long argument lists, it -may make sense to keep the list of arguments in a file rather than typing it out -at the command line. If the ``fromfile_prefix_chars=`` argument is given to the -:class:`ArgumentParser` constructor, then arguments that start with any of the -specified characters will be treated as files, and will be replaced by the -arguments they contain. For example:: - - >>> with open('args.txt', 'w') as fp: - ... fp.write('-f\nbar') - >>> parser = argparse.ArgumentParser(fromfile_prefix_chars='@') - >>> parser.add_argument('-f') - >>> parser.parse_args(['-f', 'foo', '@args.txt']) - Namespace(f='bar') - -Arguments read from a file must by default be one per line (but see also -:meth:`~ArgumentParser.convert_arg_line_to_args`) and are treated as if they -were in the same place as the original file referencing argument on the command -line. So in the example above, the expression ``['-f', 'foo', '@args.txt']`` -is considered equivalent to the expression ``['-f', 'foo', '-f', 'bar']``. - -The ``fromfile_prefix_chars=`` argument defaults to ``None``, meaning that -arguments will never be treated as file references. - - -argument_default -^^^^^^^^^^^^^^^^ - -Generally, argument defaults are specified either by passing a default to -:meth:`~ArgumentParser.add_argument` or by calling the -:meth:`~ArgumentParser.set_defaults` methods with a specific set of name-value -pairs. Sometimes however, it may be useful to specify a single parser-wide -default for arguments. This can be accomplished by passing the -``argument_default=`` keyword argument to :class:`ArgumentParser`. For example, -to globally suppress attribute creation on :meth:`~ArgumentParser.parse_args` -calls, we supply ``argument_default=SUPPRESS``:: - - >>> parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS) - >>> parser.add_argument('--foo') - >>> parser.add_argument('bar', nargs='?') - >>> parser.parse_args(['--foo', '1', 'BAR']) - Namespace(bar='BAR', foo='1') - >>> parser.parse_args([]) - Namespace() - - parents ^^^^^^^ @@ -371,16 +351,16 @@ formatter_class ^^^^^^^^^^^^^^^ :class:`ArgumentParser` objects allow the help formatting to be customized by -specifying an alternate formatting class. Currently, there are three such +specifying an alternate formatting class. Currently, there are four such classes: .. class:: RawDescriptionHelpFormatter RawTextHelpFormatter ArgumentDefaultsHelpFormatter + MetavarTypeHelpFormatter -The first two allow more control over how textual descriptions are displayed, -while the last automatically adds information about argument default values. - +:class:`RawDescriptionHelpFormatter` and :class:`RawTextHelpFormatter` give +more control over how textual descriptions are displayed. By default, :class:`ArgumentParser` objects line-wrap the description_ and epilog_ texts in command-line help messages:: @@ -433,8 +413,8 @@ should not be line-wrapped:: :class:`RawTextHelpFormatter` maintains whitespace for all sorts of help text, including argument descriptions. -The other formatter class available, :class:`ArgumentDefaultsHelpFormatter`, -will add information about the default value of each of the arguments:: +:class:`ArgumentDefaultsHelpFormatter` automatically adds information about +default values to each of the argument help messages:: >>> parser = argparse.ArgumentParser( ... prog='PROG', @@ -451,6 +431,93 @@ will add information about the default value of each of the arguments:: -h, --help show this help message and exit --foo FOO FOO! (default: 42) +:class:`MetavarTypeHelpFormatter` uses the name of the type_ argument for each +argument as the display name for its values (rather than using the dest_ +as the regular formatter does):: + + >>> parser = argparse.ArgumentParser( + ... prog='PROG', + ... formatter_class=argparse.MetavarTypeHelpFormatter) + >>> parser.add_argument('--foo', type=int) + >>> parser.add_argument('bar', type=float) + >>> parser.print_help() + usage: PROG [-h] [--foo int] float + + positional arguments: + float + + optional arguments: + -h, --help show this help message and exit + --foo int + + +prefix_chars +^^^^^^^^^^^^ + +Most command-line options will use ``-`` as the prefix, e.g. ``-f/--foo``. +Parsers that need to support different or additional prefix +characters, e.g. for options +like ``+f`` or ``/foo``, may specify them using the ``prefix_chars=`` argument +to the ArgumentParser constructor:: + + >>> parser = argparse.ArgumentParser(prog='PROG', prefix_chars='-+') + >>> parser.add_argument('+f') + >>> parser.add_argument('++bar') + >>> parser.parse_args('+f X ++bar Y'.split()) + Namespace(bar='Y', f='X') + +The ``prefix_chars=`` argument defaults to ``'-'``. Supplying a set of +characters that does not include ``-`` will cause ``-f/--foo`` options to be +disallowed. + + +fromfile_prefix_chars +^^^^^^^^^^^^^^^^^^^^^ + +Sometimes, for example when dealing with a particularly long argument lists, it +may make sense to keep the list of arguments in a file rather than typing it out +at the command line. If the ``fromfile_prefix_chars=`` argument is given to the +:class:`ArgumentParser` constructor, then arguments that start with any of the +specified characters will be treated as files, and will be replaced by the +arguments they contain. For example:: + + >>> with open('args.txt', 'w') as fp: + ... fp.write('-f\nbar') + >>> parser = argparse.ArgumentParser(fromfile_prefix_chars='@') + >>> parser.add_argument('-f') + >>> parser.parse_args(['-f', 'foo', '@args.txt']) + Namespace(f='bar') + +Arguments read from a file must by default be one per line (but see also +:meth:`~ArgumentParser.convert_arg_line_to_args`) and are treated as if they +were in the same place as the original file referencing argument on the command +line. So in the example above, the expression ``['-f', 'foo', '@args.txt']`` +is considered equivalent to the expression ``['-f', 'foo', '-f', 'bar']``. + +The ``fromfile_prefix_chars=`` argument defaults to ``None``, meaning that +arguments will never be treated as file references. + + +argument_default +^^^^^^^^^^^^^^^^ + +Generally, argument defaults are specified either by passing a default to +:meth:`~ArgumentParser.add_argument` or by calling the +:meth:`~ArgumentParser.set_defaults` methods with a specific set of name-value +pairs. Sometimes however, it may be useful to specify a single parser-wide +default for arguments. This can be accomplished by passing the +``argument_default=`` keyword argument to :class:`ArgumentParser`. For example, +to globally suppress attribute creation on :meth:`~ArgumentParser.parse_args` +calls, we supply ``argument_default=SUPPRESS``:: + + >>> parser = argparse.ArgumentParser(argument_default=argparse.SUPPRESS) + >>> parser.add_argument('--foo') + >>> parser.add_argument('bar', nargs='?') + >>> parser.parse_args(['--foo', '1', 'BAR']) + Namespace(bar='BAR', foo='1') + >>> parser.parse_args([]) + Namespace() + conflict_handler ^^^^^^^^^^^^^^^^ @@ -489,22 +556,20 @@ action is retained as the ``-f`` action, because only the ``--foo`` option string was overridden. -prog -^^^^ +add_help +^^^^^^^^ -By default, :class:`ArgumentParser` objects uses ``sys.argv[0]`` to determine -how to display the name of the program in help messages. This default is almost -always desirable because it will make the help messages match how the program was -invoked on the command line. For example, consider a file named -``myprogram.py`` with the following code:: +By default, ArgumentParser objects add an option which simply displays +the parser's help message. For example, consider a file named +``myprogram.py`` containing the following code:: import argparse parser = argparse.ArgumentParser() parser.add_argument('--foo', help='foo help') args = parser.parse_args() -The help for this program will display ``myprogram.py`` as the program name -(regardless of where the program was invoked from):: +If ``-h`` or ``--help`` is supplied at the command line, the ArgumentParser +help will be printed:: $ python myprogram.py --help usage: myprogram.py [-h] [--foo FOO] @@ -512,76 +577,31 @@ The help for this program will display ``myprogram.py`` as the program name optional arguments: -h, --help show this help message and exit --foo FOO foo help - $ cd .. - $ python subdir\myprogram.py --help - usage: myprogram.py [-h] [--foo FOO] - - optional arguments: - -h, --help show this help message and exit - --foo FOO foo help - -To change this default behavior, another value can be supplied using the -``prog=`` argument to :class:`ArgumentParser`:: - - >>> parser = argparse.ArgumentParser(prog='myprogram') - >>> parser.print_help() - usage: myprogram [-h] - - optional arguments: - -h, --help show this help message and exit - -Note that the program name, whether determined from ``sys.argv[0]`` or from the -``prog=`` argument, is available to help messages using the ``%(prog)s`` format -specifier. - -:: - - >>> parser = argparse.ArgumentParser(prog='myprogram') - >>> parser.add_argument('--foo', help='foo of the %(prog)s program') - >>> parser.print_help() - usage: myprogram [-h] [--foo FOO] - - optional arguments: - -h, --help show this help message and exit - --foo FOO foo of the myprogram program - - -usage -^^^^^ -By default, :class:`ArgumentParser` calculates the usage message from the -arguments it contains:: +Occasionally, it may be useful to disable the addition of this help option. +This can be achieved by passing ``False`` as the ``add_help=`` argument to +:class:`ArgumentParser`:: - >>> parser = argparse.ArgumentParser(prog='PROG') - >>> parser.add_argument('--foo', nargs='?', help='foo help') - >>> parser.add_argument('bar', nargs='+', help='bar help') + >>> parser = argparse.ArgumentParser(prog='PROG', add_help=False) + >>> parser.add_argument('--foo', help='foo help') >>> parser.print_help() - usage: PROG [-h] [--foo [FOO]] bar [bar ...] - - positional arguments: - bar bar help + usage: PROG [--foo FOO] optional arguments: - -h, --help show this help message and exit - --foo [FOO] foo help + --foo FOO foo help -The default message can be overridden with the ``usage=`` keyword argument:: +The help option is typically ``-h/--help``. The exception to this is +if the ``prefix_chars=`` is specified and does not include ``-``, in +which case ``-h`` and ``--help`` are not valid options. In +this case, the first character in ``prefix_chars`` is used to prefix +the help options:: - >>> parser = argparse.ArgumentParser(prog='PROG', usage='%(prog)s [options]') - >>> parser.add_argument('--foo', nargs='?', help='foo help') - >>> parser.add_argument('bar', nargs='+', help='bar help') + >>> parser = argparse.ArgumentParser(prog='PROG', prefix_chars='+/') >>> parser.print_help() - usage: PROG [options] - - positional arguments: - bar bar help + usage: PROG [+h] optional arguments: - -h, --help show this help message and exit - --foo [FOO] foo help - -The ``%(prog)s`` format specifier is available to fill in the program name in -your usage messages. + +h, ++help show this help message and exit The add_argument() method @@ -1385,12 +1405,14 @@ argument:: >>> parser.parse_args(['--', '-f']) Namespace(foo='-f', one=None) +.. _prefix-matching: -Argument abbreviations -^^^^^^^^^^^^^^^^^^^^^^ +Argument abbreviations (prefix matching) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The :meth:`~ArgumentParser.parse_args` method allows long options to be -abbreviated if the abbreviation is unambiguous:: +abbreviated to a prefix, if the abbreviation is unambiguous (the prefix matches +a unique option):: >>> parser = argparse.ArgumentParser(prog='PROG') >>> parser.add_argument('-bacon') @@ -1466,7 +1488,10 @@ Other utilities Sub-commands ^^^^^^^^^^^^ -.. method:: ArgumentParser.add_subparsers() +.. method:: ArgumentParser.add_subparsers([title], [description], [prog], \ + [parser_class], [action], \ + [option_string], [dest], [help], \ + [metavar]) Many programs split up their functionality into a number of sub-commands, for example, the ``svn`` program can invoke sub-commands like ``svn @@ -1480,6 +1505,30 @@ Sub-commands command name and any :class:`ArgumentParser` constructor arguments, and returns an :class:`ArgumentParser` object that can be modified as usual. + Description of parameters: + + * title - title for the sub-parser group in help output; by default + "subcommands" if description is provided, otherwise uses title for + positional arguments + + * description - description for the sub-parser group in help output, by + default None + + * prog - usage information that will be displayed with sub-command help, + by default the name of the program and any positional arguments before the + subparser argument + + * parser_class - class which will be used to create sub-parser instances, by + default the class of the current parser (e.g. ArgumentParser) + + * dest - name of the attribute under which sub-command name will be + stored; by default None and no value is stored + + * help - help for sub-parser group in help output, by default None + + * metavar - string presenting available sub-commands in help; by default it + is None and presents sub-commands in form {cmd1, cmd2, ..} + Some example usage:: >>> # create the top-level parser @@ -1710,7 +1759,7 @@ Argument groups Mutual exclusion ^^^^^^^^^^^^^^^^ -.. method:: add_mutually_exclusive_group(required=False) +.. method:: ArgumentParser.add_mutually_exclusive_group(required=False) Create a mutually exclusive group. :mod:`argparse` will make sure that only one of the arguments in the mutually exclusive group was present on the @@ -1839,6 +1888,12 @@ the populated namespace and the list of remaining argument strings. >>> parser.parse_known_args(['--foo', '--badger', 'BAR', 'spam']) (Namespace(bar='BAR', foo=True), ['--badger', 'spam']) +.. warning:: + :ref:`Prefix matching <prefix-matching>` rules apply to + :meth:`parse_known_args`. The parser may consume an option even if it's just + a prefix of one of its known options, instead of leaving it in the remaining + arguments list. + Customizing file parsing ^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/Doc/library/array.rst b/Doc/library/array.rst index d563cce..f1ab959 100644 --- a/Doc/library/array.rst +++ b/Doc/library/array.rst @@ -14,36 +14,54 @@ them is constrained. The type is specified at object creation time by using a :dfn:`type code`, which is a single character. The following type codes are defined: -+-----------+----------------+-------------------+-----------------------+ -| Type code | C Type | Python Type | Minimum size in bytes | -+===========+================+===================+=======================+ -| ``'b'`` | signed char | int | 1 | -+-----------+----------------+-------------------+-----------------------+ -| ``'B'`` | unsigned char | int | 1 | -+-----------+----------------+-------------------+-----------------------+ -| ``'u'`` | Py_UNICODE | Unicode character | 2 (see note) | -+-----------+----------------+-------------------+-----------------------+ -| ``'h'`` | signed short | int | 2 | -+-----------+----------------+-------------------+-----------------------+ -| ``'H'`` | unsigned short | int | 2 | -+-----------+----------------+-------------------+-----------------------+ -| ``'i'`` | signed int | int | 2 | -+-----------+----------------+-------------------+-----------------------+ -| ``'I'`` | unsigned int | int | 2 | -+-----------+----------------+-------------------+-----------------------+ -| ``'l'`` | signed long | int | 4 | -+-----------+----------------+-------------------+-----------------------+ -| ``'L'`` | unsigned long | int | 4 | -+-----------+----------------+-------------------+-----------------------+ -| ``'f'`` | float | float | 4 | -+-----------+----------------+-------------------+-----------------------+ -| ``'d'`` | double | float | 8 | -+-----------+----------------+-------------------+-----------------------+ - -.. note:: - - The ``'u'`` typecode corresponds to Python's unicode character. On narrow - Unicode builds this is 2-bytes, on wide builds this is 4-bytes. ++-----------+--------------------+-------------------+-----------------------+-------+ +| Type code | C Type | Python Type | Minimum size in bytes | Notes | ++===========+====================+===================+=======================+=======+ +| ``'b'`` | signed char | int | 1 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'B'`` | unsigned char | int | 1 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'u'`` | Py_UNICODE | Unicode character | 2 | \(1) | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'h'`` | signed short | int | 2 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'H'`` | unsigned short | int | 2 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'i'`` | signed int | int | 2 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'I'`` | unsigned int | int | 2 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'l'`` | signed long | int | 4 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'L'`` | unsigned long | int | 4 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'q'`` | signed long long | int | 8 | \(2) | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'Q'`` | unsigned long long | int | 8 | \(2) | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'f'`` | float | float | 4 | | ++-----------+--------------------+-------------------+-----------------------+-------+ +| ``'d'`` | double | float | 8 | | ++-----------+--------------------+-------------------+-----------------------+-------+ + +Notes: + +(1) + The ``'u'`` type code corresponds to Python's obsolete unicode character + (:c:type:`Py_UNICODE` which is :c:type:`wchar_t`). Depending on the + platform, it can be 16 bits or 32 bits. + + ``'u'`` will be removed together with the rest of the :c:type:`Py_UNICODE` + API. + + .. deprecated-removed:: 3.3 4.0 + +(2) + The ``'q'`` and ``'Q'`` type codes are available only if + the platform C compiler used to build Python supports C :c:type:`long long`, + or, on Windows, :c:type:`__int64`. + + .. versionadded:: 3.3 The actual representation of values is determined by the machine architecture (strictly speaking, by the C implementation). The actual size can be accessed @@ -55,8 +73,8 @@ The module defines the following type: .. class:: array(typecode[, initializer]) A new array whose items are restricted by *typecode*, and initialized - from the optional *initializer* value, which must be a list, object - supporting the buffer interface, or iterable over elements of the + from the optional *initializer* value, which must be a list, a + :term:`bytes-like object`, or iterable over elements of the appropriate type. If given a list or string, the initializer is passed to the new array's @@ -73,7 +91,7 @@ Array objects support the ordinary sequence operations of indexing, slicing, concatenation, and multiplication. When using slice assignment, the assigned value must be an array object with the same type code; in all other cases, :exc:`TypeError` is raised. Array objects also implement the buffer interface, -and may be used wherever buffer objects are supported. +and may be used wherever :term:`bytes-like object`\ s are supported. The following data items and methods are also supported: @@ -253,9 +271,7 @@ Examples:: Packing and unpacking of External Data Representation (XDR) data as used in some remote procedure call systems. - `The Numerical Python Manual <http://numpy.sourceforge.net/numdoc/HTML/numdoc.htm>`_ + `The Numerical Python Documentation <http://docs.scipy.org/doc/>`_ The Numeric Python extension (NumPy) defines another array type; see - http://numpy.sourceforge.net/ for further information about Numerical Python. - (A PDF version of the NumPy manual is available at - http://numpy.sourceforge.net/numdoc/numdoc.pdf). + http://www.numpy.org/ for further information about Numerical Python. diff --git a/Doc/library/ast.rst b/Doc/library/ast.rst index e2c0b6d..daf28de 100644 --- a/Doc/library/ast.rst +++ b/Doc/library/ast.rst @@ -96,9 +96,6 @@ Node classes Abstract Grammar ---------------- -The module defines a string constant ``__version__`` which is the decimal -Subversion revision number of the file shown below. - The abstract grammar is currently defined as follows: .. literalinclude:: ../../Parser/Python.asdl @@ -247,6 +244,6 @@ and classes for traversing abstract syntax trees: Return a formatted dump of the tree in *node*. This is mainly useful for debugging purposes. The returned string will show the names and the values for fields. This makes the code impossible to evaluate, so if evaluation is - wanted *annotate_fields* must be set to False. Attributes such as line + wanted *annotate_fields* must be set to ``False``. Attributes such as line numbers and column offsets are not dumped by default. If this is wanted, *include_attributes* can be set to ``True``. diff --git a/Doc/library/asynchat.rst b/Doc/library/asynchat.rst index 75b3cda..adbd3be 100644 --- a/Doc/library/asynchat.rst +++ b/Doc/library/asynchat.rst @@ -56,8 +56,8 @@ connection requests. have only one method, :meth:`more`, which should return data to be transmitted on the channel. The producer indicates exhaustion (*i.e.* that it contains no more data) by - having its :meth:`more` method return the empty string. At this point the - :class:`async_chat` object removes the producer from the fifo and starts + having its :meth:`more` method return the empty bytes object. At this point + the :class:`async_chat` object removes the producer from the fifo and starts using the next producer, if any. When the producer fifo is empty the :meth:`handle_write` method does nothing. You use the channel object's :meth:`set_terminator` method to describe how to recognize the end of, or @@ -197,6 +197,9 @@ The :meth:`handle_request` method is called once all relevant input has been marshalled, after setting the channel terminator to ``None`` to ensure that any extraneous data sent by the web client are ignored. :: + + import asynchat + class http_request_handler(asynchat.async_chat): def __init__(self, sock, addr, sessions, log): @@ -218,7 +221,7 @@ any extraneous data sent by the web client are ignored. :: def found_terminator(self): if self.reading_headers: self.reading_headers = False - self.parse_headers("".join(self.ibuffer)) + self.parse_headers(b"".join(self.ibuffer)) self.ibuffer = [] if self.op.upper() == b"POST": clen = self.headers.getheader("content-length") diff --git a/Doc/library/asyncore.rst b/Doc/library/asyncore.rst index 619b7bb..1521e72 100644 --- a/Doc/library/asyncore.rst +++ b/Doc/library/asyncore.rst @@ -53,10 +53,10 @@ any that have been added to the map during asynchronous service) is closed. channels have been closed. All arguments are optional. The *count* parameter defaults to None, resulting in the loop terminating only when all channels have been closed. The *timeout* argument sets the timeout - parameter for the appropriate :func:`select` or :func:`poll` call, measured - in seconds; the default is 30 seconds. The *use_poll* parameter, if true, - indicates that :func:`poll` should be used in preference to :func:`select` - (the default is ``False``). + parameter for the appropriate :func:`~select.select` or :func:`~select.poll` + call, measured in seconds; the default is 30 seconds. The *use_poll* + parameter, if true, indicates that :func:`~select.poll` should be used in + preference to :func:`~select.select` (the default is ``False``). The *map* parameter is a dictionary whose items are the channels to watch. As channels are closed they are deleted from their map. If *map* is @@ -184,12 +184,15 @@ any that have been added to the map during asynchronous service) is closed. Most of these are nearly identical to their socket partners. - .. method:: create_socket(family, type) + .. method:: create_socket(family=socket.AF_INET, type=socket.SOCK_STREAM) This is identical to the creation of a normal socket, and will use the same options for creation. Refer to the :mod:`socket` documentation for information on creating sockets. + .. versionchanged:: 3.3 + *family* and *type* arguments can be omitted. + .. method:: connect(address) @@ -205,7 +208,8 @@ any that have been added to the map during asynchronous service) is closed. .. method:: recv(buffer_size) Read at most *buffer_size* bytes from the socket's remote end-point. An - empty string implies that the channel has been closed from the other end. + empty bytes object implies that the channel has been closed from the + other end. .. method:: listen(backlog) @@ -274,13 +278,13 @@ asyncore Example basic HTTP client Here is a very basic HTTP client that uses the :class:`dispatcher` class to implement its socket handling:: - import asyncore, socket + import asyncore class HTTPClient(asyncore.dispatcher): def __init__(self, host, path): asyncore.dispatcher.__init__(self) - self.create_socket(socket.AF_INET, socket.SOCK_STREAM) + self.create_socket() self.connect( (host, 80) ) self.buffer = bytes('GET %s HTTP/1.0\r\nHost: %s\r\n\r\n' % (path, host), 'ascii') @@ -314,7 +318,6 @@ Here is a basic echo server that uses the :class:`dispatcher` class to accept connections and dispatches the incoming connections to a handler:: import asyncore - import socket class EchoHandler(asyncore.dispatcher_with_send): @@ -327,7 +330,7 @@ connections and dispatches the incoming connections to a handler:: def __init__(self, host, port): asyncore.dispatcher.__init__(self) - self.create_socket(socket.AF_INET, socket.SOCK_STREAM) + self.create_socket() self.set_reuse_addr() self.bind((host, port)) self.listen(5) @@ -338,4 +341,3 @@ connections and dispatches the incoming connections to a handler:: server = EchoServer('localhost', 8080) asyncore.loop() - diff --git a/Doc/library/atexit.rst b/Doc/library/atexit.rst index 01cf379..dbdd81e 100644 --- a/Doc/library/atexit.rst +++ b/Doc/library/atexit.rst @@ -9,13 +9,14 @@ The :mod:`atexit` module defines functions to register and unregister cleanup functions. Functions thus registered are automatically executed upon normal -interpreter termination. The order in which the functions are called is not -defined; if you have cleanup operations that depend on each other, you should -wrap them in a function and register that one. This keeps :mod:`atexit` simple. +interpreter termination. :mod:`atexit` runs these functions in the *reverse* +order in which they were registered; if you register ``A``, ``B``, and ``C``, +at interpreter termination time they will be run in the order ``C``, ``B``, +``A``. -Note: the functions registered via this module are not called when the program -is killed by a signal not handled by Python, when a Python fatal internal error -is detected, or when :func:`os._exit` is called. +**Note:** The functions registered via this module are not called when the +program is killed by a signal not handled by Python, when a Python fatal +internal error is detected, or when :func:`os._exit` is called. .. function:: register(func, *args, **kargs) @@ -67,8 +68,9 @@ automatically when the program terminates without relying on the application making an explicit call into this module at termination. :: try: - _count = int(open("counter").read()) - except IOError: + with open("counterfile") as infile: + _count = int(infile.read()) + except FileNotFoundError: _count = 0 def incrcounter(n): @@ -76,7 +78,8 @@ making an explicit call into this module at termination. :: _count = _count + n def savecounter(): - open("counter", "w").write("%d" % _count) + with open("counterfile", "w") as outfile: + outfile.write("%d" % _count) import atexit atexit.register(savecounter) diff --git a/Doc/library/audioop.rst b/Doc/library/audioop.rst index d4c0c95..edb3870 100644 --- a/Doc/library/audioop.rst +++ b/Doc/library/audioop.rst @@ -241,7 +241,7 @@ to be stateless (i.e. to be able to tolerate packet loss) you should not only transmit the data but also the state. Note that you should send the *initial* state (the one you passed to :func:`lin2adpcm`) along to the decoder, not the final state (as returned by the coder). If you want to use -:func:`struct.struct` to store the state in binary you can code the first +:class:`struct.Struct` to store the state in binary you can code the first element (the predicted value) in 16 bits and the second (the delta index) in 8. The ADPCM coders have never been tried against other ADPCM coders, only against diff --git a/Doc/library/base64.rst b/Doc/library/base64.rst index c08df15..f0d11b0 100644 --- a/Doc/library/base64.rst +++ b/Doc/library/base64.rst @@ -18,9 +18,14 @@ POST request. The encoding algorithm is not the same as the There are two interfaces provided by this module. The modern interface supports encoding and decoding ASCII byte string objects using all three -alphabets. The legacy interface provides for encoding and decoding to and from -file-like objects as well as byte strings, but only using the Base64 standard -alphabet. +alphabets. Additionally, the decoding functions of the modern interface also +accept Unicode strings containing only ASCII characters. The legacy interface +provides for encoding and decoding to and from file-like objects as well as +byte strings, but only using the Base64 standard alphabet. + +.. versionchanged:: 3.3 + ASCII-only Unicode strings are now accepted by the decoding functions of + the modern interface. The modern interface provides: @@ -98,7 +103,7 @@ The modern interface provides: digit 0 is always mapped to the letter O). For security purposes the default is ``None``, so that 0 and 1 are not allowed in the input. - The decoded byte string is returned. A :exc:`TypeError` is raised if *s* were + The decoded byte string is returned. A :exc:`binascii.Error` is raised if *s* is incorrectly padded or if there are non-alphabet characters present in the string. diff --git a/Doc/library/bdb.rst b/Doc/library/bdb.rst index 0737602..7229087 100644 --- a/Doc/library/bdb.rst +++ b/Doc/library/bdb.rst @@ -194,17 +194,17 @@ The :mod:`bdb` module also defines two classes: .. method:: user_line(frame) This method is called from :meth:`dispatch_line` when either - :meth:`stop_here` or :meth:`break_here` yields True. + :meth:`stop_here` or :meth:`break_here` yields ``True``. .. method:: user_return(frame, return_value) This method is called from :meth:`dispatch_return` when :meth:`stop_here` - yields True. + yields ``True``. .. method:: user_exception(frame, exc_info) This method is called from :meth:`dispatch_exception` when - :meth:`stop_here` yields True. + :meth:`stop_here` yields ``True``. .. method:: do_clear(arg) @@ -245,7 +245,7 @@ The :mod:`bdb` module also defines two classes: .. method:: set_quit() - Set the :attr:`quitting` attribute to True. This raises :exc:`BdbQuit` in + Set the :attr:`quitting` attribute to ``True``. This raises :exc:`BdbQuit` in the next call to one of the :meth:`dispatch_\*` methods. diff --git a/Doc/library/binary.rst b/Doc/library/binary.rst new file mode 100644 index 0000000..51fbdc1 --- /dev/null +++ b/Doc/library/binary.rst @@ -0,0 +1,23 @@ +.. _binaryservices: + +******************** +Binary Data Services +******************** + +The modules described in this chapter provide some basic services operations +for manipulation of binary data. Other operations on binary data, specifically +in relation to file formats and network protocols, are described in the +relevant sections. + +Some libraries described under :ref:`textservices` also work with either +ASCII-compatible binary formats (for example, :mod:`re`) or all binary data +(for example, :mod:`difflib`). + +In addition, see the documentation for Python's built-in binary data types in +:ref:`binaryseq`. + +.. toctree:: + + struct.rst + codecs.rst + diff --git a/Doc/library/binascii.rst b/Doc/library/binascii.rst index 2aa3702..c92a8e1 100644 --- a/Doc/library/binascii.rst +++ b/Doc/library/binascii.rst @@ -20,8 +20,14 @@ higher-level modules. .. note:: - Encoding and decoding functions do not accept Unicode strings. Only bytestring - and bytearray objects can be processed. + ``a2b_*`` functions accept Unicode strings containing only ASCII characters. + Other functions only accept :term:`bytes-like object`\ s (such as + :class:`bytes`, :class:`bytearray` and other objects that support the buffer + protocol). + + .. versionchanged:: 3.3 + ASCII-only unicode strings are now accepted by the ``a2b_*`` functions. + The :mod:`binascii` module defines the following functions: @@ -139,7 +145,7 @@ The :mod:`binascii` module defines the following functions: Return the hexadecimal representation of the binary *data*. Every byte of *data* is converted into the corresponding 2-digit hex representation. The - resulting string is therefore twice as long as the length of *data*. + returned bytes object is therefore twice as long as the length of *data*. .. function:: a2b_hex(hexstr) diff --git a/Doc/library/bz2.rst b/Doc/library/bz2.rst index 93144b6..b79bccd 100644 --- a/Doc/library/bz2.rst +++ b/Doc/library/bz2.rst @@ -1,201 +1,207 @@ -:mod:`bz2` --- Compression compatible with :program:`bzip2` -=========================================================== +:mod:`bz2` --- Support for :program:`bzip2` compression +======================================================= .. module:: bz2 - :synopsis: Interface to compression and decompression routines - compatible with bzip2. + :synopsis: Interfaces for bzip2 compression and decompression. .. moduleauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> +.. moduleauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> .. sectionauthor:: Gustavo Niemeyer <niemeyer@conectiva.com> +.. sectionauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> -This module provides a comprehensive interface for the bz2 compression library. -It implements a complete file interface, one-shot (de)compression functions, and -types for sequential (de)compression. +This module provides a comprehensive interface for compressing and +decompressing data using the bzip2 compression algorithm. -Here is a summary of the features offered by the bz2 module: +The :mod:`bz2` module contains: -* :class:`BZ2File` class implements a complete file interface, including - :meth:`~BZ2File.readline`, :meth:`~BZ2File.readlines`, - :meth:`~BZ2File.writelines`, :meth:`~BZ2File.seek`, etc; +* The :func:`.open` function and :class:`BZ2File` class for reading and + writing compressed files. +* The :class:`BZ2Compressor` and :class:`BZ2Decompressor` classes for + incremental (de)compression. +* The :func:`compress` and :func:`decompress` functions for one-shot + (de)compression. -* :class:`BZ2File` class implements emulated :meth:`~BZ2File.seek` support; +All of the classes in this module may safely be accessed from multiple threads. -* :class:`BZ2File` class implements universal newline support; -* :class:`BZ2File` class offers an optimized line iteration using a readahead - algorithm; +(De)compression of files +------------------------ -* Sequential (de)compression supported by :class:`BZ2Compressor` and - :class:`BZ2Decompressor` classes; +.. function:: open(filename, mode='r', compresslevel=9, encoding=None, errors=None, newline=None) -* One-shot (de)compression supported by :func:`compress` and :func:`decompress` - functions; + Open a bzip2-compressed file in binary or text mode, returning a :term:`file + object`. -* Thread safety uses individual locking mechanism. + As with the constructor for :class:`BZ2File`, the *filename* argument can be + an actual filename (a :class:`str` or :class:`bytes` object), or an existing + file object to read from or write to. + The *mode* argument can be any of ``'r'``, ``'rb'``, ``'w'``, ``'wb'``, + ``'a'``, or ``'ab'`` for binary mode, or ``'rt'``, ``'wt'``, or ``'at'`` for + text mode. The default is ``'rb'``. -(De)compression of files ------------------------- + The *compresslevel* argument is an integer from 1 to 9, as for the + :class:`BZ2File` constructor. -Handling of compressed files is offered by the :class:`BZ2File` class. + For binary mode, this function is equivalent to the :class:`BZ2File` + constructor: ``BZ2File(filename, mode, compresslevel=compresslevel)``. In + this case, the *encoding*, *errors* and *newline* arguments must not be + provided. + For text mode, a :class:`BZ2File` object is created, and wrapped in an + :class:`io.TextIOWrapper` instance with the specified encoding, error + handling behavior, and line ending(s). -.. index:: - single: universal newlines; bz2.BZ2File class + .. versionadded:: 3.3 -.. class:: BZ2File(filename, mode='r', buffering=0, compresslevel=9) - Open a bz2 file. Mode can be either ``'r'`` or ``'w'``, for reading (default) - or writing. When opened for writing, the file will be created if it doesn't - exist, and truncated otherwise. If *buffering* is given, ``0`` means - unbuffered, and larger numbers specify the buffer size; the default is - ``0``. If *compresslevel* is given, it must be a number between ``1`` and - ``9``; the default is ``9``. Add a ``'U'`` to mode to open the file for input - in :term:`universal newlines` mode. Any line ending in the input file will be - seen as a ``'\n'`` in Python. Also, a file so opened gains the attribute - :attr:`newlines`; the value for this attribute is one of ``None`` (no newline - read yet), ``'\r'``, ``'\n'``, ``'\r\n'`` or a tuple containing all the - newline types seen. Universal newlines are available only when - reading. Instances support iteration in the same way as normal :class:`file` - instances. +.. class:: BZ2File(filename, mode='r', buffering=None, compresslevel=9) - :class:`BZ2File` supports the :keyword:`with` statement. - - .. versionchanged:: 3.1 - Support for the :keyword:`with` statement was added. + Open a bzip2-compressed file in binary mode. + If *filename* is a :class:`str` or :class:`bytes` object, open the named file + directly. Otherwise, *filename* should be a :term:`file object`, which will + be used to read or write the compressed data. - .. note:: + The *mode* argument can be either ``'r'`` for reading (default), ``'w'`` for + overwriting, or ``'a'`` for appending. These can equivalently be given as + ``'rb'``, ``'wb'``, and ``'ab'`` respectively. - This class does not support input files containing multiple streams (such - as those produced by the :program:`pbzip2` tool). When reading such an - input file, only the first stream will be accessible. If you require - support for multi-stream files, consider using the third-party - :mod:`bz2file` module (available from - `PyPI <http://pypi.python.org/pypi/bz2file>`_). This module provides a - backport of Python 3.3's :class:`BZ2File` class, which does support - multi-stream files. + If *filename* is a file object (rather than an actual file name), a mode of + ``'w'`` does not truncate the file, and is instead equivalent to ``'a'``. + The *buffering* argument is ignored. Its use is deprecated. - .. method:: close() + If *mode* is ``'w'`` or ``'a'``, *compresslevel* can be a number between + ``1`` and ``9`` specifying the level of compression: ``1`` produces the + least compression, and ``9`` (default) produces the most compression. - Close the file. Sets data attribute :attr:`closed` to true. A closed file - cannot be used for further I/O operations. :meth:`close` may be called - more than once without error. + If *mode* is ``'r'``, the input file may be the concatenation of multiple + compressed streams. + :class:`BZ2File` provides all of the members specified by the + :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`. + Iteration and the :keyword:`with` statement are supported. - .. method:: read([size]) + :class:`BZ2File` also provides the following method: - Read at most *size* uncompressed bytes, returned as a byte string. If the - *size* argument is negative or omitted, read until EOF is reached. + .. method:: peek([n]) + Return buffered data without advancing the file position. At least one + byte of data will be returned (unless at EOF). The exact number of bytes + returned is unspecified. - .. method:: readline([size]) + .. note:: While calling :meth:`peek` does not change the file position of + the :class:`BZ2File`, it may change the position of the underlying file + object (e.g. if the :class:`BZ2File` was constructed by passing a file + object for *filename*). - Return the next line from the file, as a byte string, retaining newline. - A non-negative *size* argument limits the maximum number of bytes to - return (an incomplete line may be returned then). Return an empty byte - string at EOF. + .. versionadded:: 3.3 + .. versionchanged:: 3.1 + Support for the :keyword:`with` statement was added. - .. method:: readlines([size]) - - Return a list of lines read. The optional *size* argument, if given, is an - approximate bound on the total number of bytes in the lines returned. + .. versionchanged:: 3.3 + The :meth:`fileno`, :meth:`readable`, :meth:`seekable`, :meth:`writable`, + :meth:`read1` and :meth:`readinto` methods were added. + .. versionchanged:: 3.3 + Support was added for *filename* being a :term:`file object` instead of an + actual filename. - .. method:: seek(offset[, whence]) + .. versionchanged:: 3.3 + The ``'a'`` (append) mode was added, along with support for reading + multi-stream files. - Move to new file position. Argument *offset* is a byte count. Optional - argument *whence* defaults to ``os.SEEK_SET`` or ``0`` (offset from start - of file; offset should be ``>= 0``); other values are ``os.SEEK_CUR`` or - ``1`` (move relative to current position; offset can be positive or - negative), and ``os.SEEK_END`` or ``2`` (move relative to end of file; - offset is usually negative, although many platforms allow seeking beyond - the end of a file). - Note that seeking of bz2 files is emulated, and depending on the - parameters the operation may be extremely slow. +Incremental (de)compression +--------------------------- +.. class:: BZ2Compressor(compresslevel=9) - .. method:: tell() + Create a new compressor object. This object may be used to compress data + incrementally. For one-shot compression, use the :func:`compress` function + instead. - Return the current file position, an integer. + *compresslevel*, if given, must be a number between ``1`` and ``9``. The + default is ``9``. + .. method:: compress(data) - .. method:: write(data) + Provide data to the compressor object. Returns a chunk of compressed data + if possible, or an empty byte string otherwise. - Write the byte string *data* to file. Note that due to buffering, - :meth:`close` may be needed before the file on disk reflects the data - written. + When you have finished providing data to the compressor, call the + :meth:`flush` method to finish the compression process. - .. method:: writelines(sequence_of_byte_strings) + .. method:: flush() - Write the sequence of byte strings to the file. Note that newlines are not - added. The sequence can be any iterable object producing byte strings. - This is equivalent to calling write() for each byte string. + Finish the compression process. Returns the compressed data left in + internal buffers. + The compressor object may not be used after this method has been called. -Sequential (de)compression --------------------------- -Sequential compression and decompression is done using the classes -:class:`BZ2Compressor` and :class:`BZ2Decompressor`. +.. class:: BZ2Decompressor() + Create a new decompressor object. This object may be used to decompress data + incrementally. For one-shot compression, use the :func:`decompress` function + instead. -.. class:: BZ2Compressor(compresslevel=9) + .. note:: + This class does not transparently handle inputs containing multiple + compressed streams, unlike :func:`decompress` and :class:`BZ2File`. If + you need to decompress a multi-stream input with :class:`BZ2Decompressor`, + you must use a new decompressor for each stream. - Create a new compressor object. This object may be used to compress data - sequentially. If you want to compress data in one shot, use the - :func:`compress` function instead. The *compresslevel* parameter, if given, - must be a number between ``1`` and ``9``; the default is ``9``. + .. method:: decompress(data) - .. method:: compress(data) + Provide data to the decompressor object. Returns a chunk of decompressed + data if possible, or an empty byte string otherwise. - Provide more data to the compressor object. It will return chunks of - compressed data whenever possible. When you've finished providing data to - compress, call the :meth:`flush` method to finish the compression process, - and return what is left in internal buffers. + Attempting to decompress data after the end of the current stream is + reached raises an :exc:`EOFError`. If any data is found after the end of + the stream, it is ignored and saved in the :attr:`unused_data` attribute. - .. method:: flush() + .. attribute:: eof - Finish the compression process and return what is left in internal - buffers. You must not use the compressor object after calling this method. + ``True`` if the end-of-stream marker has been reached. + .. versionadded:: 3.3 -.. class:: BZ2Decompressor() - Create a new decompressor object. This object may be used to decompress data - sequentially. If you want to decompress data in one shot, use the - :func:`decompress` function instead. + .. attribute:: unused_data - .. method:: decompress(data) + Data found after the end of the compressed stream. - Provide more data to the decompressor object. It will return chunks of - decompressed data whenever possible. If you try to decompress data after - the end of stream is found, :exc:`EOFError` will be raised. If any data - was found after the end of stream, it'll be ignored and saved in - :attr:`unused_data` attribute. + If this attribute is accessed before the end of the stream has been + reached, its value will be ``b''``. One-shot (de)compression ------------------------ -One-shot compression and decompression is provided through the :func:`compress` -and :func:`decompress` functions. +.. function:: compress(data, compresslevel=9) + Compress *data*. -.. function:: compress(data, compresslevel=9) + *compresslevel*, if given, must be a number between ``1`` and ``9``. The + default is ``9``. - Compress *data* in one shot. If you want to compress data sequentially, use - an instance of :class:`BZ2Compressor` instead. The *compresslevel* parameter, - if given, must be a number between ``1`` and ``9``; the default is ``9``. + For incremental compression, use a :class:`BZ2Compressor` instead. .. function:: decompress(data) - Decompress *data* in one shot. If you want to decompress data sequentially, - use an instance of :class:`BZ2Decompressor` instead. + Decompress *data*. + + If *data* is the concatenation of multiple compressed streams, decompress + all of the streams. + + For incremental decompression, use a :class:`BZ2Decompressor` instead. + + .. versionchanged:: 3.3 + Support for multi-stream inputs was added. diff --git a/Doc/library/calendar.rst b/Doc/library/calendar.rst index f495271..3187c43 100644 --- a/Doc/library/calendar.rst +++ b/Doc/library/calendar.rst @@ -272,10 +272,11 @@ For simple text calendars this module provides the following functions. .. function:: timegm(tuple) - An unrelated but handy function that takes a time tuple such as returned by the - :func:`gmtime` function in the :mod:`time` module, and returns the corresponding - Unix timestamp value, assuming an epoch of 1970, and the POSIX encoding. In - fact, :func:`time.gmtime` and :func:`timegm` are each others' inverse. + An unrelated but handy function that takes a time tuple such as returned by + the :func:`~time.gmtime` function in the :mod:`time` module, and returns the + corresponding Unix timestamp value, assuming an epoch of 1970, and the POSIX + encoding. In fact, :func:`time.gmtime` and :func:`timegm` are each others' + inverse. The :mod:`calendar` module exports the following data attributes: diff --git a/Doc/library/cgi.rst b/Doc/library/cgi.rst index 478c95a..c4e7c60 100644 --- a/Doc/library/cgi.rst +++ b/Doc/library/cgi.rst @@ -97,7 +97,7 @@ standard input, it should be instantiated only once. The :class:`FieldStorage` instance can be indexed like a Python dictionary. It allows membership testing with the :keyword:`in` operator, and also supports -the standard dictionary method :meth:`keys` and the built-in function +the standard dictionary method :meth:`~dict.keys` and the built-in function :func:`len`. Form fields containing empty strings are ignored and do not appear in the dictionary; to keep such values, provide a true value for the optional *keep_blank_values* keyword parameter when creating the :class:`FieldStorage` @@ -119,30 +119,32 @@ string:: Here the fields, accessed through ``form[key]``, are themselves instances of :class:`FieldStorage` (or :class:`MiniFieldStorage`, depending on the form -encoding). The :attr:`value` attribute of the instance yields the string value -of the field. The :meth:`getvalue` method returns this string value directly; -it also accepts an optional second argument as a default to return if the -requested key is not present. +encoding). The :attr:`~FieldStorage.value` attribute of the instance yields +the string value of the field. The :meth:`~FieldStorage.getvalue` method +returns this string value directly; it also accepts an optional second argument +as a default to return if the requested key is not present. If the submitted form data contains more than one field with the same name, the object retrieved by ``form[key]`` is not a :class:`FieldStorage` or :class:`MiniFieldStorage` instance but a list of such instances. Similarly, in this situation, ``form.getvalue(key)`` would return a list of strings. If you expect this possibility (when your HTML form contains multiple fields with the -same name), use the :func:`getlist` function, which always returns a list of -values (so that you do not need to special-case the single item case). For -example, this code concatenates any number of username fields, separated by -commas:: +same name), use the :meth:`~FieldStorage.getlist` method, which always returns +a list of values (so that you do not need to special-case the single item +case). For example, this code concatenates any number of username fields, +separated by commas:: value = form.getlist("username") usernames = ",".join(value) If a field represents an uploaded file, accessing the value via the -:attr:`value` attribute or the :func:`getvalue` method reads the entire file in -memory as bytes. This may not be what you want. You can test for an uploaded -file by testing either the :attr:`filename` attribute or the :attr:`!file` +:attr:`~FieldStorage.value` attribute or the :meth:`~FieldStorage.getvalue` +method reads the entire file in memory as bytes. This may not be what you +want. You can test for an uploaded file by testing either the +:attr:`~FieldStorage.filename` attribute or the :attr:`~FieldStorage.file` attribute. You can then read the data at leisure from the :attr:`!file` -attribute (the :func:`read` and :func:`readline` methods will return bytes):: +attribute (the :func:`~io.RawIOBase.read` and :func:`~io.IOBase.readline` +methods will return bytes):: fileitem = form["userfile"] if fileitem.file: @@ -155,8 +157,8 @@ attribute (the :func:`read` and :func:`readline` methods will return bytes):: If an error is encountered when obtaining the contents of an uploaded file (for example, when the user interrupts the form submission by clicking on -a Back or Cancel button) the :attr:`done` attribute of the object for the -field will be set to the value -1. +a Back or Cancel button) the :attr:`~FieldStorage.done` attribute of the +object for the field will be set to the value -1. The file upload draft standard entertains the possibility of uploading multiple files from one field (using a recursive :mimetype:`multipart/\*` encoding). @@ -223,8 +225,8 @@ Therefore, the appropriate way to read form data values was to always use the code which checks whether the obtained value is a single value or a list of values. That's annoying and leads to less readable scripts. -A more convenient approach is to use the methods :meth:`getfirst` and -:meth:`getlist` provided by this higher level interface. +A more convenient approach is to use the methods :meth:`~FieldStorage.getfirst` +and :meth:`~FieldStorage.getlist` provided by this higher level interface. .. method:: FieldStorage.getfirst(name, default=None) diff --git a/Doc/library/chunk.rst b/Doc/library/chunk.rst index d3558a4..50b6979 100644 --- a/Doc/library/chunk.rst +++ b/Doc/library/chunk.rst @@ -54,8 +54,9 @@ instance will fail with a :exc:`EOFError` exception. Class which represents a chunk. The *file* argument is expected to be a file-like object. An instance of this class is specifically allowed. The - only method that is needed is :meth:`read`. If the methods :meth:`seek` and - :meth:`tell` are present and don't raise an exception, they are also used. + only method that is needed is :meth:`~io.IOBase.read`. If the methods + :meth:`~io.IOBase.seek` and :meth:`~io.IOBase.tell` are present and don't + raise an exception, they are also used. If these methods are present and raise an exception, they are expected to not have altered the object. If the optional argument *align* is true, chunks are assumed to be aligned on 2-byte boundaries. If *align* is false, no @@ -84,8 +85,9 @@ instance will fail with a :exc:`EOFError` exception. Close and skip to the end of the chunk. This does not close the underlying file. - The remaining methods will raise :exc:`IOError` if called after the - :meth:`close` method has been called. + The remaining methods will raise :exc:`OSError` if called after the + :meth:`close` method has been called. Before Python 3.3, they used to + raise :exc:`IOError`, now an alias of :exc:`OSError`. .. method:: isatty() @@ -111,15 +113,15 @@ instance will fail with a :exc:`EOFError` exception. Read at most *size* bytes from the chunk (less if the read hits the end of the chunk before obtaining *size* bytes). If the *size* argument is - negative or omitted, read all data until the end of the chunk. The bytes - are returned as a string object. An empty string is returned when the end - of the chunk is encountered immediately. + negative or omitted, read all data until the end of the chunk. An empty + bytes object is returned when the end of the chunk is encountered + immediately. .. method:: skip() Skip to the end of the chunk. All further calls to :meth:`read` for the - chunk will return ``''``. If you are not interested in the contents of + chunk will return ``b''``. If you are not interested in the contents of the chunk, this method should be called so that the file points to the start of the next chunk. diff --git a/Doc/library/cmd.rst b/Doc/library/cmd.rst index 943c04a..9722928 100644 --- a/Doc/library/cmd.rst +++ b/Doc/library/cmd.rst @@ -285,8 +285,8 @@ immediate playback:: def do_playback(self, arg): 'Playback commands from a file: PLAYBACK rose.cmd' self.close() - cmds = open(arg).read().splitlines() - self.cmdqueue.extend(cmds) + with open(arg) as f: + self.cmdqueue.extend(f.read().splitlines()) def precmd(self, line): line = line.lower() if self.file and 'playback' not in line: diff --git a/Doc/library/code.rst b/Doc/library/code.rst index 56214b9..e869004 100644 --- a/Doc/library/code.rst +++ b/Doc/library/code.rst @@ -29,13 +29,13 @@ build applications which provide an interactive interpreter prompt. .. function:: interact(banner=None, readfunc=None, local=None) - Convenience function to run a read-eval-print loop. This creates a new instance - of :class:`InteractiveConsole` and sets *readfunc* to be used as the - :meth:`raw_input` method, if provided. If *local* is provided, it is passed to - the :class:`InteractiveConsole` constructor for use as the default namespace for - the interpreter loop. The :meth:`interact` method of the instance is then run - with *banner* passed as the banner to use, if provided. The console object is - discarded after use. + Convenience function to run a read-eval-print loop. This creates a new + instance of :class:`InteractiveConsole` and sets *readfunc* to be used as + the :meth:`InteractiveConsole.raw_input` method, if provided. If *local* is + provided, it is passed to the :class:`InteractiveConsole` constructor for + use as the default namespace for the interpreter loop. The :meth:`interact` + method of the instance is then run with *banner* passed as the banner to + use, if provided. The console object is discarded after use. .. function:: compile_command(source, filename="<input>", symbol="single") diff --git a/Doc/library/codecs.rst b/Doc/library/codecs.rst index 762bb98..009ae26 100644 --- a/Doc/library/codecs.rst +++ b/Doc/library/codecs.rst @@ -22,6 +22,25 @@ manages the codec and error handling lookup process. It defines the following functions: +.. function:: encode(obj, encoding='utf-8', errors='strict') + + Encodes *obj* using the codec registered for *encoding*. + + *Errors* may be given to set the desired error handling scheme. The + default error handler is ``strict`` meaning that encoding errors raise + :exc:`ValueError` (or a more codec specific subclass, such as + :exc:`UnicodeEncodeError`). Refer to :ref:`codec-base-classes` for more + information on codec error handling. + +.. function:: decode(obj, encoding='utf-8', errors='strict') + + Decodes *obj* using the codec registered for *encoding*. + + *Errors* may be given to set the desired error handling scheme. The + default error handler is ``strict`` meaning that decoding errors raise + :exc:`ValueError` (or a more codec specific subclass, such as + :exc:`UnicodeDecodeError`). Refer to :ref:`codec-base-classes` for more + information on codec error handling. .. function:: register(search_function) @@ -46,9 +65,9 @@ It defines the following functions: The various functions or classes take the following arguments: *encode* and *decode*: These must be functions or methods which have the same - interface as the :meth:`encode`/:meth:`decode` methods of Codec instances (see - Codec Interface). The functions/methods are expected to work in a stateless - mode. + interface as the :meth:`~Codec.encode`/:meth:`~Codec.decode` methods of Codec + instances (see :ref:`Codec Interface <codec-objects>`). The functions/methods + are expected to work in a stateless mode. *incrementalencoder* and *incrementaldecoder*: These have to be factory functions providing the following interface: @@ -65,7 +84,7 @@ It defines the following functions: ``factory(stream, errors='strict')`` The factory functions must return objects providing the interfaces defined by - the base classes :class:`StreamWriter` and :class:`StreamReader`, respectively. + the base classes :class:`StreamReader` and :class:`StreamWriter`, respectively. Stream codecs can maintain state. Possible values for errors are @@ -78,7 +97,11 @@ It defines the following functions: reference (for encoding only) * ``'backslashreplace'``: replace with backslashed escape sequences (for encoding only) - * ``'surrogateescape'``: replace with surrogate U+DCxx, see :pep:`383` + * ``'surrogateescape'``: on decoding, replace with code points in the Unicode + Private Use Area ranging from U+DC80 to U+DCFF. These private code + points will then be turned back into the same bytes when the + ``surrogateescape`` error handler is used when encoding the data. + (See :pep:`383` for more.) as well as any other error handling name defined via :func:`register_error`. @@ -155,13 +178,16 @@ functions which use :func:`lookup` for the codec lookup: when *name* is specified as the errors parameter. For encoding *error_handler* will be called with a :exc:`UnicodeEncodeError` - instance, which contains information about the location of the error. The error - handler must either raise this or a different exception or return a tuple with a - replacement for the unencodable part of the input and a position where encoding - should continue. The encoder will encode the replacement and continue encoding - the original input at the specified position. Negative position values will be - treated as being relative to the end of the input string. If the resulting - position is out of bound an :exc:`IndexError` will be raised. + instance, which contains information about the location of the error. The + error handler must either raise this or a different exception or return a + tuple with a replacement for the unencodable part of the input and a position + where encoding should continue. The replacement may be either :class:`str` or + :class:`bytes`. If the replacement is bytes, the encoder will simply copy + them into the output buffer. If the replacement is a string, the encoder will + encode the replacement. Encoding continues on original input at the + specified position. Negative position values will be treated as being + relative to the end of the input string. If the resulting position is out of + bound an :exc:`IndexError` will be raised. Decoding and translating works similar, except :exc:`UnicodeDecodeError` or :exc:`UnicodeTranslateError` will be passed to the handler and that the @@ -307,11 +333,13 @@ implement the file protocols. The :class:`Codec` class defines the interface for stateless encoders/decoders. -To simplify and standardize error handling, the :meth:`encode` and -:meth:`decode` methods may implement different error handling schemes by +To simplify and standardize error handling, the :meth:`~Codec.encode` and +:meth:`~Codec.decode` methods may implement different error handling schemes by providing the *errors* string argument. The following string values are defined and implemented by all standard Python codecs: +.. tabularcolumns:: |l|L| + +-------------------------+-----------------------------------------------+ | Value | Meaning | +=========================+===============================================+ @@ -400,12 +428,14 @@ interfaces of the stateless encoder and decoder: The :class:`IncrementalEncoder` and :class:`IncrementalDecoder` classes provide the basic interface for incremental encoding and decoding. Encoding/decoding the input isn't done with one call to the stateless encoder/decoder function, but -with multiple calls to the :meth:`encode`/:meth:`decode` method of the -incremental encoder/decoder. The incremental encoder/decoder keeps track of the -encoding/decoding process during method calls. - -The joined output of calls to the :meth:`encode`/:meth:`decode` method is the -same as if all the single inputs were joined into one, and this input was +with multiple calls to the +:meth:`~IncrementalEncoder.encode`/:meth:`~IncrementalDecoder.decode` method of +the incremental encoder/decoder. The incremental encoder/decoder keeps track of +the encoding/decoding process during method calls. + +The joined output of calls to the +:meth:`~IncrementalEncoder.encode`/:meth:`~IncrementalDecoder.decode` method is +the same as if all the single inputs were joined into one, and this input was encoded/decoded with the stateless encoder/decoder. @@ -458,7 +488,8 @@ define in order to be compatible with the Python codec registry. .. method:: reset() - Reset the encoder to the initial state. + Reset the encoder to the initial state. The output is discarded: call + ``.encode('', final=True)`` to reset the encoder and to get the output. .. method:: IncrementalEncoder.getstate() @@ -684,7 +715,7 @@ compatible with the Python codec registry. Read one line from the input stream and return the decoded data. *size*, if given, is passed as size argument to the stream's - :meth:`readline` method. + :meth:`read` method. If *keepends* is false line-endings will be stripped from the lines returned. @@ -786,11 +817,9 @@ methods and attributes from the underlying stream. Encodings and Unicode --------------------- -Strings are stored internally as sequences of codepoints (to be precise -as :c:type:`Py_UNICODE` arrays). Depending on the way Python is compiled (either -via ``--without-wide-unicode`` or ``--with-wide-unicode``, with the -former being the default) :c:type:`Py_UNICODE` is either a 16-bit or 32-bit data -type. Once a string object is used outside of CPU and memory, CPU endianness +Strings are stored internally as sequences of codepoints in range ``0 - 10FFFF`` +(see :pep:`393` for more details about the implementation). +Once a string object is used outside of CPU and memory, CPU endianness and how these arrays are stored as bytes become an issue. Transforming a string object into a sequence of bytes is called encoding and recreating the string object from the sequence of bytes is known as decoding. There are many @@ -901,6 +930,15 @@ is meant to be exhaustive. Notice that spelling alternatives that only differ in case or use a hyphen instead of an underscore are also valid aliases; therefore, e.g. ``'utf-8'`` is a valid alias for the ``'utf_8'`` codec. +.. impl-detail:: + + Some common encodings can bypass the codecs lookup machinery to + improve performance. These optimization opportunities are only + recognized by CPython for a limited set of aliases: utf-8, utf8, + latin-1, latin1, iso-8859-1, mbcs (Windows only), ascii, utf-16, + and utf-32. Using alternative spellings for these encodings may + result in slower execution. + Many of the character sets support the same languages. They vary in individual characters (e.g. whether the EURO SIGN is supported or not), and in the assignment of characters to code positions. For the European languages in @@ -915,6 +953,8 @@ particular, the following variants typically exist: * an IBM PC code page, which is ASCII compatible +.. tabularcolumns:: |l|p{0.3\linewidth}|p{0.3\linewidth}| + +-----------------+--------------------------------+--------------------------------+ | Codec | Aliases | Languages | +=================+================================+================================+ @@ -1003,6 +1043,11 @@ particular, the following variants typically exist: +-----------------+--------------------------------+--------------------------------+ | cp1258 | windows-1258 | Vietnamese | +-----------------+--------------------------------+--------------------------------+ +| cp65001 | | Windows only: Windows UTF-8 | +| | | (``CP_UTF8``) | +| | | | +| | | .. versionadded:: 3.3 | ++-----------------+--------------------------------+--------------------------------+ | euc_jp | eucjp, ujis, u-jis | Japanese | +-----------------+--------------------------------+--------------------------------+ | euc_jis_2004 | jisx0213, eucjis2004 | Japanese | @@ -1122,7 +1167,21 @@ particular, the following variants typically exist: | utf_8_sig | | all languages | +-----------------+--------------------------------+--------------------------------+ -.. XXX fix here, should be in above table +Python Specific Encodings +------------------------- + +A number of predefined codecs are specific to Python, so their codec names have +no meaning outside Python. These are listed in the tables below based on the +expected input and output types (note that while text encodings are the most +common use case for codecs, the underlying codec infrastructure supports +arbitrary data transforms rather than just text encodings). For asymmetric +codecs, the stated purpose describes the encoding direction. + +The following codecs provide :class:`str` to :class:`bytes` encoding and +:term:`bytes-like object` to :class:`str` decoding, similar to the Unicode text +encodings. + +.. tabularcolumns:: |l|p{0.3\linewidth}|p{0.3\linewidth}| +--------------------+---------+---------------------------+ | Codec | Aliases | Purpose | @@ -1160,45 +1219,61 @@ particular, the following variants typically exist: | unicode_internal | | Return the internal | | | | representation of the | | | | operand | +| | | | +| | | .. deprecated:: 3.3 | +--------------------+---------+---------------------------+ -The following codecs provide bytes-to-bytes mappings. - -+--------------------+---------------------------+---------------------------+ -| Codec | Aliases | Purpose | -+====================+===========================+===========================+ -| base64_codec | base64, base-64 | Convert operand to MIME | -| | | base64 | -+--------------------+---------------------------+---------------------------+ -| bz2_codec | bz2 | Compress the operand | -| | | using bz2 | -+--------------------+---------------------------+---------------------------+ -| hex_codec | hex | Convert operand to | -| | | hexadecimal | -| | | representation, with two | -| | | digits per byte | -+--------------------+---------------------------+---------------------------+ -| quopri_codec | quopri, quoted-printable, | Convert operand to MIME | -| | quotedprintable | quoted printable | -+--------------------+---------------------------+---------------------------+ -| uu_codec | uu | Convert the operand using | -| | | uuencode | -+--------------------+---------------------------+---------------------------+ -| zlib_codec | zip, zlib | Compress the operand | -| | | using gzip | -+--------------------+---------------------------+---------------------------+ - -The following codecs provide string-to-string mappings. - -+--------------------+---------------------------+---------------------------+ -| Codec | Aliases | Purpose | -+====================+===========================+===========================+ -| rot_13 | rot13 | Returns the Caesar-cypher | -| | | encryption of the operand | -+--------------------+---------------------------+---------------------------+ +The following codecs provide :term:`bytes-like object` to :class:`bytes` +mappings. + + +.. tabularcolumns:: |l|L|L| + ++----------------------+---------------------------+------------------------------+ +| Codec | Purpose | Encoder/decoder | ++======================+===========================+==============================+ +| base64_codec [#b64]_ | Convert operand to MIME | :meth:`base64.b64encode`, | +| | base64 (the result always | :meth:`base64.b64decode` | +| | includes a trailing | | +| | ``'\n'``) | | ++----------------------+---------------------------+------------------------------+ +| bz2_codec | Compress the operand | :meth:`bz2.compress`, | +| | using bz2 | :meth:`bz2.decompress` | ++----------------------+---------------------------+------------------------------+ +| hex_codec | Convert operand to | :meth:`base64.b16encode`, | +| | hexadecimal | :meth:`base64.b16decode` | +| | representation, with two | | +| | digits per byte | | ++----------------------+---------------------------+------------------------------+ +| quopri_codec | Convert operand to MIME | :meth:`quopri.encodestring`, | +| | quoted printable | :meth:`quopri.decodestring` | ++----------------------+---------------------------+------------------------------+ +| uu_codec | Convert the operand using | :meth:`uu.encode`, | +| | uuencode | :meth:`uu.decode` | ++----------------------+---------------------------+------------------------------+ +| zlib_codec | Compress the operand | :meth:`zlib.compress`, | +| | using gzip | :meth:`zlib.decompress` | ++----------------------+---------------------------+------------------------------+ + +.. [#b64] Rather than accepting any :term:`bytes-like object`, + ``'base64_codec'`` accepts only :class:`bytes` and :class:`bytearray` for + encoding and only :class:`bytes`, :class:`bytearray`, and ASCII-only + instances of :class:`str` for decoding + + +The following codecs provide :class:`str` to :class:`str` mappings. + +.. tabularcolumns:: |l|L| + ++--------------------+---------------------------+ +| Codec | Purpose | ++====================+===========================+ +| rot_13 | Returns the Caesar-cypher | +| | encryption of the operand | ++--------------------+---------------------------+ .. versionadded:: 3.2 - bytes-to-bytes and string-to-string codecs. + bytes-to-bytes and str-to-str codecs. :mod:`encodings.idna` --- Internationalized Domain Names in Applications @@ -1272,12 +1347,13 @@ functions can be used directly if desired. .. module:: encodings.mbcs :synopsis: Windows ANSI codepage -Encode operand according to the ANSI codepage (CP_ACP). This codec only -supports ``'strict'`` and ``'replace'`` error handlers to encode, and -``'strict'`` and ``'ignore'`` error handlers to decode. +Encode operand according to the ANSI codepage (CP_ACP). Availability: Windows only. +.. versionchanged:: 3.3 + Support any error handler. + .. versionchanged:: 3.2 Before 3.2, the *errors* argument was ignored; ``'replace'`` was always used to encode, and ``'ignore'`` to decode. diff --git a/Doc/library/collections.abc.rst b/Doc/library/collections.abc.rst new file mode 100644 index 0000000..4b21932 --- /dev/null +++ b/Doc/library/collections.abc.rst @@ -0,0 +1,191 @@ +:mod:`collections.abc` --- Abstract Base Classes for Containers +=============================================================== + +.. module:: collections.abc + :synopsis: Abstract base classes for containers +.. moduleauthor:: Raymond Hettinger <python at rcn.com> +.. sectionauthor:: Raymond Hettinger <python at rcn.com> + +.. versionadded:: 3.3 + Formerly, this module was part of the :mod:`collections` module. + +.. testsetup:: * + + from collections import * + import itertools + __name__ = '<doctest>' + +**Source code:** :source:`Lib/collections/abc.py` + +-------------- + +This module provides :term:`abstract base classes <abstract base class>` that +can be used to test whether a class provides a particular interface; for +example, whether it is hashable or whether it is a mapping. + + +.. _collections-abstract-base-classes: + +Collections Abstract Base Classes +--------------------------------- + +The collections module offers the following :term:`ABCs <abstract base class>`: + +.. tabularcolumns:: |l|L|L|L| + +========================= ===================== ====================== ==================================================== +ABC Inherits from Abstract Methods Mixin Methods +========================= ===================== ====================== ==================================================== +:class:`Container` ``__contains__`` +:class:`Hashable` ``__hash__`` +:class:`Iterable` ``__iter__`` +:class:`Iterator` :class:`Iterable` ``__next__`` ``__iter__`` +:class:`Sized` ``__len__`` +:class:`Callable` ``__call__`` + +:class:`Sequence` :class:`Sized`, ``__getitem__``, ``__contains__``, ``__iter__``, ``__reversed__``, + :class:`Iterable`, ``__len__`` ``index``, and ``count`` + :class:`Container` + +:class:`MutableSequence` :class:`Sequence` ``__getitem__``, Inherited :class:`Sequence` methods and + ``__setitem__``, ``append``, ``reverse``, ``extend``, ``pop``, + ``__delitem__``, ``remove``, and ``__iadd__`` + ``__len__``, + ``insert`` + +:class:`Set` :class:`Sized`, ``__contains__``, ``__le__``, ``__lt__``, ``__eq__``, ``__ne__``, + :class:`Iterable`, ``__iter__``, ``__gt__``, ``__ge__``, ``__and__``, ``__or__``, + :class:`Container` ``__len__`` ``__sub__``, ``__xor__``, and ``isdisjoint`` + +:class:`MutableSet` :class:`Set` ``__contains__``, Inherited :class:`Set` methods and + ``__iter__``, ``clear``, ``pop``, ``remove``, ``__ior__``, + ``__len__``, ``__iand__``, ``__ixor__``, and ``__isub__`` + ``add``, + ``discard`` + +:class:`Mapping` :class:`Sized`, ``__getitem__``, ``__contains__``, ``keys``, ``items``, ``values``, + :class:`Iterable`, ``__iter__``, ``get``, ``__eq__``, and ``__ne__`` + :class:`Container` ``__len__`` + +:class:`MutableMapping` :class:`Mapping` ``__getitem__``, Inherited :class:`Mapping` methods and + ``__setitem__``, ``pop``, ``popitem``, ``clear``, ``update``, + ``__delitem__``, and ``setdefault`` + ``__iter__``, + ``__len__`` + + +:class:`MappingView` :class:`Sized` ``__len__`` +:class:`ItemsView` :class:`MappingView`, ``__contains__``, + :class:`Set` ``__iter__`` +:class:`KeysView` :class:`MappingView`, ``__contains__``, + :class:`Set` ``__iter__`` +:class:`ValuesView` :class:`MappingView` ``__contains__``, ``__iter__`` +========================= ===================== ====================== ==================================================== + + +.. class:: Container + Hashable + Sized + Callable + + ABCs for classes that provide respectively the methods :meth:`__contains__`, + :meth:`__hash__`, :meth:`__len__`, and :meth:`__call__`. + +.. class:: Iterable + + ABC for classes that provide the :meth:`__iter__` method. + See also the definition of :term:`iterable`. + +.. class:: Iterator + + ABC for classes that provide the :meth:`~iterator.__iter__` and + :meth:`~iterator.__next__` methods. See also the definition of + :term:`iterator`. + +.. class:: Sequence + MutableSequence + + ABCs for read-only and mutable :term:`sequences <sequence>`. + +.. class:: Set + MutableSet + + ABCs for read-only and mutable sets. + +.. class:: Mapping + MutableMapping + + ABCs for read-only and mutable :term:`mappings <mapping>`. + +.. class:: MappingView + ItemsView + KeysView + ValuesView + + ABCs for mapping, items, keys, and values :term:`views <view>`. + + +These ABCs allow us to ask classes or instances if they provide +particular functionality, for example:: + + size = None + if isinstance(myvar, collections.abc.Sized): + size = len(myvar) + +Several of the ABCs are also useful as mixins that make it easier to develop +classes supporting container APIs. For example, to write a class supporting +the full :class:`Set` API, it only necessary to supply the three underlying +abstract methods: :meth:`__contains__`, :meth:`__iter__`, and :meth:`__len__`. +The ABC supplies the remaining methods such as :meth:`__and__` and +:meth:`isdisjoint`:: + + class ListBasedSet(collections.abc.Set): + ''' Alternate set implementation favoring space over speed + and not requiring the set elements to be hashable. ''' + def __init__(self, iterable): + self.elements = lst = [] + for value in iterable: + if value not in lst: + lst.append(value) + def __iter__(self): + return iter(self.elements) + def __contains__(self, value): + return value in self.elements + def __len__(self): + return len(self.elements) + + s1 = ListBasedSet('abcdef') + s2 = ListBasedSet('defghi') + overlap = s1 & s2 # The __and__() method is supported automatically + +Notes on using :class:`Set` and :class:`MutableSet` as a mixin: + +(1) + Since some set operations create new sets, the default mixin methods need + a way to create new instances from an iterable. The class constructor is + assumed to have a signature in the form ``ClassName(iterable)``. + That assumption is factored-out to an internal classmethod called + :meth:`_from_iterable` which calls ``cls(iterable)`` to produce a new set. + If the :class:`Set` mixin is being used in a class with a different + constructor signature, you will need to override :meth:`_from_iterable` + with a classmethod that can construct new instances from + an iterable argument. + +(2) + To override the comparisons (presumably for speed, as the + semantics are fixed), redefine :meth:`__le__` and + then the other operations will automatically follow suit. + +(3) + The :class:`Set` mixin provides a :meth:`_hash` method to compute a hash value + for the set; however, :meth:`__hash__` is not defined because not all sets + are hashable or immutable. To add set hashabilty using mixins, + inherit from both :meth:`Set` and :meth:`Hashable`, then define + ``__hash__ = Set._hash``. + +.. seealso:: + + * `OrderedSet recipe <http://code.activestate.com/recipes/576694/>`_ for an + example built on :class:`MutableSet`. + + * For more about ABCs, see the :mod:`abc` module and :pep:`3119`. diff --git a/Doc/library/collections.rst b/Doc/library/collections.rst index 7a2802d..4def1dc 100644 --- a/Doc/library/collections.rst +++ b/Doc/library/collections.rst @@ -2,17 +2,17 @@ ========================================== .. module:: collections - :synopsis: Container datatypes + :synopsis: Container datatypes .. moduleauthor:: Raymond Hettinger <python@rcn.com> .. sectionauthor:: Raymond Hettinger <python@rcn.com> .. testsetup:: * - from collections import * - import itertools - __name__ = '<doctest>' + from collections import * + import itertools + __name__ = '<doctest>' -**Source code:** :source:`Lib/collections.py` and :source:`Lib/_abcoll.py` +**Source code:** :source:`Lib/collections/__init__.py` -------------- @@ -23,6 +23,7 @@ Python's general purpose built-in containers, :class:`dict`, :class:`list`, ===================== ==================================================================== :func:`namedtuple` factory function for creating tuple subclasses with named fields :class:`deque` list-like container with fast appends and pops on either end +:class:`ChainMap` dict-like class for creating a single view of multiple mappings :class:`Counter` dict subclass for counting hashable objects :class:`OrderedDict` dict subclass that remembers the order entries were added :class:`defaultdict` dict subclass that calls a factory function to supply missing values @@ -31,10 +32,167 @@ Python's general purpose built-in containers, :class:`dict`, :class:`list`, :class:`UserString` wrapper around string objects for easier string subclassing ===================== ==================================================================== -In addition to the concrete container classes, the collections module provides -:ref:`abstract base classes <collections-abstract-base-classes>` that can be -used to test whether a class provides a particular interface, for example, -whether it is hashable or a mapping. +.. versionchanged:: 3.3 + Moved :ref:`collections-abstract-base-classes` to the :mod:`collections.abc` module. + For backwards compatibility, they continue to be visible in this module + as well. + + +:class:`ChainMap` objects +------------------------- + +.. versionadded:: 3.3 + +A :class:`ChainMap` class is provided for quickly linking a number of mappings +so they can be treated as a single unit. It is often much faster than creating +a new dictionary and running multiple :meth:`~dict.update` calls. + +The class can be used to simulate nested scopes and is useful in templating. + +.. class:: ChainMap(*maps) + + A :class:`ChainMap` groups multiple dicts or other mappings together to + create a single, updateable view. If no *maps* are specified, a single empty + dictionary is provided so that a new chain always has at least one mapping. + + The underlying mappings are stored in a list. That list is public and can + accessed or updated using the *maps* attribute. There is no other state. + + Lookups search the underlying mappings successively until a key is found. In + contrast, writes, updates, and deletions only operate on the first mapping. + + A :class:`ChainMap` incorporates the underlying mappings by reference. So, if + one of the underlying mappings gets updated, those changes will be reflected + in :class:`ChainMap`. + + All of the usual dictionary methods are supported. In addition, there is a + *maps* attribute, a method for creating new subcontexts, and a property for + accessing all but the first mapping: + + .. attribute:: maps + + A user updateable list of mappings. The list is ordered from + first-searched to last-searched. It is the only stored state and can + be modified to change which mappings are searched. The list should + always contain at least one mapping. + + .. method:: new_child() + + Returns a new :class:`ChainMap` containing a new :class:`dict` followed by + all of the maps in the current instance. A call to ``d.new_child()`` is + equivalent to: ``ChainMap({}, *d.maps)``. This method is used for + creating subcontexts that can be updated without altering values in any + of the parent mappings. + + .. attribute:: parents + + Property returning a new :class:`ChainMap` containing all of the maps in + the current instance except the first one. This is useful for skipping + the first map in the search. Use cases are similar to those for the + :keyword:`nonlocal` keyword used in :term:`nested scopes <nested + scope>`. The use cases also parallel those for the built-in + :func:`super` function. A reference to ``d.parents`` is equivalent to: + ``ChainMap(*d.maps[1:])``. + + +.. seealso:: + + * The `MultiContext class + <https://github.com/enthought/codetools/blob/4.0.0/codetools/contexts/multi_context.py>`_ + in the Enthought `CodeTools package + <https://github.com/enthought/codetools>`_ has options to support + writing to any mapping in the chain. + + * Django's `Context class + <http://code.djangoproject.com/browser/django/trunk/django/template/context.py>`_ + for templating is a read-only chain of mappings. It also features + pushing and popping of contexts similar to the + :meth:`~collections.ChainMap.new_child` method and the + :meth:`~collections.ChainMap.parents` property. + + * The `Nested Contexts recipe + <http://code.activestate.com/recipes/577434/>`_ has options to control + whether writes and other mutations apply only to the first mapping or to + any mapping in the chain. + + * A `greatly simplified read-only version of Chainmap + <http://code.activestate.com/recipes/305268/>`_. + + +:class:`ChainMap` Examples and Recipes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +This section shows various approaches to working with chained maps. + + +Example of simulating Python's internal lookup chain:: + + import builtins + pylookup = ChainMap(locals(), globals(), vars(builtins)) + +Example of letting user specified command-line arguments take precedence over +environment variables which in turn take precedence over default values:: + + import os, argparse + + defaults = {'color': 'red', 'user': 'guest'} + + parser = argparse.ArgumentParser() + parser.add_argument('-u', '--user') + parser.add_argument('-c', '--color') + namespace = parser.parse_args() + command_line_args = {k:v for k, v in vars(namespace).items() if v} + + combined = ChainMap(command_line_args, os.environ, defaults) + print(combined['color']) + print(combined['user']) + +Example patterns for using the :class:`ChainMap` class to simulate nested +contexts:: + + c = ChainMap() # Create root context + d = c.new_child() # Create nested child context + e = c.new_child() # Child of c, independent from d + e.maps[0] # Current context dictionary -- like Python's locals() + e.maps[-1] # Root context -- like Python's globals() + e.parents # Enclosing context chain -- like Python's nonlocals + + d['x'] # Get first key in the chain of contexts + d['x'] = 1 # Set value in current context + del d['x'] # Delete from current context + list(d) # All nested values + k in d # Check all nested values + len(d) # Number of nested values + d.items() # All nested items + dict(d) # Flatten into a regular dictionary + +The :class:`ChainMap` class only makes updates (writes and deletions) to the +first mapping in the chain while lookups will search the full chain. However, +if deep writes and deletions are desired, it is easy to make a subclass that +updates keys found deeper in the chain:: + + class DeepChainMap(ChainMap): + 'Variant of ChainMap that allows direct updates to inner scopes' + + def __setitem__(self, key, value): + for mapping in self.maps: + if key in mapping: + mapping[key] = value + return + self.maps[0][key] = value + + def __delitem__(self, key): + for mapping in self.maps: + if key in mapping: + del mapping[key] + return + raise KeyError(key) + + >>> d = DeepChainMap({'zebra': 'black'}, {'elephant': 'blue'}, {'lion': 'yellow'}) + >>> d['lion'] = 'orange' # update an existing key two levels down + >>> d['snake'] = 'red' # new keys get added to the topmost dict + >>> del d['elephant'] # remove an existing key one level down + DeepChainMap({'zebra': 'black', 'snake': 'red'}, {}, {'lion': 'orange'}) :class:`Counter` objects @@ -52,71 +210,71 @@ For example:: >>> # Find the ten most common words in Hamlet >>> import re - >>> words = re.findall('\w+', open('hamlet.txt').read().lower()) + >>> words = re.findall(r'\w+', open('hamlet.txt').read().lower()) >>> Counter(words).most_common(10) [('the', 1143), ('and', 966), ('to', 762), ('of', 669), ('i', 631), ('you', 554), ('a', 546), ('my', 514), ('hamlet', 471), ('in', 451)] .. class:: Counter([iterable-or-mapping]) - A :class:`Counter` is a :class:`dict` subclass for counting hashable objects. - It is an unordered collection where elements are stored as dictionary keys - and their counts are stored as dictionary values. Counts are allowed to be - any integer value including zero or negative counts. The :class:`Counter` - class is similar to bags or multisets in other languages. + A :class:`Counter` is a :class:`dict` subclass for counting hashable objects. + It is an unordered collection where elements are stored as dictionary keys + and their counts are stored as dictionary values. Counts are allowed to be + any integer value including zero or negative counts. The :class:`Counter` + class is similar to bags or multisets in other languages. - Elements are counted from an *iterable* or initialized from another - *mapping* (or counter): + Elements are counted from an *iterable* or initialized from another + *mapping* (or counter): >>> c = Counter() # a new, empty counter >>> c = Counter('gallahad') # a new counter from an iterable >>> c = Counter({'red': 4, 'blue': 2}) # a new counter from a mapping >>> c = Counter(cats=4, dogs=8) # a new counter from keyword args - Counter objects have a dictionary interface except that they return a zero - count for missing items instead of raising a :exc:`KeyError`: + Counter objects have a dictionary interface except that they return a zero + count for missing items instead of raising a :exc:`KeyError`: >>> c = Counter(['eggs', 'ham']) >>> c['bacon'] # count of a missing element is zero 0 - Setting a count to zero does not remove an element from a counter. - Use ``del`` to remove it entirely: + Setting a count to zero does not remove an element from a counter. + Use ``del`` to remove it entirely: >>> c['sausage'] = 0 # counter entry with a zero count >>> del c['sausage'] # del actually removes the entry - .. versionadded:: 3.1 + .. versionadded:: 3.1 - Counter objects support three methods beyond those available for all - dictionaries: + Counter objects support three methods beyond those available for all + dictionaries: - .. method:: elements() + .. method:: elements() - Return an iterator over elements repeating each as many times as its - count. Elements are returned in arbitrary order. If an element's count - is less than one, :meth:`elements` will ignore it. + Return an iterator over elements repeating each as many times as its + count. Elements are returned in arbitrary order. If an element's count + is less than one, :meth:`elements` will ignore it. >>> c = Counter(a=4, b=2, c=0, d=-2) >>> list(c.elements()) ['a', 'a', 'a', 'a', 'b', 'b'] - .. method:: most_common([n]) + .. method:: most_common([n]) - Return a list of the *n* most common elements and their counts from the - most common to the least. If *n* is not specified, :func:`most_common` - returns *all* elements in the counter. Elements with equal counts are - ordered arbitrarily: + Return a list of the *n* most common elements and their counts from the + most common to the least. If *n* is not specified, :func:`most_common` + returns *all* elements in the counter. Elements with equal counts are + ordered arbitrarily: >>> Counter('abracadabra').most_common(3) [('a', 5), ('r', 2), ('b', 2)] - .. method:: subtract([iterable-or-mapping]) + .. method:: subtract([iterable-or-mapping]) - Elements are subtracted from an *iterable* or from another *mapping* - (or counter). Like :meth:`dict.update` but subtracts counts instead - of replacing them. Both inputs and outputs may be zero or negative. + Elements are subtracted from an *iterable* or from another *mapping* + (or counter). Like :meth:`dict.update` but subtracts counts instead + of replacing them. Both inputs and outputs may be zero or negative. >>> c = Counter(a=4, b=2, c=0, d=-2) >>> d = Counter(a=1, b=2, c=3, d=4) @@ -124,21 +282,21 @@ For example:: >>> c Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6}) - .. versionadded:: 3.2 + .. versionadded:: 3.2 - The usual dictionary methods are available for :class:`Counter` objects - except for two which work differently for counters. + The usual dictionary methods are available for :class:`Counter` objects + except for two which work differently for counters. - .. method:: fromkeys(iterable) + .. method:: fromkeys(iterable) - This class method is not implemented for :class:`Counter` objects. + This class method is not implemented for :class:`Counter` objects. - .. method:: update([iterable-or-mapping]) + .. method:: update([iterable-or-mapping]) - Elements are counted from an *iterable* or added-in from another - *mapping* (or counter). Like :meth:`dict.update` but adds counts - instead of replacing them. Also, the *iterable* is expected to be a - sequence of elements, not a sequence of ``(key, value)`` pairs. + Elements are counted from an *iterable* or added-in from another + *mapping* (or counter). Like :meth:`dict.update` but adds counts + instead of replacing them. Also, the *iterable* is expected to be a + sequence of elements, not a sequence of ``(key, value)`` pairs. Common patterns for working with :class:`Counter` objects:: @@ -149,8 +307,8 @@ Common patterns for working with :class:`Counter` objects:: dict(c) # convert to a regular dictionary c.items() # convert to a list of (elem, cnt) pairs Counter(dict(list_of_pairs)) # convert from a list of (elem, cnt) pairs - c.most_common()[:-n:-1] # n least common elements - c += Counter() # remove zero and negative counts + c.most_common()[:-n-1:-1] # n least common elements + +c # remove zero and negative counts Several mathematical operations are provided for combining :class:`Counter` objects to produce multisets (counters that have counts greater than zero). @@ -170,32 +328,44 @@ counts, but the output will exclude results with counts of zero or less. >>> c | d # union: max(c[x], d[x]) Counter({'a': 3, 'b': 2}) +Unary addition and substraction are shortcuts for adding an empty counter +or subtracting from an empty counter. + + >>> c = Counter(a=2, b=-4) + >>> +c + Counter({'a': 2}) + >>> -c + Counter({'b': 4}) + +.. versionadded:: 3.3 + Added support for unary plus, unary minus, and in-place multiset operations. + .. note:: - Counters were primarily designed to work with positive integers to represent - running counts; however, care was taken to not unnecessarily preclude use - cases needing other types or negative values. To help with those use cases, - this section documents the minimum range and type restrictions. + Counters were primarily designed to work with positive integers to represent + running counts; however, care was taken to not unnecessarily preclude use + cases needing other types or negative values. To help with those use cases, + this section documents the minimum range and type restrictions. - * The :class:`Counter` class itself is a dictionary subclass with no - restrictions on its keys and values. The values are intended to be numbers - representing counts, but you *could* store anything in the value field. + * The :class:`Counter` class itself is a dictionary subclass with no + restrictions on its keys and values. The values are intended to be numbers + representing counts, but you *could* store anything in the value field. - * The :meth:`most_common` method requires only that the values be orderable. + * The :meth:`most_common` method requires only that the values be orderable. - * For in-place operations such as ``c[key] += 1``, the value type need only - support addition and subtraction. So fractions, floats, and decimals would - work and negative values are supported. The same is also true for - :meth:`update` and :meth:`subtract` which allow negative and zero values - for both inputs and outputs. + * For in-place operations such as ``c[key] += 1``, the value type need only + support addition and subtraction. So fractions, floats, and decimals would + work and negative values are supported. The same is also true for + :meth:`update` and :meth:`subtract` which allow negative and zero values + for both inputs and outputs. - * The multiset methods are designed only for use cases with positive values. - The inputs may be negative or zero, but only outputs with positive values - are created. There are no type restrictions, but the value type needs to - support addition, subtraction, and comparison. + * The multiset methods are designed only for use cases with positive values. + The inputs may be negative or zero, but only outputs with positive values + are created. There are no type restrictions, but the value type needs to + support addition, subtraction, and comparison. - * The :meth:`elements` method requires integer counts. It ignores zero and - negative counts. + * The :meth:`elements` method requires integer counts. It ignores zero and + negative counts. .. seealso:: @@ -218,7 +388,7 @@ counts, but the output will exclude results with counts of zero or less. * To enumerate all distinct multisets of a given size over a given set of elements, see :func:`itertools.combinations_with_replacement`. - map(Counter, combinations_with_replacement('ABC', 2)) --> AA AB AC BB BC CC + map(Counter, combinations_with_replacement('ABC', 2)) --> AA AB AC BB BC CC :class:`deque` objects @@ -226,105 +396,105 @@ counts, but the output will exclude results with counts of zero or less. .. class:: deque([iterable, [maxlen]]) - Returns a new deque object initialized left-to-right (using :meth:`append`) with - data from *iterable*. If *iterable* is not specified, the new deque is empty. + Returns a new deque object initialized left-to-right (using :meth:`append`) with + data from *iterable*. If *iterable* is not specified, the new deque is empty. - Deques are a generalization of stacks and queues (the name is pronounced "deck" - and is short for "double-ended queue"). Deques support thread-safe, memory - efficient appends and pops from either side of the deque with approximately the - same O(1) performance in either direction. + Deques are a generalization of stacks and queues (the name is pronounced "deck" + and is short for "double-ended queue"). Deques support thread-safe, memory + efficient appends and pops from either side of the deque with approximately the + same O(1) performance in either direction. - Though :class:`list` objects support similar operations, they are optimized for - fast fixed-length operations and incur O(n) memory movement costs for - ``pop(0)`` and ``insert(0, v)`` operations which change both the size and - position of the underlying data representation. + Though :class:`list` objects support similar operations, they are optimized for + fast fixed-length operations and incur O(n) memory movement costs for + ``pop(0)`` and ``insert(0, v)`` operations which change both the size and + position of the underlying data representation. - If *maxlen* is not specified or is *None*, deques may grow to an - arbitrary length. Otherwise, the deque is bounded to the specified maximum - length. Once a bounded length deque is full, when new items are added, a - corresponding number of items are discarded from the opposite end. Bounded - length deques provide functionality similar to the ``tail`` filter in - Unix. They are also useful for tracking transactions and other pools of data - where only the most recent activity is of interest. + If *maxlen* is not specified or is *None*, deques may grow to an + arbitrary length. Otherwise, the deque is bounded to the specified maximum + length. Once a bounded length deque is full, when new items are added, a + corresponding number of items are discarded from the opposite end. Bounded + length deques provide functionality similar to the ``tail`` filter in + Unix. They are also useful for tracking transactions and other pools of data + where only the most recent activity is of interest. - Deque objects support the following methods: + Deque objects support the following methods: - .. method:: append(x) + .. method:: append(x) - Add *x* to the right side of the deque. + Add *x* to the right side of the deque. - .. method:: appendleft(x) + .. method:: appendleft(x) - Add *x* to the left side of the deque. + Add *x* to the left side of the deque. - .. method:: clear() + .. method:: clear() - Remove all elements from the deque leaving it with length 0. + Remove all elements from the deque leaving it with length 0. - .. method:: count(x) + .. method:: count(x) - Count the number of deque elements equal to *x*. + Count the number of deque elements equal to *x*. - .. versionadded:: 3.2 + .. versionadded:: 3.2 - .. method:: extend(iterable) + .. method:: extend(iterable) - Extend the right side of the deque by appending elements from the iterable - argument. + Extend the right side of the deque by appending elements from the iterable + argument. - .. method:: extendleft(iterable) + .. method:: extendleft(iterable) - Extend the left side of the deque by appending elements from *iterable*. - Note, the series of left appends results in reversing the order of - elements in the iterable argument. + Extend the left side of the deque by appending elements from *iterable*. + Note, the series of left appends results in reversing the order of + elements in the iterable argument. - .. method:: pop() + .. method:: pop() - Remove and return an element from the right side of the deque. If no - elements are present, raises an :exc:`IndexError`. + Remove and return an element from the right side of the deque. If no + elements are present, raises an :exc:`IndexError`. - .. method:: popleft() + .. method:: popleft() - Remove and return an element from the left side of the deque. If no - elements are present, raises an :exc:`IndexError`. + Remove and return an element from the left side of the deque. If no + elements are present, raises an :exc:`IndexError`. - .. method:: remove(value) + .. method:: remove(value) - Removed the first occurrence of *value*. If not found, raises a - :exc:`ValueError`. + Removed the first occurrence of *value*. If not found, raises a + :exc:`ValueError`. - .. method:: reverse() + .. method:: reverse() - Reverse the elements of the deque in-place and then return ``None``. + Reverse the elements of the deque in-place and then return ``None``. - .. versionadded:: 3.2 + .. versionadded:: 3.2 - .. method:: rotate(n) + .. method:: rotate(n) - Rotate the deque *n* steps to the right. If *n* is negative, rotate to - the left. Rotating one step to the right is equivalent to: - ``d.appendleft(d.pop())``. + Rotate the deque *n* steps to the right. If *n* is negative, rotate to + the left. Rotating one step to the right is equivalent to: + ``d.appendleft(d.pop())``. - Deque objects also provide one read-only attribute: + Deque objects also provide one read-only attribute: - .. attribute:: maxlen + .. attribute:: maxlen - Maximum size of a deque or *None* if unbounded. + Maximum size of a deque or *None* if unbounded. - .. versionadded:: 3.1 + .. versionadded:: 3.1 In addition to the above, deques support iteration, pickling, ``len(d)``, @@ -337,56 +507,56 @@ Example: .. doctest:: - >>> from collections import deque - >>> d = deque('ghi') # make a new deque with three items - >>> for elem in d: # iterate over the deque's elements - ... print(elem.upper()) - G - H - I - - >>> d.append('j') # add a new entry to the right side - >>> d.appendleft('f') # add a new entry to the left side - >>> d # show the representation of the deque - deque(['f', 'g', 'h', 'i', 'j']) - - >>> d.pop() # return and remove the rightmost item - 'j' - >>> d.popleft() # return and remove the leftmost item - 'f' - >>> list(d) # list the contents of the deque - ['g', 'h', 'i'] - >>> d[0] # peek at leftmost item - 'g' - >>> d[-1] # peek at rightmost item - 'i' - - >>> list(reversed(d)) # list the contents of a deque in reverse - ['i', 'h', 'g'] - >>> 'h' in d # search the deque - True - >>> d.extend('jkl') # add multiple elements at once - >>> d - deque(['g', 'h', 'i', 'j', 'k', 'l']) - >>> d.rotate(1) # right rotation - >>> d - deque(['l', 'g', 'h', 'i', 'j', 'k']) - >>> d.rotate(-1) # left rotation - >>> d - deque(['g', 'h', 'i', 'j', 'k', 'l']) - - >>> deque(reversed(d)) # make a new deque in reverse order - deque(['l', 'k', 'j', 'i', 'h', 'g']) - >>> d.clear() # empty the deque - >>> d.pop() # cannot pop from an empty deque - Traceback (most recent call last): - File "<pyshell#6>", line 1, in -toplevel- - d.pop() - IndexError: pop from an empty deque - - >>> d.extendleft('abc') # extendleft() reverses the input order - >>> d - deque(['c', 'b', 'a']) + >>> from collections import deque + >>> d = deque('ghi') # make a new deque with three items + >>> for elem in d: # iterate over the deque's elements + ... print(elem.upper()) + G + H + I + + >>> d.append('j') # add a new entry to the right side + >>> d.appendleft('f') # add a new entry to the left side + >>> d # show the representation of the deque + deque(['f', 'g', 'h', 'i', 'j']) + + >>> d.pop() # return and remove the rightmost item + 'j' + >>> d.popleft() # return and remove the leftmost item + 'f' + >>> list(d) # list the contents of the deque + ['g', 'h', 'i'] + >>> d[0] # peek at leftmost item + 'g' + >>> d[-1] # peek at rightmost item + 'i' + + >>> list(reversed(d)) # list the contents of a deque in reverse + ['i', 'h', 'g'] + >>> 'h' in d # search the deque + True + >>> d.extend('jkl') # add multiple elements at once + >>> d + deque(['g', 'h', 'i', 'j', 'k', 'l']) + >>> d.rotate(1) # right rotation + >>> d + deque(['l', 'g', 'h', 'i', 'j', 'k']) + >>> d.rotate(-1) # left rotation + >>> d + deque(['g', 'h', 'i', 'j', 'k', 'l']) + + >>> deque(reversed(d)) # make a new deque in reverse order + deque(['l', 'k', 'j', 'i', 'h', 'g']) + >>> d.clear() # empty the deque + >>> d.pop() # cannot pop from an empty deque + Traceback (most recent call last): + File "<pyshell#6>", line 1, in -toplevel- + d.pop() + IndexError: pop from an empty deque + + >>> d.extendleft('abc') # extendleft() reverses the input order + >>> d + deque(['c', 'b', 'a']) :class:`deque` Recipes @@ -397,9 +567,10 @@ This section shows various approaches to working with deques. Bounded length deques provide functionality similar to the ``tail`` filter in Unix:: - def tail(filename, n=10): - 'Return the last n lines of a file' - return deque(open(filename), n) + def tail(filename, n=10): + 'Return the last n lines of a file' + with open(filename) as f: + return deque(f, n) Another approach to using deques is to maintain a sequence of recently added elements by appending to the right and popping to the left:: @@ -420,10 +591,10 @@ The :meth:`rotate` method provides a way to implement :class:`deque` slicing and deletion. For example, a pure Python implementation of ``del d[n]`` relies on the :meth:`rotate` method to position elements to be popped:: - def delete_nth(d, n): - d.rotate(-n) - d.popleft() - d.rotate(n) + def delete_nth(d, n): + d.rotate(-n) + d.popleft() + d.rotate(n) To implement :class:`deque` slicing, use a similar approach applying :meth:`rotate` to bring a target element to the left side of the deque. Remove @@ -439,50 +610,50 @@ stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``, .. class:: defaultdict([default_factory[, ...]]) - Returns a new dictionary-like object. :class:`defaultdict` is a subclass of the - built-in :class:`dict` class. It overrides one method and adds one writable - instance variable. The remaining functionality is the same as for the - :class:`dict` class and is not documented here. + Returns a new dictionary-like object. :class:`defaultdict` is a subclass of the + built-in :class:`dict` class. It overrides one method and adds one writable + instance variable. The remaining functionality is the same as for the + :class:`dict` class and is not documented here. - The first argument provides the initial value for the :attr:`default_factory` - attribute; it defaults to ``None``. All remaining arguments are treated the same - as if they were passed to the :class:`dict` constructor, including keyword - arguments. + The first argument provides the initial value for the :attr:`default_factory` + attribute; it defaults to ``None``. All remaining arguments are treated the same + as if they were passed to the :class:`dict` constructor, including keyword + arguments. - :class:`defaultdict` objects support the following method in addition to the - standard :class:`dict` operations: + :class:`defaultdict` objects support the following method in addition to the + standard :class:`dict` operations: - .. method:: __missing__(key) + .. method:: __missing__(key) - If the :attr:`default_factory` attribute is ``None``, this raises a - :exc:`KeyError` exception with the *key* as argument. + If the :attr:`default_factory` attribute is ``None``, this raises a + :exc:`KeyError` exception with the *key* as argument. - If :attr:`default_factory` is not ``None``, it is called without arguments - to provide a default value for the given *key*, this value is inserted in - the dictionary for the *key*, and returned. + If :attr:`default_factory` is not ``None``, it is called without arguments + to provide a default value for the given *key*, this value is inserted in + the dictionary for the *key*, and returned. - If calling :attr:`default_factory` raises an exception this exception is - propagated unchanged. + If calling :attr:`default_factory` raises an exception this exception is + propagated unchanged. - This method is called by the :meth:`__getitem__` method of the - :class:`dict` class when the requested key is not found; whatever it - returns or raises is then returned or raised by :meth:`__getitem__`. + This method is called by the :meth:`__getitem__` method of the + :class:`dict` class when the requested key is not found; whatever it + returns or raises is then returned or raised by :meth:`__getitem__`. - Note that :meth:`__missing__` is *not* called for any operations besides - :meth:`__getitem__`. This means that :meth:`get` will, like normal - dictionaries, return ``None`` as a default rather than using - :attr:`default_factory`. + Note that :meth:`__missing__` is *not* called for any operations besides + :meth:`__getitem__`. This means that :meth:`get` will, like normal + dictionaries, return ``None`` as a default rather than using + :attr:`default_factory`. - :class:`defaultdict` objects support the following instance variable: + :class:`defaultdict` objects support the following instance variable: - .. attribute:: default_factory + .. attribute:: default_factory - This attribute is used by the :meth:`__missing__` method; it is - initialized from the first argument to the constructor, if present, or to - ``None``, if absent. + This attribute is used by the :meth:`__missing__` method; it is + initialized from the first argument to the constructor, if present, or to + ``None``, if absent. :class:`defaultdict` Examples @@ -491,13 +662,13 @@ stack manipulations such as ``dup``, ``drop``, ``swap``, ``over``, ``pick``, Using :class:`list` as the :attr:`default_factory`, it is easy to group a sequence of key-value pairs into a dictionary of lists: - >>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)] - >>> d = defaultdict(list) - >>> for k, v in s: - ... d[k].append(v) - ... - >>> list(d.items()) - [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])] + >>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)] + >>> d = defaultdict(list) + >>> for k, v in s: + ... d[k].append(v) + ... + >>> list(d.items()) + [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])] When each key is encountered for the first time, it is not already in the mapping; so an entry is automatically created using the :attr:`default_factory` @@ -507,24 +678,24 @@ again, the look-up proceeds normally (returning the list for that key) and the :meth:`list.append` operation adds another value to the list. This technique is simpler and faster than an equivalent technique using :meth:`dict.setdefault`: - >>> d = {} - >>> for k, v in s: - ... d.setdefault(k, []).append(v) - ... - >>> list(d.items()) - [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])] + >>> d = {} + >>> for k, v in s: + ... d.setdefault(k, []).append(v) + ... + >>> list(d.items()) + [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])] Setting the :attr:`default_factory` to :class:`int` makes the :class:`defaultdict` useful for counting (like a bag or multiset in other languages): - >>> s = 'mississippi' - >>> d = defaultdict(int) - >>> for k in s: - ... d[k] += 1 - ... - >>> list(d.items()) - [('i', 4), ('p', 2), ('s', 4), ('m', 1)] + >>> s = 'mississippi' + >>> d = defaultdict(int) + >>> for k in s: + ... d[k] += 1 + ... + >>> list(d.items()) + [('i', 4), ('p', 2), ('s', 4), ('m', 1)] When a letter is first encountered, it is missing from the mapping, so the :attr:`default_factory` function calls :func:`int` to supply a default count of @@ -535,23 +706,23 @@ constant functions. A faster and more flexible way to create constant functions is to use a lambda function which can supply any constant value (not just zero): - >>> def constant_factory(value): - ... return lambda: value - >>> d = defaultdict(constant_factory('<missing>')) - >>> d.update(name='John', action='ran') - >>> '%(name)s %(action)s to %(object)s' % d - 'John ran to <missing>' + >>> def constant_factory(value): + ... return lambda: value + >>> d = defaultdict(constant_factory('<missing>')) + >>> d.update(name='John', action='ran') + >>> '%(name)s %(action)s to %(object)s' % d + 'John ran to <missing>' Setting the :attr:`default_factory` to :class:`set` makes the :class:`defaultdict` useful for building a dictionary of sets: - >>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)] - >>> d = defaultdict(set) - >>> for k, v in s: - ... d[k].add(v) - ... - >>> list(d.items()) - [('blue', set([2, 4])), ('red', set([1, 3]))] + >>> s = [('red', 1), ('blue', 2), ('red', 3), ('blue', 4), ('red', 1), ('blue', 4)] + >>> d = defaultdict(set) + >>> for k, v in s: + ... d[k].add(v) + ... + >>> list(d.items()) + [('blue', {2, 4}), ('red', {1, 3})] :func:`namedtuple` Factory Function for Tuples with Named Fields @@ -563,168 +734,131 @@ they add the ability to access fields by name instead of position index. .. function:: namedtuple(typename, field_names, verbose=False, rename=False) - Returns a new tuple subclass named *typename*. The new subclass is used to - create tuple-like objects that have fields accessible by attribute lookup as - well as being indexable and iterable. Instances of the subclass also have a - helpful docstring (with typename and field_names) and a helpful :meth:`__repr__` - method which lists the tuple contents in a ``name=value`` format. + Returns a new tuple subclass named *typename*. The new subclass is used to + create tuple-like objects that have fields accessible by attribute lookup as + well as being indexable and iterable. Instances of the subclass also have a + helpful docstring (with typename and field_names) and a helpful :meth:`__repr__` + method which lists the tuple contents in a ``name=value`` format. - The *field_names* are a single string with each fieldname separated by whitespace - and/or commas, for example ``'x y'`` or ``'x, y'``. Alternatively, *field_names* - can be a sequence of strings such as ``['x', 'y']``. + The *field_names* are a single string with each fieldname separated by whitespace + and/or commas, for example ``'x y'`` or ``'x, y'``. Alternatively, *field_names* + can be a sequence of strings such as ``['x', 'y']``. - Any valid Python identifier may be used for a fieldname except for names - starting with an underscore. Valid identifiers consist of letters, digits, - and underscores but do not start with a digit or underscore and cannot be - a :mod:`keyword` such as *class*, *for*, *return*, *global*, *pass*, - or *raise*. + Any valid Python identifier may be used for a fieldname except for names + starting with an underscore. Valid identifiers consist of letters, digits, + and underscores but do not start with a digit or underscore and cannot be + a :mod:`keyword` such as *class*, *for*, *return*, *global*, *pass*, + or *raise*. - If *rename* is true, invalid fieldnames are automatically replaced - with positional names. For example, ``['abc', 'def', 'ghi', 'abc']`` is - converted to ``['abc', '_1', 'ghi', '_3']``, eliminating the keyword - ``def`` and the duplicate fieldname ``abc``. + If *rename* is true, invalid fieldnames are automatically replaced + with positional names. For example, ``['abc', 'def', 'ghi', 'abc']`` is + converted to ``['abc', '_1', 'ghi', '_3']``, eliminating the keyword + ``def`` and the duplicate fieldname ``abc``. - If *verbose* is true, the class definition is printed just before being built. + If *verbose* is true, the class definition is printed after it is + built. This option is outdated; instead, it is simpler to print the + :attr:`_source` attribute. - Named tuple instances do not have per-instance dictionaries, so they are - lightweight and require no more memory than regular tuples. + Named tuple instances do not have per-instance dictionaries, so they are + lightweight and require no more memory than regular tuples. - .. versionchanged:: 3.1 - Added support for *rename*. + .. versionchanged:: 3.1 + Added support for *rename*. .. doctest:: - :options: +NORMALIZE_WHITESPACE - - >>> # Basic example - >>> Point = namedtuple('Point', ['x', 'y']) - >>> p = Point(x=10, y=11) - - >>> # Example using the verbose option to print the class definition - >>> Point = namedtuple('Point', 'x y', verbose=True) - class Point(tuple): - 'Point(x, y)' - <BLANKLINE> - __slots__ = () - <BLANKLINE> - _fields = ('x', 'y') - <BLANKLINE> - def __new__(_cls, x, y): - 'Create a new instance of Point(x, y)' - return _tuple.__new__(_cls, (x, y)) - <BLANKLINE> - @classmethod - def _make(cls, iterable, new=tuple.__new__, len=len): - 'Make a new Point object from a sequence or iterable' - result = new(cls, iterable) - if len(result) != 2: - raise TypeError('Expected 2 arguments, got %d' % len(result)) - return result - <BLANKLINE> - def __repr__(self): - 'Return a nicely formatted representation string' - return self.__class__.__name__ + '(x=%r, y=%r)' % self - <BLANKLINE> - def _asdict(self): - 'Return a new OrderedDict which maps field names to their values' - return OrderedDict(zip(self._fields, self)) - <BLANKLINE> - __dict__ = property(_asdict) - <BLANKLINE> - def _replace(_self, **kwds): - 'Return a new Point object replacing specified fields with new values' - result = _self._make(map(kwds.pop, ('x', 'y'), _self)) - if kwds: - raise ValueError('Got unexpected field names: %r' % list(kwds.keys())) - return result - <BLANKLINE> - def __getnewargs__(self): - 'Return self as a plain tuple. Used by copy and pickle.' - return tuple(self) - <BLANKLINE> - x = _property(_itemgetter(0), doc='Alias for field number 0') - y = _property(_itemgetter(1), doc='Alias for field number 1') - - >>> p = Point(11, y=22) # instantiate with positional or keyword arguments - >>> p[0] + p[1] # indexable like the plain tuple (11, 22) - 33 - >>> x, y = p # unpack like a regular tuple - >>> x, y - (11, 22) - >>> p.x + p.y # fields also accessible by name - 33 - >>> p # readable __repr__ with a name=value style - Point(x=11, y=22) + :options: +NORMALIZE_WHITESPACE + + >>> # Basic example + >>> Point = namedtuple('Point', ['x', 'y']) + >>> p = Point(11, y=22) # instantiate with positional or keyword arguments + >>> p[0] + p[1] # indexable like the plain tuple (11, 22) + 33 + >>> x, y = p # unpack like a regular tuple + >>> x, y + (11, 22) + >>> p.x + p.y # fields also accessible by name + 33 + >>> p # readable __repr__ with a name=value style + Point(x=11, y=22) Named tuples are especially useful for assigning field names to result tuples returned by the :mod:`csv` or :mod:`sqlite3` modules:: - EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade') + EmployeeRecord = namedtuple('EmployeeRecord', 'name, age, title, department, paygrade') - import csv - for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))): - print(emp.name, emp.title) + import csv + for emp in map(EmployeeRecord._make, csv.reader(open("employees.csv", "rb"))): + print(emp.name, emp.title) - import sqlite3 - conn = sqlite3.connect('/companydata') - cursor = conn.cursor() - cursor.execute('SELECT name, age, title, department, paygrade FROM employees') - for emp in map(EmployeeRecord._make, cursor.fetchall()): - print(emp.name, emp.title) + import sqlite3 + conn = sqlite3.connect('/companydata') + cursor = conn.cursor() + cursor.execute('SELECT name, age, title, department, paygrade FROM employees') + for emp in map(EmployeeRecord._make, cursor.fetchall()): + print(emp.name, emp.title) In addition to the methods inherited from tuples, named tuples support -three additional methods and one attribute. To prevent conflicts with +three additional methods and two attributes. To prevent conflicts with field names, the method and attribute names start with an underscore. .. classmethod:: somenamedtuple._make(iterable) - Class method that makes a new instance from an existing sequence or iterable. + Class method that makes a new instance from an existing sequence or iterable. -.. doctest:: + .. doctest:: - >>> t = [11, 22] - >>> Point._make(t) - Point(x=11, y=22) + >>> t = [11, 22] + >>> Point._make(t) + Point(x=11, y=22) .. method:: somenamedtuple._asdict() - Return a new :class:`OrderedDict` which maps field names to their corresponding - values:: + Return a new :class:`OrderedDict` which maps field names to their corresponding + values. Note, this method is no longer needed now that the same effect can + be achieved by using the built-in :func:`vars` function:: - >>> p._asdict() - OrderedDict([('x', 11), ('y', 22)]) + >>> vars(p) + OrderedDict([('x', 11), ('y', 22)]) - .. versionchanged:: 3.1 - Returns an :class:`OrderedDict` instead of a regular :class:`dict`. + .. versionchanged:: 3.1 + Returns an :class:`OrderedDict` instead of a regular :class:`dict`. .. method:: somenamedtuple._replace(kwargs) - Return a new instance of the named tuple replacing specified fields with new - values: + Return a new instance of the named tuple replacing specified fields with new + values:: + + >>> p = Point(x=11, y=22) + >>> p._replace(x=33) + Point(x=33, y=22) + + >>> for partnum, record in inventory.items(): + ... inventory[partnum] = record._replace(price=newprices[partnum], timestamp=time.now()) -:: +.. attribute:: somenamedtuple._source - >>> p = Point(x=11, y=22) - >>> p._replace(x=33) - Point(x=33, y=22) + A string with the pure Python source code used to create the named + tuple class. The source makes the named tuple self-documenting. + It can be printed, executed using :func:`exec`, or saved to a file + and imported. - >>> for partnum, record in inventory.items(): - ... inventory[partnum] = record._replace(price=newprices[partnum], timestamp=time.now()) + .. versionadded:: 3.3 .. attribute:: somenamedtuple._fields - Tuple of strings listing the field names. Useful for introspection - and for creating new named tuple types from existing named tuples. + Tuple of strings listing the field names. Useful for introspection + and for creating new named tuple types from existing named tuples. -.. doctest:: + .. doctest:: - >>> p._fields # view the field names - ('x', 'y') + >>> p._fields # view the field names + ('x', 'y') - >>> Color = namedtuple('Color', 'red green blue') - >>> Pixel = namedtuple('Pixel', Point._fields + Color._fields) - >>> Pixel(11, 22, 128, 255, 0) - Pixel(x=11, y=22, red=128, green=255, blue=0) + >>> Color = namedtuple('Color', 'red green blue') + >>> Pixel = namedtuple('Pixel', Point._fields + Color._fields) + >>> Pixel(11, 22, 128, 255, 0) + Pixel(x=11, y=22, red=128, green=255, blue=0) To retrieve a field whose name is stored in a string, use the :func:`getattr` function: @@ -735,31 +869,30 @@ function: To convert a dictionary to a named tuple, use the double-star-operator (as described in :ref:`tut-unpacking-arguments`): - >>> d = {'x': 11, 'y': 22} - >>> Point(**d) - Point(x=11, y=22) + >>> d = {'x': 11, 'y': 22} + >>> Point(**d) + Point(x=11, y=22) Since a named tuple is a regular Python class, it is easy to add or change functionality with a subclass. Here is how to add a calculated field and a fixed-width print format: >>> class Point(namedtuple('Point', 'x y')): - __slots__ = () - @property - def hypot(self): - return (self.x ** 2 + self.y ** 2) ** 0.5 - def __str__(self): - return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot) + __slots__ = () + @property + def hypot(self): + return (self.x ** 2 + self.y ** 2) ** 0.5 + def __str__(self): + return 'Point: x=%6.3f y=%6.3f hypot=%6.3f' % (self.x, self.y, self.hypot) >>> for p in Point(3, 4), Point(14, 5/7): - print(p) + print(p) Point: x= 3.000 y= 4.000 hypot= 5.000 Point: x=14.000 y= 0.714 hypot=14.018 The subclass shown above sets ``__slots__`` to an empty tuple. This helps keep memory requirements low by preventing the creation of instance dictionaries. - Subclassing is not useful for adding new, stored fields. Instead, simply create a new named tuple type from the :attr:`_fields` attribute: @@ -771,6 +904,7 @@ customize a prototype instance: >>> Account = namedtuple('Account', 'owner balance transaction_count') >>> default_account = Account('<owner name>', 0.0, 0) >>> johns_account = default_account._replace(owner='John') + >>> janes_account = default_account._replace(owner='Jane') Enumerated constants can be implemented with named tuples, but it is simpler and more efficient to use a simple class declaration: @@ -779,19 +913,19 @@ and more efficient to use a simple class declaration: >>> Status.open, Status.pending, Status.closed (0, 1, 2) >>> class Status: - open, pending, closed = range(3) + open, pending, closed = range(3) .. seealso:: - * `Named tuple recipe <http://code.activestate.com/recipes/500261/>`_ - adapted for Python 2.4. + * `Named tuple recipe <http://code.activestate.com/recipes/500261/>`_ + adapted for Python 2.4. - * `Recipe for named tuple abstract base class with a metaclass mix-in - <http://code.activestate.com/recipes/577629-namedtupleabc-abstract-base-class-mix-in-for-named/>`_ - by Jan Kaliszewski. Besides providing an :term:`abstract base class` for - named tuples, it also supports an alternate :term:`metaclass`-based - constructor that is convenient for use cases where named tuples are being - subclassed. + * `Recipe for named tuple abstract base class with a metaclass mix-in + <http://code.activestate.com/recipes/577629-namedtupleabc-abstract-base-class-mix-in-for-named/>`_ + by Jan Kaliszewski. Besides providing an :term:`abstract base class` for + named tuples, it also supports an alternate :term:`metaclass`-based + constructor that is convenient for use cases where named tuples are being + subclassed. :class:`OrderedDict` objects @@ -803,36 +937,36 @@ the items are returned in the order their keys were first added. .. class:: OrderedDict([items]) - Return an instance of a dict subclass, supporting the usual :class:`dict` - methods. An *OrderedDict* is a dict that remembers the order that keys - were first inserted. If a new entry overwrites an existing entry, the - original insertion position is left unchanged. Deleting an entry and - reinserting it will move it to the end. + Return an instance of a dict subclass, supporting the usual :class:`dict` + methods. An *OrderedDict* is a dict that remembers the order that keys + were first inserted. If a new entry overwrites an existing entry, the + original insertion position is left unchanged. Deleting an entry and + reinserting it will move it to the end. - .. versionadded:: 3.1 + .. versionadded:: 3.1 - .. method:: popitem(last=True) + .. method:: popitem(last=True) - The :meth:`popitem` method for ordered dictionaries returns and removes a - (key, value) pair. The pairs are returned in LIFO order if *last* is true - or FIFO order if false. + The :meth:`popitem` method for ordered dictionaries returns and removes a + (key, value) pair. The pairs are returned in LIFO order if *last* is true + or FIFO order if false. - .. method:: move_to_end(key, last=True) + .. method:: move_to_end(key, last=True) - Move an existing *key* to either end of an ordered dictionary. The item - is moved to the right end if *last* is true (the default) or to the - beginning if *last* is false. Raises :exc:`KeyError` if the *key* does - not exist:: + Move an existing *key* to either end of an ordered dictionary. The item + is moved to the right end if *last* is true (the default) or to the + beginning if *last* is false. Raises :exc:`KeyError` if the *key* does + not exist:: - >>> d = OrderedDict.fromkeys('abcde') - >>> d.move_to_end('b') - >>> ''.join(d.keys()) - 'acdeb' - >>> d.move_to_end('b', last=False) - >>> ''.join(d.keys()) - 'bacde' + >>> d = OrderedDict.fromkeys('abcde') + >>> d.move_to_end('b') + >>> ''.join(d.keys()) + 'acdeb' + >>> d.move_to_end('b', last=False) + >>> ''.join(d.keys()) + 'bacde' - .. versionadded:: 3.2 + .. versionadded:: 3.2 In addition to the usual mapping methods, ordered dictionaries also support reverse iteration using :func:`reversed`. @@ -840,9 +974,9 @@ reverse iteration using :func:`reversed`. Equality tests between :class:`OrderedDict` objects are order-sensitive and are implemented as ``list(od1.items())==list(od2.items())``. Equality tests between :class:`OrderedDict` objects and other -:class:`Mapping` objects are order-insensitive like regular dictionaries. -This allows :class:`OrderedDict` objects to be substituted anywhere a -regular dictionary is used. +:class:`~collections.abc.Mapping` objects are order-insensitive like regular +dictionaries. This allows :class:`OrderedDict` objects to be substituted +anywhere a regular dictionary is used. The :class:`OrderedDict` constructor and :meth:`update` method both accept keyword arguments, but their order is lost because Python's function call @@ -850,8 +984,8 @@ semantics pass-in keyword arguments using a regular unordered dictionary. .. seealso:: - `Equivalent OrderedDict recipe <http://code.activestate.com/recipes/576693/>`_ - that runs on Python 2.4 or later. + `Equivalent OrderedDict recipe <http://code.activestate.com/recipes/576693/>`_ + that runs on Python 2.4 or later. :class:`OrderedDict` Examples and Recipes ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -894,7 +1028,7 @@ original insertion position is changed and moved to the end:: An ordered dictionary can be combined with the :class:`Counter` class so that the counter remembers the order elements are first encountered:: - class OrderedCounter(Counter, OrderedDict): + class OrderedCounter(Counter, OrderedDict): 'Counter that remembers the order elements are first encountered' def __repr__(self): @@ -915,19 +1049,19 @@ attribute. .. class:: UserDict([initialdata]) - Class that simulates a dictionary. The instance's contents are kept in a - regular dictionary, which is accessible via the :attr:`data` attribute of - :class:`UserDict` instances. If *initialdata* is provided, :attr:`data` is - initialized with its contents; note that a reference to *initialdata* will not - be kept, allowing it be used for other purposes. + Class that simulates a dictionary. The instance's contents are kept in a + regular dictionary, which is accessible via the :attr:`data` attribute of + :class:`UserDict` instances. If *initialdata* is provided, :attr:`data` is + initialized with its contents; note that a reference to *initialdata* will not + be kept, allowing it be used for other purposes. - In addition to supporting the methods and operations of mappings, - :class:`UserDict` instances provide the following attribute: + In addition to supporting the methods and operations of mappings, + :class:`UserDict` instances provide the following attribute: - .. attribute:: data + .. attribute:: data - A real dictionary used to store the contents of the :class:`UserDict` - class. + A real dictionary used to store the contents of the :class:`UserDict` + class. @@ -945,21 +1079,21 @@ to work with because the underlying list is accessible as an attribute. .. class:: UserList([list]) - Class that simulates a list. The instance's contents are kept in a regular - list, which is accessible via the :attr:`data` attribute of :class:`UserList` - instances. The instance's contents are initially set to a copy of *list*, - defaulting to the empty list ``[]``. *list* can be any iterable, for - example a real Python list or a :class:`UserList` object. + Class that simulates a list. The instance's contents are kept in a regular + list, which is accessible via the :attr:`data` attribute of :class:`UserList` + instances. The instance's contents are initially set to a copy of *list*, + defaulting to the empty list ``[]``. *list* can be any iterable, for + example a real Python list or a :class:`UserList` object. - In addition to supporting the methods and operations of mutable sequences, - :class:`UserList` instances provide the following attribute: + In addition to supporting the methods and operations of mutable sequences, + :class:`UserList` instances provide the following attribute: - .. attribute:: data + .. attribute:: data - A real :class:`list` object used to store the contents of the - :class:`UserList` class. + A real :class:`list` object used to store the contents of the + :class:`UserList` class. -**Subclassing requirements:** Subclasses of :class:`UserList` are expect to +**Subclassing requirements:** Subclasses of :class:`UserList` are expected to offer a constructor which can be called with either no arguments or one argument. List operations which return a new sequence attempt to create an instance of the actual implementation class. To do so, it assumes that the @@ -982,168 +1116,10 @@ attribute. .. class:: UserString([sequence]) - Class that simulates a string or a Unicode string object. The instance's - content is kept in a regular string object, which is accessible via the - :attr:`data` attribute of :class:`UserString` instances. The instance's - contents are initially set to a copy of *sequence*. The *sequence* can - be an instance of :class:`bytes`, :class:`str`, :class:`UserString` (or a - subclass) or an arbitrary sequence which can be converted into a string using - the built-in :func:`str` function. - - -.. _collections-abstract-base-classes: - -ABCs - abstract base classes ----------------------------- - -The collections module offers the following :term:`ABCs <abstract base class>`: - -========================= ===================== ====================== ==================================================== -ABC Inherits from Abstract Methods Mixin Methods -========================= ===================== ====================== ==================================================== -:class:`Container` ``__contains__`` -:class:`Hashable` ``__hash__`` -:class:`Iterable` ``__iter__`` -:class:`Iterator` :class:`Iterable` ``__next__`` ``__iter__`` -:class:`Sized` ``__len__`` -:class:`Callable` ``__call__`` - -:class:`Sequence` :class:`Sized`, ``__getitem__`` ``__contains__``, ``__iter__``, ``__reversed__``, - :class:`Iterable`, ``index``, and ``count`` - :class:`Container` - -:class:`MutableSequence` :class:`Sequence` ``__setitem__``, Inherited :class:`Sequence` methods and - ``__delitem__``, ``append``, ``reverse``, ``extend``, ``pop``, - ``insert`` ``remove``, and ``__iadd__`` - -:class:`Set` :class:`Sized`, ``__le__``, ``__lt__``, ``__eq__``, ``__ne__``, - :class:`Iterable`, ``__gt__``, ``__ge__``, ``__and__``, ``__or__``, - :class:`Container` ``__sub__``, ``__xor__``, and ``isdisjoint`` - -:class:`MutableSet` :class:`Set` ``add``, Inherited :class:`Set` methods and - ``discard`` ``clear``, ``pop``, ``remove``, ``__ior__``, - ``__iand__``, ``__ixor__``, and ``__isub__`` - -:class:`Mapping` :class:`Sized`, ``__getitem__`` ``__contains__``, ``keys``, ``items``, ``values``, - :class:`Iterable`, ``get``, ``__eq__``, and ``__ne__`` - :class:`Container` - -:class:`MutableMapping` :class:`Mapping` ``__setitem__``, Inherited :class:`Mapping` methods and - ``__delitem__`` ``pop``, ``popitem``, ``clear``, ``update``, - and ``setdefault`` - - -:class:`MappingView` :class:`Sized` ``__len__`` -:class:`ItemsView` :class:`MappingView`, ``__contains__``, - :class:`Set` ``__iter__`` -:class:`KeysView` :class:`MappingView`, ``__contains__``, - :class:`Set` ``__iter__`` -:class:`ValuesView` :class:`MappingView` ``__contains__``, ``__iter__`` -========================= ===================== ====================== ==================================================== - - -.. class:: Container - Hashable - Sized - Callable - - ABCs for classes that provide respectively the methods :meth:`__contains__`, - :meth:`__hash__`, :meth:`__len__`, and :meth:`__call__`. - -.. class:: Iterable - - ABC for classes that provide the :meth:`__iter__` method. - See also the definition of :term:`iterable`. - -.. class:: Iterator - - ABC for classes that provide the :meth:`__iter__` and :meth:`__next__` methods. - See also the definition of :term:`iterator`. - -.. class:: Sequence - MutableSequence - - ABCs for read-only and mutable :term:`sequences <sequence>`. - -.. class:: Set - MutableSet - - ABCs for read-only and mutable sets. - -.. class:: Mapping - MutableMapping - - ABCs for read-only and mutable :term:`mappings <mapping>`. - -.. class:: MappingView - ItemsView - KeysView - ValuesView - - ABCs for mapping, items, keys, and values :term:`views <view>`. - - -These ABCs allow us to ask classes or instances if they provide -particular functionality, for example:: - - size = None - if isinstance(myvar, collections.Sized): - size = len(myvar) - -Several of the ABCs are also useful as mixins that make it easier to develop -classes supporting container APIs. For example, to write a class supporting -the full :class:`Set` API, it only necessary to supply the three underlying -abstract methods: :meth:`__contains__`, :meth:`__iter__`, and :meth:`__len__`. -The ABC supplies the remaining methods such as :meth:`__and__` and -:meth:`isdisjoint` :: - - class ListBasedSet(collections.Set): - ''' Alternate set implementation favoring space over speed - and not requiring the set elements to be hashable. ''' - def __init__(self, iterable): - self.elements = lst = [] - for value in iterable: - if value not in lst: - lst.append(value) - def __iter__(self): - return iter(self.elements) - def __contains__(self, value): - return value in self.elements - def __len__(self): - return len(self.elements) - - s1 = ListBasedSet('abcdef') - s2 = ListBasedSet('defghi') - overlap = s1 & s2 # The __and__() method is supported automatically - -Notes on using :class:`Set` and :class:`MutableSet` as a mixin: - -(1) - Since some set operations create new sets, the default mixin methods need - a way to create new instances from an iterable. The class constructor is - assumed to have a signature in the form ``ClassName(iterable)``. - That assumption is factored-out to an internal classmethod called - :meth:`_from_iterable` which calls ``cls(iterable)`` to produce a new set. - If the :class:`Set` mixin is being used in a class with a different - constructor signature, you will need to override :meth:`_from_iterable` - with a classmethod that can construct new instances from - an iterable argument. - -(2) - To override the comparisons (presumably for speed, as the - semantics are fixed), redefine :meth:`__le__` and - then the other operations will automatically follow suit. - -(3) - The :class:`Set` mixin provides a :meth:`_hash` method to compute a hash value - for the set; however, :meth:`__hash__` is not defined because not all sets - are hashable or immutable. To add set hashabilty using mixins, - inherit from both :meth:`Set` and :meth:`Hashable`, then define - ``__hash__ = Set._hash``. - -.. seealso:: - - * `OrderedSet recipe <http://code.activestate.com/recipes/576694/>`_ for an - example built on :class:`MutableSet`. - - * For more about ABCs, see the :mod:`abc` module and :pep:`3119`. + Class that simulates a string or a Unicode string object. The instance's + content is kept in a regular string object, which is accessible via the + :attr:`data` attribute of :class:`UserString` instances. The instance's + contents are initially set to a copy of *sequence*. The *sequence* can + be an instance of :class:`bytes`, :class:`str`, :class:`UserString` (or a + subclass) or an arbitrary sequence which can be converted into a string using + the built-in :func:`str` function. diff --git a/Doc/library/compileall.rst b/Doc/library/compileall.rst index cb7a09c..b12c217 100644 --- a/Doc/library/compileall.rst +++ b/Doc/library/compileall.rst @@ -162,7 +162,7 @@ subdirectory and all its subdirectories:: # Perform same compilation, excluding files in .svn directories. import re - compileall.compile_dir('Lib/', rx=re.compile('/[.]svn'), force=True) + compileall.compile_dir('Lib/', rx=re.compile(r'[/\\][.]svn'), force=True) .. seealso:: diff --git a/Doc/library/concurrency.rst b/Doc/library/concurrency.rst new file mode 100644 index 0000000..f2a512f --- /dev/null +++ b/Doc/library/concurrency.rst @@ -0,0 +1,32 @@ +.. _concurrency: + +******************** +Concurrent Execution +******************** + +The modules described in this chapter provide support for concurrent +execution of code. The appropriate choice of tool will depend on the +task to be executed (CPU bound vs IO bound) and preferred style of +development (event driven cooperative multitasking vs preemptive +multitasking). Here's an overview: + + +.. toctree:: + + threading.rst + multiprocessing.rst + concurrent.rst + concurrent.futures.rst + subprocess.rst + sched.rst + queue.rst + select.rst + + +The following are support modules for some of the above services: + +.. toctree:: + + dummy_threading.rst + _thread.rst + _dummy_thread.rst diff --git a/Doc/library/concurrent.futures.rst b/Doc/library/concurrent.futures.rst index eee2285..575b1ea 100644 --- a/Doc/library/concurrent.futures.rst +++ b/Doc/library/concurrent.futures.rst @@ -40,7 +40,7 @@ Executor Objects .. method:: map(func, *iterables, timeout=None) - Equivalent to ``map(func, *iterables)`` except *func* is executed + Equivalent to :func:`map(func, *iterables) <map>` except *func* is executed asynchronously and several calls to *func* may be made concurrently. The returned iterator raises a :exc:`TimeoutError` if :meth:`~iterator.__next__` is called and the result isn't available @@ -136,20 +136,23 @@ ThreadPoolExecutor Example 'http://www.bbc.co.uk/', 'http://some-made-up-domain.com/'] + # Retrieve a single page and report the url and contents def load_url(url, timeout): - return urllib.request.urlopen(url, timeout=timeout).read() + conn = urllib.request.urlopen(url, timeout=timeout) + return conn.readall() + # We can use a with statement to ensure threads are cleaned up promptly with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor: - future_to_url = dict((executor.submit(load_url, url, 60), url) - for url in URLS) - + # Start the load operations and mark each future with its URL + future_to_url = {executor.submit(load_url, url, 60): url for url in URLS} for future in concurrent.futures.as_completed(future_to_url): url = future_to_url[future] - if future.exception() is not None: - print('%r generated an exception: %s' % (url, - future.exception())) + try: + data = future.result() + except Exception as exc: + print('%r generated an exception: %s' % (url, exc)) else: - print('%r page is %d bytes' % (url, len(future.result()))) + print('%r page is %d bytes' % (url, len(data))) ProcessPoolExecutor @@ -170,6 +173,12 @@ to a :class:`ProcessPoolExecutor` will result in deadlock. of at most *max_workers* processes. If *max_workers* is ``None`` or not given, it will default to the number of processors on the machine. + .. versionchanged:: 3.3 + When one of the worker processes terminates abruptly, a + :exc:`BrokenProcessPool` error is now raised. Previously, behaviour + was undefined but operations on the executor or its futures would often + freeze or deadlock. + .. _processpoolexecutor-example: @@ -337,6 +346,8 @@ Module Functions *return_when* indicates when this function should return. It must be one of the following constants: + .. tabularcolumns:: |l|L| + +-----------------------------+----------------------------------------+ | Constant | Description | +=============================+========================================+ @@ -357,7 +368,8 @@ Module Functions Returns an iterator over the :class:`Future` instances (possibly created by different :class:`Executor` instances) given by *fs* that yields futures as - they complete (finished or were cancelled). Any futures that completed + they complete (finished or were cancelled). Any futures given by *fs* that + are duplicated will be returned once. Any futures that completed before :func:`as_completed` is called will be yielded first. The returned iterator raises a :exc:`TimeoutError` if :meth:`~iterator.__next__` is called and the result isn't available after *timeout* seconds from the @@ -371,3 +383,16 @@ Module Functions :pep:`3148` -- futures - execute computations asynchronously The proposal which described this feature for inclusion in the Python standard library. + + +Exception classes +----------------- + +.. exception:: BrokenProcessPool + + Derived from :exc:`RuntimeError`, this exception class is raised when + one of the workers of a :class:`ProcessPoolExecutor` has terminated + in a non-clean fashion (for example, if it was killed from the outside). + + .. versionadded:: 3.3 + diff --git a/Doc/library/concurrent.rst b/Doc/library/concurrent.rst new file mode 100644 index 0000000..2eba536 --- /dev/null +++ b/Doc/library/concurrent.rst @@ -0,0 +1,6 @@ +The :mod:`concurrent` package +============================= + +Currently, there is only one module in this package: + +* :mod:`concurrent.futures` -- Launching parallel tasks diff --git a/Doc/library/configparser.rst b/Doc/library/configparser.rst index 0b8212c..024d27c 100644 --- a/Doc/library/configparser.rst +++ b/Doc/library/configparser.rst @@ -371,7 +371,8 @@ are changed on a section proxy, they are actually mutated in the original parser. :mod:`configparser` objects behave as close to actual dictionaries as possible. -The mapping interface is complete and adheres to the ``MutableMapping`` ABC. +The mapping interface is complete and adheres to the +:class:`~collections.abc.MutableMapping` ABC. However, there are a few differences that should be taken into account: * By default, all keys in sections are accessible in a case-insensitive manner @@ -539,7 +540,7 @@ the :meth:`__init__` options: * *delimiters*, default value: ``('=', ':')`` Delimiters are substrings that delimit keys from values within a section. The - first occurence of a delimiting substring on a line is considered a delimiter. + first occurrence of a delimiting substring on a line is considered a delimiter. This means values (but not keys) can contain the delimiters. See also the *space_around_delimiters* argument to diff --git a/Doc/library/contextlib.rst b/Doc/library/contextlib.rst index e8dc17f..fba48f4 100644 --- a/Doc/library/contextlib.rst +++ b/Doc/library/contextlib.rst @@ -12,8 +12,11 @@ This module provides utilities for common tasks involving the :keyword:`with` statement. For more information see also :ref:`typecontextmanager` and :ref:`context-managers`. -Functions provided: +Utilities +--------- + +Functions and classes provided: .. decorator:: contextmanager @@ -168,6 +171,349 @@ Functions provided: .. versionadded:: 3.2 +.. class:: ExitStack() + + A context manager that is designed to make it easy to programmatically + combine other context managers and cleanup functions, especially those + that are optional or otherwise driven by input data. + + For example, a set of files may easily be handled in a single with + statement as follows:: + + with ExitStack() as stack: + files = [stack.enter_context(open(fname)) for fname in filenames] + # All opened files will automatically be closed at the end of + # the with statement, even if attempts to open files later + # in the list raise an exception + + Each instance maintains a stack of registered callbacks that are called in + reverse order when the instance is closed (either explicitly or implicitly + at the end of a :keyword:`with` statement). Note that callbacks are *not* + invoked implicitly when the context stack instance is garbage collected. + + This stack model is used so that context managers that acquire their + resources in their ``__init__`` method (such as file objects) can be + handled correctly. + + Since registered callbacks are invoked in the reverse order of + registration, this ends up behaving as if multiple nested :keyword:`with` + statements had been used with the registered set of callbacks. This even + extends to exception handling - if an inner callback suppresses or replaces + an exception, then outer callbacks will be passed arguments based on that + updated state. + + This is a relatively low level API that takes care of the details of + correctly unwinding the stack of exit callbacks. It provides a suitable + foundation for higher level context managers that manipulate the exit + stack in application specific ways. + + .. versionadded:: 3.3 + + .. method:: enter_context(cm) + + Enters a new context manager and adds its :meth:`__exit__` method to + the callback stack. The return value is the result of the context + manager's own :meth:`__enter__` method. + + These context managers may suppress exceptions just as they normally + would if used directly as part of a :keyword:`with` statement. + + .. method:: push(exit) + + Adds a context manager's :meth:`__exit__` method to the callback stack. + + As ``__enter__`` is *not* invoked, this method can be used to cover + part of an :meth:`__enter__` implementation with a context manager's own + :meth:`__exit__` method. + + If passed an object that is not a context manager, this method assumes + it is a callback with the same signature as a context manager's + :meth:`__exit__` method and adds it directly to the callback stack. + + By returning true values, these callbacks can suppress exceptions the + same way context manager :meth:`__exit__` methods can. + + The passed in object is returned from the function, allowing this + method to be used as a function decorator. + + .. method:: callback(callback, *args, **kwds) + + Accepts an arbitrary callback function and arguments and adds it to + the callback stack. + + Unlike the other methods, callbacks added this way cannot suppress + exceptions (as they are never passed the exception details). + + The passed in callback is returned from the function, allowing this + method to be used as a function decorator. + + .. method:: pop_all() + + Transfers the callback stack to a fresh :class:`ExitStack` instance + and returns it. No callbacks are invoked by this operation - instead, + they will now be invoked when the new stack is closed (either + explicitly or implicitly at the end of a :keyword:`with` statement). + + For example, a group of files can be opened as an "all or nothing" + operation as follows:: + + with ExitStack() as stack: + files = [stack.enter_context(open(fname)) for fname in filenames] + # Hold onto the close method, but don't call it yet. + close_files = stack.pop_all().close + # If opening any file fails, all previously opened files will be + # closed automatically. If all files are opened successfully, + # they will remain open even after the with statement ends. + # close_files() can then be invoked explicitly to close them all. + + .. method:: close() + + Immediately unwinds the callback stack, invoking callbacks in the + reverse order of registration. For any context managers and exit + callbacks registered, the arguments passed in will indicate that no + exception occurred. + + +Examples and Recipes +-------------------- + +This section describes some examples and recipes for making effective use of +the tools provided by :mod:`contextlib`. + + +Supporting a variable number of context managers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The primary use case for :class:`ExitStack` is the one given in the class +documentation: supporting a variable number of context managers and other +cleanup operations in a single :keyword:`with` statement. The variability +may come from the number of context managers needed being driven by user +input (such as opening a user specified collection of files), or from +some of the context managers being optional:: + + with ExitStack() as stack: + for resource in resources: + stack.enter_context(resource) + if need_special resource: + special = acquire_special_resource() + stack.callback(release_special_resource, special) + # Perform operations that use the acquired resources + +As shown, :class:`ExitStack` also makes it quite easy to use :keyword:`with` +statements to manage arbitrary resources that don't natively support the +context management protocol. + + +Simplifying support for single optional context managers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In the specific case of a single optional context manager, :class:`ExitStack` +instances can be used as a "do nothing" context manager, allowing a context +manager to easily be omitted without affecting the overall structure of +the source code:: + + def debug_trace(details): + if __debug__: + return TraceContext(details) + # Don't do anything special with the context in release mode + return ExitStack() + + with debug_trace(): + # Suite is traced in debug mode, but runs normally otherwise + + +Catching exceptions from ``__enter__`` methods +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +It is occasionally desirable to catch exceptions from an ``__enter__`` +method implementation, *without* inadvertently catching exceptions from +the :keyword:`with` statement body or the context manager's ``__exit__`` +method. By using :class:`ExitStack` the steps in the context management +protocol can be separated slightly in order to allow this:: + + stack = ExitStack() + try: + x = stack.enter_context(cm) + except Exception: + # handle __enter__ exception + else: + with stack: + # Handle normal case + +Actually needing to do this is likely to indicate that the underlying API +should be providing a direct resource management interface for use with +:keyword:`try`/:keyword:`except`/:keyword:`finally` statements, but not +all APIs are well designed in that regard. When a context manager is the +only resource management API provided, then :class:`ExitStack` can make it +easier to handle various situations that can't be handled directly in a +:keyword:`with` statement. + + +Cleaning up in an ``__enter__`` implementation +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +As noted in the documentation of :meth:`ExitStack.push`, this +method can be useful in cleaning up an already allocated resource if later +steps in the :meth:`__enter__` implementation fail. + +Here's an example of doing this for a context manager that accepts resource +acquisition and release functions, along with an optional validation function, +and maps them to the context management protocol:: + + from contextlib import contextmanager, ExitStack + + class ResourceManager: + + def __init__(self, acquire_resource, release_resource, check_resource_ok=None): + self.acquire_resource = acquire_resource + self.release_resource = release_resource + if check_resource_ok is None: + def check_resource_ok(resource): + return True + self.check_resource_ok = check_resource_ok + + @contextmanager + def _cleanup_on_error(self): + with ExitStack() as stack: + stack.push(self) + yield + # The validation check passed and didn't raise an exception + # Accordingly, we want to keep the resource, and pass it + # back to our caller + stack.pop_all() + + def __enter__(self): + resource = self.acquire_resource() + with self._cleanup_on_error(): + if not self.check_resource_ok(resource): + msg = "Failed validation for {!r}" + raise RuntimeError(msg.format(resource)) + return resource + + def __exit__(self, *exc_details): + # We don't need to duplicate any of our resource release logic + self.release_resource() + + +Replacing any use of ``try-finally`` and flag variables +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +A pattern you will sometimes see is a ``try-finally`` statement with a flag +variable to indicate whether or not the body of the ``finally`` clause should +be executed. In its simplest form (that can't already be handled just by +using an ``except`` clause instead), it looks something like this:: + + cleanup_needed = True + try: + result = perform_operation() + if result: + cleanup_needed = False + finally: + if cleanup_needed: + cleanup_resources() + +As with any ``try`` statement based code, this can cause problems for +development and review, because the setup code and the cleanup code can end +up being separated by arbitrarily long sections of code. + +:class:`ExitStack` makes it possible to instead register a callback for +execution at the end of a ``with`` statement, and then later decide to skip +executing that callback:: + + from contextlib import ExitStack + + with ExitStack() as stack: + stack.callback(cleanup_resources) + result = perform_operation() + if result: + stack.pop_all() + +This allows the intended cleanup up behaviour to be made explicit up front, +rather than requiring a separate flag variable. + +If a particular application uses this pattern a lot, it can be simplified +even further by means of a small helper class:: + + from contextlib import ExitStack + + class Callback(ExitStack): + def __init__(self, callback, *args, **kwds): + super(Callback, self).__init__() + self.callback(callback, *args, **kwds) + + def cancel(self): + self.pop_all() + + with Callback(cleanup_resources) as cb: + result = perform_operation() + if result: + cb.cancel() + +If the resource cleanup isn't already neatly bundled into a standalone +function, then it is still possible to use the decorator form of +:meth:`ExitStack.callback` to declare the resource cleanup in +advance:: + + from contextlib import ExitStack + + with ExitStack() as stack: + @stack.callback + def cleanup_resources(): + ... + result = perform_operation() + if result: + stack.pop_all() + +Due to the way the decorator protocol works, a callback function +declared this way cannot take any parameters. Instead, any resources to +be released must be accessed as closure variables + + +Using a context manager as a function decorator +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:class:`ContextDecorator` makes it possible to use a context manager in +both an ordinary ``with`` statement and also as a function decorator. + +For example, it is sometimes useful to wrap functions or groups of statements +with a logger that can track the time of entry and time of exit. Rather than +writing both a function decorator and a context manager for the task, +inheriting from :class:`ContextDecorator` provides both capabilities in a +single definition:: + + from contextlib import ContextDecorator + import logging + + logging.basicConfig(level=logging.INFO) + + class track_entry_and_exit(ContextDecorator): + def __init__(self, name): + self.name = name + + def __enter__(self): + logging.info('Entering: {}'.format(name)) + + def __exit__(self, exc_type, exc, exc_tb): + logging.info('Exiting: {}'.format(name)) + +Instances of this class can be used as both a context manager:: + + with track_entry_and_exit('widget loader'): + print('Some time consuming activity goes here') + load_widget() + +And also as a function decorator:: + + @track_entry_and_exit('widget loader') + def activity(): + print('Some time consuming activity goes here') + load_widget() + +Note that there is one additional limitation when using context managers +as function decorators: there's no way to access the return value of +:meth:`__enter__`. If that value is needed, then it is still necessary to use +an explicit ``with`` statement. + .. seealso:: :pep:`0343` - The "with" statement diff --git a/Doc/library/copyreg.rst b/Doc/library/copyreg.rst index f3721d1..50d5879 100644 --- a/Doc/library/copyreg.rst +++ b/Doc/library/copyreg.rst @@ -33,9 +33,11 @@ Such constructors may be factory functions or class instances. returned by *function* at pickling time. :exc:`TypeError` will be raised if *object* is a class or *constructor* is not callable. - See the :mod:`pickle` module for more details on the interface expected of - *function* and *constructor*. - + See the :mod:`pickle` module for more details on the interface + expected of *function* and *constructor*. Note that the + :attr:`~pickle.Pickler.dispatch_table` attribute of a pickler + object or subclass of :class:`pickle.Pickler` can also be used for + declaring reduction functions. Example ------- diff --git a/Doc/library/crypt.rst b/Doc/library/crypt.rst index 0be571e..b4c90cd 100644 --- a/Doc/library/crypt.rst +++ b/Doc/library/crypt.rst @@ -15,9 +15,9 @@ This module implements an interface to the :manpage:`crypt(3)` routine, which is a one-way hash function based upon a modified DES algorithm; see the Unix man -page for further details. Possible uses include allowing Python scripts to -accept typed passwords from the user, or attempting to crack Unix passwords with -a dictionary. +page for further details. Possible uses include storing hashed passwords +so you can check passwords without storing the actual password, or attempting +to crack Unix passwords with a dictionary. .. index:: single: crypt(3) @@ -26,15 +26,74 @@ the :manpage:`crypt(3)` routine in the running system. Therefore, any extensions available on the current implementation will also be available on this module. +Hashing Methods +--------------- -.. function:: crypt(word, salt) +.. versionadded:: 3.3 + +The :mod:`crypt` module defines the list of hashing methods (not all methods +are available on all platforms): + +.. data:: METHOD_SHA512 + + A Modular Crypt Format method with 16 character salt and 86 character + hash. This is the strongest method. + +.. data:: METHOD_SHA256 + + Another Modular Crypt Format method with 16 character salt and 43 + character hash. + +.. data:: METHOD_MD5 + + Another Modular Crypt Format method with 8 character salt and 22 + character hash. + +.. data:: METHOD_CRYPT + + The traditional method with a 2 character salt and 13 characters of + hash. This is the weakest method. + + +Module Attributes +----------------- + +.. versionadded:: 3.3 + +.. attribute:: methods + + A list of available password hashing algorithms, as + ``crypt.METHOD_*`` objects. This list is sorted from strongest to + weakest, and is guaranteed to have at least ``crypt.METHOD_CRYPT``. + + +Module Functions +---------------- + +The :mod:`crypt` module defines the following functions: + +.. function:: crypt(word, salt=None) *word* will usually be a user's password as typed at a prompt or in a graphical - interface. *salt* is usually a random two-character string which will be used - to perturb the DES algorithm in one of 4096 ways. The characters in *salt* must - be in the set ``[./a-zA-Z0-9]``. Returns the hashed password as a string, which - will be composed of characters from the same alphabet as the salt (the first two - characters represent the salt itself). + interface. The optional *salt* is either a string as returned from + :func:`mksalt`, one of the ``crypt.METHOD_*`` values (though not all + may be available on all platforms), or a full encrypted password + including salt, as returned by this function. If *salt* is not + provided, the strongest method will be used (as returned by + :func:`methods`. + + Checking a password is usually done by passing the plain-text password + as *word* and the full results of a previous :func:`crypt` call, + which should be the same as the results of this call. + + *salt* (either a random 2 or 16 character string, possibly prefixed with + ``$digit$`` to indicate the method) which will be used to perturb the + encryption algorithm. The characters in *salt* must be in the set + ``[./a-zA-Z0-9]``, with the exception of Modular Crypt Format which + prefixes a ``$digit$``. + + Returns the hashed password as a string, which will be composed of + characters from the same alphabet as the salt. .. index:: single: crypt(3) @@ -42,18 +101,52 @@ this module. different sizes in the *salt*, it is recommended to use the full crypted password as salt when checking for a password. -A simple example illustrating typical use:: + .. versionchanged:: 3.3 + Accept ``crypt.METHOD_*`` values in addition to strings for *salt*. + + +.. function:: mksalt(method=None) - import crypt, getpass, pwd + Return a randomly generated salt of the specified method. If no + *method* is given, the strongest method available as returned by + :func:`methods` is used. + + The return value is a string either of 2 characters in length for + ``crypt.METHOD_CRYPT``, or 19 characters starting with ``$digit$`` and + 16 random characters from the set ``[./a-zA-Z0-9]``, suitable for + passing as the *salt* argument to :func:`crypt`. + + .. versionadded:: 3.3 + +Examples +-------- + +A simple example illustrating typical use (a constant-time comparison +operation is needed to limit exposure to timing attacks. +:func:`hmac.compare_digest` is suitable for this purpose):: + + import pwd + import crypt + import getpass + from hmac import compare_digest as compare_hash def login(): - username = input('Python login:') + username = input('Python login: ') cryptedpasswd = pwd.getpwnam(username)[1] if cryptedpasswd: if cryptedpasswd == 'x' or cryptedpasswd == '*': - raise "Sorry, currently no support for shadow passwords" + raise ValueError('no support for shadow passwords') cleartext = getpass.getpass() - return crypt.crypt(cleartext, cryptedpasswd) == cryptedpasswd + return compare_hash(crypt.crypt(cleartext, cryptedpasswd), cryptedpasswd) else: - return 1 + return True + +To generate a hash of a password using the strongest available method and +check it against the original:: + + import crypt + from hmac import compare_digest as compare_hash + hashed = crypt.crypt(plaintext) + if not compare_hash(hashed, crypt.crypt(plaintext, hashed)): + raise ValueError("hashed version doesn't validate against original") diff --git a/Doc/library/crypto.rst b/Doc/library/crypto.rst index a233561..469ede49 100644 --- a/Doc/library/crypto.rst +++ b/Doc/library/crypto.rst @@ -8,6 +8,7 @@ Cryptographic Services The modules described in this chapter implement various algorithms of a cryptographic nature. They are available at the discretion of the installation. +On Unix systems, the :mod:`crypt` module may also be available. Here's an overview: diff --git a/Doc/library/csv.rst b/Doc/library/csv.rst index 90261e2..ccc9dc6 100644 --- a/Doc/library/csv.rst +++ b/Doc/library/csv.rst @@ -11,15 +11,15 @@ pair: data; tabular The so-called CSV (Comma Separated Values) format is the most common import and -export format for spreadsheets and databases. There is no "CSV standard", so -the format is operationally defined by the many applications which read and -write it. The lack of a standard means that subtle differences often exist in -the data produced and consumed by different applications. These differences can -make it annoying to process CSV files from multiple sources. Still, while the -delimiters and quoting characters vary, the overall format is similar enough -that it is possible to write a single module which can efficiently manipulate -such data, hiding the details of reading and writing the data from the -programmer. +export format for spreadsheets and databases. CSV format was used for many +years prior to attempts to describe the format in a standardized way in +:rfc:`4180`. The lack of a well-defined standard means that subtle differences +often exist in the data produced and consumed by different applications. These +differences can make it annoying to process CSV files from multiple sources. +Still, while the delimiters and quoting characters vary, the overall format is +similar enough that it is possible to write a single module which can +efficiently manipulate such data, hiding the details of reading and writing the +data from the programmer. The :mod:`csv` module implements classes to read and write tabular data in CSV format. It allows programmers to say, "write this data in the format preferred @@ -142,36 +142,43 @@ The :mod:`csv` module defines the following functions: The :mod:`csv` module defines the following classes: -.. class:: DictReader(csvfile, fieldnames=None, restkey=None, restval=None, dialect='excel', *args, **kwds) - - Create an object which operates like a regular reader but maps the information - read into a dict whose keys are given by the optional *fieldnames* parameter. - If the *fieldnames* parameter is omitted, the values in the first row of the - *csvfile* will be used as the fieldnames. If the row read has more fields - than the fieldnames sequence, the remaining data is added as a sequence - keyed by the value of *restkey*. If the row read has fewer fields than the - fieldnames sequence, the remaining keys take the value of the optional - *restval* parameter. Any other optional or keyword arguments are passed to - the underlying :class:`reader` instance. - - -.. class:: DictWriter(csvfile, fieldnames, restval='', extrasaction='raise', dialect='excel', *args, **kwds) - - Create an object which operates like a regular writer but maps dictionaries onto - output rows. The *fieldnames* parameter identifies the order in which values in - the dictionary passed to the :meth:`writerow` method are written to the - *csvfile*. The optional *restval* parameter specifies the value to be written - if the dictionary is missing a key in *fieldnames*. If the dictionary passed to - the :meth:`writerow` method contains a key not found in *fieldnames*, the - optional *extrasaction* parameter indicates what action to take. If it is set - to ``'raise'`` a :exc:`ValueError` is raised. If it is set to ``'ignore'``, - extra values in the dictionary are ignored. Any other optional or keyword - arguments are passed to the underlying :class:`writer` instance. - - Note that unlike the :class:`DictReader` class, the *fieldnames* parameter of - the :class:`DictWriter` is not optional. Since Python's :class:`dict` objects - are not ordered, there is not enough information available to deduce the order - in which the row should be written to the *csvfile*. +.. class:: DictReader(csvfile, fieldnames=None, restkey=None, restval=None, \ + dialect='excel', *args, **kwds) + + Create an object which operates like a regular reader but maps the + information read into a dict whose keys are given by the optional + *fieldnames* parameter. The *fieldnames* parameter is a :mod:`sequence + <collections.abc>` whose elements are associated with the fields of the + input data in order. These elements become the keys of the resulting + dictionary. If the *fieldnames* parameter is omitted, the values in the + first row of the *csvfile* will be used as the fieldnames. If the row read + has more fields than the fieldnames sequence, the remaining data is added as + a sequence keyed by the value of *restkey*. If the row read has fewer + fields than the fieldnames sequence, the remaining keys take the value of + the optional *restval* parameter. Any other optional or keyword arguments + are passed to the underlying :class:`reader` instance. + + +.. class:: DictWriter(csvfile, fieldnames, restval='', extrasaction='raise', \ + dialect='excel', *args, **kwds) + + Create an object which operates like a regular writer but maps dictionaries + onto output rows. The *fieldnames* parameter is a :mod:`sequence + <collections.abc>` of keys that identify the order in which values in the + dictionary passed to the :meth:`writerow` method are written to the + *csvfile*. The optional *restval* parameter specifies the value to be + written if the dictionary is missing a key in *fieldnames*. If the + dictionary passed to the :meth:`writerow` method contains a key not found in + *fieldnames*, the optional *extrasaction* parameter indicates what action to + take. If it is set to ``'raise'`` a :exc:`ValueError` is raised. If it is + set to ``'ignore'``, extra values in the dictionary are ignored. Any other + optional or keyword arguments are passed to the underlying :class:`writer` + instance. + + Note that unlike the :class:`DictReader` class, the *fieldnames* parameter + of the :class:`DictWriter` is not optional. Since Python's :class:`dict` + objects are not ordered, there is not enough information available to deduce + the order in which the row should be written to the *csvfile*. .. class:: Dialect diff --git a/Doc/library/ctypes.rst b/Doc/library/ctypes.rst index 6f093fe..f46da85 100644 --- a/Doc/library/ctypes.rst +++ b/Doc/library/ctypes.rst @@ -39,9 +39,14 @@ loads libraries which export functions using the standard ``cdecl`` calling convention, while *windll* libraries call functions using the ``stdcall`` calling convention. *oledll* also uses the ``stdcall`` calling convention, and assumes the functions return a Windows :c:type:`HRESULT` error code. The error -code is used to automatically raise a :class:`WindowsError` exception when the +code is used to automatically raise a :class:`OSError` exception when the function call fails. +.. versionchanged:: 3.3 + Windows errors used to raise :exc:`WindowsError`, which is now an alias + of :exc:`OSError`. + + Here are some examples for Windows. Note that ``msvcrt`` is the MS standard C library containing most standard C functions, and uses the cdecl calling convention:: @@ -189,11 +194,13 @@ argument values:: >>> windll.kernel32.GetModuleHandleA(32) # doctest: +WINDOWS Traceback (most recent call last): File "<stdin>", line 1, in ? - WindowsError: exception: access violation reading 0x00000020 + OSError: exception: access violation reading 0x00000020 >>> There are, however, enough ways to crash Python with :mod:`ctypes`, so you -should be careful anyway. +should be careful anyway. The :mod:`faulthandler` module can be helpful in +debugging crashes (e.g. from segmentation faults produced by erroneous C library +calls). ``None``, integers, bytes objects and (unicode) strings are the only native Python objects that can directly be used as parameters in these function calls. @@ -211,7 +218,7 @@ more about :mod:`ctypes` data types. Fundamental data types ^^^^^^^^^^^^^^^^^^^^^^ -:mod:`ctypes` defines a number of primitive C compatible data types : +:mod:`ctypes` defines a number of primitive C compatible data types: +----------------------+------------------------------------------+----------------------------+ | ctypes type | C type | Python type | @@ -496,7 +503,7 @@ useful to check for error return values and automatically raise an exception:: Traceback (most recent call last): File "<stdin>", line 1, in ? File "<stdin>", line 3, in ValidHandle - WindowsError: [Errno 126] The specified module could not be found. + OSError: [Errno 126] The specified module could not be found. >>> ``WinError`` is a function which will call Windows ``FormatMessage()`` api to @@ -938,21 +945,21 @@ Callback functions :mod:`ctypes` allows to create C callable function pointers from Python callables. These are sometimes called *callback functions*. -First, you must create a class for the callback function, the class knows the +First, you must create a class for the callback function. The class knows the calling convention, the return type, and the number and types of arguments this function will receive. -The CFUNCTYPE factory function creates types for callback functions using the -normal cdecl calling convention, and, on Windows, the WINFUNCTYPE factory -function creates types for callback functions using the stdcall calling -convention. +The :func:`CFUNCTYPE` factory function creates types for callback functions +using the ``cdecl`` calling convention. On Windows, the :func:`WINFUNCTYPE` +factory function creates types for callback functions using the ``stdcall`` +calling convention. Both of these factory functions are called with the result type as first argument, and the callback functions expected argument types as the remaining arguments. I will present an example here which uses the standard C library's -:c:func:`qsort` function, this is used to sort items with the help of a callback +:c:func:`qsort` function, that is used to sort items with the help of a callback function. :c:func:`qsort` will be used to sort an array of integers:: >>> IntArray5 = c_int * 5 @@ -965,7 +972,7 @@ function. :c:func:`qsort` will be used to sort an array of integers:: items in the data array, the size of one item, and a pointer to the comparison function, the callback. The callback will then be called with two pointers to items, and it must return a negative integer if the first item is smaller than -the second, a zero if they are equal, and a positive integer else. +the second, a zero if they are equal, and a positive integer otherwise. So our callback function receives pointers to integers, and must return an integer. First we create the ``type`` for the callback function:: @@ -973,36 +980,8 @@ integer. First we create the ``type`` for the callback function:: >>> CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int)) >>> -For the first implementation of the callback function, we simply print the -arguments we get, and return 0 (incremental development ;-):: - - >>> def py_cmp_func(a, b): - ... print("py_cmp_func", a, b) - ... return 0 - ... - >>> - -Create the C callable callback:: - - >>> cmp_func = CMPFUNC(py_cmp_func) - >>> - -And we're ready to go:: - - >>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - py_cmp_func <ctypes.LP_c_long object at 0x00...> <ctypes.LP_c_long object at 0x00...> - >>> - -We know how to access the contents of a pointer, so lets redefine our callback:: +To get started, here is a simple callback that shows the values it gets +passed:: >>> def py_cmp_func(a, b): ... print("py_cmp_func", a[0], b[0]) @@ -1011,23 +990,7 @@ We know how to access the contents of a pointer, so lets redefine our callback:: >>> cmp_func = CMPFUNC(py_cmp_func) >>> -Here is what we get on Windows:: - - >>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +WINDOWS - py_cmp_func 7 1 - py_cmp_func 33 1 - py_cmp_func 99 1 - py_cmp_func 5 1 - py_cmp_func 7 5 - py_cmp_func 33 5 - py_cmp_func 99 5 - py_cmp_func 7 99 - py_cmp_func 33 99 - py_cmp_func 7 33 - >>> - -It is funny to see that on linux the sort function seems to work much more -efficiently, it is doing less comparisons:: +The result:: >>> qsort(ia, len(ia), sizeof(c_int), cmp_func) # doctest: +LINUX py_cmp_func 5 1 @@ -1037,32 +1000,13 @@ efficiently, it is doing less comparisons:: py_cmp_func 1 7 >>> -Ah, we're nearly done! The last step is to actually compare the two items and -return a useful result:: +Now we can actually compare the two items and return a useful result:: >>> def py_cmp_func(a, b): ... print("py_cmp_func", a[0], b[0]) ... return a[0] - b[0] ... >>> - -Final run on Windows:: - - >>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +WINDOWS - py_cmp_func 33 7 - py_cmp_func 99 33 - py_cmp_func 5 99 - py_cmp_func 1 99 - py_cmp_func 33 7 - py_cmp_func 1 33 - py_cmp_func 5 33 - py_cmp_func 5 7 - py_cmp_func 1 7 - py_cmp_func 5 1 - >>> - -and on Linux:: - >>> qsort(ia, len(ia), sizeof(c_int), CMPFUNC(py_cmp_func)) # doctest: +LINUX py_cmp_func 5 1 py_cmp_func 33 99 @@ -1071,9 +1015,6 @@ and on Linux:: py_cmp_func 5 7 >>> -It is quite interesting to see that the Windows :func:`qsort` function needs -more comparisons than the linux version! - As we can easily check, our array is sorted now:: >>> for i in ia: print(i, end=" ") @@ -1081,12 +1022,18 @@ As we can easily check, our array is sorted now:: 1 5 7 33 99 >>> -**Important note for callback functions:** +.. note:: -Make sure you keep references to CFUNCTYPE objects as long as they are used from -C code. :mod:`ctypes` doesn't, and if you don't, they may be garbage collected, -crashing your program when a callback is made. + Make sure you keep references to :func:`CFUNCTYPE` objects as long as they + are used from C code. :mod:`ctypes` doesn't, and if you don't, they may be + garbage collected, crashing your program when a callback is made. + Also, note that if the callback function is called in a thread created + outside of Python's control (e.g. by the foreign code that calls the + callback), ctypes creates a new dummy Python thread on every invocation. This + behavior is correct for most purposes, but it means that values stored with + `threading.local` will *not* survive across different callbacks, even when + those calls are made from the same C thread. .. _ctypes-accessing-values-exported-from-dlls: @@ -1335,7 +1282,7 @@ returns the full pathname, but since there is no predefined naming scheme a call like ``find_library("c")`` will fail and return ``None``. If wrapping a shared library with :mod:`ctypes`, it *may* be better to determine -the shared library name at development type, and hardcode that into the wrapper +the shared library name at development time, and hardcode that into the wrapper module instead of using :func:`find_library` to locate the library at runtime. @@ -1362,7 +1309,10 @@ way is to instantiate one of the following classes: assumed to return the windows specific :class:`HRESULT` code. :class:`HRESULT` values contain information specifying whether the function call failed or succeeded, together with additional error code. If the return value signals a - failure, an :class:`WindowsError` is automatically raised. + failure, an :class:`OSError` is automatically raised. + + .. versionchanged:: 3.3 + :exc:`WindowsError` used to be raised. .. class:: WinDLL(name, mode=DEFAULT_MODE, handle=None, use_errno=False, use_last_error=False) @@ -1707,7 +1657,7 @@ the windows header file is this:: WINUSERAPI int WINAPI MessageBoxA( - HWND hWnd , + HWND hWnd, LPCSTR lpText, LPCSTR lpCaption, UINT uType); @@ -1965,8 +1915,8 @@ Utility functions .. function:: sizeof(obj_or_type) - Returns the size in bytes of a ctypes type or instance memory buffer. Does the - same as the C ``sizeof()`` function. + Returns the size in bytes of a ctypes type or instance memory buffer. + Does the same as the C ``sizeof`` operator. .. function:: string_at(address, size=-1) @@ -1979,11 +1929,14 @@ Utility functions .. function:: WinError(code=None, descr=None) Windows only: this function is probably the worst-named thing in ctypes. It - creates an instance of WindowsError. If *code* is not specified, + creates an instance of OSError. If *code* is not specified, ``GetLastError`` is called to determine the error code. If *descr* is not specified, :func:`FormatError` is called to get a textual description of the error. + .. versionchanged:: 3.3 + An instance of :exc:`WindowsError` used to be created. + .. function:: wstring_at(address, size=-1) @@ -2295,7 +2248,7 @@ These are the fundamental ctypes data types: .. class:: c_bool Represent the C :c:type:`bool` datatype (more accurately, :c:type:`_Bool` from - C99). Its value can be True or False, and the constructor accepts any object + C99). Its value can be ``True`` or ``False``, and the constructor accepts any object that has a truth value. diff --git a/Doc/library/curses.rst b/Doc/library/curses.rst index d2c2316..314636e 100644 --- a/Doc/library/curses.rst +++ b/Doc/library/curses.rst @@ -377,7 +377,7 @@ The module :mod:`curses` defines the following functions: is to be displayed. -.. function:: newwin(begin_y, begin_x) +.. function:: newwin(nlines, ncols) newwin(nlines, ncols, begin_y, begin_x) Return a new window, whose left-upper corner is at ``(begin_y, begin_x)``, and @@ -599,6 +599,17 @@ The module :mod:`curses` defines the following functions: Only one *ch* can be pushed before :meth:`getch` is called. +.. function:: unget_wch(ch) + + Push *ch* so the next :meth:`get_wch` will return it. + + .. note:: + + Only one *ch* can be pushed before :meth:`get_wch` is called. + + .. versionadded:: 3.3 + + .. function:: ungetmouse(id, x, y, z, bstate) Push a :const:`KEY_MOUSE` event onto the input queue, associating the given @@ -643,7 +654,7 @@ Window Objects -------------- Window objects, as returned by :func:`initscr` and :func:`newwin` above, have -the following methods: +the following methods and attributes: .. method:: window.addch(ch[, attr]) @@ -831,6 +842,16 @@ the following methods: event. +.. attribute:: window.encoding + + Encoding used to encode method arguments (Unicode strings and characters). + The encoding attribute is inherited from the parent window when a subwindow + is created, for example with :meth:`window.subwin`. By default, the locale + encoding is used (see :func:`locale.getpreferredencoding`). + + .. versionadded:: 3.3 + + .. method:: window.erase() Clear the window. @@ -854,11 +875,20 @@ the following methods: until a key is pressed. +.. method:: window.get_wch([y, x]) + + Get a wide character. Return a character for most keys, or an integer for + function keys, keypad keys, and other special keys. + + .. versionadded:: 3.3 + + .. method:: window.getkey([y, x]) Get a character, returning a string instead of an integer, as :meth:`getch` - does. Function keys, keypad keys and so on return a multibyte string containing - the key name. In no-delay mode, an exception is raised if there is no input. + does. Function keys, keypad keys and other special keys return a multibyte + string containing the key name. In no-delay mode, an exception is raised if + there is no input. .. method:: window.getmaxyx() diff --git a/Doc/library/datatypes.rst b/Doc/library/datatypes.rst index 6b4a71a..d0382e0 100644 --- a/Doc/library/datatypes.rst +++ b/Doc/library/datatypes.rst @@ -21,11 +21,10 @@ The following modules are documented in this chapter: datetime.rst calendar.rst collections.rst + collections.abc.rst heapq.rst bisect.rst array.rst - sched.rst - queue.rst weakref.rst types.rst copy.rst diff --git a/Doc/library/datetime.rst b/Doc/library/datetime.rst index 8bdd961..065e850 100644 --- a/Doc/library/datetime.rst +++ b/Doc/library/datetime.rst @@ -404,12 +404,19 @@ Other constructors, all class methods: .. classmethod:: date.fromtimestamp(timestamp) Return the local date corresponding to the POSIX timestamp, such as is returned - by :func:`time.time`. This may raise :exc:`ValueError`, if the timestamp is out - of the range of values supported by the platform C :c:func:`localtime` function. + by :func:`time.time`. This may raise :exc:`OverflowError`, if the timestamp is out + of the range of values supported by the platform C :c:func:`localtime` function, + and :exc:`OSError` on :c:func:`localtime` failure. It's common for this to be restricted to years from 1970 through 2038. Note that on non-POSIX systems that include leap seconds in their notion of a timestamp, leap seconds are ignored by :meth:`fromtimestamp`. + .. versionchanged:: 3.3 + Raise :exc:`OverflowError` instead of :exc:`ValueError` if the timestamp + is out of the range of values supported by the platform C + :c:func:`localtime` function. Raise :exc:`OSError` instead of + :exc:`ValueError` on :c:func:`localtime` failure. + .. classmethod:: date.fromordinal(ordinal) @@ -586,8 +593,17 @@ Instance methods: .. method:: date.strftime(format) Return a string representing the date, controlled by an explicit format string. - Format codes referring to hours, minutes or seconds will see 0 values. See - section :ref:`strftime-strptime-behavior`. + Format codes referring to hours, minutes or seconds will see 0 values. For a + complete list of formatting directives, see + :ref:`strftime-strptime-behavior`. + + +.. method:: date.__format__(format) + + Same as :meth:`.date.strftime`. This makes it possible to specify format + string for a :class:`.date` object when using :meth:`str.format`. For a + complete list of formatting directives, see + :ref:`strftime-strptime-behavior`. Example of counting days to an event:: @@ -640,6 +656,8 @@ Example of working with :class:`date`: '11/03/02' >>> d.strftime("%A %d. %B %Y") 'Monday 11. March 2002' + >>> 'The {1} is {0:%d}, the {2} is {0:%B}.'.format(d, "day", "month") + 'The day is 11, the month is March.' .. _datetime-datetime: @@ -713,23 +731,44 @@ Other constructors, all class methods: equivalent to ``tz.fromutc(datetime.utcfromtimestamp(timestamp).replace(tzinfo=tz))``. - :meth:`fromtimestamp` may raise :exc:`ValueError`, if the timestamp is out of + :meth:`fromtimestamp` may raise :exc:`OverflowError`, if the timestamp is out of the range of values supported by the platform C :c:func:`localtime` or - :c:func:`gmtime` functions. It's common for this to be restricted to years in + :c:func:`gmtime` functions, and :exc:`OSError` on :c:func:`localtime` or + :c:func:`gmtime` failure. + It's common for this to be restricted to years in 1970 through 2038. Note that on non-POSIX systems that include leap seconds in their notion of a timestamp, leap seconds are ignored by :meth:`fromtimestamp`, and then it's possible to have two timestamps differing by a second that yield identical :class:`.datetime` objects. See also :meth:`utcfromtimestamp`. + .. versionchanged:: 3.3 + Raise :exc:`OverflowError` instead of :exc:`ValueError` if the timestamp + is out of the range of values supported by the platform C + :c:func:`localtime` or :c:func:`gmtime` functions. Raise :exc:`OSError` + instead of :exc:`ValueError` on :c:func:`localtime` or :c:func:`gmtime` + failure. + .. classmethod:: datetime.utcfromtimestamp(timestamp) Return the UTC :class:`.datetime` corresponding to the POSIX timestamp, with - :attr:`tzinfo` ``None``. This may raise :exc:`ValueError`, if the timestamp is - out of the range of values supported by the platform C :c:func:`gmtime` function. + :attr:`tzinfo` ``None``. This may raise :exc:`OverflowError`, if the timestamp is + out of the range of values supported by the platform C :c:func:`gmtime` function, + and :exc:`OSError` on :c:func:`gmtime` failure. It's common for this to be restricted to years in 1970 through 2038. See also :meth:`fromtimestamp`. + On the POSIX compliant platforms, ``utcfromtimestamp(timestamp)`` + is equivalent to the following expression:: + + datetime(1970, 1, 1) + timedelta(seconds=timestamp) + + .. versionchanged:: 3.3 + Raise :exc:`OverflowError` instead of :exc:`ValueError` if the timestamp + is out of the range of values supported by the platform C + :c:func:`gmtime` function. Raise :exc:`OSError` instead of + :exc:`ValueError` on :c:func:`gmtime` failure. + .. classmethod:: datetime.fromordinal(ordinal) @@ -756,7 +795,8 @@ Other constructors, all class methods: *format*. This is equivalent to ``datetime(*(time.strptime(date_string, format)[0:6]))``. :exc:`ValueError` is raised if the date_string and format can't be parsed by :func:`time.strptime` or if it returns a value which isn't a - time tuple. See section :ref:`strftime-strptime-behavior`. + time tuple. For a complete list of formatting directives, see + :ref:`strftime-strptime-behavior`. @@ -873,13 +913,20 @@ Supported operations: *datetime1* is considered less than *datetime2* when *datetime1* precedes *datetime2* in time. - If one comparand is naive and the other is aware, :exc:`TypeError` is raised. + If one comparand is naive and the other is aware, :exc:`TypeError` + is raised if an order comparison is attempted. For equality + comparisons, naive instances are never equal to aware instances. + If both comparands are aware, and have the same :attr:`tzinfo` attribute, the common :attr:`tzinfo` attribute is ignored and the base datetimes are compared. If both comparands are aware and have different :attr:`tzinfo` attributes, the comparands are first adjusted by subtracting their UTC offsets (obtained from ``self.utcoffset()``). + .. versionchanged:: 3.3 + Equality comparisons between naive and aware :class:`datetime` + instances don't raise :exc:`TypeError`. + .. note:: In order to stop comparison from falling back to the default scheme of comparing @@ -922,17 +969,22 @@ Instance methods: datetime with no conversion of date and time data. -.. method:: datetime.astimezone(tz) +.. method:: datetime.astimezone(tz=None) - Return a :class:`.datetime` object with new :attr:`tzinfo` attribute *tz*, + Return a :class:`datetime` object with new :attr:`tzinfo` attribute *tz*, adjusting the date and time data so the result is the same UTC time as *self*, but in *tz*'s local time. - *tz* must be an instance of a :class:`tzinfo` subclass, and its + If provided, *tz* must be an instance of a :class:`tzinfo` subclass, and its :meth:`utcoffset` and :meth:`dst` methods must not return ``None``. *self* must be aware (``self.tzinfo`` must not be ``None``, and ``self.utcoffset()`` must not return ``None``). + If called without arguments (or with ``tz=None``) the system local + timezone is assumed. The ``tzinfo`` attribute of the converted + datetime instance will be set to an instance of :class:`timezone` + with the zone name and offset obtained from the OS. + If ``self.tzinfo`` is *tz*, ``self.astimezone(tz)`` is equal to *self*: no adjustment of date or time data is performed. Else the result is local time in time zone *tz*, representing the same UTC time as *self*: after @@ -959,6 +1011,9 @@ Instance methods: # Convert from UTC to tz's local time. return tz.fromutc(utc) + .. versionchanged:: 3.3 + *tz* now can be omitted. + .. method:: datetime.utcoffset() @@ -1015,6 +1070,39 @@ Instance methods: Return the proleptic Gregorian ordinal of the date. The same as ``self.date().toordinal()``. +.. method:: datetime.timestamp() + + Return POSIX timestamp corresponding to the :class:`datetime` + instance. The return value is a :class:`float` similar to that + returned by :func:`time.time`. + + Naive :class:`datetime` instances are assumed to represent local + time and this method relies on the platform C :c:func:`mktime` + function to perform the conversion. Since :class:`datetime` + supports wider range of values than :c:func:`mktime` on many + platforms, this method may raise :exc:`OverflowError` for times far + in the past or far in the future. + + For aware :class:`datetime` instances, the return value is computed + as:: + + (dt - datetime(1970, 1, 1, tzinfo=timezone.utc)).total_seconds() + + .. versionadded:: 3.3 + + .. note:: + + There is no method to obtain the POSIX timestamp directly from a + naive :class:`datetime` instance representing UTC time. If your + application uses this convention and your system timezone is not + set to UTC, you can obtain the POSIX timestamp by supplying + ``tzinfo=timezone.utc``:: + + timestamp = dt.replace(tzinfo=timezone.utc).timestamp() + + or by calculating the timestamp directly:: + + timestamp = (dt - datetime(1970, 1, 1)) / timedelta(seconds=1) .. method:: datetime.weekday() @@ -1075,7 +1163,16 @@ Instance methods: .. method:: datetime.strftime(format) Return a string representing the date and time, controlled by an explicit format - string. See section :ref:`strftime-strptime-behavior`. + string. For a complete list of formatting directives, see + :ref:`strftime-strptime-behavior`. + + +.. method:: datetime.__format__(format) + + Same as :meth:`.datetime.strftime`. This makes it possible to specify format + string for a :class:`.datetime` object when using :meth:`str.format`. For a + complete list of formatting directives, see + :ref:`strftime-strptime-behavior`. Examples of working with datetime objects: @@ -1122,6 +1219,8 @@ Examples of working with datetime objects: >>> # Formatting datetime >>> dt.strftime("%A, %d. %B %Y %I:%M%p") 'Tuesday, 21. November 2006 04:30PM' + >>> 'The {1} is {0:%d}, the {2} is {0:%B}, the {3} is {0:%I:%M%p}.'.format(dt, "day", "month", "time") + 'The day is 21, the month is November, the time is 04:30PM.' Using datetime with tzinfo: @@ -1254,7 +1353,10 @@ Supported operations: * comparison of :class:`.time` to :class:`.time`, where *a* is considered less than *b* when *a* precedes *b* in time. If one comparand is naive and the other - is aware, :exc:`TypeError` is raised. If both comparands are aware, and have + is aware, :exc:`TypeError` is raised if an order comparison is attempted. For equality + comparisons, naive instances are never equal to aware instances. + + If both comparands are aware, and have the same :attr:`tzinfo` attribute, the common :attr:`tzinfo` attribute is ignored and the base times are compared. If both comparands are aware and have different :attr:`tzinfo` attributes, the comparands are first adjusted by @@ -1264,6 +1366,10 @@ Supported operations: different type, :exc:`TypeError` is raised unless the comparison is ``==`` or ``!=``. The latter cases return :const:`False` or :const:`True`, respectively. + .. versionchanged:: 3.3 + Equality comparisons between naive and aware :class:`time` instances + don't raise :exc:`TypeError`. + * hash, use as dict key * efficient pickling @@ -1298,8 +1404,17 @@ Instance methods: .. method:: time.strftime(format) - Return a string representing the time, controlled by an explicit format string. - See section :ref:`strftime-strptime-behavior`. + Return a string representing the time, controlled by an explicit format + string. For a complete list of formatting directives, see + :ref:`strftime-strptime-behavior`. + + +.. method:: time.__format__(format) + + Same as :meth:`.time.strftime`. This makes it possible to specify format string + for a :class:`.time` object when using :meth:`str.format`. For a + complete list of formatting directives, see + :ref:`strftime-strptime-behavior`. .. method:: time.utcoffset() @@ -1348,6 +1463,8 @@ Example: 'Europe/Prague' >>> t.strftime("%H:%M:%S %Z") '12:10:30 Europe/Prague' + >>> 'The {} is {:%H:%M}.'.format("time", t) + 'The time is 12:10.' .. _datetime-tzinfo: @@ -1587,11 +1704,12 @@ only EST (fixed offset -5 hours), or only EDT (fixed offset -4 hours)). :class:`timezone` Objects -------------------------- -A :class:`timezone` object represents a timezone that is defined by a -fixed offset from UTC. Note that objects of this class cannot be used -to represent timezone information in the locations where different -offsets are used in different days of the year or where historical -changes have been made to civil time. +The :class:`timezone` class is a subclass of :class:`tzinfo`, each +instance of which represents a timezone defined by a fixed offset from +UTC. Note that objects of this class cannot be used to represent +timezone information in the locations where different offsets are used +in different days of the year or where historical changes have been +made to civil time. .. class:: timezone(offset[, name]) @@ -1662,158 +1780,180 @@ For :class:`date` objects, the format codes for hours, minutes, seconds, and microseconds should not be used, as :class:`date` objects have no such values. If they're used anyway, ``0`` is substituted for them. -For a naive object, the ``%z`` and ``%Z`` format codes are replaced by empty -strings. - -For an aware object: - -``%z`` - :meth:`utcoffset` is transformed into a 5-character string of the form +HHMM or - -HHMM, where HH is a 2-digit string giving the number of UTC offset hours, and - MM is a 2-digit string giving the number of UTC offset minutes. For example, if - :meth:`utcoffset` returns ``timedelta(hours=-3, minutes=-30)``, ``%z`` is - replaced with the string ``'-0330'``. - -``%Z`` - If :meth:`tzname` returns ``None``, ``%Z`` is replaced by an empty string. - Otherwise ``%Z`` is replaced by the returned value, which must be a string. - The full set of format codes supported varies across platforms, because Python calls the platform C library's :func:`strftime` function, and platform -variations are common. +variations are common. To see the full set of format codes supported on your +platform, consult the :manpage:`strftime(3)` documentation. The following is a list of all the format codes that the C standard (1989 version) requires, and these work on all platforms with a standard C implementation. Note that the 1999 version of the C standard added additional format codes. -+-----------+--------------------------------+-------+ -| Directive | Meaning | Notes | -+===========+================================+=======+ -| ``%a`` | Locale's abbreviated weekday | | -| | name. | | -+-----------+--------------------------------+-------+ -| ``%A`` | Locale's full weekday name. | | -+-----------+--------------------------------+-------+ -| ``%b`` | Locale's abbreviated month | | -| | name. | | -+-----------+--------------------------------+-------+ -| ``%B`` | Locale's full month name. | | -+-----------+--------------------------------+-------+ -| ``%c`` | Locale's appropriate date and | | -| | time representation. | | -+-----------+--------------------------------+-------+ -| ``%d`` | Day of the month as a decimal | | -| | number [01,31]. | | -+-----------+--------------------------------+-------+ -| ``%f`` | Microsecond as a decimal | \(1) | -| | number [0,999999], zero-padded | | -| | on the left | | -+-----------+--------------------------------+-------+ -| ``%H`` | Hour (24-hour clock) as a | | -| | decimal number [00,23]. | | -+-----------+--------------------------------+-------+ -| ``%I`` | Hour (12-hour clock) as a | | -| | decimal number [01,12]. | | -+-----------+--------------------------------+-------+ -| ``%j`` | Day of the year as a decimal | | -| | number [001,366]. | | -+-----------+--------------------------------+-------+ -| ``%m`` | Month as a decimal number | | -| | [01,12]. | | -+-----------+--------------------------------+-------+ -| ``%M`` | Minute as a decimal number | | -| | [00,59]. | | -+-----------+--------------------------------+-------+ -| ``%p`` | Locale's equivalent of either | \(2) | -| | AM or PM. | | -+-----------+--------------------------------+-------+ -| ``%S`` | Second as a decimal number | \(3) | -| | [00,59]. | | -+-----------+--------------------------------+-------+ -| ``%U`` | Week number of the year | \(4) | -| | (Sunday as the first day of | | -| | the week) as a decimal number | | -| | [00,53]. All days in a new | | -| | year preceding the first | | -| | Sunday are considered to be in | | -| | week 0. | | -+-----------+--------------------------------+-------+ -| ``%w`` | Weekday as a decimal number | | -| | [0(Sunday),6]. | | -+-----------+--------------------------------+-------+ -| ``%W`` | Week number of the year | \(4) | -| | (Monday as the first day of | | -| | the week) as a decimal number | | -| | [00,53]. All days in a new | | -| | year preceding the first | | -| | Monday are considered to be in | | -| | week 0. | | -+-----------+--------------------------------+-------+ -| ``%x`` | Locale's appropriate date | | -| | representation. | | -+-----------+--------------------------------+-------+ -| ``%X`` | Locale's appropriate time | | -| | representation. | | -+-----------+--------------------------------+-------+ -| ``%y`` | Year without century as a | | -| | decimal number [00,99]. | | -+-----------+--------------------------------+-------+ -| ``%Y`` | Year with century as a decimal | \(5) | -| | number [0001,9999] (strptime), | | -| | [1000,9999] (strftime). | | -+-----------+--------------------------------+-------+ -| ``%z`` | UTC offset in the form +HHMM | \(6) | -| | or -HHMM (empty string if the | | -| | the object is naive). | | -+-----------+--------------------------------+-------+ -| ``%Z`` | Time zone name (empty string | | -| | if the object is naive). | | -+-----------+--------------------------------+-------+ -| ``%%`` | A literal ``'%'`` character. | | -+-----------+--------------------------------+-------+ ++-----------+--------------------------------+------------------------+-------+ +| Directive | Meaning | Example | Notes | ++===========+================================+========================+=======+ +| ``%a`` | Weekday as locale's || Sun, Mon, ..., Sat | \(1) | +| | abbreviated name. | (en_US); | | +| | || So, Mo, ..., Sa | | +| | | (de_DE) | | ++-----------+--------------------------------+------------------------+-------+ +| ``%A`` | Weekday as locale's full name. || Sunday, Monday, ..., | \(1) | +| | | Saturday (en_US); | | +| | || Sonntag, Montag, ..., | | +| | | Samstag (de_DE) | | ++-----------+--------------------------------+------------------------+-------+ +| ``%w`` | Weekday as a decimal number, | 0, 1, ..., 6 | | +| | where 0 is Sunday and 6 is | | | +| | Saturday. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%d`` | Day of the month as a | 01, 02, ..., 31 | | +| | zero-padded decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%b`` | Month as locale's abbreviated || Jan, Feb, ..., Dec | \(1) | +| | name. | (en_US); | | +| | || Jan, Feb, ..., Dez | | +| | | (de_DE) | | ++-----------+--------------------------------+------------------------+-------+ +| ``%B`` | Month as locale's full name. || January, February, | \(1) | +| | | ..., December (en_US);| | +| | || Januar, Februar, ..., | | +| | | Dezember (de_DE) | | ++-----------+--------------------------------+------------------------+-------+ +| ``%m`` | Month as a zero-padded | 01, 02, ..., 12 | | +| | decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%y`` | Year without century as a | 00, 01, ..., 99 | | +| | zero-padded decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%Y`` | Year with century as a decimal | 0001, 0002, ..., 2013, | \(2) | +| | number. | 2014, ..., 9998, 9999 | | ++-----------+--------------------------------+------------------------+-------+ +| ``%H`` | Hour (24-hour clock) as a | 00, 01, ..., 23 | | +| | zero-padded decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%I`` | Hour (12-hour clock) as a | 01, 02, ..., 12 | | +| | zero-padded decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%p`` | Locale's equivalent of either || AM, PM (en_US); | \(1), | +| | AM or PM. || am, pm (de_DE) | \(3) | ++-----------+--------------------------------+------------------------+-------+ +| ``%M`` | Minute as a zero-padded | 00, 01, ..., 59 | | +| | decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%S`` | Second as a zero-padded | 00, 01, ..., 59 | \(4) | +| | decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%f`` | Microsecond as a decimal | 000000, 000001, ..., | \(5) | +| | number, zero-padded on the | 999999 | | +| | left. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%z`` | UTC offset in the form +HHMM | (empty), +0000, -0400, | \(6) | +| | or -HHMM (empty string if the | +1030 | | +| | the object is naive). | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%Z`` | Time zone name (empty string | (empty), UTC, EST, CST | | +| | if the object is naive). | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%j`` | Day of the year as a | 001, 002, ..., 366 | | +| | zero-padded decimal number. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%U`` | Week number of the year | 00, 01, ..., 53 | \(7) | +| | (Sunday as the first day of | | | +| | the week) as a zero padded | | | +| | decimal number. All days in a | | | +| | new year preceding the first | | | +| | Sunday are considered to be in | | | +| | week 0. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%W`` | Week number of the year | 00, 01, ..., 53 | \(7) | +| | (Monday as the first day of | | | +| | the week) as a decimal number. | | | +| | All days in a new year | | | +| | preceding the first Monday | | | +| | are considered to be in | | | +| | week 0. | | | ++-----------+--------------------------------+------------------------+-------+ +| ``%c`` | Locale's appropriate date and || Tue Aug 16 21:30:00 | \(1) | +| | time representation. | 1988 (en_US); | | +| | || Di 16 Aug 21:30:00 | | +| | | 1988 (de_DE) | | ++-----------+--------------------------------+------------------------+-------+ +| ``%x`` | Locale's appropriate date || 08/16/88 (None); | \(1) | +| | representation. || 08/16/1988 (en_US); | | +| | || 16.08.1988 (de_DE) | | ++-----------+--------------------------------+------------------------+-------+ +| ``%X`` | Locale's appropriate time || 21:30:00 (en_US); | \(1) | +| | representation. || 21:30:00 (de_DE) | | ++-----------+--------------------------------+------------------------+-------+ +| ``%%`` | A literal ``'%'`` character. | % | | ++-----------+--------------------------------+------------------------+-------+ Notes: (1) + Because the format depends on the current locale, care should be taken when + making assumptions about the output value. Field orderings will vary (for + example, "month/day/year" versus "day/month/year"), and the output may + contain Unicode characters encoded using the locale's default encoding (for + example, if the current locale is ``ja_JP``, the default encoding could be + any one of ``eucJP``, ``SJIS``, or ``utf-8``; use :meth:`locale.getlocale` + to determine the current locale's encoding). + +(2) + The :meth:`strptime` method can parse years in the full [1, 9999] range, but + years < 1000 must be zero-filled to 4-digit width. + + .. versionchanged:: 3.2 + In previous versions, :meth:`strftime` method was restricted to + years >= 1900. + + .. versionchanged:: 3.3 + In version 3.2, :meth:`strftime` method was restricted to + years >= 1000. + +(3) + When used with the :meth:`strptime` method, the ``%p`` directive only affects + the output hour field if the ``%I`` directive is used to parse the hour. + +(4) + Unlike the :mod:`time` module, the :mod:`datetime` module does not support + leap seconds. + +(5) When used with the :meth:`strptime` method, the ``%f`` directive accepts from one to six digits and zero pads on the right. ``%f`` is an extension to the set of format characters in the C standard (but implemented separately in datetime objects, and therefore always available). -(2) - When used with the :meth:`strptime` method, the ``%p`` directive only affects - the output hour field if the ``%I`` directive is used to parse the hour. +(6) + For a naive object, the ``%z`` and ``%Z`` format codes are replaced by empty + strings. -(3) - Unlike :mod:`time` module, :mod:`datetime` module does not support - leap seconds. + For an aware object: -(4) - When used with the :meth:`strptime` method, ``%U`` and ``%W`` are only used in - calculations when the day of the week and the year are specified. + ``%z`` + :meth:`utcoffset` is transformed into a 5-character string of the form + +HHMM or -HHMM, where HH is a 2-digit string giving the number of UTC + offset hours, and MM is a 2-digit string giving the number of UTC offset + minutes. For example, if :meth:`utcoffset` returns + ``timedelta(hours=-3, minutes=-30)``, ``%z`` is replaced with the string + ``'-0330'``. -(5) - For technical reasons, :meth:`strftime` method does not support - dates before year 1000: ``t.strftime(format)`` will raise a - :exc:`ValueError` when ``t.year < 1000`` even if ``format`` does - not contain ``%Y`` directive. The :meth:`strptime` method can - parse years in the full [1, 9999] range, but years < 1000 must be - zero-filled to 4-digit width. + ``%Z`` + If :meth:`tzname` returns ``None``, ``%Z`` is replaced by an empty + string. Otherwise ``%Z`` is replaced by the returned value, which must + be a string. .. versionchanged:: 3.2 - In previous versions, :meth:`strftime` method was restricted to - years >= 1900. - -(6) - For example, if :meth:`utcoffset` returns ``timedelta(hours=-3, minutes=-30)``, - ``%z`` is replaced with the string ``'-0330'``. + When the ``%z`` directive is provided to the :meth:`strptime` method, an + aware :class:`.datetime` object will be produced. The ``tzinfo`` of the + result will be set to a :class:`timezone` instance. -.. versionchanged:: 3.2 - When the ``%z`` directive is provided to the :meth:`strptime` method, an - aware :class:`.datetime` object will be produced. The ``tzinfo`` of the - result will be set to a :class:`timezone` instance. +(7) + When used with the :meth:`strptime` method, ``%U`` and ``%W`` are only used + in calculations when the day of the week and the year are specified. .. rubric:: Footnotes diff --git a/Doc/library/dbm.rst b/Doc/library/dbm.rst index e3d50b9..bd88f41 100644 --- a/Doc/library/dbm.rst +++ b/Doc/library/dbm.rst @@ -216,6 +216,9 @@ supported. When the database has been opened in fast mode, this method forces any unwritten data to be written to the disk. + .. method:: gdbm.close() + + Close the ``gdbm`` database. :mod:`dbm.ndbm` --- Interface based on ndbm ------------------------------------------- @@ -247,7 +250,7 @@ to locate the appropriate header file to simplify building this module. .. function:: open(filename[, flag[, mode]]) - Open a dbm database and return a ``dbm`` object. The *filename* argument is the + Open a dbm database and return a ``ndbm`` object. The *filename* argument is the name of the database file (without the :file:`.dir` or :file:`.pag` extensions). The optional *flag* argument must be one of these values: @@ -272,6 +275,12 @@ to locate the appropriate header file to simplify building this module. database has to be created. It defaults to octal ``0o666`` (and will be modified by the prevailing umask). + In addition to the dictionary-like methods, ``ndbm`` objects + provide the following method: + + .. method:: ndbm.close() + + Close the ``ndbm`` database. :mod:`dbm.dumb` --- Portable DBM implementation @@ -317,10 +326,16 @@ The module defines the following: database has to be created. It defaults to octal ``0o666`` (and will be modified by the prevailing umask). - In addition to the methods provided by the :class:`collections.MutableMapping` class, - :class:`dumbdbm` objects provide the following method: + In addition to the methods provided by the + :class:`collections.abc.MutableMapping` class, :class:`dumbdbm` objects + provide the following methods: .. method:: dumbdbm.sync() Synchronize the on-disk directory and data files. This method is called by the :meth:`Shelve.sync` method. + + .. method:: dumbdbm.close() + + Close the ``dumbdbm`` database. + diff --git a/Doc/library/debug.rst b/Doc/library/debug.rst index b2ee4fa..c69fb1c 100644 --- a/Doc/library/debug.rst +++ b/Doc/library/debug.rst @@ -10,7 +10,8 @@ allowing you to identify bottlenecks in your programs. .. toctree:: bdb.rst + faulthandler.rst pdb.rst profile.rst timeit.rst - trace.rst
\ No newline at end of file + trace.rst diff --git a/Doc/library/decimal.rst b/Doc/library/decimal.rst index d989e3f..059ae7c 100644 --- a/Doc/library/decimal.rst +++ b/Doc/library/decimal.rst @@ -9,6 +9,7 @@ .. moduleauthor:: Raymond Hettinger <python at rcn.com> .. moduleauthor:: Aahz <aahz at pobox.com> .. moduleauthor:: Tim Peters <tim.one at comcast.net> +.. moduleauthor:: Stefan Krah <skrah at bytereef.org> .. sectionauthor:: Raymond D. Hettinger <python at rcn.com> .. import modules for testing inline doctests with the Sphinx doctest builder @@ -20,8 +21,9 @@ # make sure each group gets a fresh context setcontext(Context()) -The :mod:`decimal` module provides support for decimal floating point -arithmetic. It offers several advantages over the :class:`float` datatype: +The :mod:`decimal` module provides support for fast correctly-rounded +decimal floating point arithmetic. It offers several advantages over the +:class:`float` datatype: * Decimal "is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle -- computers must @@ -92,7 +94,7 @@ computation. Depending on the needs of the application, signals may be ignored, considered as informational, or treated as exceptions. The signals in the decimal module are: :const:`Clamped`, :const:`InvalidOperation`, :const:`DivisionByZero`, :const:`Inexact`, :const:`Rounded`, :const:`Subnormal`, -:const:`Overflow`, and :const:`Underflow`. +:const:`Overflow`, :const:`Underflow` and :const:`FloatOperation`. For each signal there is a flag and a trap enabler. When a signal is encountered, its flag is set to one, then, if the trap enabler is @@ -122,7 +124,7 @@ precision, rounding, or enabled traps:: >>> from decimal import * >>> getcontext() - Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, + Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[Overflow, DivisionByZero, InvalidOperation]) @@ -132,7 +134,7 @@ Decimal instances can be constructed from integers, strings, floats, or tuples. Construction from an integer or a float performs an exact conversion of the value of that integer or float. Decimal numbers include special values such as :const:`NaN` which stands for "Not a number", positive and negative -:const:`Infinity`, and :const:`-0`. +:const:`Infinity`, and :const:`-0`:: >>> getcontext().prec = 28 >>> Decimal(10) @@ -152,6 +154,25 @@ value of that integer or float. Decimal numbers include special values such as >>> Decimal('-Infinity') Decimal('-Infinity') +If the :exc:`FloatOperation` signal is trapped, accidental mixing of +decimals and floats in constructors or ordering comparisons raises +an exception:: + + >>> c = getcontext() + >>> c.traps[FloatOperation] = True + >>> Decimal(3.14) + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + decimal.FloatOperation: [<class 'decimal.FloatOperation'>] + >>> Decimal('3.5') < 3.7 + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + decimal.FloatOperation: [<class 'decimal.FloatOperation'>] + >>> Decimal('3.5') == 3.5 + True + +.. versionadded:: 3.3 + The significance of a new Decimal is determined solely by the number of digits input. Context precision and rounding only come into play during arithmetic operations. @@ -169,6 +190,16 @@ operations. >>> Decimal('3.1415926535') + Decimal('2.7182818285') Decimal('5.85988') +If the internal limits of the C version are exceeded, constructing +a decimal raises :class:`InvalidOperation`:: + + >>> Decimal("1e9999999999999999999") + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + decimal.InvalidOperation: [<class 'decimal.InvalidOperation'>] + +.. versionchanged:: 3.3 + Decimals interact well with much of the rest of Python. Here is a small decimal floating point flying circus: @@ -244,7 +275,7 @@ enabled: Decimal('0.142857142857142857142857142857142857142857142857142857142857') >>> ExtendedContext - Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, + Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[]) >>> setcontext(ExtendedContext) >>> Decimal(1) / Decimal(7) @@ -269,7 +300,7 @@ using the :meth:`clear_flags` method. :: >>> Decimal(355) / Decimal(113) Decimal('3.14159292') >>> getcontext() - Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, + Context(prec=9, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, Rounded], traps=[]) The *flags* entry shows that the rational approximation to :const:`Pi` was @@ -358,6 +389,10 @@ Decimal objects The argument to the constructor is now permitted to be a :class:`float` instance. + .. versionchanged:: 3.3 + :class:`float` arguments raise an exception if the :exc:`FloatOperation` + trap is set. By default the trap is off. + Decimal floating point objects share many properties with the other built-in numeric types such as :class:`float` and :class:`int`. All of the usual math operations and special methods apply. Likewise, decimal objects can be @@ -424,7 +459,7 @@ Decimal objects a :class:`Decimal` instance is always canonical, so this operation returns its argument unchanged. - .. method:: compare(other[, context]) + .. method:: compare(other, context=None) Compare the values of two Decimal instances. :meth:`compare` returns a Decimal instance, and if either operand is a NaN then the result is a @@ -435,13 +470,13 @@ Decimal objects a == b ==> Decimal('0') a > b ==> Decimal('1') - .. method:: compare_signal(other[, context]) + .. method:: compare_signal(other, context=None) This operation is identical to the :meth:`compare` method, except that all NaNs signal. That is, if neither operand is a signaling NaN then any quiet NaN operand is treated as though it were a signaling NaN. - .. method:: compare_total(other) + .. method:: compare_total(other, context=None) Compare two operands using their abstract representation rather than their numerical value. Similar to the :meth:`compare` method, but the result @@ -459,13 +494,21 @@ Decimal objects higher in the total order than the second operand. See the specification for details of the total order. - .. method:: compare_total_mag(other) + This operation is unaffected by context and is quiet: no flags are changed + and no rounding is performed. As an exception, the C version may raise + InvalidOperation if the second operand cannot be converted exactly. + + .. method:: compare_total_mag(other, context=None) Compare two operands using their abstract representation rather than their value as in :meth:`compare_total`, but ignoring the sign of each operand. ``x.compare_total_mag(y)`` is equivalent to ``x.copy_abs().compare_total(y.copy_abs())``. + This operation is unaffected by context and is quiet: no flags are changed + and no rounding is performed. As an exception, the C version may raise + InvalidOperation if the second operand cannot be converted exactly. + .. method:: conjugate() Just returns self, this method is only to comply with the Decimal @@ -482,7 +525,7 @@ Decimal objects Return the negation of the argument. This operation is unaffected by the context and is quiet: no flags are changed and no rounding is performed. - .. method:: copy_sign(other) + .. method:: copy_sign(other, context=None) Return a copy of the first operand with the sign set to be the same as the sign of the second operand. For example: @@ -490,10 +533,11 @@ Decimal objects >>> Decimal('2.3').copy_sign(Decimal('-1.5')) Decimal('-2.3') - This operation is unaffected by the context and is quiet: no flags are - changed and no rounding is performed. + This operation is unaffected by context and is quiet: no flags are changed + and no rounding is performed. As an exception, the C version may raise + InvalidOperation if the second operand cannot be converted exactly. - .. method:: exp([context]) + .. method:: exp(context=None) Return the value of the (natural) exponential function ``e**x`` at the given number. The result is correctly rounded using the @@ -530,7 +574,7 @@ Decimal objects .. versionadded:: 3.1 - .. method:: fma(other, third[, context]) + .. method:: fma(other, third, context=None) Fused multiply-add. Return self*other+third with no rounding of the intermediate product self*other. @@ -559,7 +603,7 @@ Decimal objects Return :const:`True` if the argument is a (quiet or signaling) NaN and :const:`False` otherwise. - .. method:: is_normal() + .. method:: is_normal(context=None) Return :const:`True` if the argument is a *normal* finite number. Return :const:`False` if the argument is zero, subnormal, infinite or a NaN. @@ -579,7 +623,7 @@ Decimal objects Return :const:`True` if the argument is a signaling NaN and :const:`False` otherwise. - .. method:: is_subnormal() + .. method:: is_subnormal(context=None) Return :const:`True` if the argument is subnormal, and :const:`False` otherwise. @@ -589,17 +633,17 @@ Decimal objects Return :const:`True` if the argument is a (positive or negative) zero and :const:`False` otherwise. - .. method:: ln([context]) + .. method:: ln(context=None) Return the natural (base e) logarithm of the operand. The result is correctly rounded using the :const:`ROUND_HALF_EVEN` rounding mode. - .. method:: log10([context]) + .. method:: log10(context=None) Return the base ten logarithm of the operand. The result is correctly rounded using the :const:`ROUND_HALF_EVEN` rounding mode. - .. method:: logb([context]) + .. method:: logb(context=None) For a nonzero number, return the adjusted exponent of its operand as a :class:`Decimal` instance. If the operand is a zero then @@ -607,73 +651,73 @@ Decimal objects is raised. If the operand is an infinity then ``Decimal('Infinity')`` is returned. - .. method:: logical_and(other[, context]) + .. method:: logical_and(other, context=None) :meth:`logical_and` is a logical operation which takes two *logical operands* (see :ref:`logical_operands_label`). The result is the digit-wise ``and`` of the two operands. - .. method:: logical_invert([context]) + .. method:: logical_invert(context=None) :meth:`logical_invert` is a logical operation. The result is the digit-wise inversion of the operand. - .. method:: logical_or(other[, context]) + .. method:: logical_or(other, context=None) :meth:`logical_or` is a logical operation which takes two *logical operands* (see :ref:`logical_operands_label`). The result is the digit-wise ``or`` of the two operands. - .. method:: logical_xor(other[, context]) + .. method:: logical_xor(other, context=None) :meth:`logical_xor` is a logical operation which takes two *logical operands* (see :ref:`logical_operands_label`). The result is the digit-wise exclusive or of the two operands. - .. method:: max(other[, context]) + .. method:: max(other, context=None) Like ``max(self, other)`` except that the context rounding rule is applied before returning and that :const:`NaN` values are either signaled or ignored (depending on the context and whether they are signaling or quiet). - .. method:: max_mag(other[, context]) + .. method:: max_mag(other, context=None) Similar to the :meth:`.max` method, but the comparison is done using the absolute values of the operands. - .. method:: min(other[, context]) + .. method:: min(other, context=None) Like ``min(self, other)`` except that the context rounding rule is applied before returning and that :const:`NaN` values are either signaled or ignored (depending on the context and whether they are signaling or quiet). - .. method:: min_mag(other[, context]) + .. method:: min_mag(other, context=None) Similar to the :meth:`.min` method, but the comparison is done using the absolute values of the operands. - .. method:: next_minus([context]) + .. method:: next_minus(context=None) Return the largest number representable in the given context (or in the current thread's context if no context is given) that is smaller than the given operand. - .. method:: next_plus([context]) + .. method:: next_plus(context=None) Return the smallest number representable in the given context (or in the current thread's context if no context is given) that is larger than the given operand. - .. method:: next_toward(other[, context]) + .. method:: next_toward(other, context=None) If the two operands are unequal, return the number closest to the first operand in the direction of the second operand. If both operands are numerically equal, return a copy of the first operand with the sign set to be the same as the sign of the second operand. - .. method:: normalize([context]) + .. method:: normalize(context=None) Normalize the number by stripping the rightmost trailing zeros and converting any result equal to :const:`Decimal('0')` to @@ -682,7 +726,7 @@ Decimal objects ``Decimal('0.321000e+2')`` both normalize to the equivalent value ``Decimal('32.1')``. - .. method:: number_class([context]) + .. method:: number_class(context=None) Return a string describing the *class* of the operand. The returned value is one of the following ten strings. @@ -698,7 +742,7 @@ Decimal objects * ``"NaN"``, indicating that the operand is a quiet NaN (Not a Number). * ``"sNaN"``, indicating that the operand is a signaling NaN. - .. method:: quantize(exp[, rounding[, context[, watchexp]]]) + .. method:: quantize(exp, rounding=None, context=None, watchexp=True) Return a value equal to the first operand after rounding and having the exponent of the second operand. @@ -725,13 +769,18 @@ Decimal objects resulting exponent is greater than :attr:`Emax` or less than :attr:`Etiny`. + .. deprecated:: 3.3 + *watchexp* is an implementation detail from the pure Python version + and is not present in the C version. It will be removed in version + 3.4, where it defaults to ``True``. + .. method:: radix() Return ``Decimal(10)``, the radix (base) in which the :class:`Decimal` class does all its arithmetic. Included for compatibility with the specification. - .. method:: remainder_near(other[, context]) + .. method:: remainder_near(other, context=None) Return the remainder from dividing *self* by *other*. This differs from ``self % other`` in that the sign of the remainder is chosen so as to @@ -749,7 +798,7 @@ Decimal objects >>> Decimal(35).remainder_near(Decimal(10)) Decimal('-5') - .. method:: rotate(other[, context]) + .. method:: rotate(other, context=None) Return the result of rotating the digits of the first operand by an amount specified by the second operand. The second operand must be an integer in @@ -760,18 +809,22 @@ Decimal objects length precision if necessary. The sign and exponent of the first operand are unchanged. - .. method:: same_quantum(other[, context]) + .. method:: same_quantum(other, context=None) Test whether self and other have the same exponent or whether both are :const:`NaN`. - .. method:: scaleb(other[, context]) + This operation is unaffected by context and is quiet: no flags are changed + and no rounding is performed. As an exception, the C version may raise + InvalidOperation if the second operand cannot be converted exactly. + + .. method:: scaleb(other, context=None) Return the first operand with exponent adjusted by the second. Equivalently, return the first operand multiplied by ``10**other``. The second operand must be an integer. - .. method:: shift(other[, context]) + .. method:: shift(other, context=None) Return the result of shifting the digits of the first operand by an amount specified by the second operand. The second operand must be an integer in @@ -781,12 +834,12 @@ Decimal objects right. Digits shifted into the coefficient are zeros. The sign and exponent of the first operand are unchanged. - .. method:: sqrt([context]) + .. method:: sqrt(context=None) Return the square root of the argument to full precision. - .. method:: to_eng_string([context]) + .. method:: to_eng_string(context=None) Convert to an engineering-type string. @@ -794,12 +847,12 @@ Decimal objects are up to 3 digits left of the decimal place. For example, converts ``Decimal('123E+1')`` to ``Decimal('1.23E+3')`` - .. method:: to_integral([rounding[, context]]) + .. method:: to_integral(rounding=None, context=None) Identical to the :meth:`to_integral_value` method. The ``to_integral`` name has been kept for compatibility with older versions. - .. method:: to_integral_exact([rounding[, context]]) + .. method:: to_integral_exact(rounding=None, context=None) Round to the nearest integer, signaling :const:`Inexact` or :const:`Rounded` as appropriate if rounding occurs. The rounding mode is @@ -807,7 +860,7 @@ Decimal objects ``context``. If neither parameter is given then the rounding mode of the current context is used. - .. method:: to_integral_value([rounding[, context]]) + .. method:: to_integral_value(rounding=None, context=None) Round to the nearest integer without signaling :const:`Inexact` or :const:`Rounded`. If given, applies *rounding*; otherwise, uses the @@ -853,10 +906,10 @@ Each thread has its own current context which is accessed or changed using the You can also use the :keyword:`with` statement and the :func:`localcontext` function to temporarily change the active context. -.. function:: localcontext([c]) +.. function:: localcontext(ctx=None) Return a context manager that will set the current context for the active thread - to a copy of *c* on entry to the with-statement and restore the previous context + to a copy of *ctx* on entry to the with-statement and restore the previous context when exiting the with-statement. If no context is specified, a copy of the current context is used. @@ -912,39 +965,33 @@ described below. In addition, the module provides three pre-made contexts: In single threaded environments, it is preferable to not use this context at all. Instead, simply create contexts explicitly as described below. - The default values are precision=28, rounding=ROUND_HALF_EVEN, and enabled traps - for Overflow, InvalidOperation, and DivisionByZero. + The default values are :attr:`prec`\ =\ :const:`28`, + :attr:`rounding`\ =\ :const:`ROUND_HALF_EVEN`, + and enabled traps for :class:`Overflow`, :class:`InvalidOperation`, and + :class:`DivisionByZero`. In addition to the three supplied contexts, new contexts can be created with the :class:`Context` constructor. -.. class:: Context(prec=None, rounding=None, traps=None, flags=None, Emin=None, Emax=None, capitals=None, clamp=None) +.. class:: Context(prec=None, rounding=None, Emin=None, Emax=None, capitals=None, clamp=None, flags=None, traps=None) Creates a new context. If a field is not specified or is :const:`None`, the default values are copied from the :const:`DefaultContext`. If the *flags* field is not specified or is :const:`None`, all flags are cleared. - The *prec* field is a positive integer that sets the precision for arithmetic - operations in the context. + *prec* is an integer in the range [:const:`1`, :const:`MAX_PREC`] that sets + the precision for arithmetic operations in the context. - The *rounding* option is one of: - - * :const:`ROUND_CEILING` (towards :const:`Infinity`), - * :const:`ROUND_DOWN` (towards zero), - * :const:`ROUND_FLOOR` (towards :const:`-Infinity`), - * :const:`ROUND_HALF_DOWN` (to nearest with ties going towards zero), - * :const:`ROUND_HALF_EVEN` (to nearest with ties going to nearest even integer), - * :const:`ROUND_HALF_UP` (to nearest with ties going away from zero), or - * :const:`ROUND_UP` (away from zero). - * :const:`ROUND_05UP` (away from zero if last digit after rounding towards zero - would have been 0 or 5; otherwise towards zero) + The *rounding* option is one of the constants listed in the section + `Rounding Modes`_. The *traps* and *flags* fields list any signals to be set. Generally, new contexts should only set traps and leave the flags clear. The *Emin* and *Emax* fields are integers specifying the outer limits allowable - for exponents. + for exponents. *Emin* must be in the range [:const:`MIN_EMIN`, :const:`0`], + *Emax* in the range [:const:`0`, :const:`MAX_EMAX`]. The *capitals* field is either :const:`0` or :const:`1` (the default). If set to :const:`1`, exponents are printed with a capital :const:`E`; otherwise, a @@ -983,6 +1030,12 @@ In addition to the three supplied contexts, new contexts can be created with the Resets all of the flags to :const:`0`. + .. method:: clear_traps() + + Resets all of the traps to :const:`0`. + + .. versionadded:: 3.3 + .. method:: copy() Return a duplicate of the context. @@ -1130,52 +1183,52 @@ In addition to the three supplied contexts, new contexts can be created with the .. method:: is_canonical(x) - Returns True if *x* is canonical; otherwise returns False. + Returns ``True`` if *x* is canonical; otherwise returns ``False``. .. method:: is_finite(x) - Returns True if *x* is finite; otherwise returns False. + Returns ``True`` if *x* is finite; otherwise returns ``False``. .. method:: is_infinite(x) - Returns True if *x* is infinite; otherwise returns False. + Returns ``True`` if *x* is infinite; otherwise returns ``False``. .. method:: is_nan(x) - Returns True if *x* is a qNaN or sNaN; otherwise returns False. + Returns ``True`` if *x* is a qNaN or sNaN; otherwise returns ``False``. .. method:: is_normal(x) - Returns True if *x* is a normal number; otherwise returns False. + Returns ``True`` if *x* is a normal number; otherwise returns ``False``. .. method:: is_qnan(x) - Returns True if *x* is a quiet NaN; otherwise returns False. + Returns ``True`` if *x* is a quiet NaN; otherwise returns ``False``. .. method:: is_signed(x) - Returns True if *x* is negative; otherwise returns False. + Returns ``True`` if *x* is negative; otherwise returns ``False``. .. method:: is_snan(x) - Returns True if *x* is a signaling NaN; otherwise returns False. + Returns ``True`` if *x* is a signaling NaN; otherwise returns ``False``. .. method:: is_subnormal(x) - Returns True if *x* is subnormal; otherwise returns False. + Returns ``True`` if *x* is subnormal; otherwise returns ``False``. .. method:: is_zero(x) - Returns True if *x* is a zero; otherwise returns False. + Returns ``True`` if *x* is a zero; otherwise returns ``False``. .. method:: ln(x) @@ -1275,15 +1328,20 @@ In addition to the three supplied contexts, new contexts can be created with the identity operation. - .. method:: power(x, y[, modulo]) + .. method:: power(x, y, modulo=None) Return ``x`` to the power of ``y``, reduced modulo ``modulo`` if given. With two arguments, compute ``x**y``. If ``x`` is negative then ``y`` must be integral. The result will be inexact unless ``y`` is integral and the result is finite and can be expressed exactly in 'precision' digits. - The result should always be correctly rounded, using the rounding mode of - the current thread's context. + The rounding mode of the context is used. Results are always correctly-rounded + in the Python version. + + .. versionchanged:: 3.3 + The C module computes :meth:`power` in terms of the correctly-rounded + :meth:`exp` and :meth:`ln` functions. The result is well-defined but + only "almost always correctly-rounded". With three arguments, compute ``(x**y) % modulo``. For the three argument form, the following restrictions on the arguments hold: @@ -1332,7 +1390,7 @@ In addition to the three supplied contexts, new contexts can be created with the .. method:: same_quantum(x, y) - Returns True if the two operands have the same exponent. + Returns ``True`` if the two operands have the same exponent. .. method:: scaleb (x, y) @@ -1371,6 +1429,69 @@ In addition to the three supplied contexts, new contexts can be created with the .. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% +.. _decimal-rounding-modes: + +Constants +--------- + +The constants in this section are only relevant for the C module. They +are also included in the pure Python version for compatibility. + ++---------------------+---------------------+-------------------------------+ +| | 32-bit | 64-bit | ++=====================+=====================+===============================+ +| .. data:: MAX_PREC | :const:`425000000` | :const:`999999999999999999` | ++---------------------+---------------------+-------------------------------+ +| .. data:: MAX_EMAX | :const:`425000000` | :const:`999999999999999999` | ++---------------------+---------------------+-------------------------------+ +| .. data:: MIN_EMIN | :const:`-425000000` | :const:`-999999999999999999` | ++---------------------+---------------------+-------------------------------+ +| .. data:: MIN_ETINY | :const:`-849999999` | :const:`-1999999999999999997` | ++---------------------+---------------------+-------------------------------+ + + +.. data:: HAVE_THREADS + + The default value is ``True``. If Python is compiled without threads, the + C version automatically disables the expensive thread local context + machinery. In this case, the value is ``False``. + +Rounding modes +-------------- + +.. data:: ROUND_CEILING + + Round towards :const:`Infinity`. + +.. data:: ROUND_DOWN + + Round towards zero. + +.. data:: ROUND_FLOOR + + Round towards :const:`-Infinity`. + +.. data:: ROUND_HALF_DOWN + + Round to nearest with ties going towards zero. + +.. data:: ROUND_HALF_EVEN + + Round to nearest with ties going to nearest even integer. + +.. data:: ROUND_HALF_UP + + Round to nearest with ties going away from zero. + +.. data:: ROUND_UP + + Round away from zero. + +.. data:: ROUND_05UP + + Round away from zero if last digit after rounding towards zero would have + been 0 or 5; otherwise round towards zero. + .. _decimal-signals: @@ -1435,7 +1556,6 @@ condition. Infinity / Infinity x % 0 Infinity % x - x._rescale( non-integer ) sqrt(-x) and x > 0 0 ** 0 x ** (non-integer) @@ -1478,6 +1598,23 @@ condition. Occurs when a subnormal result is pushed to zero by rounding. :class:`Inexact` and :class:`Subnormal` are also signaled. + +.. class:: FloatOperation + + Enable stricter semantics for mixing floats and Decimals. + + If the signal is not trapped (default), mixing floats and Decimals is + permitted in the :class:`~decimal.Decimal` constructor, + :meth:`~decimal.Context.create_decimal` and all comparison operators. + Both conversion and comparisons are exact. Any occurrence of a mixed + operation is silently recorded by setting :exc:`FloatOperation` in the + context flags. Explicit conversions with :meth:`~decimal.Decimal.from_float` + or :meth:`~decimal.Context.create_decimal_from_float` do not set the flag. + + Otherwise (the signal is trapped), only equality comparisons and explicit + conversions are silent. All other mixed operations raise :exc:`FloatOperation`. + + The following table summarizes the hierarchy of signals:: exceptions.ArithmeticError(exceptions.Exception) @@ -1490,10 +1627,12 @@ The following table summarizes the hierarchy of signals:: InvalidOperation Rounded Subnormal + FloatOperation(DecimalException, exceptions.TypeError) .. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% + .. _decimal-notes: Floating Point Notes @@ -1603,7 +1742,7 @@ normalized floating point representations, it is not immediately obvious that the following calculation returns a value equal to zero: >>> 1 / Decimal('Infinity') - Decimal('0E-1000000026') + Decimal('0E-1000026') .. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -1615,7 +1754,7 @@ Working with threads The :func:`getcontext` function accesses a different :class:`Context` object for each thread. Having separate thread contexts means that threads may make -changes (such as ``getcontext.prec=10``) without interfering with other threads. +changes (such as ``getcontext().prec=10``) without interfering with other threads. Likewise, the :func:`setcontext` function automatically assigns its target to the current thread. diff --git a/Doc/library/depgraph-output.png b/Doc/library/depgraph-output.png Binary files differnew file mode 100644 index 0000000..960bb1b --- /dev/null +++ b/Doc/library/depgraph-output.png diff --git a/Doc/library/development.rst b/Doc/library/development.rst index c822e08..2368769 100644 --- a/Doc/library/development.rst +++ b/Doc/library/development.rst @@ -19,5 +19,8 @@ The list of modules described in this chapter is: pydoc.rst doctest.rst unittest.rst + unittest.mock.rst + unittest.mock-examples.rst 2to3.rst test.rst + venv.rst diff --git a/Doc/library/difflib.rst b/Doc/library/difflib.rst index bdc37b3..81dc0f1 100644 --- a/Doc/library/difflib.rst +++ b/Doc/library/difflib.rst @@ -143,8 +143,8 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module. By default, the diff control lines (those with ``***`` or ``---``) are created with a trailing newline. This is helpful so that inputs created from - :func:`file.readlines` result in diffs that are suitable for use with - :func:`file.writelines` since both the inputs and outputs have trailing + :func:`io.IOBase.readlines` result in diffs that are suitable for use with + :func:`io.IOBase.writelines` since both the inputs and outputs have trailing newlines. For inputs that do not have trailing newlines, set the *lineterm* argument to @@ -275,8 +275,8 @@ diffs. For comparing directories and files, see also, the :mod:`filecmp` module. By default, the diff control lines (those with ``---``, ``+++``, or ``@@``) are created with a trailing newline. This is helpful so that inputs created from - :func:`file.readlines` result in diffs that are suitable for use with - :func:`file.writelines` since both the inputs and outputs have trailing + :func:`io.IOBase.readlines` result in diffs that are suitable for use with + :func:`io.IOBase.writelines` since both the inputs and outputs have trailing newlines. For inputs that do not have trailing newlines, set the *lineterm* argument to @@ -359,7 +359,7 @@ The :class:`SequenceMatcher` class has this constructor: The *autojunk* parameter. SequenceMatcher objects get three data attributes: *bjunk* is the - set of elements of *b* for which *isjunk* is True; *bpopular* is the set of + set of elements of *b* for which *isjunk* is ``True``; *bpopular* is the set of non-junk elements considered popular by the heuristic (if it is not disabled); *b2j* is a dict mapping the remaining elements of *b* to a list of positions where they occur. All three are reset whenever *b* is reset @@ -629,10 +629,12 @@ The :class:`Differ` class has this constructor: Compare two sequences of lines, and generate the delta (a sequence of lines). - Each sequence must contain individual single-line strings ending with newlines. - Such sequences can be obtained from the :meth:`readlines` method of file-like - objects. The delta generated also consists of newline-terminated strings, ready - to be printed as-is via the :meth:`writelines` method of a file-like object. + Each sequence must contain individual single-line strings ending with + newlines. Such sequences can be obtained from the + :meth:`~io.IOBase.readlines` method of file-like objects. The delta + generated also consists of newline-terminated strings, ready to be + printed as-is via the :meth:`~io.IOBase.writelines` method of a + file-like object. .. _differ-examples: @@ -642,7 +644,7 @@ Differ Example This example compares two texts. First we set up the texts, sequences of individual single-line strings ending with newlines (such sequences can also be -obtained from the :meth:`readlines` method of file-like objects): +obtained from the :meth:`~io.BaseIO.readlines` method of file-like objects): >>> text1 = ''' 1. Beautiful is better than ugly. ... 2. Explicit is better than implicit. @@ -752,8 +754,8 @@ It is also contained in the Python source distribution, as # we're passing these as arguments to the diff function fromdate = time.ctime(os.stat(fromfile).st_mtime) todate = time.ctime(os.stat(tofile).st_mtime) - fromlines = open(fromfile, 'U').readlines() - tolines = open(tofile, 'U').readlines() + with open(fromfile) as fromf, open(tofile) as tof: + fromlines, tolines = list(fromf), list(tof) if options.u: diff = difflib.unified_diff(fromlines, tolines, fromfile, tofile, diff --git a/Doc/library/dis.rst b/Doc/library/dis.rst index 108cda7..ec7112d 100644 --- a/Doc/library/dis.rst +++ b/Doc/library/dis.rst @@ -171,11 +171,6 @@ The Python compiler currently generates the following bytecode instructions. **General instructions** -.. opcode:: STOP_CODE - - Indicates end-of-code to the compiler, not used by the interpreter. - - .. opcode:: NOP Do nothing code. Used as a placeholder by the bytecode optimizer. @@ -436,6 +431,13 @@ the stack so that it is available for further iterations of the loop. Pops ``TOS`` and yields it from a :term:`generator`. +.. opcode:: YIELD_FROM + + Pops ``TOS`` and delegates to it as a subiterator from a :term:`generator`. + + .. versionadded:: 3.3 + + .. opcode:: IMPORT_STAR Loads all symbols not starting with ``'_'`` directly from the module TOS to the @@ -752,17 +754,26 @@ the more significant byte last. .. opcode:: MAKE_FUNCTION (argc) - Pushes a new function object on the stack. TOS is the code associated with the - function. The function object is defined to have *argc* default parameters, - which are found below TOS. + Pushes a new function object on the stack. From bottom to top, the consumed + stack must consist of + + * ``argc & 0xFF`` default argument objects in positional order + * ``(argc >> 8) & 0xFF`` pairs of name and default argument, with the name + just below the object on the stack, for keyword-only parameters + * ``(argc >> 16) & 0x7FFF`` parameter annotation objects + * a tuple listing the parameter names for the annotations (only if there are + ony annotation objects) + * the code associated with the function (at TOS1) + * the :term:`qualified name` of the function (at TOS) .. opcode:: MAKE_CLOSURE (argc) Creates a new function object, sets its *__closure__* slot, and pushes it on - the stack. TOS is the code associated with the function, TOS1 the tuple - containing cells for the closure's free variables. The function also has - *argc* default parameters, which are found below the cells. + the stack. TOS is the :term:`qualified name` of the function, TOS1 is the + code associated with the function, and TOS2 is the tuple containing cells for + the closure's free variables. The function also has *argc* default parameters, + which are found below the cells. .. opcode:: BUILD_SLICE (argc) diff --git a/Doc/library/distutils.rst b/Doc/library/distutils.rst index 238b79d..6666a9b 100644 --- a/Doc/library/distutils.rst +++ b/Doc/library/distutils.rst @@ -12,18 +12,14 @@ additional modules into a Python installation. The new modules may be either 100%-pure Python, or may be extension modules written in C, or may be collections of Python packages which include modules coded in both Python and C. -This package is discussed in two separate chapters: +User documentation and API reference are provided in another document: .. seealso:: :ref:`distutils-index` The manual for developers and packagers of Python modules. This describes how to prepare :mod:`distutils`\ -based packages so that they may be - easily installed into an existing Python installation. - - :ref:`install-index` - An "administrators" manual which includes information on installing - modules into an existing Python installation. You do not need to be a - Python programmer to read this manual. - + easily installed into an existing Python installation. It also contains + instructions for end-users wanting to install a distutils-based package, + :ref:`install-index`. diff --git a/Doc/library/doctest.rst b/Doc/library/doctest.rst index ec8edbe..222c719 100644 --- a/Doc/library/doctest.rst +++ b/Doc/library/doctest.rst @@ -320,7 +320,8 @@ The fine print: Tabs in output generated by the tested code are not modified. Because any hard tabs in the sample output *are* expanded, this means that if the code output includes hard tabs, the only way the doctest can pass is if the - :const:`NORMALIZE_WHITESPACE` option or directive is in effect. + :const:`NORMALIZE_WHITESPACE` option or :ref:`directive <doctest-directives>` + is in effect. Alternatively, the test can be rewritten to capture the output and compare it to an expected value as part of the test. This handling of tabs in the source was arrived at through trial and error, and has proven to be the least @@ -485,15 +486,16 @@ Some details you should read once, but won't need to remember: SyntaxError: invalid syntax +.. _option-flags-and-directives: .. _doctest-options: -Option Flags and Directives -^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Option Flags +^^^^^^^^^^^^ A number of option flags control various aspects of doctest's behavior. Symbolic names for the flags are supplied as module constants, which can be or'ed together and passed to various functions. The names can also be used in -doctest directives (see below). +:ref:`doctest directives <doctest-directives>`. The first group of options define test semantics, controlling aspects of how doctest decides whether actual output matches an example's expected output: @@ -547,14 +549,14 @@ doctest decides whether actual output matches an example's expected output: :exc:`TypeError` is raised. It will also ignore the module name used in Python 3 doctest reports. Hence - both these variations will work regardless of whether the test is run under - Python 2.7 or Python 3.2 (or later versions): + both of these variations will work with the flag specified, regardless of + whether the test is run under Python 2.7 or Python 3.2 (or later versions):: - >>> raise CustomError('message') #doctest: +IGNORE_EXCEPTION_DETAIL + >>> raise CustomError('message') Traceback (most recent call last): CustomError: message - >>> raise CustomError('message') #doctest: +IGNORE_EXCEPTION_DETAIL + >>> raise CustomError('message') Traceback (most recent call last): my_module.CustomError: message @@ -564,15 +566,16 @@ doctest decides whether actual output matches an example's expected output: exception name. Using :const:`IGNORE_EXCEPTION_DETAIL` and the details from Python 2.3 is also the only clear way to write a doctest that doesn't care about the exception detail yet continues to pass under Python 2.3 or - earlier (those releases do not support doctest directives and ignore them - as irrelevant comments). For example, :: + earlier (those releases do not support :ref:`doctest directives + <doctest-directives>` and ignore them as irrelevant comments). For example:: - >>> (1, 2)[3] = 'moo' #doctest: +IGNORE_EXCEPTION_DETAIL + >>> (1, 2)[3] = 'moo' Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: object doesn't support item assignment - passes under Python 2.3 and later Python versions, even though the detail + passes under Python 2.3 and later Python versions with the flag specified, + even though the detail changed in Python 2.4 to say "does not" instead of "doesn't". .. versionchanged:: 3.2 @@ -634,9 +637,30 @@ The second group of options controls how test failures are reported: A bitmask or'ing together all the reporting flags above. -"Doctest directives" may be used to modify the option flags for individual -examples. Doctest directives are expressed as a special Python comment -following an example's source code: + +There is also a way to register new option flag names, though this isn't +useful unless you intend to extend :mod:`doctest` internals via subclassing: + + +.. function:: register_optionflag(name) + + Create a new option flag with a given name, and return the new flag's integer + value. :func:`register_optionflag` can be used when subclassing + :class:`OutputChecker` or :class:`DocTestRunner` to create new options that are + supported by your subclasses. :func:`register_optionflag` should always be + called using the following idiom:: + + MY_FLAG = register_optionflag('MY_FLAG') + + +.. _doctest-directives: + +Directives +^^^^^^^^^^ + +Doctest directives may be used to modify the :ref:`option flags +<doctest-options>` for an individual example. Doctest directives are +special Python comments following an example's source code: .. productionlist:: doctest directive: "#" "doctest:" `directive_options` @@ -693,20 +717,6 @@ usually the only meaningful choice. However, option flags can also be passed to functions that run doctests, establishing different defaults. In such cases, disabling an option via ``-`` in a directive can be useful. -There's also a way to register new option flag names, although this isn't useful -unless you intend to extend :mod:`doctest` internals via subclassing: - - -.. function:: register_optionflag(name) - - Create a new option flag with a given name, and return the new flag's integer - value. :func:`register_optionflag` can be used when subclassing - :class:`OutputChecker` or :class:`DocTestRunner` to create new options that are - supported by your subclasses. :func:`register_optionflag` should always be - called using the following idiom:: - - MY_FLAG = register_optionflag('MY_FLAG') - .. _doctest-warnings: diff --git a/Doc/library/email.charset.rst b/Doc/library/email.charset.rst index a828f3c..19a6953 100644 --- a/Doc/library/email.charset.rst +++ b/Doc/library/email.charset.rst @@ -234,5 +234,5 @@ new entries to the global character set, alias, and codec registries: *charset* is the canonical name of a character set. *codecname* is the name of a Python codec, as appropriate for the second argument to the :class:`str`'s - :func:`decode` method + :meth:`~str.encode` method diff --git a/Doc/library/email.errors.rst b/Doc/library/email.errors.rst index d8f330f..294e3ed 100644 --- a/Doc/library/email.errors.rst +++ b/Doc/library/email.errors.rst @@ -25,7 +25,8 @@ The following exception classes are defined in the :mod:`email.errors` module: Raised under some error conditions when parsing the :rfc:`2822` headers of a message, this class is derived from :exc:`MessageParseError`. It can be raised - from the :meth:`Parser.parse` or :meth:`Parser.parsestr` methods. + from the :meth:`Parser.parse <email.parser.Parser.parse>` or + :meth:`Parser.parsestr <email.parser.Parser.parsestr>` methods. Situations where it can be raised include finding an envelope header after the first :rfc:`2822` header of the message, finding a continuation line before the @@ -37,7 +38,8 @@ The following exception classes are defined in the :mod:`email.errors` module: Raised under some error conditions when parsing the :rfc:`2822` headers of a message, this class is derived from :exc:`MessageParseError`. It can be raised - from the :meth:`Parser.parse` or :meth:`Parser.parsestr` methods. + from the :meth:`Parser.parse <email.parser.Parser.parse>` or + :meth:`Parser.parsestr <email.parser.Parser.parsestr>` methods. Situations where it can be raised include not being able to find the starting or terminating boundary in a :mimetype:`multipart/\*` message when strict parsing @@ -46,19 +48,20 @@ The following exception classes are defined in the :mod:`email.errors` module: .. exception:: MultipartConversionError() - Raised when a payload is added to a :class:`Message` object using - :meth:`add_payload`, but the payload is already a scalar and the message's - :mailheader:`Content-Type` main type is not either :mimetype:`multipart` or - missing. :exc:`MultipartConversionError` multiply inherits from - :exc:`MessageError` and the built-in :exc:`TypeError`. + Raised when a payload is added to a :class:`~email.message.Message` object + using :meth:`add_payload`, but the payload is already a scalar and the + message's :mailheader:`Content-Type` main type is not either + :mimetype:`multipart` or missing. :exc:`MultipartConversionError` multiply + inherits from :exc:`MessageError` and the built-in :exc:`TypeError`. - Since :meth:`Message.add_payload` is deprecated, this exception is rarely raised - in practice. However the exception may also be raised if the :meth:`attach` + Since :meth:`Message.add_payload` is deprecated, this exception is rarely + raised in practice. However the exception may also be raised if the + :meth:`~email.message.Message.attach` method is called on an instance of a class derived from :class:`~email.mime.nonmultipart.MIMENonMultipart` (e.g. :class:`~email.mime.image.MIMEImage`). -Here's the list of the defects that the :class:`~email.mime.parser.FeedParser` +Here's the list of the defects that the :class:`~email.parser.FeedParser` can find while parsing messages. Note that the defects are added to the message where the problem was found, so for example, if a message nested inside a :mimetype:`multipart/alternative` had a malformed header, that nested message @@ -73,17 +76,38 @@ this class is *not* an exception! * :class:`StartBoundaryNotFoundDefect` -- The start boundary claimed in the :mailheader:`Content-Type` header was never found. +* :class:`CloseBoundaryNotFoundDefect` -- A start boundary was found, but + no corresponding close boundary was ever found. + + .. versionadded:: 3.3 + * :class:`FirstHeaderLineIsContinuationDefect` -- The message had a continuation line as its first header line. * :class:`MisplacedEnvelopeHeaderDefect` - A "Unix From" header was found in the middle of a header block. +* :class:`MissingHeaderBodySeparatorDefect` - A line was found while parsing + headers that had no leading white space but contained no ':'. Parsing + continues assuming that the line represents the first line of the body. + + .. versionadded:: 3.3 + * :class:`MalformedHeaderDefect` -- A header was found that was missing a colon, or was otherwise malformed. + .. deprecated:: 3.3 + This defect has not been used for several Python versions. + * :class:`MultipartInvariantViolationDefect` -- A message claimed to be a - :mimetype:`multipart`, but no subparts were found. Note that when a message has - this defect, its :meth:`is_multipart` method may return false even though its - content type claims to be :mimetype:`multipart`. + :mimetype:`multipart`, but no subparts were found. Note that when a message + has this defect, its :meth:`~email.message.Message.is_multipart` method may + return false even though its content type claims to be :mimetype:`multipart`. + +* :class:`InvalidBase64PaddingDefect` -- When decoding a block of base64 + enocded bytes, the padding was not correct. Enough padding is added to + perform the decode, but the resulting decoded bytes may be invalid. +* :class:`InvalidBase64CharactersDefect` -- When decoding a block of base64 + enocded bytes, characters outside the base64 alphebet were encountered. + The characters are ignored, but the resulting decoded bytes may be invalid. diff --git a/Doc/library/email.generator.rst b/Doc/library/email.generator.rst index 88c62a2..c172acb 100644 --- a/Doc/library/email.generator.rst +++ b/Doc/library/email.generator.rst @@ -32,7 +32,7 @@ Here are the public methods of the :class:`Generator` class, imported from the :mod:`email.generator` module: -.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78) +.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78, *, policy=None) The constructor for the :class:`Generator` class takes a :term:`file-like object` called *outfp* for an argument. *outfp* must support the :meth:`write` method @@ -53,10 +53,17 @@ Here are the public methods of the :class:`Generator` class, imported from the :class:`~email.header.Header` class. Set to zero to disable header wrapping. The default is 78, as recommended (but not required) by :rfc:`2822`. + The *policy* keyword specifies a :mod:`~email.policy` object that controls a + number of aspects of the generator's operation. If no *policy* is specified, + then the *policy* attached to the message object passed to :attr:`flatten` + is used. + + .. versionchanged:: 3.3 Added the *policy* keyword. + The other public :class:`Generator` methods are: - .. method:: flatten(msg, unixfrom=False, linesep='\\n') + .. method:: flatten(msg, unixfrom=False, linesep=None) Print the textual representation of the message object structure rooted at *msg* to the output file specified when the :class:`Generator` instance @@ -72,19 +79,20 @@ Here are the public methods of the :class:`Generator` class, imported from the Note that for subparts, no envelope header is ever printed. Optional *linesep* specifies the line separator character used to - terminate lines in the output. It defaults to ``\n`` because that is - the most useful value for Python application code (other library packages - expect ``\n`` separated lines). ``linesep=\r\n`` can be used to - generate output with RFC-compliant line separators. + terminate lines in the output. If specified it overrides the value + specified by the *msg*\'s or ``Generator``\'s ``policy``. - Messages parsed with a Bytes parser that have a - :mailheader:`Content-Transfer-Encoding` of 8bit will be converted to a - use a 7bit Content-Transfer-Encoding. Non-ASCII bytes in the headers - will be :rfc:`2047` encoded with a charset of `unknown-8bit`. + Because strings cannot represent non-ASCII bytes, if the policy that + applies when ``flatten`` is run has :attr:`~email.policy.Policy.cte_type` + set to ``8bit``, ``Generator`` will operate as if it were set to + ``7bit``. This means that messages parsed with a Bytes parser that have + a :mailheader:`Content-Transfer-Encoding` of ``8bit`` will be converted + to a use a ``7bit`` Content-Transfer-Encoding. Non-ASCII bytes in the + headers will be :rfc:`2047` encoded with a charset of ``unknown-8bit``. .. versionchanged:: 3.2 - Added support for re-encoding 8bit message bodies, and the *linesep* - argument. + Added support for re-encoding ``8bit`` message bodies, and the + *linesep* argument. .. method:: clone(fp) @@ -103,7 +111,8 @@ As a convenience, see the :class:`~email.message.Message` methods formatted string representation of a message object. For more detail, see :mod:`email.message`. -.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78) +.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, \ + policy=policy.default) The constructor for the :class:`BytesGenerator` class takes a binary :term:`file-like object` called *outfp* for an argument. *outfp* must @@ -125,19 +134,31 @@ formatted string representation of a message object. For more detail, see wrapping. The default is 78, as recommended (but not required) by :rfc:`2822`. + The *policy* keyword specifies a :mod:`~email.policy` object that controls a + number of aspects of the generator's operation. The default policy + maintains backward compatibility. + + .. versionchanged:: 3.3 Added the *policy* keyword. + The other public :class:`BytesGenerator` methods are: - .. method:: flatten(msg, unixfrom=False, linesep='\n') + .. method:: flatten(msg, unixfrom=False, linesep=None) Print the textual representation of the message object structure rooted at *msg* to the output file specified when the :class:`BytesGenerator` instance was created. Subparts are visited depth-first and the resulting - text will be properly MIME encoded. If the input that created the *msg* - contained bytes with the high bit set and those bytes have not been - modified, they will be copied faithfully to the output, even if doing so - is not strictly RFC compliant. (To produce strictly RFC compliant - output, use the :class:`Generator` class.) + text will be properly MIME encoded. If the :mod:`~email.policy` option + :attr:`~email.policy.Policy.cte_type` is ``8bit`` (the default), + then any bytes with the high bit set in the original parsed message that + have not been modified will be copied faithfully to the output. If + ``cte_type`` is ``7bit``, the bytes will be converted as needed + using an ASCII-compatible Content-Transfer-Encoding. In particular, + RFC-invalid non-ASCII bytes in headers will be encoded using the MIME + ``unknown-8bit`` character set, thus rendering them RFC-compliant. + + .. XXX: There should be a complementary option that just does the RFC + compliance transformation but leaves CTE 8bit parts alone. Messages parsed with a Bytes parser that have a :mailheader:`Content-Transfer-Encoding` of 8bit will be reconstructed @@ -152,10 +173,8 @@ formatted string representation of a message object. For more detail, see Note that for subparts, no envelope header is ever printed. Optional *linesep* specifies the line separator character used to - terminate lines in the output. It defaults to ``\n`` because that is - the most useful value for Python application code (other library packages - expect ``\n`` separated lines). ``linesep=\r\n`` can be used to - generate output with RFC-compliant line separators. + terminate lines in the output. If specified it overrides the value + specified by the ``Generator``\ 's ``policy``. .. method:: clone(fp) diff --git a/Doc/library/email.header.rst b/Doc/library/email.header.rst index 7d2dc2e..346d23f 100644 --- a/Doc/library/email.header.rst +++ b/Doc/library/email.header.rst @@ -31,8 +31,8 @@ For example:: >>> msg = Message() >>> h = Header('p\xf6stal', 'iso-8859-1') >>> msg['Subject'] = h - >>> print(msg.as_string()) - Subject: =?iso-8859-1?q?p=F6stal?= + >>> msg.as_string() + 'Subject: =?iso-8859-1?q?p=F6stal?=\n\n' @@ -176,7 +176,7 @@ The :mod:`email.header` module also provides the following convenient functions. >>> from email.header import decode_header >>> decode_header('=?iso-8859-1?q?p=F6stal?=') - [('p\xf6stal', 'iso-8859-1')] + [(b'p\xf6stal', 'iso-8859-1')] .. function:: make_header(decoded_seq, maxlinelen=None, header_name=None, continuation_ws=' ') diff --git a/Doc/library/email.headerregistry.rst b/Doc/library/email.headerregistry.rst new file mode 100644 index 0000000..db3aade --- /dev/null +++ b/Doc/library/email.headerregistry.rst @@ -0,0 +1,453 @@ +:mod:`email.headerregistry`: Custom Header Objects +-------------------------------------------------- + +.. module:: email.headerregistry + :synopsis: Automatic Parsing of headers based on the field name + +.. moduleauthor:: R. David Murray <rdmurray@bitdance.com> +.. sectionauthor:: R. David Murray <rdmurray@bitdance.com> + + +.. note:: + + The headerregistry module has been included in the standard library on a + :term:`provisional basis <provisional package>`. Backwards incompatible + changes (up to and including removal of the module) may occur if deemed + necessary by the core developers. + +.. versionadded:: 3.3 + as a :term:`provisional module <provisional package>`. + +Headers are represented by customized subclasses of :class:`str`. The +particular class used to represent a given header is determined by the +:attr:`~email.policy.EmailPolicy.header_factory` of the :mod:`~email.policy` in +effect when the headers are created. This section documents the particular +``header_factory`` implemented by the email package for handling :RFC:`5322` +compliant email messages, which not only provides customized header objects for +various header types, but also provides an extension mechanism for applications +to add their own custom header types. + +When using any of the policy objects derived from +:data:`~email.policy.EmailPolicy`, all headers are produced by +:class:`.HeaderRegistry` and have :class:`.BaseHeader` as their last base +class. Each header class has an additional base class that is determined by +the type of the header. For example, many headers have the class +:class:`.UnstructuredHeader` as their other base class. The specialized second +class for a header is determined by the name of the header, using a lookup +table stored in the :class:`.HeaderRegistry`. All of this is managed +transparently for the typical application program, but interfaces are provided +for modifying the default behavior for use by more complex applications. + +The sections below first document the header base classes and their attributes, +followed by the API for modifying the behavior of :class:`.HeaderRegistry`, and +finally the support classes used to represent the data parsed from structured +headers. + + +.. class:: BaseHeader(name, value) + + *name* and *value* are passed to ``BaseHeader`` from the + :attr:`~email.policy.EmailPolicy.header_factory` call. The string value of + any header object is the *value* fully decoded to unicode. + + This base class defines the following read-only properties: + + + .. attribute:: name + + The name of the header (the portion of the field before the ':'). This + is exactly the value passed in the + :attr:`~email.policy.EmailPolicy.header_factory` call for *name*; that + is, case is preserved. + + + .. attribute:: defects + + A tuple of :exc:`~email.errors.HeaderDefect` instances reporting any + RFC compliance problems found during parsing. The email package tries to + be complete about detecting compliance issues. See the :mod:`~email.errors` + module for a discussion of the types of defects that may be reported. + + + .. attribute:: max_count + + The maximum number of headers of this type that can have the same + ``name``. A value of ``None`` means unlimited. The ``BaseHeader`` value + for this attribute is ``None``; it is expected that specialized header + classes will override this value as needed. + + ``BaseHeader`` also provides the following method, which is called by the + email library code and should not in general be called by application + programs: + + .. method:: fold(*, policy) + + Return a string containing :attr:`~email.policy.Policy.linesep` + characters as required to correctly fold the header according + to *policy*. A :attr:`~email.policy.Policy.cte_type` of + ``8bit`` will be treated as if it were ``7bit``, since strings + may not contain binary data. + + + ``BaseHeader`` by itself cannot be used to create a header object. It + defines a protocol that each specialized header cooperates with in order to + produce the header object. Specifically, ``BaseHeader`` requires that + the specialized class provide a :func:`classmethod` named ``parse``. This + method is called as follows:: + + parse(string, kwds) + + ``kwds`` is a dictionary containing one pre-initialized key, ``defects``. + ``defects`` is an empty list. The parse method should append any detected + defects to this list. On return, the ``kwds`` dictionary *must* contain + values for at least the keys ``decoded`` and ``defects``. ``decoded`` + should be the string value for the header (that is, the header value fully + decoded to unicode). The parse method should assume that *string* may + contain transport encoded parts, but should correctly handle all valid + unicode characters as well so that it can parse un-encoded header values. + + ``BaseHeader``'s ``__new__`` then creates the header instance, and calls its + ``init`` method. The specialized class only needs to provide an ``init`` + method if it wishes to set additional attributes beyond those provided by + ``BaseHeader`` itself. Such an ``init`` method should look like this:: + + def init(self, *args, **kw): + self._myattr = kw.pop('myattr') + super().init(*args, **kw) + + That is, anything extra that the specialized class puts in to the ``kwds`` + dictionary should be removed and handled, and the remaining contents of + ``kw`` (and ``args``) passed to the ``BaseHeader`` ``init`` method. + + +.. class:: UnstructuredHeader + + An "unstructured" header is the default type of header in :rfc:`5322`. + Any header that does not have a specified syntax is treated as + unstructured. The classic example of an unstructured header is the + :mailheader:`Subject` header. + + In :rfc:`5322`, an unstructured header is a run of arbitrary text in the + ASCII character set. :rfc:`2047`, however, has an :rfc:`5322` compatible + mechanism for encoding non-ASCII text as ASCII characters within a header + value. When a *value* containing encoded words is passed to the + constructor, the ``UnstructuredHeader`` parser converts such encoded words + back in to the original unicode, following the :rfc:`2047` rules for + unstructured text. The parser uses heuristics to attempt to decode certain + non-compliant encoded words. Defects are registered in such cases, as well + as defects for issues such as invalid characters within the encoded words or + the non-encoded text. + + This header type provides no additional attributes. + + +.. class:: DateHeader + + :rfc:`5322` specifies a very specific format for dates within email headers. + The ``DateHeader`` parser recognizes that date format, as well as + recognizing a number of variant forms that are sometimes found "in the + wild". + + This header type provides the following additional attributes: + + .. attribute:: datetime + + If the header value can be recognized as a valid date of one form or + another, this attribute will contain a :class:`~datetime.datetime` + instance representing that date. If the timezone of the input date is + specified as ``-0000`` (indicating it is in UTC but contains no + information about the source timezone), then :attr:`.datetime` will be a + naive :class:`~datetime.datetime`. If a specific timezone offset is + found (including `+0000`), then :attr:`.datetime` will contain an aware + ``datetime`` that uses :class:`datetime.timezone` to record the timezone + offset. + + The ``decoded`` value of the header is determined by formatting the + ``datetime`` according to the :rfc:`5322` rules; that is, it is set to:: + + email.utils.format_datetime(self.datetime) + + When creating a ``DateHeader``, *value* may be + :class:`~datetime.datetime` instance. This means, for example, that + the following code is valid and does what one would expect:: + + msg['Date'] = datetime(2011, 7, 15, 21) + + Because this is a naive ``datetime`` it will be interpreted as a UTC + timestamp, and the resulting value will have a timezone of ``-0000``. Much + more useful is to use the :func:`~email.utils.localtime` function from the + :mod:`~email.utils` module:: + + msg['Date'] = utils.localtime() + + This example sets the date header to the current time and date using + the current timezone offset. + + +.. class:: AddressHeader + + Address headers are one of the most complex structured header types. + The ``AddressHeader`` class provides a generic interface to any address + header. + + This header type provides the following additional attributes: + + + .. attribute:: groups + + A tuple of :class:`.Group` objects encoding the + addresses and groups found in the header value. Addresses that are + not part of a group are represented in this list as single-address + ``Groups`` whose :attr:`~.Group.display_name` is ``None``. + + + .. attribute:: addresses + + A tuple of :class:`.Address` objects encoding all + of the individual addresses from the header value. If the header value + contains any groups, the individual addresses from the group are included + in the list at the point where the group occurs in the value (that is, + the list of addresses is "flattened" into a one dimensional list). + + The ``decoded`` value of the header will have all encoded words decoded to + unicode. :class:`~encodings.idna` encoded domain names are also decoded to unicode. The + ``decoded`` value is set by :attr:`~str.join`\ ing the :class:`str` value of + the elements of the ``groups`` attribute with ``', '``. + + A list of :class:`.Address` and :class:`.Group` objects in any combination + may be used to set the value of an address header. ``Group`` objects whose + ``display_name`` is ``None`` will be interpreted as single addresses, which + allows an address list to be copied with groups intact by using the list + obtained ``groups`` attribute of the source header. + + +.. class:: SingleAddressHeader + + A subclass of :class:`.AddressHeader` that adds one + additional attribute: + + + .. attribute:: address + + The single address encoded by the header value. If the header value + actually contains more than one address (which would be a violation of + the RFC under the default :mod:`~email.policy`), accessing this attribute + will result in a :exc:`ValueError`. + + +Many of the above classes also have a ``Unique`` variant (for example, +``UniqueUnstructuredHeader``). The only difference is that in the ``Unique`` +variant, :attr:`~.BaseHeader.max_count` is set to 1. + + +.. class:: MIMEVersionHeader + + There is really only one valid value for the :mailheader:`MIME-Version` + header, and that is ``1.0``. For future proofing, this header class + supports other valid version numbers. If a version number has a valid value + per :rfc:`2045`, then the header object will have non-``None`` values for + the following attributes: + + .. attribute:: version + + The version number as a string, with any whitespace and/or comments + removed. + + .. attribute:: major + + The major version number as an integer + + .. attribute:: minor + + The minor version number as an integer + + +.. class:: ParameterizedMIMEHeader + + MOME headers all start with the prefix 'Content-'. Each specific header has + a certain value, described under the class for that header. Some can + also take a list of supplemental parameters, which have a common format. + This class serves as a base for all the MIME headers that take parameters. + + .. attribute:: params + + A dictionary mapping parameter names to parameter values. + + +.. class:: ContentTypeHeader + + A :class:`ParameterizedMIMEHeader` class that handles the + :mailheader:`Content-Type` header. + + .. attribute:: content_type + + The content type string, in the form ``maintype/subtype``. + + .. attribute:: maintype + + .. attribute:: subtype + + +.. class:: ContentDispositionHeader + + A :class:`ParameterizedMIMEHeader` class that handles the + :mailheader:`Content-Disposition` header. + + .. attribute:: content-disposition + + ``inline`` and ``attachment`` are the only valid values in common use. + + +.. class:: ContentTransferEncoding + + Handles the :mailheader:`Content-Transfer-Encoding` header. + + .. attribute:: cte + + Valid values are ``7bit``, ``8bit``, ``base64``, and + ``quoted-printable``. See :rfc:`2045` for more information. + + + +.. class:: HeaderRegistry(base_class=BaseHeader, \ + default_class=UnstructuredHeader, \ + use_default_map=True) + + This is the factory used by :class:`~email.policy.EmailPolicy` by default. + ``HeaderRegistry`` builds the class used to create a header instance + dynamically, using *base_class* and a specialized class retrieved from a + registry that it holds. When a given header name does not appear in the + registry, the class specified by *default_class* is used as the specialized + class. When *use_default_map* is ``True`` (the default), the standard + mapping of header names to classes is copied in to the registry during + initialization. *base_class* is always the last class in the generated + class's ``__bases__`` list. + + The default mappings are: + + :subject: UniqueUnstructuredHeader + :date: UniqueDateHeader + :resent-date: DateHeader + :orig-date: UniqueDateHeader + :sender: UniqueSingleAddressHeader + :resent-sender: SingleAddressHeader + :to: UniqueAddressHeader + :resent-to: AddressHeader + :cc: UniqueAddressHeader + :resent-cc: AddressHeader + :from: UniqueAddressHeader + :resent-from: AddressHeader + :reply-to: UniqueAddressHeader + + ``HeaderRegistry`` has the following methods: + + + .. method:: map_to_type(self, name, cls) + + *name* is the name of the header to be mapped. It will be converted to + lower case in the registry. *cls* is the specialized class to be used, + along with *base_class*, to create the class used to instantiate headers + that match *name*. + + + .. method:: __getitem__(name) + + Construct and return a class to handle creating a *name* header. + + + .. method:: __call__(name, value) + + Retrieves the specialized header associated with *name* from the + registry (using *default_class* if *name* does not appear in the + registry) and composes it with *base_class* to produce a class, + calls the constructed class's constructor, passing it the same + argument list, and finally returns the class instance created thereby. + + +The following classes are the classes used to represent data parsed from +structured headers and can, in general, be used by an application program to +construct structured values to assign to specific headers. + + +.. class:: Address(display_name='', username='', domain='', addr_spec=None) + + The class used to represent an email address. The general form of an + address is:: + + [display_name] <username@domain> + + or:: + + username@domain + + where each part must conform to specific syntax rules spelled out in + :rfc:`5322`. + + As a convenience *addr_spec* can be specified instead of *username* and + *domain*, in which case *username* and *domain* will be parsed from the + *addr_spec*. An *addr_spec* must be a properly RFC quoted string; if it is + not ``Address`` will raise an error. Unicode characters are allowed and + will be property encoded when serialized. However, per the RFCs, unicode is + *not* allowed in the username portion of the address. + + .. attribute:: display_name + + The display name portion of the address, if any, with all quoting + removed. If the address does not have a display name, this attribute + will be an empty string. + + .. attribute:: username + + The ``username`` portion of the address, with all quoting removed. + + .. attribute:: domain + + The ``domain`` portion of the address. + + .. attribute:: addr_spec + + The ``username@domain`` portion of the address, correctly quoted + for use as a bare address (the second form shown above). This + attribute is not mutable. + + .. method:: __str__() + + The ``str`` value of the object is the address quoted according to + :rfc:`5322` rules, but with no Content Transfer Encoding of any non-ASCII + characters. + + To support SMTP (:rfc:`5321`), ``Address`` handles one special case: if + ``username`` and ``domain`` are both the empty string (or ``None``), then + the string value of the ``Address`` is ``<>``. + + +.. class:: Group(display_name=None, addresses=None) + + The class used to represent an address group. The general form of an + address group is:: + + display_name: [address-list]; + + As a convenience for processing lists of addresses that consist of a mixture + of groups and single addresses, a ``Group`` may also be used to represent + single addresses that are not part of a group by setting *display_name* to + ``None`` and providing a list of the single address as *addresses*. + + .. attribute:: display_name + + The ``display_name`` of the group. If it is ``None`` and there is + exactly one ``Address`` in ``addresses``, then the ``Group`` represents a + single address that is not in a group. + + .. attribute:: addresses + + A possibly empty tuple of :class:`.Address` objects representing the + addresses in the group. + + .. method:: __str__() + + The ``str`` value of a ``Group`` is formatted according to :rfc:`5322`, + but with no Content Transfer Encoding of any non-ASCII characters. If + ``display_name`` is none and there is a single ``Address`` in the + ``addresses`` list, the ``str`` value will be the same as the ``str`` of + that single ``Address``. diff --git a/Doc/library/email.iterators.rst b/Doc/library/email.iterators.rst index d1f1797..f92f460 100644 --- a/Doc/library/email.iterators.rst +++ b/Doc/library/email.iterators.rst @@ -6,8 +6,9 @@ Iterating over a message object tree is fairly easy with the -:meth:`Message.walk` method. The :mod:`email.iterators` module provides some -useful higher level iterations over message object trees. +:meth:`Message.walk <email.message.Message.walk>` method. The +:mod:`email.iterators` module provides some useful higher level iterations over +message object trees. .. function:: body_line_iterator(msg, decode=False) @@ -16,9 +17,11 @@ useful higher level iterations over message object trees. string payloads line-by-line. It skips over all the subpart headers, and it skips over any subpart with a payload that isn't a Python string. This is somewhat equivalent to reading the flat text representation of the message from - a file using :meth:`readline`, skipping over all the intervening headers. + a file using :meth:`~io.TextIOBase.readline`, skipping over all the + intervening headers. - Optional *decode* is passed through to :meth:`Message.get_payload`. + Optional *decode* is passed through to :meth:`Message.get_payload + <email.message.Message.get_payload>`. .. function:: typed_subpart_iterator(msg, maintype='text', subtype=None) @@ -33,14 +36,22 @@ useful higher level iterations over message object trees. Thus, by default :func:`typed_subpart_iterator` returns each subpart that has a MIME type of :mimetype:`text/\*`. + The following function has been added as a useful debugging tool. It should *not* be considered part of the supported public interface for the package. - .. function:: _structure(msg, fp=None, level=0, include_default=False) Prints an indented representation of the content types of the message object - structure. For example:: + structure. For example: + + .. testsetup:: + + >>> import email + >>> from email.iterators import _structure + >>> somefile = open('Lib/test/test_email/data/msg_02.txt') + + .. doctest:: >>> msg = email.message_from_file(somefile) >>> _structure(msg) @@ -60,6 +71,10 @@ The following function has been added as a useful debugging tool. It should text/plain text/plain + .. testsetup:: + + >>> somefile.close() + Optional *fp* is a file-like object to print the output to. It must be suitable for Python's :func:`print` function. *level* is used internally. *include_default*, if true, prints the default type as well. diff --git a/Doc/library/email.message.rst b/Doc/library/email.message.rst index f685e54..3075e86 100644 --- a/Doc/library/email.message.rst +++ b/Doc/library/email.message.rst @@ -31,9 +31,15 @@ parameters, and for recursively walking over the object tree. Here are the methods of the :class:`Message` class: -.. class:: Message() +.. class:: Message(policy=compat32) - The constructor takes no arguments. + The *policy* argument determiens the :mod:`~email.policy` that will be used + to update the message model. The default value, :class:`compat32 + <email.policy.Compat32>` maintains backward compatibility with the + Python 3.2 version of the email package. For more information see the + :mod:`~email.policy` documentation. + + .. versionchanged:: 3.3 The *policy* keyword argument was added. .. method:: as_string(unixfrom=False, maxheaderlen=0) @@ -49,8 +55,8 @@ Here are the methods of the :class:`Message` class: format the message the way you want. For example, by default it does not do the mangling of lines that begin with ``From`` that is required by the unix mbox format. For more flexibility, instantiate a - :class:`~email.generator.Generator` instance and use its :meth:`flatten` - method directly. For example:: + :class:`~email.generator.Generator` instance and use its + :meth:`~email.generator.Generator.flatten` method directly. For example:: from io import StringIO from email.generator import Generator @@ -69,7 +75,7 @@ Here are the methods of the :class:`Message` class: Return ``True`` if the message's payload is a list of sub-\ :class:`Message` objects, otherwise return ``False``. When - :meth:`is_multipart` returns False, the payload should be a string object. + :meth:`is_multipart` returns ``False``, the payload should be a string object. .. method:: set_unixfrom(unixfrom) @@ -111,10 +117,14 @@ Here are the methods of the :class:`Message` class: header. When ``True`` and the message is not a multipart, the payload will be decoded if this header's value is ``quoted-printable`` or ``base64``. If some other encoding is used, or :mailheader:`Content-Transfer-Encoding` - header is missing, or if the payload has bogus base64 data, the payload is + header is missing, the payload is returned as-is (undecoded). In all cases the returned value is binary data. If the message is a multipart and the *decode* flag is ``True``, - then ``None`` is returned. + then ``None`` is returned. If the payload is base64 and it was not + perfectly formed (missing padding, characters outside the base64 + alphabet), then an appropriate defect will be added to the message's + defect property (:class:`~email.errors.InvalidBase64PaddingDefect` or + :class:`~email.errors.InvalidBase64CharactersDefect`, respectively). When *decode* is ``False`` (the default) the body is returned as a string without decoding the :mailheader:`Content-Transfer-Encoding`. However, @@ -466,8 +476,8 @@ Here are the methods of the :class:`Message` class: Set the ``boundary`` parameter of the :mailheader:`Content-Type` header to *boundary*. :meth:`set_boundary` will always quote *boundary* if - necessary. A :exc:`HeaderParseError` is raised if the message object has - no :mailheader:`Content-Type` header. + necessary. A :exc:`~email.errors.HeaderParseError` is raised if the + message object has no :mailheader:`Content-Type` header. Note that using this method is subtly different than deleting the old :mailheader:`Content-Type` header and adding a new one with the new @@ -509,16 +519,25 @@ Here are the methods of the :class:`Message` class: iterator in a ``for`` loop; each iteration returns the next subpart. Here's an example that prints the MIME type of every part of a multipart - message structure:: + message structure: + + .. testsetup:: + + >>> from email import message_from_binary_file + >>> with open('Lib/test/test_email/data/msg_16.txt', 'rb') as f: + ... msg = message_from_binary_file(f) + + .. doctest:: - >>> for part in msg.walk(): - ... print(part.get_content_type()) - multipart/report - text/plain - message/delivery-status - text/plain - text/plain - message/rfc822 + >>> for part in msg.walk(): + ... print(part.get_content_type()) + multipart/report + text/plain + message/delivery-status + text/plain + text/plain + message/rfc822 + text/plain :class:`Message` objects can also optionally contain two instance attributes, which can be used when generating the plain text of a MIME message. @@ -554,7 +573,8 @@ Here are the methods of the :class:`Message` class: the end of the message. You do not need to set the epilogue to the empty string in order for the - :class:`Generator` to print a newline at the end of the file. + :class:`~email.generator.Generator` to print a newline at the end of the + file. .. attribute:: defects diff --git a/Doc/library/email.mime.rst b/Doc/library/email.mime.rst index d6addc8..4cdb322 100644 --- a/Doc/library/email.mime.rst +++ b/Doc/library/email.mime.rst @@ -35,7 +35,8 @@ Here are the classes: *_maintype* is the :mailheader:`Content-Type` major type (e.g. :mimetype:`text` or :mimetype:`image`), and *_subtype* is the :mailheader:`Content-Type` minor type (e.g. :mimetype:`plain` or :mimetype:`gif`). *_params* is a parameter - key/value dictionary and is passed directly to :meth:`Message.add_header`. + key/value dictionary and is passed directly to :meth:`Message.add_header + <email.message.Message.add_header>`. The :class:`MIMEBase` class always adds a :mailheader:`Content-Type` header (based on *_maintype*, *_subtype*, and *_params*), and a @@ -50,8 +51,9 @@ Here are the classes: A subclass of :class:`~email.mime.base.MIMEBase`, this is an intermediate base class for MIME messages that are not :mimetype:`multipart`. The primary - purpose of this class is to prevent the use of the :meth:`attach` method, - which only makes sense for :mimetype:`multipart` messages. If :meth:`attach` + purpose of this class is to prevent the use of the + :meth:`~email.message.Message.attach` method, which only makes sense for + :mimetype:`multipart` messages. If :meth:`~email.message.Message.attach` is called, a :exc:`~email.errors.MultipartConversionError` exception is raised. @@ -74,7 +76,8 @@ Here are the classes: *_subparts* is a sequence of initial subparts for the payload. It must be possible to convert this sequence to a list. You can always attach new subparts - to the message by using the :meth:`Message.attach` method. + to the message by using the :meth:`Message.attach + <email.message.Message.attach>` method. Additional parameters for the :mailheader:`Content-Type` header are taken from the keyword arguments, or passed into the *_params* argument, which is a keyword @@ -95,8 +98,10 @@ Here are the classes: Optional *_encoder* is a callable (i.e. function) which will perform the actual encoding of the data for transport. This callable takes one argument, which is - the :class:`MIMEApplication` instance. It should use :meth:`get_payload` and - :meth:`set_payload` to change the payload to encoded form. It should also add + the :class:`MIMEApplication` instance. It should use + :meth:`~email.message.Message.get_payload` and + :meth:`~email.message.Message.set_payload` to change the payload to encoded + form. It should also add any :mailheader:`Content-Transfer-Encoding` or other headers to the message object as necessary. The default encoding is base64. See the :mod:`email.encoders` module for a list of the built-in encoders. @@ -121,8 +126,10 @@ Here are the classes: Optional *_encoder* is a callable (i.e. function) which will perform the actual encoding of the audio data for transport. This callable takes one argument, - which is the :class:`MIMEAudio` instance. It should use :meth:`get_payload` and - :meth:`set_payload` to change the payload to encoded form. It should also add + which is the :class:`MIMEAudio` instance. It should use + :meth:`~email.message.Message.get_payload` and + :meth:`~email.message.Message.set_payload` to change the payload to encoded + form. It should also add any :mailheader:`Content-Transfer-Encoding` or other headers to the message object as necessary. The default encoding is base64. See the :mod:`email.encoders` module for a list of the built-in encoders. @@ -147,8 +154,10 @@ Here are the classes: Optional *_encoder* is a callable (i.e. function) which will perform the actual encoding of the image data for transport. This callable takes one argument, - which is the :class:`MIMEImage` instance. It should use :meth:`get_payload` and - :meth:`set_payload` to change the payload to encoded form. It should also add + which is the :class:`MIMEImage` instance. It should use + :meth:`~email.message.Message.get_payload` and + :meth:`~email.message.Message.set_payload` to change the payload to encoded + form. It should also add any :mailheader:`Content-Transfer-Encoding` or other headers to the message object as necessary. The default encoding is base64. See the :mod:`email.encoders` module for a list of the built-in encoders. @@ -175,7 +184,7 @@ Here are the classes: .. currentmodule:: email.mime.text -.. class:: MIMEText(_text, _subtype='plain', _charset='us-ascii') +.. class:: MIMEText(_text, _subtype='plain', _charset=None) Module: :mod:`email.mime.text` @@ -185,7 +194,8 @@ Here are the classes: minor type and defaults to :mimetype:`plain`. *_charset* is the character set of the text and is passed as an argument to the :class:`~email.mime.nonmultipart.MIMENonMultipart` constructor; it defaults - to ``us-ascii``. + to ``us-ascii`` if the string contains only ``ascii`` codepoints, and + ``utf-8`` otherwise. Unless the *_charset* argument is explicitly set to ``None``, the MIMEText object created will have both a :mailheader:`Content-Type` header @@ -196,4 +206,3 @@ Here are the classes: ``Content-Transfer-Encoding`` header, after which a ``set_payload`` call will automatically encode the new payload (and add a new :mailheader:`Content-Transfer-Encoding` header). - diff --git a/Doc/library/email.parser.rst b/Doc/library/email.parser.rst index 49a59c0..ee6af3f 100644 --- a/Doc/library/email.parser.rst +++ b/Doc/library/email.parser.rst @@ -7,7 +7,8 @@ Message object structures can be created in one of two ways: they can be created from whole cloth by instantiating :class:`~email.message.Message` objects and -stringing them together via :meth:`attach` and :meth:`set_payload` calls, or they +stringing them together via :meth:`~email.message.Message.attach` and +:meth:`~email.message.Message.set_payload` calls, or they can be created by parsing a flat text representation of the email message. The :mod:`email` package provides a standard parser that understands most email @@ -16,8 +17,9 @@ or a file object, and the parser will return to you the root :class:`~email.message.Message` instance of the object structure. For simple, non-MIME messages the payload of this root object will likely be a string containing the text of the message. For MIME messages, the root object will -return ``True`` from its :meth:`is_multipart` method, and the subparts can be -accessed via the :meth:`get_payload` and :meth:`walk` methods. +return ``True`` from its :meth:`~email.message.Message.is_multipart` method, and +the subparts can be accessed via the :meth:`~email.message.Message.get_payload` +and :meth:`~email.message.Message.walk` methods. There are actually two parser interfaces available for use, the classic :class:`Parser` API and the incremental :class:`FeedParser` API. The classic @@ -58,12 +60,18 @@ list of defects that it can find. Here is the API for the :class:`FeedParser`: -.. class:: FeedParser(_factory=email.message.Message) +.. class:: FeedParser(_factory=email.message.Message, *, policy=policy.default) Create a :class:`FeedParser` instance. Optional *_factory* is a no-argument callable that will be called whenever a new message object is needed. It defaults to the :class:`email.message.Message` class. + The *policy* keyword specifies a :mod:`~email.policy` object that controls a + number of aspects of the parser's operation. The default policy maintains + backward compatibility. + + .. versionchanged:: 3.3 Added the *policy* keyword. + .. method:: feed(data) Feed the :class:`FeedParser` some more data. *data* should be a string @@ -94,15 +102,18 @@ Parser class API The :class:`Parser` class, imported from the :mod:`email.parser` module, provides an API that can be used to parse a message when the complete contents of the message are available in a string or file. The :mod:`email.parser` -module also provides a second class, called :class:`HeaderParser` which can be -used if you're only interested in the headers of the message. -:class:`HeaderParser` can be much faster in these situations, since it does not -attempt to parse the message body, instead setting the payload to the raw body -as a string. :class:`HeaderParser` has the same API as the :class:`Parser` -class. +module also provides header-only parsers, called :class:`HeaderParser` and +:class:`BytesHeaderParser`, which can be used if you're only interested in the +headers of the message. :class:`HeaderParser` and :class:`BytesHeaderParser` +can be much faster in these situations, since they do not attempt to parse the +message body, instead setting the payload to the raw body as a string. They +have the same API as the :class:`Parser` and :class:`BytesParser` classes. +.. versionadded:: 3.3 + The BytesHeaderParser class. -.. class:: Parser(_class=email.message.Message, strict=None) + +.. class:: Parser(_class=email.message.Message, *, policy=policy.default) The constructor for the :class:`Parser` class takes an optional argument *_class*. This must be a callable factory (such as a function or a class), and @@ -110,13 +121,13 @@ class. :class:`~email.message.Message` (see :mod:`email.message`). The factory will be called without arguments. - The optional *strict* flag is ignored. + The *policy* keyword specifies a :mod:`~email.policy` object that controls a + number of aspects of the parser's operation. The default policy maintains + backward compatibility. - .. deprecated:: 2.4 - Because the :class:`Parser` class is a backward compatible API wrapper - around the new-in-Python 2.4 :class:`FeedParser`, *all* parsing is - effectively non-strict. You should simply stop passing a *strict* flag to - the :class:`Parser` constructor. + .. versionchanged:: 3.3 + Removed the *strict* argument that was deprecated in 2.4. Added the + *policy* keyword. The other public :class:`Parser` methods are: @@ -125,7 +136,8 @@ class. Read all the data from the file-like object *fp*, parse the resulting text, and return the root message object. *fp* must support both the - :meth:`readline` and the :meth:`read` methods on file-like objects. + :meth:`~io.TextIOBase.readline` and the :meth:`~io.TextIOBase.read` + methods on file-like objects. The text contained in *fp* must be formatted as a block of :rfc:`2822` style headers and header continuation lines, optionally preceded by a @@ -147,19 +159,25 @@ class. Optional *headersonly* is as with the :meth:`parse` method. -.. class:: BytesParser(_class=email.message.Message, strict=None) +.. class:: BytesParser(_class=email.message.Message, *, policy=policy.default) This class is exactly parallel to :class:`Parser`, but handles bytes input. The *_class* and *strict* arguments are interpreted in the same way as for - the :class:`Parser` constructor. *strict* is supported only to make porting - code easier; it is deprecated. + the :class:`Parser` constructor. + + The *policy* keyword specifies a :mod:`~email.policy` object that + controls a number of aspects of the parser's operation. The default + policy maintains backward compatibility. + + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. .. method:: parse(fp, headeronly=False) Read all the data from the binary file-like object *fp*, parse the resulting bytes, and return the message object. *fp* must support - both the :meth:`readline` and the :meth:`read` methods on file-like - objects. + both the :meth:`~io.IOBase.readline` and the :meth:`~io.IOBase.read` + methods on file-like objects. The bytes contained in *fp* must be formatted as a block of :rfc:`2822` style headers and header continuation lines, optionally preceded by a @@ -190,39 +208,55 @@ in the top-level :mod:`email` package namespace. .. currentmodule:: email -.. function:: message_from_string(s, _class=email.message.Message, strict=None) +.. function:: message_from_string(s, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure from a string. This is exactly equivalent to - ``Parser().parsestr(s)``. Optional *_class* and *strict* are interpreted as - with the :class:`Parser` class constructor. + ``Parser().parsestr(s)``. *_class* and *policy* are interpreted as + with the :class:`~email.parser.Parser` class constructor. + + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. -.. function:: message_from_bytes(s, _class=email.message.Message, strict=None) +.. function:: message_from_bytes(s, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure from a byte string. This is exactly equivalent to ``BytesParser().parsebytes(s)``. Optional *_class* and - *strict* are interpreted as with the :class:`Parser` class constructor. + *strict* are interpreted as with the :class:`~email.parser.Parser` class + constructor. .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. -.. function:: message_from_file(fp, _class=email.message.Message, strict=None) +.. function:: message_from_file(fp, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure tree from an open :term:`file object`. - This is exactly equivalent to ``Parser().parse(fp)``. Optional *_class* - and *strict* are interpreted as with the :class:`Parser` class constructor. + This is exactly equivalent to ``Parser().parse(fp)``. *_class* + and *policy* are interpreted as with the :class:`~email.parser.Parser` class + constructor. + + .. versionchanged:: + Removed the *strict* argument. Added the *policy* keyword. -.. function:: message_from_binary_file(fp, _class=email.message.Message, strict=None) +.. function:: message_from_binary_file(fp, _class=email.message.Message, *, \ + policy=policy.default) Return a message object structure tree from an open binary :term:`file object`. This is exactly equivalent to ``BytesParser().parse(fp)``. - Optional *_class* and *strict* are interpreted as with the :class:`Parser` - class constructor. + *_class* and *policy* are interpreted as with the + :class:`~email.parser.Parser` class constructor. .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Removed the *strict* argument. Added the *policy* keyword. Here's an example of how you might use this at an interactive Python prompt:: >>> import email - >>> msg = email.message_from_string(myString) + >>> msg = email.message_from_string(myString) # doctest: +SKIP Additional notes @@ -232,32 +266,35 @@ Here are some notes on the parsing semantics: * Most non-\ :mimetype:`multipart` type messages are parsed as a single message object with a string payload. These objects will return ``False`` for - :meth:`is_multipart`. Their :meth:`get_payload` method will return a string - object. + :meth:`~email.message.Message.is_multipart`. Their + :meth:`~email.message.Message.get_payload` method will return a string object. * All :mimetype:`multipart` type messages will be parsed as a container message object with a list of sub-message objects for their payload. The outer - container message will return ``True`` for :meth:`is_multipart` and their - :meth:`get_payload` method will return the list of :class:`~email.message.Message` - subparts. + container message will return ``True`` for + :meth:`~email.message.Message.is_multipart` and their + :meth:`~email.message.Message.get_payload` method will return the list of + :class:`~email.message.Message` subparts. * Most messages with a content type of :mimetype:`message/\*` (e.g. :mimetype:`message/delivery-status` and :mimetype:`message/rfc822`) will also be parsed as container object containing a list payload of length 1. Their - :meth:`is_multipart` method will return ``True``. The single element in the - list payload will be a sub-message object. + :meth:`~email.message.Message.is_multipart` method will return ``True``. + The single element in the list payload will be a sub-message object. * Some non-standards compliant messages may not be internally consistent about their :mimetype:`multipart`\ -edness. Such messages may have a :mailheader:`Content-Type` header of type :mimetype:`multipart`, but their - :meth:`is_multipart` method may return ``False``. If such messages were parsed - with the :class:`FeedParser`, they will have an instance of the - :class:`MultipartInvariantViolationDefect` class in their *defects* attribute - list. See :mod:`email.errors` for details. + :meth:`~email.message.Message.is_multipart` method may return ``False``. + If such messages were parsed with the :class:`~email.parser.FeedParser`, + they will have an instance of the + :class:`~email.errors.MultipartInvariantViolationDefect` class in their + *defects* attribute list. See :mod:`email.errors` for details. .. rubric:: Footnotes .. [#] As of email package version 3.0, introduced in Python 2.4, the classic - :class:`Parser` was re-implemented in terms of the :class:`FeedParser`, so the - semantics and results are identical between the two parsers. + :class:`~email.parser.Parser` was re-implemented in terms of the + :class:`~email.parser.FeedParser`, so the semantics and results are + identical between the two parsers. diff --git a/Doc/library/email.policy.rst b/Doc/library/email.policy.rst new file mode 100644 index 0000000..cb2023c --- /dev/null +++ b/Doc/library/email.policy.rst @@ -0,0 +1,513 @@ +:mod:`email.policy`: Policy Objects +----------------------------------- + +.. module:: email.policy + :synopsis: Controlling the parsing and generating of messages + +.. moduleauthor:: R. David Murray <rdmurray@bitdance.com> +.. sectionauthor:: R. David Murray <rdmurray@bitdance.com> + +.. versionadded:: 3.3 + + +The :mod:`email` package's prime focus is the handling of email messages as +described by the various email and MIME RFCs. However, the general format of +email messages (a block of header fields each consisting of a name followed by +a colon followed by a value, the whole block followed by a blank line and an +arbitrary 'body'), is a format that has found utility outside of the realm of +email. Some of these uses conform fairly closely to the main RFCs, some do +not. And even when working with email, there are times when it is desirable to +break strict compliance with the RFCs. + +Policy objects give the email package the flexibility to handle all these +disparate use cases. + +A :class:`Policy` object encapsulates a set of attributes and methods that +control the behavior of various components of the email package during use. +:class:`Policy` instances can be passed to various classes and methods in the +email package to alter the default behavior. The settable values and their +defaults are described below. + +There is a default policy used by all classes in the email package. This +policy is named :class:`Compat32`, with a corresponding pre-defined instance +named :const:`compat32`. It provides for complete backward compatibility (in +some cases, including bug compatibility) with the pre-Python3.3 version of the +email package. + +The first part of this documentation covers the features of :class:`Policy`, an +:term:`abstract base class` that defines the features that are common to all +policy objects, including :const:`compat32`. This includes certain hook +methods that are called internally by the email package, which a custom policy +could override to obtain different behavior. + +When a :class:`~email.message.Message` object is created, it acquires a policy. +By default this will be :const:`compat32`, but a different policy can be +specified. If the ``Message`` is created by a :mod:`~email.parser`, a policy +passed to the parser will be the policy used by the ``Message`` it creates. If +the ``Message`` is created by the program, then the policy can be specified +when it is created. When a ``Message`` is passed to a :mod:`~email.generator`, +the generator uses the policy from the ``Message`` by default, but you can also +pass a specific policy to the generator that will override the one stored on +the ``Message`` object. + +:class:`Policy` instances are immutable, but they can be cloned, accepting the +same keyword arguments as the class constructor and returning a new +:class:`Policy` instance that is a copy of the original but with the specified +attributes values changed. + +As an example, the following code could be used to read an email message from a +file on disk and pass it to the system ``sendmail`` program on a Unix system: + +.. testsetup:: + + >>> from unittest import mock + >>> mocker = mock.patch('subprocess.Popen') + >>> m = mocker.start() + >>> proc = mock.MagicMock() + >>> m.return_value = proc + >>> proc.stdin.close.return_value = None + >>> mymsg = open('mymsg.txt', 'w') + >>> mymsg.write('To: abc@xyz.com\n\n') + 17 + >>> mymsg.flush() + +.. doctest:: + + >>> from email import message_from_binary_file + >>> from email.generator import BytesGenerator + >>> from email import policy + >>> from subprocess import Popen, PIPE + >>> with open('mymsg.txt', 'rb') as f: + ... msg = message_from_binary_file(f, policy=policy.default) + >>> p = Popen(['sendmail', msg['To'].addresses[0]], stdin=PIPE) + >>> g = BytesGenerator(p.stdin, policy=msg.policy.clone(linesep='\r\n')) + >>> g.flatten(msg) + >>> p.stdin.close() + >>> rc = p.wait() + +.. testsetup:: + + >>> mymsg.close() + >>> mocker.stop() + >>> import os + >>> os.remove('mymsg.txt') + +Here we are telling :class:`~email.generator.BytesGenerator` to use the RFC +correct line separator characters when creating the binary string to feed into +``sendmail's`` ``stdin``, where the default policy would use ``\n`` line +separators. + +Policy objects can also be combined using the addition operator, producing a +policy object whose settings are a combination of the non-default values of the +summed objects:: + + >>> compat_SMTP = policy.compat32.clone(linesep='\r\n') + >>> compat_strict = policy.compat32.clone(raise_on_defect=True) + >>> compat_strict_SMTP = compat_SMTP + compat_strict + +This operation is not commutative; that is, the order in which the objects are +added matters. To illustrate:: + + >>> policy100 = policy.compat32.clone(max_line_length=100) + >>> policy80 = policy.compat32.clone(max_line_length=80) + >>> apolicy = policy100 + policy80 + >>> apolicy.max_line_length + 80 + >>> apolicy = policy80 + policy100 + >>> apolicy.max_line_length + 100 + + +.. class:: Policy(**kw) + + This is the :term:`abstract base class` for all policy classes. It provides + default implementations for a couple of trivial methods, as well as the + implementation of the immutability property, the :meth:`clone` method, and + the constructor semantics. + + The constructor of a policy class can be passed various keyword arguments. + The arguments that may be specified are any non-method properties on this + class, plus any additional non-method properties on the concrete class. A + value specified in the constructor will override the default value for the + corresponding attribute. + + This class defines the following properties, and thus values for the + following may be passed in the constructor of any policy class: + + .. attribute:: max_line_length + + The maximum length of any line in the serialized output, not counting the + end of line character(s). Default is 78, per :rfc:`5322`. A value of + ``0`` or :const:`None` indicates that no line wrapping should be + done at all. + + .. attribute:: linesep + + The string to be used to terminate lines in serialized output. The + default is ``\n`` because that's the internal end-of-line discipline used + by Python, though ``\r\n`` is required by the RFCs. + + .. attribute:: cte_type + + Controls the type of Content Transfer Encodings that may be or are + required to be used. The possible values are: + + .. tabularcolumns:: |l|L| + + ======== =============================================================== + ``7bit`` all data must be "7 bit clean" (ASCII-only). This means that + where necessary data will be encoded using either + quoted-printable or base64 encoding. + + ``8bit`` data is not constrained to be 7 bit clean. Data in headers is + still required to be ASCII-only and so will be encoded (see + 'binary_fold' below for an exception), but body parts may use + the ``8bit`` CTE. + ======== =============================================================== + + A ``cte_type`` value of ``8bit`` only works with ``BytesGenerator``, not + ``Generator``, because strings cannot contain binary data. If a + ``Generator`` is operating under a policy that specifies + ``cte_type=8bit``, it will act as if ``cte_type`` is ``7bit``. + + .. attribute:: raise_on_defect + + If :const:`True`, any defects encountered will be raised as errors. If + :const:`False` (the default), defects will be passed to the + :meth:`register_defect` method. + + The following :class:`Policy` method is intended to be called by code using + the email library to create policy instances with custom settings: + + .. method:: clone(**kw) + + Return a new :class:`Policy` instance whose attributes have the same + values as the current instance, except where those attributes are + given new values by the keyword arguments. + + The remaining :class:`Policy` methods are called by the email package code, + and are not intended to be called by an application using the email package. + A custom policy must implement all of these methods. + + .. method:: handle_defect(obj, defect) + + Handle a *defect* found on *obj*. When the email package calls this + method, *defect* will always be a subclass of + :class:`~email.errors.Defect`. + + The default implementation checks the :attr:`raise_on_defect` flag. If + it is ``True``, *defect* is raised as an exception. If it is ``False`` + (the default), *obj* and *defect* are passed to :meth:`register_defect`. + + .. method:: register_defect(obj, defect) + + Register a *defect* on *obj*. In the email package, *defect* will always + be a subclass of :class:`~email.errors.Defect`. + + The default implementation calls the ``append`` method of the ``defects`` + attribute of *obj*. When the email package calls :attr:`handle_defect`, + *obj* will normally have a ``defects`` attribute that has an ``append`` + method. Custom object types used with the email package (for example, + custom ``Message`` objects) should also provide such an attribute, + otherwise defects in parsed messages will raise unexpected errors. + + .. method:: header_max_count(name) + + Return the maximum allowed number of headers named *name*. + + Called when a header is added to a :class:`~email.message.Message` + object. If the returned value is not ``0`` or ``None``, and there are + already a number of headers with the name *name* equal to the value + returned, a :exc:`ValueError` is raised. + + Because the default behavior of ``Message.__setitem__`` is to append the + value to the list of headers, it is easy to create duplicate headers + without realizing it. This method allows certain headers to be limited + in the number of instances of that header that may be added to a + ``Message`` programmatically. (The limit is not observed by the parser, + which will faithfully produce as many headers as exist in the message + being parsed.) + + The default implementation returns ``None`` for all header names. + + .. method:: header_source_parse(sourcelines) + + The email package calls this method with a list of strings, each string + ending with the line separation characters found in the source being + parsed. The first line includes the field header name and separator. + All whitespace in the source is preserved. The method should return the + ``(name, value)`` tuple that is to be stored in the ``Message`` to + represent the parsed header. + + If an implementation wishes to retain compatibility with the existing + email package policies, *name* should be the case preserved name (all + characters up to the '``:``' separator), while *value* should be the + unfolded value (all line separator characters removed, but whitespace + kept intact), stripped of leading whitespace. + + *sourcelines* may contain surrogateescaped binary data. + + There is no default implementation + + .. method:: header_store_parse(name, value) + + The email package calls this method with the name and value provided by + the application program when the application program is modifying a + ``Message`` programmatically (as opposed to a ``Message`` created by a + parser). The method should return the ``(name, value)`` tuple that is to + be stored in the ``Message`` to represent the header. + + If an implementation wishes to retain compatibility with the existing + email package policies, the *name* and *value* should be strings or + string subclasses that do not change the content of the passed in + arguments. + + There is no default implementation + + .. method:: header_fetch_parse(name, value) + + The email package calls this method with the *name* and *value* currently + stored in the ``Message`` when that header is requested by the + application program, and whatever the method returns is what is passed + back to the application as the value of the header being retrieved. + Note that there may be more than one header with the same name stored in + the ``Message``; the method is passed the specific name and value of the + header destined to be returned to the application. + + *value* may contain surrogateescaped binary data. There should be no + surrogateescaped binary data in the value returned by the method. + + There is no default implementation + + .. method:: fold(name, value) + + The email package calls this method with the *name* and *value* currently + stored in the ``Message`` for a given header. The method should return a + string that represents that header "folded" correctly (according to the + policy settings) by composing the *name* with the *value* and inserting + :attr:`linesep` characters at the appropriate places. See :rfc:`5322` + for a discussion of the rules for folding email headers. + + *value* may contain surrogateescaped binary data. There should be no + surrogateescaped binary data in the string returned by the method. + + .. method:: fold_binary(name, value) + + The same as :meth:`fold`, except that the returned value should be a + bytes object rather than a string. + + *value* may contain surrogateescaped binary data. These could be + converted back into binary data in the returned bytes object. + + +.. class:: Compat32(**kw) + + This concrete :class:`Policy` is the backward compatibility policy. It + replicates the behavior of the email package in Python 3.2. The + :mod:`~email.policy` module also defines an instance of this class, + :const:`compat32`, that is used as the default policy. Thus the default + behavior of the email package is to maintain compatibility with Python 3.2. + + The class provides the following concrete implementations of the + abstract methods of :class:`Policy`: + + .. method:: header_source_parse(sourcelines) + + The name is parsed as everything up to the '``:``' and returned + unmodified. The value is determined by stripping leading whitespace off + the remainder of the first line, joining all subsequent lines together, + and stripping any trailing carriage return or linefeed characters. + + .. method:: header_store_parse(name, value) + + The name and value are returned unmodified. + + .. method:: header_fetch_parse(name, value) + + If the value contains binary data, it is converted into a + :class:`~email.header.Header` object using the ``unknown-8bit`` charset. + Otherwise it is returned unmodified. + + .. method:: fold(name, value) + + Headers are folded using the :class:`~email.header.Header` folding + algorithm, which preserves existing line breaks in the value, and wraps + each resulting line to the ``max_line_length``. Non-ASCII binary data are + CTE encoded using the ``unknown-8bit`` charset. + + .. method:: fold_binary(name, value) + + Headers are folded using the :class:`~email.header.Header` folding + algorithm, which preserves existing line breaks in the value, and wraps + each resulting line to the ``max_line_length``. If ``cte_type`` is + ``7bit``, non-ascii binary data is CTE encoded using the ``unknown-8bit`` + charset. Otherwise the original source header is used, with its existing + line breaks and any (RFC invalid) binary data it may contain. + + +.. note:: + + The documentation below describes new policies that are included in the + standard library on a :term:`provisional basis <provisional package>`. + Backwards incompatible changes (up to and including removal of the feature) + may occur if deemed necessary by the core developers. + + +.. class:: EmailPolicy(**kw) + + This concrete :class:`Policy` provides behavior that is intended to be fully + compliant with the current email RFCs. These include (but are not limited + to) :rfc:`5322`, :rfc:`2047`, and the current MIME RFCs. + + This policy adds new header parsing and folding algorithms. Instead of + simple strings, headers are custom objects with custom attributes depending + on the type of the field. The parsing and folding algorithm fully implement + :rfc:`2047` and :rfc:`5322`. + + In addition to the settable attributes listed above that apply to all + policies, this policy adds the following additional attributes: + + .. attribute:: refold_source + + If the value for a header in the ``Message`` object originated from a + :mod:`~email.parser` (as opposed to being set by a program), this + attribute indicates whether or not a generator should refold that value + when transforming the message back into stream form. The possible values + are: + + ======== =============================================================== + ``none`` all source values use original folding + + ``long`` source values that have any line that is longer than + ``max_line_length`` will be refolded + + ``all`` all values are refolded. + ======== =============================================================== + + The default is ``long``. + + .. attribute:: header_factory + + A callable that takes two arguments, ``name`` and ``value``, where + ``name`` is a header field name and ``value`` is an unfolded header field + value, and returns a string subclass that represents that header. A + default ``header_factory`` (see :mod:`~email.headerregistry`) is provided + that understands some of the :RFC:`5322` header field types. (Currently + address fields and date fields have special treatment, while all other + fields are treated as unstructured. This list will be completed before + the extension is marked stable.) + + The class provides the following concrete implementations of the abstract + methods of :class:`Policy`: + + .. method:: header_max_count(name) + + Returns the value of the + :attr:`~email.headerregistry.BaseHeader.max_count` attribute of the + specialized class used to represent the header with the given name. + + .. method:: header_source_parse(sourcelines) + + The implementation of this method is the same as that for the + :class:`Compat32` policy. + + .. method:: header_store_parse(name, value) + + The name is returned unchanged. If the input value has a ``name`` + attribute and it matches *name* ignoring case, the value is returned + unchanged. Otherwise the *name* and *value* are passed to + ``header_factory``, and the resulting custom header object is returned as + the value. In this case a ``ValueError`` is raised if the input value + contains CR or LF characters. + + .. method:: header_fetch_parse(name, value) + + If the value has a ``name`` attribute, it is returned to unmodified. + Otherwise the *name*, and the *value* with any CR or LF characters + removed, are passed to the ``header_factory``, and the resulting custom + header object is returned. Any surrogateescaped bytes get turned into + the unicode unknown-character glyph. + + .. method:: fold(name, value) + + Header folding is controlled by the :attr:`refold_source` policy setting. + A value is considered to be a 'source value' if and only if it does not + have a ``name`` attribute (having a ``name`` attribute means it is a + header object of some sort). If a source value needs to be refolded + according to the policy, it is converted into a custom header object by + passing the *name* and the *value* with any CR and LF characters removed + to the ``header_factory``. Folding of a custom header object is done by + calling its ``fold`` method with the current policy. + + Source values are split into lines using :meth:`~str.splitlines`. If + the value is not to be refolded, the lines are rejoined using the + ``linesep`` from the policy and returned. The exception is lines + containing non-ascii binary data. In that case the value is refolded + regardless of the ``refold_source`` setting, which causes the binary data + to be CTE encoded using the ``unknown-8bit`` charset. + + .. method:: fold_binary(name, value) + + The same as :meth:`fold` if :attr:`~Policy.cte_type` is ``7bit``, except + that the returned value is bytes. + + If :attr:`~Policy.cte_type` is ``8bit``, non-ASCII binary data is + converted back + into bytes. Headers with binary data are not refolded, regardless of the + ``refold_header`` setting, since there is no way to know whether the + binary data consists of single byte characters or multibyte characters. + +The following instances of :class:`EmailPolicy` provide defaults suitable for +specific application domains. Note that in the future the behavior of these +instances (in particular the ``HTTP`` instance) may be adjusted to conform even +more closely to the RFCs relevant to their domains. + +.. data:: default + + An instance of ``EmailPolicy`` with all defaults unchanged. This policy + uses the standard Python ``\n`` line endings rather than the RFC-correct + ``\r\n``. + +.. data:: SMTP + + Suitable for serializing messages in conformance with the email RFCs. + Like ``default``, but with ``linesep`` set to ``\r\n``, which is RFC + compliant. + +.. data:: HTTP + + Suitable for serializing headers with for use in HTTP traffic. Like + ``SMTP`` except that ``max_line_length`` is set to ``None`` (unlimited). + +.. data:: strict + + Convenience instance. The same as ``default`` except that + ``raise_on_defect`` is set to ``True``. This allows any policy to be made + strict by writing:: + + somepolicy + policy.strict + +With all of these :class:`EmailPolicies <.EmailPolicy>`, the effective API of +the email package is changed from the Python 3.2 API in the following ways: + + * Setting a header on a :class:`~email.message.Message` results in that + header being parsed and a custom header object created. + + * Fetching a header value from a :class:`~email.message.Message` results + in that header being parsed and a custom header object created and + returned. + + * Any custom header object, or any header that is refolded due to the + policy settings, is folded using an algorithm that fully implements the + RFC folding algorithms, including knowing where encoded words are required + and allowed. + +From the application view, this means that any header obtained through the +:class:`~email.message.Message` is a custom header object with custom +attributes, whose string value is the fully decoded unicode value of the +header. Likewise, a header may be assigned a new value, or a new header +created, using a unicode string, and the policy will take care of converting +the unicode string into the correct RFC encoded form. + +The custom header objects and their attributes are described in +:mod:`~email.headerregistry`. diff --git a/Doc/library/email.rst b/Doc/library/email.rst index 4530b95..a6cbbce 100644 --- a/Doc/library/email.rst +++ b/Doc/library/email.rst @@ -51,6 +51,8 @@ Contents of the :mod:`email` package documentation: email.message.rst email.parser.rst email.generator.rst + email.policy.rst + email.headerregistry.rst email.mime.rst email.header.rst email.charset.rst @@ -145,14 +147,15 @@ Here are the major differences between :mod:`email` version 4 and version 3: *Note that the version 3 names will continue to work until Python 2.6*. * The :mod:`email.mime.application` module was added, which contains the - :class:`MIMEApplication` class. + :class:`~email.mime.application.MIMEApplication` class. * Methods that were deprecated in version 3 have been removed. These include :meth:`Generator.__call__`, :meth:`Message.get_type`, :meth:`Message.get_main_type`, :meth:`Message.get_subtype`. * Fixes have been added for :rfc:`2231` support which can change some of the - return types for :func:`Message.get_param` and friends. Under some + return types for :func:`Message.get_param <email.message.Message.get_param>` + and friends. Under some circumstances, values which used to return a 3-tuple now return simple strings (specifically, if all extended parameter segments were unencoded, there is no language and charset designation expected, so the return type is now a simple @@ -161,23 +164,24 @@ Here are the major differences between :mod:`email` version 4 and version 3: Here are the major differences between :mod:`email` version 3 and version 2: -* The :class:`FeedParser` class was introduced, and the :class:`Parser` class - was implemented in terms of the :class:`FeedParser`. All parsing therefore is +* The :class:`~email.parser.FeedParser` class was introduced, and the + :class:`~email.parser.Parser` class was implemented in terms of the + :class:`~email.parser.FeedParser`. All parsing therefore is non-strict, and parsing will make a best effort never to raise an exception. Problems found while parsing messages are stored in the message's *defect* attribute. * All aspects of the API which raised :exc:`DeprecationWarning`\ s in version 2 have been removed. These include the *_encoder* argument to the - :class:`MIMEText` constructor, the :meth:`Message.add_payload` method, the - :func:`Utils.dump_address_pair` function, and the functions :func:`Utils.decode` - and :func:`Utils.encode`. + :class:`~email.mime.text.MIMEText` constructor, the + :meth:`Message.add_payload` method, the :func:`Utils.dump_address_pair` + function, and the functions :func:`Utils.decode` and :func:`Utils.encode`. * New :exc:`DeprecationWarning`\ s have been added to: :meth:`Generator.__call__`, :meth:`Message.get_type`, :meth:`Message.get_main_type`, :meth:`Message.get_subtype`, and the *strict* - argument to the :class:`Parser` class. These are expected to be removed in - future versions. + argument to the :class:`~email.parser.Parser` class. These are expected to + be removed in future versions. * Support for Pythons earlier than 2.3 has been removed. @@ -185,53 +189,61 @@ Here are the differences between :mod:`email` version 2 and version 1: * The :mod:`email.Header` and :mod:`email.Charset` modules have been added. -* The pickle format for :class:`Message` instances has changed. Since this was - never (and still isn't) formally defined, this isn't considered a backward - incompatibility. However if your application pickles and unpickles - :class:`Message` instances, be aware that in :mod:`email` version 2, - :class:`Message` instances now have private variables *_charset* and - *_default_type*. +* The pickle format for :class:`~email.message.Message` instances has changed. + Since this was never (and still isn't) formally defined, this isn't + considered a backward incompatibility. However if your application pickles + and unpickles :class:`~email.message.Message` instances, be aware that in + :mod:`email` version 2, :class:`~email.message.Message` instances now have + private variables *_charset* and *_default_type*. -* Several methods in the :class:`Message` class have been deprecated, or their - signatures changed. Also, many new methods have been added. See the - documentation for the :class:`Message` class for details. The changes should be - completely backward compatible. +* Several methods in the :class:`~email.message.Message` class have been + deprecated, or their signatures changed. Also, many new methods have been + added. See the documentation for the :class:`~email.message.Message` class + for details. The changes should be completely backward compatible. * The object structure has changed in the face of :mimetype:`message/rfc822` - content types. In :mod:`email` version 1, such a type would be represented by a - scalar payload, i.e. the container message's :meth:`is_multipart` returned - false, :meth:`get_payload` was not a list object, but a single :class:`Message` - instance. + content types. In :mod:`email` version 1, such a type would be represented + by a scalar payload, i.e. the container message's + :meth:`~email.message.Message.is_multipart` returned false, + :meth:`~email.message.Message.get_payload` was not a list object, but a + single :class:`~email.message.Message` instance. This structure was inconsistent with the rest of the package, so the object representation for :mimetype:`message/rfc822` content types was changed. In :mod:`email` version 2, the container *does* return ``True`` from - :meth:`is_multipart`, and :meth:`get_payload` returns a list containing a single - :class:`Message` item. - - Note that this is one place that backward compatibility could not be completely - maintained. However, if you're already testing the return type of - :meth:`get_payload`, you should be fine. You just need to make sure your code - doesn't do a :meth:`set_payload` with a :class:`Message` instance on a container - with a content type of :mimetype:`message/rfc822`. - -* The :class:`Parser` constructor's *strict* argument was added, and its - :meth:`parse` and :meth:`parsestr` methods grew a *headersonly* argument. The - *strict* flag was also added to functions :func:`email.message_from_file` and - :func:`email.message_from_string`. - -* :meth:`Generator.__call__` is deprecated; use :meth:`Generator.flatten` - instead. The :class:`Generator` class has also grown the :meth:`clone` method. - -* The :class:`DecodedGenerator` class in the :mod:`email.Generator` module was - added. - -* The intermediate base classes :class:`MIMENonMultipart` and - :class:`MIMEMultipart` have been added, and interposed in the class hierarchy - for most of the other MIME-related derived classes. - -* The *_encoder* argument to the :class:`MIMEText` constructor has been - deprecated. Encoding now happens implicitly based on the *_charset* argument. + :meth:`~email.message.Message.is_multipart`, and + :meth:`~email.message.Message.get_payload` returns a list containing a single + :class:`~email.message.Message` item. + + Note that this is one place that backward compatibility could not be + completely maintained. However, if you're already testing the return type of + :meth:`~email.message.Message.get_payload`, you should be fine. You just need + to make sure your code doesn't do a :meth:`~email.message.Message.set_payload` + with a :class:`~email.message.Message` instance on a container with a content + type of :mimetype:`message/rfc822`. + +* The :class:`~email.parser.Parser` constructor's *strict* argument was added, + and its :meth:`~email.parser.Parser.parse` and + :meth:`~email.parser.Parser.parsestr` methods grew a *headersonly* argument. + The *strict* flag was also added to functions :func:`email.message_from_file` + and :func:`email.message_from_string`. + +* :meth:`Generator.__call__` is deprecated; use :meth:`Generator.flatten + <email.generator.Generator.flatten>` instead. The + :class:`~email.generator.Generator` class has also grown the + :meth:`~email.generator.Generator.clone` method. + +* The :class:`~email.generator.DecodedGenerator` class in the + :mod:`email.generator` module was added. + +* The intermediate base classes + :class:`~email.mime.nonmultipart.MIMENonMultipart` and + :class:`~email.mime.multipart.MIMEMultipart` have been added, and interposed + in the class hierarchy for most of the other MIME-related derived classes. + +* The *_encoder* argument to the :class:`~email.mime.text.MIMEText` constructor + has been deprecated. Encoding now happens implicitly based on the + *_charset* argument. * The following functions in the :mod:`email.Utils` module have been deprecated: :func:`dump_address_pairs`, :func:`decode`, and :func:`encode`. The following @@ -264,17 +276,22 @@ package has the following differences: * :func:`messageFromFile` has been renamed to :func:`message_from_file`. -The :class:`Message` class has the following differences: +The :class:`~email.message.Message` class has the following differences: -* The method :meth:`asString` was renamed to :meth:`as_string`. +* The method :meth:`asString` was renamed to + :meth:`~email.message.Message.as_string`. -* The method :meth:`ismultipart` was renamed to :meth:`is_multipart`. +* The method :meth:`ismultipart` was renamed to + :meth:`~email.message.Message.is_multipart`. -* The :meth:`get_payload` method has grown a *decode* optional argument. +* The :meth:`~email.message.Message.get_payload` method has grown a *decode* + optional argument. -* The method :meth:`getall` was renamed to :meth:`get_all`. +* The method :meth:`getall` was renamed to + :meth:`~email.message.Message.get_all`. -* The method :meth:`addheader` was renamed to :meth:`add_header`. +* The method :meth:`addheader` was renamed to + :meth:`~email.message.Message.add_header`. * The method :meth:`gettype` was renamed to :meth:`get_type`. @@ -282,48 +299,57 @@ The :class:`Message` class has the following differences: * The method :meth:`getsubtype` was renamed to :meth:`get_subtype`. -* The method :meth:`getparams` was renamed to :meth:`get_params`. Also, whereas - :meth:`getparams` returned a list of strings, :meth:`get_params` returns a list - of 2-tuples, effectively the key/value pairs of the parameters, split on the - ``'='`` sign. +* The method :meth:`getparams` was renamed to + :meth:`~email.message.Message.get_params`. Also, whereas :meth:`getparams` + returned a list of strings, :meth:`~email.message.Message.get_params` returns + a list of 2-tuples, effectively the key/value pairs of the parameters, split + on the ``'='`` sign. -* The method :meth:`getparam` was renamed to :meth:`get_param`. +* The method :meth:`getparam` was renamed to + :meth:`~email.message.Message.get_param`. -* The method :meth:`getcharsets` was renamed to :meth:`get_charsets`. +* The method :meth:`getcharsets` was renamed to + :meth:`~email.message.Message.get_charsets`. -* The method :meth:`getfilename` was renamed to :meth:`get_filename`. +* The method :meth:`getfilename` was renamed to + :meth:`~email.message.Message.get_filename`. -* The method :meth:`getboundary` was renamed to :meth:`get_boundary`. +* The method :meth:`getboundary` was renamed to + :meth:`~email.message.Message.get_boundary`. -* The method :meth:`setboundary` was renamed to :meth:`set_boundary`. +* The method :meth:`setboundary` was renamed to + :meth:`~email.message.Message.set_boundary`. * The method :meth:`getdecodedpayload` was removed. To get similar - functionality, pass the value 1 to the *decode* flag of the get_payload() - method. + functionality, pass the value 1 to the *decode* flag of the + :meth:`~email.message.Message.get_payload` method. * The method :meth:`getpayloadastext` was removed. Similar functionality is - supported by the :class:`DecodedGenerator` class in the :mod:`email.generator` - module. + supported by the :class:`~email.generator.DecodedGenerator` class in the + :mod:`email.generator` module. * The method :meth:`getbodyastext` was removed. You can get similar - functionality by creating an iterator with :func:`typed_subpart_iterator` in the - :mod:`email.iterators` module. + functionality by creating an iterator with + :func:`~email.iterators.typed_subpart_iterator` in the :mod:`email.iterators` + module. -The :class:`Parser` class has no differences in its public interface. It does -have some additional smarts to recognize :mimetype:`message/delivery-status` -type messages, which it represents as a :class:`Message` instance containing -separate :class:`Message` subparts for each header block in the delivery status -notification [#]_. +The :class:`~email.parser.Parser` class has no differences in its public +interface. It does have some additional smarts to recognize +:mimetype:`message/delivery-status` type messages, which it represents as a +:class:`~email.message.Message` instance containing separate +:class:`~email.message.Message` subparts for each header block in the delivery +status notification [#]_. -The :class:`Generator` class has no differences in its public interface. There -is a new class in the :mod:`email.generator` module though, called -:class:`DecodedGenerator` which provides most of the functionality previously -available in the :meth:`Message.getpayloadastext` method. +The :class:`~email.generator.Generator` class has no differences in its public +interface. There is a new class in the :mod:`email.generator` module though, +called :class:`~email.generator.DecodedGenerator` which provides most of the +functionality previously available in the :meth:`Message.getpayloadastext` +method. The following modules and classes have been changed: -* The :class:`MIMEBase` class constructor arguments *_major* and *_minor* have - changed to *_maintype* and *_subtype* respectively. +* The :class:`~email.mime.base.MIMEBase` class constructor arguments *_major* + and *_minor* have changed to *_maintype* and *_subtype* respectively. * The ``Image`` class/module has been renamed to ``MIMEImage``. The *_minor* argument has been renamed to *_subtype*. @@ -336,7 +362,8 @@ The following modules and classes have been changed: but that clashed with the Python standard library module :mod:`rfc822` on some case-insensitive file systems. - Also, the :class:`MIMEMessage` class now represents any kind of MIME message + Also, the :class:`~email.mime.message.MIMEMessage` class now represents any + kind of MIME message with main type :mimetype:`message`. It takes an optional argument *_subtype* which is used to set the MIME subtype. *_subtype* defaults to :mimetype:`rfc822`. @@ -346,8 +373,8 @@ The following modules and classes have been changed: :mod:`email.utils` module. The ``MsgReader`` class/module has been removed. Its functionality is most -closely supported in the :func:`body_line_iterator` function in the -:mod:`email.iterators` module. +closely supported in the :func:`~email.iterators.body_line_iterator` function +in the :mod:`email.iterators` module. .. rubric:: Footnotes diff --git a/Doc/library/email.util.rst b/Doc/library/email.util.rst index 11bf3b2..f75975e 100644 --- a/Doc/library/email.util.rst +++ b/Doc/library/email.util.rst @@ -29,20 +29,28 @@ There are several useful utilities provided in the :mod:`email.utils` module: fails, in which case a 2-tuple of ``('', '')`` is returned. -.. function:: formataddr(pair) +.. function:: formataddr(pair, charset='utf-8') The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, email_address)`` and returns the string value suitable for a :mailheader:`To` or :mailheader:`Cc` header. If the first element of *pair* is false, then the second element is returned unmodified. + Optional *charset* is the character set that will be used in the :rfc:`2047` + encoding of the ``realname`` if the ``realname`` contains non-ASCII + characters. Can be an instance of :class:`str` or a + :class:`~email.charset.Charset`. Defaults to ``utf-8``. + + .. versionchanged:: 3.3 + Added the *charset* option. + .. function:: getaddresses(fieldvalues) This method returns a list of 2-tuples of the form returned by ``parseaddr()``. *fieldvalues* is a sequence of header field values as might be returned by - :meth:`Message.get_all`. Here's a simple example that gets all the recipients - of a message:: + :meth:`Message.get_all <email.message.Message.get_all>`. Here's a simple + example that gets all the recipients of a message:: from email.utils import getaddresses @@ -74,6 +82,20 @@ There are several useful utilities provided in the :mod:`email.utils` module: indexes 6, 7, and 8 of the result tuple are not usable. +.. function:: parsedate_to_datetime(date) + + The inverse of :func:`format_datetime`. Performs the same function as + :func:`parsedate`, but on success returns a :mod:`~datetime.datetime`. If + the input date has a timezone of ``-0000``, the ``datetime`` will be a naive + ``datetime``, and if the date is conforming to the RFCs it will represent a + time in UTC but with no indication of the actual source timezone of the + message the date comes from. If the input date has any other valid timezone + offset, the ``datetime`` will be an aware ``datetime`` with the + corresponding a :class:`~datetime.timezone` :class:`~datetime.tzinfo`. + + .. versionadded:: 3.3 + + .. function:: mktime_tz(tuple) Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. It @@ -105,6 +127,36 @@ There are several useful utilities provided in the :mod:`email.utils` module: ``False``. The default is ``False``. +.. function:: format_datetime(dt, usegmt=False) + + Like ``formatdate``, but the input is a :mod:`datetime` instance. If it is + a naive datetime, it is assumed to be "UTC with no information about the + source timezone", and the conventional ``-0000`` is used for the timezone. + If it is an aware ``datetime``, then the numeric timezone offset is used. + If it is an aware timezone with offset zero, then *usegmt* may be set to + ``True``, in which case the string ``GMT`` is used instead of the numeric + timezone offset. This provides a way to generate standards conformant HTTP + date headers. + + .. versionadded:: 3.3 + + +.. function:: localtime(dt=None) + + Return local time as an aware datetime object. If called without + arguments, return current time. Otherwise *dt* argument should be a + :class:`~datetime.datetime` instance, and it is converted to the local time + zone according to the system time zone database. If *dt* is naive (that + is, ``dt.tzinfo`` is ``None``), it is assumed to be in local time. In this + case, a positive or zero value for *isdst* causes ``localtime`` to presume + initially that summer time (for example, Daylight Saving Time) is or is not + (respectively) in effect for the specified time. A negative value for + *isdst* causes the ``localtime`` to attempt to divine whether summer time + is in effect for the specified time. + + .. versionadded:: 3.3 + + .. function:: make_msgid(idstring=None, domain=None) Returns a string suitable for an :rfc:`2822`\ -compliant @@ -115,7 +167,8 @@ There are several useful utilities provided in the :mod:`email.utils` module: may be useful certain cases, such as a constructing distributed system that uses a consistent domain name across multiple hosts. - .. versionchanged:: 3.2 domain keyword added + .. versionchanged:: 3.2 + Added the *domain* keyword. .. function:: decode_rfc2231(s) @@ -134,10 +187,11 @@ There are several useful utilities provided in the :mod:`email.utils` module: .. function:: collapse_rfc2231_value(value, errors='replace', fallback_charset='us-ascii') When a header parameter is encoded in :rfc:`2231` format, - :meth:`Message.get_param` may return a 3-tuple containing the character set, + :meth:`Message.get_param <email.message.Message.get_param>` may return a + 3-tuple containing the character set, language, and value. :func:`collapse_rfc2231_value` turns this into a unicode string. Optional *errors* is passed to the *errors* argument of :class:`str`'s - :func:`encode` method; it defaults to ``'replace'``. Optional + :func:`~str.encode` method; it defaults to ``'replace'``. Optional *fallback_charset* specifies the character set to use if the one in the :rfc:`2231` header is not known by Python; it defaults to ``'us-ascii'``. diff --git a/Doc/library/exceptions.rst b/Doc/library/exceptions.rst index 7d622c2..6e4559c 100644 --- a/Doc/library/exceptions.rst +++ b/Doc/library/exceptions.rst @@ -34,6 +34,41 @@ programmers are encouraged to at least derive new exceptions from the defining exceptions is available in the Python Tutorial under :ref:`tut-userexceptions`. +When raising (or re-raising) an exception in an :keyword:`except` clause +:attr:`__context__` is automatically set to the last exception caught; if the +new exception is not handled the traceback that is eventually displayed will +include the originating exception(s) and the final exception. + +When raising a new exception (rather than using a bare ``raise`` to re-raise +the exception currently being handled), the implicit exception context can be +supplemented with an explicit cause by using :keyword:`from` with +:keyword:`raise`:: + + raise new_exc from original_exc + +The expression following :keyword:`from` must be an exception or ``None``. It +will be set as :attr:`__cause__` on the raised exception. Setting +:attr:`__cause__` also implicitly sets the :attr:`__suppress_context__` +attribute to ``True``, so that using ``raise new_exc from None`` +effectively replaces the old exception with the new one for display +purposes (e.g. converting :exc:`KeyError` to :exc:`AttributeError`, while +leaving the old exception available in :attr:`__context__` for introspection +when debugging. + +The default traceback display code shows these chained exceptions in +addition to the traceback for the exception itself. An explicitly chained +exception in :attr:`__cause__` is always shown when present. An implicitly +chained exception in :attr:`__context__` is shown only if :attr:`__cause__` +is :const:`None` and :attr:`__suppress_context__` is false. + +In either case, the exception itself is always shown after any chained +exceptions so that the final line of the traceback always shows the last +exception that was raised. + + +Base classes +------------ + The following exceptions are used mostly as base classes for other exceptions. .. exception:: BaseException @@ -90,27 +125,8 @@ The following exceptions are used mostly as base classes for other exceptions. can be raised directly by :func:`codecs.lookup`. -.. exception:: EnvironmentError - - The base class for exceptions that can occur outside the Python system: - :exc:`IOError`, :exc:`OSError`. When exceptions of this type are created with a - 2-tuple, the first item is available on the instance's :attr:`errno` attribute - (it is assumed to be an error number), and the second item is available on the - :attr:`strerror` attribute (it is usually the associated error message). The - tuple itself is also available on the :attr:`args` attribute. - - When an :exc:`EnvironmentError` exception is instantiated with a 3-tuple, the - first two items are available as above, while the third item is available on the - :attr:`filename` attribute. However, for backwards compatibility, the - :attr:`args` attribute contains only a 2-tuple of the first two constructor - arguments. - - The :attr:`filename` attribute is ``None`` when this exception is created with - other than 3 arguments. The :attr:`errno` and :attr:`strerror` attributes are - also ``None`` when the instance was created with other than 2 or 3 arguments. - In this last case, :attr:`args` contains the verbatim constructor arguments as a - tuple. - +Concrete exceptions +------------------- The following exceptions are the exceptions that are usually raised. @@ -130,10 +146,9 @@ The following exceptions are the exceptions that are usually raised. .. exception:: EOFError - Raised when one of the built-in functions (:func:`input` or :func:`raw_input`) - hits an end-of-file condition (EOF) without reading any data. (N.B.: the - :meth:`file.read` and :meth:`file.readline` methods return an empty string - when they hit EOF.) + Raised when the :func:`input` function hits an end-of-file condition (EOF) + without reading any data. (N.B.: the :meth:`io.IOBase.read` and + :meth:`io.IOBase.readline` methods return an empty string when they hit EOF.) .. exception:: FloatingPointError @@ -151,21 +166,19 @@ The following exceptions are the exceptions that are usually raised. it is technically not an error. -.. exception:: IOError - - Raised when an I/O operation (such as the built-in :func:`print` or - :func:`open` functions or a method of a :term:`file object`) fails for an - I/O-related reason, e.g., "file not found" or "disk full". - - This class is derived from :exc:`EnvironmentError`. See the discussion above - for more information on exception instance attributes. - - .. exception:: ImportError Raised when an :keyword:`import` statement fails to find the module definition or when a ``from ... import`` fails to find a name that is to be imported. + The :attr:`name` and :attr:`path` attributes can be set using keyword-only + arguments to the constructor. When set they represent the name of the module + that was attempted to be imported and the path to any file which triggered + the exception, respectively. + + .. versionchanged:: 3.3 + Added the :attr:`name` and :attr:`path` attributes. + .. exception:: IndexError @@ -221,17 +234,30 @@ The following exceptions are the exceptions that are usually raised. .. index:: module: errno - This exception is derived from :exc:`EnvironmentError`. It is raised when a - function returns a system-related error (not for illegal argument types or - other incidental errors). The :attr:`errno` attribute is a numeric error - code from :c:data:`errno`, and the :attr:`strerror` attribute is the - corresponding string, as would be printed by the C function :c:func:`perror`. - See the module :mod:`errno`, which contains names for the error codes defined - by the underlying operating system. + This exception is raised when a system function returns a system-related + error, including I/O failures such as "file not found" or "disk full" + (not for illegal argument types or other incidental errors). Often a + subclass of :exc:`OSError` will actually be raised as described in + `OS exceptions`_ below. The :attr:`errno` attribute is a numeric error + code from the C variable :c:data:`errno`. + + Under Windows, the :attr:`winerror` attribute gives you the native + Windows error code. The :attr:`errno` attribute is then an approximate + translation, in POSIX terms, of that native error code. + + Under all platforms, the :attr:`strerror` attribute is the corresponding + error message as provided by the operating system (as formatted by the C + functions :c:func:`perror` under POSIX, and :c:func:`FormatMessage` + Windows). + + For exceptions that involve a file system path (such as :func:`open` or + :func:`os.unlink`), the exception instance will contain an additional + attribute, :attr:`filename`, which is the file name passed to the function. - For exceptions that involve a file system path (such as :func:`chdir` or - :func:`unlink`), the exception instance will contain a third attribute, - :attr:`filename`, which is the file name passed to the function. + .. versionchanged:: 3.3 + :exc:`EnvironmentError`, :exc:`IOError`, :exc:`WindowsError`, + :exc:`VMSError`, :exc:`socket.error`, :exc:`select.error` and + :exc:`mmap.error` have been merged into :exc:`OSError`. .. exception:: OverflowError @@ -255,15 +281,26 @@ The following exceptions are the exceptions that are usually raised. Raised when an error is detected that doesn't fall in any of the other categories. The associated value is a string indicating what precisely went - wrong. (This exception is mostly a relic from a previous version of the - interpreter; it is not used very much any more.) + wrong. .. exception:: StopIteration Raised by built-in function :func:`next` and an :term:`iterator`\'s - :meth:`~iterator.__next__` method to signal that there are no further values. + :meth:`~iterator.__next__` method to signal that there are no further + items produced by the iterator. + The exception object has a single attribute :attr:`value`, which is + given as an argument when constructing the exception, and defaults + to :const:`None`. + + When a generator function returns, a new :exc:`StopIteration` instance is + raised, and the value returned by the function is used as the + :attr:`value` parameter to the constructor of the exception. + + .. versionchanged:: 3.3 + Added ``value`` attribute and the ability for generator functions to + use it to return a value. .. exception:: SyntaxError @@ -311,7 +348,7 @@ The following exceptions are the exceptions that are usually raised. if it has another type (such as a string), the object's value is printed and the exit status is one. - Instances have an attribute :attr:`code` which is set to the proposed exit + Instances have an attribute :attr:`!code` which is set to the proposed exit status or error message (defaulting to ``None``). Also, this exception derives directly from :exc:`BaseException` and not :exc:`Exception`, since it is not technically an error. @@ -321,7 +358,7 @@ The following exceptions are the exceptions that are usually raised. executed, and so that a debugger can execute a script without running the risk of losing control. The :func:`os._exit` function can be used if it is absolutely positively necessary to exit immediately (for example, in the child - process after a call to :func:`fork`). + process after a call to :func:`os.fork`). The exception inherits from :exc:`BaseException` instead of :exc:`Exception` so that it is not accidentally caught by code that catches :exc:`Exception`. This @@ -346,6 +383,30 @@ The following exceptions are the exceptions that are usually raised. Raised when a Unicode-related encoding or decoding error occurs. It is a subclass of :exc:`ValueError`. + :exc:`UnicodeError` has attributes that describe the encoding or decoding + error. For example, ``err.object[err.start:err.end]`` gives the particular + invalid input that the codec failed on. + + .. attribute:: encoding + + The name of the encoding that raised the error. + + .. attribute:: reason + + A string describing the specific codec error. + + .. attribute:: object + + The object the codec was attempting to encode or decode. + + .. attribute:: start + + The first index of invalid data in :attr:`object`. + + .. attribute:: end + + The index after the last invalid data in :attr:`object`. + .. exception:: UnicodeEncodeError @@ -372,27 +433,142 @@ The following exceptions are the exceptions that are usually raised. more precise exception such as :exc:`IndexError`. -.. exception:: VMSError +.. exception:: ZeroDivisionError - Only available on VMS. Raised when a VMS-specific error occurs. + Raised when the second argument of a division or modulo operation is zero. The + associated value is a string indicating the type of the operands and the + operation. +The following exceptions are kept for compatibility with previous versions; +starting from Python 3.3, they are aliases of :exc:`OSError`. + +.. exception:: EnvironmentError + +.. exception:: IOError + +.. exception:: VMSError + + Only available on VMS. + .. exception:: WindowsError - Raised when a Windows-specific error occurs or when the error number does not - correspond to an :c:data:`errno` value. The :attr:`winerror` and - :attr:`strerror` values are created from the return values of the - :c:func:`GetLastError` and :c:func:`FormatMessage` functions from the Windows - Platform API. The :attr:`errno` value maps the :attr:`winerror` value to - corresponding ``errno.h`` values. This is a subclass of :exc:`OSError`. + Only available on Windows. -.. exception:: ZeroDivisionError +OS exceptions +^^^^^^^^^^^^^ + +The following exceptions are subclasses of :exc:`OSError`, they get raised +depending on the system error code. + +.. exception:: BlockingIOError + + Raised when an operation would block on an object (e.g. socket) set + for non-blocking operation. + Corresponds to :c:data:`errno` ``EAGAIN``, ``EALREADY``, + ``EWOULDBLOCK`` and ``EINPROGRESS``. + + In addition to those of :exc:`OSError`, :exc:`BlockingIOError` can have + one more attribute: + + .. attribute:: characters_written + + An integer containing the number of characters written to the stream + before it blocked. This attribute is available when using the + buffered I/O classes from the :mod:`io` module. + +.. exception:: ChildProcessError + + Raised when an operation on a child process failed. + Corresponds to :c:data:`errno` ``ECHILD``. + +.. exception:: ConnectionError + + A base class for connection-related issues. + + Subclasses are :exc:`BrokenPipeError`, :exc:`ConnectionAbortedError`, + :exc:`ConnectionRefusedError` and :exc:`ConnectionResetError`. + +.. exception:: BrokenPipeError + + A subclass of :exc:`ConnectionError`, raised when trying to write on a + pipe while the other end has been closed, or trying to write on a socket + which has been shutdown for writing. + Corresponds to :c:data:`errno` ``EPIPE`` and ``ESHUTDOWN``. + +.. exception:: ConnectionAbortedError + + A subclass of :exc:`ConnectionError`, raised when a connection attempt + is aborted by the peer. + Corresponds to :c:data:`errno` ``ECONNABORTED``. + +.. exception:: ConnectionRefusedError + + A subclass of :exc:`ConnectionError`, raised when a connection attempt + is refused by the peer. + Corresponds to :c:data:`errno` ``ECONNREFUSED``. + +.. exception:: ConnectionResetError + + A subclass of :exc:`ConnectionError`, raised when a connection is + reset by the peer. + Corresponds to :c:data:`errno` ``ECONNRESET``. + +.. exception:: FileExistsError + + Raised when trying to create a file or directory which already exists. + Corresponds to :c:data:`errno` ``EEXIST``. + +.. exception:: FileNotFoundError + + Raised when a file or directory is requested but doesn't exist. + Corresponds to :c:data:`errno` ``ENOENT``. + +.. exception:: InterruptedError + + Raised when a system call is interrupted by an incoming signal. + Corresponds to :c:data:`errno` ``EINTR``. + +.. exception:: IsADirectoryError + + Raised when a file operation (such as :func:`os.remove`) is requested + on a directory. + Corresponds to :c:data:`errno` ``EISDIR``. + +.. exception:: NotADirectoryError + + Raised when a directory operation (such as :func:`os.listdir`) is requested + on something which is not a directory. + Corresponds to :c:data:`errno` ``ENOTDIR``. + +.. exception:: PermissionError + + Raised when trying to run an operation without the adequate access + rights - for example filesystem permissions. + Corresponds to :c:data:`errno` ``EACCES`` and ``EPERM``. + +.. exception:: ProcessLookupError + + Raised when a given process doesn't exist. + Corresponds to :c:data:`errno` ``ESRCH``. + +.. exception:: TimeoutError + + Raised when a system function timed out at the system level. + Corresponds to :c:data:`errno` ``ETIMEDOUT``. + +.. versionadded:: 3.3 + All the above :exc:`OSError` subclasses were added. + + +.. seealso:: + + :pep:`3151` - Reworking the OS and IO exception hierarchy - Raised when the second argument of a division or modulo operation is zero. The - associated value is a string indicating the type of the operands and the - operation. +Warnings +-------- The following exceptions are used as warning categories; see the :mod:`warnings` module for more information. @@ -445,7 +621,7 @@ module for more information. .. exception:: BytesWarning - Base class for warnings related to :class:`bytes` and :class:`buffer`. + Base class for warnings related to :class:`bytes` and :class:`bytearray`. .. exception:: ResourceWarning diff --git a/Doc/library/faulthandler.rst b/Doc/library/faulthandler.rst new file mode 100644 index 0000000..3c33621 --- /dev/null +++ b/Doc/library/faulthandler.rst @@ -0,0 +1,136 @@ +:mod:`faulthandler` --- Dump the Python traceback +================================================= + +.. module:: faulthandler + :synopsis: Dump the Python traceback. + +This module contains functions to dump Python tracebacks explicitly, on a fault, +after a timeout, or on a user signal. Call :func:`faulthandler.enable` to +install fault handlers for the :const:`SIGSEGV`, :const:`SIGFPE`, +:const:`SIGABRT`, :const:`SIGBUS`, and :const:`SIGILL` signals. You can also +enable them at startup by setting the :envvar:`PYTHONFAULTHANDLER` environment +variable or by using :option:`-X` ``faulthandler`` command line option. + +The fault handler is compatible with system fault handlers like Apport or the +Windows fault handler. The module uses an alternative stack for signal handlers +if the :c:func:`sigaltstack` function is available. This allows it to dump the +traceback even on a stack overflow. + +The fault handler is called on catastrophic cases and therefore can only use +signal-safe functions (e.g. it cannot allocate memory on the heap). Because of +this limitation traceback dumping is minimal compared to normal Python +tracebacks: + +* Only ASCII is supported. The ``backslashreplace`` error handler is used on + encoding. +* Each string is limited to 500 characters. +* Only the filename, the function name and the line number are + displayed. (no source code) +* It is limited to 100 frames and 100 threads. + +By default, the Python traceback is written to :data:`sys.stderr`. To see +tracebacks, applications must be run in the terminal. A log file can +alternatively be passed to :func:`faulthandler.enable`. + +The module is implemented in C, so tracebacks can be dumped on a crash or when +Python is deadlocked. + +.. versionadded:: 3.3 + + +Dump the traceback +------------------ + +.. function:: dump_traceback(file=sys.stderr, all_threads=True) + + Dump the tracebacks of all threads into *file*. If *all_threads* is + ``False``, dump only the current thread. + + +Fault handler state +------------------- + +.. function:: enable(file=sys.stderr, all_threads=True) + + Enable the fault handler: install handlers for the :const:`SIGSEGV`, + :const:`SIGFPE`, :const:`SIGABRT`, :const:`SIGBUS` and :const:`SIGILL` + signals to dump the Python traceback. If *all_threads* is ``True``, + produce tracebacks for every running thread. Otherwise, dump only the current + thread. + +.. function:: disable() + + Disable the fault handler: uninstall the signal handlers installed by + :func:`enable`. + +.. function:: is_enabled() + + Check if the fault handler is enabled. + + +Dump the tracebacks after a timeout +----------------------------------- + +.. function:: dump_traceback_later(timeout, repeat=False, file=sys.stderr, exit=False) + + Dump the tracebacks of all threads, after a timeout of *timeout* seconds, or + every *timeout* seconds if *repeat* is ``True``. If *exit* is ``True``, call + :c:func:`_exit` with status=1 after dumping the tracebacks. (Note + :c:func:`_exit` exits the process immediately, which means it doesn't do any + cleanup like flushing file buffers.) If the function is called twice, the new + call replaces previous parameters and resets the timeout. The timer has a + sub-second resolution. + + This function is implemented using a watchdog thread and therefore is not + available if Python is compiled with threads disabled. + +.. function:: cancel_dump_traceback_later() + + Cancel the last call to :func:`dump_traceback_later`. + + +Dump the traceback on a user signal +----------------------------------- + +.. function:: register(signum, file=sys.stderr, all_threads=True, chain=False) + + Register a user signal: install a handler for the *signum* signal to dump + the traceback of all threads, or of the current thread if *all_threads* is + ``False``, into *file*. Call the previous handler if chain is ``True``. + + Not available on Windows. + +.. function:: unregister(signum) + + Unregister a user signal: uninstall the handler of the *signum* signal + installed by :func:`register`. Return ``True`` if the signal was registered, + ``False`` otherwise. + + Not available on Windows. + + +File descriptor issue +--------------------- + +:func:`enable`, :func:`dump_traceback_later` and :func:`register` keep the +file descriptor of their *file* argument. If the file is closed and its file +descriptor is reused by a new file, or if :func:`os.dup2` is used to replace +the file descriptor, the traceback will be written into a different file. Call +these functions again each time that the file is replaced. + + +Example +------- + +Example of a segmentation fault on Linux: :: + + $ python -q -X faulthandler + >>> import ctypes + >>> ctypes.string_at(0) + Fatal Python error: Segmentation fault + + Current thread 0x00007fb899f39700: + File "/home/python/cpython/Lib/ctypes/__init__.py", line 486 in string_at + File "<stdin>", line 1 in <module> + Segmentation fault + diff --git a/Doc/library/fcntl.rst b/Doc/library/fcntl.rst index 6192400..8e932fb 100644 --- a/Doc/library/fcntl.rst +++ b/Doc/library/fcntl.rst @@ -1,5 +1,5 @@ -:mod:`fcntl` --- The :func:`fcntl` and :func:`ioctl` system calls -================================================================= +:mod:`fcntl` --- The ``fcntl`` and ``ioctl`` system calls +========================================================= .. module:: fcntl :platform: Unix @@ -17,17 +17,24 @@ interface to the :c:func:`fcntl` and :c:func:`ioctl` Unix routines. All functions in this module take a file descriptor *fd* as their first argument. This can be an integer file descriptor, such as returned by ``sys.stdin.fileno()``, or a :class:`io.IOBase` object, such as ``sys.stdin`` -itself, which provides a :meth:`fileno` that returns a genuine file descriptor. +itself, which provides a :meth:`~io.IOBase.fileno` that returns a genuine file +descriptor. + +.. versionchanged:: 3.3 + Operations in this module used to raise a :exc:`IOError` where they now + raise a :exc:`OSError`. + The module defines the following functions: .. function:: fcntl(fd, op[, arg]) - Perform the requested operation on file descriptor *fd* (file objects providing - a :meth:`fileno` method are accepted as well). The operation is defined by *op* - and is operating system dependent. These codes are also found in the - :mod:`fcntl` module. The argument *arg* is optional, and defaults to the integer + Perform the operation *op* on file descriptor *fd* (file objects providing + a :meth:`~io.IOBase.fileno` method are accepted as well). The values used + for *op* are operating system dependent, and are available as constants + in the :mod:`fcntl` module, using the same names as used in the relevant C + header files. The argument *arg* is optional, and defaults to the integer value ``0``. When present, it can either be an integer value, or a string. With the argument missing or an integer value, the return value of this function is the integer return value of the C :c:func:`fcntl` call. When the argument is @@ -40,21 +47,25 @@ The module defines the following functions: larger than 1024 bytes, this is most likely to result in a segmentation violation or a more subtle data corruption. - If the :c:func:`fcntl` fails, an :exc:`IOError` is raised. + If the :c:func:`fcntl` fails, an :exc:`OSError` is raised. .. function:: ioctl(fd, op[, arg[, mutate_flag]]) - This function is identical to the :func:`fcntl` function, except that the - argument handling is even more complicated. + This function is identical to the :func:`~fcntl.fcntl` function, except + that the argument handling is even more complicated. The op parameter is limited to values that can fit in 32-bits. + Additional constants of interest for use as the *op* argument can be + found in the :mod:`termios` module, under the same names as used in + the relevant C header files. The parameter *arg* can be one of an integer, absent (treated identically to the integer ``0``), an object supporting the read-only buffer interface (most likely a plain Python string) or an object supporting the read-write buffer interface. - In all but the last case, behaviour is as for the :func:`fcntl` function. + In all but the last case, behaviour is as for the :func:`~fcntl.fcntl` + function. If a mutable buffer is passed, then the behaviour is determined by the value of the *mutate_flag* parameter. @@ -89,16 +100,16 @@ The module defines the following functions: .. function:: flock(fd, op) Perform the lock operation *op* on file descriptor *fd* (file objects providing - a :meth:`fileno` method are accepted as well). See the Unix manual + a :meth:`~io.IOBase.fileno` method are accepted as well). See the Unix manual :manpage:`flock(2)` for details. (On some systems, this function is emulated using :c:func:`fcntl`.) .. function:: lockf(fd, operation, [length, [start, [whence]]]) - This is essentially a wrapper around the :func:`fcntl` locking calls. *fd* is - the file descriptor of the file to lock or unlock, and *operation* is one of the - following values: + This is essentially a wrapper around the :func:`~fcntl.fcntl` locking calls. + *fd* is the file descriptor of the file to lock or unlock, and *operation* + is one of the following values: * :const:`LOCK_UN` -- unlock * :const:`LOCK_SH` -- acquire a shared lock @@ -107,19 +118,19 @@ The module defines the following functions: When *operation* is :const:`LOCK_SH` or :const:`LOCK_EX`, it can also be bitwise ORed with :const:`LOCK_NB` to avoid blocking on lock acquisition. If :const:`LOCK_NB` is used and the lock cannot be acquired, an - :exc:`IOError` will be raised and the exception will have an *errno* + :exc:`OSError` will be raised and the exception will have an *errno* attribute set to :const:`EACCES` or :const:`EAGAIN` (depending on the operating system; for portability, check for both values). On at least some systems, :const:`LOCK_EX` can only be used if the file descriptor refers to a file opened for writing. - *length* is the number of bytes to lock, *start* is the byte offset at which the - lock starts, relative to *whence*, and *whence* is as with :func:`fileobj.seek`, - specifically: + *length* is the number of bytes to lock, *start* is the byte offset at + which the lock starts, relative to *whence*, and *whence* is as with + :func:`io.IOBase.seek`, specifically: - * :const:`0` -- relative to the start of the file (:const:`SEEK_SET`) - * :const:`1` -- relative to the current buffer position (:const:`SEEK_CUR`) - * :const:`2` -- relative to the end of the file (:const:`SEEK_END`) + * :const:`0` -- relative to the start of the file (:data:`os.SEEK_SET`) + * :const:`1` -- relative to the current buffer position (:data:`os.SEEK_CUR`) + * :const:`2` -- relative to the end of the file (:data:`os.SEEK_END`) The default for *start* is 0, which means to start at the beginning of the file. The default for *length* is 0 which means to lock to the end of the file. The @@ -144,7 +155,8 @@ lay-out for the *lockdata* variable is system dependent --- therefore using the .. seealso:: Module :mod:`os` - If the locking flags :const:`O_SHLOCK` and :const:`O_EXLOCK` are present - in the :mod:`os` module (on BSD only), the :func:`os.open` function - provides an alternative to the :func:`lockf` and :func:`flock` functions. + If the locking flags :data:`~os.O_SHLOCK` and :data:`~os.O_EXLOCK` are + present in the :mod:`os` module (on BSD only), the :func:`os.open` + function provides an alternative to the :func:`lockf` and :func:`flock` + functions. diff --git a/Doc/library/filecmp.rst b/Doc/library/filecmp.rst index 8fd23dd..8a88f8c 100644 --- a/Doc/library/filecmp.rst +++ b/Doc/library/filecmp.rst @@ -21,11 +21,8 @@ The :mod:`filecmp` module defines the following functions: Compare the files named *f1* and *f2*, returning ``True`` if they seem equal, ``False`` otherwise. - Unless *shallow* is given and is false, files with identical :func:`os.stat` - signatures are taken to be equal. - - Files that were compared using this function will not be compared again unless - their :func:`os.stat` signature changes. + If *shallow* is true, files with identical :func:`os.stat` signatures are + taken to be equal. Otherwise, the contents of the files are compared. Note that no external programs are called from this function, giving it portability and efficiency. @@ -51,23 +48,11 @@ The :mod:`filecmp` module defines the following functions: one of the three returned lists. -Example:: - - >>> import filecmp - >>> filecmp.cmp('undoc.rst', 'undoc.rst') # doctest: +SKIP - True - >>> filecmp.cmp('undoc.rst', 'index.rst') # doctest: +SKIP - False - - .. _dircmp-objects: The :class:`dircmp` class ------------------------- -:class:`dircmp` instances are built using this constructor: - - .. class:: dircmp(a, b, ignore=None, hide=None) Construct a new directory comparison object, to compare the directories *a* and @@ -83,7 +68,7 @@ The :class:`dircmp` class .. method:: report() - Print (to ``sys.stdout``) a comparison between *a* and *b*. + Print (to :data:`sys.stdout`) a comparison between *a* and *b*. .. method:: report_partial_closure() diff --git a/Doc/library/fileinput.rst b/Doc/library/fileinput.rst index ac44311..d5a4875 100644 --- a/Doc/library/fileinput.rst +++ b/Doc/library/fileinput.rst @@ -28,7 +28,10 @@ as the first argument to :func:`.input`. A single file name is also allowed. All files are opened in text mode by default, but you can override this by specifying the *mode* parameter in the call to :func:`.input` or :class:`FileInput`. If an I/O error occurs during opening or reading a file, -:exc:`IOError` is raised. +:exc:`OSError` is raised. + +.. versionchanged:: 3.3 + :exc:`IOError` used to be raised; it is now an alias of :exc:`OSError`. If ``sys.stdin`` is used more than once, the second and further use will return no lines, except perhaps for interactive use, or if it has been explicitly reset @@ -133,11 +136,12 @@ available for subclassing as well: Class :class:`FileInput` is the implementation; its methods :meth:`filename`, :meth:`fileno`, :meth:`lineno`, :meth:`filelineno`, :meth:`isfirstline`, - :meth:`isstdin`, :meth:`nextfile` and :meth:`close` correspond to the functions - of the same name in the module. In addition it has a :meth:`readline` method - which returns the next input line, and a :meth:`__getitem__` method which - implements the sequence behavior. The sequence must be accessed in strictly - sequential order; random access and :meth:`readline` cannot be mixed. + :meth:`isstdin`, :meth:`nextfile` and :meth:`close` correspond to the + functions of the same name in the module. In addition it has a + :meth:`~io.TextIOBase.readline` method which returns the next input line, + and a :meth:`__getitem__` method which implements the sequence behavior. + The sequence must be accessed in strictly sequential order; random access + and :meth:`~io.TextIOBase.readline` cannot be mixed. With *mode* you can specify which file mode will be passed to :func:`open`. It must be one of ``'r'``, ``'rU'``, ``'U'`` and ``'rb'``. @@ -168,10 +172,6 @@ and the backup file remains around; by default, the extension is ``'.bak'`` and it is deleted when the output file is closed. In-place filtering is disabled when standard input is read. -.. note:: - - The current implementation does not work for MS-DOS 8+3 filesystems. - The two following opening hooks are provided by this module: diff --git a/Doc/library/fractions.rst b/Doc/library/fractions.rst index 59e6b1b..c2c7401 100644 --- a/Doc/library/fractions.rst +++ b/Doc/library/fractions.rst @@ -77,20 +77,31 @@ another rational number, or from a string. :class:`numbers.Rational`, and implements all of the methods and operations from that class. :class:`Fraction` instances are hashable, and should be treated as immutable. In addition, - :class:`Fraction` has the following methods: + :class:`Fraction` has the following properties and methods: .. versionchanged:: 3.2 The :class:`Fraction` constructor now accepts :class:`float` and :class:`decimal.Decimal` instances. + .. attribute:: numerator + + Numerator of the Fraction in lowest term. + + .. attribute:: denominator + + Denominator of the Fraction in lowest term. + + .. method:: from_float(flt) This class method constructs a :class:`Fraction` representing the exact value of *flt*, which must be a :class:`float`. Beware that ``Fraction.from_float(0.3)`` is not the same value as ``Fraction(3, 10)`` - .. note:: From Python 3.2 onwards, you can also construct a + .. note:: + + From Python 3.2 onwards, you can also construct a :class:`Fraction` instance directly from a :class:`float`. @@ -99,7 +110,9 @@ another rational number, or from a string. This class method constructs a :class:`Fraction` representing the exact value of *dec*, which must be a :class:`decimal.Decimal` instance. - .. note:: From Python 3.2 onwards, you can also construct a + .. note:: + + From Python 3.2 onwards, you can also construct a :class:`Fraction` instance directly from a :class:`decimal.Decimal` instance. diff --git a/Doc/library/ftplib.rst b/Doc/library/ftplib.rst index 3274f19..dcb2ac4 100644 --- a/Doc/library/ftplib.rst +++ b/Doc/library/ftplib.rst @@ -23,16 +23,17 @@ as mirroring other ftp servers. It is also used by the module Here's a sample session using the :mod:`ftplib` module:: >>> from ftplib import FTP - >>> ftp = FTP('ftp.cwi.nl') # connect to host, default port - >>> ftp.login() # user anonymous, passwd anonymous@ - >>> ftp.retrlines('LIST') # list directory contents - total 24418 - drwxrwsr-x 5 ftp-usr pdmaint 1536 Mar 20 09:48 . - dr-xr-srwt 105 ftp-usr pdmaint 1536 Mar 21 14:32 .. - -rw-r--r-- 1 ftp-usr pdmaint 5305 Mar 20 09:48 INDEX - . - . - . + >>> ftp = FTP('ftp.debian.org') # connect to host, default port + >>> ftp.login() # user anonymous, passwd anonymous@ + '230 Login successful.' + >>> ftp.cwd('debian') # change into "debian" directory + >>> ftp.retrlines('LIST') # list directory contents + -rw-rw-r-- 1 1176 1176 1063 Jun 15 10:18 README + ... + drwxr-sr-x 5 1176 1176 4096 Dec 19 2000 pool + drwxr-sr-x 4 1176 1176 4096 Nov 17 2008 project + drwxr-xr-x 3 1176 1176 4096 Oct 10 2012 tools + '226 Directory send OK.' >>> ftp.retrbinary('RETR README', open('README', 'wb').write) '226 Transfer complete.' >>> ftp.quit() @@ -40,7 +41,7 @@ Here's a sample session using the :mod:`ftplib` module:: The module defines the following items: -.. class:: FTP(host='', user='', passwd='', acct=''[, timeout]) +.. class:: FTP(host='', user='', passwd='', acct='', timeout=None, source_address=None) Return a new instance of the :class:`FTP` class. When *host* is given, the method call ``connect(host)`` is made. When *user* is given, additionally @@ -48,7 +49,8 @@ The module defines the following items: *acct* default to the empty string when not given). The optional *timeout* parameter specifies a timeout in seconds for blocking operations like the connection attempt (if is not specified, the global default timeout setting - will be used). + will be used). *source_address* is a 2-tuple ``(host, port)`` for the socket + to bind to as its source address before connecting. :class:`FTP` class supports the :keyword:`with` statement. Here is a sample on how using it: @@ -68,8 +70,11 @@ The module defines the following items: .. versionchanged:: 3.2 Support for the :keyword:`with` statement was added. + .. versionchanged:: 3.3 + *source_address* parameter was added. -.. class:: FTP_TLS(host='', user='', passwd='', acct='', [keyfile[, certfile[, context[, timeout]]]]) + +.. class:: FTP_TLS(host='', user='', passwd='', acct='', keyfile=None, certfile=None, context=None, timeout=None, source_address=None) A :class:`FTP` subclass which adds TLS support to FTP as described in :rfc:`4217`. @@ -80,10 +85,15 @@ The module defines the following items: private key and certificate chain file name for the SSL connection. *context* parameter is a :class:`ssl.SSLContext` object which allows bundling SSL configuration options, certificates and private keys into a - single (potentially long-lived) structure. + single (potentially long-lived) structure. *source_address* is a 2-tuple + ``(host, port)`` for the socket to bind to as its source address before + connecting. .. versionadded:: 3.2 + .. versionchanged:: 3.3 + *source_address* parameter was added. + Here's a sample session using the :class:`FTP_TLS` class: >>> from ftplib import FTP_TLS @@ -135,8 +145,7 @@ The module defines the following items: The set of all exceptions (as a tuple) that methods of :class:`FTP` instances may raise as a result of problems with the FTP connection (as opposed to programming errors made by the caller). This set includes the - four exceptions listed above as well as :exc:`socket.error` and - :exc:`IOError`. + four exceptions listed above as well as :exc:`OSError`. .. seealso:: @@ -174,7 +183,7 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. debugging output, logging each line sent and received on the control connection. -.. method:: FTP.connect(host='', port=0[, timeout]) +.. method:: FTP.connect(host='', port=0, timeout=None, source_address=None) Connect to the given host and port. The default port number is ``21``, as specified by the FTP protocol specification. It is rarely needed to specify a @@ -182,10 +191,14 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. instance; it should not be called at all if a host was given when the instance was created. All other methods can only be used after a connection has been made. - The optional *timeout* parameter specifies a timeout in seconds for the connection attempt. If no *timeout* is passed, the global default timeout setting will be used. + *source_address* is a 2-tuple ``(host, port)`` for the socket to bind to as + its source address before connecting. + + .. versionchanged:: 3.3 + *source_address* parameter was added. .. method:: FTP.getwelcome() @@ -241,13 +254,12 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. Retrieve a file or directory listing in ASCII transfer mode. *cmd* should be an appropriate ``RETR`` command (see :meth:`retrbinary`) or a command such as - ``LIST``, ``NLST`` or ``MLSD`` (usually just the string ``'LIST'``). + ``LIST`` or ``NLST`` (usually just the string ``'LIST'``). ``LIST`` retrieves a list of files and information about those files. - ``NLST`` retrieves a list of file names. On some servers, ``MLSD`` retrieves - a machine readable list of files and information about those files. The - *callback* function is called for each line with a string argument containing - the line with the trailing CRLF stripped. The default *callback* prints the - line to ``sys.stdout``. + ``NLST`` retrieves a list of file names. + The *callback* function is called for each line with a string argument + containing the line with the trailing CRLF stripped. The default *callback* + prints the line to ``sys.stdout``. .. method:: FTP.set_pasv(boolean) @@ -260,7 +272,7 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. Store a file in binary transfer mode. *cmd* should be an appropriate ``STOR`` command: ``"STOR filename"``. *file* is a :term:`file object` - (opened in binary mode) which is read until EOF using its :meth:`read` + (opened in binary mode) which is read until EOF using its :meth:`~io.IOBase.read` method in blocks of size *blocksize* to provide the data to be stored. The *blocksize* argument defaults to 8192. *callback* is an optional single parameter callable that is called on each block of data after it is sent. @@ -274,7 +286,7 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. Store a file in ASCII transfer mode. *cmd* should be an appropriate ``STOR`` command (see :meth:`storbinary`). Lines are read until EOF from the - :term:`file object` *file* (opened in binary mode) using its :meth:`readline` + :term:`file object` *file* (opened in binary mode) using its :meth:`~io.IOBase.readline` method to provide the data to be stored. *callback* is an optional single parameter callable that is called on each line after it is sent. @@ -307,6 +319,20 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. in :meth:`transfercmd`. +.. method:: FTP.mlsd(path="", facts=[]) + + List a directory in a standardized format by using MLSD command + (:rfc:`3659`). If *path* is omitted the current directory is assumed. + *facts* is a list of strings representing the type of information desired + (e.g. ``["type", "size", "perm"]``). Return a generator object yielding a + tuple of two elements for every file found in path. First element is the + file name, the second one is a dictionary containing facts about the file + name. Content of this dictionary might be limited by the *facts* argument + but server is not guaranteed to return all requested facts. + + .. versionadded:: 3.3 + + .. method:: FTP.nlst(argument[, ...]) Return a list of file names as returned by the ``NLST`` command. The @@ -314,6 +340,8 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. directory). Multiple arguments can be used to pass non-standard options to the ``NLST`` command. + .. deprecated:: 3.3 use :meth:`mlsd` instead. + .. method:: FTP.dir(argument[, ...]) @@ -324,6 +352,8 @@ followed by ``lines`` for the text version or ``binary`` for the binary version. as a *callback* function as for :meth:`retrlines`; the default prints to ``sys.stdout``. This method returns ``None``. + .. deprecated:: 3.3 use :meth:`mlsd` instead. + .. method:: FTP.rename(fromname, toname) @@ -394,7 +424,16 @@ FTP_TLS Objects .. method:: FTP_TLS.auth() - Set up secure control connection by using TLS or SSL, depending on what specified in :meth:`ssl_version` attribute. + Set up secure control connection by using TLS or SSL, depending on what + specified in :meth:`ssl_version` attribute. + +.. method:: FTP_TLS.ccc() + + Revert control channel back to plaintext. This can be useful to take + advantage of firewalls that know how to handle NAT with non-secure FTP + without opening fixed ports. + + .. versionadded:: 3.3 .. method:: FTP_TLS.prot_p() @@ -403,5 +442,3 @@ FTP_TLS Objects .. method:: FTP_TLS.prot_c() Set up clear text data connection. - - diff --git a/Doc/library/functions.rst b/Doc/library/functions.rst index 93d35f8..7ed25c1 100644 --- a/Doc/library/functions.rst +++ b/Doc/library/functions.rst @@ -14,12 +14,12 @@ are always available. They are listed here in alphabetical order. :func:`all` :func:`dir` :func:`hex` :func:`next` :func:`slice` :func:`any` :func:`divmod` :func:`id` :func:`object` :func:`sorted` :func:`ascii` :func:`enumerate` :func:`input` :func:`oct` :func:`staticmethod` -:func:`bin` :func:`eval` :func:`int` :func:`open` :func:`str` +:func:`bin` :func:`eval` :func:`int` :func:`open` |func-str|_ :func:`bool` :func:`exec` :func:`isinstance` :func:`ord` :func:`sum` :func:`bytearray` :func:`filter` :func:`issubclass` :func:`pow` :func:`super` -:func:`bytes` :func:`float` :func:`iter` :func:`print` :func:`tuple` +:func:`bytes` :func:`float` :func:`iter` :func:`print` |func-tuple|_ :func:`callable` :func:`format` :func:`len` :func:`property` :func:`type` -:func:`chr` |func-frozenset|_ :func:`list` :func:`range` :func:`vars` +:func:`chr` |func-frozenset|_ |func-list|_ |func-range|_ :func:`vars` :func:`classmethod` :func:`getattr` :func:`locals` :func:`repr` :func:`zip` :func:`compile` :func:`globals` :func:`map` :func:`reversed` :func:`__import__` :func:`complex` :func:`hasattr` :func:`max` :func:`round` @@ -33,6 +33,10 @@ are always available. They are listed here in alphabetical order. .. |func-frozenset| replace:: ``frozenset()`` .. |func-memoryview| replace:: ``memoryview()`` .. |func-set| replace:: ``set()`` +.. |func-list| replace:: ``list()`` +.. |func-str| replace:: ``str()`` +.. |func-tuple| replace:: ``tuple()`` +.. |func-range| replace:: ``range()`` .. function:: abs(x) @@ -44,7 +48,7 @@ are always available. They are listed here in alphabetical order. .. function:: all(iterable) - Return True if all elements of the *iterable* are true (or if the iterable + Return ``True`` if all elements of the *iterable* are true (or if the iterable is empty). Equivalent to:: def all(iterable): @@ -56,8 +60,8 @@ are always available. They are listed here in alphabetical order. .. function:: any(iterable) - Return True if any element of the *iterable* is true. If the iterable - is empty, return False. Equivalent to:: + Return ``True`` if any element of the *iterable* is true. If the iterable + is empty, return ``False``. Equivalent to:: def any(iterable): for element in iterable: @@ -93,6 +97,7 @@ are always available. They are listed here in alphabetical order. .. index:: pair: Boolean; type +.. _func-bytearray: .. function:: bytearray([source[, encoding[, errors]]]) Return a new array of bytes. The :class:`bytearray` type is a mutable @@ -118,7 +123,10 @@ are always available. They are listed here in alphabetical order. Without an argument, an array of size 0 is created. + See also :ref:`binaryseq` and :ref:`typebytearray`. + +.. _func-bytes: .. function:: bytes([source[, encoding[, errors]]]) Return a new "bytes" object, which is an immutable sequence of integers in @@ -130,6 +138,8 @@ are always available. They are listed here in alphabetical order. Bytes objects can also be created with literals, see :ref:`strings`. + See also :ref:`binaryseq`, :ref:`typebytes`, and :ref:`bytes-methods`. + .. function:: callable(object) @@ -152,10 +162,6 @@ are always available. They are listed here in alphabetical order. 1,114,111 (0x10FFFF in base 16). :exc:`ValueError` will be raised if *i* is outside that range. - Note that on narrow Unicode builds, the result is a string of - length two for *i* greater than 65,535 (0xFFFF in hexadecimal). - - .. function:: classmethod(function) @@ -187,9 +193,9 @@ are always available. They are listed here in alphabetical order. .. function:: compile(source, filename, mode, flags=0, dont_inherit=False, optimize=-1) Compile the *source* into a code or AST object. Code objects can be executed - by :func:`exec` or :func:`eval`. *source* can either be a string or an AST - object. Refer to the :mod:`ast` module documentation for information on how - to work with AST objects. + by :func:`exec` or :func:`eval`. *source* can either be a normal string, a + byte string, or an AST object. Refer to the :mod:`ast` module documentation + for information on how to work with AST objects. The *filename* argument should give the file from which the code was read; pass some recognizable value if it wasn't read from a file (``'<string>'`` is @@ -213,8 +219,8 @@ are always available. They are listed here in alphabetical order. Future statements are specified by bits which can be bitwise ORed together to specify multiple statements. The bitfield required to specify a given feature - can be found as the :attr:`compiler_flag` attribute on the :class:`_Feature` - instance in the :mod:`__future__` module. + can be found as the :attr:`~__future__._Feature.compiler_flag` attribute on + the :class:`~__future__._Feature` instance in the :mod:`__future__` module. The argument *optimize* specifies the optimization level of the compiler; the default value of ``-1`` selects the optimization level of the interpreter as @@ -312,17 +318,18 @@ are always available. They are listed here in alphabetical order. >>> import struct >>> dir() # show the names in the module namespace - ['__builtins__', '__doc__', '__name__', 'struct'] - >>> dir(struct) # show the names in the struct module - ['Struct', '__builtins__', '__doc__', '__file__', '__name__', - '__package__', '_clearcache', 'calcsize', 'error', 'pack', 'pack_into', + ['__builtins__', '__name__', 'struct'] + >>> dir(struct) # show the names in the struct module # doctest: +SKIP + ['Struct', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', + '__initializing__', '__loader__', '__name__', '__package__', + '_clearcache', 'calcsize', 'error', 'pack', 'pack_into', 'unpack', 'unpack_from'] >>> class Shape: - def __dir__(self): - return ['area', 'perimeter', 'location'] + ... def __dir__(self): + ... return ['area', 'perimeter', 'location'] >>> s = Shape() >>> dir(s) - ['area', 'perimeter', 'location'] + ['area', 'location', 'perimeter'] .. note:: @@ -515,12 +522,12 @@ are always available. They are listed here in alphabetical order. The float type is described in :ref:`typesnumeric`. - -.. function:: format(value[, format_spec]) - .. index:: - pair: str; format single: __format__ + single: string; format() (built-in function) + + +.. function:: format(value[, format_spec]) Convert a *value* to a "formatted" representation, as controlled by *format_spec*. The interpretation of *format_spec* will depend on the type @@ -576,11 +583,16 @@ are always available. They are listed here in alphabetical order. .. function:: hash(object) - Return the hash value of the object (if it has one). Hash values are integers. - They are used to quickly compare dictionary keys during a dictionary lookup. - Numeric values that compare equal have the same hash value (even if they are of - different types, as is the case for 1 and 1.0). + Return the hash value of the object (if it has one). Hash values are + integers. They are used to quickly compare dictionary keys during a + dictionary lookup. Numeric values that compare equal have the same hash + value (even if they are of different types, as is the case for 1 and 1.0). + + .. note:: + For object's with custom :meth:`__hash__` methods, note that :func:`hash` + truncates the return value based on the bit width of the host machine. + See :meth:`__hash__` for details. .. function:: help([object]) @@ -596,9 +608,19 @@ are always available. They are listed here in alphabetical order. .. function:: hex(x) - Convert an integer number to a hexadecimal string. The result is a valid Python - expression. If *x* is not a Python :class:`int` object, it has to define an - :meth:`__index__` method that returns an integer. + Convert an integer number to a lowercase hexadecimal string + prefixed with "0x", for example: + + >>> hex(255) + '0xff' + >>> hex(-42) + '-0x2a' + + If x is not a Python :class:`int` object, it has to define an __index__() + method that returns an integer. + + See also :func:`int` for converting a hexadecimal string to an + integer using a base of 16. .. note:: @@ -623,9 +645,9 @@ are always available. They are listed here in alphabetical order. to a string (stripping a trailing newline), and returns that. When EOF is read, :exc:`EOFError` is raised. Example:: - >>> s = input('--> ') + >>> s = input('--> ') # doctest: +SKIP --> Monty Python's Flying Circus - >>> s + >>> s # doctest: +SKIP "Monty Python's Flying Circus" If the :mod:`readline` module was loaded, then :func:`input` will use it @@ -691,9 +713,11 @@ are always available. They are listed here in alphabetical order. *sentinel*, :exc:`StopIteration` will be raised, otherwise the value will be returned. + See also :ref:`typeiter`. + One useful application of the second form of :func:`iter` is to read lines of a file until a certain line is reached. The following example reads a file - until the :meth:`readline` method returns an empty string:: + until the :meth:`~io.TextIOBase.readline` method returns an empty string:: with open('mydata.txt') as fp: for line in iter(fp.readline, ''): @@ -706,16 +730,12 @@ are always available. They are listed here in alphabetical order. sequence (string, tuple or list) or a mapping (dictionary). +.. _func-list: .. function:: list([iterable]) + :noindex: - Return a list whose items are the same and in the same order as *iterable*'s - items. *iterable* may be either a sequence, a container that supports - iteration, or an iterator object. If *iterable* is already a list, a copy is - made and returned, similar to ``iterable[:]``. For instance, ``list('abc')`` - returns ``['a', 'b', 'c']`` and ``list( (1, 2, 3) )`` returns ``[1, 2, 3]``. - If no argument is given, returns a new empty list, ``[]``. - - :class:`list` is a mutable sequence type, as documented in :ref:`typesseq`. + Rather than being a function, :class:`list` is actually a mutable + sequence type, as documented in :ref:`typesseq-list` and :ref:`typesseq`. .. function:: locals() @@ -800,8 +820,8 @@ are always available. They are listed here in alphabetical order. .. note:: - :class:`object` does *not* have a :attr:`__dict__`, so you can't assign - arbitrary attributes to an instance of the :class:`object` class. + :class:`object` does *not* have a :attr:`~object.__dict__`, so you can't + assign arbitrary attributes to an instance of the :class:`object` class. .. function:: oct(x) @@ -814,10 +834,10 @@ are always available. They are listed here in alphabetical order. .. index:: single: file object; open() built-in function -.. function:: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True) +.. function:: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) Open *file* and return a corresponding :term:`file object`. If the file - cannot be opened, an :exc:`IOError` is raised. + cannot be opened, an :exc:`OSError` is raised. *file* is either a string or bytes object giving the pathname (absolute or relative to the current working directory) of the file to be opened or @@ -828,17 +848,20 @@ are always available. They are listed here in alphabetical order. *mode* is an optional string that specifies the mode in which the file is opened. It defaults to ``'r'`` which means open for reading in text mode. Other common values are ``'w'`` for writing (truncating the file if it - already exists), and ``'a'`` for appending (which on *some* Unix systems, - means that *all* writes append to the end of the file regardless of the - current seek position). In text mode, if *encoding* is not specified the - encoding used is platform dependent. (For reading and writing raw bytes use - binary mode and leave *encoding* unspecified.) The available modes are: + already exists), ``'x'`` for exclusive creation and ``'a'`` for appending + (which on *some* Unix systems, means that *all* writes append to the end of + the file regardless of the current seek position). In text mode, if + *encoding* is not specified the encoding used is platform dependent: + ``locale.getpreferredencoding(False)`` is called to get the current locale + encoding. (For reading and writing raw bytes use binary mode and leave + *encoding* unspecified.) The available modes are: ========= =============================================================== Character Meaning - --------- --------------------------------------------------------------- + ========= =============================================================== ``'r'`` open for reading (default) ``'w'`` open for writing, truncating the file first + ``'x'`` open for exclusive creation, failing if the file already exists ``'a'`` open for writing, appending to the end of the file if it exists ``'b'`` binary mode ``'t'`` text mode (default) @@ -876,9 +899,9 @@ are always available. They are listed here in alphabetical order. size" and falling back on :attr:`io.DEFAULT_BUFFER_SIZE`. On many systems, the buffer will typically be 4096 or 8192 bytes long. - * "Interactive" text files (files for which :meth:`isatty` returns True) use - line buffering. Other text files use the policy described above for binary - files. + * "Interactive" text files (files for which :meth:`~io.IOBase.isatty` + returns ``True``) use line buffering. Other text files use the policy + described above for binary files. *encoding* is the name of the encoding used to decode or encode the file. This should only be used in text mode. The default encoding is platform @@ -887,16 +910,36 @@ are always available. They are listed here in alphabetical order. the list of supported encodings. *errors* is an optional string that specifies how encoding and decoding - errors are to be handled--this cannot be used in binary mode. Pass - ``'strict'`` to raise a :exc:`ValueError` exception if there is an encoding - error (the default of ``None`` has the same effect), or pass ``'ignore'`` to - ignore errors. (Note that ignoring encoding errors can lead to data loss.) - ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted - where there is malformed data. When writing, ``'xmlcharrefreplace'`` - (replace with the appropriate XML character reference) or - ``'backslashreplace'`` (replace with backslashed escape sequences) can be - used. Any other error handling name that has been registered with - :func:`codecs.register_error` is also valid. + errors are to be handled--this cannot be used in binary mode. + A variety of standard error handlers are available, though any + error handling name that has been registered with + :func:`codecs.register_error` is also valid. The standard names + are: + + * ``'strict'`` to raise a :exc:`ValueError` exception if there is + an encoding error. The default value of ``None`` has the same + effect. + + * ``'ignore'`` ignores errors. Note that ignoring encoding errors + can lead to data loss. + + * ``'replace'`` causes a replacement marker (such as ``'?'``) to be inserted + where there is malformed data. + + * ``'surrogateescape'`` will represent any incorrect bytes as code + points in the Unicode Private Use Area ranging from U+DC80 to + U+DCFF. These private code points will then be turned back into + the same bytes when the ``surrogateescape`` error handler is used + when writing data. This is useful for processing files in an + unknown encoding. + + * ``'xmlcharrefreplace'`` is only supported when writing to a file. + Characters not supported by the encoding are replaced with the + appropriate XML character reference ``&#nnn;``. + + * ``'backslashreplace'`` (also only supported when writing) + replaces unsupported characters with Python's backslashed escape + sequences. .. index:: single: universal newlines; open() built-in function @@ -924,6 +967,29 @@ are always available. They are listed here in alphabetical order. closed. If a filename is given *closefd* has no effect and must be ``True`` (the default). + A custom opener can be used by passing a callable as *opener*. The underlying + file descriptor for the file object is then obtained by calling *opener* with + (*file*, *flags*). *opener* must return an open file descriptor (passing + :mod:`os.open` as *opener* results in functionality similar to passing + ``None``). + + The following example uses the :ref:`dir_fd <dir_fd>` parameter of the + :func:`os.open` function to open a file relative to a given directory:: + + >>> import os + >>> dir_fd = os.open('somedir', os.O_RDONLY) + >>> def opener(path, flags): + ... return os.open(path, flags, dir_fd=dir_fd) + ... + >>> with open('spamspam.txt', 'w', opener=opener) as f: + ... print('This will be written to somedir/spamspam.txt', file=f) + ... + >>> os.close(dir_fd) # don't leak a file descriptor + + .. versionchanged:: 3.3 + The *opener* parameter was added. + The ``'x'`` mode was added. + The type of :term:`file object` returned by the :func:`open` function depends on the mode. When :func:`open` is used to open a file in a text mode (``'w'``, ``'r'``, ``'wt'``, ``'rt'``, etc.), it returns a subclass of @@ -949,6 +1015,11 @@ are always available. They are listed here in alphabetical order. (where :func:`open` is declared), :mod:`os`, :mod:`os.path`, :mod:`tempfile`, and :mod:`shutil`. + .. versionchanged:: 3.3 + :exc:`IOError` used to be raised, it is now an alias of :exc:`OSError`. + :exc:`FileExistsError` is now raised if the file opened in exclusive + creation mode (``'x'``) already exists. + .. XXX works for bytes too, but should it? .. function:: ord(c) @@ -958,9 +1029,6 @@ are always available. They are listed here in alphabetical order. point of that character. For example, ``ord('a')`` returns the integer ``97`` and ``ord('\u2020')`` returns ``8224``. This is the inverse of :func:`chr`. - On wide Unicode builds, if the argument length is not one, a - :exc:`TypeError` will be raised. On narrow Unicode builds, strings - of length two are accepted when they form a UTF-16 surrogate pair. .. function:: pow(x, y[, z]) @@ -978,7 +1046,7 @@ are always available. They are listed here in alphabetical order. must be of integer types, and *y* must be non-negative. -.. function:: print(*objects, sep=' ', end='\\n', file=sys.stdout) +.. function:: print(*objects, sep=' ', end='\\n', file=sys.stdout, flush=False) Print *objects* to the stream *file*, separated by *sep* and followed by *end*. *sep*, *end* and *file*, if present, must be given as keyword @@ -991,9 +1059,12 @@ are always available. They are listed here in alphabetical order. *end*. The *file* argument must be an object with a ``write(string)`` method; if it - is not present or ``None``, :data:`sys.stdout` will be used. Output buffering - is determined by *file*. Use ``file.flush()`` to ensure, for instance, - immediate appearance on a screen. + is not present or ``None``, :data:`sys.stdout` will be used. Whether output + is buffered is usually determined by *file*, but if the *flush* keyword + argument is true, the stream is forcibly flushed. + + .. versionchanged:: 3.3 + Added the *flush* keyword argument. .. function:: property(fget=None, fset=None, fdel=None, doc=None) @@ -1035,10 +1106,10 @@ are always available. They are listed here in alphabetical order. turns the :meth:`voltage` method into a "getter" for a read-only attribute with the same name. - A property object has :attr:`getter`, :attr:`setter`, and :attr:`deleter` - methods usable as decorators that create a copy of the property with the - corresponding accessor function set to the decorated function. This is - best explained with an example:: + A property object has :attr:`~property.getter`, :attr:`~property.setter`, + and :attr:`~property.deleter` methods usable as decorators that create a + copy of the property with the corresponding accessor function set to the + decorated function. This is best explained with an example:: class C: def __init__(self): @@ -1065,63 +1136,13 @@ are always available. They are listed here in alphabetical order. ``fdel`` corresponding to the constructor arguments. -.. XXX does accept objects with __index__ too +.. _func-range: .. function:: range(stop) range(start, stop[, step]) + :noindex: - This is a versatile function to create iterables yielding arithmetic - progressions. It is most often used in :keyword:`for` loops. The arguments - must be integers. If the *step* argument is omitted, it defaults to ``1``. - If the *start* argument is omitted, it defaults to ``0``. The full form - returns an iterable of integers ``[start, start + step, start + 2 * step, - ...]``. If *step* is positive, the last element is the largest ``start + i * - step`` less than *stop*; if *step* is negative, the last element is the - smallest ``start + i * step`` greater than *stop*. *step* must not be zero - (or else :exc:`ValueError` is raised). Example: - - >>> list(range(10)) - [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] - >>> list(range(1, 11)) - [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] - >>> list(range(0, 30, 5)) - [0, 5, 10, 15, 20, 25] - >>> list(range(0, 10, 3)) - [0, 3, 6, 9] - >>> list(range(0, -10, -1)) - [0, -1, -2, -3, -4, -5, -6, -7, -8, -9] - >>> list(range(0)) - [] - >>> list(range(1, 0)) - [] - - Range objects implement the :class:`collections.Sequence` ABC, and provide - features such as containment tests, element index lookup, slicing and - support for negative indices (see :ref:`typesseq`): - - >>> r = range(0, 20, 2) - >>> r - range(0, 20, 2) - >>> 11 in r - False - >>> 10 in r - True - >>> r.index(10) - 5 - >>> r[5] - 10 - >>> r[:5] - range(0, 10, 2) - >>> r[-1] - 18 - - Ranges containing absolute values larger than :data:`sys.maxsize` are permitted - but some features (such as :func:`len`) will raise :exc:`OverflowError`. - - .. versionchanged:: 3.2 - Implement the Sequence ABC. - Support slicing and negative indices. - Test integers for membership in constant time instead of iterating - through all items. + Rather than being a function, :class:`range` is actually an immutable + sequence type, as documented in :ref:`typesseq-range` and :ref:`typesseq`. .. function:: repr(object) @@ -1194,13 +1215,13 @@ are always available. They are listed here in alphabetical order. Return a :term:`slice` object representing the set of indices specified by ``range(start, stop, step)``. The *start* and *step* arguments default to - ``None``. Slice objects have read-only data attributes :attr:`start`, - :attr:`stop` and :attr:`step` which merely return the argument values (or their - default). They have no other explicit functionality; however they are used by - Numerical Python and other third party extensions. Slice objects are also - generated when extended indexing syntax is used. For example: - ``a[start:stop:step]`` or ``a[start:stop, i]``. See :func:`itertools.islice` - for an alternate version that returns an iterator. + ``None``. Slice objects have read-only data attributes :attr:`~slice.start`, + :attr:`~slice.stop` and :attr:`~slice.step` which merely return the argument + values (or their default). They have no other explicit functionality; + however they are used by Numerical Python and other third party extensions. + Slice objects are also generated when extended indexing syntax is used. For + example: ``a[start:stop:step]`` or ``a[start:stop, i]``. See + :func:`itertools.islice` for an alternate version that returns an iterator. .. function:: sorted(iterable[, key][, reverse]) @@ -1250,47 +1271,15 @@ are always available. They are listed here in alphabetical order. single: string; str() (built-in function) +.. _func-str: .. function:: str(object='') str(object=b'', encoding='utf-8', errors='strict') + :noindex: - Return a :ref:`string <typesseq>` version of *object*. If *object* is not - provided, returns the empty string. Otherwise, the behavior of ``str()`` - depends on whether *encoding* or *errors* is given, as follows. - - If neither *encoding* nor *errors* is given, ``str(object)`` returns - :meth:`object.__str__() <object.__str__>`, which is the "informal" or nicely - printable string representation of *object*. For string objects, this is - the string itself. If *object* does not have a :meth:`~object.__str__` - method, then :func:`str` falls back to returning - :meth:`repr(object) <repr>`. + Return a :class:`str` version of *object*. See :func:`str` for details. - .. index:: - single: buffer protocol; str() (built-in function) - single: bytes; str() (built-in function) - - If at least one of *encoding* or *errors* is given, *object* should be a - :class:`bytes` or :class:`bytearray` object, or more generally any object - that supports the :ref:`buffer protocol <bufferobjects>`. In this case, if - *object* is a :class:`bytes` (or :class:`bytearray`) object, then - ``str(bytes, encoding, errors)`` is equivalent to - :meth:`bytes.decode(encoding, errors) <bytes.decode>`. Otherwise, the bytes - object underlying the buffer object is obtained before calling - :meth:`bytes.decode`. See the :ref:`typesseq` section, the - :ref:`typememoryview` section, and :ref:`bufferobjects` for information on - buffer objects. - - Passing a :class:`bytes` object to :func:`str` without the *encoding* - or *errors* arguments falls under the first case of returning the informal - string representation (see also the :option:`-b` command-line option to - Python). For example:: - - >>> str(b'Zoot!') - "b'Zoot!'" - - ``str`` is a built-in :term:`type`. For more information on the string - type and its methods, see the :ref:`typesseq` and :ref:`string-methods` - sections. To output formatted strings, see the :ref:`string-formatting` - section. In addition, see the :ref:`stringservices` section. + ``str`` is the built-in string :term:`class`. For general information + about strings, see :ref:`textseq`. .. function:: sum(iterable[, start]) @@ -1312,9 +1301,10 @@ are always available. They are listed here in alphabetical order. been overridden in a class. The search order is same as that used by :func:`getattr` except that the *type* itself is skipped. - The :attr:`__mro__` attribute of the *type* lists the method resolution - search order used by both :func:`getattr` and :func:`super`. The attribute - is dynamic and can change whenever the inheritance hierarchy is updated. + The :attr:`~class.__mro__` attribute of the *type* lists the method + resolution search order used by both :func:`getattr` and :func:`super`. The + attribute is dynamic and can change whenever the inheritance hierarchy is + updated. If the second argument is omitted, the super object returned is unbound. If the second argument is an object, ``isinstance(obj, type)`` must be true. If @@ -1350,26 +1340,24 @@ are always available. They are listed here in alphabetical order. Accordingly, :func:`super` is undefined for implicit lookups using statements or operators such as ``super()[name]``. - Also note that :func:`super` is not limited to use inside methods. The two - argument form specifies the arguments exactly and makes the appropriate - references. The zero argument form automatically searches the stack frame - for the class (``__class__``) and the first argument. + Also note that, aside from the zero argument form, :func:`super` is not + limited to use inside methods. The two argument form specifies the + arguments exactly and makes the appropriate references. The zero + argument form only works inside a class definition, as the compiler fills + in the necessary details to correctly retrieve the class being defined, + as well as accessing the current instance for ordinary methods. For practical suggestions on how to design cooperative classes using :func:`super`, see `guide to using super() <http://rhettinger.wordpress.com/2011/05/26/super-considered-super/>`_. +.. _func-tuple: .. function:: tuple([iterable]) + :noindex: - Return a tuple whose items are the same and in the same order as *iterable*'s - items. *iterable* may be a sequence, a container that supports iteration, or an - iterator object. If *iterable* is already a tuple, it is returned unchanged. - For instance, ``tuple('abc')`` returns ``('a', 'b', 'c')`` and ``tuple([1, 2, - 3])`` returns ``(1, 2, 3)``. If no argument is given, returns a new empty - tuple, ``()``. - - :class:`tuple` is an immutable sequence type, as documented in :ref:`typesseq`. + Rather than being a function, :class:`tuple` is actually an immutable + sequence type, as documented in :ref:`typesseq-tuple` and :ref:`typesseq`. .. function:: type(object) @@ -1379,7 +1367,8 @@ are always available. They are listed here in alphabetical order. With one argument, return the type of an *object*. The return value is a - type object and generally the same object as returned by ``object.__class__``. + type object and generally the same object as returned by + :attr:`object.__class__ <instance.__class__>`. The :func:`isinstance` built-in function is recommended for testing the type of an object, because it takes subclasses into account. @@ -1387,28 +1376,34 @@ are always available. They are listed here in alphabetical order. With three arguments, return a new type object. This is essentially a dynamic form of the :keyword:`class` statement. The *name* string is the - class name and becomes the :attr:`__name__` attribute; the *bases* tuple - itemizes the base classes and becomes the :attr:`__bases__` attribute; - and the *dict* dictionary is the namespace containing definitions for class - body and becomes the :attr:`__dict__` attribute. For example, the - following two statements create identical :class:`type` objects: + class name and becomes the :attr:`~class.__name__` attribute; the *bases* + tuple itemizes the base classes and becomes the :attr:`~class.__bases__` + attribute; and the *dict* dictionary is the namespace containing definitions + for class body and becomes the :attr:`~object.__dict__` attribute. For + example, the following two statements create identical :class:`type` objects: >>> class X: ... a = 1 ... >>> X = type('X', (object,), dict(a=1)) + See also :ref:`bltin-type-objects`. + .. function:: vars([object]) - Without an argument, act like :func:`locals`. + Return the :attr:`~object.__dict__` attribute for a module, class, instance, + or any other object with a :attr:`__dict__` attribute. - With a module, class or class instance object as argument (or anything else that - has a :attr:`__dict__` attribute), return that attribute. + Objects such as modules and instances have an updateable :attr:`__dict__` + attribute; however, other objects may have write restrictions on their + :attr:`__dict__` attributes (for example, classes use a + dictproxy to prevent direct dictionary updates). + + Without an argument, :func:`vars` acts like :func:`locals`. Note, the + locals dictionary is only useful for reads since updates to the locals + dictionary are ignored. - .. note:: - The returned dictionary should not be modified: - the effects on the corresponding symbol table are undefined. [#]_ .. function:: zip(*iterables) @@ -1454,7 +1449,7 @@ are always available. They are listed here in alphabetical order. True -.. function:: __import__(name, globals={}, locals={}, fromlist=[], level=-1) +.. function:: __import__(name, globals=None, locals=None, fromlist=(), level=0) .. index:: statement: import @@ -1468,9 +1463,11 @@ are always available. They are listed here in alphabetical order. This function is invoked by the :keyword:`import` statement. It can be replaced (by importing the :mod:`builtins` module and assigning to ``builtins.__import__``) in order to change semantics of the - :keyword:`import` statement, but nowadays it is usually simpler to use import - hooks (see :pep:`302`). Direct use of :func:`__import__` is rare, except in - cases where you want to import a module whose name is only known at runtime. + :keyword:`import` statement, but doing so is **strongly** discouraged as it + is usually simpler to use import hooks (see :pep:`302`) to attain the same + goals and does not cause issues with code which assumes the default import + implementation is in use. Direct use of :func:`__import__` is also + discouraged in favor of :func:`importlib.import_module`. The function imports the module *name*, potentially using the given *globals* and *locals* to determine how to interpret the name in a package context. @@ -1479,13 +1476,11 @@ are always available. They are listed here in alphabetical order. not use its *locals* argument at all, and uses its *globals* only to determine the package context of the :keyword:`import` statement. - *level* specifies whether to use absolute or relative imports. ``0`` - means only perform absolute imports. Positive values for *level* indicate the - number of parent directories to search relative to the directory of the - module calling :func:`__import__`. Negative values attempt both an implicit - relative import and an absolute import (usage of negative values for *level* - are strongly discouraged as future versions of Python do not support such - values). Import statements only use values of 0 or greater. + *level* specifies whether to use absolute or relative imports. ``0`` (the + default) means only perform absolute imports. Positive values for + *level* indicate the number of parent directories to search relative to the + directory of the module calling :func:`__import__` (see :pep:`328` for the + details). When the *name* variable is of the form ``package.module``, normally, the top-level package (the name up till the first dot) is returned, *not* the @@ -1518,13 +1513,13 @@ are always available. They are listed here in alphabetical order. If you simply want to import a module (potentially within a package) by name, use :func:`importlib.import_module`. + .. versionchanged:: 3.3 + Negative values for *level* are no longer supported (which also changes + the default value to 0). + .. rubric:: Footnotes .. [#] Note that the parser only accepts the Unix-style end of line convention. If you are reading the code from a file, make sure to use newline conversion mode to convert Windows or Mac-style newlines. - -.. [#] In the current implementation, local variable bindings cannot normally be - affected this way, but variables retrieved from other scopes (such as modules) - can be. This may change. diff --git a/Doc/library/functools.rst b/Doc/library/functools.rst index 04743d3..3c946e3 100644 --- a/Doc/library/functools.rst +++ b/Doc/library/functools.rst @@ -40,7 +40,7 @@ The :mod:`functools` module defines the following functions: .. versionadded:: 3.2 -.. decorator:: lru_cache(maxsize=100) +.. decorator:: lru_cache(maxsize=128, typed=False) Decorator to wrap a function with a memoizing callable that saves up to the *maxsize* most recent calls. It can save time when an expensive or I/O bound @@ -49,8 +49,13 @@ The :mod:`functools` module defines the following functions: Since a dictionary is used to cache results, the positional and keyword arguments to the function must be hashable. - If *maxsize* is set to None, the LRU feature is disabled and the cache - can grow without bound. + If *maxsize* is set to None, the LRU feature is disabled and the cache can + grow without bound. The LRU feature performs best when *maxsize* is a + power-of-two. + + If *typed* is set to True, function arguments of different types will be + cached separately. For example, ``f(3)`` and ``f(3.0)`` will be treated + as distinct calls with distinct results. To help measure the effectiveness of the cache and tune the *maxsize* parameter, the wrapped function is instrumented with a :func:`cache_info` @@ -67,14 +72,14 @@ The :mod:`functools` module defines the following functions: An `LRU (least recently used) cache <http://en.wikipedia.org/wiki/Cache_algorithms#Least_Recently_Used>`_ works - best when more recent calls are the best predictors of upcoming calls (for - example, the most popular articles on a news server tend to change daily). + best when the most recent calls are the best predictors of upcoming calls (for + example, the most popular articles on a news server tend to change each day). The cache's size limit assures that the cache does not grow without bound on long-running processes such as web servers. Example of an LRU cache for static web content:: - @lru_cache(maxsize=20) + @lru_cache(maxsize=32) def get_pep(num): 'Retrieve text of a Python Enhancement Proposal' resource = 'http://www.python.org/dev/peps/pep-%04d/' % num @@ -88,8 +93,8 @@ The :mod:`functools` module defines the following functions: ... pep = get_pep(n) ... print(n, len(pep)) - >>> print(get_pep.cache_info()) - CacheInfo(hits=3, misses=8, maxsize=20, currsize=8) + >>> get_pep.cache_info() + CacheInfo(hits=3, misses=8, maxsize=32, currsize=8) Example of efficiently computing `Fibonacci numbers <http://en.wikipedia.org/wiki/Fibonacci_number>`_ @@ -103,14 +108,17 @@ The :mod:`functools` module defines the following functions: return n return fib(n-1) + fib(n-2) - >>> print([fib(n) for n in range(16)]) + >>> [fib(n) for n in range(16)] [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610] - >>> print(fib.cache_info()) + >>> fib.cache_info() CacheInfo(hits=28, misses=16, maxsize=None, currsize=16) .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Added the *typed* option. + .. decorator:: total_ordering Given a class defining one or more rich comparison ordering methods, this @@ -177,6 +185,18 @@ The :mod:`functools` module defines the following functions: a default when the sequence is empty. If *initializer* is not given and *sequence* contains only one item, the first item is returned. + Equivalent to:: + + def reduce(function, iterable, initializer=None): + it = iter(iterable) + if initializer is None: + value = next(it) + else: + value = initializer + for element in it: + value = function(value, element) + return value + .. function:: update_wrapper(wrapper, wrapped, assigned=WRAPPER_ASSIGNMENTS, updated=WRAPPER_UPDATES) diff --git a/Doc/library/gc.rst b/Doc/library/gc.rst index 0281bb7..71449a3 100644 --- a/Doc/library/gc.rst +++ b/Doc/library/gc.rst @@ -121,8 +121,8 @@ The :mod:`gc` module provides the following functions: Return a list of objects directly referred to by any of the arguments. The referents returned are those objects visited by the arguments' C-level - :attr:`tp_traverse` methods (if any), and may not be all objects actually - directly reachable. :attr:`tp_traverse` methods are supported only by objects + :c:member:`~PyTypeObject.tp_traverse` methods (if any), and may not be all objects actually + directly reachable. :c:member:`~PyTypeObject.tp_traverse` methods are supported only by objects that support garbage collection, and are only required to visit objects that may be involved in a cycle. So, for example, if an integer is directly reachable from an argument, that integer object may or may not appear in the result list. @@ -130,8 +130,8 @@ The :mod:`gc` module provides the following functions: .. function:: is_tracked(obj) - Returns True if the object is currently tracked by the garbage collector, - False otherwise. As a general rule, instances of atomic types aren't + Returns ``True`` if the object is currently tracked by the garbage collector, + ``False`` otherwise. As a general rule, instances of atomic types aren't tracked and instances of non-atomic types (containers, user-defined objects...) are. However, some type-specific optimizations can be present in order to suppress the garbage collector footprint of simple instances @@ -153,8 +153,8 @@ The :mod:`gc` module provides the following functions: .. versionadded:: 3.1 -The following variable is provided for read-only access (you can mutate its -value but should not rebind it): +The following variables are provided for read-only access (you can mutate the +values but should not rebind them): .. data:: garbage @@ -183,6 +183,41 @@ value but should not rebind it): :const:`DEBUG_UNCOLLECTABLE` is set, in addition all uncollectable objects are printed. +.. data:: callbacks + + A list of callbacks that will be invoked by the garbage collector before and + after collection. The callbacks will be called with two arguments, + *phase* and *info*. + + *phase* can be one of two values: + + "start": The garbage collection is about to start. + + "stop": The garbage collection has finished. + + *info* is a dict providing more information for the callback. The following + keys are currently defined: + + "generation": The oldest generation being collected. + + "collected": When *phase* is "stop", the number of objects + successfully collected. + + "uncollectable": When *phase* is "stop", the number of objects + that could not be collected and were put in :data:`garbage`. + + Applications can add their own callbacks to this list. The primary + use cases are: + + Gathering statistics about garbage collection, such as how often + various generations are collected, and how long the collection + takes. + + Allowing applications to identify and clear their own uncollectable + types when they appear in :data:`garbage`. + + .. versionadded:: 3.3 + The following constants are provided for use with :func:`set_debug`: diff --git a/Doc/library/getopt.rst b/Doc/library/getopt.rst index b6ab3df..f9a1e53 100644 --- a/Doc/library/getopt.rst +++ b/Doc/library/getopt.rst @@ -10,6 +10,7 @@ -------------- .. note:: + The :mod:`getopt` module is a parser for command line options whose API is designed to be familiar to users of the C :c:func:`getopt` function. Users who are unfamiliar with the C :c:func:`getopt` function or who would like to write diff --git a/Doc/library/gettext.rst b/Doc/library/gettext.rst index 0fa022c..982780f 100644 --- a/Doc/library/gettext.rst +++ b/Doc/library/gettext.rst @@ -3,8 +3,8 @@ .. module:: gettext :synopsis: Multilingual internationalization services. -.. moduleauthor:: Barry A. Warsaw <barry@zope.com> -.. sectionauthor:: Barry A. Warsaw <barry@zope.com> +.. moduleauthor:: Barry A. Warsaw <barry@python.org> +.. sectionauthor:: Barry A. Warsaw <barry@python.org> **Source code:** :source:`Lib/gettext.py` @@ -94,9 +94,10 @@ class-based API instead. The Plural formula is taken from the catalog header. It is a C or Python expression that has a free variable *n*; the expression evaluates to the index - of the plural in the catalog. See the GNU gettext documentation for the precise - syntax to be used in :file:`.po` files and the formulas for a variety of - languages. + of the plural in the catalog. See + `the GNU gettext documentation <https://www.gnu.org/software/gettext/manual/gettext.html>`__ + for the precise syntax to be used in :file:`.po` files and the + formulas for a variety of languages. .. function:: lngettext(singular, plural, n) @@ -185,10 +186,13 @@ class can also install themselves in the built-in namespace as the function translation object from the cache; the actual instance data is still shared with the cache. - If no :file:`.mo` file is found, this function raises :exc:`IOError` if + If no :file:`.mo` file is found, this function raises :exc:`OSError` if *fallback* is false (which is the default), and returns a :class:`NullTranslations` instance if *fallback* is true. + .. versionchanged:: 3.3 + :exc:`IOError` used to be raised instead of :exc:`OSError`. + .. function:: install(domain, localedir=None, codeset=None, names=None) @@ -342,7 +346,7 @@ The entire set of key/value pairs are placed into a dictionary and set as the If the :file:`.mo` file's magic number is invalid, or if other problems occur while reading the file, instantiating a :class:`GNUTranslations` class can raise -:exc:`IOError`. +:exc:`OSError`. The following methods are overridden from the base class implementation: @@ -448,35 +452,42 @@ it in ``_('...')`` --- that is, a call to the function :func:`_`. For example:: In this example, the string ``'writing a log message'`` is marked as a candidate for translation, while the strings ``'mylog.txt'`` and ``'w'`` are not. -The Python distribution comes with two tools which help you generate the message -catalogs once you've prepared your source code. These may or may not be -available from a binary distribution, but they can be found in a source -distribution, in the :file:`Tools/i18n` directory. - -The :program:`pygettext` [#]_ program scans all your Python source code looking -for the strings you previously marked as translatable. It is similar to the GNU -:program:`gettext` program except that it understands all the intricacies of -Python source code, but knows nothing about C or C++ source code. You don't -need GNU ``gettext`` unless you're also going to be translating C code (such as -C extension modules). - -:program:`pygettext` generates textual Uniforum-style human readable message -catalog :file:`.pot` files, essentially structured human readable files which -contain every marked string in the source code, along with a placeholder for the -translation strings. :program:`pygettext` is a command line script that supports -a similar command line interface as :program:`xgettext`; for details on its use, -run:: - - pygettext.py --help - -Copies of these :file:`.pot` files are then handed over to the individual human -translators who write language-specific versions for every supported natural -language. They send you back the filled in language-specific versions as a -:file:`.po` file. Using the :program:`msgfmt.py` [#]_ program (in the -:file:`Tools/i18n` directory), you take the :file:`.po` files from your -translators and generate the machine-readable :file:`.mo` binary catalog files. -The :file:`.mo` files are what the :mod:`gettext` module uses for the actual -translation processing during run-time. +There are a few tools to extract the strings meant for translation. +The original GNU :program:`gettext` only supported C or C++ source +code but its extended version :program:`xgettext` scans code written +in a number of languages, including Python, to find strings marked as +translatable. `Babel <http://babel.pocoo.org/>`__ is a Python +internationalization library that includes a :file:`pybabel` script to +extract and compile message catalogs. François Pinard's program +called :program:`xpot` does a similar job and is available as part of +his `po-utils package <http://po-utils.progiciels-bpi.ca/>`__. + +(Python also includes pure-Python versions of these programs, called +:program:`pygettext.py` and :program:`msgfmt.py`; some Python distributions +will install them for you. :program:`pygettext.py` is similar to +:program:`xgettext`, but only understands Python source code and +cannot handle other programming languages such as C or C++. +:program:`pygettext.py` supports a command-line interface similar to +:program:`xgettext`; for details on its use, run ``pygettext.py +--help``. :program:`msgfmt.py` is binary compatible with GNU +:program:`msgfmt`. With these two programs, you may not need the GNU +:program:`gettext` package to internationalize your Python +applications.) + +:program:`xgettext`, :program:`pygettext`, and similar tools generate +:file:`.po` files that are message catalogs. They are structured +human-readable files that contain every marked string in the source +code, along with a placeholder for the translated versions of these +strings. + +Copies of these :file:`.po` files are then handed over to the +individual human translators who write translations for every +supported natural language. They send back the completed +language-specific versions as a :file:`<language-name>.po` file that's +compiled into a machine-readable :file:`.mo` binary catalog file using +the :program:`msgfmt` program. The :file:`.mo` files are used by the +:mod:`gettext` module for the actual translation processing at +run-time. How you use the :mod:`gettext` module in your code depends on whether you are internationalizing a single module or your entire application. The next two @@ -514,7 +525,7 @@ driver file of your application:: import gettext gettext.install('myapplication') -If you need to set the locale directory, you can pass these into the +If you need to set the locale directory, you can pass it into the :func:`install` function:: import gettext @@ -587,7 +598,8 @@ care, though if you have a previous definition of :func:`_` in the local namespace. Note that the second use of :func:`_` will not identify "a" as being -translatable to the :program:`pygettext` program, since it is not a string. +translatable to the :program:`gettext` program, because the parameter +is not a string literal. Another way to handle this is with the following example:: @@ -603,11 +615,14 @@ Another way to handle this is with the following example:: for a in animals: print(_(a)) -In this case, you are marking translatable strings with the function :func:`N_`, -[#]_ which won't conflict with any definition of :func:`_`. However, you will -need to teach your message extraction program to look for translatable strings -marked with :func:`N_`. :program:`pygettext` and :program:`xpot` both support -this through the use of command line switches. +In this case, you are marking translatable strings with the function +:func:`N_`, which won't conflict with any definition of :func:`_`. +However, you will need to teach your message extraction program to +look for translatable strings marked with :func:`N_`. :program:`xgettext`, +:program:`pygettext`, ``pybabel extract``, and :program:`xpot` all +support this through the use of the :option:`-k` command-line switch. +The choice of :func:`N_` here is totally arbitrary; it could have just +as easily been :func:`MarkThisStringForTranslation`. Acknowledgements @@ -642,16 +657,3 @@ implementations, and valuable experience to the creation of this module: absolute path at the start of your application. .. [#] See the footnote for :func:`bindtextdomain` above. - -.. [#] François Pinard has written a program called :program:`xpot` which does a - similar job. It is available as part of his `po-utils package - <http://po-utils.progiciels-bpi.ca/>`_. - -.. [#] :program:`msgfmt.py` is binary compatible with GNU :program:`msgfmt` except that - it provides a simpler, all-Python implementation. With this and - :program:`pygettext.py`, you generally won't need to install the GNU - :program:`gettext` package to internationalize your Python applications. - -.. [#] The choice of :func:`N_` here is totally arbitrary; it could have just as easily - been :func:`MarkThisStringForTranslation`. - diff --git a/Doc/library/gzip.rst b/Doc/library/gzip.rst index abbd018..354deed 100644 --- a/Doc/library/gzip.rst +++ b/Doc/library/gzip.rst @@ -13,9 +13,11 @@ like the GNU programs :program:`gzip` and :program:`gunzip` would. The data compression is provided by the :mod:`zlib` module. -The :mod:`gzip` module provides the :class:`GzipFile` class. The :class:`GzipFile` -class reads and writes :program:`gzip`\ -format files, automatically compressing -or decompressing the data so that it looks like an ordinary :term:`file object`. +The :mod:`gzip` module provides the :class:`GzipFile` class, as well as the +:func:`.open`, :func:`compress` and :func:`decompress` convenience functions. +The :class:`GzipFile` class reads and writes :program:`gzip`\ -format files, +automatically compressing or decompressing the data so that it looks like an +ordinary :term:`file object`. Note that additional file formats which can be decompressed by the :program:`gzip` and :program:`gunzip` programs, such as those produced by @@ -24,6 +26,34 @@ Note that additional file formats which can be decompressed by the The module defines the following items: +.. function:: open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None) + + Open a gzip-compressed file in binary or text mode, returning a :term:`file + object`. + + The *filename* argument can be an actual filename (a :class:`str` or + :class:`bytes` object), or an existing file object to read from or write to. + + The *mode* argument can be any of ``'r'``, ``'rb'``, ``'a'``, ``'ab'``, + ``'w'``, or ``'wb'`` for binary mode, or ``'rt'``, ``'at'``, or ``'wt'`` for + text mode. The default is ``'rb'``. + + The *compresslevel* argument is an integer from 0 to 9, as for the + :class:`GzipFile` constructor. + + For binary mode, this function is equivalent to the :class:`GzipFile` + constructor: ``GzipFile(filename, mode, compresslevel)``. In this case, the + *encoding*, *errors* and *newline* arguments must not be provided. + + For text mode, a :class:`GzipFile` object is created, and wrapped in an + :class:`io.TextIOWrapper` instance with the specified encoding, error + handling behavior, and line ending(s). + + .. versionchanged:: 3.3 + Added support for *filename* being a file object, support for text mode, + and the *encoding*, *errors* and *newline* arguments. + + .. class:: GzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None) Constructor for the :class:`GzipFile` class, which simulates most of the @@ -32,12 +62,12 @@ The module defines the following items: value. The new class instance is based on *fileobj*, which can be a regular file, a - :class:`StringIO` object, or any other object which simulates a file. It + :class:`io.BytesIO` object, or any other object which simulates a file. It defaults to ``None``, in which case *filename* is opened to provide a file object. When *fileobj* is not ``None``, the *filename* argument is only used to be - included in the :program:`gzip` file header, which may includes the original + included in the :program:`gzip` file header, which may include the original filename of the uncompressed file. It defaults to the filename of *fileobj*, if discernible; otherwise, it defaults to the empty string, and in this case the original filename is not included in the header. @@ -46,9 +76,9 @@ The module defines the following items: or ``'wb'``, depending on whether the file will be read or written. The default is the mode of *fileobj* if discernible; otherwise, the default is ``'rb'``. - Note that the file is always opened in binary mode; text mode is not - supported. If you need to read a compressed file in text mode, wrap your - :class:`GzipFile` with an :class:`io.TextIOWrapper`. + Note that the file is always opened in binary mode. To open a compressed file + in text mode, use :func:`.open` (or wrap your :class:`GzipFile` with an + :class:`io.TextIOWrapper`). The *compresslevel* argument is an integer from ``0`` to ``9`` controlling the level of compression; ``1`` is fastest and produces the least @@ -72,7 +102,7 @@ The module defines the following items: :class:`GzipFile` supports the :class:`io.BufferedIOBase` interface, including iteration and the :keyword:`with` statement. Only the - :meth:`read1` and :meth:`truncate` methods aren't implemented. + :meth:`truncate` method isn't implemented. :class:`GzipFile` also provides the following method: @@ -83,23 +113,23 @@ The module defines the following items: the call. The number of bytes returned may be more or less than requested. + .. note:: While calling :meth:`peek` does not change the file position of + the :class:`GzipFile`, it may change the position of the underlying + file object (e.g. if the :class:`GzipFile` was constructed with the + *fileobj* parameter). + .. versionadded:: 3.2 .. versionchanged:: 3.1 - Support for the :keyword:`with` statement was added. + Support for the :keyword:`with` statement was added, along with the + *mtime* argument. .. versionchanged:: 3.2 - Support for zero-padded files was added. - - .. versionchanged:: 3.2 - Support for unseekable files was added. - + Support for zero-padded and unseekable files was added. -.. function:: open(filename, mode='rb', compresslevel=9) + .. versionchanged:: 3.3 + The :meth:`io.BufferedIOBase.read1` method is now implemented. - This is a shorthand for ``GzipFile(filename,`` ``mode,`` ``compresslevel)``. - The *filename* argument is required; *mode* defaults to ``'rb'`` and - *compresslevel* defaults to ``9``. .. function:: compress(data, compresslevel=9) diff --git a/Doc/library/hashlib.rst b/Doc/library/hashlib.rst index bc8ab2c..a693f7e 100644 --- a/Doc/library/hashlib.rst +++ b/Doc/library/hashlib.rst @@ -23,29 +23,31 @@ algorithm (defined in Internet :rfc:`1321`). The terms "secure hash" and digests. The modern term is secure hash. .. note:: - If you want the adler32 or crc32 hash functions they are available in + + If you want the adler32 or crc32 hash functions, they are available in the :mod:`zlib` module. .. warning:: - Some algorithms have known hash collision weaknesses, see the FAQ at the end. + Some algorithms have known hash collision weaknesses, refer to the "See + also" section at the end. There is one constructor method named for each type of :dfn:`hash`. All return a hash object with the same simple interface. For example: use :func:`sha1` to -create a SHA1 hash object. You can now feed this object with objects conforming -to the buffer interface (normally :class:`bytes` objects) using the -:meth:`update` method. At any point you can ask it for the :dfn:`digest` of the +create a SHA1 hash object. You can now feed this object with :term:`bytes-like +object`\ s (normally :class:`bytes`) using the :meth:`update` method. +At any point you can ask it for the :dfn:`digest` of the concatenation of the data fed to it so far using the :meth:`digest` or :meth:`hexdigest` methods. .. note:: - For better multithreading performance, the Python GIL is released for - strings of more than 2047 bytes at object creation or on update. + For better multithreading performance, the Python :term:`GIL` is released for + data larger than 2047 bytes at object creation or on update. .. note:: - Feeding string objects is to :meth:`update` is not supported, as hashes work + Feeding string objects into :meth:`update` is not supported, as hashes work on bytes, not on characters. .. index:: single: OpenSSL; (use in module hashlib) @@ -93,18 +95,18 @@ Hashlib provides the following constant attributes: .. data:: algorithms_guaranteed - Contains the names of the hash algorithms guaranteed to be supported + A set containing the names of the hash algorithms guaranteed to be supported by this module on all platforms. .. versionadded:: 3.2 .. data:: algorithms_available - Contains the names of the hash algorithms that are available - in the running Python interpreter. These names will be recognized - when passed to :func:`new`. :attr:`algorithms_guaranteed` - will always be a subset. Duplicate algorithms with different - name formats may appear in this set (thanks to OpenSSL). + A set containing the names of the hash algorithms that are available in the + running Python interpreter. These names will be recognized when passed to + :func:`new`. :attr:`algorithms_guaranteed` will always be a subset. The + same algorithm may appear multiple times in this set under different names + (thanks to OpenSSL). .. versionadded:: 3.2 @@ -132,7 +134,7 @@ A hash object has the following methods: .. versionchanged:: 3.1 The Python GIL is released to allow other threads to run while hash - updates on data larger than 2048 bytes is taking place when using hash + updates on data larger than 2047 bytes is taking place when using hash algorithms supplied by OpenSSL. diff --git a/Doc/library/heapq.rst b/Doc/library/heapq.rst index 768dfdc..4f1a682 100644 --- a/Doc/library/heapq.rst +++ b/Doc/library/heapq.rst @@ -246,7 +246,7 @@ A nice feature of this sort is that you can efficiently insert new items while the sort is going on, provided that the inserted items are not "better" than the last 0'th element you extracted. This is especially useful in simulation contexts, where the tree holds all incoming events, and the "win" condition -means the smallest scheduled time. When an event schedule other events for +means the smallest scheduled time. When an event schedules other events for execution, they are scheduled into the future, so they can easily go into the heap. So, a heap is a good structure for implementing schedulers (this is what I used for my MIDI sequencer :-). diff --git a/Doc/library/hmac.rst b/Doc/library/hmac.rst index eff2724..9575693 100644 --- a/Doc/library/hmac.rst +++ b/Doc/library/hmac.rst @@ -19,7 +19,7 @@ This module implements the HMAC algorithm as described by :rfc:`2104`. Return a new hmac object. *key* is a bytes object giving the secret key. If *msg* is present, the method call ``update(msg)`` is made. *digestmod* is the digest constructor or module for the HMAC object to use. It defaults to - the :func:`hashlib.md5` constructor. + the :data:`hashlib.md5` constructor. An HMAC object has the following methods: @@ -38,6 +38,13 @@ An HMAC object has the following methods: given to the constructor. It may contain non-ASCII bytes, including NUL bytes. + .. warning:: + + When comparing the output of :meth:`digest` to an externally-supplied + digest during a verification routine, it is recommended to use the + :func:`compare_digest` function instead of the ``==`` operator + to reduce the vulnerability to timing attacks. + .. method:: HMAC.hexdigest() @@ -45,6 +52,13 @@ An HMAC object has the following methods: length containing only hexadecimal digits. This may be used to exchange the value safely in email or other non-binary environments. + .. warning:: + + When comparing the output of :meth:`hexdigest` to an externally-supplied + digest during a verification routine, it is recommended to use the + :func:`compare_digest` function instead of the ``==`` operator + to reduce the vulnerability to timing attacks. + .. method:: HMAC.copy() @@ -52,6 +66,26 @@ An HMAC object has the following methods: compute the digests of strings that share a common initial substring. +This module also provides the following helper function: + +.. function:: compare_digest(a, b) + + Return ``a == b``. This function uses an approach designed to prevent + timing analysis by avoiding content-based short circuiting behaviour, + making it appropriate for cryptography. *a* and *b* must both be of the + same type: either :class:`str` (ASCII only, as e.g. returned by + :meth:`HMAC.hexdigest`), or a :term:`bytes-like object`. + + .. note:: + + If *a* and *b* are of different lengths, or if an error occurs, + a timing attack could theoretically reveal information about the + types and lengths of *a* and *b*--but not their values. + + + .. versionadded:: 3.3 + + .. seealso:: Module :mod:`hashlib` diff --git a/Doc/library/html.entities.rst b/Doc/library/html.entities.rst index b8b4aa8..0aabbad 100644 --- a/Doc/library/html.entities.rst +++ b/Doc/library/html.entities.rst @@ -9,11 +9,19 @@ -------------- -This module defines three dictionaries, ``name2codepoint``, ``codepoint2name``, -and ``entitydefs``. ``entitydefs`` is used to provide the :attr:`entitydefs` -attribute of the :class:`html.parser.HTMLParser` class. The definition provided -here contains all the entities defined by XHTML 1.0 that can be handled using -simple textual substitution in the Latin-1 character set (ISO-8859-1). +This module defines four dictionaries, :data:`html5`, +:data:`name2codepoint`, :data:`codepoint2name`, and :data:`entitydefs`. + + +.. data:: html5 + + A dictionary that maps HTML5 named character references [#]_ to the + equivalent Unicode character(s), e.g. ``html5['gt;'] == '>'``. + Note that the trailing semicolon is included in the name (e.g. ``'gt;'``), + however some of the names are accepted by the standard even without the + semicolon: in this case the name is present with and without the ``';'``. + + .. versionadded:: 3.3 .. data:: entitydefs @@ -30,3 +38,8 @@ simple textual substitution in the Latin-1 character set (ISO-8859-1). .. data:: codepoint2name A dictionary that maps Unicode codepoints to HTML entity names. + + +.. rubric:: Footnotes + +.. [#] See http://www.w3.org/TR/html5/syntax.html#named-character-references diff --git a/Doc/library/html.parser.rst b/Doc/library/html.parser.rst index f3c36ec..e4154ef 100644 --- a/Doc/library/html.parser.rst +++ b/Doc/library/html.parser.rst @@ -16,13 +16,14 @@ This module defines a class :class:`HTMLParser` which serves as the basis for parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. -.. class:: HTMLParser(strict=True) +.. class:: HTMLParser(strict=False) - Create a parser instance. If *strict* is ``True`` (the default), invalid - HTML results in :exc:`~html.parser.HTMLParseError` exceptions [#]_. If - *strict* is ``False``, the parser uses heuristics to make a best guess at - the intention of any invalid HTML it encounters, similar to the way most - browsers do. Using ``strict=False`` is advised. + Create a parser instance. If *strict* is ``False`` (the default), the parser + will accept and parse invalid markup. If *strict* is ``True`` the parser + will raise an :exc:`~html.parser.HTMLParseError` exception instead [#]_ when + it's not able to parse the markup. + The use of ``strict=True`` is discouraged and the *strict* argument is + deprecated. An :class:`.HTMLParser` instance is fed HTML data and calls handler methods when start tags, end tags, text, comments, and other markup elements are @@ -32,7 +33,12 @@ parsing text files formatted in HTML (HyperText Mark-up Language) and XHTML. This parser does not check that end tags match start tags or call the end-tag handler for elements which are closed implicitly by closing an outer element. - .. versionchanged:: 3.2 *strict* keyword added + .. versionchanged:: 3.2 + *strict* keyword added. + + .. deprecated-removed:: 3.3 3.5 + The *strict* argument and the strict mode have been deprecated. + The parser is now able to accept and parse invalid markup too. An exception is defined as well: @@ -46,6 +52,10 @@ An exception is defined as well: detected, and :attr:`offset` is the number of characters into the line at which the construct starts. + .. deprecated-removed:: 3.3 3.5 + This exception has been deprecated because it's never raised by the parser + (when the default non-strict mode is used). + Example HTML Parser Application ------------------------------- diff --git a/Doc/library/html.rst b/Doc/library/html.rst index 3ad1c0c..1107ca9 100644 --- a/Doc/library/html.rst +++ b/Doc/library/html.rst @@ -19,3 +19,10 @@ This module defines utilities to manipulate HTML. attribute value delimited by quotes, as in ``<a href="...">``. .. versionadded:: 3.2 + +-------------- + +Submodules in the ``html`` package are: + +* :mod:`html.parser` -- HTML/XHTML parser with lenient parsing mode +* :mod:`html.entities` -- HTML entity definitions diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst index 7423a4c..bb30f24 100644 --- a/Doc/library/http.client.rst +++ b/Doc/library/http.client.rst @@ -51,7 +51,7 @@ The module provides the following classes: .. versionchanged:: 3.2 *source_address* was added. - .. deprecated:: 3.2 + .. deprecated-removed:: 3.2 3.4 The *strict* parameter is deprecated. HTTP 0.9-style "Simple Responses" are not supported anymore. @@ -89,7 +89,7 @@ The module provides the following classes: This class now supports HTTPS virtual hosts if possible (that is, if :data:`ssl.HAS_SNI` is true). - .. deprecated:: 3.2 + .. deprecated-removed:: 3.2 3.4 The *strict* parameter is deprecated. HTTP 0.9-style "Simple Responses" are not supported anymore. @@ -99,7 +99,7 @@ The module provides the following classes: Class whose instances are returned upon successful connection. Not instantiated directly by user. - .. deprecated:: 3.2 + .. deprecated-removed:: 3.2 3.4 The *strict* parameter is deprecated. HTTP 0.9-style "Simple Responses" are not supported anymore. @@ -169,8 +169,8 @@ The following exceptions are raised as appropriate: A subclass of :exc:`HTTPException`. Raised if a server responds with a HTTP status code that we don't understand. -The constants defined in this module are: +The constants defined in this module are: .. data:: HTTP_PORT @@ -343,6 +343,15 @@ and also the following constants for integer status codes: | :const:`UPGRADE_REQUIRED` | ``426`` | HTTP Upgrade to TLS, | | | | :rfc:`2817`, Section 6 | +------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`PRECONDITION_REQUIRED` | ``428`` | Additional HTTP Status Codes, | +| | | :rfc:`6585`, Section 3 | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`TOO_MANY_REQUESTS` | ``429`` | Additional HTTP Status Codes, | +| | | :rfc:`6585`, Section 4 | ++------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`REQUEST_HEADER_FIELDS_TOO_LARGE` | ``431`` | Additional HTTP Status Codes, | +| | | :rfc:`6585`, Section 5 | ++------------------------------------------+---------+-----------------------------------------------------------------------+ | :const:`INTERNAL_SERVER_ERROR` | ``500`` | HTTP/1.1, `RFC 2616, Section | | | | 10.5.1 | | | | <http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.1>`_ | @@ -373,6 +382,12 @@ and also the following constants for integer status codes: | :const:`NOT_EXTENDED` | ``510`` | An HTTP Extension Framework, | | | | :rfc:`2774`, Section 7 | +------------------------------------------+---------+-----------------------------------------------------------------------+ +| :const:`NETWORK_AUTHENTICATION_REQUIRED` | ``511`` | Additional HTTP Status Codes, | +| | | :rfc:`6585`, Section 6 | ++------------------------------------------+---------+-----------------------------------------------------------------------+ + +.. versionchanged:: 3.3 + Added codes ``428``, ``429``, ``431`` and ``511`` from :rfc:`6585`. .. data:: responses @@ -436,11 +451,25 @@ HTTPConnection Objects .. method:: HTTPConnection.set_tunnel(host, port=None, headers=None) - Set the host and the port for HTTP Connect Tunnelling. Normally used when it - is required to a HTTPS Connection through a proxy server. + Set the host and the port for HTTP Connect Tunnelling. This allows running + the connection through a proxy server. + + The host and port arguments specify the endpoint of the tunneled connection + (i.e. the address included in the CONNECT request, *not* the address of the + proxy server). + + The headers argument should be a mapping of extra HTTP headers to send with + the CONNECT request. - The headers argument should be a mapping of extra HTTP headers to send - with the CONNECT request. + For example, to tunnel through a HTTPS proxy server running locally on port + 8080, we would pass the address of the proxy to the :class:`HTTPSConnection` + constructor, and the address of the host that we eventually want to reach to + the :meth:`~HTTPConnection.set_tunnel` method:: + + >>> import http.client + >>> conn = http.client.HTTPSConnection("localhost", 8080) + >>> conn.set_tunnel("www.python.org") + >>> conn.request("HEAD","/index.html") .. versionadded:: 3.2 @@ -506,6 +535,12 @@ statement. Reads and returns the response body, or up to the next *amt* bytes. +.. method:: HTTPResponse.readinto(b) + + Reads up to the next len(b) bytes of the response body into the buffer *b*. + Returns the number of bytes read. + + .. versionadded:: 3.3 .. method:: HTTPResponse.getheader(name, default=None) @@ -552,7 +587,7 @@ statement. .. attribute:: HTTPResponse.closed - Is True if the stream is closed. + Is ``True`` if the stream is closed. Examples -------- @@ -614,8 +649,10 @@ Here is an example session that shows how to ``POST`` requests:: Client side ``HTTP PUT`` requests are very similar to ``POST`` requests. The difference lies only the server side where HTTP server will allow resources to -be created via ``PUT`` request. Here is an example session that shows how to do -``PUT`` request using http.client:: +be created via ``PUT`` request. It should be noted that custom HTTP methods ++are also handled in :class:`urllib.request.Request` by sending the appropriate ++method attribute.Here is an example session that shows how to do ``PUT`` +request using http.client:: >>> # This creates an HTTP message >>> # with the content of BODY as the enclosed representation @@ -626,7 +663,7 @@ be created via ``PUT`` request. Here is an example session that shows how to do >>> conn = http.client.HTTPConnection("localhost", 8080) >>> conn.request("PUT", "/file", BODY) >>> response = conn.getresponse() - >>> print(resp.status, response.reason) + >>> print(response.status, response.reason) 200, OK .. _httpmessage-objects: diff --git a/Doc/library/http.cookiejar.rst b/Doc/library/http.cookiejar.rst index cc8f251..2fae47c 100644 --- a/Doc/library/http.cookiejar.rst +++ b/Doc/library/http.cookiejar.rst @@ -40,7 +40,11 @@ The module defines the following exception: .. exception:: LoadError Instances of :class:`FileCookieJar` raise this exception on failure to load - cookies from a file. :exc:`LoadError` is a subclass of :exc:`IOError`. + cookies from a file. :exc:`LoadError` is a subclass of :exc:`OSError`. + + .. versionchanged:: 3.3 + LoadError was made a subclass of :exc:`OSError` instead of + :exc:`IOError`. The following classes are provided: @@ -86,7 +90,7 @@ The following classes are provided: Netscape and RFC 2965 cookies. By default, RFC 2109 cookies (ie. cookies received in a :mailheader:`Set-Cookie` header with a version cookie-attribute of 1) are treated according to the RFC 2965 rules. However, if RFC 2965 handling - is turned off or :attr:`rfc2109_as_netscape` is True, RFC 2109 cookies are + is turned off or :attr:`rfc2109_as_netscape` is ``True``, RFC 2109 cookies are 'downgraded' by the :class:`CookieJar` instance to Netscape cookies, by setting the :attr:`version` attribute of the :class:`Cookie` instance to 0. :class:`DefaultCookiePolicy` also provides some parameters to allow some @@ -150,9 +154,15 @@ contained :class:`Cookie` objects. The *request* object (usually a :class:`urllib.request..Request` instance) must support the methods :meth:`get_full_url`, :meth:`get_host`, - :meth:`get_type`, :meth:`unverifiable`, :meth:`get_origin_req_host`, - :meth:`has_header`, :meth:`get_header`, :meth:`header_items`, and - :meth:`add_unredirected_header`, as documented by :mod:`urllib.request`. + :meth:`get_type`, :meth:`unverifiable`, :meth:`has_header`, + :meth:`get_header`, :meth:`header_items`, :meth:`add_unredirected_header` + and :attr:`origin_req_host` attribute as documented by + :mod:`urllib.request`. + + .. versionchanged:: 3.3 + + *request* object needs :attr:`origin_req_host` attribute. Dependency on a + deprecated method :meth:`get_origin_req_host` has been removed. .. method:: CookieJar.extract_cookies(response, request) @@ -170,11 +180,15 @@ contained :class:`Cookie` objects. The *request* object (usually a :class:`urllib.request.Request` instance) must support the methods :meth:`get_full_url`, :meth:`get_host`, - :meth:`unverifiable`, and :meth:`get_origin_req_host`, as documented by - :mod:`urllib.request`. The request is used to set default values for + :meth:`unverifiable`, and :attr:`origin_req_host` attribute, as documented + by :mod:`urllib.request`. The request is used to set default values for cookie-attributes as well as for checking that the cookie is allowed to be set. + .. versionchanged:: 3.3 + + *request* object needs :attr:`origin_req_host` attribute. Dependency on a + deprecated method :meth:`get_origin_req_host` has been removed. .. method:: CookieJar.set_policy(policy) @@ -257,9 +271,12 @@ contained :class:`Cookie` objects. Arguments are as for :meth:`save`. The named file must be in the format understood by the class, or - :exc:`LoadError` will be raised. Also, :exc:`IOError` may be raised, for + :exc:`LoadError` will be raised. Also, :exc:`OSError` may be raised, for example if the file does not exist. + .. versionchanged:: 3.3 + :exc:`IOError` used to be raised, it is now an alias of :exc:`OSError`. + .. method:: FileCookieJar.revert(filename=None, ignore_discard=False, ignore_expires=False) @@ -292,7 +309,7 @@ FileCookieJar subclasses and co-operation with web browsers ----------------------------------------------------------- The following :class:`CookieJar` subclasses are provided for reading and -writing . +writing. .. class:: MozillaCookieJar(filename, delayload=None, policy=None) @@ -627,7 +644,7 @@ internal consistency, so you should know what you're doing if you do that. .. attribute:: Cookie.secure - True if cookie should only be returned over a secure connection. + ``True`` if cookie should only be returned over a secure connection. .. attribute:: Cookie.expires @@ -638,7 +655,7 @@ internal consistency, so you should know what you're doing if you do that. .. attribute:: Cookie.discard - True if this is a session cookie. + ``True`` if this is a session cookie. .. attribute:: Cookie.comment @@ -655,7 +672,7 @@ internal consistency, so you should know what you're doing if you do that. .. attribute:: Cookie.rfc2109 - True if this cookie was received as an RFC 2109 cookie (ie. the cookie + ``True`` if this cookie was received as an RFC 2109 cookie (ie. the cookie arrived in a :mailheader:`Set-Cookie` header, and the value of the Version cookie-attribute in that header was 1). This attribute is provided because :mod:`http.cookiejar` may 'downgrade' RFC 2109 cookies to Netscape cookies, in @@ -664,18 +681,18 @@ internal consistency, so you should know what you're doing if you do that. .. attribute:: Cookie.port_specified - True if a port or set of ports was explicitly specified by the server (in the + ``True`` if a port or set of ports was explicitly specified by the server (in the :mailheader:`Set-Cookie` / :mailheader:`Set-Cookie2` header). .. attribute:: Cookie.domain_specified - True if a domain was explicitly specified by the server. + ``True`` if a domain was explicitly specified by the server. .. attribute:: Cookie.domain_initial_dot - True if the domain explicitly specified by the server began with a dot + ``True`` if the domain explicitly specified by the server began with a dot (``'.'``). Cookies may have additional non-standard cookie-attributes. These may be @@ -702,7 +719,7 @@ The :class:`Cookie` class also defines the following method: .. method:: Cookie.is_expired(now=None) - True if cookie has passed the time at which the server requested it should + ``True`` if cookie has passed the time at which the server requested it should expire. If *now* is given (in seconds since the epoch), return whether the cookie has expired at the specified time. diff --git a/Doc/library/http.cookies.rst b/Doc/library/http.cookies.rst index 5ae3fd4..646f2e8 100644 --- a/Doc/library/http.cookies.rst +++ b/Doc/library/http.cookies.rst @@ -22,9 +22,12 @@ many current day browsers and servers have relaxed parsing rules when comes to Cookie handling. As a result, the parsing rules used are a bit less strict. The character set, :data:`string.ascii_letters`, :data:`string.digits` and -``!#$%&'*+-.^_`|~`` denote the set of valid characters allowed by this module +``!#$%&'*+-.^_`|~:`` denote the set of valid characters allowed by this module in Cookie name (as :attr:`~Morsel.key`). +.. versionchanged:: 3.3 + Allowed ':' as a valid Cookie name character. + .. note:: diff --git a/Doc/library/http.rst b/Doc/library/http.rst new file mode 100644 index 0000000..a387a37 --- /dev/null +++ b/Doc/library/http.rst @@ -0,0 +1,11 @@ +:mod:`http` --- HTTP modules +============================ + +``http`` is a package that collects several modules for working with the +HyperText Transfer Protocol: + +* :mod:`http.client` is a low-level HTTP protocol client; for high-level URL + opening use :mod:`urllib.request` +* :mod:`http.server` contains basic HTTP server classes based on :mod:`socketserver` +* :mod:`http.cookies` has utilities for implementing state management with cookies +* :mod:`http.cookiejar` provides persistence of cookies diff --git a/Doc/library/http.server.rst b/Doc/library/http.server.rst index 300e332..5aeb719 100644 --- a/Doc/library/http.server.rst +++ b/Doc/library/http.server.rst @@ -29,8 +29,8 @@ handler. Code to create and run the server looks like this:: .. class:: HTTPServer(server_address, RequestHandlerClass) - This class builds on the :class:`TCPServer` class by storing the server - address as instance variables named :attr:`server_name` and + This class builds on the :class:`~socketserver.TCPServer` class by storing + the server address as instance variables named :attr:`server_name` and :attr:`server_port`. The server is accessible by the handler, typically through the handler's :attr:`server` instance variable. @@ -179,19 +179,30 @@ of which this module provides three different variants: .. method:: send_response(code, message=None) - Sends a response header and logs the accepted request. The HTTP response - line is sent, followed by *Server* and *Date* headers. The values for - these two headers are picked up from the :meth:`version_string` and - :meth:`date_time_string` methods, respectively. + Adds a response header to the headers buffer and logs the accepted + request. The HTTP response line is written to the internal buffer, + followed by *Server* and *Date* headers. The values for these two headers + are picked up from the :meth:`version_string` and + :meth:`date_time_string` methods, respectively. If the server does not + intend to send any other headers using the :meth:`send_header` method, + then :meth:`send_response` should be followed by a :meth:`end_headers` + call. + + .. versionchanged:: 3.3 + Headers are stored to an internal buffer and :meth:`end_headers` + needs to be called explicitly. + .. method:: send_header(keyword, value) - Stores the HTTP header to an internal buffer which will be written to the - output stream when :meth:`end_headers` method is invoked. - *keyword* should specify the header keyword, with *value* - specifying its value. + Adds the HTTP header to an internal buffer which will be written to the + output stream when either :meth:`end_headers` or :meth:`flush_headers` is + invoked. *keyword* should specify the header keyword, with *value* + specifying its value. Note that, after the send_header calls are done, + :meth:`end_headers` MUST BE called in order to complete the operation. - .. versionchanged:: 3.2 Storing the headers in an internal buffer + .. versionchanged:: 3.2 + Headers are stored in an internal buffer. .. method:: send_response_only(code, message=None) @@ -205,10 +216,19 @@ of which this module provides three different variants: .. method:: end_headers() - Write the buffered HTTP headers to the output stream and send a blank - line, indicating the end of the HTTP headers in the response. + Adds a blank line + (indicating the end of the HTTP headers in the response) + to the headers buffer and calls :meth:`flush_headers()`. + + .. versionchanged:: 3.2 + The buffered headers are written to the output stream. + + .. method:: flush_headers() + + Finally send the headers to the output stream and flush the internal + headers buffer. - .. versionchanged:: 3.2 Writing the buffered headers to the output stream. + .. versionadded:: 3.3 .. method:: log_request(code='-', size='-') @@ -250,8 +270,11 @@ of which this module provides three different variants: .. method:: address_string() - Returns the client address, formatted for logging. A name lookup is - performed on the client's IP address. + Returns the client address. + + .. versionchanged:: 3.3 + Previously, a name lookup was performed. To avoid name resolution + delays, it now always returns the IP address. .. class:: SimpleHTTPRequestHandler(request, client_address, server) @@ -296,10 +319,10 @@ of which this module provides three different variants: file's contents are returned; otherwise a directory listing is generated by calling the :meth:`list_directory` method. This method uses :func:`os.listdir` to scan the directory, and returns a ``404`` error - response if the :func:`listdir` fails. + response if the :func:`~os.listdir` fails. If the request was mapped to a file, it is opened and the contents are - returned. Any :exc:`IOError` exception in opening the requested file is + returned. Any :exc:`OSError` exception in opening the requested file is mapped to a ``404``, ``'File not found'`` error. Otherwise, the content type is guessed by calling the :meth:`guess_type` method, which in turn uses the *extensions_map* variable. @@ -378,3 +401,9 @@ the previous example, this serves files relative to the current directory. :: Note that CGI scripts will be run with UID of user nobody, for security reasons. Problems with the CGI script will be translated to error 403. + +:class:`CGIHTTPRequestHandler` can be enabled in the command line by passing +the ``--cgi`` option.:: + + python -m http.server --cgi 8000 + diff --git a/Doc/library/idle.rst b/Doc/library/idle.rst index b40e470..36d78b0 100644 --- a/Doc/library/idle.rst +++ b/Doc/library/idle.rst @@ -33,8 +33,8 @@ Menus File menu ^^^^^^^^^ -New window - create a new editing window +New file + create a new file editing window Open... open an existing file diff --git a/Doc/library/imaplib.rst b/Doc/library/imaplib.rst index da6cc8c..01236fb 100644 --- a/Doc/library/imaplib.rst +++ b/Doc/library/imaplib.rst @@ -64,14 +64,21 @@ Three exceptions are defined as attributes of the :class:`IMAP4` class: There's also a subclass for secure connections: -.. class:: IMAP4_SSL(host='', port=IMAP4_SSL_PORT, keyfile=None, certfile=None) +.. class:: IMAP4_SSL(host='', port=IMAP4_SSL_PORT, keyfile=None, certfile=None, ssl_context=None) This is a subclass derived from :class:`IMAP4` that connects over an SSL encrypted socket (to use this class you need a socket module that was compiled with SSL support). If *host* is not specified, ``''`` (the local host) is used. If *port* is omitted, the standard IMAP4-over-SSL port (993) is used. *keyfile* and *certfile* are also optional - they can contain a PEM formatted private key - and certificate chain file for the SSL connection. + and certificate chain file for the SSL connection. *ssl_context* parameter is a + :class:`ssl.SSLContext` object which allows bundling SSL configuration + options, certificates and private keys into a single (potentially long-lived) + structure. Note that the *keyfile*/*certfile* parameters are mutually exclusive with *ssl_context*, + a :class:`ValueError` is raised if *keyfile*/*certfile* is provided along with *ssl_context*. + + .. versionchanged:: 3.3 + *ssl_context* parameter added. The second subclass allows for connections created by a child process: @@ -106,13 +113,15 @@ The following utility functions are defined: .. function:: Time2Internaldate(date_time) - Convert *date_time* to an IMAP4 ``INTERNALDATE`` representation. The - return value is a string in the form: ``"DD-Mmm-YYYY HH:MM:SS - +HHMM"`` (including double-quotes). The *date_time* argument can be a - number (int or float) representing seconds since epoch (as returned - by :func:`time.time`), a 9-tuple representing local time (as returned by - :func:`time.localtime`), or a double-quoted string. In the last case, it - is assumed to already be in the correct format. + Convert *date_time* to an IMAP4 ``INTERNALDATE`` representation. + The return value is a string in the form: ``"DD-Mmm-YYYY HH:MM:SS + +HHMM"`` (including double-quotes). The *date_time* argument can + be a number (int or float) representing seconds since epoch (as + returned by :func:`time.time`), a 9-tuple representing local time + an instance of :class:`time.struct_time` (as returned by + :func:`time.localtime`), an aware instance of + :class:`datetime.datetime`, or a double-quoted string. In the last + case, it is assumed to already be in the correct format. Note that IMAP4 message numbers change as the mailbox changes; in particular, after an ``EXPUNGE`` command performs deletions the remaining messages are @@ -301,8 +310,9 @@ An :class:`IMAP4` instance has the following methods: Opens socket to *port* at *host*. This method is implicitly called by the :class:`IMAP4` constructor. The connection objects established by this - method will be used in the ``read``, ``readline``, ``send``, and ``shutdown`` - methods. You may override this method. + method will be used in the :meth:`IMAP4.read`, :meth:`IMAP4.readline`, + :meth:`IMAP4.send`, and :meth:`IMAP4.shutdown` methods. You may override + this method. .. method:: IMAP4.partial(message_num, message_part, start, length) diff --git a/Doc/library/imp.rst b/Doc/library/imp.rst index 1345b25..364d81e 100644 --- a/Doc/library/imp.rst +++ b/Doc/library/imp.rst @@ -1,5 +1,5 @@ -:mod:`imp` --- Access the :keyword:`import` internals -===================================================== +:mod:`imp` --- Access the :ref:`import <importsystem>` internals +================================================================ .. module:: imp :synopsis: Access the implementation of the import statement. @@ -11,6 +11,10 @@ This module provides an interface to the mechanisms used to implement the :keyword:`import` statement. It defines the following constants and functions: +.. note:: + New programs should use :mod:`importlib` rather than this module. + + .. function:: get_magic() .. index:: pair: file; byte-code @@ -30,6 +34,9 @@ This module provides an interface to the mechanisms used to implement the :const:`PY_SOURCE`, :const:`PY_COMPILED`, or :const:`C_EXTENSION`, described below. + .. deprecated:: 3.3 + Use the constants defined on :mod:`importlib.machinery` instead. + .. function:: find_module(name[, path]) @@ -69,6 +76,9 @@ This module provides an interface to the mechanisms used to implement the then use :func:`find_module` with the *path* argument set to ``P.__path__``. When *P* itself has a dotted name, apply this recipe recursively. + .. deprecated:: 3.3 + Use :func:`importlib.find_loader` instead. + .. function:: load_module(name, file, pathname, description) @@ -90,6 +100,10 @@ This module provides an interface to the mechanisms used to implement the it was not ``None``, even when an exception is raised. This is best done using a :keyword:`try` ... :keyword:`finally` statement. + .. deprecated:: 3.3 + Unneeded as loaders should be used to load modules and + :func:`find_module` is deprecated. + .. function:: new_module(name) @@ -97,37 +111,6 @@ This module provides an interface to the mechanisms used to implement the in ``sys.modules``. -.. function:: lock_held() - - Return ``True`` if the import lock is currently held, else ``False``. On - platforms without threads, always return ``False``. - - On platforms with threads, a thread executing an import holds an internal lock - until the import is complete. This lock blocks other threads from doing an - import until the original import completes, which in turn prevents other threads - from seeing incomplete module objects constructed by the original thread while - in the process of completing its import (and the imports, if any, triggered by - that). - - -.. function:: acquire_lock() - - Acquire the interpreter's import lock for the current thread. This lock should - be used by import hooks to ensure thread-safety when importing modules. - - Once a thread has acquired the import lock, the same thread may acquire it - again without blocking; the thread must release it once for each time it has - acquired it. - - On platforms without threads, this function does nothing. - - -.. function:: release_lock() - - Release the interpreter's import lock. On platforms without threads, this - function does nothing. - - .. function:: reload(module) Reload a previously imported *module*. The argument must be a module object, so @@ -175,7 +158,7 @@ This module provides an interface to the mechanisms used to implement the cache = {} It is legal though generally not very useful to reload built-in or dynamically - loaded modules, except for :mod:`sys`, :mod:`__main__` and :mod:`__builtin__`. + loaded modules, except for :mod:`sys`, :mod:`__main__` and :mod:`builtins`. In many cases, however, extension modules are not designed to be initialized more than once, and may fail in arbitrary ways when reloaded. @@ -189,6 +172,10 @@ This module provides an interface to the mechanisms used to implement the the class does not affect the method definitions of the instances --- they continue to use the old class definition. The same is true for derived classes. + .. versionchanged:: 3.3 + Relies on both ``__name__`` and ``__loader__`` being defined on the module + being reloaded instead of just ``__name__``. + The following functions are conveniences for handling :pep:`3147` byte-compiled file paths. @@ -201,14 +188,19 @@ file paths. source *path*. For example, if *path* is ``/foo/bar/baz.py`` the return value would be ``/foo/bar/__pycache__/baz.cpython-32.pyc`` for Python 3.2. The ``cpython-32`` string comes from the current magic tag (see - :func:`get_tag`). The returned path will end in ``.pyc`` when - ``__debug__`` is True or ``.pyo`` for an optimized Python - (i.e. ``__debug__`` is False). By passing in True or False for + :func:`get_tag`; if :attr:`sys.implementation.cache_tag` is not defined then + :exc:`NotImplementedError` will be raised). The returned path will end in + ``.pyc`` when ``__debug__`` is ``True`` or ``.pyo`` for an optimized Python + (i.e. ``__debug__`` is ``False``). By passing in ``True`` or ``False`` for *debug_override* you can override the system's value for ``__debug__`` for extension selection. *path* need not exist. + .. versionchanged:: 3.3 + If :attr:`sys.implementation.cache_tag` is ``None``, then + :exc:`NotImplementedError` is raised. + .. function:: source_from_cache(path) @@ -216,7 +208,13 @@ file paths. file path. For example, if *path* is ``/foo/bar/__pycache__/baz.cpython-32.pyc`` the returned path would be ``/foo/bar/baz.py``. *path* need not exist, however if it does not conform - to :pep:`3147` format, a ``ValueError`` is raised. + to :pep:`3147` format, a ``ValueError`` is raised. If + :attr:`sys.implementation.cache_tag` is not defined, + :exc:`NotImplementedError` is raised. + + .. versionchanged:: 3.3 + Raise :exc:`NotImplementedError` when + :attr:`sys.implementation.cache_tag` is not defined. .. function:: get_tag() @@ -224,6 +222,64 @@ file paths. Return the :pep:`3147` magic tag string matching this version of Python's magic number, as returned by :func:`get_magic`. + .. note:: + You may use :attr:`sys.implementation.cache_tag` directly starting + in Python 3.3. + + +The following functions help interact with the import system's internal +locking mechanism. Locking semantics of imports are an implementation +detail which may vary from release to release. However, Python ensures +that circular imports work without any deadlocks. + + +.. function:: lock_held() + + Return ``True`` if the global import lock is currently held, else + ``False``. On platforms without threads, always return ``False``. + + On platforms with threads, a thread executing an import first holds a + global import lock, then sets up a per-module lock for the rest of the + import. This blocks other threads from importing the same module until + the original import completes, preventing other threads from seeing + incomplete module objects constructed by the original thread. An + exception is made for circular imports, which by construction have to + expose an incomplete module object at some point. + +.. versionchanged:: 3.3 + The locking scheme has changed to per-module locks for + the most part. A global import lock is kept for some critical tasks, + such as initializing the per-module locks. + + +.. function:: acquire_lock() + + Acquire the interpreter's global import lock for the current thread. + This lock should be used by import hooks to ensure thread-safety when + importing modules. + + Once a thread has acquired the import lock, the same thread may acquire it + again without blocking; the thread must release it once for each time it has + acquired it. + + On platforms without threads, this function does nothing. + +.. versionchanged:: 3.3 + The locking scheme has changed to per-module locks for + the most part. A global import lock is kept for some critical tasks, + such as initializing the per-module locks. + + +.. function:: release_lock() + + Release the interpreter's global import lock. On platforms without + threads, this function does nothing. + +.. versionchanged:: 3.3 + The locking scheme has changed to per-module locks for + the most part. A global import lock is kept for some critical tasks, + such as initializing the per-module locks. + The following constants with integer values, defined in this module, are used to indicate the search result of :func:`find_module`. @@ -233,31 +289,43 @@ to indicate the search result of :func:`find_module`. The module was found as a source file. + .. deprecated:: 3.3 + .. data:: PY_COMPILED The module was found as a compiled code object file. + .. deprecated:: 3.3 + .. data:: C_EXTENSION The module was found as dynamically loadable shared library. + .. deprecated:: 3.3 + .. data:: PKG_DIRECTORY The module was found as a package directory. + .. deprecated:: 3.3 + .. data:: C_BUILTIN The module was found as a built-in module. + .. deprecated:: 3.3 + .. data:: PY_FROZEN The module was found as a frozen module. + .. deprecated:: 3.3 + .. class:: NullImporter(path_string) @@ -266,16 +334,17 @@ to indicate the search result of :func:`find_module`. with an existing directory or empty string raises :exc:`ImportError`. Otherwise, a :class:`NullImporter` instance is returned. - Python adds instances of this type to ``sys.path_importer_cache`` for any path - entries that are not directories and are not handled by any other path hooks on - ``sys.path_hooks``. Instances have only one method: - + Instances have only one method: .. method:: NullImporter.find_module(fullname [, path]) This method always returns ``None``, indicating that the requested module could not be found. + .. versionchanged:: 3.3 + ``None`` is inserted into ``sys.path_importer_cache`` instead of an + instance of :class:`NullImporter`. + .. _examples-imp: diff --git a/Doc/library/importlib.rst b/Doc/library/importlib.rst index 1649063..5f740a2 100644 --- a/Doc/library/importlib.rst +++ b/Doc/library/importlib.rst @@ -1,8 +1,8 @@ -:mod:`importlib` -- An implementation of :keyword:`import` -========================================================== +:mod:`importlib` -- The implementation of :keyword:`import` +=========================================================== .. module:: importlib - :synopsis: An implementation of the import machinery. + :synopsis: The implementation of the import machinery. .. moduleauthor:: Brett Cannon <brett@python.org> .. sectionauthor:: Brett Cannon <brett@python.org> @@ -13,17 +13,16 @@ Introduction ------------ -The purpose of the :mod:`importlib` package is two-fold. One is to provide an +The purpose of the :mod:`importlib` package is two-fold. One is to provide the implementation of the :keyword:`import` statement (and thus, by extension, the :func:`__import__` function) in Python source code. This provides an implementation of :keyword:`import` which is portable to any Python -interpreter. This also provides a reference implementation which is easier to +interpreter. This also provides an implementation which is easier to comprehend than one implemented in a programming language other than Python. Two, the components to implement :keyword:`import` are exposed in this package, making it easier for users to create their own custom objects (known generically as an :term:`importer`) to participate in the import process. -Details on custom importers can be found in :pep:`302`. .. seealso:: @@ -53,6 +52,9 @@ Details on custom importers can be found in :pep:`302`. :pep:`366` Main module explicit relative imports + :pep:`451` + A ModuleSpec Type for the Import System + :pep:`3120` Using UTF-8 as the Default Source Encoding @@ -63,7 +65,7 @@ Details on custom importers can be found in :pep:`302`. Functions --------- -.. function:: __import__(name, globals={}, locals={}, fromlist=list(), level=0) +.. function:: __import__(name, globals=None, locals=None, fromlist=(), level=0) An implementation of the built-in :func:`__import__` function. @@ -82,10 +84,36 @@ Functions derived from :func:`importlib.__import__`, including requiring the package from which an import is occurring to have been previously imported (i.e., *package* must already be imported). The most important difference - is that :func:`import_module` returns the most nested package or module - that was imported (e.g. ``pkg.mod``), while :func:`__import__` returns the + is that :func:`import_module` returns the specified package or module + (e.g. ``pkg.mod``), while :func:`__import__` returns the top-level package or module (e.g. ``pkg``). + .. versionchanged:: 3.3 + Parent packages are automatically imported. + +.. function:: find_loader(name, path=None) + + Find the loader for a module, optionally within the specified *path*. If the + module is in :attr:`sys.modules`, then ``sys.modules[name].__loader__`` is + returned (unless the loader would be ``None``, in which case + :exc:`ValueError` is raised). Otherwise a search using :attr:`sys.meta_path` + is done. ``None`` is returned if no loader is found. + + A dotted name does not have its parent's implicitly imported as that requires + loading them and that may not be desired. To properly import a submodule you + will need to import all parent packages of the submodule and use the correct + argument to *path*. + +.. function:: invalidate_caches() + + Invalidate the internal caches of finders stored at + :data:`sys.meta_path`. If a finder implements ``invalidate_caches()`` then it + will be called to perform the invalidation. This function should be called + if any modules are created/installed while your program is running to + guarantee all finders will notice the new module's existence. + + .. versionadded:: 3.3 + :mod:`importlib.abc` -- Abstract base classes related to import --------------------------------------------------------------- @@ -97,19 +125,90 @@ The :mod:`importlib.abc` module contains all of the core abstract base classes used by :keyword:`import`. Some subclasses of the core abstract base classes are also provided to help in implementing the core ABCs. +ABC hierarchy:: + + object + +-- Finder (deprecated) + | +-- MetaPathFinder + | +-- PathEntryFinder + +-- Loader + +-- ResourceLoader --------+ + +-- InspectLoader | + +-- ExecutionLoader --+ + +-- FileLoader + +-- SourceLoader + +-- PyLoader (deprecated) + +-- PyPycLoader (deprecated) + .. class:: Finder - An abstract base class representing a :term:`finder`. - See :pep:`302` for the exact definition for a finder. + An abstract base class representing a :term:`finder`. + + .. deprecated:: 3.3 + Use :class:`MetaPathFinder` or :class:`PathEntryFinder` instead. + + .. method:: find_module(fullname, path=None) + + An abstact method for finding a :term:`loader` for the specified + module. Originally specified in :pep:`302`, this method was meant + for use in :data:`sys.meta_path` and in the path-based import subsystem. + + +.. class:: MetaPathFinder + + An abstract base class representing a :term:`meta path finder`. For + compatibility, this is a subclass of :class:`Finder`. + + .. versionadded:: 3.3 + + .. method:: find_module(fullname, path) + + An abstract method for finding a :term:`loader` for the specified + module. If this is a top-level import, *path* will be ``None``. + Otherwise, this is a search for a subpackage or module and *path* + will be the value of :attr:`__path__` from the parent + package. If a loader cannot be found, ``None`` is returned. + + .. method:: invalidate_caches() - .. method:: find_module(fullname, path=None) + An optional method which, when called, should invalidate any internal + cache used by the finder. Used by :func:`importlib.invalidate_caches` + when invalidating the caches of all finders on :data:`sys.meta_path`. - An abstract method for finding a :term:`loader` for the specified - module. If the :term:`finder` is found on :data:`sys.meta_path` and the - module to be searched for is a subpackage or module then *path* will - be the value of :attr:`__path__` from the parent package. If a loader - cannot be found, ``None`` is returned. + +.. class:: PathEntryFinder + + An abstract base class representing a :term:`path entry finder`. Though + it bears some similarities to :class:`MetaPathFinder`, ``PathEntryFinder`` + is meant for use only within the path-based import subsystem provided + by :class:`PathFinder`. This ABC is a subclass of :class:`Finder` for + compatibility. + + .. versionadded:: 3.3 + + .. method:: find_loader(fullname) + + An abstract method for finding a :term:`loader` for the specified + module. Returns a 2-tuple of ``(loader, portion)`` where ``portion`` + is a sequence of file system locations contributing to part of a namespace + package. The loader may be ``None`` while specifying ``portion`` to + signify the contribution of the file system locations to a namespace + package. An empty list can be used for ``portion`` to signify the loader + is not part of a package. If ``loader`` is ``None`` and ``portion`` is + the empty list then no loader or location for a namespace package were + found (i.e. failure to find anything for the module). + + .. method:: find_module(fullname) + + A concrete implementation of :meth:`Finder.find_module` which is + equivalent to ``self.find_loader(fullname)[0]``. + + .. method:: invalidate_caches() + + An optional method which, when called, should invalidate any internal + cache used by the finder. Used by :meth:`PathFinder.invalidate_caches` + when invalidating the caches of all cached finders. .. class:: Loader @@ -144,6 +243,10 @@ are also provided to help in implementing the core ABCs. The path to where the module data is stored (not set for built-in modules). + - :attr:`__cached__` + The path to where a compiled version of the module is/should be + stored (not set when the attribute would be inappropriate). + - :attr:`__path__` A list of strings specifying the search path within a package. This attribute is not set on modules. @@ -159,6 +262,13 @@ are also provided to help in implementing the core ABCs. (This is not set by the built-in import machinery, but it should be set whenever a :term:`loader` is used.) + .. method:: module_repr(module) + + An abstract method which when implemented calculates and returns the + given module's repr, as a string. + + .. versionadded: 3.3 + .. class:: ResourceLoader @@ -224,6 +334,38 @@ are also provided to help in implementing the core ABCs. module. +.. class:: FileLoader(fullname, path) + + An abstract base class which inherits from :class:`ResourceLoader` and + :class:`ExecutionLoader`, providing concrete implementations of + :meth:`ResourceLoader.get_data` and :meth:`ExecutionLoader.get_filename`. + + The *fullname* argument is a fully resolved name of the module the loader is + to handle. The *path* argument is the path to the file for the module. + + .. versionadded:: 3.3 + + .. attribute:: name + + The name of the module the loader can handle. + + .. attribute:: path + + Path to the file of the module. + + .. method:: load_module(fullname) + + Calls super's ``load_module()``. + + .. method:: get_filename(fullname) + + Returns :attr:`path`. + + .. method:: get_data(path) + + Returns the open, binary file for *path*. + + .. class:: SourceLoader An abstract base class for implementing source (and optionally bytecode) @@ -243,37 +385,59 @@ are also provided to help in implementing the core ABCs. optimization to speed up loading by removing the parsing step of Python's compiler, and so no bytecode-specific API is exposed. - .. method:: path_mtime(self, path) + .. method:: path_stats(path) + + Optional abstract method which returns a :class:`dict` containing + metadata about the specifed path. Supported dictionary keys are: + + - ``'mtime'`` (mandatory): an integer or floating-point number + representing the modification time of the source code; + - ``'size'`` (optional): the size in bytes of the source code. + + Any other keys in the dictionary are ignored, to allow for future + extensions. + + .. versionadded:: 3.3 + + .. method:: path_mtime(path) Optional abstract method which returns the modification time for the specified path. - .. method:: set_data(self, path, data) + .. deprecated:: 3.3 + This method is deprecated in favour of :meth:`path_stats`. You don't + have to implement it, but it is still available for compatibility + purposes. + + .. method:: set_data(path, data) Optional abstract method which writes the specified bytes to a file path. Any intermediate directories which do not exist are to be created automatically. When writing to the path fails because the path is read-only - (:attr:`errno.EACCES`), do not propagate the exception. + (:attr:`errno.EACCES`/:exc:`PermissionError`), do not propagate the + exception. - .. method:: get_code(self, fullname) + .. method:: get_code(fullname) Concrete implementation of :meth:`InspectLoader.get_code`. - .. method:: load_module(self, fullname) + .. method:: load_module(fullname) Concrete implementation of :meth:`Loader.load_module`. - .. method:: get_source(self, fullname) + .. method:: get_source(fullname) Concrete implementation of :meth:`InspectLoader.get_source`. - .. method:: is_package(self, fullname) + .. method:: is_package(fullname) Concrete implementation of :meth:`InspectLoader.is_package`. A module - is determined to be a package if its file path is a file named - ``__init__`` when the file extension is removed. + is determined to be a package if its file path (as provided by + :meth:`ExecutionLoader.get_filename`) is a file named + ``__init__`` when the file extension is removed **and** the module name + itself does not end in ``__init__``. .. class:: PyLoader @@ -374,6 +538,10 @@ are also provided to help in implementing the core ABCs. :class:`PyLoader`. Do note that this solution will not support sourceless/bytecode-only loading; only source *and* bytecode loading. + .. versionchanged:: 3.3 + Updated to parse (but not use) the new source size field in bytecode + files when reading and to write out the field properly when writing. + .. method:: source_mtime(fullname) An abstract method which returns the modification time for the source @@ -417,12 +585,59 @@ are also provided to help in implementing the core ABCs. This module contains the various objects that help :keyword:`import` find and load modules. +.. attribute:: SOURCE_SUFFIXES + + A list of strings representing the recognized file suffixes for source + modules. + + .. versionadded:: 3.3 + +.. attribute:: DEBUG_BYTECODE_SUFFIXES + + A list of strings representing the file suffixes for non-optimized bytecode + modules. + + .. versionadded:: 3.3 + +.. attribute:: OPTIMIZED_BYTECODE_SUFFIXES + + A list of strings representing the file suffixes for optimized bytecode + modules. + + .. versionadded:: 3.3 + +.. attribute:: BYTECODE_SUFFIXES + + A list of strings representing the recognized file suffixes for bytecode + modules. Set to either :attr:`DEBUG_BYTECODE_SUFFIXES` or + :attr:`OPTIMIZED_BYTECODE_SUFFIXES` based on whether ``__debug__`` is true. + + .. versionadded:: 3.3 + +.. attribute:: EXTENSION_SUFFIXES + + A list of strings representing the recognized file suffixes for + extension modules. + + .. versionadded:: 3.3 + +.. function:: all_suffixes() + + Returns a combined list of strings representing all file suffixes for + modules recognized by the standard import machinery. This is a + helper for code which simply needs to know if a filesystem path + potentially refers to a module without needing any details on the kind + of module (for example, :func:`inspect.getmodulename`) + + .. versionadded:: 3.3 + + .. class:: BuiltinImporter An :term:`importer` for built-in modules. All known built-in modules are listed in :data:`sys.builtin_module_names`. This class implements the - :class:`importlib.abc.Finder` and :class:`importlib.abc.InspectLoader` - ABCs. + :class:`importlib.abc.MetaPathFinder` and + :class:`importlib.abc.InspectLoader` ABCs. Only class methods are defined by this class to alleviate the need for instantiation. @@ -431,48 +646,225 @@ find and load modules. .. class:: FrozenImporter An :term:`importer` for frozen modules. This class implements the - :class:`importlib.abc.Finder` and :class:`importlib.abc.InspectLoader` - ABCs. + :class:`importlib.abc.MetaPathFinder` and + :class:`importlib.abc.InspectLoader` ABCs. Only class methods are defined by this class to alleviate the need for instantiation. +.. class:: WindowsRegistryFinder + + :term:`Finder` for modules declared in the Windows registry. This class + implements the :class:`importlib.abc.Finder` ABC. + + Only class methods are defined by this class to alleviate the need for + instantiation. + + .. versionadded:: 3.3 + + .. class:: PathFinder - :term:`Finder` for :data:`sys.path`. This class implements the - :class:`importlib.abc.Finder` ABC. + A :term:`Finder` for :data:`sys.path` and package ``__path__`` attributes. + This class implements the :class:`importlib.abc.MetaPathFinder` ABC. - This class does not perfectly mirror the semantics of :keyword:`import` in - terms of :data:`sys.path`. No implicit path hooks are assumed for - simplification of the class and its semantics. + Only class methods are defined by this class to alleviate the need for + instantiation. - Only class methods are defined by this class to alleviate the need for - instantiation. + .. classmethod:: find_module(fullname, path=None) + + Class method that attempts to find a :term:`loader` for the module + specified by *fullname* on :data:`sys.path` or, if defined, on + *path*. For each path entry that is searched, + :data:`sys.path_importer_cache` is checked. If a non-false object is + found then it is used as the :term:`path entry finder` to look for the + module being searched for. If no entry is found in + :data:`sys.path_importer_cache`, then :data:`sys.path_hooks` is + searched for a finder for the path entry and, if found, is stored in + :data:`sys.path_importer_cache` along with being queried about the + module. If no finder is ever found then ``None`` is both stored in + the cache and returned. + + .. classmethod:: invalidate_caches() + + Calls :meth:`importlib.abc.PathEntryFinder.invalidate_caches` on all + finders stored in :attr:`sys.path_importer_cache`. + + +.. class:: FileFinder(path, \*loader_details) + + A concrete implementation of :class:`importlib.abc.PathEntryFinder` which + caches results from the file system. + + The *path* argument is the directory for which the finder is in charge of + searching. + + The *loader_details* argument is a variable number of 2-item tuples each + containing a loader and a sequence of file suffixes the loader recognizes. + The loaders are expected to be callables which accept two arguments of + the module's name and the path to the file found. + + The finder will cache the directory contents as necessary, making stat calls + for each module search to verify the cache is not outdated. Because cache + staleness relies upon the granularity of the operating system's state + information of the file system, there is a potential race condition of + searching for a module, creating a new file, and then searching for the + module the new file represents. If the operations happen fast enough to fit + within the granularity of stat calls, then the module search will fail. To + prevent this from happening, when you create a module dynamically, make sure + to call :func:`importlib.invalidate_caches`. + + .. versionadded:: 3.3 + + .. attribute:: path + + The path the finder will search in. + + .. method:: find_loader(fullname) + + Attempt to find the loader to handle *fullname* within :attr:`path`. + + .. method:: invalidate_caches() + + Clear out the internal cache. + + .. classmethod:: path_hook(\*loader_details) + + A class method which returns a closure for use on :attr:`sys.path_hooks`. + An instance of :class:`FileFinder` is returned by the closure using the + path argument given to the closure directly and *loader_details* + indirectly. + + If the argument to the closure is not an existing directory, + :exc:`ImportError` is raised. + + +.. class:: SourceFileLoader(fullname, path) + + A concrete implementation of :class:`importlib.abc.SourceLoader` by + subclassing :class:`importlib.abc.FileLoader` and providing some concrete + implementations of other methods. + + .. versionadded:: 3.3 + + .. attribute:: name + + The name of the module that this loader will handle. + + .. attribute:: path + + The path to the source file. + + .. method:: is_package(fullname) + + Return true if :attr:`path` appears to be for a package. + + .. method:: path_stats(path) + + Concrete implementation of :meth:`importlib.abc.SourceLoader.path_stats`. + + .. method:: set_data(path, data) + + Concrete implementation of :meth:`importlib.abc.SourceLoader.set_data`. + + +.. class:: SourcelessFileLoader(fullname, path) - .. classmethod:: find_module(fullname, path=None) + A concrete implementation of :class:`importlib.abc.FileLoader` which can + import bytecode files (i.e. no source code files exist). - Class method that attempts to find a :term:`loader` for the module - specified by *fullname* on :data:`sys.path` or, if defined, on - *path*. For each path entry that is searched, - :data:`sys.path_importer_cache` is checked. If an non-false object is - found then it is used as the :term:`finder` to look for the module - being searched for. If no entry is found in - :data:`sys.path_importer_cache`, then :data:`sys.path_hooks` is - searched for a finder for the path entry and, if found, is stored in - :data:`sys.path_importer_cache` along with being queried about the - module. If no finder is ever found then ``None`` is returned. + Please note that direct use of bytecode files (and thus not source code + files) inhibits your modules from being usable by all Python + implementations or new versions of Python which change the bytecode + format. + + .. versionadded:: 3.3 + + .. attribute:: name + + The name of the module the loader will handle. + + .. attribute:: path + + The path to the bytecode file. + + .. method:: is_package(fullname) + + Determines if the module is a package based on :attr:`path`. + + .. method:: get_code(fullname) + + Returns the code object for :attr:`name` created from :attr:`path`. + + .. method:: get_source(fullname) + + Returns ``None`` as bytecode files have no source when this loader is + used. + + +.. class:: ExtensionFileLoader(fullname, path) + + A concrete implementation of :class:`importlib.abc.InspectLoader` for + extension modules. + + The *fullname* argument specifies the name of the module the loader is to + support. The *path* argument is the path to the extension module's file. + + .. versionadded:: 3.3 + + .. attribute:: name + + Name of the module the loader supports. + + .. attribute:: path + + Path to the extension module. + + .. method:: load_module(fullname) + + Loads the extension module if and only if *fullname* is the same as + :attr:`name` or is ``None``. + + .. method:: is_package(fullname) + + Returns ``True`` if the file path points to a package's ``__init__`` + module based on :attr:`EXTENSION_SUFFIXES`. + + .. method:: get_code(fullname) + + Returns ``None`` as extension modules lack a code object. + + .. method:: get_source(fullname) + + Returns ``None`` as extension modules do not have source code. :mod:`importlib.util` -- Utility code for importers --------------------------------------------------- .. module:: importlib.util - :synopsis: Importers and path hooks + :synopsis: Utility code for importers This module contains the various objects that help in the construction of an :term:`importer`. +.. function:: resolve_name(name, package) + + Resolve a relative module name to an absolute one. + + If **name** has no leading dots, then **name** is simply returned. This + allows for usage such as + ``importlib.util.resolve_name('sys', __package__)`` without doing a + check to see if the **package** argument is needed. + + :exc:`ValueError` is raised if **name** is a relative module name but + package is a false value (e.g. ``None`` or the empty string). + :exc:`ValueError` is also raised a relative name would escape its containing + package (e.g. requesting ``..bacon`` from within the ``spam`` package). + + .. versionadded:: 3.3 + .. decorator:: module_for_loader A :term:`decorator` for a :term:`loader` method, @@ -481,22 +873,30 @@ an :term:`importer`. signature taking two positional arguments (e.g. ``load_module(self, module)``) for which the second argument will be the module **object** to be used by the loader. - Note that the decorator - will not work on static methods because of the assumption of two - arguments. + Note that the decorator will not work on static methods because of the + assumption of two arguments. The decorated method will take in the **name** of the module to be loaded as expected for a :term:`loader`. If the module is not found in :data:`sys.modules` then a new one is constructed with its - :attr:`__name__` attribute set. Otherwise the module found in - :data:`sys.modules` will be passed into the method. If an - exception is raised by the decorated method and a module was added to + :attr:`__name__` attribute set to **name**, :attr:`__loader__` set to + **self**, and :attr:`__package__` set if + :meth:`importlib.abc.InspectLoader.is_package` is defined for **self** and + does not raise :exc:`ImportError` for **name**. If a new module is not + needed then the module found in :data:`sys.modules` will be passed into the + method. + + If an exception is raised by the decorated method and a module was added to :data:`sys.modules` it will be removed to prevent a partially initialized module from being in left in :data:`sys.modules`. If the module was already in :data:`sys.modules` then it is left alone. Use of this decorator handles all the details of which module object a - loader should initialize as specified by :pep:`302`. + loader should initialize as specified by :pep:`302` as best as possible. + + .. versionchanged:: 3.3 + :attr:`__loader__` and :attr:`__package__` are automatically set + (when possible). .. decorator:: set_loader @@ -504,7 +904,13 @@ an :term:`importer`. to set the :attr:`__loader__` attribute on loaded modules. If the attribute is already set the decorator does nothing. It is assumed that the first positional argument to the - wrapped method is what :attr:`__loader__` should be set to. + wrapped method (i.e. ``self``) is what :attr:`__loader__` should be set to. + + .. note:: + + It is recommended that :func:`module_for_loader` be used over this + decorator as it subsumes this functionality. + .. decorator:: set_package @@ -515,8 +921,12 @@ an :term:`importer`. set on and not the module found in :data:`sys.modules`. Reliance on this decorator is discouraged when it is possible to set - :attr:`__package__` before the execution of the code is possible. By - setting it before the code for the module is executed it allows the - attribute to be used at the global level of the module during + :attr:`__package__` before importing. By + setting it beforehand the code for the module is executed with the + attribute set and thus can be used by global level code during initialization. + .. note:: + + It is recommended that :func:`module_for_loader` be used over this + decorator as it subsumes this functionality. diff --git a/Doc/library/index.rst b/Doc/library/index.rst index b6a5559..1b25c6e 100644 --- a/Doc/library/index.rst +++ b/Doc/library/index.rst @@ -43,7 +43,8 @@ the `Python Package Index <http://pypi.python.org/pypi>`_. stdtypes.rst exceptions.rst - strings.rst + text.rst + binary.rst datatypes.rst numeric.rst functional.rst @@ -53,7 +54,7 @@ the `Python Package Index <http://pypi.python.org/pypi>`_. fileformats.rst crypto.rst allos.rst - someos.rst + concurrency.rst ipc.rst netdata.rst markup.rst diff --git a/Doc/library/inspect.rst b/Doc/library/inspect.rst index d127ce8..d4cf905 100644 --- a/Doc/library/inspect.rst +++ b/Doc/library/inspect.rst @@ -190,13 +190,26 @@ attributes: compared to the constants defined in the :mod:`imp` module; see the documentation for that module for more information on module types. + .. deprecated:: 3.3 + You may check the file path's suffix against the supported suffixes + listed in :mod:`importlib.machinery` to infer the same information. + .. function:: getmodulename(path) Return the name of the module named by the file *path*, without including the - names of enclosing packages. This uses the same algorithm as the interpreter - uses when searching for modules. If the name cannot be matched according to the - interpreter's rules, ``None`` is returned. + names of enclosing packages. The file extension is checked against all of + the entries in :func:`importlib.machinery.all_suffixes`. If it matches, + the final path component is returned with the extension removed. + Otherwise, ``None`` is returned. + + Note that this function *only* returns a meaningful name for actual + Python modules - paths that potentially refer to Python packages will + still return ``None``. + + .. versionchanged:: 3.3 + This function is now based directly on :mod:`importlib` rather than the + deprecated :func:`getmoduleinfo`. .. function:: ismodule(object) @@ -355,17 +368,25 @@ Retrieving source code argument may be a module, class, method, function, traceback, frame, or code object. The source code is returned as a list of the lines corresponding to the object and the line number indicates where in the original source file the first - line of code was found. An :exc:`IOError` is raised if the source code cannot + line of code was found. An :exc:`OSError` is raised if the source code cannot be retrieved. + .. versionchanged:: 3.3 + :exc:`OSError` is raised instead of :exc:`IOError`, now an alias of the + former. + .. function:: getsource(object) Return the text of the source code for an object. The argument may be a module, class, method, function, traceback, frame, or code object. The source code is - returned as a single string. An :exc:`IOError` is raised if the source code + returned as a single string. An :exc:`OSError` is raised if the source code cannot be retrieved. + .. versionchanged:: 3.3 + :exc:`OSError` is raised instead of :exc:`IOError`, now an alias of the + former. + .. function:: cleandoc(doc) @@ -374,6 +395,275 @@ Retrieving source code onwards is removed. Also, all tabs are expanded to spaces. +.. _inspect-signature-object: + +Introspecting callables with the Signature object +------------------------------------------------- + +.. versionadded:: 3.3 + +The Signature object represents the call signature of a callable object and its +return annotation. To retrieve a Signature object, use the :func:`signature` +function. + +.. function:: signature(callable) + + Return a :class:`Signature` object for the given ``callable``:: + + >>> from inspect import signature + >>> def foo(a, *, b:int, **kwargs): + ... pass + + >>> sig = signature(foo) + + >>> str(sig) + '(a, *, b:int, **kwargs)' + + >>> str(sig.parameters['b']) + 'b:int' + + >>> sig.parameters['b'].annotation + <class 'int'> + + Accepts a wide range of python callables, from plain functions and classes to + :func:`functools.partial` objects. + + .. note:: + + Some callables may not be introspectable in certain implementations of + Python. For example, in CPython, built-in functions defined in C provide + no metadata about their arguments. + + +.. class:: Signature(parameters=None, \*, return_annotation=Signature.empty) + + A Signature object represents the call signature of a function and its return + annotation. For each parameter accepted by the function it stores a + :class:`Parameter` object in its :attr:`parameters` collection. + + The optional *parameters* argument is a sequence of :class:`Parameter` + objects, which is validated to check that there are no parameters with + duplicate names, and that the parameters are in the right order, i.e. + positional-only first, then positional-or-keyword, and that parameters with + defaults follow parameters without defaults. + + The optional *return_annotation* argument, can be an arbitrary Python object, + is the "return" annotation of the callable. + + Signature objects are *immutable*. Use :meth:`Signature.replace` to make a + modified copy. + + .. attribute:: Signature.empty + + A special class-level marker to specify absence of a return annotation. + + .. attribute:: Signature.parameters + + An ordered mapping of parameters' names to the corresponding + :class:`Parameter` objects. + + .. attribute:: Signature.return_annotation + + The "return" annotation for the callable. If the callable has no "return" + annotation, this attribute is set to :attr:`Signature.empty`. + + .. method:: Signature.bind(*args, **kwargs) + + Create a mapping from positional and keyword arguments to parameters. + Returns :class:`BoundArguments` if ``*args`` and ``**kwargs`` match the + signature, or raises a :exc:`TypeError`. + + .. method:: Signature.bind_partial(*args, **kwargs) + + Works the same way as :meth:`Signature.bind`, but allows the omission of + some required arguments (mimics :func:`functools.partial` behavior.) + Returns :class:`BoundArguments`, or raises a :exc:`TypeError` if the + passed arguments do not match the signature. + + .. method:: Signature.replace(*[, parameters][, return_annotation]) + + Create a new Signature instance based on the instance replace was invoked + on. It is possible to pass different ``parameters`` and/or + ``return_annotation`` to override the corresponding properties of the base + signature. To remove return_annotation from the copied Signature, pass in + :attr:`Signature.empty`. + + :: + + >>> def test(a, b): + ... pass + >>> sig = signature(test) + >>> new_sig = sig.replace(return_annotation="new return anno") + >>> str(new_sig) + "(a, b) -> 'new return anno'" + + +.. class:: Parameter(name, kind, \*, default=Parameter.empty, annotation=Parameter.empty) + + Parameter objects are *immutable*. Instead of modifying a Parameter object, + you can use :meth:`Parameter.replace` to create a modified copy. + + .. attribute:: Parameter.empty + + A special class-level marker to specify absence of default values and + annotations. + + .. attribute:: Parameter.name + + The name of the parameter as a string. Must be a valid python identifier + name (with the exception of ``POSITIONAL_ONLY`` parameters, which can have + it set to ``None``). + + .. attribute:: Parameter.default + + The default value for the parameter. If the parameter has no default + value, this attribute is set to :attr:`Parameter.empty`. + + .. attribute:: Parameter.annotation + + The annotation for the parameter. If the parameter has no annotation, + this attribute is set to :attr:`Parameter.empty`. + + .. attribute:: Parameter.kind + + Describes how argument values are bound to the parameter. Possible values + (accessible via :class:`Parameter`, like ``Parameter.KEYWORD_ONLY``): + + .. tabularcolumns:: |l|L| + + +------------------------+----------------------------------------------+ + | Name | Meaning | + +========================+==============================================+ + | *POSITIONAL_ONLY* | Value must be supplied as a positional | + | | argument. | + | | | + | | Python has no explicit syntax for defining | + | | positional-only parameters, but many built-in| + | | and extension module functions (especially | + | | those that accept only one or two parameters)| + | | accept them. | + +------------------------+----------------------------------------------+ + | *POSITIONAL_OR_KEYWORD*| Value may be supplied as either a keyword or | + | | positional argument (this is the standard | + | | binding behaviour for functions implemented | + | | in Python.) | + +------------------------+----------------------------------------------+ + | *VAR_POSITIONAL* | A tuple of positional arguments that aren't | + | | bound to any other parameter. This | + | | corresponds to a ``*args`` parameter in a | + | | Python function definition. | + +------------------------+----------------------------------------------+ + | *KEYWORD_ONLY* | Value must be supplied as a keyword argument.| + | | Keyword only parameters are those which | + | | appear after a ``*`` or ``*args`` entry in a | + | | Python function definition. | + +------------------------+----------------------------------------------+ + | *VAR_KEYWORD* | A dict of keyword arguments that aren't bound| + | | to any other parameter. This corresponds to a| + | | ``**kwargs`` parameter in a Python function | + | | definition. | + +------------------------+----------------------------------------------+ + + Example: print all keyword-only arguments without default values:: + + >>> def foo(a, b, *, c, d=10): + ... pass + + >>> sig = signature(foo) + >>> for param in sig.parameters.values(): + ... if (param.kind == param.KEYWORD_ONLY and + ... param.default is param.empty): + ... print('Parameter:', param) + Parameter: c + + .. method:: Parameter.replace(*[, name][, kind][, default][, annotation]) + + Create a new Parameter instance based on the instance replaced was invoked + on. To override a :class:`Parameter` attribute, pass the corresponding + argument. To remove a default value or/and an annotation from a + Parameter, pass :attr:`Parameter.empty`. + + :: + + >>> from inspect import Parameter + >>> param = Parameter('foo', Parameter.KEYWORD_ONLY, default=42) + >>> str(param) + 'foo=42' + + >>> str(param.replace()) # Will create a shallow copy of 'param' + 'foo=42' + + >>> str(param.replace(default=Parameter.empty, annotation='spam')) + "foo:'spam'" + + +.. class:: BoundArguments + + Result of a :meth:`Signature.bind` or :meth:`Signature.bind_partial` call. + Holds the mapping of arguments to the function's parameters. + + .. attribute:: BoundArguments.arguments + + An ordered, mutable mapping (:class:`collections.OrderedDict`) of + parameters' names to arguments' values. Contains only explicitly bound + arguments. Changes in :attr:`arguments` will reflect in :attr:`args` and + :attr:`kwargs`. + + Should be used in conjunction with :attr:`Signature.parameters` for any + argument processing purposes. + + .. note:: + + Arguments for which :meth:`Signature.bind` or + :meth:`Signature.bind_partial` relied on a default value are skipped. + However, if needed, it is easy to include them. + + :: + + >>> def foo(a, b=10): + ... pass + + >>> sig = signature(foo) + >>> ba = sig.bind(5) + + >>> ba.args, ba.kwargs + ((5,), {}) + + >>> for param in sig.parameters.values(): + ... if param.name not in ba.arguments: + ... ba.arguments[param.name] = param.default + + >>> ba.args, ba.kwargs + ((5, 10), {}) + + + .. attribute:: BoundArguments.args + + A tuple of positional arguments values. Dynamically computed from the + :attr:`arguments` attribute. + + .. attribute:: BoundArguments.kwargs + + A dict of keyword arguments values. Dynamically computed from the + :attr:`arguments` attribute. + + The :attr:`args` and :attr:`kwargs` properties can be used to invoke + functions:: + + def test(a, *, b): + ... + + sig = signature(test) + ba = sig.bind(10, b=20) + test(*ba.args, **ba.kwargs) + + +.. seealso:: + + :pep:`362` - Function Signature Object. + The detailed specification, implementation details and examples. + + .. _inspect-classes-functions: Classes and functions @@ -396,9 +686,9 @@ Classes and functions :term:`named tuple` ``ArgSpec(args, varargs, keywords, defaults)`` is returned. *args* is a list of the argument names. *varargs* and *keywords* are the names of the ``*`` and ``**`` arguments or ``None``. *defaults* is a - tuple of default argument values or None if there are no default arguments; - if this tuple has *n* elements, they correspond to the last *n* elements - listed in *args*. + tuple of default argument values or ``None`` if there are no default + arguments; if this tuple has *n* elements, they correspond to the last + *n* elements listed in *args*. .. deprecated:: 3.0 Use :func:`getfullargspec` instead, which provides information about @@ -414,14 +704,19 @@ Classes and functions annotations)`` *args* is a list of the argument names. *varargs* and *varkw* are the names - of the ``*`` and ``**`` arguments or ``None``. *defaults* is an n-tuple of - the default values of the last n arguments. *kwonlyargs* is a list of + of the ``*`` and ``**`` arguments or ``None``. *defaults* is an *n*-tuple + of the default values of the last *n* arguments, or ``None`` if there are no + default arguments. *kwonlyargs* is a list of keyword-only argument names. *kwonlydefaults* is a dictionary mapping names from kwonlyargs to defaults. *annotations* is a dictionary mapping argument names to annotations. The first four items in the tuple correspond to :func:`getargspec`. + .. note:: + Consider using the new :ref:`Signature Object <inspect-signature-object>` + interface, which provides a better way of introspecting functions. + .. function:: getargvalues(frame) @@ -432,11 +727,23 @@ Classes and functions locals dictionary of the given frame. -.. function:: formatargspec(args[, varargs, varkw, defaults, formatarg, formatvarargs, formatvarkw, formatvalue]) +.. function:: formatargspec(args[, varargs, varkw, defaults, kwonlyargs, kwonlydefaults, annotations[, formatarg, formatvarargs, formatvarkw, formatvalue, formatreturns, formatannotations]]) - Format a pretty argument spec from the four values returned by - :func:`getargspec`. The format\* arguments are the corresponding optional - formatting functions that are called to turn names and values into strings. + Format a pretty argument spec from the values returned by + :func:`getargspec` or :func:`getfullargspec`. + + The first seven arguments are (``args``, ``varargs``, ``varkw``, + ``defaults``, ``kwonlyargs``, ``kwonlydefaults``, ``annotations``). The + other five arguments are the corresponding optional formatting functions + that are called to turn names and values into strings. The last argument + is an optional function to format the sequence of arguments. For example:: + + >>> from inspect import formatargspec, getfullargspec + >>> def f(a: int, b: float): + ... pass + ... + >>> formatargspec(*getfullargspec(f)) + '(a: int, b: float)' .. function:: formatargvalues(args[, varargs, varkw, locals, formatarg, formatvarargs, formatvarkw, formatvalue]) @@ -454,7 +761,7 @@ Classes and functions metatype is in use, cls will be the first element of the tuple. -.. function:: getcallargs(func[, *args][, **kwds]) +.. function:: getcallargs(func, *args, **kwds) Bind the *args* and *kwds* to the argument names of the Python function or method *func*, as if it was called with them. For bound methods, bind also the @@ -468,17 +775,36 @@ Classes and functions >>> from inspect import getcallargs >>> def f(a, b=1, *pos, **named): ... pass - >>> getcallargs(f, 1, 2, 3) - {'a': 1, 'named': {}, 'b': 2, 'pos': (3,)} - >>> getcallargs(f, a=2, x=4) - {'a': 2, 'named': {'x': 4}, 'b': 1, 'pos': ()} + >>> getcallargs(f, 1, 2, 3) == {'a': 1, 'named': {}, 'b': 2, 'pos': (3,)} + True + >>> getcallargs(f, a=2, x=4) == {'a': 2, 'named': {'x': 4}, 'b': 1, 'pos': ()} + True >>> getcallargs(f) Traceback (most recent call last): ... - TypeError: f() takes at least 1 argument (0 given) + TypeError: f() missing 1 required positional argument: 'a' .. versionadded:: 3.2 + .. note:: + Consider using the new :meth:`Signature.bind` instead. + + +.. function:: getclosurevars(func) + + Get the mapping of external name references in a Python function or + method *func* to their current values. A + :term:`named tuple` ``ClosureVars(nonlocals, globals, builtins, unbound)`` + is returned. *nonlocals* maps referenced names to lexical closure + variables, *globals* to the function's module globals and *builtins* to + the builtins visible from the function body. *unbound* is the set of names + referenced in the function that could not be resolved at all given the + current module globals and builtins. + + :exc:`TypeError` is raised if *func* is not a Python function or method. + + .. versionadded:: 3.3 + .. _inspect-stack: @@ -589,8 +915,9 @@ but avoids executing code when it fetches attributes. that raise AttributeError). It can also return descriptors objects instead of instance members. - If the instance :attr:`__dict__` is shadowed by another member (for example a - property) then this function will be unable to find instance members. + If the instance :attr:`~object.__dict__` is shadowed by another member (for + example a property) then this function will be unable to find instance + members. .. versionadded:: 3.2 @@ -643,3 +970,27 @@ generator to be determined easily. * GEN_CLOSED: Execution has completed. .. versionadded:: 3.2 + +The current internal state of the generator can also be queried. This is +mostly useful for testing purposes, to ensure that internal state is being +updated as expected: + +.. function:: getgeneratorlocals(generator) + + Get the mapping of live local variables in *generator* to their current + values. A dictionary is returned that maps from variable names to values. + This is the equivalent of calling :func:`locals` in the body of the + generator, and all the same caveats apply. + + If *generator* is a :term:`generator` with no currently associated frame, + then an empty dictionary is returned. :exc:`TypeError` is raised if + *generator* is not a Python generator object. + + .. impl-detail:: + + This function relies on the generator exposing a Python stack frame + for introspection, which isn't guaranteed to be the case in all + implementations of Python. In such cases, this function will always + return an empty dictionary. + + .. versionadded:: 3.3 diff --git a/Doc/library/internet.rst b/Doc/library/internet.rst index 6fa7873..b8950bb 100644 --- a/Doc/library/internet.rst +++ b/Doc/library/internet.rst @@ -23,10 +23,12 @@ is currently supported on most popular platforms. Here is an overview: cgi.rst cgitb.rst wsgiref.rst + urllib.rst urllib.request.rst urllib.parse.rst urllib.error.rst urllib.robotparser.rst + http.rst http.client.rst ftplib.rst poplib.rst @@ -40,5 +42,7 @@ is currently supported on most popular platforms. Here is an overview: http.server.rst http.cookies.rst http.cookiejar.rst + xmlrpc.rst xmlrpc.client.rst xmlrpc.server.rst + ipaddress.rst diff --git a/Doc/library/io.rst b/Doc/library/io.rst index 1939352..716767f 100644 --- a/Doc/library/io.rst +++ b/Doc/library/io.rst @@ -37,6 +37,10 @@ giving a :class:`str` object to the ``write()`` method of a binary stream will raise a ``TypeError``. So will giving a :class:`bytes` object to the ``write()`` method of a text stream. +.. versionchanged:: 3.3 + Operations that used to raise :exc:`IOError` now raise :exc:`OSError`, since + :exc:`IOError` is now an alias of :exc:`OSError`. + Text I/O ^^^^^^^^ @@ -55,7 +59,7 @@ In-memory text streams are also available as :class:`StringIO` objects:: f = io.StringIO("some initial text data") -The text stream API is described in detail in the documentation for the +The text stream API is described in detail in the documentation of :class:`TextIOBase`. @@ -106,28 +110,20 @@ High-level Module Interface :func:`os.stat`) if possible. -.. function:: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True) +.. function:: open(file, mode='r', buffering=-1, encoding=None, errors=None, newline=None, closefd=True, opener=None) This is an alias for the builtin :func:`open` function. .. exception:: BlockingIOError - Error raised when blocking would occur on a non-blocking stream. It inherits - :exc:`IOError`. - - In addition to those of :exc:`IOError`, :exc:`BlockingIOError` has one - attribute: - - .. attribute:: characters_written - - An integer containing the number of characters written to the stream - before it blocked. + This is a compatibility alias for the builtin :exc:`BlockingIOError` + exception. .. exception:: UnsupportedOperation - An exception inheriting :exc:`IOError` and :exc:`ValueError` that is raised + An exception inheriting :exc:`OSError` and :exc:`ValueError` that is raised when an unsupported operation is called on a stream. @@ -161,7 +157,7 @@ standard stream implementations. The abstract base classes also provide default implementations of some methods in order to help implementation of concrete stream classes. For example, :class:`BufferedIOBase` provides unoptimized implementations of - ``readinto()`` and ``readline()``. + :meth:`~IOBase.readinto` and :meth:`~IOBase.readline`. At the top of the I/O hierarchy is the abstract base class :class:`IOBase`. It defines the basic interface to a stream. Note, however, that there is no @@ -191,6 +187,8 @@ Argument names are not part of the specification, and only the arguments of The following table summarizes the ABCs provided by the :mod:`io` module: +.. tabularcolumns:: |l|l|L|L| + ========================= ================== ======================== ================================================== ABC Inherits Stub Methods Mixin Methods and Properties ========================= ================== ======================== ================================================== @@ -225,24 +223,24 @@ I/O Base Classes Even though :class:`IOBase` does not declare :meth:`read`, :meth:`readinto`, or :meth:`write` because their signatures will vary, implementations and clients should consider those methods part of the interface. Also, - implementations may raise a :exc:`IOError` when operations they do not - support are called. + implementations may raise a :exc:`ValueError` (or :exc:`UnsupportedOperation`) + when operations they do not support are called. The basic type used for binary data read from or written to a file is :class:`bytes`. :class:`bytearray`\s are accepted too, and in some cases - (such as :class:`readinto`) required. Text I/O classes work with + (such as :meth:`readinto`) required. Text I/O classes work with :class:`str` data. Note that calling any method (even inquiries) on a closed stream is - undefined. Implementations may raise :exc:`IOError` in this case. + undefined. Implementations may raise :exc:`ValueError` in this case. - IOBase (and its subclasses) supports the iterator protocol, meaning that an - :class:`IOBase` object can be iterated over yielding the lines in a stream. - Lines are defined slightly differently depending on whether the stream is - a binary stream (yielding bytes), or a text stream (yielding character - strings). See :meth:`~IOBase.readline` below. + :class:`IOBase` (and its subclasses) supports the iterator protocol, meaning + that an :class:`IOBase` object can be iterated over yielding the lines in a + stream. Lines are defined slightly differently depending on whether the + stream is a binary stream (yielding bytes), or a text stream (yielding + character strings). See :meth:`~IOBase.readline` below. - IOBase is also a context manager and therefore supports the + :class:`IOBase` is also a context manager and therefore supports the :keyword:`with` statement. In this example, *file* is closed after the :keyword:`with` statement's suite is finished---even if an exception occurs:: @@ -262,12 +260,12 @@ I/O Base Classes .. attribute:: closed - True if the stream is closed. + ``True`` if the stream is closed. .. method:: fileno() Return the underlying file descriptor (an integer) of the stream if it - exists. An :exc:`IOError` is raised if the IO object does not use a file + exists. An :exc:`OSError` is raised if the IO object does not use a file descriptor. .. method:: flush() @@ -282,8 +280,8 @@ I/O Base Classes .. method:: readable() - Return ``True`` if the stream can be read from. If False, :meth:`read` - will raise :exc:`IOError`. + Return ``True`` if the stream can be read from. If ``False``, :meth:`read` + will raise :exc:`OSError`. .. method:: readline(limit=-1) @@ -300,6 +298,9 @@ I/O Base Classes to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds *hint*. + Note that it's already possible to iterate on file objects using ``for + line in file: ...`` without calling ``file.readlines()``. + .. method:: seek(offset, whence=SEEK_SET) Change the stream position to the given byte *offset*. *offset* is @@ -318,10 +319,15 @@ I/O Base Classes .. versionadded:: 3.1 The ``SEEK_*`` constants. + .. versionadded:: 3.3 + Some operating systems could support additional values, like + :data:`os.SEEK_HOLE` or :data:`os.SEEK_DATA`. The valid values + for a file could depend on it being open in text or binary mode. + .. method:: seekable() Return ``True`` if the stream supports random access. If ``False``, - :meth:`seek`, :meth:`tell` and :meth:`truncate` will raise :exc:`IOError`. + :meth:`seek`, :meth:`tell` and :meth:`truncate` will raise :exc:`OSError`. .. method:: tell() @@ -339,7 +345,7 @@ I/O Base Classes .. method:: writable() Return ``True`` if the stream supports writing. If ``False``, - :meth:`write` and :meth:`truncate` will raise :exc:`IOError`. + :meth:`write` and :meth:`truncate` will raise :exc:`OSError`. .. method:: writelines(lines) @@ -358,7 +364,7 @@ I/O Base Classes (this is left to Buffered I/O and Text I/O, described later in this page). In addition to the attributes and methods from :class:`IOBase`, - RawIOBase provides the following methods: + :class:`RawIOBase` provides the following methods: .. method:: read(n=-1) @@ -378,18 +384,18 @@ I/O Base Classes .. method:: readinto(b) - Read up to len(b) bytes into bytearray *b* and return the number - of bytes read. If the object is in non-blocking mode and no + Read up to ``len(b)`` bytes into :class:`bytearray` *b* and return the + number of bytes read. If the object is in non-blocking mode and no bytes are available, ``None`` is returned. .. method:: write(b) - Write the given bytes or bytearray object, *b*, to the underlying raw - stream and return the number of bytes written. This can be less than - ``len(b)``, depending on specifics of the underlying raw stream, and - especially if it is in non-blocking mode. ``None`` is returned if the - raw stream is set not to block and no single byte could be readily - written to it. + Write the given :class:`bytes` or :class:`bytearray` object, *b*, to the + underlying raw stream and return the number of bytes written. This can + be less than ``len(b)``, depending on specifics of the underlying raw + stream, and especially if it is in non-blocking mode. ``None`` is + returned if the raw stream is set not to block and no single byte could + be readily written to it. .. class:: BufferedIOBase @@ -439,8 +445,8 @@ I/O Base Classes .. method:: read(n=-1) Read and return up to *n* bytes. If the argument is omitted, ``None``, or - negative, data is read and returned until EOF is reached. An empty bytes - object is returned if the stream is already at EOF. + negative, data is read and returned until EOF is reached. An empty + :class:`bytes` object is returned if the stream is already at EOF. If the argument is positive, and the underlying raw stream is not interactive, multiple raw reads may be issued to satisfy the byte count @@ -460,22 +466,23 @@ I/O Base Classes .. method:: readinto(b) - Read up to len(b) bytes into bytearray *b* and return the number of bytes - read. + Read up to ``len(b)`` bytes into bytearray *b* and return the number of + bytes read. Like :meth:`read`, multiple reads may be issued to the underlying raw - stream, unless the latter is 'interactive'. + stream, unless the latter is interactive. A :exc:`BlockingIOError` is raised if the underlying raw stream is in non blocking-mode, and has no data available at the moment. .. method:: write(b) - Write the given bytes or bytearray object, *b* and return the number - of bytes written (never less than ``len(b)``, since if the write fails - an :exc:`IOError` will be raised). Depending on the actual - implementation, these bytes may be readily written to the underlying - stream, or held in a buffer for performance and latency reasons. + Write the given :class:`bytes` or :class:`bytearray` object, *b* and + return the number of bytes written (never less than ``len(b)``, since if + the write fails an :exc:`OSError` will be raised). Depending on the + actual implementation, these bytes may be readily written to the + underlying stream, or held in a buffer for performance and latency + reasons. When in non-blocking mode, a :exc:`BlockingIOError` is raised if the data needed to be written to the raw stream but it couldn't accept @@ -485,7 +492,7 @@ I/O Base Classes Raw File I/O ^^^^^^^^^^^^ -.. class:: FileIO(name, mode='r', closefd=True) +.. class:: FileIO(name, mode='r', closefd=True, opener=None) :class:`FileIO` represents an OS-level file containing bytes data. It implements the :class:`RawIOBase` interface (and therefore the @@ -493,22 +500,38 @@ Raw File I/O The *name* can be one of two things: - * a character string or bytes object representing the path to the file - which will be opened; + * a character string or :class:`bytes` object representing the path to the + file which will be opened; * an integer representing the number of an existing OS-level file descriptor to which the resulting :class:`FileIO` object will give access. - The *mode* can be ``'r'``, ``'w'`` or ``'a'`` for reading (default), writing, - or appending. The file will be created if it doesn't exist when opened for - writing or appending; it will be truncated when opened for writing. Add a + The *mode* can be ``'r'``, ``'w'``, ``'x'`` or ``'a'`` for reading + (default), writing, exclusive creation or appending. The file will be + created if it doesn't exist when opened for writing or appending; it will be + truncated when opened for writing. :exc:`FileExistsError` will be raised if + it already exists when opened for creating. Opening a file for creating + implies writing, so this mode behaves in a similar way to ``'w'``. Add a ``'+'`` to the mode to allow simultaneous reading and writing. The :meth:`read` (when called with a positive argument), :meth:`readinto` and :meth:`write` methods on this class will only make one system call. + A custom opener can be used by passing a callable as *opener*. The underlying + file descriptor for the file object is then obtained by calling *opener* with + (*name*, *flags*). *opener* must return an open file descriptor (passing + :mod:`os.open` as *opener* results in functionality similar to passing + ``None``). + + See the :func:`open` built-in function for examples on using the *opener* + parameter. + + .. versionchanged:: 3.3 + The *opener* parameter was added. + The ``'x'`` mode was added. + In addition to the attributes and methods from :class:`IOBase` and :class:`RawIOBase`, :class:`FileIO` provides the following data - attributes and methods: + attributes: .. attribute:: mode @@ -556,7 +579,7 @@ than raw I/O does. .. method:: getvalue() - Return ``bytes`` containing the entire contents of the buffer. + Return :class:`bytes` containing the entire contents of the buffer. .. method:: read1() @@ -600,7 +623,7 @@ than raw I/O does. A buffer providing higher-level access to a writeable, sequential :class:`RawIOBase` object. It inherits :class:`BufferedIOBase`. - When writing to this object, data is normally held into an internal + When writing to this object, data is normally placed into an internal buffer. The buffer will be written out to the underlying :class:`RawIOBase` object under various conditions, including: @@ -613,8 +636,6 @@ than raw I/O does. *raw* stream. If the *buffer_size* is not given, it defaults to :data:`DEFAULT_BUFFER_SIZE`. - A third argument, *max_buffer_size*, is supported, but unused and deprecated. - :class:`BufferedWriter` provides or overrides these methods in addition to those from :class:`BufferedIOBase` and :class:`IOBase`: @@ -625,9 +646,10 @@ than raw I/O does. .. method:: write(b) - Write the bytes or bytearray object, *b* and return the number of bytes - written. When in non-blocking mode, a :exc:`BlockingIOError` is raised - if the buffer needs to be written out but the raw stream blocks. + Write the :class:`bytes` or :class:`bytearray` object, *b* and return the + number of bytes written. When in non-blocking mode, a + :exc:`BlockingIOError` is raised if the buffer needs to be written out but + the raw stream blocks. .. class:: BufferedRandom(raw, buffer_size=DEFAULT_BUFFER_SIZE) @@ -640,8 +662,6 @@ than raw I/O does. in the first argument. If the *buffer_size* is omitted it defaults to :data:`DEFAULT_BUFFER_SIZE`. - A third argument, *max_buffer_size*, is supported, but unused and deprecated. - :class:`BufferedRandom` is capable of anything :class:`BufferedReader` or :class:`BufferedWriter` can do. @@ -656,14 +676,12 @@ than raw I/O does. writeable respectively. If the *buffer_size* is omitted it defaults to :data:`DEFAULT_BUFFER_SIZE`. - A fourth argument, *max_buffer_size*, is supported, but unused and - deprecated. - :class:`BufferedRWPair` implements all of :class:`BufferedIOBase`\'s methods except for :meth:`~BufferedIOBase.detach`, which raises :exc:`UnsupportedOperation`. .. warning:: + :class:`BufferedRWPair` does not attempt to synchronize accesses to its underlying raw streams. You should not pass it the same object as reader and writer; use :class:`BufferedRandom` instead. @@ -701,7 +719,7 @@ Text I/O The underlying binary buffer (a :class:`BufferedIOBase` instance) that :class:`TextIOBase` deals with. This is not part of the - :class:`TextIOBase` API and may not exist on some implementations. + :class:`TextIOBase` API and may not exist in some implementations. .. method:: detach() @@ -761,13 +779,15 @@ Text I/O written. -.. class:: TextIOWrapper(buffer, encoding=None, errors=None, newline=None, line_buffering=False) +.. class:: TextIOWrapper(buffer, encoding=None, errors=None, newline=None, \ + line_buffering=False, write_through=False) A buffered text stream over a :class:`BufferedIOBase` binary stream. It inherits :class:`TextIOBase`. *encoding* gives the name of the encoding that the stream will be decoded or - encoded with. It defaults to :func:`locale.getpreferredencoding`. + encoded with. It defaults to + :func:`locale.getpreferredencoding(False) <locale.getpreferredencoding>`. *errors* is an optional string that specifies how encoding and decoding errors are to be handled. Pass ``'strict'`` to raise a :exc:`ValueError` @@ -804,6 +824,19 @@ Text I/O If *line_buffering* is ``True``, :meth:`flush` is implied when a call to write contains a newline character. + If *write_through* is ``True``, calls to :meth:`write` are guaranteed + not to be buffered: any data written on the :class:`TextIOWrapper` + object is immediately handled to its underlying binary *buffer*. + + .. versionchanged:: 3.3 + The *write_through* argument has been added. + + .. versionchanged:: 3.3 + The default *encoding* is now ``locale.getpreferredencoding(False)`` + instead of ``locale.getpreferredencoding()``. Don't change temporary the + locale encoding using :func:`locale.setlocale`, use the current locale + encoding instead of the user preferred encoding. + :class:`TextIOWrapper` provides one attribute in addition to those of :class:`TextIOBase` and its parents: @@ -812,13 +845,14 @@ Text I/O Whether line buffering is enabled. -.. class:: StringIO(initial_value='', newline=None) +.. class:: StringIO(initial_value='', newline='\\n') An in-memory stream for text I/O. The initial value of the buffer (an empty string by default) can be set by providing *initial_value*. The *newline* argument works like that of - :class:`TextIOWrapper`. The default is to do no newline translation. + :class:`TextIOWrapper`. The default is to consider only ``\n`` characters + as end of lines and to do no newline translation. :class:`StringIO` provides this method in addition to those from :class:`TextIOBase` and its parents: @@ -870,8 +904,8 @@ operating system's unbuffered I/O routines. The gain depends on the OS and the kind of I/O which is performed. For example, on some modern OSes such as Linux, unbuffered disk I/O can be as fast as buffered I/O. The bottom line, however, is that buffered I/O offers predictable performance regardless of the platform -and the backing device. Therefore, it is most always preferable to use buffered -I/O rather than unbuffered I/O for binary datal +and the backing device. Therefore, it is almost always preferable to use +buffered I/O rather than unbuffered I/O for binary data. Text I/O ^^^^^^^^ @@ -906,8 +940,8 @@ Binary buffered objects (instances of :class:`BufferedReader`, :class:`BufferedWriter`, :class:`BufferedRandom` and :class:`BufferedRWPair`) are not reentrant. While reentrant calls will not happen in normal situations, they can arise from doing I/O in a :mod:`signal` handler. If a thread tries to -renter a buffered object which it is already accessing, a :exc:`RuntimeError` is -raised. Note this doesn't prohibit a different thread from entering the +re-enter a buffered object which it is already accessing, a :exc:`RuntimeError` +is raised. Note this doesn't prohibit a different thread from entering the buffered object. The above implicitly extends to text files, since the :func:`open()` function diff --git a/Doc/library/ipaddress.rst b/Doc/library/ipaddress.rst new file mode 100644 index 0000000..8eac92f --- /dev/null +++ b/Doc/library/ipaddress.rst @@ -0,0 +1,804 @@ +:mod:`ipaddress` --- IPv4/IPv6 manipulation library +=================================================== + +.. module:: ipaddress + :synopsis: IPv4/IPv6 manipulation library. +.. moduleauthor:: Peter Moody + +**Source code:** :source:`Lib/ipaddress.py` + +-------------- + +.. note:: + + The ``ipaddress`` module has been included in the standard library on a + :term:`provisional basis <provisional package>`. Backwards incompatible + changes (up to and including removal of the package) may occur if deemed + necessary by the core developers. + +:mod:`ipaddress` provides the capabilities to create, manipulate and +operate on IPv4 and IPv6 addresses and networks. + +The functions and classes in this module make it straightforward to handle +various tasks related to IP addresses, including checking whether or not two +hosts are on the same subnet, iterating over all hosts in a particular +subnet, checking whether or not a string represents a valid IP address or +network definition, and so on. + +This is the full module API reference - for an overview and introduction, +see :ref:`ipaddress-howto`. + +.. versionadded:: 3.3 + + +Convenience factory functions +----------------------------- + +The :mod:`ipaddress` module provides factory functions to conveniently create +IP addresses, networks and interfaces: + +.. function:: ip_address(address) + + Return an :class:`IPv4Address` or :class:`IPv6Address` object depending on + the IP address passed as argument. Either IPv4 or IPv6 addresses may be + supplied; integers less than 2**32 will be considered to be IPv4 by default. + A :exc:`ValueError` is raised if *address* does not represent a valid IPv4 + or IPv6 address. + +.. testsetup:: + >>> import ipaddress + >>> from ipaddress import (ip_network, IPv4Address, IPv4Interface, + ... IPv4Network) + +:: + + >>> ipaddress.ip_address('192.168.0.1') + IPv4Address('192.168.0.1') + >>> ipaddress.ip_address('2001:db8::') + IPv6Address('2001:db8::') + + +.. function:: ip_network(address, strict=True) + + Return an :class:`IPv4Network` or :class:`IPv6Network` object depending on + the IP address passed as argument. *address* is a string or integer + representing the IP network. Either IPv4 or IPv6 networks may be supplied; + integers less than 2**32 will be considered to be IPv4 by default. *strict* + is passed to :class:`IPv4Network` or :class:`IPv6Network` constructor. A + :exc:`ValueError` is raised if *address* does not represent a valid IPv4 or + IPv6 address, or if the network has host bits set. + + >>> ipaddress.ip_network('192.168.0.0/28') + IPv4Network('192.168.0.0/28') + + +.. function:: ip_interface(address) + + Return an :class:`IPv4Interface` or :class:`IPv6Interface` object depending + on the IP address passed as argument. *address* is a string or integer + representing the IP address. Either IPv4 or IPv6 addresses may be supplied; + integers less than 2**32 will be considered to be IPv4 by default. A + :exc:`ValueError` is raised if *address* does not represent a valid IPv4 or + IPv6 address. + +One downside of these convenience functions is that the need to handle both +IPv4 and IPv6 formats means that error messages provide minimal +information on the precise error, as the functions don't know whether the +IPv4 or IPv6 format was intended. More detailed error reporting can be +obtained by calling the appropriate version specific class constructors +directly. + + +IP Addresses +------------ + +Address objects +^^^^^^^^^^^^^^^ + +The :class:`IPv4Address` and :class:`IPv6Address` objects share a lot of common +attributes. Some attributes that are only meaningful for IPv6 addresses are +also implemented by :class:`IPv4Address` objects, in order to make it easier to +write code that handles both IP versions correctly. + +.. class:: IPv4Address(address) + + Construct an IPv4 address. An :exc:`AddressValueError` is raised if + *address* is not a valid IPv4 address. + + The following constitutes a valid IPv4 address: + + 1. A string in decimal-dot notation, consisting of four decimal integers in + the inclusive range 0-255, separated by dots (e.g. ``192.168.0.1``). Each + integer represents an octet (byte) in the address. Leading zeroes are + tolerated only for values less then 8 (as there is no ambiguity + between the decimal and octal interpretations of such strings). + 2. An integer that fits into 32 bits. + 3. An integer packed into a :class:`bytes` object of length 4 (most + significant octet first). + + >>> ipaddress.IPv4Address('192.168.0.1') + IPv4Address('192.168.0.1') + >>> ipaddress.IPv4Address(3232235521) + IPv4Address('192.168.0.1') + >>> ipaddress.IPv4Address(b'\xC0\xA8\x00\x01') + IPv4Address('192.168.0.1') + + .. attribute:: version + + The appropriate version number: ``4`` for IPv4, ``6`` for IPv6. + + .. attribute:: max_prefixlen + + The total number of bits in the address representation for this + version: ``32`` for IPv4, ``128`` for IPv6. + + The prefix defines the number of leading bits in an address that + are compared to determine whether or not an address is part of a + network. + + .. attribute:: compressed + .. attribute:: exploded + + The string representation in dotted decimal notation. Leading zeroes + are never included in the representation. + + As IPv4 does not define a shorthand notation for addresses with octets + set to zero, these two attributes are always the same as ``str(addr)`` + for IPv4 addresses. Exposing these attributes makes it easier to + write display code that can handle both IPv4 and IPv6 addresses. + + .. attribute:: packed + + The binary representation of this address - a :class:`bytes` object of + the appropriate length (most significant octet first). This is 4 bytes + for IPv4 and 16 bytes for IPv6. + + .. attribute:: is_multicast + + ``True`` if the address is reserved for multicast use. See + :RFC:`3171` (for IPv4) or :RFC:`2373` (for IPv6). + + .. attribute:: is_private + + ``True`` if the address is allocated for private networks. See + :RFC:`1918` (for IPv4) or :RFC:`4193` (for IPv6). + + .. attribute:: is_unspecified + + ``True`` if the address is unspecified. See :RFC:`5735` (for IPv4) + or :RFC:`2373` (for IPv6). + + .. attribute:: is_reserved + + ``True`` if the address is otherwise IETF reserved. + + .. attribute:: is_loopback + + ``True`` if this is a loopback address. See :RFC:`3330` (for IPv4) + or :RFC:`2373` (for IPv6). + + .. attribute:: is_link_local + + ``True`` if the address is reserved for link-local usage. See + :RFC:`3927`. + + +.. class:: IPv6Address(address) + + Construct an IPv6 address. An :exc:`AddressValueError` is raised if + *address* is not a valid IPv6 address. + + The following constitutes a valid IPv6 address: + + 1. A string consisting of eight groups of four hexadecimal digits, each + group representing 16 bits. The groups are separated by colons. + This describes an *exploded* (longhand) notation. The string can + also be *compressed* (shorthand notation) by various means. See + :RFC:`4291` for details. For example, + ``"0000:0000:0000:0000:0000:0abc:0007:0def"`` can be compressed to + ``"::abc:7:def"``. + 2. An integer that fits into 128 bits. + 3. An integer packed into a :class:`bytes` object of length 16, big-endian. + + >>> ipaddress.IPv6Address('2001:db8::1000') + IPv6Address('2001:db8::1000') + + .. attribute:: compressed + + The short form of the address representation, with leading zeroes in + groups omitted and the longest sequence of groups consisting entirely of + zeroes collapsed to a single empty group. + + This is also the value returned by ``str(addr)`` for IPv6 addresses. + + .. attribute:: exploded + + The long form of the address representation, with all leading zeroes and + groups consisting entirely of zeroes included. + + .. attribute:: packed + .. attribute:: version + .. attribute:: max_prefixlen + .. attribute:: is_multicast + .. attribute:: is_private + .. attribute:: is_unspecified + .. attribute:: is_reserved + .. attribute:: is_loopback + .. attribute:: is_link_local + + Refer to the corresponding attribute documentation in + :class:`IPv4Address` + + .. attribute:: is_site_local + + ``True`` if the address is reserved for site-local usage. Note that + the site-local address space has been deprecated by :RFC:`3879`. Use + :attr:`~IPv4Address.is_private` to test if this address is in the + space of unique local addresses as defined by :RFC:`4193`. + + .. attribute:: ipv4_mapped + + For addresses that appear to be IPv4 mapped addresses (starting with + ``::FFFF/96``), this property will report the embedded IPv4 address. + For any other address, this property will be ``None``. + + .. attribute:: sixtofour + + For addresses that appear to be 6to4 addresses (starting with + ``2002::/16``) as defined by :RFC:`3056`, this property will report + the embedded IPv4 address. For any other address, this property will + be ``None``. + + .. attribute:: teredo + + For addresses that appear to be Teredo addresses (starting with + ``2001::/32``) as defined by :RFC:`4380`, this property will report + the embedded ``(server, client)`` IP address pair. For any other + address, this property will be ``None``. + + +Conversion to Strings and Integers +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To interoperate with networking interfaces such as the socket module, +addresses must be converted to strings or integers. This is handled using +the :func:`str` and :func:`int` builtin functions:: + + >>> str(ipaddress.IPv4Address('192.168.0.1')) + '192.168.0.1' + >>> int(ipaddress.IPv4Address('192.168.0.1')) + 3232235521 + >>> str(ipaddress.IPv6Address('::1')) + '::1' + >>> int(ipaddress.IPv6Address('::1')) + 1 + + +Operators +^^^^^^^^^ + +Address objects support some operators. Unless stated otherwise, operators can +only be applied between compatible objects (i.e. IPv4 with IPv4, IPv6 with +IPv6). + + +Comparison operators +"""""""""""""""""""" + +Address objects can be compared with the usual set of comparison operators. Some +examples:: + + >>> IPv4Address('127.0.0.2') > IPv4Address('127.0.0.1') + True + >>> IPv4Address('127.0.0.2') == IPv4Address('127.0.0.1') + False + >>> IPv4Address('127.0.0.2') != IPv4Address('127.0.0.1') + True + + +Arithmetic operators +"""""""""""""""""""" + +Integers can be added to or subtracted from address objects. Some examples:: + + >>> IPv4Address('127.0.0.2') + 3 + IPv4Address('127.0.0.5') + >>> IPv4Address('127.0.0.2') - 3 + IPv4Address('126.255.255.255') + >>> IPv4Address('255.255.255.255') + 1 + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + ipaddress.AddressValueError: 4294967296 (>= 2**32) is not permitted as an IPv4 address + + +IP Network definitions +---------------------- + +The :class:`IPv4Network` and :class:`IPv6Network` objects provide a mechanism +for defining and inspecting IP network definitions. A network definition +consists of a *mask* and a *network address*, and as such defines a range of +IP addresses that equal the network address when masked (binary AND) with the +mask. For example, a network definition with the mask ``255.255.255.0`` and +the network address ``192.168.1.0`` consists of IP addresses in the inclusive +range ``192.168.1.0`` to ``192.168.1.255``. + + +Prefix, net mask and host mask +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +There are several equivalent ways to specify IP network masks. A *prefix* +``/<nbits>`` is a notation that denotes how many high-order bits are set in +the network mask. A *net mask* is an IP address with some number of +high-order bits set. Thus the prefix ``/24`` is equivalent to the net mask +``255.255.255.0`` in IPv4, or ``ffff:ff00::`` in IPv6. In addition, a +*host mask* is the logical inverse of a *net mask*, and is sometimes used +(for example in Cisco access control lists) to denote a network mask. The +host mask equivalent to ``/24`` in IPv4 is ``0.0.0.255``. + + +Network objects +^^^^^^^^^^^^^^^ + +All attributes implemented by address objects are implemented by network +objects as well. In addition, network objects implement additional attributes. +All of these are common between :class:`IPv4Network` and :class:`IPv6Network`, +so to avoid duplication they are only documented for :class:`IPv4Network`. + +.. class:: IPv4Network(address, strict=True) + + Construct an IPv4 network definition. *address* can be one of the following: + + 1. A string consisting of an IP address and an optional mask, separated by + a slash (``/``). The IP address is the network address, and the mask + can be either a single number, which means it's a *prefix*, or a string + representation of an IPv4 address. If it's the latter, the mask is + interpreted as a *net mask* if it starts with a non-zero field, or as + a *host mask* if it starts with a zero field. If no mask is provided, + it's considered to be ``/32``. + + For example, the following *address* specifications are equivalent: + ``192.168.1.0/24``, ``192.168.1.0/255.255.255.0`` and + ``192.168.1.0/0.0.0.255``. + + 2. An integer that fits into 32 bits. This is equivalent to a + single-address network, with the network address being *address* and + the mask being ``/32``. + + 3. An integer packed into a :class:`bytes` object of length 4, big-endian. + The interpretation is similar to an integer *address*. + + An :exc:`AddressValueError` is raised if *address* is not a valid IPv4 + address. A :exc:`NetmaskValueError` is raised if the mask is not valid for + an IPv4 address. + + If *strict* is ``True`` and host bits are set in the supplied address, + then :exc:`ValueError` is raised. Otherwise, the host bits are masked out + to determine the appropriate network address. + + Unless stated otherwise, all network methods accepting other network/address + objects will raise :exc:`TypeError` if the argument's IP version is + incompatible to ``self`` + + .. attribute:: version + .. attribute:: max_prefixlen + + Refer to the corresponding attribute documentation in + :class:`IPv4Address` + + .. attribute:: is_multicast + .. attribute:: is_private + .. attribute:: is_unspecified + .. attribute:: is_reserved + .. attribute:: is_loopback + .. attribute:: is_link_local + + These attributes are true for the network as a whole if they are true + for both the network address and the broadcast address + + .. attribute:: network_address + + The network address for the network. The network address and the + prefix length together uniquely define a network. + + .. attribute:: broadcast_address + + The broadcast address for the network. Packets sent to the broadcast + address should be received by every host on the network. + + .. attribute:: hostmask + + The host mask, as a string. + + .. attribute:: with_prefixlen + .. attribute:: compressed + .. attribute:: exploded + + A string representation of the network, with the mask in prefix + notation. + + ``with_prefixlen`` and ``compressed`` are always the same as + ``str(network)``. + ``exploded`` uses the exploded form the network address. + + .. attribute:: with_netmask + + A string representation of the network, with the mask in net mask + notation. + + .. attribute:: with_hostmask + + A string representation of the network, with the mask in host mask + notation. + + .. attribute:: num_addresses + + The total number of addresses in the network. + + .. attribute:: prefixlen + + Length of the network prefix, in bits. + + .. method:: hosts() + + Returns an iterator over the usable hosts in the network. The usable + hosts are all the IP addresses that belong to the network, except the + network address itself and the network broadcast address. + + >>> list(ip_network('192.0.2.0/29').hosts()) #doctest: +NORMALIZE_WHITESPACE + [IPv4Address('192.0.2.1'), IPv4Address('192.0.2.2'), + IPv4Address('192.0.2.3'), IPv4Address('192.0.2.4'), + IPv4Address('192.0.2.5'), IPv4Address('192.0.2.6')] + + .. method:: overlaps(other) + + ``True`` if this network is partly or wholly contained in *other* or + *other* is wholly contained in this network. + + .. method:: address_exclude(network) + + Computes the network definitions resulting from removing the given + *network* from this one. Returns an iterator of network objects. + Raises :exc:`ValueError` if *network* is not completely contained in + this network. + + >>> n1 = ip_network('192.0.2.0/28') + >>> n2 = ip_network('192.0.2.1/32') + >>> list(n1.address_exclude(n2)) #doctest: +NORMALIZE_WHITESPACE + [IPv4Network('192.0.2.8/29'), IPv4Network('192.0.2.4/30'), + IPv4Network('192.0.2.2/31'), IPv4Network('192.0.2.0/32')] + + .. method:: subnets(prefixlen_diff=1, new_prefix=None) + + The subnets that join to make the current network definition, depending + on the argument values. *prefixlen_diff* is the amount our prefix + length should be increased by. *new_prefix* is the desired new + prefix of the subnets; it must be larger than our prefix. One and + only one of *prefixlen_diff* and *new_prefix* must be set. Returns an + iterator of network objects. + + >>> list(ip_network('192.0.2.0/24').subnets()) + [IPv4Network('192.0.2.0/25'), IPv4Network('192.0.2.128/25')] + >>> list(ip_network('192.0.2.0/24').subnets(prefixlen_diff=2)) #doctest: +NORMALIZE_WHITESPACE + [IPv4Network('192.0.2.0/26'), IPv4Network('192.0.2.64/26'), + IPv4Network('192.0.2.128/26'), IPv4Network('192.0.2.192/26')] + >>> list(ip_network('192.0.2.0/24').subnets(new_prefix=26)) #doctest: +NORMALIZE_WHITESPACE + [IPv4Network('192.0.2.0/26'), IPv4Network('192.0.2.64/26'), + IPv4Network('192.0.2.128/26'), IPv4Network('192.0.2.192/26')] + >>> list(ip_network('192.0.2.0/24').subnets(new_prefix=23)) + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + raise ValueError('new prefix must be longer') + ValueError: new prefix must be longer + >>> list(ip_network('192.0.2.0/24').subnets(new_prefix=25)) + [IPv4Network('192.0.2.0/25'), IPv4Network('192.0.2.128/25')] + + .. method:: supernet(prefixlen_diff=1, new_prefix=None) + + The supernet containing this network definition, depending on the + argument values. *prefixlen_diff* is the amount our prefix length + should be decreased by. *new_prefix* is the desired new prefix of + the supernet; it must be smaller than our prefix. One and only one + of *prefixlen_diff* and *new_prefix* must be set. Returns a single + network object. + + >>> ip_network('192.0.2.0/24').supernet() + IPv4Network('192.0.2.0/23') + >>> ip_network('192.0.2.0/24').supernet(prefixlen_diff=2) + IPv4Network('192.0.0.0/22') + >>> ip_network('192.0.2.0/24').supernet(new_prefix=20) + IPv4Network('192.0.0.0/20') + + .. method:: compare_networks(other) + + Compare this network to *other*. In this comparison only the network + addresses are considered; host bits aren't. Returns either ``-1``, + ``0`` or ``1``. + + >>> ip_network('192.0.2.1/32').compare_networks(ip_network('192.0.2.2/32')) + -1 + >>> ip_network('192.0.2.1/32').compare_networks(ip_network('192.0.2.0/32')) + 1 + >>> ip_network('192.0.2.1/32').compare_networks(ip_network('192.0.2.1/32')) + 0 + + +.. class:: IPv6Network(address, strict=True) + + Construct an IPv6 network definition. *address* can be one of the following: + + 1. A string consisting of an IP address and an optional mask, separated by + a slash (``/``). The IP address is the network address, and the mask + can be either a single number, which means it's a *prefix*, or a string + representation of an IPv6 address. If it's the latter, the mask is + interpreted as a *net mask*. If no mask is provided, it's considered to + be ``/128``. + + For example, the following *address* specifications are equivalent: + ``2001:db00::0/24`` and ``2001:db00::0/ffff:ff00::``. + + 2. An integer that fits into 128 bits. This is equivalent to a + single-address network, with the network address being *address* and + the mask being ``/128``. + + 3. An integer packed into a :class:`bytes` object of length 16, bit-endian. + The interpretation is similar to an integer *address*. + + An :exc:`AddressValueError` is raised if *address* is not a valid IPv6 + address. A :exc:`NetmaskValueError` is raised if the mask is not valid for + an IPv6 address. + + If *strict* is ``True`` and host bits are set in the supplied address, + then :exc:`ValueError` is raised. Otherwise, the host bits are masked out + to determine the appropriate network address. + + .. attribute:: version + .. attribute:: max_prefixlen + .. attribute:: is_multicast + .. attribute:: is_private + .. attribute:: is_unspecified + .. attribute:: is_reserved + .. attribute:: is_loopback + .. attribute:: is_link_local + .. attribute:: network_address + .. attribute:: broadcast_address + .. attribute:: hostmask + .. attribute:: with_prefixlen + .. attribute:: compressed + .. attribute:: exploded + .. attribute:: with_netmask + .. attribute:: with_hostmask + .. attribute:: num_addresses + .. attribute:: prefixlen + .. method:: hosts() + .. method:: overlaps(other) + .. method:: address_exclude(network) + .. method:: subnets(prefixlen_diff=1, new_prefix=None) + .. method:: supernet(prefixlen_diff=1, new_prefix=None) + .. method:: compare_networks(other) + + Refer to the corresponding attribute documentation in + :class:`IPv4Network` + + .. attribute:: is_site_local + + These attribute is true for the network as a whole if it is true + for both the network address and the broadcast address + + +Operators +^^^^^^^^^ + +Network objects support some operators. Unless stated otherwise, operators can +only be applied between compatible objects (i.e. IPv4 with IPv4, IPv6 with +IPv6). + + +Logical operators +""""""""""""""""" + +Network objects can be compared with the usual set of logical operators, +similarly to address objects. + + +Iteration +""""""""" + +Network objects can be iterated to list all the addresses belonging to the +network. For iteration, *all* hosts are returned, including unusable hosts +(for usable hosts, use the :meth:`~IPv4Network.hosts` method). An +example:: + + >>> for addr in IPv4Network('192.0.2.0/28'): + ... addr + ... + IPv4Address('192.0.2.0') + IPv4Address('192.0.2.1') + IPv4Address('192.0.2.2') + IPv4Address('192.0.2.3') + IPv4Address('192.0.2.4') + IPv4Address('192.0.2.5') + IPv4Address('192.0.2.6') + IPv4Address('192.0.2.7') + IPv4Address('192.0.2.8') + IPv4Address('192.0.2.9') + IPv4Address('192.0.2.10') + IPv4Address('192.0.2.11') + IPv4Address('192.0.2.12') + IPv4Address('192.0.2.13') + IPv4Address('192.0.2.14') + IPv4Address('192.0.2.15') + + +Networks as containers of addresses +""""""""""""""""""""""""""""""""""" + +Network objects can act as containers of addresses. Some examples:: + + >>> IPv4Network('192.0.2.0/28')[0] + IPv4Address('192.0.2.0') + >>> IPv4Network('192.0.2.0/28')[15] + IPv4Address('192.0.2.15') + >>> IPv4Address('192.0.2.6') in IPv4Network('192.0.2.0/28') + True + >>> IPv4Address('192.0.3.6') in IPv4Network('192.0.2.0/28') + False + + +Interface objects +----------------- + +.. class:: IPv4Interface(address) + + Construct an IPv4 interface. The meaning of *address* is as in the + constructor of :class:`IPv4Network`, except that arbitrary host addresses + are always accepted. + + :class:`IPv4Interface` is a subclass of :class:`IPv4Address`, so it inherits + all the attributes from that class. In addition, the following attributes + are available: + + .. attribute:: ip + + The address (:class:`IPv4Address`) without network information. + + >>> interface = IPv4Interface('192.0.2.5/24') + >>> interface.ip + IPv4Address('192.0.2.5') + + .. attribute:: network + + The network (:class:`IPv4Network`) this interface belongs to. + + >>> interface = IPv4Interface('192.0.2.5/24') + >>> interface.network + IPv4Network('192.0.2.0/24') + + .. attribute:: with_prefixlen + + A string representation of the interface with the mask in prefix notation. + + >>> interface = IPv4Interface('192.0.2.5/24') + >>> interface.with_prefixlen + '192.0.2.5/24' + + .. attribute:: with_netmask + + A string representation of the interface with the network as a net mask. + + >>> interface = IPv4Interface('192.0.2.5/24') + >>> interface.with_netmask + '192.0.2.5/255.255.255.0' + + .. attribute:: with_hostmask + + A string representation of the interface with the network as a host mask. + + >>> interface = IPv4Interface('192.0.2.5/24') + >>> interface.with_hostmask + '192.0.2.5/0.0.0.255' + + +.. class:: IPv6Interface(address) + + Construct an IPv6 interface. The meaning of *address* is as in the + constructor of :class:`IPv6Network`, except that arbitrary host addresses + are always accepted. + + :class:`IPv6Interface` is a subclass of :class:`IPv6Address`, so it inherits + all the attributes from that class. In addition, the following attributes + are available: + + .. attribute:: ip + .. attribute:: network + .. attribute:: with_prefixlen + .. attribute:: with_netmask + .. attribute:: with_hostmask + + Refer to the corresponding attribute documentation in + :class:`IPv4Interface`. + + +Other Module Level Functions +---------------------------- + +The module also provides the following module level functions: + +.. function:: v4_int_to_packed(address) + + Represent an address as 4 packed bytes in network (big-endian) order. + *address* is an integer representation of an IPv4 IP address. A + :exc:`ValueError` is raised if the integer is negative or too large to be an + IPv4 IP address. + + >>> ipaddress.ip_address(3221225985) + IPv4Address('192.0.2.1') + >>> ipaddress.v4_int_to_packed(3221225985) + b'\xc0\x00\x02\x01' + + +.. function:: v6_int_to_packed(address) + + Represent an address as 16 packed bytes in network (big-endian) order. + *address* is an integer representation of an IPv6 IP address. A + :exc:`ValueError` is raised if the integer is negative or too large to be an + IPv6 IP address. + + +.. function:: summarize_address_range(first, last) + + Return an iterator of the summarized network range given the first and last + IP addresses. *first* is the first :class:`IPv4Address` or + :class:`IPv6Address` in the range and *last* is the last :class:`IPv4Address` + or :class:`IPv6Address` in the range. A :exc:`TypeError` is raised if + *first* or *last* are not IP addresses or are not of the same version. A + :exc:`ValueError` is raised if *last* is not greater than *first* or if + *first* address version is not 4 or 6. + + >>> [ipaddr for ipaddr in ipaddress.summarize_address_range( + ... ipaddress.IPv4Address('192.0.2.0'), + ... ipaddress.IPv4Address('192.0.2.130'))] + [IPv4Network('192.0.2.0/25'), IPv4Network('192.0.2.128/31'), IPv4Network('192.0.2.130/32')] + + +.. function:: collapse_addresses(addresses) + + Return an iterator of the collapsed :class:`IPv4Network` or + :class:`IPv6Network` objects. *addresses* is an iterator of + :class:`IPv4Network` or :class:`IPv6Network` objects. A :exc:`TypeError` is + raised if *addresses* contains mixed version objects. + + >>> [ipaddr for ipaddr in + ... ipaddress.collapse_addresses([ipaddress.IPv4Network('192.0.2.0/25'), + ... ipaddress.IPv4Network('192.0.2.128/25')])] + [IPv4Network('192.0.2.0/24')] + + +.. function:: get_mixed_type_key(obj) + + Return a key suitable for sorting between networks and addresses. Address + and Network objects are not sortable by default; they're fundamentally + different, so the expression:: + + IPv4Address('192.0.2.0') <= IPv4Network('192.0.2.0/24') + + doesn't make sense. There are some times however, where you may wish to + have :mod:`ipaddress` sort these anyway. If you need to do this, you can use + this function as the ``key`` argument to :func:`sorted()`. + + *obj* is either a network or address object. + + +Custom Exceptions +----------------- + +To support more specific error reporting from class constructors, the +module defines the following exceptions: + +.. exception:: AddressValueError(ValueError) + + Any value error related to the address. + + +.. exception:: NetmaskValueError(ValueError) + + Any value error related to the netmask. diff --git a/Doc/library/ipc.rst b/Doc/library/ipc.rst index c873065..91ec693 100644 --- a/Doc/library/ipc.rst +++ b/Doc/library/ipc.rst @@ -8,7 +8,7 @@ The modules described in this chapter provide mechanisms for different processes to communicate. Some modules only work for two processes that are on the same machine, e.g. -:mod:`signal` and :mod:`subprocess`. Other modules support networking protocols +:mod:`signal` and :mod:`mmap`. Other modules support networking protocols that two or more processes can used to communicate across machines. The list of modules described in this chapter is: @@ -16,9 +16,9 @@ The list of modules described in this chapter is: .. toctree:: - subprocess.rst socket.rst ssl.rst - signal.rst asyncore.rst asynchat.rst + signal.rst + mmap.rst diff --git a/Doc/library/itertools.rst b/Doc/library/itertools.rst index 308f925..5d3e50a 100644 --- a/Doc/library/itertools.rst +++ b/Doc/library/itertools.rst @@ -43,21 +43,22 @@ Iterator Arguments Results **Iterators terminating on the shortest input sequence:** -==================== ============================ ================================================= ============================================================= -Iterator Arguments Results Example -==================== ============================ ================================================= ============================================================= -:func:`accumulate` p p0, p0+p1, p0+p1+p2, ... ``accumulate([1,2,3,4,5]) --> 1 3 6 10 15`` -:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F`` -:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F`` -:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1`` -:func:`filterfalse` pred, seq elements of seq where pred(elem) is False ``filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8`` -:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v) -:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G`` -:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000`` -:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4`` -:func:`tee` it, n it1, it2 , ... itn splits one iterator into n -:func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-`` -==================== ============================ ================================================= ============================================================= +============================ ============================ ================================================= ============================================================= +Iterator Arguments Results Example +============================ ============================ ================================================= ============================================================= +:func:`accumulate` p [,func] p0, p0+p1, p0+p1+p2, ... ``accumulate([1,2,3,4,5]) --> 1 3 6 10 15`` +:func:`chain` p, q, ... p0, p1, ... plast, q0, q1, ... ``chain('ABC', 'DEF') --> A B C D E F`` +:func:`chain.from_iterable` iterable p0, p1, ... plast, q0, q1, ... ``chain.from_iterable(['ABC', 'DEF']) --> A B C D E F`` +:func:`compress` data, selectors (d[0] if s[0]), (d[1] if s[1]), ... ``compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F`` +:func:`dropwhile` pred, seq seq[n], seq[n+1], starting when pred fails ``dropwhile(lambda x: x<5, [1,4,6,4,1]) --> 6 4 1`` +:func:`filterfalse` pred, seq elements of seq where pred(elem) is false ``filterfalse(lambda x: x%2, range(10)) --> 0 2 4 6 8`` +:func:`groupby` iterable[, keyfunc] sub-iterators grouped by value of keyfunc(v) +:func:`islice` seq, [start,] stop [, step] elements from seq[start:stop:step] ``islice('ABCDEFG', 2, None) --> C D E F G`` +:func:`starmap` func, seq func(\*seq[0]), func(\*seq[1]), ... ``starmap(pow, [(2,5), (3,2), (10,3)]) --> 32 9 1000`` +:func:`takewhile` pred, seq seq[0], seq[1], until pred fails ``takewhile(lambda x: x<5, [1,4,6,4,1]) --> 1 4`` +:func:`tee` it, n it1, it2, ... itn splits one iterator into n +:func:`zip_longest` p, q, ... (p[0], q[0]), (p[1], q[1]), ... ``zip_longest('ABCD', 'xy', fillvalue='-') --> Ax By C- D-`` +============================ ============================ ================================================= ============================================================= **Combinatoric generators:** @@ -84,23 +85,64 @@ The following module functions all construct and return iterators. Some provide streams of infinite length, so they should only be accessed by functions or loops that truncate the stream. -.. function:: accumulate(iterable) +.. function:: accumulate(iterable[, func]) Make an iterator that returns accumulated sums. Elements may be any addable - type including :class:`Decimal` or :class:`Fraction`. Equivalent to:: + type including :class:`~decimal.Decimal` or :class:`~fractions.Fraction`. + If the optional *func* argument is supplied, it should be a function of two + arguments and it will be used instead of addition. - def accumulate(iterable): + Equivalent to:: + + def accumulate(iterable, func=operator.add): 'Return running totals' # accumulate([1,2,3,4,5]) --> 1 3 6 10 15 + # accumulate([1,2,3,4,5], operator.mul) --> 1 2 6 24 120 it = iter(iterable) total = next(it) yield total for element in it: - total = total + element + total = func(total, element) yield total + There are a number of uses for the *func* argument. It can be set to + :func:`min` for a running minimum, :func:`max` for a running maximum, or + :func:`operator.mul` for a running product. Amortization tables can be + built by accumulating interest and applying payments. First-order + `recurrence relations <http://en.wikipedia.org/wiki/Recurrence_relation>`_ + can be modeled by supplying the initial value in the iterable and using only + the accumulated total in *func* argument:: + + >>> data = [3, 4, 6, 2, 1, 9, 0, 7, 5, 8] + >>> list(accumulate(data, operator.mul)) # running product + [3, 12, 72, 144, 144, 1296, 0, 0, 0, 0] + >>> list(accumulate(data, max)) # running maximum + [3, 4, 6, 6, 6, 9, 9, 9, 9, 9] + + # Amortize a 5% loan of 1000 with 4 annual payments of 90 + >>> cashflows = [1000, -90, -90, -90, -90] + >>> list(accumulate(cashflows, lambda bal, pmt: bal*1.05 + pmt)) + [1000, 960.0, 918.0, 873.9000000000001, 827.5950000000001] + + # Chaotic recurrence relation http://en.wikipedia.org/wiki/Logistic_map + >>> logistic_map = lambda x, _: r * x * (1 - x) + >>> r = 3.8 + >>> x0 = 0.4 + >>> inputs = repeat(x0, 36) # only the initial value is used + >>> [format(x, '.2f') for x in accumulate(inputs, logistic_map)] + ['0.40', '0.91', '0.30', '0.81', '0.60', '0.92', '0.29', '0.79', '0.63', + '0.88', '0.39', '0.90', '0.33', '0.84', '0.52', '0.95', '0.18', '0.57', + '0.93', '0.25', '0.71', '0.79', '0.63', '0.88', '0.39', '0.91', '0.32', + '0.83', '0.54', '0.95', '0.20', '0.60', '0.91', '0.30', '0.80', '0.60'] + + See :func:`functools.reduce` for a similar function that returns only the + final accumulated value. + .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Added the optional *func* parameter. + .. function:: chain(*iterables) Make an iterator that returns elements from the first iterable until it is @@ -118,9 +160,8 @@ loops that truncate the stream. .. classmethod:: chain.from_iterable(iterable) Alternate constructor for :func:`chain`. Gets chained inputs from a - single iterable argument that is evaluated lazily. Equivalent to:: + single iterable argument that is evaluated lazily. Roughly equivalent to:: - @classmethod def from_iterable(iterables): # chain.from_iterable(['ABC', 'DEF']) --> A B C D E F for it in iterables: @@ -240,7 +281,7 @@ loops that truncate the stream. .. function:: count(start=0, step=1) - Make an iterator that returns evenly spaced values starting with *n*. Often + Make an iterator that returns evenly spaced values starting with number *start*. Often used as an argument to :func:`map` to generate consecutive data points. Also, used with :func:`zip` to add sequence numbers. Equivalent to:: @@ -667,8 +708,9 @@ which incur interpreter overhead. next(b, None) return zip(a, b) - def grouper(n, iterable, fillvalue=None): - "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx" + def grouper(iterable, n, fillvalue=None): + "Collect data into fixed-length chunks or blocks" + # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx" args = [iter(iterable)] * n return zip_longest(*args, fillvalue=fillvalue) @@ -723,7 +765,7 @@ which incur interpreter overhead. """ Call a function repeatedly until an exception is raised. Converts a call-until-exception interface to an iterator interface. - Like __builtin__.iter(func, sentinel) but uses an exception instead + Like builtins.iter(func, sentinel) but uses an exception instead of a sentinel to end the loop. Examples: diff --git a/Doc/library/json.rst b/Doc/library/json.rst index 58eee18..f652039 100644 --- a/Doc/library/json.rst +++ b/Doc/library/json.rst @@ -124,7 +124,8 @@ Basic Usage sort_keys=False, **kw) Serialize *obj* as a JSON formatted stream to *fp* (a ``.write()``-supporting - :term:`file-like object`). + :term:`file-like object`) using this :ref:`conversion table + <py-to-json-table>`. If *skipkeys* is ``True`` (default: ``False``), then dict keys that are not of a basic type (:class:`str`, :class:`int`, :class:`float`, :class:`bool`, @@ -183,8 +184,9 @@ Basic Usage indent=None, separators=None, default=None, \ sort_keys=False, **kw) - Serialize *obj* to a JSON formatted :class:`str`. The arguments have the - same meaning as in :func:`dump`. + Serialize *obj* to a JSON formatted :class:`str` using this :ref:`conversion + table <py-to-json-table>`. The arguments have the same meaning as in + :func:`dump`. .. note:: @@ -204,7 +206,8 @@ Basic Usage .. function:: load(fp, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw) Deserialize *fp* (a ``.read()``-supporting :term:`file-like object` - containing a JSON document) to a Python object. + containing a JSON document) to a Python object using this :ref:`conversion + table <json-to-py-table>`. *object_hook* is an optional function that will be called with the result of any object literal decoded (a :class:`dict`). The return value of @@ -249,7 +252,7 @@ Basic Usage .. function:: loads(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw) Deserialize *s* (a :class:`str` instance containing a JSON document) to a - Python object. + Python object using this :ref:`conversion table <json-to-py-table>`. The other arguments have the same meaning as in :func:`load`, except *encoding* which is ignored and deprecated. @@ -264,6 +267,8 @@ Encoders and Decoders Performs the following translations in decoding by default: + .. _json-to-py-table: + +---------------+-------------------+ | JSON | Python | +===============+===================+ @@ -323,6 +328,8 @@ Encoders and Decoders those with character codes in the 0-31 range, including ``'\t'`` (tab), ``'\n'``, ``'\r'`` and ``'\0'``. + If the data being deserialized is not a valid JSON document, a + :exc:`ValueError` will be raised. .. method:: decode(s) @@ -345,6 +352,8 @@ Encoders and Decoders Supports the following objects and types by default: + .. _py-to-json-table: + +-------------------+---------------+ | Python | JSON | +===================+===============+ diff --git a/Doc/library/locale.rst b/Doc/library/locale.rst index 45aba0a..9600193 100644 --- a/Doc/library/locale.rst +++ b/Doc/library/locale.rst @@ -55,6 +55,8 @@ The :mod:`locale` module defines the following exception and functions: Returns the database of the local conventions as a dictionary. This dictionary has the following strings as keys: + .. tabularcolumns:: |l|l|L| + +----------------------+-------------------------------------+--------------------------------+ | Category | Key | Meaning | +======================+=====================================+================================+ @@ -475,8 +477,11 @@ in such a way that frequent locale changes may cause core dumps. This makes the locale somewhat painful to use correctly. Initially, when a program is started, the locale is the ``C`` locale, no matter -what the user's preferred locale is. The program must explicitly say that it -wants the user's preferred locale settings by calling ``setlocale(LC_ALL, '')``. +what the user's preferred locale is. There is one exception: the +:data:`LC_CTYPE` category is changed at startup to set the current locale +encoding to the user's preferred locale encoding. The program must explicitly +say that it wants the user's preferred locale settings for other categories by +calling ``setlocale(LC_ALL, '')``. It is generally a bad idea to call :func:`setlocale` in some library routine, since as a side effect it affects the entire program. Saving and restoring it diff --git a/Doc/library/logging.config.rst b/Doc/library/logging.config.rst index 1391ed2..303b4d8 100644 --- a/Doc/library/logging.config.rst +++ b/Doc/library/logging.config.rst @@ -17,6 +17,10 @@ * :ref:`Advanced Tutorial <logging-advanced-tutorial>` * :ref:`Logging Cookbook <logging-cookbook>` +**Source code:** :source:`Lib/logging/config.py` + +-------------- + This section describes the API for configuring the logging module. .. _logging-config-api: @@ -77,8 +81,9 @@ in :mod:`logging` itself) and defining handlers which are declared either in .. function:: fileConfig(fname, defaults=None, disable_existing_loggers=True) Reads the logging configuration from a :mod:`configparser`\-format file - named *fname*. This function can be called several times from an - application, allowing an end user to select from various pre-canned + named *fname*. The format of the file should be as described in + :ref:`logging-config-fileformat`. This function can be called several times + from an application, allowing an end user to select from various pre-canned configurations (if the developer provides a mechanism to present the choices and load the chosen configuration). @@ -101,15 +106,18 @@ in :mod:`logging` itself) and defining handlers which are declared either in configurations. If no port is specified, the module's default :const:`DEFAULT_LOGGING_CONFIG_PORT` is used. Logging configurations will be sent as a file suitable for processing by :func:`fileConfig`. Returns a - :class:`Thread` instance on which you can call :meth:`start` to start the - server, and which you can :meth:`join` when appropriate. To stop the server, + :class:`~threading.Thread` instance on which you can call + :meth:`~threading.Thread.start` to start the server, and which you can + :meth:`~threading.Thread.join` when appropriate. To stop the server, call :func:`stopListening`. To send a configuration to the socket, read in the configuration file and send it to the socket as a string of bytes preceded by a four-byte length string packed in binary using ``struct.pack('>L', n)``. - .. note:: Because portions of the configuration are passed through + .. note:: + + Because portions of the configuration are passed through :func:`eval`, use of this function may open its users to a security risk. While the function only binds to a socket on ``localhost``, and so does not accept connections from remote machines, there are scenarios where @@ -166,11 +174,11 @@ otherwise, the context is used to determine what to instantiate. * *formatters* - the corresponding value will be a dict in which each key is a formatter id and each value is a dict describing how to - configure the corresponding Formatter instance. + configure the corresponding :class:`~logging.Formatter` instance. The configuring dict is searched for keys ``format`` and ``datefmt`` (with defaults of ``None``) and these are used to construct a - :class:`logging.Formatter` instance. + :class:`~logging.Formatter` instance. * *filters* - the corresponding value will be a dict in which each key is a filter id and each value is a dict describing how to configure @@ -704,10 +712,13 @@ format string, with a comma separator. An example time in ISO8601 format is The ``class`` entry is optional. It indicates the name of the formatter's class (as a dotted module and class name.) This option is useful for instantiating a -:class:`Formatter` subclass. Subclasses of :class:`Formatter` can present -exception tracebacks in an expanded or condensed format. +:class:`~logging.Formatter` subclass. Subclasses of +:class:`~logging.Formatter` can present exception tracebacks in an expanded or +condensed format. + +.. note:: -.. note:: Due to the use of :func:`eval` as described above, there are + Due to the use of :func:`eval` as described above, there are potential security risks which result from using the :func:`listen` to send and receive configurations via sockets. The risks are limited to where multiple users with no mutual trust run code on the same machine; see the diff --git a/Doc/library/logging.handlers.rst b/Doc/library/logging.handlers.rst index ba7cd00..415e397 100644 --- a/Doc/library/logging.handlers.rst +++ b/Doc/library/logging.handlers.rst @@ -17,6 +17,10 @@ * :ref:`Advanced Tutorial <logging-advanced-tutorial>` * :ref:`Logging Cookbook <logging-cookbook>` +**Source code:** :source:`Lib/logging/handlers.py` + +-------------- + .. currentmodule:: logging The following useful handlers are provided in the package. Note that three of @@ -53,8 +57,8 @@ and :meth:`flush` methods). .. method:: flush() Flushes the stream by calling its :meth:`flush` method. Note that the - :meth:`close` method is inherited from :class:`Handler` and so does - no output, so an explicit :meth:`flush` call may be needed at times. + :meth:`close` method is inherited from :class:`~logging.Handler` and so + does no output, so an explicit :meth:`flush` call may be needed at times. .. versionchanged:: 3.2 The ``StreamHandler`` class now has a ``terminator`` attribute, default @@ -145,8 +149,8 @@ new stream. This handler is not appropriate for use under Windows, because under Windows open log files cannot be moved or renamed - logging opens the files with exclusive locks - and so there is no need for such a handler. Furthermore, -*ST_INO* is not supported under Windows; :func:`stat` always returns zero for -this value. +*ST_INO* is not supported under Windows; :func:`~os.stat` always returns zero +for this value. .. class:: WatchedFileHandler(filename[,mode[, encoding[, delay]]]) @@ -164,6 +168,87 @@ this value. changed. If it has, the existing stream is flushed and closed and the file opened again, before outputting the record to the file. +.. _base-rotating-handler: + +BaseRotatingHandler +^^^^^^^^^^^^^^^^^^^ + +The :class:`BaseRotatingHandler` class, located in the :mod:`logging.handlers` +module, is the base class for the rotating file handlers, +:class:`RotatingFileHandler` and :class:`TimedRotatingFileHandler`. You should +not need to instantiate this class, but it has attributes and methods you may +need to override. + +.. class:: BaseRotatingHandler(filename, mode, encoding=None, delay=False) + + The parameters are as for :class:`FileHandler`. The attributes are: + + .. attribute:: namer + + If this attribute is set to a callable, the :meth:`rotation_filename` + method delegates to this callable. The parameters passed to the callable + are those passed to :meth:`rotation_filename`. + + .. note:: The namer function is called quite a few times during rollover, + so it should be as simple and as fast as possible. It should also + return the same output every time for a given input, otherwise the + rollover behaviour may not work as expected. + + .. versionadded:: 3.3 + + + .. attribute:: BaseRotatingHandler.rotator + + If this attribute is set to a callable, the :meth:`rotate` method + delegates to this callable. The parameters passed to the callable are + those passed to :meth:`rotate`. + + .. versionadded:: 3.3 + + .. method:: BaseRotatingHandler.rotation_filename(default_name) + + Modify the filename of a log file when rotating. + + This is provided so that a custom filename can be provided. + + The default implementation calls the 'namer' attribute of the handler, + if it's callable, passing the default name to it. If the attribute isn't + callable (the default is ``None``), the name is returned unchanged. + + :param default_name: The default name for the log file. + + .. versionadded:: 3.3 + + + .. method:: BaseRotatingHandler.rotate(source, dest) + + When rotating, rotate the current log. + + The default implementation calls the 'rotator' attribute of the handler, + if it's callable, passing the source and dest arguments to it. If the + attribute isn't callable (the default is ``None``), the source is simply + renamed to the destination. + + :param source: The source filename. This is normally the base + filename, e.g. 'test.log' + :param dest: The destination filename. This is normally + what the source is rotated to, e.g. 'test.log.1'. + + .. versionadded:: 3.3 + +The reason the attributes exist is to save you having to subclass - you can use +the same callables for instances of :class:`RotatingFileHandler` and +:class:`TimedRotatingFileHandler`. If either the namer or rotator callable +raises an exception, this will be handled in the same way as any other +exception during an :meth:`emit` call, i.e. via the :meth:`handleError` method +of the handler. + +If you need to make more significant changes to rotation processing, you can +override the methods. + +For an example, see :ref:`cookbook-rotator-namer`. + + .. _rotating-file-handler: RotatingFileHandler @@ -302,7 +387,8 @@ sends logging output to a network socket. The base class uses a TCP socket. binary format. If there is an error with the socket, silently drops the packet. If the connection was previously lost, re-establishes the connection. To unpickle the record at the receiving end into a - :class:`LogRecord`, use the :func:`makeLogRecord` function. + :class:`~logging.LogRecord`, use the :func:`~logging.makeLogRecord` + function. .. method:: handleError() @@ -380,7 +466,8 @@ over UDP sockets. Pickles the record's attribute dictionary and writes it to the socket in binary format. If there is an error with the socket, silently drops the packet. To unpickle the record at the receiving end into a - :class:`LogRecord`, use the :func:`makeLogRecord` function. + :class:`~logging.LogRecord`, use the :func:`~logging.makeLogRecord` + function. .. method:: makeSocket() @@ -456,6 +543,15 @@ supports sending logging messages to a remote or local Unix syslog. behaviour) but can be set to ``False`` on a ``SysLogHandler`` instance in order for that instance to *not* append the NUL terminator. + .. versionchanged:: 3.3 + (See: :issue:`12419`.) In earlier versions, there was no facility for + an "ident" or "tag" prefix to identify the source of the message. This + can now be specified using a class-level attribute, defaulting to + ``""`` to preserve existing behaviour, but which can be overridden on + a ``SysLogHandler`` instance in order for that instance to prepend + the ident to every message handled. Note that the provided ident must + be text, not bytes, and is prepended to the message exactly as is. + .. method:: encodePriority(facility, priority) Encodes the facility and priority into an integer. You can pass in strings @@ -618,7 +714,7 @@ The :class:`SMTPHandler` class, located in the :mod:`logging.handlers` module, supports sending logging messages to an email address via SMTP. -.. class:: SMTPHandler(mailhost, fromaddr, toaddrs, subject, credentials=None, secure=None) +.. class:: SMTPHandler(mailhost, fromaddr, toaddrs, subject, credentials=None, secure=None, timeout=1.0) Returns a new instance of the :class:`SMTPHandler` class. The instance is initialized with the from and to addresses and subject line of the email. The @@ -634,6 +730,12 @@ supports sending logging messages to an email address via SMTP. and certificate file. (This tuple is passed to the :meth:`smtplib.SMTP.starttls` method.) + A timeout can be specified for communication with the SMTP server using the + *timeout* argument. + + .. versionadded:: 3.3 + The *timeout* argument was added. + .. method:: emit(record) Formats the record and sends it to the specified addressees. @@ -694,7 +796,7 @@ should, then :meth:`flush` is expected to do the flushing. .. method:: close() - Calls :meth:`flush`, sets the target to :const:`None` and clears the + Calls :meth:`flush`, sets the target to ``None`` and clears the buffer. @@ -729,7 +831,7 @@ supports sending logging messages to a Web server, using either ``GET`` or Returns a new instance of the :class:`HTTPHandler` class. The *host* can be of the form ``host:port``, should you need to use a specific port number. - If no *method* is specified, ``GET`` is used. If *secure* is True, an HTTPS + If no *method* is specified, ``GET`` is used. If *secure* is true, an HTTPS connection will be used. If *credentials* is specified, it should be a 2-tuple consisting of userid and password, which will be placed in an HTTP 'Authorization' header using Basic authentication. If you specify @@ -790,7 +892,7 @@ possible, while any potentially slow operations (such as sending an email via Enqueues the record on the queue using ``put_nowait()``; you may want to override this if you want to use blocking behaviour, or a - timeout, or a customised queue implementation. + timeout, or a customized queue implementation. @@ -863,6 +965,15 @@ possible, while any potentially slow operations (such as sending an email via Note that if you don't call this before your application exits, there may be some records still left on the queue, which won't be processed. + .. method:: enqueue_sentinel() + + Writes a sentinel to the queue to tell the listener to quit. This + implementation uses ``put_nowait()``. You may want to override this + method if you want to use timeouts or work with custom queue + implementations. + + .. versionadded:: 3.3 + .. seealso:: diff --git a/Doc/library/logging.rst b/Doc/library/logging.rst index c2af8e8..8d093e1 100644 --- a/Doc/library/logging.rst +++ b/Doc/library/logging.rst @@ -21,6 +21,10 @@ * :ref:`Logging Cookbook <logging-cookbook>` +**Source code:** :source:`Lib/logging/__init__.py` + +-------------- + This module defines functions and classes which implement a flexible event logging system for applications and libraries. @@ -109,6 +113,8 @@ is the module's name in the Python package namespace. If the root is reached, and it has a level of NOTSET, then all messages will be processed. Otherwise, the root's level will be used as the effective level. + See :ref:`levels` for a list of levels. + .. versionchanged:: 3.2 The *lvl* parameter now accepts a string representation of the level such as 'INFO' as an alternative to the integer constants @@ -156,7 +162,7 @@ is the module's name in the Python package namespace. is called to get the exception information. The second optional keyword argument is *stack_info*, which defaults to - False. If specified as True, stack information is added to the logging + ``False``. If true, stack information is added to the logging message, including the actual logging call. Note that this is not the same stack information as that displayed through specifying *exc_info*: The former is stack frames from the bottom of the stack up to the logging call @@ -222,6 +228,9 @@ is the module's name in the Python package namespace. Logs a message with level :const:`WARNING` on this logger. The arguments are interpreted as for :meth:`debug`. + .. note:: There is an obsolete method ``warn`` which is functionally + identical to ``warning``. As ``warn`` is deprecated, please do not use + it - use ``warning`` instead. .. method:: Logger.error(msg, *args, **kwargs) @@ -301,7 +310,7 @@ is the module's name in the Python package namespace. Checks to see if this logger has any handlers configured. This is done by looking for handlers in this logger and its parents in the logger hierarchy. - Returns True if a handler was found, else False. The method stops searching + Returns ``True`` if a handler was found, else ``False``. The method stops searching up the hierarchy whenever a logger with the 'propagate' attribute set to False is found - that will be the last logger which is checked for the existence of handlers. @@ -309,6 +318,34 @@ is the module's name in the Python package namespace. .. versionadded:: 3.2 +.. _levels: + +Logging Levels +-------------- + +The numeric values of logging levels are given in the following table. These are +primarily of interest if you want to define your own levels, and need them to +have specific values relative to the predefined levels. If you define a level +with the same numeric value, it overwrites the predefined value; the predefined +name is lost. + ++--------------+---------------+ +| Level | Numeric value | ++==============+===============+ +| ``CRITICAL`` | 50 | ++--------------+---------------+ +| ``ERROR`` | 40 | ++--------------+---------------+ +| ``WARNING`` | 30 | ++--------------+---------------+ +| ``INFO`` | 20 | ++--------------+---------------+ +| ``DEBUG`` | 10 | ++--------------+---------------+ +| ``NOTSET`` | 0 | ++--------------+---------------+ + + .. _handler: Handler Objects @@ -349,6 +386,8 @@ subclasses. However, the :meth:`__init__` method in subclasses needs to call severe than *lvl* will be ignored. When a handler is created, the level is set to :const:`NOTSET` (which causes all messages to be processed). + See :ref:`levels` for a list of levels. + .. versionchanged:: 3.2 The *lvl* parameter now accepts a string representation of the level such as 'INFO' as an alternative to the integer constants @@ -461,7 +500,8 @@ The useful mapping keys in a :class:`LogRecord` are given in the section on The *style* parameter can be one of '%', '{' or '$' and determines how the format string will be merged with its data: using one of %-formatting, - :meth:`str.format` or :class:`string.Template`. + :meth:`str.format` or :class:`string.Template`. See :ref:`formatting-styles` + for more information on using {- and $-formatting for log messages. .. versionchanged:: 3.2 The *style* parameter was added. @@ -507,6 +547,19 @@ The useful mapping keys in a :class:`LogRecord` are given in the section on want all logging times to be shown in GMT, set the ``converter`` attribute in the ``Formatter`` class. + .. versionchanged:: 3.3 + Previously, the default ISO 8601 format was hard-coded as in this + example: ``2010-09-06 22:38:15,292`` where the part before the comma is + handled by a strptime format string (``'%Y-%m-%d %H:%M:%S'``), and the + part after the comma is a millisecond value. Because strptime does not + have a format placeholder for milliseconds, the millisecond value is + appended using another format string, ``'%s,%03d'`` – and both of these + format strings have been hardcoded into this method. With the change, + these strings are defined as class-level attributes which can be + overridden at the instance level when desired. The names of the + attributes are ``default_time_format`` (for the strptime format string) + and ``default_msec_format`` (for appending the millisecond value). + .. method:: formatException(exc_info) Formats the specified exception information (a standard exception tuple as @@ -757,7 +810,7 @@ LoggerAdapter Objects --------------------- :class:`LoggerAdapter` instances are used to conveniently pass contextual -information into logging calls. For a usage example , see the section on +information into logging calls. For a usage example, see the section on :ref:`adding contextual information to your logging output <context-info>`. .. class:: LoggerAdapter(logger, extra) @@ -774,17 +827,18 @@ information into logging calls. For a usage example , see the section on (possibly modified) versions of the arguments passed in. In addition to the above, :class:`LoggerAdapter` supports the following -methods of :class:`Logger`, i.e. :meth:`debug`, :meth:`info`, :meth:`warning`, -:meth:`error`, :meth:`exception`, :meth:`critical`, :meth:`log`, -:meth:`isEnabledFor`, :meth:`getEffectiveLevel`, :meth:`setLevel`, -:meth:`hasHandlers`. These methods have the same signatures as their +methods of :class:`Logger`: :meth:`~Logger.debug`, :meth:`~Logger.info`, +:meth:`~Logger.warning`, :meth:`~Logger.error`, :meth:`~Logger.exception`, +:meth:`~Logger.critical`, :meth:`~Logger.log`, :meth:`~Logger.isEnabledFor`, +:meth:`~Logger.getEffectiveLevel`, :meth:`~Logger.setLevel` and +:meth:`~Logger.hasHandlers`. These methods have the same signatures as their counterparts in :class:`Logger`, so you can use the two types of instances interchangeably. .. versionchanged:: 3.2 - The :meth:`isEnabledFor`, :meth:`getEffectiveLevel`, :meth:`setLevel` and - :meth:`hasHandlers` methods were added to :class:`LoggerAdapter`. These - methods delegate to the underlying logger. + The :meth:`~Logger.isEnabledFor`, :meth:`~Logger.getEffectiveLevel`, + :meth:`~Logger.setLevel` and :meth:`~Logger.hasHandlers` methods were added + to :class:`LoggerAdapter`. These methods delegate to the underlying logger. Thread Safety @@ -824,8 +878,8 @@ functions. Return either the standard :class:`Logger` class, or the last class passed to :func:`setLoggerClass`. This function may be called from within a new class - definition, to ensure that installing a customised :class:`Logger` class will - not undo customisations already applied by other code. For example:: + definition, to ensure that installing a customized :class:`Logger` class will + not undo customizations already applied by other code. For example:: class MyLogger(logging.getLoggerClass()): # ... override behaviour here @@ -857,7 +911,7 @@ functions. is called to get the exception information. The second optional keyword argument is *stack_info*, which defaults to - False. If specified as True, stack information is added to the logging + ``False``. If true, stack information is added to the logging message, including the actual logging call. Note that this is not the same stack information as that displayed through specifying *exc_info*: The former is stack frames from the bottom of the stack up to the logging call @@ -918,8 +972,12 @@ functions. .. function:: warning(msg, *args, **kwargs) - Logs a message with level :const:`WARNING` on the root logger. The arguments are - interpreted as for :func:`debug`. + Logs a message with level :const:`WARNING` on the root logger. The arguments + are interpreted as for :func:`debug`. + + .. note:: There is an obsolete function ``warn`` which is functionally + identical to ``warning``. As ``warn`` is deprecated, please do not use + it - use ``warning`` instead. .. function:: error(msg, *args, **kwargs) @@ -945,14 +1003,15 @@ functions. Logs a message with level *level* on the root logger. The other arguments are interpreted as for :func:`debug`. - .. note:: The above module-level functions which delegate to the root - logger should *not* be used in threads, in versions of Python earlier - than 2.7.1 and 3.2, unless at least one handler has been added to the - root logger *before* the threads are started. These convenience functions - call :func:`basicConfig` to ensure that at least one handler is - available; in earlier versions of Python, this can (under rare - circumstances) lead to handlers being added multiple times to the root - logger, which can in turn lead to multiple messages for the same event. + .. note:: The above module-level convenience functions, which delegate to the + root logger, call :func:`basicConfig` to ensure that at least one handler + is available. Because of this, they should *not* be used in threads, + in versions of Python earlier than 2.7.1 and 3.2, unless at least one + handler has been added to the root logger *before* the threads are + started. In earlier versions of Python, due to a thread safety shortcoming + in :func:`basicConfig`, this can (under rare circumstances) lead to + handlers being added multiple times to the root logger, which can in turn + lead to multiple messages for the same event. .. function:: disable(lvl) @@ -962,8 +1021,10 @@ functions. effect is to disable all logging calls of severity *lvl* and below, so that if you call it with a value of INFO, then all INFO and DEBUG events would be discarded, whereas those of severity WARNING and above would be processed - according to the logger's effective level. To undo the effect of a call to - ``logging.disable(lvl)``, call ``logging.disable(logging.NOTSET)``. + according to the logger's effective level. If + ``logging.disable(logging.NOTSET)`` is called, it effectively removes this + overriding level, so that logging output again depends on the effective + levels of individual loggers. .. function:: addLevelName(lvl, levelName) @@ -1017,6 +1078,8 @@ functions. The following keyword arguments are supported. + .. tabularcolumns:: |l|L| + +--------------+---------------------------------------------+ | Format | Description | +==============+=============================================+ @@ -1045,12 +1108,27 @@ functions. | ``stream`` | Use the specified stream to initialize the | | | StreamHandler. Note that this argument is | | | incompatible with 'filename' - if both are | - | | present, 'stream' is ignored. | + | | present, a ``ValueError`` is raised. | + +--------------+---------------------------------------------+ + | ``handlers`` | If specified, this should be an iterable of | + | | already created handlers to add to the root | + | | logger. Any handlers which don't already | + | | have a formatter set will be assigned the | + | | default formatter created in this function. | + | | Note that this argument is incompatible | + | | with 'filename' or 'stream' - if both are | + | | present, a ``ValueError`` is raised. | +--------------+---------------------------------------------+ .. versionchanged:: 3.2 The ``style`` argument was added. + .. versionchanged:: 3.3 + The ``handlers`` argument was added. Additional checks were added to + catch situations where incompatible arguments are specified (e.g. + ``handlers`` together with ``stream`` or ``filename``, or ``stream`` + together with ``filename``). + .. function:: shutdown() diff --git a/Doc/library/lzma.rst b/Doc/library/lzma.rst new file mode 100644 index 0000000..07b69eb --- /dev/null +++ b/Doc/library/lzma.rst @@ -0,0 +1,387 @@ +:mod:`lzma` --- Compression using the LZMA algorithm +==================================================== + +.. module:: lzma + :synopsis: A Python wrapper for the liblzma compression library. +.. moduleauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> +.. sectionauthor:: Nadeem Vawda <nadeem.vawda@gmail.com> + +.. versionadded:: 3.3 + + +This module provides classes and convenience functions for compressing and +decompressing data using the LZMA compression algorithm. Also included is a file +interface supporting the ``.xz`` and legacy ``.lzma`` file formats used by the +:program:`xz` utility, as well as raw compressed streams. + +The interface provided by this module is very similar to that of the :mod:`bz2` +module. However, note that :class:`LZMAFile` is *not* thread-safe, unlike +:class:`bz2.BZ2File`, so if you need to use a single :class:`LZMAFile` instance +from multiple threads, it is necessary to protect it with a lock. + + +.. exception:: LZMAError + + This exception is raised when an error occurs during compression or + decompression, or while initializing the compressor/decompressor state. + + +Reading and writing compressed files +------------------------------------ + +.. function:: open(filename, mode="rb", \*, format=None, check=-1, preset=None, filters=None, encoding=None, errors=None, newline=None) + + Open an LZMA-compressed file in binary or text mode, returning a :term:`file + object`. + + The *filename* argument can be either an actual file name (given as a + :class:`str` or :class:`bytes` object), in which case the named file is + opened, or it can be an existing file object to read from or write to. + + The *mode* argument can be any of ``"r"``, ``"rb"``, ``"w"``, ``"wb"``, + ``"a"`` or ``"ab"`` for binary mode, or ``"rt"``, ``"wt"``, or ``"at"`` for + text mode. The default is ``"rb"``. + + When opening a file for reading, the *format* and *filters* arguments have + the same meanings as for :class:`LZMADecompressor`. In this case, the *check* + and *preset* arguments should not be used. + + When opening a file for writing, the *format*, *check*, *preset* and + *filters* arguments have the same meanings as for :class:`LZMACompressor`. + + For binary mode, this function is equivalent to the :class:`LZMAFile` + constructor: ``LZMAFile(filename, mode, ...)``. In this case, the *encoding*, + *errors* and *newline* arguments must not be provided. + + For text mode, a :class:`LZMAFile` object is created, and wrapped in an + :class:`io.TextIOWrapper` instance with the specified encoding, error + handling behavior, and line ending(s). + + +.. class:: LZMAFile(filename=None, mode="r", \*, format=None, check=-1, preset=None, filters=None) + + Open an LZMA-compressed file in binary mode. + + An :class:`LZMAFile` can wrap an already-open :term:`file object`, or operate + directly on a named file. The *filename* argument specifies either the file + object to wrap, or the name of the file to open (as a :class:`str` or + :class:`bytes` object). When wrapping an existing file object, the wrapped + file will not be closed when the :class:`LZMAFile` is closed. + + The *mode* argument can be either ``"r"`` for reading (default), ``"w"`` for + overwriting, or ``"a"`` for appending. These can equivalently be given as + ``"rb"``, ``"wb"``, and ``"ab"`` respectively. + + If *filename* is a file object (rather than an actual file name), a mode of + ``"w"`` does not truncate the file, and is instead equivalent to ``"a"``. + + When opening a file for reading, the input file may be the concatenation of + multiple separate compressed streams. These are transparently decoded as a + single logical stream. + + When opening a file for reading, the *format* and *filters* arguments have + the same meanings as for :class:`LZMADecompressor`. In this case, the *check* + and *preset* arguments should not be used. + + When opening a file for writing, the *format*, *check*, *preset* and + *filters* arguments have the same meanings as for :class:`LZMACompressor`. + + :class:`LZMAFile` supports all the members specified by + :class:`io.BufferedIOBase`, except for :meth:`detach` and :meth:`truncate`. + Iteration and the :keyword:`with` statement are supported. + + The following method is also provided: + + .. method:: peek(size=-1) + + Return buffered data without advancing the file position. At least one + byte of data will be returned, unless EOF has been reached. The exact + number of bytes returned is unspecified (the *size* argument is ignored). + + .. note:: While calling :meth:`peek` does not change the file position of + the :class:`LZMAFile`, it may change the position of the underlying + file object (e.g. if the :class:`LZMAFile` was constructed by passing a + file object for *filename*). + + +Compressing and decompressing data in memory +-------------------------------------------- + +.. class:: LZMACompressor(format=FORMAT_XZ, check=-1, preset=None, filters=None) + + Create a compressor object, which can be used to compress data incrementally. + + For a more convenient way of compressing a single chunk of data, see + :func:`compress`. + + The *format* argument specifies what container format should be used. + Possible values are: + + * :const:`FORMAT_XZ`: The ``.xz`` container format. + This is the default format. + + * :const:`FORMAT_ALONE`: The legacy ``.lzma`` container format. + This format is more limited than ``.xz`` -- it does not support integrity + checks or multiple filters. + + * :const:`FORMAT_RAW`: A raw data stream, not using any container format. + This format specifier does not support integrity checks, and requires that + you always specify a custom filter chain (for both compression and + decompression). Additionally, data compressed in this manner cannot be + decompressed using :const:`FORMAT_AUTO` (see :class:`LZMADecompressor`). + + The *check* argument specifies the type of integrity check to include in the + compressed data. This check is used when decompressing, to ensure that the + data has not been corrupted. Possible values are: + + * :const:`CHECK_NONE`: No integrity check. + This is the default (and the only acceptable value) for + :const:`FORMAT_ALONE` and :const:`FORMAT_RAW`. + + * :const:`CHECK_CRC32`: 32-bit Cyclic Redundancy Check. + + * :const:`CHECK_CRC64`: 64-bit Cyclic Redundancy Check. + This is the default for :const:`FORMAT_XZ`. + + * :const:`CHECK_SHA256`: 256-bit Secure Hash Algorithm. + + If the specified check is not supported, an :class:`LZMAError` is raised. + + The compression settings can be specified either as a preset compression + level (with the *preset* argument), or in detail as a custom filter chain + (with the *filters* argument). + + The *preset* argument (if provided) should be an integer between ``0`` and + ``9`` (inclusive), optionally OR-ed with the constant + :const:`PRESET_EXTREME`. If neither *preset* nor *filters* are given, the + default behavior is to use :const:`PRESET_DEFAULT` (preset level ``6``). + Higher presets produce smaller output, but make the compression process + slower. + + .. note:: + + In addition to being more CPU-intensive, compression with higher presets + also requires much more memory (and produces output that needs more memory + to decompress). With preset ``9`` for example, the overhead for an + :class:`LZMACompressor` object can be as high as 800 MiB. For this reason, + it is generally best to stick with the default preset. + + The *filters* argument (if provided) should be a filter chain specifier. + See :ref:`filter-chain-specs` for details. + + .. method:: compress(data) + + Compress *data* (a :class:`bytes` object), returning a :class:`bytes` + object containing compressed data for at least part of the input. Some of + *data* may be buffered internally, for use in later calls to + :meth:`compress` and :meth:`flush`. The returned data should be + concatenated with the output of any previous calls to :meth:`compress`. + + .. method:: flush() + + Finish the compression process, returning a :class:`bytes` object + containing any data stored in the compressor's internal buffers. + + The compressor cannot be used after this method has been called. + + +.. class:: LZMADecompressor(format=FORMAT_AUTO, memlimit=None, filters=None) + + Create a decompressor object, which can be used to decompress data + incrementally. + + For a more convenient way of decompressing an entire compressed stream at + once, see :func:`decompress`. + + The *format* argument specifies the container format that should be used. The + default is :const:`FORMAT_AUTO`, which can decompress both ``.xz`` and + ``.lzma`` files. Other possible values are :const:`FORMAT_XZ`, + :const:`FORMAT_ALONE`, and :const:`FORMAT_RAW`. + + The *memlimit* argument specifies a limit (in bytes) on the amount of memory + that the decompressor can use. When this argument is used, decompression will + fail with an :class:`LZMAError` if it is not possible to decompress the input + within the given memory limit. + + The *filters* argument specifies the filter chain that was used to create + the stream being decompressed. This argument is required if *format* is + :const:`FORMAT_RAW`, but should not be used for other formats. + See :ref:`filter-chain-specs` for more information about filter chains. + + .. note:: + This class does not transparently handle inputs containing multiple + compressed streams, unlike :func:`decompress` and :class:`LZMAFile`. To + decompress a multi-stream input with :class:`LZMADecompressor`, you must + create a new decompressor for each stream. + + .. method:: decompress(data) + + Decompress *data* (a :class:`bytes` object), returning a :class:`bytes` + object containing the decompressed data for at least part of the input. + Some of *data* may be buffered internally, for use in later calls to + :meth:`decompress`. The returned data should be concatenated with the + output of any previous calls to :meth:`decompress`. + + .. attribute:: check + + The ID of the integrity check used by the input stream. This may be + :const:`CHECK_UNKNOWN` until enough of the input has been decoded to + determine what integrity check it uses. + + .. attribute:: eof + + ``True`` if the end-of-stream marker has been reached. + + .. attribute:: unused_data + + Data found after the end of the compressed stream. + + Before the end of the stream is reached, this will be ``b""``. + + +.. function:: compress(data, format=FORMAT_XZ, check=-1, preset=None, filters=None) + + Compress *data* (a :class:`bytes` object), returning the compressed data as a + :class:`bytes` object. + + See :class:`LZMACompressor` above for a description of the *format*, *check*, + *preset* and *filters* arguments. + + +.. function:: decompress(data, format=FORMAT_AUTO, memlimit=None, filters=None) + + Decompress *data* (a :class:`bytes` object), returning the uncompressed data + as a :class:`bytes` object. + + If *data* is the concatenation of multiple distinct compressed streams, + decompress all of these streams, and return the concatenation of the results. + + See :class:`LZMADecompressor` above for a description of the *format*, + *memlimit* and *filters* arguments. + + +Miscellaneous +------------- + +.. function:: is_check_supported(check) + + Returns true if the given integrity check is supported on this system. + + :const:`CHECK_NONE` and :const:`CHECK_CRC32` are always supported. + :const:`CHECK_CRC64` and :const:`CHECK_SHA256` may be unavailable if you are + using a version of :program:`liblzma` that was compiled with a limited + feature set. + + +.. _filter-chain-specs: + +Specifying custom filter chains +------------------------------- + +A filter chain specifier is a sequence of dictionaries, where each dictionary +contains the ID and options for a single filter. Each dictionary must contain +the key ``"id"``, and may contain additional keys to specify filter-dependent +options. Valid filter IDs are as follows: + +* Compression filters: + * :const:`FILTER_LZMA1` (for use with :const:`FORMAT_ALONE`) + * :const:`FILTER_LZMA2` (for use with :const:`FORMAT_XZ` and :const:`FORMAT_RAW`) + +* Delta filter: + * :const:`FILTER_DELTA` + +* Branch-Call-Jump (BCJ) filters: + * :const:`FILTER_X86` + * :const:`FILTER_IA64` + * :const:`FILTER_ARM` + * :const:`FILTER_ARMTHUMB` + * :const:`FILTER_POWERPC` + * :const:`FILTER_SPARC` + +A filter chain can consist of up to 4 filters, and cannot be empty. The last +filter in the chain must be a compression filter, and any other filters must be +delta or BCJ filters. + +Compression filters support the following options (specified as additional +entries in the dictionary representing the filter): + + * ``preset``: A compression preset to use as a source of default values for + options that are not specified explicitly. + * ``dict_size``: Dictionary size in bytes. This should be between 4 KiB and + 1.5 GiB (inclusive). + * ``lc``: Number of literal context bits. + * ``lp``: Number of literal position bits. The sum ``lc + lp`` must be at + most 4. + * ``pb``: Number of position bits; must be at most 4. + * ``mode``: :const:`MODE_FAST` or :const:`MODE_NORMAL`. + * ``nice_len``: What should be considered a "nice length" for a match. + This should be 273 or less. + * ``mf``: What match finder to use -- :const:`MF_HC3`, :const:`MF_HC4`, + :const:`MF_BT2`, :const:`MF_BT3`, or :const:`MF_BT4`. + * ``depth``: Maximum search depth used by match finder. 0 (default) means to + select automatically based on other filter options. + +The delta filter stores the differences between bytes, producing more repetitive +input for the compressor in certain circumstances. It only supports a single +The delta filter supports only one option, ``dist``. This indicates the distance +between bytes to be subtracted. The default is 1, i.e. take the differences +between adjacent bytes. + +The BCJ filters are intended to be applied to machine code. They convert +relative branches, calls and jumps in the code to use absolute addressing, with +the aim of increasing the redundancy that can be exploited by the compressor. +These filters support one option, ``start_offset``. This specifies the address +that should be mapped to the beginning of the input data. The default is 0. + + +Examples +-------- + +Reading in a compressed file:: + + import lzma + with lzma.open("file.xz") as f: + file_content = f.read() + +Creating a compressed file:: + + import lzma + data = b"Insert Data Here" + with lzma.open("file.xz", "w") as f: + f.write(data) + +Compressing data in memory:: + + import lzma + data_in = b"Insert Data Here" + data_out = lzma.compress(data_in) + +Incremental compression:: + + import lzma + lzc = lzma.LZMACompressor() + out1 = lzc.compress(b"Some data\n") + out2 = lzc.compress(b"Another piece of data\n") + out3 = lzc.compress(b"Even more data\n") + out4 = lzc.flush() + # Concatenate all the partial results: + result = b"".join([out1, out2, out3, out4]) + +Writing compressed data to an already-open file:: + + import lzma + with open("file.xz", "wb") as f: + f.write(b"This data will not be compressed\n") + with lzma.open(f, "w") as lzf: + lzf.write(b"This *will* be compressed\n") + f.write(b"Not compressed\n") + +Creating a compressed file using a custom filter chain:: + + import lzma + my_filters = [ + {"id": lzma.FILTER_DELTA, "dist": 5}, + {"id": lzma.FILTER_LZMA2, "preset": 7 | lzma.PRESET_EXTREME}, + ] + with lzma.open("file.xz", "w", filters=my_filters) as f: + f.write(b"blah blah blah") diff --git a/Doc/library/mailbox.rst b/Doc/library/mailbox.rst index 0f6aba1..0478ed1 100644 --- a/Doc/library/mailbox.rst +++ b/Doc/library/mailbox.rst @@ -674,8 +674,8 @@ Supported mailbox formats are Maildir, mbox, MH, Babyl, and MMDF. In Babyl mailboxes, the headers of a message are not stored contiguously with the body of the message. To generate a file-like representation, the - headers and body are copied together into a :class:`StringIO` instance - (from the :mod:`StringIO` module), which has an API identical to that of a + headers and body are copied together into a :class:`io.BytesIO` instance, + which has an API identical to that of a file. As a result, the file-like object is truly independent of the underlying mailbox but does not save memory compared to a string representation. @@ -1009,7 +1009,7 @@ When a :class:`MaildirMessage` instance is created based upon a Set the "From " line to *from_*, which should be specified without a leading "From " or trailing newline. For convenience, *time_* may be specified and will be formatted appropriately and appended to *from_*. If - *time_* is specified, it should be a :class:`struct_time` instance, a + *time_* is specified, it should be a :class:`time.struct_time` instance, a tuple suitable for passing to :meth:`time.strftime`, or ``True`` (to use :meth:`time.gmtime`). @@ -1380,7 +1380,7 @@ When a :class:`BabylMessage` instance is created based upon an Set the "From " line to *from_*, which should be specified without a leading "From " or trailing newline. For convenience, *time_* may be specified and will be formatted appropriately and appended to *from_*. If - *time_* is specified, it should be a :class:`struct_time` instance, a + *time_* is specified, it should be a :class:`time.struct_time` instance, a tuple suitable for passing to :meth:`time.strftime`, or ``True`` (to use :meth:`time.gmtime`). @@ -1550,7 +1550,7 @@ programs, mail loss due to interruption of the program, or premature termination due to malformed messages in the mailbox:: import mailbox - import email.Errors + import email.errors list_names = ('python-list', 'python-dev', 'python-bugs') @@ -1560,7 +1560,7 @@ due to malformed messages in the mailbox:: for key in inbox.iterkeys(): try: message = inbox[key] - except email.Errors.MessageParseError: + except email.errors.MessageParseError: continue # The message is malformed. Just leave it. for name in list_names: diff --git a/Doc/library/markup.rst b/Doc/library/markup.rst index ed24ba2..1588aa8 100644 --- a/Doc/library/markup.rst +++ b/Doc/library/markup.rst @@ -9,14 +9,6 @@ data markup. This includes modules to work with the Standard Generalized Markup Language (SGML) and the Hypertext Markup Language (HTML), and several interfaces for working with the Extensible Markup Language (XML). -It is important to note that modules in the :mod:`xml` package require that -there be at least one SAX-compliant XML parser available. The Expat parser is -included with Python, so the :mod:`xml.parsers.expat` module will always be -available. - -The documentation for the :mod:`xml.dom` and :mod:`xml.sax` packages are the -definition of the Python bindings for the DOM and SAX interfaces. - .. toctree:: diff --git a/Doc/library/math.rst b/Doc/library/math.rst index b014fc4..3c41672 100644 --- a/Doc/library/math.rst +++ b/Doc/library/math.rst @@ -31,14 +31,14 @@ Number-theoretic and representation functions Return the ceiling of *x*, the smallest integer greater than or equal to *x*. If *x* is not a float, delegates to ``x.__ceil__()``, which should return an - :class:`Integral` value. + :class:`~numbers.Integral` value. .. function:: copysign(x, y) - Return *x* with the sign of *y*. On a platform that supports - signed zeros, ``copysign(1.0, -0.0)`` returns *-1.0*. - + Return a float with the magnitude (absolute value) of *x* but the sign of + *y*. On platforms that support signed zeros, ``copysign(1.0, -0.0)`` + returns *-1.0*. .. function:: fabs(x) @@ -53,7 +53,7 @@ Number-theoretic and representation functions Return the floor of *x*, the largest integer less than or equal to *x*. If *x* is not a float, delegates to ``x.__floor__()``, which should return an - :class:`Integral` value. + :class:`~numbers.Integral` value. .. function:: fmod(x, y) @@ -133,8 +133,9 @@ Number-theoretic and representation functions .. function:: trunc(x) - Return the :class:`Real` value *x* truncated to an :class:`Integral` (usually - an integer). Delegates to ``x.__trunc__()``. + Return the :class:`~numbers.Real` value *x* truncated to an + :class:`~numbers.Integral` (usually an integer). Delegates to + ``x.__trunc__()``. Note that :func:`frexp` and :func:`modf` have a different call/return pattern @@ -187,6 +188,19 @@ Power and logarithmic functions result is calculated in a way which is accurate for *x* near zero. +.. function:: log2(x) + + Return the base-2 logarithm of *x*. This is usually more accurate than + ``log(x, 2)``. + + .. versionadded:: 3.3 + + .. seealso:: + + :meth:`int.bit_length` returns the number of bits necessary to represent + an integer in binary, excluding the sign and leading zeros. + + .. function:: log10(x) Return the base-10 logarithm of *x*. This is usually more accurate diff --git a/Doc/library/mimetypes.rst b/Doc/library/mimetypes.rst index be11c0d..12c9eca 100644 --- a/Doc/library/mimetypes.rst +++ b/Doc/library/mimetypes.rst @@ -85,6 +85,9 @@ behavior of the module. :const:`knownfiles` takes precedence over those named before it. Calling :func:`init` repeatedly is allowed. + Specifying an empty list for *files* will prevent the system defaults from + being applied: only the well-known values will be present from a built-in list. + .. versionchanged:: 3.2 Previously, Windows registry settings were ignored. diff --git a/Doc/library/mmap.rst b/Doc/library/mmap.rst index 178e388..18e05e3 100644 --- a/Doc/library/mmap.rst +++ b/Doc/library/mmap.rst @@ -155,13 +155,14 @@ To map anonymous memory, -1 should be passed as the fileno along with the length .. method:: close() - Close the file. Subsequent calls to other methods of the object will - result in an exception being raised. + Closes the mmap. Subsequent calls to other methods of the object will + result in a ValueError exception being raised. This will not close + the open file. .. attribute:: closed - True if the file is closed. + ``True`` if the file is closed. .. versionadded:: 3.2 @@ -196,12 +197,16 @@ To map anonymous memory, -1 should be passed as the fileno along with the length move will raise a :exc:`TypeError` exception. - .. method:: read(num) + .. method:: read([n]) - Return a :class:`bytes` containing up to *num* bytes starting from the - current file position; the file position is updated to point after the - bytes that were returned. + Return a :class:`bytes` containing up to *n* bytes starting from the + current file position. If the argument is omitted, *None* or negative, + return all bytes from the current file position to the end of the + mapping. The file position is updated to point after the bytes that were + returned. + .. versionchanged:: 3.3 + Argument can be omitted or *None*. .. method:: read_byte() diff --git a/Doc/library/msilib.rst b/Doc/library/msilib.rst index 270f4ff..d3451c8 100644 --- a/Doc/library/msilib.rst +++ b/Doc/library/msilib.rst @@ -88,8 +88,7 @@ structures. record according to the schema of the table. For optional fields, ``None`` can be passed. - Field values can be int or long numbers, strings, or instances of the Binary - class. + Field values can be ints, strings, or instances of the Binary class. .. class:: Binary(filename) @@ -430,8 +429,9 @@ GUI classes ----------- :mod:`msilib` provides several classes that wrap the GUI tables in an MSI -database. However, no standard user interface is provided; use :mod:`bdist_msi` -to create MSI files with a user-interface for installing Python packages. +database. However, no standard user interface is provided; use +:mod:`~distutils.command.bdist_msi` to create MSI files with a user-interface +for installing Python packages. .. class:: Control(dlg, name) diff --git a/Doc/library/msvcrt.rst b/Doc/library/msvcrt.rst index 889a0c5..9d23720 100644 --- a/Doc/library/msvcrt.rst +++ b/Doc/library/msvcrt.rst @@ -20,6 +20,11 @@ api. The normal API deals only with ASCII characters and is of limited use for internationalized applications. The wide char API should be used where ever possible +.. versionchanged:: 3.3 + Operations in this module now raise :exc:`OSError` where :exc:`IOError` + was raised. + + .. _msvcrt-files: File Operations @@ -29,7 +34,7 @@ File Operations .. function:: locking(fd, mode, nbytes) Lock part of a file based on file descriptor *fd* from the C runtime. Raises - :exc:`IOError` on failure. The locked region of the file extends from the + :exc:`OSError` on failure. The locked region of the file extends from the current file position for *nbytes* bytes, and may continue beyond the end of the file. *mode* must be one of the :const:`LK_\*` constants listed below. Multiple regions in a file may be locked at the same time, but may not overlap. Adjacent @@ -41,13 +46,13 @@ File Operations Locks the specified bytes. If the bytes cannot be locked, the program immediately tries again after 1 second. If, after 10 attempts, the bytes cannot - be locked, :exc:`IOError` is raised. + be locked, :exc:`OSError` is raised. .. data:: LK_NBLCK LK_NBRLCK - Locks the specified bytes. If the bytes cannot be locked, :exc:`IOError` is + Locks the specified bytes. If the bytes cannot be locked, :exc:`OSError` is raised. @@ -73,7 +78,7 @@ File Operations .. function:: get_osfhandle(fd) - Return the file handle for the file descriptor *fd*. Raises :exc:`IOError` if + Return the file handle for the file descriptor *fd*. Raises :exc:`OSError` if *fd* is not recognized. @@ -144,4 +149,4 @@ Other Functions .. function:: heapmin() Force the :c:func:`malloc` heap to clean itself up and return unused blocks to - the operating system. On failure, this raises :exc:`IOError`. + the operating system. On failure, this raises :exc:`OSError`. diff --git a/Doc/library/multiprocessing.rst b/Doc/library/multiprocessing.rst index 3e40faf..5f50cf1 100644 --- a/Doc/library/multiprocessing.rst +++ b/Doc/library/multiprocessing.rst @@ -29,7 +29,7 @@ Windows. Functionality within this package requires that the ``__main__`` module be importable by the children. This is covered in :ref:`multiprocessing-programming` however it is worth pointing out here. This means that some examples, such - as the :class:`multiprocessing.Pool` examples will not work in the + as the :class:`multiprocessing.pool.Pool` examples will not work in the interactive interpreter. For example:: >>> from multiprocessing import Pool @@ -121,9 +121,7 @@ processes: print(q.get()) # prints "[42, None, 'hello']" p.join() - Queues are thread and process safe, but note that they must never - be instantiated as a side effect of importing a module: this can lead - to a deadlock! (see :ref:`threaded-imports`) + Queues are thread and process safe. **Pipes** @@ -229,11 +227,11 @@ However, if you really do need to use some shared data then holds Python objects and allows other processes to manipulate them using proxies. - A manager returned by :func:`Manager` will support types :class:`list`, - :class:`dict`, :class:`Namespace`, :class:`Lock`, :class:`RLock`, - :class:`Semaphore`, :class:`BoundedSemaphore`, :class:`Condition`, - :class:`Event`, :class:`Queue`, :class:`Value` and :class:`Array`. For - example, :: + A manager returned by :func:`Manager` will support types + :class:`list`, :class:`dict`, :class:`Namespace`, :class:`Lock`, + :class:`RLock`, :class:`Semaphore`, :class:`BoundedSemaphore`, + :class:`Condition`, :class:`Event`, :class:`Barrier`, + :class:`Queue`, :class:`Value` and :class:`Array`. For example, :: from multiprocessing import Process, Manager @@ -244,17 +242,16 @@ However, if you really do need to use some shared data then l.reverse() if __name__ == '__main__': - manager = Manager() + with Manager() as manager: + d = manager.dict() + l = manager.list(range(10)) - d = manager.dict() - l = manager.list(range(10)) + p = Process(target=f, args=(d, l)) + p.start() + p.join() - p = Process(target=f, args=(d, l)) - p.start() - p.join() - - print(d) - print(l) + print(d) + print(l) will print :: @@ -282,10 +279,13 @@ For example:: return x*x if __name__ == '__main__': - pool = Pool(processes=4) # start 4 worker processes - result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously - print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow - print(pool.map(f, range(10))) # prints "[0, 1, 4,..., 81]" + with Pool(processes=4) as pool: # start 4 worker processes + result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously + print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow + print(pool.map(f, range(10))) # prints "[0, 1, 4,..., 81]" + +Note that the methods of a pool should only ever be used by the +process which created it. Reference @@ -298,7 +298,8 @@ The :mod:`multiprocessing` package mostly replicates the API of the :class:`Process` and exceptions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -.. class:: Process(group=None, target=None, name=None, args=(), kwargs={}) +.. class:: Process(group=None, target=None, name=None, args=(), kwargs={}, \ + *, daemon=None) Process objects represent activity that is run in a separate process. The :class:`Process` class has equivalents of all the methods of @@ -308,18 +309,22 @@ The :mod:`multiprocessing` package mostly replicates the API of the should always be ``None``; it exists solely for compatibility with :class:`threading.Thread`. *target* is the callable object to be invoked by the :meth:`run()` method. It defaults to ``None``, meaning nothing is - called. *name* is the process name. By default, a unique name is constructed - of the form 'Process-N\ :sub:`1`:N\ :sub:`2`:...:N\ :sub:`k`' where N\ - :sub:`1`,N\ :sub:`2`,...,N\ :sub:`k` is a sequence of integers whose length - is determined by the *generation* of the process. *args* is the argument - tuple for the target invocation. *kwargs* is a dictionary of keyword - arguments for the target invocation. By default, no arguments are passed to - *target*. + called. *name* is the process name (see :attr:`name` for more details). + *args* is the argument tuple for the target invocation. *kwargs* is a + dictionary of keyword arguments for the target invocation. If provided, + the keyword-only *daemon* argument sets the process :attr:`daemon` flag + to ``True`` or ``False``. If ``None`` (the default), this flag will be + inherited from the creating process. + + By default, no arguments are passed to *target*. If a subclass overrides the constructor, it must make sure it invokes the base class constructor (:meth:`Process.__init__`) before doing anything else to the process. + .. versionchanged:: 3.3 + Added the *daemon* argument. + .. method:: run() Method representing the process's activity. @@ -338,10 +343,9 @@ The :mod:`multiprocessing` package mostly replicates the API of the .. method:: join([timeout]) - Block the calling thread until the process whose :meth:`join` method is - called terminates or until the optional timeout occurs. - - If *timeout* is ``None`` then there is no timeout. + If the optional argument *timeout* is ``None`` (the default), the method + blocks until the process whose :meth:`join` method is called terminates. + If *timeout* is a positive number, it blocks at most *timeout* seconds. A process can be joined many times. @@ -350,11 +354,14 @@ The :mod:`multiprocessing` package mostly replicates the API of the .. attribute:: name - The process's name. + The process's name. The name is a string used for identification purposes + only. It has no semantics. Multiple processes may be given the same + name. - The name is a string used for identification purposes only. It has no - semantics. Multiple processes may be given the same name. The initial - name is set by the constructor. + The initial name is set by the constructor. If no explicit name is + provided to the constructor, a name of the form + 'Process-N\ :sub:`1`:N\ :sub:`2`:...:N\ :sub:`k`' is constructed, where + each N\ :sub:`k` is the N-th child of its parent. .. method:: is_alive @@ -379,7 +386,7 @@ The :mod:`multiprocessing` package mostly replicates the API of the Unix daemons or services, they are normal processes that will be terminated (and not joined) if non-daemonic processes have exited. - In addition to the :class:`Threading.Thread` API, :class:`Process` objects + In addition to the :class:`threading.Thread` API, :class:`Process` objects also support the following attributes and methods: .. attribute:: pid @@ -398,7 +405,7 @@ The :mod:`multiprocessing` package mostly replicates the API of the The process's authentication key (a byte string). When :mod:`multiprocessing` is initialized the main process is assigned a - random string using :func:`os.random`. + random string using :func:`os.urandom`. When a :class:`Process` object is created, it will inherit the authentication key of its parent process, although this may be changed by @@ -406,6 +413,21 @@ The :mod:`multiprocessing` package mostly replicates the API of the See :ref:`multiprocessing-auth-keys`. + .. attribute:: sentinel + + A numeric handle of a system object which will become "ready" when + the process ends. + + You can use this value if you want to wait on several events at + once using :func:`multiprocessing.connection.wait`. Otherwise + calling :meth:`join()` is simpler. + + On Windows, this is an OS handle usable with the ``WaitForSingleObject`` + and ``WaitForMultipleObjects`` family of API calls. On Unix, this is + a file descriptor usable with primitives from the :mod:`select` module. + + .. versionadded:: 3.3 + .. method:: terminate() Terminate the process. On Unix this is done using the ``SIGTERM`` signal; @@ -424,7 +446,7 @@ The :mod:`multiprocessing` package mostly replicates the API of the cause other processes to deadlock. Note that the :meth:`start`, :meth:`join`, :meth:`is_alive`, - :meth:`terminate` and :attr:`exit_code` methods should only be called by + :meth:`terminate` and :attr:`exitcode` methods should only be called by the process that created the process object. Example usage of some of the methods of :class:`Process`: @@ -445,6 +467,9 @@ The :mod:`multiprocessing` package mostly replicates the API of the >>> p.exitcode == -signal.SIGTERM True +.. exception:: ProcessError + + The base class of all :mod:`multiprocessing` exceptions. .. exception:: BufferTooShort @@ -454,6 +479,13 @@ The :mod:`multiprocessing` package mostly replicates the API of the If ``e`` is an instance of :exc:`BufferTooShort` then ``e.args[0]`` will give the message as a byte string. +.. exception:: AuthenticationError + + Raised when there is an authentication error. + +.. exception:: TimeoutError + + Raised by methods with a timeout when the timeout expires. Pipes and Queues ~~~~~~~~~~~~~~~~ @@ -465,7 +497,7 @@ primitives like locks. For passing messages one can use :func:`Pipe` (for a connection between two processes) or a queue (which allows multiple producers and consumers). -The :class:`Queue`, :class:`multiprocessing.queues.SimpleQueue` and :class:`JoinableQueue` types are multi-producer, +The :class:`Queue`, :class:`SimpleQueue` and :class:`JoinableQueue` types are multi-producer, multi-consumer FIFO queues modelled on the :class:`queue.Queue` class in the standard library. They differ in that :class:`Queue` lacks the :meth:`~queue.Queue.task_done` and :meth:`~queue.Queue.join` methods introduced @@ -486,6 +518,24 @@ Note that one can also create a shared queue by using a manager object -- see the :mod:`multiprocessing` namespace so you need to import them from :mod:`queue`. +.. note:: + + When an object is put on a queue, the object is pickled and a + background thread later flushes the pickled data to an underlying + pipe. This has some consequences which are a little surprising, + but should not cause any practical difficulties -- if they really + bother you then you can instead use a queue created with a + :ref:`manager <multiprocessing-managers>`. + + (1) After putting an object on an empty queue there may be an + infinitesimal delay before the queue's :meth:`~Queue.empty` + method returns :const:`False` and :meth:`~Queue.get_nowait` can + return without raising :exc:`queue.Empty`. + + (2) If multiple processes are enqueuing objects, it is possible for + the objects to be received at the other end out-of-order. + However, objects enqueued by the same process will always be in + the expected order with respect to each other. .. warning:: @@ -497,7 +547,8 @@ Note that one can also create a shared queue by using a manager object -- see .. warning:: As mentioned above, if a child process has put items on a queue (and it has - not used :meth:`JoinableQueue.cancel_join_thread`), then that process will + not used :meth:`JoinableQueue.cancel_join_thread + <multiprocessing.Queue.cancel_join_thread>`), then that process will not terminate until all buffered items have been flushed to the pipe. This means that if you try joining that process you may get a deadlock unless @@ -530,7 +581,7 @@ For an example of the usage of queues for interprocess communication see thread is started which transfers objects from a buffer into the pipe. The usual :exc:`queue.Empty` and :exc:`queue.Full` exceptions from the - standard library's :mod:`Queue` module are raised to signal timeouts. + standard library's :mod:`queue` module are raised to signal timeouts. :class:`Queue` implements all the methods of :class:`queue.Queue` except for :meth:`~queue.Queue.task_done` and :meth:`~queue.Queue.join`. @@ -609,8 +660,15 @@ For an example of the usage of queues for interprocess communication see the background thread from being joined automatically when the process exits -- see :meth:`join_thread`. + A better name for this method might be + ``allow_exit_without_flush()``. It is likely to cause enqueued + data to lost, and you almost certainly will not need to use it. + It is really only there if you need the current process to exit + immediately without waiting to flush enqueued data to the + underlying pipe, and you don't care about lost data. + -.. class:: multiprocessing.queues.SimpleQueue() +.. class:: SimpleQueue() It is a simplified :class:`Queue` type, very close to a locked :class:`Pipe`. @@ -634,12 +692,12 @@ For an example of the usage of queues for interprocess communication see .. method:: task_done() - Indicate that a formerly enqueued task is complete. Used by queue consumer - threads. For each :meth:`~Queue.get` used to fetch a task, a subsequent + Indicate that a formerly enqueued task is complete. Used by queue + consumers. For each :meth:`~Queue.get` used to fetch a task, a subsequent call to :meth:`task_done` tells the queue that the processing on the task is complete. - If a :meth:`~Queue.join` is currently blocking, it will resume when all + If a :meth:`~queue.Queue.join` is currently blocking, it will resume when all items have been processed (meaning that a :meth:`task_done` call was received for every item that had been :meth:`~Queue.put` into the queue). @@ -652,10 +710,10 @@ For an example of the usage of queues for interprocess communication see Block until all items in the queue have been gotten and processed. The count of unfinished tasks goes up whenever an item is added to the - queue. The count goes down whenever a consumer thread calls + queue. The count goes down whenever a consumer calls :meth:`task_done` to indicate that the item was retrieved and all work on it is complete. When the count of unfinished tasks drops to zero, - :meth:`~Queue.join` unblocks. + :meth:`~queue.Queue.join` unblocks. Miscellaneous @@ -766,10 +824,12 @@ Connection objects are usually created using :func:`Pipe` -- see also *timeout* is a number then this specifies the maximum time in seconds to block. If *timeout* is ``None`` then an infinite timeout is used. + Note that multiple connection objects may be polled at once by + using :func:`multiprocessing.connection.wait`. + .. method:: send_bytes(buffer[, offset[, size]]) - Send byte data from an object supporting the buffer interface as a - complete message. + Send byte data from a :term:`bytes-like object` as a complete message. If *offset* is given then data is read from that position in *buffer*. If *size* is given then that many bytes will be read from buffer. Very large @@ -784,9 +844,14 @@ Connection objects are usually created using :func:`Pipe` -- see also to receive and the other end has closed. If *maxlength* is specified and the message is longer than *maxlength* - then :exc:`IOError` is raised and the connection will no longer be + then :exc:`OSError` is raised and the connection will no longer be readable. + .. versionchanged:: 3.3 + This function used to raise a :exc:`IOError`, which is now an + alias of :exc:`OSError`. + + .. method:: recv_bytes_into(buffer[, offset]) Read into *buffer* a complete message of byte data sent from the other end @@ -795,7 +860,7 @@ Connection objects are usually created using :func:`Pipe` -- see also :exc:`EOFError` if there is nothing left to receive and the other end was closed. - *buffer* must be an object satisfying the writable buffer interface. If + *buffer* must be a writable :term:`bytes-like object`. If *offset* is given then the message will be written into the buffer from that position. Offset must be a non-negative integer less than the length of *buffer* (in bytes). @@ -804,6 +869,14 @@ Connection objects are usually created using :func:`Pipe` -- see also raised and the complete message is available as ``e.args[0]`` where ``e`` is the exception instance. + .. versionchanged:: 3.3 + Connection objects themselves can now be transferred between processes + using :meth:`Connection.send` and :meth:`Connection.recv`. + + .. versionadded:: 3.3 + Connection objects now support the context manager protocol -- see + :ref:`typecontextmanager`. :meth:`~contextmanager.__enter__` returns the + connection object, and :meth:`~contextmanager.__exit__` calls :meth:`close`. For example: @@ -855,6 +928,12 @@ program as they are in a multithreaded program. See the documentation for Note that one can also create synchronization primitives by using a manager object -- see :ref:`multiprocessing-managers`. +.. class:: Barrier(parties[, action[, timeout]]) + + A barrier object: a clone of :class:`threading.Barrier`. + + .. versionadded:: 3.3 + .. class:: BoundedSemaphore([value]) A bounded semaphore object: a clone of :class:`threading.BoundedSemaphore`. @@ -864,20 +943,17 @@ object -- see :ref:`multiprocessing-managers`. .. class:: Condition([lock]) - A condition variable: a clone of :class:`threading.Condition`. + A condition variable: an alias for :class:`threading.Condition`. If *lock* is specified then it should be a :class:`Lock` or :class:`RLock` object from :mod:`multiprocessing`. + .. versionchanged:: 3.3 + The :meth:`~threading.Condition.wait_for` method was added. + .. class:: Event() A clone of :class:`threading.Event`. - This method returns the state of the internal semaphore on exit, so it - will always return ``True`` except if a timeout is given and the operation - times out. - - .. versionchanged:: 3.1 - Previously, the method always returned ``None``. .. class:: Lock() @@ -893,6 +969,12 @@ object -- see :ref:`multiprocessing-managers`. .. note:: + The :meth:`acquire` and :meth:`wait` methods of each of these types + treat negative timeouts as zero timeouts. This differs from + :mod:`threading` where, since version 3.2, the equivalent + :meth:`acquire` methods treat negative timeouts as infinite + timeouts. + On Mac OS X, ``sem_timedwait`` is unsupported, so calling ``acquire()`` with a timeout will emulate that function's behavior using a sleeping loop. @@ -914,21 +996,34 @@ Shared :mod:`ctypes` Objects It is possible to create shared objects using shared memory which can be inherited by child processes. -.. function:: Value(typecode_or_type, *args[, lock]) +.. function:: Value(typecode_or_type, *args, lock=True) Return a :mod:`ctypes` object allocated from shared memory. By default the - return value is actually a synchronized wrapper for the object. + return value is actually a synchronized wrapper for the object. The object + itself can be accessed via the *value* attribute of a :class:`Value`. *typecode_or_type* determines the type of the returned object: it is either a ctypes type or a one character typecode of the kind used by the :mod:`array` module. *\*args* is passed on to the constructor for the type. - If *lock* is ``True`` (the default) then a new lock object is created to - synchronize access to the value. If *lock* is a :class:`Lock` or - :class:`RLock` object then that will be used to synchronize access to the - value. If *lock* is ``False`` then access to the returned object will not be - automatically protected by a lock, so it will not necessarily be - "process-safe". + If *lock* is ``True`` (the default) then a new recursive lock + object is created to synchronize access to the value. If *lock* is + a :class:`Lock` or :class:`RLock` object then that will be used to + synchronize access to the value. If *lock* is ``False`` then + access to the returned object will not be automatically protected + by a lock, so it will not necessarily be "process-safe". + + Operations like ``+=`` which involve a read and write are not + atomic. So if, for instance, you want to atomically increment a + shared value it is insufficient to just do :: + + counter.value += 1 + + Assuming the associated lock is recursive (which it is by default) + you can instead do :: + + with counter.get_lock(): + counter.value += 1 Note that *lock* is a keyword-only argument. @@ -1006,30 +1101,31 @@ processes. attributes which allow one to use it to store and retrieve strings -- see documentation for :mod:`ctypes`. -.. function:: Array(typecode_or_type, size_or_initializer, *args[, lock]) +.. function:: Array(typecode_or_type, size_or_initializer, *, lock=True) The same as :func:`RawArray` except that depending on the value of *lock* a process-safe synchronization wrapper may be returned instead of a raw ctypes array. If *lock* is ``True`` (the default) then a new lock object is created to - synchronize access to the value. If *lock* is a :class:`Lock` or - :class:`RLock` object then that will be used to synchronize access to the + synchronize access to the value. If *lock* is a + :class:`~multiprocessing.Lock` or :class:`~multiprocessing.RLock` object + then that will be used to synchronize access to the value. If *lock* is ``False`` then access to the returned object will not be automatically protected by a lock, so it will not necessarily be "process-safe". Note that *lock* is a keyword-only argument. -.. function:: Value(typecode_or_type, *args[, lock]) +.. function:: Value(typecode_or_type, *args, lock=True) The same as :func:`RawValue` except that depending on the value of *lock* a process-safe synchronization wrapper may be returned instead of a raw ctypes object. If *lock* is ``True`` (the default) then a new lock object is created to - synchronize access to the value. If *lock* is a :class:`Lock` or - :class:`RLock` object then that will be used to synchronize access to the + synchronize access to the value. If *lock* is a :class:`~multiprocessing.Lock` or + :class:`~multiprocessing.RLock` object then that will be used to synchronize access to the value. If *lock* is ``False`` then access to the returned object will not be automatically protected by a lock, so it will not necessarily be "process-safe". @@ -1123,8 +1219,10 @@ Managers ~~~~~~~~ Managers provide a way to create data which can be shared between different -processes. A manager object controls a server process which manages *shared -objects*. Other processes can access the shared objects by using proxies. +processes, including sharing over a network between processes running on +different machines. A manager object controls a server process which manages +*shared objects*. Other processes can access the shared objects by using +proxies. .. function:: multiprocessing.Manager() @@ -1197,9 +1295,10 @@ their parent process exits. The manager classes are defined in the type of shared object. This must be a string. *callable* is a callable used for creating objects for this type - identifier. If a manager instance will be created using the - :meth:`from_address` classmethod or if the *create_method* argument is - ``False`` then this can be left as ``None``. + identifier. If a manager instance will be connected to the + server using the :meth:`connect` method, or if the + *create_method* argument is ``False`` then this can be left as + ``None``. *proxytype* is a subclass of :class:`BaseProxy` which is used to create proxies for shared objects with this *typeid*. If ``None`` then a proxy @@ -1207,12 +1306,12 @@ their parent process exits. The manager classes are defined in the *exposed* is used to specify a sequence of method names which proxies for this typeid should be allowed to access using - :meth:`BaseProxy._callMethod`. (If *exposed* is ``None`` then + :meth:`BaseProxy._callmethod`. (If *exposed* is ``None`` then :attr:`proxytype._exposed_` is used instead if it exists.) In the case where no exposed list is specified, all "public methods" of the shared object will be accessible. (Here a "public method" means any attribute - which has a :meth:`__call__` method and whose name does not begin with - ``'_'``.) + which has a :meth:`~object.__call__` method and whose name does not begin + with ``'_'``.) *method_to_typeid* is a mapping used to specify the return type of those exposed methods which should return a proxy. It maps method names to @@ -1231,6 +1330,14 @@ their parent process exits. The manager classes are defined in the The address used by the manager. + .. versionchanged:: 3.3 + Manager objects support the context manager protocol -- see + :ref:`typecontextmanager`. :meth:`~contextmanager.__enter__` starts the + server process (if it has not already started) and then returns the + manager object. :meth:`~contextmanager.__exit__` calls :meth:`shutdown`. + + In previous versions :meth:`~contextmanager.__enter__` did not start the + manager's server process if it was not already started. .. class:: SyncManager @@ -1240,6 +1347,13 @@ their parent process exits. The manager classes are defined in the It also supports creation of shared lists and dictionaries. + .. method:: Barrier(parties[, action[, timeout]]) + + Create a shared :class:`threading.Barrier` object and return a + proxy for it. + + .. versionadded:: 3.3 + .. method:: BoundedSemaphore([value]) Create a shared :class:`threading.BoundedSemaphore` object and return a @@ -1253,6 +1367,9 @@ their parent process exits. The manager classes are defined in the If *lock* is supplied then it should be a proxy for a :class:`threading.Lock` or :class:`threading.RLock` object. + .. versionchanged:: 3.3 + The :meth:`~threading.Condition.wait_for` method was added. + .. method:: Event() Create a shared :class:`threading.Event` object and return a proxy for it. @@ -1358,11 +1475,10 @@ callables with the manager class. For example:: MyManager.register('Maths', MathsClass) if __name__ == '__main__': - manager = MyManager() - manager.start() - maths = manager.Maths() - print(maths.add(4, 3)) # prints 7 - print(maths.mul(7, 8)) # prints 56 + with MyManager() as manager: + maths = manager.Maths() + print(maths.add(4, 3)) # prints 7 + print(maths.mul(7, 8)) # prints 56 Using a remote manager @@ -1562,7 +1678,7 @@ Process Pools One can create a pool of processes which will carry out tasks submitted to it with the :class:`Pool` class. -.. class:: multiprocessing.Pool([processes[, initializer[, initargs[, maxtasksperchild]]]]) +.. class:: Pool([processes[, initializer[, initargs[, maxtasksperchild]]]]) A process pool object which controls a pool of worker processes to which jobs can be submitted. It supports asynchronous results with timeouts and @@ -1573,6 +1689,9 @@ with the :class:`Pool` class. *initializer* is not ``None`` then each worker process will call ``initializer(*initargs)`` when it starts. + Note that the methods of the pool object should only be called by + the process which created the pool. + .. versionadded:: 3.2 *maxtasksperchild* is the number of tasks a worker process can complete before it will exit and be replaced with a fresh worker process, to enable @@ -1657,6 +1776,24 @@ with the :class:`Pool` class. returned iterator should be considered arbitrary. (Only when there is only one worker process is the order guaranteed to be "correct".) + .. method:: starmap(func, iterable[, chunksize]) + + Like :meth:`map` except that the elements of the `iterable` are expected + to be iterables that are unpacked as arguments. + + Hence an `iterable` of `[(1,2), (3, 4)]` results in `[func(1,2), + func(3,4)]`. + + .. versionadded:: 3.3 + + .. method:: starmap_async(func, iterable[, chunksize[, callback[, error_back]]]) + + A combination of :meth:`starmap` and :meth:`map_async` that iterates over + `iterable` of iterables and calls `func` with the iterables unpacked. + Returns a result object. + + .. versionadded:: 3.3 + .. method:: close() Prevents any more tasks from being submitted to the pool. Once all the @@ -1673,6 +1810,11 @@ with the :class:`Pool` class. Wait for the worker processes to exit. One must call :meth:`close` or :meth:`terminate` before using :meth:`join`. + .. versionadded:: 3.3 + Pool objects now support the context manager protocol -- see + :ref:`typecontextmanager`. :meth:`~contextmanager.__enter__` returns the + pool object, and :meth:`~contextmanager.__exit__` calls :meth:`terminate`. + .. class:: AsyncResult @@ -1707,21 +1849,20 @@ The following example demonstrates the use of a pool:: return x*x if __name__ == '__main__': - pool = Pool(processes=4) # start 4 worker processes + with Pool(processes=4) as pool: # start 4 worker processes + result = pool.apply_async(f, (10,)) # evaluate "f(10)" asynchronously + print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow - result = pool.apply_async(f, (10,)) # evaluate "f(10)" asynchronously - print(result.get(timeout=1)) # prints "100" unless your computer is *very* slow + print(pool.map(f, range(10))) # prints "[0, 1, 4,..., 81]" - print(pool.map(f, range(10))) # prints "[0, 1, 4,..., 81]" + it = pool.imap(f, range(10)) + print(next(it)) # prints "0" + print(next(it)) # prints "1" + print(it.next(timeout=1)) # prints "4" unless your computer is *very* slow - it = pool.imap(f, range(10)) - print(next(it)) # prints "0" - print(next(it)) # prints "1" - print(it.next(timeout=1)) # prints "4" unless your computer is *very* slow - - import time - result = pool.apply_async(time.sleep, (10,)) - print(result.get(timeout=1)) # raises TimeoutError + import time + result = pool.apply_async(time.sleep, (10,)) + print(result.get(timeout=1)) # raises TimeoutError .. _multiprocessing-listeners-clients: @@ -1733,12 +1874,14 @@ Listeners and Clients :synopsis: API for dealing with sockets. Usually message passing between processes is done using queues or by using -:class:`Connection` objects returned by :func:`Pipe`. +:class:`~multiprocessing.Connection` objects returned by +:func:`~multiprocessing.Pipe`. However, the :mod:`multiprocessing.connection` module allows some extra flexibility. It basically gives a high level message oriented API for dealing -with sockets or Windows named pipes, and also has support for *digest -authentication* using the :mod:`hmac` module. +with sockets or Windows named pipes. It also has support for *digest +authentication* using the :mod:`hmac` module, and for polling +multiple connections at the same time. .. function:: deliver_challenge(connection, authkey) @@ -1748,15 +1891,15 @@ authentication* using the :mod:`hmac` module. If the reply matches the digest of the message using *authkey* as the key then a welcome message is sent to the other end of the connection. Otherwise - :exc:`AuthenticationError` is raised. + :exc:`~multiprocessing.AuthenticationError` is raised. -.. function:: answerChallenge(connection, authkey) +.. function:: answer_challenge(connection, authkey) Receive a message, calculate the digest of the message using *authkey* as the key, and then send the digest back. - If a welcome message is not received, then :exc:`AuthenticationError` is - raised. + If a welcome message is not received, then + :exc:`~multiprocessing.AuthenticationError` is raised. .. function:: Client(address[, family[, authenticate[, authkey]]]) @@ -1770,7 +1913,8 @@ authentication* using the :mod:`hmac` module. If *authenticate* is ``True`` or *authkey* is a byte string then digest authentication is used. The key used for authentication will be either *authkey* or ``current_process().authkey`` if *authkey* is ``None``. - If authentication fails then :exc:`AuthenticationError` is raised. See + If authentication fails then + :exc:`~multiprocessing.AuthenticationError` is raised. See :ref:`multiprocessing-auth-keys`. .. class:: Listener([address[, family[, backlog[, authenticate[, authkey]]]]]) @@ -1799,7 +1943,8 @@ authentication* using the :mod:`hmac` module. private temporary directory created using :func:`tempfile.mkstemp`. If the listener object uses a socket then *backlog* (1 by default) is passed - to the :meth:`listen` method of the socket once it has been bound. + to the :meth:`~socket.socket.listen` method of the socket once it has been + bound. If *authenticate* is ``True`` (``False`` by default) or *authkey* is not ``None`` then digest authentication is used. @@ -1811,13 +1956,15 @@ authentication* using the :mod:`hmac` module. ``current_process().authkey`` is used as the authentication key. If *authkey* is ``None`` and *authenticate* is ``False`` then no authentication is done. If authentication fails then - :exc:`AuthenticationError` is raised. See :ref:`multiprocessing-auth-keys`. + :exc:`~multiprocessing.AuthenticationError` is raised. + See :ref:`multiprocessing-auth-keys`. .. method:: accept() Accept a connection on the bound socket or named pipe of the listener - object and return a :class:`Connection` object. If authentication is - attempted and fails, then :exc:`AuthenticationError` is raised. + object and return a :class:`~multiprocessing.Connection` object. If + authentication is attempted and fails, then + :exc:`~multiprocessing.AuthenticationError` is raised. .. method:: close() @@ -1836,12 +1983,44 @@ authentication* using the :mod:`hmac` module. The address from which the last accepted connection came. If this is unavailable then it is ``None``. + .. versionadded:: 3.3 + Listener objects now support the context manager protocol -- see + :ref:`typecontextmanager`. :meth:`~contextmanager.__enter__` returns the + listener object, and :meth:`~contextmanager.__exit__` calls :meth:`close`. -The module defines two exceptions: +.. function:: wait(object_list, timeout=None) -.. exception:: AuthenticationError + Wait till an object in *object_list* is ready. Returns the list of + those objects in *object_list* which are ready. If *timeout* is a + float then the call blocks for at most that many seconds. If + *timeout* is ``None`` then it will block for an unlimited period. + A negative timeout is equivalent to a zero timeout. + + For both Unix and Windows, an object can appear in *object_list* if + it is + + * a readable :class:`~multiprocessing.Connection` object; + * a connected and readable :class:`socket.socket` object; or + * the :attr:`~multiprocessing.Process.sentinel` attribute of a + :class:`~multiprocessing.Process` object. + + A connection or socket object is ready when there is data available + to be read from it, or the other end has been closed. - Exception raised when there is an authentication error. + **Unix**: ``wait(object_list, timeout)`` almost equivalent + ``select.select(object_list, [], [], timeout)``. The difference is + that, if :func:`select.select` is interrupted by a signal, it can + raise :exc:`OSError` with an error number of ``EINTR``, whereas + :func:`wait` will not. + + **Windows**: An item in *object_list* must either be an integer + handle which is waitable (according to the definition used by the + documentation of the Win32 function ``WaitForMultipleObjects()``) + or it can be an object with a :meth:`fileno` method which returns a + socket handle or pipe handle. (Note that pipe handles and socket + handles are **not** waitable handles.) + + .. versionadded:: 3.3 **Examples** @@ -1854,19 +2033,16 @@ the client:: from array import array address = ('localhost', 6000) # family is deduced to be 'AF_INET' - listener = Listener(address, authkey=b'secret password') - - conn = listener.accept() - print('connection accepted from', listener.last_accepted) - conn.send([2.25, None, 'junk', float]) + with Listener(address, authkey=b'secret password') as listener: + with listener.accept() as conn: + print('connection accepted from', listener.last_accepted) - conn.send_bytes(b'hello') + conn.send([2.25, None, 'junk', float]) - conn.send_bytes(array('i', [42, 1729])) + conn.send_bytes(b'hello') - conn.close() - listener.close() + conn.send_bytes(array('i', [42, 1729])) The following code connects to the server and receives some data from the server:: @@ -1875,17 +2051,50 @@ server:: from array import array address = ('localhost', 6000) - conn = Client(address, authkey=b'secret password') - print(conn.recv()) # => [2.25, None, 'junk', float] + with Client(address, authkey=b'secret password') as conn: + print(conn.recv()) # => [2.25, None, 'junk', float] + + print(conn.recv_bytes()) # => 'hello' + + arr = array('i', [0, 0, 0, 0, 0]) + print(conn.recv_bytes_into(arr)) # => 8 + print(arr) # => array('i', [42, 1729, 0, 0, 0]) - print(conn.recv_bytes()) # => 'hello' +The following code uses :func:`~multiprocessing.connection.wait` to +wait for messages from multiple processes at once:: - arr = array('i', [0, 0, 0, 0, 0]) - print(conn.recv_bytes_into(arr)) # => 8 - print(arr) # => array('i', [42, 1729, 0, 0, 0]) + import time, random + from multiprocessing import Process, Pipe, current_process + from multiprocessing.connection import wait - conn.close() + def foo(w): + for i in range(10): + w.send((i, current_process().name)) + w.close() + + if __name__ == '__main__': + readers = [] + + for i in range(4): + r, w = Pipe(duplex=False) + readers.append(r) + p = Process(target=foo, args=(w,)) + p.start() + # We close the writable end of the pipe now to be sure that + # p is the only process which owns a handle for it. This + # ensures that when p closes its handle for the writable end, + # wait() will promptly report the readable end as being ready. + w.close() + + while readers: + for r in wait(readers): + try: + msg = r.recv() + except EOFError: + readers.remove(r) + else: + print(msg) .. _multiprocessing-address-formats: @@ -1913,7 +2122,8 @@ an ``'AF_PIPE'`` address rather than an ``'AF_UNIX'`` address. Authentication keys ~~~~~~~~~~~~~~~~~~~ -When one uses :meth:`Connection.recv`, the data received is automatically +When one uses :meth:`Connection.recv <multiprocessing.Connection.recv>`, the +data received is automatically unpickled. Unfortunately unpickling data from an untrusted source is a security risk. Therefore :class:`Listener` and :func:`Client` use the :mod:`hmac` module to provide digest authentication. @@ -2046,7 +2256,7 @@ Avoid shared state It is probably best to stick to using queues or pipes for communication between processes rather than using the lower level synchronization - primitives from the :mod:`threading` module. + primitives. Picklability @@ -2063,9 +2273,10 @@ Joining zombie processes On Unix when a process finishes but has not been joined it becomes a zombie. There should never be very many because each time a new process starts (or - :func:`active_children` is called) all completed processes which have not - yet been joined will be joined. Also calling a finished process's - :meth:`Process.is_alive` will join the process. Even so it is probably good + :func:`~multiprocessing.active_children` is called) all completed processes + which have not yet been joined will be joined. Also calling a finished + process's :meth:`Process.is_alive <multiprocessing.Process.is_alive>` will + join the process. Even so it is probably good practice to explicitly join all the processes that you start. Better to inherit than pickle/unpickle @@ -2078,20 +2289,23 @@ Better to inherit than pickle/unpickle Avoid terminating processes - Using the :meth:`Process.terminate` method to stop a process is liable to + Using the :meth:`Process.terminate <multiprocessing.Process.terminate>` + method to stop a process is liable to cause any shared resources (such as locks, semaphores, pipes and queues) currently being used by the process to become broken or unavailable to other processes. Therefore it is probably best to only consider using - :meth:`Process.terminate` on processes which never use any shared resources. + :meth:`Process.terminate <multiprocessing.Process.terminate>` on processes + which never use any shared resources. Joining processes that use queues Bear in mind that a process that has put items in a queue will wait before terminating until all the buffered items are fed by the "feeder" thread to the underlying pipe. (The child process can call the - :meth:`Queue.cancel_join_thread` method of the queue to avoid this behaviour.) + :meth:`Queue.cancel_join_thread <multiprocessing.Queue.cancel_join_thread>` + method of the queue to avoid this behaviour.) This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the @@ -2168,7 +2382,7 @@ Beware of replacing :data:`sys.stdin` with a "file like object" resulting in a bad file descriptor error, but introduces a potential danger to applications which replace :func:`sys.stdin` with a "file-like object" with output buffering. This danger is that if multiple processes call - :func:`close()` on this file-like object, it could result in the same + :meth:`~io.IOBase.close()` on this file-like object, it could result in the same data being flushed to the object multiple times, resulting in corruption. If you write a file-like object and implement your own caching, you can @@ -2197,14 +2411,16 @@ More picklability as the ``target`` argument on Windows --- just define a function and use that instead. - Also, if you subclass :class:`Process` then make sure that instances will be - picklable when the :meth:`Process.start` method is called. + Also, if you subclass :class:`~multiprocessing.Process` then make sure that + instances will be picklable when the :meth:`Process.start + <multiprocessing.Process.start>` method is called. Global variables Bear in mind that if code run in a child process tries to access a global variable, then the value it sees (if any) may not be the same as the value - in the parent process at the time that :meth:`Process.start` was called. + in the parent process at the time that :meth:`Process.start + <multiprocessing.Process.start>` was called. However, global variables which are just module level constants cause no problems. @@ -2260,7 +2476,7 @@ Demonstration of how to create and use customized managers and proxies: :language: python3 -Using :class:`Pool`: +Using :class:`~multiprocessing.pool.Pool`: .. literalinclude:: ../includes/mp_pool.py :language: python3 diff --git a/Doc/library/netrc.rst b/Doc/library/netrc.rst index e085bce..564f101 100644 --- a/Doc/library/netrc.rst +++ b/Doc/library/netrc.rst @@ -29,7 +29,7 @@ the Unix :program:`ftp` program and other FTP clients. This implements security behavior equivalent to that of ftp and other programs that use :file:`.netrc`. - .. versionchanged:: 3.2.6 Added the POSIX permission check. + .. versionchanged:: 3.3.3 Added the POSIX permission check. .. exception:: NetrcParseError diff --git a/Doc/library/nntplib.rst b/Doc/library/nntplib.rst index ef8b9b5..a15d7f7 100644 --- a/Doc/library/nntplib.rst +++ b/Doc/library/nntplib.rst @@ -70,10 +70,23 @@ The module itself defines the following classes: connecting to an NNTP server on the local machine and intend to call reader-specific commands, such as ``group``. If you get unexpected :exc:`NNTPPermanentError`\ s, you might need to set *readermode*. + :class:`NNTP` class supports the :keyword:`with` statement to + unconditionally consume :exc:`socket.error` exceptions and to close the NNTP + connection when done. Here is a sample on how using it: + + >>> from nntplib import NNTP + >>> with NNTP('news.gmane.org') as n: + ... n.group('gmane.comp.python.committers') + ... + ('211 1755 1 1755 gmane.comp.python.committers', 1755, 1, 1755, 'gmane.comp.python.committers') + >>> + .. versionchanged:: 3.2 - *usenetrc* is now False by default. + *usenetrc* is now ``False`` by default. + .. versionchanged:: 3.3 + Support for the :keyword:`with` statement was added. .. class:: NNTP_SSL(host, port=563, user=None, password=None, ssl_context=None, readermode=None, usenetrc=False, [timeout]) @@ -204,7 +217,7 @@ tuples or objects that the method normally returns will be empty. .. method:: NNTP.login(user=None, password=None, usenetrc=True) Send ``AUTHINFO`` commands with the user name and password. If *user* - and *password* are None and *usenetrc* is True, credentials from + and *password* are None and *usenetrc* is true, credentials from ``~/.netrc`` will be used if possible. Unless intentionally delayed, login is normally performed during the @@ -382,18 +395,18 @@ tuples or objects that the method normally returns will be empty. .. method:: NNTP.next() - Send a ``NEXT`` command. Return as for :meth:`stat`. + Send a ``NEXT`` command. Return as for :meth:`.stat`. .. method:: NNTP.last() - Send a ``LAST`` command. Return as for :meth:`stat`. + Send a ``LAST`` command. Return as for :meth:`.stat`. .. method:: NNTP.article(message_spec=None, *, file=None) Send an ``ARTICLE`` command, where *message_spec* has the same meaning as - for :meth:`stat`. Return a tuple ``(response, info)`` where *info* + for :meth:`.stat`. Return a tuple ``(response, info)`` where *info* is a :class:`~collections.namedtuple` with three attributes *number*, *message_id* and *lines* (in that order). *number* is the article number in the group (or 0 if the information is not available), *message_id* the @@ -504,6 +517,9 @@ them have been superseded by newer commands in :rfc:`3977`. article with message ID *id*. Most of the time, this extension is not enabled by NNTP server administrators. + .. deprecated:: 3.3 + The XPATH extension is not actively used. + .. XXX deprecated: diff --git a/Doc/library/numbers.rst b/Doc/library/numbers.rst index ad33396..fec04ed 100644 --- a/Doc/library/numbers.rst +++ b/Doc/library/numbers.rst @@ -71,10 +71,10 @@ The numeric tower .. class:: Integral - Subtypes :class:`Rational` and adds a conversion to :class:`int`. - Provides defaults for :func:`float`, :attr:`~Rational.numerator`, and - :attr:`~Rational.denominator`, and bit-string operations: ``<<``, - ``>>``, ``&``, ``^``, ``|``, ``~``. + Subtypes :class:`Rational` and adds a conversion to :class:`int`. Provides + defaults for :func:`float`, :attr:`~Rational.numerator`, and + :attr:`~Rational.denominator`. Adds abstract methods for ``**`` and + bit-string operations: ``<<``, ``>>``, ``&``, ``^``, ``|``, ``~``. Notes for type implementors diff --git a/Doc/library/numeric.rst b/Doc/library/numeric.rst index ba22cb6..2732a84 100644 --- a/Doc/library/numeric.rst +++ b/Doc/library/numeric.rst @@ -8,9 +8,9 @@ Numeric and Mathematical Modules The modules described in this chapter provide numeric and math-related functions and data types. The :mod:`numbers` module defines an abstract hierarchy of numeric types. The :mod:`math` and :mod:`cmath` modules contain various -mathematical functions for floating-point and complex numbers. For users more -interested in decimal accuracy than in speed, the :mod:`decimal` module supports -exact representations of decimal numbers. +mathematical functions for floating-point and complex numbers. The :mod:`decimal` +module supports exact representations of decimal numbers, using arbitrary precision +arithmetic. The following modules are documented in this chapter: diff --git a/Doc/library/operator.rst b/Doc/library/operator.rst index 3860880..24becf9 100644 --- a/Doc/library/operator.rst +++ b/Doc/library/operator.rst @@ -241,13 +241,22 @@ lookups. These are useful for making fast field extractors as arguments for expect a function argument. -.. function:: attrgetter(attr[, args...]) +.. function:: attrgetter(attr) + attrgetter(*attrs) - Return a callable object that fetches *attr* from its operand. If more than one - attribute is requested, returns a tuple of attributes. After, - ``f = attrgetter('name')``, the call ``f(b)`` returns ``b.name``. After, - ``f = attrgetter('name', 'date')``, the call ``f(b)`` returns ``(b.name, - b.date)``. Equivalent to:: + Return a callable object that fetches *attr* from its operand. + If more than one attribute is requested, returns a tuple of attributes. + The attribute names can also contain dots. For example: + + * After ``f = attrgetter('name')``, the call ``f(b)`` returns ``b.name``. + + * After ``f = attrgetter('name', 'date')``, the call ``f(b)`` returns + ``(b.name, b.date)``. + + * After ``f = attrgetter('name.first', 'name.last')``, the call ``f(b)`` + returns ``(b.name.first, b.name.last)``. + + Equivalent to:: def attrgetter(*items): if any(not isinstance(item, str) for item in items): @@ -258,7 +267,7 @@ expect a function argument. return resolve_attr(obj, attr) else: def g(obj): - return tuple(resolve_att(obj, attr) for attr in items) + return tuple(resolve_attr(obj, attr) for attr in items) return g def resolve_attr(obj, attr): @@ -267,14 +276,19 @@ expect a function argument. return obj - The attribute names can also contain dots; after ``f = attrgetter('date.month')``, - the call ``f(b)`` returns ``b.date.month``. - -.. function:: itemgetter(item[, args...]) +.. function:: itemgetter(item) + itemgetter(*items) Return a callable object that fetches *item* from its operand using the operand's :meth:`__getitem__` method. If multiple items are specified, - returns a tuple of lookup values. Equivalent to:: + returns a tuple of lookup values. For example: + + * After ``f = itemgetter(2)``, the call ``f(r)`` returns ``r[2]``. + + * After ``g = itemgetter(2, 5, 3)``, the call ``g(r)`` returns + ``(r[2], r[5], r[3])``. + + Equivalent to:: def itemgetter(*items): if len(items) == 1: @@ -313,9 +327,14 @@ expect a function argument. Return a callable object that calls the method *name* on its operand. If additional arguments and/or keyword arguments are given, they will be given - to the method as well. After ``f = methodcaller('name')``, the call ``f(b)`` - returns ``b.name()``. After ``f = methodcaller('name', 'foo', bar=1)``, the - call ``f(b)`` returns ``b.name('foo', bar=1)``. Equivalent to:: + to the method as well. For example: + + * After ``f = methodcaller('name')``, the call ``f(b)`` returns ``b.name()``. + + * After ``f = methodcaller('name', 'foo', bar=1)``, the call ``f(b)`` + returns ``b.name('foo', bar=1)``. + + Equivalent to:: def methodcaller(name, *args, **kwargs): def caller(obj): diff --git a/Doc/library/os.path.rst b/Doc/library/os.path.rst index 32c48eb..57a6bca 100644 --- a/Doc/library/os.path.rst +++ b/Doc/library/os.path.rst @@ -77,11 +77,16 @@ the :mod:`glob` module.) .. function:: exists(path) - Return ``True`` if *path* refers to an existing path. Returns ``False`` for - broken symbolic links. On some platforms, this function may return ``False`` if - permission is not granted to execute :func:`os.stat` on the requested file, even + Return ``True`` if *path* refers to an existing path or an open + file descriptor. Returns ``False`` for broken symbolic links. On + some platforms, this function may return ``False`` if permission is + not granted to execute :func:`os.stat` on the requested file, even if the *path* physically exists. + .. versionchanged:: 3.3 + *path* can now be an integer: ``True`` is returned if it is an + open file descriptor, ``False`` otherwise. + .. function:: lexists(path) @@ -126,9 +131,9 @@ the :mod:`glob` module.) Return the time of last access of *path*. The return value is a number giving the number of seconds since the epoch (see the :mod:`time` module). Raise - :exc:`os.error` if the file does not exist or is inaccessible. + :exc:`OSError` if the file does not exist or is inaccessible. - If :func:`os.stat_float_times` returns True, the result is a floating point + If :func:`os.stat_float_times` returns ``True``, the result is a floating point number. @@ -136,24 +141,24 @@ the :mod:`glob` module.) Return the time of last modification of *path*. The return value is a number giving the number of seconds since the epoch (see the :mod:`time` module). - Raise :exc:`os.error` if the file does not exist or is inaccessible. + Raise :exc:`OSError` if the file does not exist or is inaccessible. - If :func:`os.stat_float_times` returns True, the result is a floating point + If :func:`os.stat_float_times` returns ``True``, the result is a floating point number. .. function:: getctime(path) Return the system's ctime which, on some systems (like Unix) is the time of the - last change, and, on others (like Windows), is the creation time for *path*. + last metadata change, and, on others (like Windows), is the creation time for *path*. The return value is a number giving the number of seconds since the epoch (see - the :mod:`time` module). Raise :exc:`os.error` if the file does not exist or + the :mod:`time` module). Raise :exc:`OSError` if the file does not exist or is inaccessible. .. function:: getsize(path) - Return the size, in bytes, of *path*. Raise :exc:`os.error` if the file does + Return the size, in bytes, of *path*. Raise :exc:`OSError` if the file does not exist or is inaccessible. @@ -229,8 +234,10 @@ the :mod:`glob` module.) .. function:: relpath(path, start=None) - Return a relative filepath to *path* either from the current directory or from - an optional *start* point. + Return a relative filepath to *path* either from the current directory or + from an optional *start* directory. This is a path computation: the + filesystem is not accessed to confirm the existence or nature of *path* or + *start*. *start* defaults to :attr:`os.curdir`. @@ -259,15 +266,16 @@ the :mod:`glob` module.) Availability: Unix, Windows. - .. versionchanged:: 3.2 Added Windows support. + .. versionchanged:: 3.2 + Added Windows support. .. function:: samestat(stat1, stat2) Return ``True`` if the stat tuples *stat1* and *stat2* refer to the same file. - These structures may have been returned by :func:`fstat`, :func:`lstat`, or - :func:`stat`. This function implements the underlying comparison used by - :func:`samefile` and :func:`sameopenfile`. + These structures may have been returned by :func:`os.fstat`, + :func:`os.lstat`, or :func:`os.stat`. This function implements the + underlying comparison used by :func:`samefile` and :func:`sameopenfile`. Availability: Unix. @@ -326,5 +334,5 @@ the :mod:`glob` module.) .. data:: supports_unicode_filenames - True if arbitrary Unicode strings can be used as file names (within limitations + ``True`` if arbitrary Unicode strings can be used as file names (within limitations imposed by the file system). diff --git a/Doc/library/os.rst b/Doc/library/os.rst index 715f654..d7b9829 100644 --- a/Doc/library/os.rst +++ b/Doc/library/os.rst @@ -96,6 +96,13 @@ These functions and data items provide information and operate on the current process and user. +.. function:: ctermid() + + Return the filename corresponding to the controlling terminal of the process. + + Availability: Unix. + + .. data:: environ A :term:`mapping` object representing the string environment. For example, @@ -177,6 +184,28 @@ process and user. .. versionadded:: 3.2 +.. function:: getenv(key, default=None) + + Return the value of the environment variable *key* if it exists, or + *default* if it doesn't. *key*, *default* and the result are str. + + On Unix, keys and values are decoded with :func:`sys.getfilesystemencoding` + and ``'surrogateescape'`` error handler. Use :func:`os.getenvb` if you + would like to use a different encoding. + + Availability: most flavors of Unix, Windows. + + +.. function:: getenvb(key, default=None) + + Return the value of the environment variable *key* if it exists, or + *default* if it doesn't. *key*, *default* and the result are bytes. + + Availability: most flavors of Unix. + + .. versionadded:: 3.2 + + .. function:: get_exec_path(env=None) Returns the list of directories that will be searched for a named @@ -188,13 +217,6 @@ process and user. .. versionadded:: 3.2 -.. function:: ctermid() - - Return the filename corresponding to the controlling terminal of the process. - - Availability: Unix. - - .. function:: getegid() Return the effective group id of the current process. This corresponds to the @@ -221,13 +243,26 @@ process and user. Availability: Unix. +.. function:: getgrouplist(user, group) + + Return list of group ids that *user* belongs to. If *group* is not in the + list, it is included; typically, *group* is specified as the group ID + field from the password record for *user*. + + Availability: Unix. + + .. versionadded:: 3.3 + + .. function:: getgroups() Return list of supplemental group ids associated with the current process. Availability: Unix. - .. note:: On Mac OS X, :func:`getgroups` behavior differs somewhat from + .. note:: + + On Mac OS X, :func:`getgroups` behavior differs somewhat from other Unix platforms. If the Python interpreter was built with a deployment target of :const:`10.5` or earlier, :func:`getgroups` returns the list of effective group ids associated with the current user process; @@ -242,17 +277,6 @@ process and user. obtained with :func:`sysconfig.get_config_var`. -.. function:: initgroups(username, gid) - - Call the system initgroups() to initialize the group access list with all of - the groups of which the specified username is a member, plus the specified - group id. - - Availability: Unix. - - .. versionadded:: 3.2 - - .. function:: getlogin() Return the name of the user logged in on the controlling terminal of the @@ -297,11 +321,40 @@ process and user. the id returned is the one of the init process (1), on Windows it is still the same id, which may be already reused by another process. - Availability: Unix, Windows + Availability: Unix, Windows. .. versionchanged:: 3.2 Added support for Windows. + +.. function:: getpriority(which, who) + + .. index:: single: process; scheduling priority + + Get program scheduling priority. The value *which* is one of + :const:`PRIO_PROCESS`, :const:`PRIO_PGRP`, or :const:`PRIO_USER`, and *who* + is interpreted relative to *which* (a process identifier for + :const:`PRIO_PROCESS`, process group identifier for :const:`PRIO_PGRP`, and a + user ID for :const:`PRIO_USER`). A zero value for *who* denotes + (respectively) the calling process, the process group of the calling process, + or the real user ID of the calling process. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: PRIO_PROCESS + PRIO_PGRP + PRIO_USER + + Parameters for the :func:`getpriority` and :func:`setpriority` functions. + + Availability: Unix. + + .. versionadded:: 3.3 + + .. function:: getresuid() Return a tuple (ruid, euid, suid) denoting the current process's @@ -331,24 +384,13 @@ process and user. Availability: Unix. -.. function:: getenv(key, default=None) - - Return the value of the environment variable *key* if it exists, or - *default* if it doesn't. *key*, *default* and the result are str. - - On Unix, keys and values are decoded with :func:`sys.getfilesystemencoding` - and ``'surrogateescape'`` error handler. Use :func:`os.getenvb` if you - would like to use a different encoding. - - Availability: most flavors of Unix, Windows. - - -.. function:: getenvb(key, default=None) +.. function:: initgroups(username, gid) - Return the value of the environment variable *key* if it exists, or - *default* if it doesn't. *key*, *default* and the result are bytes. + Call the system initgroups() to initialize the group access list with all of + the groups of which the specified username is a member, plus the specified + group id. - Availability: most flavors of Unix. + Availability: Unix. .. versionadded:: 3.2 @@ -410,7 +452,7 @@ process and user. .. function:: setpgrp() - Call the system call :c:func:`setpgrp` or :c:func:`setpgrp(0, 0)` depending on + Call the system call :c:func:`setpgrp` or ``setpgrp(0, 0)`` depending on which version is implemented (if any). See the Unix manual for the semantics. Availability: Unix. @@ -425,6 +467,25 @@ process and user. Availability: Unix. +.. function:: setpriority(which, who, priority) + + .. index:: single: process; scheduling priority + + Set program scheduling priority. The value *which* is one of + :const:`PRIO_PROCESS`, :const:`PRIO_PGRP`, or :const:`PRIO_USER`, and *who* + is interpreted relative to *which* (a process identifier for + :const:`PRIO_PROCESS`, process group identifier for :const:`PRIO_PGRP`, and a + user ID for :const:`PRIO_USER`). A zero value for *who* denotes + (respectively) the calling process, the process group of the calling process, + or the real user ID of the calling process. + *priority* is a value in the range -20 to 19. The default priority is 0; + lower priorities cause more favorable scheduling. + + Availability: Unix + + .. versionadded:: 3.3 + + .. function:: setregid(rgid, egid) Set the current process's real and effective group ids. @@ -492,7 +553,7 @@ process and user. .. data:: supports_bytes_environ - True if the native OS type of the environment is bytes (eg. False on + ``True`` if the native OS type of the environment is bytes (eg. ``False`` on Windows). .. versionadded:: 3.2 @@ -511,15 +572,31 @@ process and user. single: gethostname() (in module socket) single: gethostbyaddr() (in module socket) - Return a 5-tuple containing information identifying the current operating - system. The tuple contains 5 strings: ``(sysname, nodename, release, version, - machine)``. Some systems truncate the nodename to 8 characters or to the + Returns information identifying the current operating system. + The return value is an object with five attributes: + + * :attr:`sysname` - operating system name + * :attr:`nodename` - name of machine on network (implementation-defined) + * :attr:`release` - operating system release + * :attr:`version` - operating system version + * :attr:`machine` - hardware identifier + + For backwards compatibility, this object is also iterable, behaving + like a five-tuple containing :attr:`sysname`, :attr:`nodename`, + :attr:`release`, :attr:`version`, and :attr:`machine` + in that order. + + Some systems truncate :attr:`nodename` to 8 characters or to the leading component; a better way to get the hostname is :func:`socket.gethostname` or even ``socket.gethostbyaddr(socket.gethostname())``. Availability: recent flavors of Unix. + .. versionchanged:: 3.3 + Return type changed from a tuple to a tuple-like object + with named attributes. + .. function:: unsetenv(key) @@ -542,15 +619,16 @@ process and user. File Object Creation -------------------- -These functions create new :term:`file objects <file object>`. (See also :func:`open`.) +This function creates new :term:`file objects <file object>`. (See also +:func:`~os.open` for opening file descriptors.) .. function:: fdopen(fd, *args, **kwargs) - Return an open file object connected to the file descriptor *fd*. - This is an alias of :func:`open` and accepts the same arguments. - The only difference is that the first argument of :func:`fdopen` - must always be an integer. + Return an open file object connected to the file descriptor *fd*. This is an + alias of the :func:`open` built-in function and accepts the same arguments. + The only difference is that the first argument of :func:`fdopen` must always + be an integer. .. _os-fd-ops: @@ -567,11 +645,12 @@ process will then be assigned 3, 4, 5, and so forth. The name "file descriptor" is slightly deceptive; on Unix platforms, sockets and pipes are also referenced by file descriptors. -The :meth:`~file.fileno` method can be used to obtain the file descriptor +The :meth:`~io.IOBase.fileno` method can be used to obtain the file descriptor associated with a :term:`file object` when required. Note that using the file descriptor directly will bypass the file object methods, ignoring aspects such as internal buffering of data. + .. function:: close(fd) Close file descriptor *fd*. @@ -583,13 +662,13 @@ as internal buffering of data. This function is intended for low-level I/O and must be applied to a file descriptor as returned by :func:`os.open` or :func:`pipe`. To close a "file object" returned by the built-in function :func:`open` or by :func:`popen` or - :func:`fdopen`, use its :meth:`~file.close` method. + :func:`fdopen`, use its :meth:`~io.IOBase.close` method. .. function:: closerange(fd_low, fd_high) Close all file descriptors from *fd_low* (inclusive) to *fd_high* (exclusive), - ignoring errors. Equivalent to:: + ignoring errors. Equivalent to (but much faster than):: for fd in range(fd_low, fd_high): try: @@ -622,8 +701,9 @@ as internal buffering of data. .. function:: fchmod(fd, mode) - Change the mode of the file given by *fd* to the numeric *mode*. See the docs - for :func:`chmod` for possible values of *mode*. + Change the mode of the file given by *fd* to the numeric *mode*. See the + docs for :func:`chmod` for possible values of *mode*. As of Python 3.3, this + is equivalent to ``os.chmod(fd, mode)``. Availability: Unix. @@ -631,7 +711,9 @@ as internal buffering of data. .. function:: fchown(fd, uid, gid) Change the owner and group id of the file given by *fd* to the numeric *uid* - and *gid*. To leave one of the ids unchanged, set it to -1. + and *gid*. To leave one of the ids unchanged, set it to -1. See + :func:`chown`. As of Python 3.3, this is equivalent to ``os.chown(fd, uid, + gid)``. Availability: Unix. @@ -662,20 +744,24 @@ as internal buffering of data. included in ``pathconf_names``, an :exc:`OSError` is raised with :const:`errno.EINVAL` for the error number. + As of Python 3.3, this is equivalent to ``os.pathconf(fd, name)``. + Availability: Unix. .. function:: fstat(fd) - Return status for file descriptor *fd*, like :func:`~os.stat`. + Return status for file descriptor *fd*, like :func:`~os.stat`. As of Python + 3.3, this is equivalent to ``os.stat(fd)``. Availability: Unix, Windows. .. function:: fstatvfs(fd) - Return information about the filesystem containing the file associated with file - descriptor *fd*, like :func:`statvfs`. + Return information about the filesystem containing the file associated with + file descriptor *fd*, like :func:`statvfs`. As of Python 3.3, this is + equivalent to ``os.statvfs(fd)``. Availability: Unix. @@ -689,13 +775,14 @@ as internal buffering of data. ``f.flush()``, and then do ``os.fsync(f.fileno())``, to ensure that all internal buffers associated with *f* are written to disk. - Availability: Unix, and Windows. + Availability: Unix, Windows. .. function:: ftruncate(fd, length) - Truncate the file corresponding to file descriptor *fd*, so that it is at most - *length* bytes in size. + Truncate the file corresponding to file descriptor *fd*, so that it is at + most *length* bytes in size. As of Python 3.3, this is equivalent to + ``os.truncate(fd, length)``. Availability: Unix. @@ -705,15 +792,38 @@ as internal buffering of data. Return ``True`` if the file descriptor *fd* is open and connected to a tty(-like) device, else ``False``. + +.. function:: lockf(fd, cmd, len) + + Apply, test or remove a POSIX lock on an open file descriptor. + *fd* is an open file descriptor. + *cmd* specifies the command to use - one of :data:`F_LOCK`, :data:`F_TLOCK`, + :data:`F_ULOCK` or :data:`F_TEST`. + *len* specifies the section of the file to lock. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: F_LOCK + F_TLOCK + F_ULOCK + F_TEST + + Flags that specify what action :func:`lockf` will take. + Availability: Unix. + .. versionadded:: 3.3 + .. function:: lseek(fd, pos, how) Set the current position of file descriptor *fd* to position *pos*, modified by *how*: :const:`SEEK_SET` or ``0`` to set the position relative to the beginning of the file; :const:`SEEK_CUR` or ``1`` to set it relative to the - current position; :const:`os.SEEK_END` or ``2`` to set it relative to the end of + current position; :const:`SEEK_END` or ``2`` to set it relative to the end of the file. Return the new cursor position in bytes, starting from the beginning. Availability: Unix, Windows. @@ -724,21 +834,29 @@ as internal buffering of data. SEEK_END Parameters to the :func:`lseek` function. Their values are 0, 1, and 2, - respectively. Availability: Windows, Unix. + respectively. + + Availability: Unix, Windows. + + .. versionadded:: 3.3 + Some operating systems could support additional values, like + :data:`os.SEEK_HOLE` or :data:`os.SEEK_DATA`. -.. function:: open(file, flags[, mode]) +.. function:: open(file, flags, mode=0o777, *, dir_fd=None) Open the file *file* and set various flags according to *flags* and possibly - its mode according to *mode*. The default *mode* is ``0o777`` (octal), and - the current umask value is first masked out. Return the file descriptor for - the newly opened file. + its mode according to *mode*. When computing *mode*, the current umask value + is first masked out. Return the file descriptor for the newly opened file. For a description of the flag and mode values, see the C run-time documentation; flag constants (like :const:`O_RDONLY` and :const:`O_WRONLY`) are defined in this module too (see :ref:`open-constants`). In particular, on Windows adding :const:`O_BINARY` is needed to open files in binary mode. + This function can support :ref:`paths relative to directory descriptors + <dir_fd>`. + Availability: Unix, Windows. .. note:: @@ -748,6 +866,9 @@ as internal buffering of data. :meth:`~file.read` and :meth:`~file.write` methods (and many more). To wrap a file descriptor in a file object, use :func:`fdopen`. + .. versionadded:: 3.3 + The *dir_fd* argument. + .. function:: openpty() @@ -768,6 +889,79 @@ as internal buffering of data. Availability: Unix, Windows. +.. function:: pipe2(flags) + + Create a pipe with *flags* set atomically. + *flags* can be constructed by ORing together one or more of these values: + :data:`O_NONBLOCK`, :data:`O_CLOEXEC`. + Return a pair of file descriptors ``(r, w)`` usable for reading and writing, + respectively. + + Availability: some flavors of Unix. + + .. versionadded:: 3.3 + + +.. function:: posix_fallocate(fd, offset, len) + + Ensures that enough disk space is allocated for the file specified by *fd* + starting from *offset* and continuing for *len* bytes. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: posix_fadvise(fd, offset, len, advice) + + Announces an intention to access data in a specific pattern thus allowing + the kernel to make optimizations. + The advice applies to the region of the file specified by *fd* starting at + *offset* and continuing for *len* bytes. + *advice* is one of :data:`POSIX_FADV_NORMAL`, :data:`POSIX_FADV_SEQUENTIAL`, + :data:`POSIX_FADV_RANDOM`, :data:`POSIX_FADV_NOREUSE`, + :data:`POSIX_FADV_WILLNEED` or :data:`POSIX_FADV_DONTNEED`. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: POSIX_FADV_NORMAL + POSIX_FADV_SEQUENTIAL + POSIX_FADV_RANDOM + POSIX_FADV_NOREUSE + POSIX_FADV_WILLNEED + POSIX_FADV_DONTNEED + + Flags that can be used in *advice* in :func:`posix_fadvise` that specify + the access pattern that is likely to be used. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: pread(fd, buffersize, offset) + + Read from a file descriptor, *fd*, at a position of *offset*. It will read up + to *buffersize* number of bytes. The file offset remains unchanged. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: pwrite(fd, string, offset) + + Write *string* to a file descriptor, *fd*, from *offset*, leaving the file + offset unchanged. + + Availability: Unix. + + .. versionadded:: 3.3 + + .. function:: read(fd, n) Read at most *n* bytes from file descriptor *fd*. Return a bytestring containing the @@ -779,10 +973,64 @@ as internal buffering of data. .. note:: This function is intended for low-level I/O and must be applied to a file - descriptor as returned by :func:`os.open` or :func:`pipe`. To read a "file object" - returned by the built-in function :func:`open` or by :func:`popen` or - :func:`fdopen`, or :data:`sys.stdin`, use its :meth:`~file.read` or - :meth:`~file.readline` methods. + descriptor as returned by :func:`os.open` or :func:`pipe`. To read a + "file object" returned by the built-in function :func:`open` or by + :func:`popen` or :func:`fdopen`, or :data:`sys.stdin`, use its + :meth:`~file.read` or :meth:`~file.readline` methods. + + +.. function:: sendfile(out, in, offset, nbytes) + sendfile(out, in, offset, nbytes, headers=None, trailers=None, flags=0) + + Copy *nbytes* bytes from file descriptor *in* to file descriptor *out* + starting at *offset*. + Return the number of bytes sent. When EOF is reached return 0. + + The first function notation is supported by all platforms that define + :func:`sendfile`. + + On Linux, if *offset* is given as ``None``, the bytes are read from the + current position of *in* and the position of *in* is updated. + + The second case may be used on Mac OS X and FreeBSD where *headers* and + *trailers* are arbitrary sequences of buffers that are written before and + after the data from *in* is written. It returns the same as the first case. + + On Mac OS X and FreeBSD, a value of 0 for *nbytes* specifies to send until + the end of *in* is reached. + + All platforms support sockets as *out* file descriptor, and some platforms + allow other types (e.g. regular file, pipe) as well. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: SF_NODISKIO + SF_MNOWAIT + SF_SYNC + + Parameters to the :func:`sendfile` function, if the implementation supports + them. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: readv(fd, buffers) + + Read from a file descriptor *fd* into a number of mutable :term:`bytes-like + objects <bytes-like object>` *buffers*. :func:`~os.readv` will transfer data + into each buffer until it is full and then move on to the next buffer in the + sequence to hold the rest of the data. :func:`~os.readv` returns the total + number of bytes read (which may be less than the total capacity of all the + objects). + + Availability: Unix. + + .. versionadded:: 3.3 .. function:: tcgetpgrp(fd) @@ -826,6 +1074,18 @@ as internal buffering of data. :meth:`~file.write` method. +.. function:: writev(fd, buffers) + + Write the contents of *buffers* to file descriptor *fd*. *buffers* must be a + sequence of :term:`bytes-like objects <bytes-like object>`. + :func:`~os.writev` writes the contents of each object to the file descriptor + and returns the total number of bytes written. + + Availability: Unix. + + .. versionadded:: 3.3 + + .. _open-constants: ``open()`` flag constants @@ -857,9 +1117,12 @@ or `the MSDN <http://msdn.microsoft.com/en-us/library/z0kc8e3z.aspx>`_ on Window O_NOCTTY O_SHLOCK O_EXLOCK + O_CLOEXEC These constants are only available on Unix. + .. versionchanged:: 3.3 + Add :data:`O_CLOEXEC` constant. .. data:: O_BINARY O_NOINHERIT @@ -882,12 +1145,106 @@ or `the MSDN <http://msdn.microsoft.com/en-us/library/z0kc8e3z.aspx>`_ on Window the C library. +.. data:: RTLD_LAZY + RTLD_NOW + RTLD_GLOBAL + RTLD_LOCAL + RTLD_NODELETE + RTLD_NOLOAD + RTLD_DEEPBIND + + See the Unix manual page :manpage:`dlopen(3)`. + + .. versionadded:: 3.3 + + +.. _terminal-size: + +Querying the size of a terminal +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. versionadded:: 3.3 + +.. function:: get_terminal_size(fd=STDOUT_FILENO) + + Return the size of the terminal window as ``(columns, lines)``, + tuple of type :class:`terminal_size`. + + The optional argument ``fd`` (default ``STDOUT_FILENO``, or standard + output) specifies which file descriptor should be queried. + + If the file descriptor is not connected to a terminal, an :exc:`OSError` + is raised. + + :func:`shutil.get_terminal_size` is the high-level function which + should normally be used, ``os.get_terminal_size`` is the low-level + implementation. + + Availability: Unix, Windows. + +.. class:: terminal_size + + A subclass of tuple, holding ``(columns, lines)`` of the terminal window size. + + .. attribute:: columns + + Width of the terminal window in characters. + + .. attribute:: lines + + Height of the terminal window in characters. + + .. _os-file-dir: Files and Directories --------------------- -.. function:: access(path, mode) +On some Unix platforms, many of these functions support one or more of these +features: + +.. _path_fd: + +* **specifying a file descriptor:** + For some functions, the *path* argument can be not only a string giving a path + name, but also a file descriptor. The function will then operate on the file + referred to by the descriptor. (For POSIX systems, Python will call the + ``f...`` version of the function.) + + You can check whether or not *path* can be specified as a file descriptor on + your platform using :data:`os.supports_fd`. If it is unavailable, using it + will raise a :exc:`NotImplementedError`. + + If the function also supports *dir_fd* or *follow_symlinks* arguments, it is + an error to specify one of those when supplying *path* as a file descriptor. + +.. _dir_fd: + +* **paths relative to directory descriptors:** If *dir_fd* is not ``None``, it + should be a file descriptor referring to a directory, and the path to operate + on should be relative; path will then be relative to that directory. If the + path is absolute, *dir_fd* is ignored. (For POSIX systems, Python will call + the ``...at`` or ``f...at`` version of the function.) + + You can check whether or not *dir_fd* is supported on your platform using + :data:`os.supports_dir_fd`. If it is unavailable, using it will raise a + :exc:`NotImplementedError`. + +.. _follow_symlinks: + +* **not following symlinks:** If *follow_symlinks* is + ``False``, and the last element of the path to operate on is a symbolic link, + the function will operate on the symbolic link itself instead of the file the + link points to. (For POSIX systems, Python will call the ``l...`` version of + the function.) + + You can check whether or not *follow_symlinks* is supported on your platform + using :data:`os.supports_follow_symlinks`. If it is unavailable, using it + will raise a :exc:`NotImplementedError`. + + + +.. function:: access(path, mode, *, dir_fd=None, effective_ids=False, follow_symlinks=True) Use the real uid/gid to test for access to *path*. Note that most operations will use the effective uid/gid, therefore this routine can be used in a @@ -898,6 +1255,15 @@ Files and Directories :const:`False` if not. See the Unix man page :manpage:`access(2)` for more information. + This function can support specifying :ref:`paths relative to directory + descriptors <dir_fd>` and :ref:`not following symlinks <follow_symlinks>`. + + If *effective_ids* is ``True``, :func:`access` will perform its access + checks using the effective uid/gid instead of the real uid/gid. + *effective_ids* may not be supported on your platform; you can check whether + or not it is available using :data:`os.supports_effective_ids`. If it is + unavailable, using it will raise a :exc:`NotImplementedError`. + Availability: Unix, Windows. .. note:: @@ -917,11 +1283,8 @@ Files and Directories try: fp = open("myfile") - except IOError as e: - if e.errno == errno.EACCES: - return "some default data" - # Not a permission error. - raise + except PermissionError: + return "some default data" else: with fp: return fp.read() @@ -932,29 +1295,18 @@ Files and Directories succeed, particularly for operations on network filesystems which may have permissions semantics beyond the usual POSIX permission-bit model. + .. versionchanged:: 3.3 + Added the *dir_fd*, *effective_ids*, and *follow_symlinks* parameters. -.. data:: F_OK - - Value to pass as the *mode* parameter of :func:`access` to test the existence of - *path*. - - -.. data:: R_OK - - Value to include in the *mode* parameter of :func:`access` to test the - readability of *path*. - - -.. data:: W_OK - - Value to include in the *mode* parameter of :func:`access` to test the - writability of *path*. +.. data:: F_OK + R_OK + W_OK + X_OK -.. data:: X_OK - - Value to include in the *mode* parameter of :func:`access` to determine if - *path* can be executed. + Values to pass as the *mode* parameter of :func:`access` to test the + existence, readability, writability and executability of *path*, + respectively. .. function:: chdir(path) @@ -963,33 +1315,17 @@ Files and Directories Change the current working directory to *path*. - Availability: Unix, Windows. - - -.. function:: fchdir(fd) - - Change the current working directory to the directory represented by the file - descriptor *fd*. The descriptor must refer to an opened directory, not an open - file. - - Availability: Unix. - - -.. function:: getcwd() - - Return a string representing the current working directory. + This function can support :ref:`specifying a file descriptor <path_fd>`. The + descriptor must refer to an opened directory, not an open file. Availability: Unix, Windows. - -.. function:: getcwdb() - - Return a bytestring representing the current working directory. - - Availability: Unix, Windows. + .. versionadded:: 3.3 + Added support for specifying *path* as a file descriptor + on some platforms. -.. function:: chflags(path, flags) +.. function:: chflags(path, flags, *, follow_symlinks=True) Set the flags of *path* to the numeric *flags*. *flags* may take a combination (bitwise OR) of the following values (as defined in the :mod:`stat` module): @@ -1007,16 +1343,15 @@ Files and Directories * :data:`stat.SF_NOUNLINK` * :data:`stat.SF_SNAPSHOT` - Availability: Unix. - + This function can support :ref:`not following symlinks <follow_symlinks>`. -.. function:: chroot(path) + Availability: Unix. - Change the root directory of the current process to *path*. Availability: - Unix. + .. versionadded:: 3.3 + The *follow_symlinks* argument. -.. function:: chmod(path, mode) +.. function:: chmod(path, mode, *, dir_fd=None, follow_symlinks=True) Change the mode of *path* to the numeric *mode*. *mode* may take one of the following values (as defined in the :mod:`stat` module) or bitwise ORed @@ -1042,28 +1377,77 @@ Files and Directories * :data:`stat.S_IWOTH` * :data:`stat.S_IXOTH` + This function can support :ref:`specifying a file descriptor <path_fd>`, + :ref:`paths relative to directory descriptors <dir_fd>` and :ref:`not + following symlinks <follow_symlinks>`. + Availability: Unix, Windows. .. note:: - Although Windows supports :func:`chmod`, you can only set the file's read-only - flag with it (via the ``stat.S_IWRITE`` and ``stat.S_IREAD`` - constants or a corresponding integer value). All other bits are - ignored. + Although Windows supports :func:`chmod`, you can only set the file's + read-only flag with it (via the ``stat.S_IWRITE`` and ``stat.S_IREAD`` + constants or a corresponding integer value). All other bits are ignored. + + .. versionadded:: 3.3 + Added support for specifying *path* as an open file descriptor, + and the *dir_fd* and *follow_symlinks* arguments. + + +.. function:: chown(path, uid, gid, *, dir_fd=None, follow_symlinks=True) + + Change the owner and group id of *path* to the numeric *uid* and *gid*. To + leave one of the ids unchanged, set it to -1. + + This function can support :ref:`specifying a file descriptor <path_fd>`, + :ref:`paths relative to directory descriptors <dir_fd>` and :ref:`not + following symlinks <follow_symlinks>`. + + See :func:`shutil.chown` for a higher-level function that accepts names in + addition to numeric ids. + + Availability: Unix. + + .. versionadded:: 3.3 + Added support for specifying an open file descriptor for *path*, + and the *dir_fd* and *follow_symlinks* arguments. + + +.. function:: chroot(path) + + Change the root directory of the current process to *path*. + Availability: Unix. -.. function:: chown(path, uid, gid) - Change the owner and group id of *path* to the numeric *uid* and *gid*. To leave - one of the ids unchanged, set it to -1. +.. function:: fchdir(fd) + + Change the current working directory to the directory represented by the file + descriptor *fd*. The descriptor must refer to an opened directory, not an + open file. As of Python 3.3, this is equivalent to ``os.chdir(fd)``. Availability: Unix. +.. function:: getcwd() + + Return a string representing the current working directory. + + Availability: Unix, Windows. + + +.. function:: getcwdb() + + Return a bytestring representing the current working directory. + + Availability: Unix, Windows. + + .. function:: lchflags(path, flags) - Set the flags of *path* to the numeric *flags*, like :func:`chflags`, but do not - follow symbolic links. + Set the flags of *path* to the numeric *flags*, like :func:`chflags`, but do + not follow symbolic links. As of Python 3.3, this is equivalent to + ``os.chflags(path, flags, follow_symlinks=False)``. Availability: Unix. @@ -1071,110 +1455,101 @@ Files and Directories .. function:: lchmod(path, mode) Change the mode of *path* to the numeric *mode*. If path is a symlink, this - affects the symlink rather than the target. See the docs for :func:`chmod` - for possible values of *mode*. + affects the symlink rather than the target. See the docs for :func:`chmod` + for possible values of *mode*. As of Python 3.3, this is equivalent to + ``os.chmod(path, mode, follow_symlinks=False)``. Availability: Unix. .. function:: lchown(path, uid, gid) - Change the owner and group id of *path* to the numeric *uid* and *gid*. This - function will not follow symbolic links. + Change the owner and group id of *path* to the numeric *uid* and *gid*. This + function will not follow symbolic links. As of Python 3.3, this is equivalent + to ``os.chown(path, uid, gid, follow_symlinks=False)``. Availability: Unix. -.. function:: link(source, link_name) +.. function:: link(src, dst, *, src_dir_fd=None, dst_dir_fd=None, follow_symlinks=True) + + Create a hard link pointing to *src* named *dst*. - Create a hard link pointing to *source* named *link_name*. + This function can support specifying *src_dir_fd* and/or *dst_dir_fd* to + supply :ref:`paths relative to directory descriptors <dir_fd>`, and :ref:`not + following symlinks <follow_symlinks>`. Availability: Unix, Windows. .. versionchanged:: 3.2 Added Windows support. + .. versionadded:: 3.3 + Added the *src_dir_fd*, *dst_dir_fd*, and *follow_symlinks* arguments. + .. function:: listdir(path='.') Return a list containing the names of the entries in the directory given by - *path* (default: ``'.'``). The list is in arbitrary order. It does not include the special + *path*. The list is in arbitrary order, and does not include the special entries ``'.'`` and ``'..'`` even if they are present in the directory. - This function can be called with a bytes or string argument, and returns - filenames of the same datatype. + *path* may be either of type ``str`` or of type ``bytes``. If *path* + is of type ``bytes``, the filenames returned will also be of type ``bytes``; + in all other circumstances, they will be of type ``str``. + + This function can also support :ref:`specifying a file descriptor + <path_fd>`; the file descriptor must refer to a directory. + + .. note:: + To encode ``str`` filenames to ``bytes``, use :func:`~os.fsencode`. Availability: Unix, Windows. .. versionchanged:: 3.2 The *path* parameter became optional. -.. function:: lstat(path) + .. versionadded:: 3.3 + Added support for specifying an open file descriptor for *path*. + + +.. function:: lstat(path, *, dir_fd=None) Perform the equivalent of an :c:func:`lstat` system call on the given path. Similar to :func:`~os.stat`, but does not follow symbolic links. On platforms that do not support symbolic links, this is an alias for - :func:`~os.stat`. + :func:`~os.stat`. As of Python 3.3, this is equivalent to ``os.stat(path, + dir_fd=dir_fd, follow_symlinks=False)``. + + This function can also support :ref:`paths relative to directory descriptors + <dir_fd>`. .. versionchanged:: 3.2 Added support for Windows 6.0 (Vista) symbolic links. - -.. function:: mkfifo(path[, mode]) - - Create a FIFO (a named pipe) named *path* with numeric mode *mode*. The - default *mode* is ``0o666`` (octal). The current umask value is first masked - out from the mode. - - FIFOs are pipes that can be accessed like regular files. FIFOs exist until they - are deleted (for example with :func:`os.unlink`). Generally, FIFOs are used as - rendezvous between "client" and "server" type processes: the server opens the - FIFO for reading, and the client opens it for writing. Note that :func:`mkfifo` - doesn't open the FIFO --- it just creates the rendezvous point. - - Availability: Unix. - - -.. function:: mknod(filename[, mode=0o600[, device=0]]) - - Create a filesystem node (file, device special file or named pipe) named - *filename*. *mode* specifies both the permissions to use and the type of node - to be created, being combined (bitwise OR) with one of ``stat.S_IFREG``, - ``stat.S_IFCHR``, ``stat.S_IFBLK``, and ``stat.S_IFIFO`` (those constants are - available in :mod:`stat`). For ``stat.S_IFCHR`` and ``stat.S_IFBLK``, - *device* defines the newly created device special file (probably using - :func:`os.makedev`), otherwise it is ignored. - - -.. function:: major(device) - - Extract the device major number from a raw device number (usually the - :attr:`st_dev` or :attr:`st_rdev` field from :c:type:`stat`). + .. versionchanged:: 3.3 + Added the *dir_fd* parameter. -.. function:: minor(device) +.. function:: mkdir(path, mode=0o777, *, dir_fd=None) - Extract the device minor number from a raw device number (usually the - :attr:`st_dev` or :attr:`st_rdev` field from :c:type:`stat`). + Create a directory named *path* with numeric mode *mode*. + On some systems, *mode* is ignored. Where it is used, the current umask + value is first masked out. If the directory already exists, :exc:`OSError` + is raised. -.. function:: makedev(major, minor) - - Compose a raw device number from the major and minor device numbers. - - -.. function:: mkdir(path[, mode]) - - Create a directory named *path* with numeric mode *mode*. The default *mode* - is ``0o777`` (octal). On some systems, *mode* is ignored. Where it is used, - the current umask value is first masked out. If the directory already - exists, :exc:`OSError` is raised. + This function can also support :ref:`paths relative to directory descriptors + <dir_fd>`. It is also possible to create temporary directories; see the :mod:`tempfile` module's :func:`tempfile.mkdtemp` function. Availability: Unix, Windows. + .. versionadded:: 3.3 + The *dir_fd* argument. + .. function:: makedirs(path, mode=0o777, exist_ok=False) @@ -1201,12 +1576,66 @@ Files and Directories .. versionadded:: 3.2 The *exist_ok* parameter. - .. versionchanged:: 3.2.6 + .. versionchanged:: 3.3.6 - Before Python 3.2.6, if *exist_ok* was ``True`` and the directory existed, + Before Python 3.3.6, if *exist_ok* was ``True`` and the directory existed, :func:`makedirs` would still raise an error if *mode* did not match the mode of the existing directory. Since this behavior was impossible to - implement safely, it was removed in Python 3.2.6. See :issue:`21082`. + implement safely, it was removed in Python 3.3.6. See :issue:`21082`. + + +.. function:: mkfifo(path, mode=0o666, *, dir_fd=None) + + Create a FIFO (a named pipe) named *path* with numeric mode *mode*. + The current umask value is first masked out from the mode. + + This function can also support :ref:`paths relative to directory descriptors + <dir_fd>`. + + FIFOs are pipes that can be accessed like regular files. FIFOs exist until they + are deleted (for example with :func:`os.unlink`). Generally, FIFOs are used as + rendezvous between "client" and "server" type processes: the server opens the + FIFO for reading, and the client opens it for writing. Note that :func:`mkfifo` + doesn't open the FIFO --- it just creates the rendezvous point. + + Availability: Unix. + + .. versionadded:: 3.3 + The *dir_fd* argument. + + +.. function:: mknod(filename, mode=0o600, device=0, *, dir_fd=None) + + Create a filesystem node (file, device special file or named pipe) named + *filename*. *mode* specifies both the permissions to use and the type of node + to be created, being combined (bitwise OR) with one of ``stat.S_IFREG``, + ``stat.S_IFCHR``, ``stat.S_IFBLK``, and ``stat.S_IFIFO`` (those constants are + available in :mod:`stat`). For ``stat.S_IFCHR`` and ``stat.S_IFBLK``, + *device* defines the newly created device special file (probably using + :func:`os.makedev`), otherwise it is ignored. + + This function can also support :ref:`paths relative to directory descriptors + <dir_fd>`. + + .. versionadded:: 3.3 + The *dir_fd* argument. + + +.. function:: major(device) + + Extract the device major number from a raw device number (usually the + :attr:`st_dev` or :attr:`st_rdev` field from :c:type:`stat`). + + +.. function:: minor(device) + + Extract the device minor number from a raw device number (usually the + :attr:`st_dev` or :attr:`st_rdev` field from :c:type:`stat`). + + +.. function:: makedev(major, minor) + + Compose a raw device number from the major and minor device numbers. .. function:: pathconf(path, name) @@ -1224,6 +1653,9 @@ Files and Directories included in ``pathconf_names``, an :exc:`OSError` is raised with :const:`errno.EINVAL` for the error number. + This function can support :ref:`specifying a file descriptor + <path_fd>`. + Availability: Unix. @@ -1231,38 +1663,53 @@ Files and Directories Dictionary mapping names accepted by :func:`pathconf` and :func:`fpathconf` to the integer values defined for those names by the host operating system. This - can be used to determine the set of names known to the system. Availability: - Unix. + can be used to determine the set of names known to the system. + + Availability: Unix. -.. function:: readlink(path) +.. function:: readlink(path, *, dir_fd=None) Return a string representing the path to which the symbolic link points. The - result may be either an absolute or relative pathname; if it is relative, it may - be converted to an absolute pathname using ``os.path.join(os.path.dirname(path), - result)``. + result may be either an absolute or relative pathname; if it is relative, it + may be converted to an absolute pathname using + ``os.path.join(os.path.dirname(path), result)``. If the *path* is a string object, the result will also be a string object, and the call may raise an UnicodeDecodeError. If the *path* is a bytes object, the result will be a bytes object. + This function can also support :ref:`paths relative to directory descriptors + <dir_fd>`. + Availability: Unix, Windows .. versionchanged:: 3.2 Added support for Windows 6.0 (Vista) symbolic links. + .. versionadded:: 3.3 + The *dir_fd* argument. + -.. function:: remove(path) +.. function:: remove(path, *, dir_fd=None) Remove (delete) the file *path*. If *path* is a directory, :exc:`OSError` is - raised; see :func:`rmdir` below to remove a directory. This is identical to - the :func:`unlink` function documented below. On Windows, attempting to - remove a file that is in use causes an exception to be raised; on Unix, the - directory entry is removed but the storage allocated to the file is not made - available until the original file is no longer in use. + raised. Use :func:`rmdir` to remove directories. + + This function can support :ref:`paths relative to directory descriptors + <dir_fd>`. + + On Windows, attempting to remove a file that is in use causes an exception to + be raised; on Unix, the directory entry is removed but the storage allocated + to the file is not made available until the original file is no longer in use. + + This function is identical to :func:`unlink`. Availability: Unix, Windows. + .. versionadded:: 3.3 + The *dir_fd* argument. + .. function:: removedirs(path) @@ -1278,7 +1725,7 @@ Files and Directories successfully removed. -.. function:: rename(src, dst) +.. function:: rename(src, dst, *, src_dir_fd=None, dst_dir_fd=None) Rename the file or directory *src* to *dst*. If *dst* is a directory, :exc:`OSError` will be raised. On Unix, if *dst* exists and is a file, it will @@ -1286,11 +1733,18 @@ Files and Directories Unix flavors if *src* and *dst* are on different filesystems. If successful, the renaming will be an atomic operation (this is a POSIX requirement). On Windows, if *dst* already exists, :exc:`OSError` will be raised even if it is a - file; there may be no way to implement an atomic rename when *dst* names an - existing file. + file. + + This function can support specifying *src_dir_fd* and/or *dst_dir_fd* to + supply :ref:`paths relative to directory descriptors <dir_fd>`. + + If you want cross-platform overwriting of the destination, use :func:`replace`. Availability: Unix, Windows. + .. versionadded:: 3.3 + The *src_dir_fd* and *dst_dir_fd* arguments. + .. function:: renames(old, new) @@ -1305,22 +1759,46 @@ Files and Directories permissions needed to remove the leaf directory or file. -.. function:: rmdir(path) +.. function:: replace(src, dst, *, src_dir_fd=None, dst_dir_fd=None) + + Rename the file or directory *src* to *dst*. If *dst* is a directory, + :exc:`OSError` will be raised. If *dst* exists and is a file, it will + be replaced silently if the user has permission. The operation may fail + if *src* and *dst* are on different filesystems. If successful, + the renaming will be an atomic operation (this is a POSIX requirement). + + This function can support specifying *src_dir_fd* and/or *dst_dir_fd* to + supply :ref:`paths relative to directory descriptors <dir_fd>`. + + Availability: Unix, Windows. + + .. versionadded:: 3.3 + + +.. function:: rmdir(path, *, dir_fd=None) Remove (delete) the directory *path*. Only works when the directory is empty, otherwise, :exc:`OSError` is raised. In order to remove whole directory trees, :func:`shutil.rmtree` can be used. + This function can support :ref:`paths relative to directory descriptors + <dir_fd>`. + Availability: Unix, Windows. + .. versionadded:: 3.3 + The *dir_fd* parameter. -.. function:: stat(path) + +.. function:: stat(path, *, dir_fd=None, follow_symlinks=True) Perform the equivalent of a :c:func:`stat` system call on the given path. - (This function follows symlinks; to stat a symlink use :func:`lstat`.) + *path* may be specified as either a string or as an open file descriptor. + (This function normally follows symlinks; to stat a symlink add the argument + ``follow_symlinks=False``, or use :func:`lstat`.) - The return value is an object whose attributes correspond to the members - of the :c:type:`stat` structure, namely: + The return value is an object whose attributes correspond roughly + to the members of the :c:type:`stat` structure, namely: * :attr:`st_mode` - protection bits, * :attr:`st_ino` - inode number, @@ -1329,16 +1807,24 @@ Files and Directories * :attr:`st_uid` - user id of owner, * :attr:`st_gid` - group id of owner, * :attr:`st_size` - size of file, in bytes, - * :attr:`st_atime` - time of most recent access, - * :attr:`st_mtime` - time of most recent content modification, - * :attr:`st_ctime` - platform dependent; time of most recent metadata change on - Unix, or the time of creation on Windows) + * :attr:`st_atime` - time of most recent access expressed in seconds, + * :attr:`st_mtime` - time of most recent content modification + expressed in seconds, + * :attr:`st_ctime` - platform dependent; time of most recent metadata + change on Unix, or the time of creation on Windows, expressed in seconds + * :attr:`st_atime_ns` - time of most recent access + expressed in nanoseconds as an integer, + * :attr:`st_mtime_ns` - time of most recent content modification + expressed in nanoseconds as an integer, + * :attr:`st_ctime_ns` - platform dependent; time of most recent metadata + change on Unix, or the time of creation on Windows, + expressed in nanoseconds as an integer On some Unix systems (such as Linux), the following attributes may also be available: - * :attr:`st_blocks` - number of blocks allocated for file - * :attr:`st_blksize` - filesystem blocksize + * :attr:`st_blocks` - number of 512-byte blocks allocated for file + * :attr:`st_blksize` - filesystem blocksize for efficient file system I/O * :attr:`st_rdev` - type of device if an inode device * :attr:`st_flags` - user defined flags for file @@ -1362,13 +1848,25 @@ Files and Directories or FAT32 file systems, :attr:`st_mtime` has 2-second resolution, and :attr:`st_atime` has only 1-day resolution. See your operating system documentation for details. - - For backward compatibility, the return value of :func:`~os.stat` is also accessible - as a tuple of at least 10 integers giving the most important (and portable) - members of the :c:type:`stat` structure, in the order :attr:`st_mode`, - :attr:`st_ino`, :attr:`st_dev`, :attr:`st_nlink`, :attr:`st_uid`, - :attr:`st_gid`, :attr:`st_size`, :attr:`st_atime`, :attr:`st_mtime`, - :attr:`st_ctime`. More items may be added at the end by some implementations. + Similarly, although :attr:`st_atime_ns`, :attr:`st_mtime_ns`, + and :attr:`st_ctime_ns` are always expressed in nanoseconds, many + systems do not provide nanosecond precision. On systems that do + provide nanosecond precision, the floating-point object used to + store :attr:`st_atime`, :attr:`st_mtime`, and :attr:`st_ctime` + cannot preserve all of it, and as such will be slightly inexact. + If you need the exact timestamps you should always use + :attr:`st_atime_ns`, :attr:`st_mtime_ns`, and :attr:`st_ctime_ns`. + + For backward compatibility, the return value of :func:`~os.stat` is also + accessible as a tuple of at least 10 integers giving the most important (and + portable) members of the :c:type:`stat` structure, in the order + :attr:`st_mode`, :attr:`st_ino`, :attr:`st_dev`, :attr:`st_nlink`, + :attr:`st_uid`, :attr:`st_gid`, :attr:`st_size`, :attr:`st_atime`, + :attr:`st_mtime`, :attr:`st_ctime`. More items may be added at the end by + some implementations. + + This function can support :ref:`specifying a file descriptor <path_fd>` and + :ref:`not following symlinks <follow_symlinks>`. .. index:: module: stat @@ -1389,6 +1887,12 @@ Files and Directories Availability: Unix, Windows. + .. versionadded:: 3.3 + Added the *dir_fd* and *follow_symlinks* arguments, + specifying a file descriptor instead of a path, + and the :attr:`st_atime_ns`, :attr:`st_mtime_ns`, + and :attr:`st_ctime_ns` members. + .. function:: stat_float_times([newvalue]) @@ -1414,6 +1918,8 @@ Files and Directories are processed, this application should turn the feature off until the library has been corrected. + .. deprecated:: 3.3 + .. function:: statvfs(path) @@ -1429,36 +1935,122 @@ Files and Directories read-only, and if :const:`ST_NOSUID` is set, the semantics of setuid/setgid bits are disabled or not supported. + This function can support :ref:`specifying a file descriptor <path_fd>`. + .. versionchanged:: 3.2 The :const:`ST_RDONLY` and :const:`ST_NOSUID` constants were added. Availability: Unix. + .. versionadded:: 3.3 + Added support for specifying an open file descriptor for *path*. -.. function:: symlink(source, link_name) - symlink(source, link_name, target_is_directory=False) - Create a symbolic link pointing to *source* named *link_name*. +.. data:: supports_dir_fd + + A :class:`~collections.abc.Set` object indicating which functions in the + :mod:`os` module permit use of their *dir_fd* parameter. Different platforms + provide different functionality, and an option that might work on one might + be unsupported on another. For consistency's sakes, functions that support + *dir_fd* always allow specifying the parameter, but will raise an exception + if the functionality is not actually available. + + To check whether a particular function permits use of its *dir_fd* + parameter, use the ``in`` operator on ``supports_dir_fd``. As an example, + this expression determines whether the *dir_fd* parameter of :func:`os.stat` + is locally available:: + + os.stat in os.supports_dir_fd + + Currently *dir_fd* parameters only work on Unix platforms; none of them work + on Windows. + + .. versionadded:: 3.3 + + +.. data:: supports_effective_ids + + A :class:`~collections.abc.Set` object indicating which functions in the + :mod:`os` module permit use of the *effective_ids* parameter for + :func:`os.access`. If the local platform supports it, the collection will + contain :func:`os.access`, otherwise it will be empty. + + To check whether you can use the *effective_ids* parameter for + :func:`os.access`, use the ``in`` operator on ``supports_dir_fd``, like so:: + + os.access in os.supports_effective_ids + + Currently *effective_ids* only works on Unix platforms; it does not work on + Windows. + + .. versionadded:: 3.3 + + +.. data:: supports_fd + + A :class:`~collections.abc.Set` object indicating which functions in the + :mod:`os` module permit specifying their *path* parameter as an open file + descriptor. Different platforms provide different functionality, and an + option that might work on one might be unsupported on another. For + consistency's sakes, functions that support *fd* always allow specifying + the parameter, but will raise an exception if the functionality is not + actually available. + + To check whether a particular function permits specifying an open file + descriptor for its *path* parameter, use the ``in`` operator on + ``supports_fd``. As an example, this expression determines whether + :func:`os.chdir` accepts open file descriptors when called on your local + platform:: + + os.chdir in os.supports_fd + + .. versionadded:: 3.3 + - On Windows, symlink version takes an additional optional parameter, - *target_is_directory*, which defaults to ``False``. +.. data:: supports_follow_symlinks - On Windows, a symlink represents a file or a directory, and does not morph to - the target dynamically. If *target_is_directory* is set to ``True``, the - symlink will be created as a directory symlink, otherwise as a file symlink - (the default). + A :class:`~collections.abc.Set` object indicating which functions in the + :mod:`os` module permit use of their *follow_symlinks* parameter. Different + platforms provide different functionality, and an option that might work on + one might be unsupported on another. For consistency's sakes, functions that + support *follow_symlinks* always allow specifying the parameter, but will + raise an exception if the functionality is not actually available. + + To check whether a particular function permits use of its *follow_symlinks* + parameter, use the ``in`` operator on ``supports_follow_symlinks``. As an + example, this expression determines whether the *follow_symlinks* parameter + of :func:`os.stat` is locally available:: + + os.stat in os.supports_follow_symlinks + + .. versionadded:: 3.3 + + +.. function:: symlink(source, link_name, target_is_directory=False, *, dir_fd=None) + + Create a symbolic link pointing to *source* named *link_name*. + + On Windows, a symlink represents either a file or a directory, and does not + morph to the target dynamically. If the target is present, the type of the + symlink will be created to match. Otherwise, the symlink will be created + as a directory if *target_is_directory* is ``True`` or a file symlink (the + default) otherwise. On non-Window platforms, *target_is_directory* is ignored. Symbolic link support was introduced in Windows 6.0 (Vista). :func:`symlink` will raise a :exc:`NotImplementedError` on Windows versions earlier than 6.0. + This function can support :ref:`paths relative to directory descriptors + <dir_fd>`. + .. note:: - The *SeCreateSymbolicLinkPrivilege* is required in order to successfully - create symlinks. This privilege is not typically granted to regular - users but is available to accounts which can escalate privileges to the - administrator level. Either obtaining the privilege or running your + On Windows, the *SeCreateSymbolicLinkPrivilege* is required in order to + successfully create symlinks. This privilege is not typically granted to + regular users but is available to accounts which can escalate privileges + to the administrator level. Either obtaining the privilege or running your application as an administrator are ways to successfully create symlinks. + :exc:`OSError` is raised when the function is called by an unprivileged user. @@ -1467,31 +2059,83 @@ Files and Directories .. versionchanged:: 3.2 Added support for Windows 6.0 (Vista) symbolic links. + .. versionadded:: 3.3 + Added the *dir_fd* argument, and now allow *target_is_directory* + on non-Windows platforms. + + +.. function:: sync() + + Force write of everything to disk. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: truncate(path, length) + + Truncate the file corresponding to *path*, so that it is at most + *length* bytes in size. + + This function can support :ref:`specifying a file descriptor <path_fd>`. + + Availability: Unix. + + .. versionadded:: 3.3 -.. function:: unlink(path) - Remove (delete) the file *path*. This is the same function as - :func:`remove`; the :func:`unlink` name is its traditional Unix - name. +.. function:: unlink(path, *, dir_fd=None) + + Remove (delete) the file *path*. This function is identical to + :func:`remove`; the ``unlink`` name is its traditional Unix + name. Please see the documentation for :func:`remove` for + further information. Availability: Unix, Windows. + .. versionadded:: 3.3 + The *dir_fd* parameter. + + +.. function:: utime(path, times=None, *, ns=None, dir_fd=None, follow_symlinks=True) + + Set the access and modified times of the file specified by *path*. + + :func:`utime` takes two optional parameters, *times* and *ns*. + These specify the times set on *path* and are used as follows: -.. function:: utime(path, times) + - If *ns* is not ``None``, + it must be a 2-tuple of the form ``(atime_ns, mtime_ns)`` + where each member is an int expressing nanoseconds. + - If *times* is not ``None``, + it must be a 2-tuple of the form ``(atime, mtime)`` + where each member is an int or float expressing seconds. + - If *times* and *ns* are both ``None``, + this is equivalent to specifying ``ns=(atime_ns, mtime_ns)`` + where both times are the current time. - Set the access and modified times of the file specified by *path*. If *times* - is ``None``, then the file's access and modified times are set to the current - time. (The effect is similar to running the Unix program :program:`touch` on - the path.) Otherwise, *times* must be a 2-tuple of numbers, of the form - ``(atime, mtime)`` which is used to set the access and modified times, - respectively. Whether a directory can be given for *path* depends on whether - the operating system implements directories as files (for example, Windows - does not). Note that the exact times you set here may not be returned by a - subsequent :func:`~os.stat` call, depending on the resolution with which your - operating system records access and modification times; see :func:`~os.stat`. + It is an error to specify tuples for both *times* and *ns*. + + Whether a directory can be given for *path* + depends on whether the operating system implements directories as files + (for example, Windows does not). Note that the exact times you set here may + not be returned by a subsequent :func:`~os.stat` call, depending on the + resolution with which your operating system records access and modification + times; see :func:`~os.stat`. The best way to preserve exact times is to + use the *st_atime_ns* and *st_mtime_ns* fields from the :func:`os.stat` + result object with the *ns* parameter to `utime`. + + This function can support :ref:`specifying a file descriptor <path_fd>`, + :ref:`paths relative to directory descriptors <dir_fd>` and :ref:`not + following symlinks <follow_symlinks>`. Availability: Unix, Windows. + .. versionadded:: 3.3 + Added support for specifying an open file descriptor for *path*, + and the *dir_fd*, *follow_symlinks*, and *ns* parameters. + .. function:: walk(top, topdown=True, onerror=None, followlinks=False) @@ -1538,9 +2182,9 @@ Files and Directories .. note:: - Be aware that setting *followlinks* to ``True`` can lead to infinite recursion if a - link points to a parent directory of itself. :func:`walk` does not keep track of - the directories it visited already. + Be aware that setting *followlinks* to ``True`` can lead to infinite + recursion if a link points to a parent directory of itself. :func:`walk` + does not keep track of the directories it visited already. .. note:: @@ -1576,6 +2220,137 @@ Files and Directories os.rmdir(os.path.join(root, name)) +.. function:: fwalk(top='.', topdown=True, onerror=None, *, follow_symlinks=False, dir_fd=None) + + .. index:: + single: directory; walking + single: directory; traversal + + This behaves exactly like :func:`walk`, except that it yields a 4-tuple + ``(dirpath, dirnames, filenames, dirfd)``, and it supports ``dir_fd``. + + *dirpath*, *dirnames* and *filenames* are identical to :func:`walk` output, + and *dirfd* is a file descriptor referring to the directory *dirpath*. + + This function always supports :ref:`paths relative to directory descriptors + <dir_fd>` and :ref:`not following symlinks <follow_symlinks>`. Note however + that, unlike other functions, the :func:`fwalk` default value for + *follow_symlinks* is ``False``. + + .. note:: + + Since :func:`fwalk` yields file descriptors, those are only valid until + the next iteration step, so you should duplicate them (e.g. with + :func:`dup`) if you want to keep them longer. + + This example displays the number of bytes taken by non-directory files in each + directory under the starting directory, except that it doesn't look under any + CVS subdirectory:: + + import os + for root, dirs, files, rootfd in os.fwalk('python/Lib/email'): + print(root, "consumes", end="") + print(sum([os.stat(name, dir_fd=rootfd).st_size for name in files]), + end="") + print("bytes in", len(files), "non-directory files") + if 'CVS' in dirs: + dirs.remove('CVS') # don't visit CVS directories + + In the next example, walking the tree bottom-up is essential: + :func:`rmdir` doesn't allow deleting a directory before the directory is + empty:: + + # Delete everything reachable from the directory named in "top", + # assuming there are no symbolic links. + # CAUTION: This is dangerous! For example, if top == '/', it + # could delete all your disk files. + import os + for root, dirs, files, rootfd in os.fwalk(top, topdown=False): + for name in files: + os.unlink(name, dir_fd=rootfd) + for name in dirs: + os.rmdir(name, dir_fd=rootfd) + + Availability: Unix. + + .. versionadded:: 3.3 + + +Linux extended attributes +~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. versionadded:: 3.3 + +These functions are all available on Linux only. + +.. function:: getxattr(path, attribute, *, follow_symlinks=True) + + Return the value of the extended filesystem attribute *attribute* for + *path*. *attribute* can be bytes or str. If it is str, it is encoded + with the filesystem encoding. + + This function can support :ref:`specifying a file descriptor <path_fd>` and + :ref:`not following symlinks <follow_symlinks>`. + + +.. function:: listxattr(path=None, *, follow_symlinks=True) + + Return a list of the extended filesystem attributes on *path*. The + attributes in the list are represented as strings decoded with the filesystem + encoding. If *path* is ``None``, :func:`listxattr` will examine the current + directory. + + This function can support :ref:`specifying a file descriptor <path_fd>` and + :ref:`not following symlinks <follow_symlinks>`. + + +.. function:: removexattr(path, attribute, *, follow_symlinks=True) + + Removes the extended filesystem attribute *attribute* from *path*. + *attribute* should be bytes or str. If it is a string, it is encoded + with the filesystem encoding. + + This function can support :ref:`specifying a file descriptor <path_fd>` and + :ref:`not following symlinks <follow_symlinks>`. + + +.. function:: setxattr(path, attribute, value, flags=0, *, follow_symlinks=True) + + Set the extended filesystem attribute *attribute* on *path* to *value*. + *attribute* must be a bytes or str with no embedded NULs. If it is a str, + it is encoded with the filesystem encoding. *flags* may be + :data:`XATTR_REPLACE` or :data:`XATTR_CREATE`. If :data:`XATTR_REPLACE` is + given and the attribute does not exist, ``EEXISTS`` will be raised. + If :data:`XATTR_CREATE` is given and the attribute already exists, the + attribute will not be created and ``ENODATA`` will be raised. + + This function can support :ref:`specifying a file descriptor <path_fd>` and + :ref:`not following symlinks <follow_symlinks>`. + + .. note:: + + A bug in Linux kernel versions less than 2.6.39 caused the flags argument + to be ignored on some filesystems. + + +.. data:: XATTR_SIZE_MAX + + The maximum size the value of an extended attribute can be. Currently, this + is 64 KiB on Linux. + + +.. data:: XATTR_CREATE + + This is a possible value for the flags argument in :func:`setxattr`. It + indicates the operation must create an attribute. + + +.. data:: XATTR_REPLACE + + This is a possible value for the flags argument in :func:`setxattr`. It + indicates the operation must replace an existing attribute. + + .. _os-process: Process Management @@ -1583,7 +2358,7 @@ Process Management These functions may be used to create and manage processes. -The various :func:`exec\*` functions take a list of arguments for the new +The various :func:`exec\* <execl>` functions take a list of arguments for the new program loaded into the process. In each case, the first of these arguments is passed to the new program as its own name rather than as an argument a user may have typed on a command line. For the C programmer, this is the ``argv[0]`` @@ -1621,9 +2396,9 @@ to be ignored. descriptors are not flushed, so if there may be data buffered on these open files, you should flush them using :func:`sys.stdout.flush` or :func:`os.fsync` before calling an - :func:`exec\*` function. + :func:`exec\* <execl>` function. - The "l" and "v" variants of the :func:`exec\*` functions differ in how + The "l" and "v" variants of the :func:`exec\* <execl>` functions differ in how command-line arguments are passed. The "l" variants are perhaps the easiest to work with if the number of parameters is fixed when the code is written; the individual parameters simply become additional parameters to the :func:`execl\*` @@ -1635,7 +2410,7 @@ to be ignored. The variants which include a "p" near the end (:func:`execlp`, :func:`execlpe`, :func:`execvp`, and :func:`execvpe`) will use the :envvar:`PATH` environment variable to locate the program *file*. When the - environment is being replaced (using one of the :func:`exec\*e` variants, + environment is being replaced (using one of the :func:`exec\*e <execl>` variants, discussed in the next paragraph), the new environment is used as the source of the :envvar:`PATH` variable. The other variants, :func:`execl`, :func:`execle`, :func:`execv`, and :func:`execve`, will not use the :envvar:`PATH` variable to @@ -1649,8 +2424,16 @@ to be ignored. :func:`execlp`, :func:`execv`, and :func:`execvp` all cause the new process to inherit the environment of the current process. + For :func:`execve` on some platforms, *path* may also be specified as an open + file descriptor. This functionality may not be supported on your platform; + you can check whether or not it is available using :data:`os.supports_fd`. + If it is unavailable, using it will raise a :exc:`NotImplementedError`. + Availability: Unix, Windows. + .. versionadded:: 3.3 + Added support for specifying an open file descriptor for *path* + for :func:`execve`. .. function:: _exit(n) @@ -1809,6 +2592,10 @@ written in Python, such as a mail server's external command delivery program. Note that some platforms including FreeBSD <= 6.3, Cygwin and OS/2 EMX have known issues when using fork() from a thread. + .. warning:: + + See :mod:`ssl` for applications that use the SSL module with fork(). + Availability: Unix. @@ -1840,6 +2627,8 @@ written in Python, such as a mail server's external command delivery program. will be set to *sig*. The Windows version of :func:`kill` additionally takes process handles to be killed. + See also :func:`signal.pthread_kill`. + .. versionadded:: 3.2 Windows support. @@ -1871,7 +2660,6 @@ written in Python, such as a mail server's external command delivery program. .. function:: popen(...) - :noindex: Run child processes, returning opened pipes for communications. These functions are described in section :ref:`os-newstreams`. @@ -1899,7 +2687,7 @@ written in Python, such as a mail server's external command delivery program. process. On Windows, the process id will actually be the process handle, so can be used with the :func:`waitpid` function. - The "l" and "v" variants of the :func:`spawn\*` functions differ in how + The "l" and "v" variants of the :func:`spawn\* <spawnl>` functions differ in how command-line arguments are passed. The "l" variants are perhaps the easiest to work with if the number of parameters is fixed when the code is written; the individual parameters simply become additional parameters to the @@ -1911,7 +2699,7 @@ written in Python, such as a mail server's external command delivery program. The variants which include a second "p" near the end (:func:`spawnlp`, :func:`spawnlpe`, :func:`spawnvp`, and :func:`spawnvpe`) will use the :envvar:`PATH` environment variable to locate the program *file*. When the - environment is being replaced (using one of the :func:`spawn\*e` variants, + environment is being replaced (using one of the :func:`spawn\*e <spawnl>` variants, discussed in the next paragraph), the new environment is used as the source of the :envvar:`PATH` variable. The other variants, :func:`spawnl`, :func:`spawnle`, :func:`spawnv`, and :func:`spawnve`, will not use the @@ -1945,7 +2733,7 @@ written in Python, such as a mail server's external command delivery program. .. data:: P_NOWAIT P_NOWAITO - Possible values for the *mode* parameter to the :func:`spawn\*` family of + Possible values for the *mode* parameter to the :func:`spawn\* <spawnl>` family of functions. If either of these values is given, the :func:`spawn\*` functions will return as soon as the new process has been created, with the process id as the return value. @@ -1955,7 +2743,7 @@ written in Python, such as a mail server's external command delivery program. .. data:: P_WAIT - Possible value for the *mode* parameter to the :func:`spawn\*` family of + Possible value for the *mode* parameter to the :func:`spawn\* <spawnl>` family of functions. If this is given as *mode*, the :func:`spawn\*` functions will not return until the new process has run to completion and will return the exit code of the process the run is successful, or ``-signal`` if a signal kills the @@ -1967,11 +2755,11 @@ written in Python, such as a mail server's external command delivery program. .. data:: P_DETACH P_OVERLAY - Possible values for the *mode* parameter to the :func:`spawn\*` family of + Possible values for the *mode* parameter to the :func:`spawn\* <spawnl>` family of functions. These are less portable than those listed above. :const:`P_DETACH` is similar to :const:`P_NOWAIT`, but the new process is detached from the console of the calling process. If :const:`P_OVERLAY` is used, the current - process will be replaced; the :func:`spawn\*` function will not return. + process will be replaced; the :func:`spawn\* <spawnl>` function will not return. Availability: Windows. @@ -2030,14 +2818,30 @@ written in Python, such as a mail server's external command delivery program. .. function:: times() - Return a 5-tuple of floating point numbers indicating accumulated (processor - or other) times, in seconds. The items are: user time, system time, - children's user time, children's system time, and elapsed real time since a - fixed point in the past, in that order. See the Unix manual page + Returns the current global process times. + The return value is an object with five attributes: + + * :attr:`user` - user time + * :attr:`system` - system time + * :attr:`children_user` - user time of all child processes + * :attr:`children_system` - system time of all child processes + * :attr:`elapsed` - elapsed real time since a fixed point in the past + + For backwards compatibility, this object also behaves like a five-tuple + containing :attr:`user`, :attr:`system`, :attr:`children_user`, + :attr:`children_system`, and :attr:`elapsed` in that order. + + See the Unix manual page :manpage:`times(2)` or the corresponding Windows Platform API documentation. - On Windows, only the first two items are filled, the others are zero. + On Windows, only :attr:`user` and :attr:`system` are known; the other + attributes are zero. + On OS/2, only :attr:`elapsed` is known; the other attributes are zero. - Availability: Unix, Windows + Availability: Unix, Windows. + + .. versionchanged:: 3.3 + Return type changed from a tuple to a tuple-like object + with named attributes. .. function:: wait() @@ -2050,6 +2854,58 @@ written in Python, such as a mail server's external command delivery program. Availability: Unix. +.. function:: waitid(idtype, id, options) + + Wait for the completion of one or more child processes. + *idtype* can be :data:`P_PID`, :data:`P_PGID` or :data:`P_ALL`. + *id* specifies the pid to wait on. + *options* is constructed from the ORing of one or more of :data:`WEXITED`, + :data:`WSTOPPED` or :data:`WCONTINUED` and additionally may be ORed with + :data:`WNOHANG` or :data:`WNOWAIT`. The return value is an object + representing the data contained in the :c:type:`siginfo_t` structure, namely: + :attr:`si_pid`, :attr:`si_uid`, :attr:`si_signo`, :attr:`si_status`, + :attr:`si_code` or ``None`` if :data:`WNOHANG` is specified and there are no + children in a waitable state. + + Availability: Unix. + + .. versionadded:: 3.3 + +.. data:: P_PID + P_PGID + P_ALL + + These are the possible values for *idtype* in :func:`waitid`. They affect + how *id* is interpreted. + + Availability: Unix. + + .. versionadded:: 3.3 + +.. data:: WEXITED + WSTOPPED + WNOWAIT + + Flags that can be used in *options* in :func:`waitid` that specify what + child signal to wait for. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: CLD_EXITED + CLD_DUMPED + CLD_TRAPPED + CLD_CONTINUED + + These are the possible values for :attr:`si_code` in the result returned by + :func:`waitid`. + + Availability: Unix. + + .. versionadded:: 3.3 + .. function:: waitpid(pid, options) @@ -2075,8 +2931,8 @@ written in Python, such as a mail server's external command delivery program. (shifting makes cross-platform use of the function easier). A *pid* less than or equal to ``0`` has no special meaning on Windows, and raises an exception. The value of integer *options* has no effect. *pid* can refer to any process whose - id is known, not necessarily a child process. The :func:`spawn` functions called - with :const:`P_NOWAIT` return suitable process handles. + id is known, not necessarily a child process. The :func:`spawn\* <spawnl>` + functions called with :const:`P_NOWAIT` return suitable process handles. .. function:: wait3(options) @@ -2084,8 +2940,9 @@ written in Python, such as a mail server's external command delivery program. Similar to :func:`waitpid`, except no process id argument is given and a 3-element tuple containing the child's process id, exit status indication, and resource usage information is returned. Refer to :mod:`resource`.\ - :func:`getrusage` for details on resource usage information. The option - argument is the same as that provided to :func:`waitpid` and :func:`wait4`. + :func:`~resource.getrusage` for details on resource usage information. The + option argument is the same as that provided to :func:`waitpid` and + :func:`wait4`. Availability: Unix. @@ -2094,9 +2951,9 @@ written in Python, such as a mail server's external command delivery program. Similar to :func:`waitpid`, except a 3-element tuple, containing the child's process id, exit status indication, and resource usage information is returned. - Refer to :mod:`resource`.\ :func:`getrusage` for details on resource usage - information. The arguments to :func:`wait4` are the same as those provided to - :func:`waitpid`. + Refer to :mod:`resource`.\ :func:`~resource.getrusage` for details on + resource usage information. The arguments to :func:`wait4` are the same + as those provided to :func:`waitpid`. Availability: Unix. @@ -2114,7 +2971,7 @@ written in Python, such as a mail server's external command delivery program. This option causes child processes to be reported if they have been continued from a job control stop since their status was last reported. - Availability: Some Unix systems. + Availability: some Unix systems. .. data:: WUNTRACED @@ -2191,6 +3048,129 @@ used to determine the disposition of a process. Availability: Unix. +Interface to the scheduler +-------------------------- + +These functions control how a process is allocated CPU time by the operating +system. They are only available on some Unix platforms. For more detailed +information, consult your Unix manpages. + +.. versionadded:: 3.3 + +The following scheduling policies are exposed if they are a supported by the +operating system. + +.. data:: SCHED_OTHER + + The default scheduling policy. + +.. data:: SCHED_BATCH + + Scheduling policy for CPU-intensive processes that tries to preserve + interactivity on the rest of the computer. + +.. data:: SCHED_IDLE + + Scheduling policy for extremely low priority background tasks. + +.. data:: SCHED_SPORADIC + + Scheduling policy for sporadic server programs. + +.. data:: SCHED_FIFO + + A First In First Out scheduling policy. + +.. data:: SCHED_RR + + A round-robin scheduling policy. + +.. data:: SCHED_RESET_ON_FORK + + This flag can OR'ed with any other scheduling policy. When a process with + this flag set forks, its child's scheduling policy and priority are reset to + the default. + + +.. class:: sched_param(sched_priority) + + This class represents tunable scheduling parameters used in + :func:`sched_setparam`, :func:`sched_setscheduler`, and + :func:`sched_getparam`. It is immutable. + + At the moment, there is only one possible parameter: + + .. attribute:: sched_priority + + The scheduling priority for a scheduling policy. + + +.. function:: sched_get_priority_min(policy) + + Get the minimum priority value for *policy*. *policy* is one of the + scheduling policy constants above. + + +.. function:: sched_get_priority_max(policy) + + Get the maximum priority value for *policy*. *policy* is one of the + scheduling policy constants above. + + +.. function:: sched_setscheduler(pid, policy, param) + + Set the scheduling policy for the process with PID *pid*. A *pid* of 0 means + the calling process. *policy* is one of the scheduling policy constants + above. *param* is a :class:`sched_param` instance. + + +.. function:: sched_getscheduler(pid) + + Return the scheduling policy for the process with PID *pid*. A *pid* of 0 + means the calling process. The result is one of the scheduling policy + constants above. + + +.. function:: sched_setparam(pid, param) + + Set a scheduling parameters for the process with PID *pid*. A *pid* of 0 means + the calling process. *param* is a :class:`sched_param` instance. + + +.. function:: sched_getparam(pid) + + Return the scheduling parameters as a :class:`sched_param` instance for the + process with PID *pid*. A *pid* of 0 means the calling process. + + +.. function:: sched_rr_get_interval(pid) + + Return the round-robin quantum in seconds for the process with PID *pid*. A + *pid* of 0 means the calling process. + + +.. function:: sched_yield() + + Voluntarily relinquish the CPU. + + +.. function:: sched_setaffinity(pid, mask) + + Restrict the process with PID *pid* (or the current process if zero) to a + set of CPUs. *mask* is an iterable of integers representing the set of + CPUs to which the process should be restricted. + + +.. function:: sched_getaffinity(pid) + + Return the set of CPUs the process with PID *pid* (or the current process + if zero) is restricted to. + + .. seealso:: + :func:`multiprocessing.cpu_count` returns the number of CPUs in the + system. + + .. _os-path: Miscellaneous System Information @@ -2215,7 +3195,7 @@ Miscellaneous System Information included in ``confstr_names``, an :exc:`OSError` is raised with :const:`errno.EINVAL` for the error number. - Availability: Unix + Availability: Unix. .. data:: confstr_names @@ -2306,8 +3286,9 @@ Higher-level operations on pathnames are defined in the :mod:`os.path` module. .. data:: defpath - The default search path used by :func:`exec\*p\*` and :func:`spawn\*p\*` if the - environment doesn't have a ``'PATH'`` key. Also available via :mod:`os.path`. + The default search path used by :func:`exec\*p\* <execl>` and + :func:`spawn\*p\* <spawnl>` if the environment doesn't have a ``'PATH'`` + key. Also available via :mod:`os.path`. .. data:: linesep @@ -2337,6 +3318,10 @@ Miscellaneous Functions This function returns random bytes from an OS-specific randomness source. The returned data should be unpredictable enough for cryptographic applications, - though its exact quality depends on the OS implementation. On a UNIX-like - system this will query /dev/urandom, and on Windows it will use CryptGenRandom. - If a randomness source is not found, :exc:`NotImplementedError` will be raised. + though its exact quality depends on the OS implementation. On a Unix-like + system this will query ``/dev/urandom``, and on Windows it will use + ``CryptGenRandom()``. If a randomness source is not found, + :exc:`NotImplementedError` will be raised. + + For an easy-to-use interface to the random number generator + provided by your platform, please see :class:`random.SystemRandom`. diff --git a/Doc/library/ossaudiodev.rst b/Doc/library/ossaudiodev.rst index ed84413..80551aa 100644 --- a/Doc/library/ossaudiodev.rst +++ b/Doc/library/ossaudiodev.rst @@ -38,6 +38,10 @@ the standard audio interface for Linux and recent versions of FreeBSD. This probably all warrants a footnote or two, but I don't understand things well enough right now to write it! --GPW +.. versionchanged:: 3.3 + Operations in this module now raise :exc:`OSError` where :exc:`IOError` + was raised. + .. seealso:: @@ -45,7 +49,7 @@ the standard audio interface for Linux and recent versions of FreeBSD. the official documentation for the OSS C API The module defines a large number of constants supplied by the OSS device - driver; see ``<sys/soundcard.h>`` on either Linux or FreeBSD for a listing . + driver; see ``<sys/soundcard.h>`` on either Linux or FreeBSD for a listing. :mod:`ossaudiodev` defines the following variables and functions: @@ -56,7 +60,7 @@ the standard audio interface for Linux and recent versions of FreeBSD. what went wrong. (If :mod:`ossaudiodev` receives an error from a system call such as - :c:func:`open`, :c:func:`write`, or :c:func:`ioctl`, it raises :exc:`IOError`. + :c:func:`open`, :c:func:`write`, or :c:func:`ioctl`, it raises :exc:`OSError`. Errors detected directly by :mod:`ossaudiodev` result in :exc:`OSSAudioError`.) (For backwards compatibility, the exception class is also available as @@ -165,11 +169,11 @@ and (read-only) attributes: be used in a :keyword:`with` statement. -The following methods each map to exactly one :func:`ioctl` system call. The +The following methods each map to exactly one :c:func:`ioctl` system call. The correspondence is obvious: for example, :meth:`setfmt` corresponds to the ``SNDCTL_DSP_SETFMT`` ioctl, and :meth:`sync` to ``SNDCTL_DSP_SYNC`` (this can be useful when consulting the OSS documentation). If the underlying -:func:`ioctl` fails, they all raise :exc:`IOError`. +:c:func:`ioctl` fails, they all raise :exc:`OSError`. .. method:: oss_audio_device.nonblock() @@ -298,7 +302,7 @@ simple calculations. fmt = dsp.setfmt(fmt) channels = dsp.channels(channels) - rate = dsp.rate(channels) + rate = dsp.rate(rate) .. method:: oss_audio_device.bufsize() @@ -345,7 +349,7 @@ The mixer object provides two file-like methods: .. method:: oss_mixer_device.close() This method closes the open mixer device file. Any further attempts to use the - mixer after this file is closed will raise an :exc:`IOError`. + mixer after this file is closed will raise an :exc:`OSError`. .. method:: oss_mixer_device.fileno() @@ -404,7 +408,7 @@ The remaining methods are specific to audio mixing: returned, but both volumes are the same. Raises :exc:`OSSAudioError` if an invalid control was is specified, or - :exc:`IOError` if an unsupported control is specified. + :exc:`OSError` if an unsupported control is specified. .. method:: oss_mixer_device.set(control, (left, right)) @@ -428,7 +432,7 @@ The remaining methods are specific to audio mixing: .. method:: oss_mixer_device.set_recsrc(bitmask) Call this function to specify a recording source. Returns a bitmask indicating - the new recording source (or sources) if successful; raises :exc:`IOError` if an + the new recording source (or sources) if successful; raises :exc:`OSError` if an invalid source was specified. To set the current recording source to the microphone input:: diff --git a/Doc/library/othergui.rst b/Doc/library/othergui.rst index da66003..eb49b99 100644 --- a/Doc/library/othergui.rst +++ b/Doc/library/othergui.rst @@ -19,8 +19,7 @@ available for Python: `PyGTK <http://www.pygtk.org/>`_ provides bindings for an older version of the library, GTK+ 2. It provides an object oriented interface that is slightly higher level than the C one. There are also bindings to - `GNOME <http://www.gnome.org>`_. One well known PyGTK application is - `PythonCAD <http://www.pythoncad.org/>`_. An online `tutorial + `GNOME <http://www.gnome.org>`_. An online `tutorial <http://www.pygtk.org/pygtk2tutorial/index.html>`_ is available. `PyQt <http://www.riverbankcomputing.co.uk/software/pyqt/>`_ diff --git a/Doc/library/pdb.rst b/Doc/library/pdb.rst index 1e9de63..f4e37ac 100644 --- a/Doc/library/pdb.rst +++ b/Doc/library/pdb.rst @@ -38,6 +38,11 @@ of the debugger is:: > <string>(1)?() (Pdb) +.. versionchanged:: 3.3 + Tab-completion via the :mod:`readline` module is available for commands and + command arguments, e.g. the current global and local names are offered as + arguments of the ``print`` command. + :file:`pdb.py` can also be invoked as a script to debug other scripts. For example:: diff --git a/Doc/library/pickle.rst b/Doc/library/pickle.rst index e250dcf..273fb34 100644 --- a/Doc/library/pickle.rst +++ b/Doc/library/pickle.rst @@ -12,16 +12,17 @@ .. module:: pickle :synopsis: Convert Python objects to streams of bytes and back. .. sectionauthor:: Jim Kerr <jbkerr@sr.hp.com>. -.. sectionauthor:: Barry Warsaw <barry@zope.com> +.. sectionauthor:: Barry Warsaw <barry@python.org> -The :mod:`pickle` module implements a fundamental, but powerful algorithm for -serializing and de-serializing a Python object structure. "Pickling" is the -process whereby a Python object hierarchy is converted into a byte stream, and -"unpickling" is the inverse operation, whereby a byte stream is converted back -into an object hierarchy. Pickling (and unpickling) is alternatively known as -"serialization", "marshalling," [#]_ or "flattening", however, to avoid -confusion, the terms used here are "pickling" and "unpickling".. +The :mod:`pickle` module implements binary protocols for serializing and +de-serializing a Python object structure. *"Pickling"* is the process +whereby a Python object hierarchy is converted into a byte stream, and +*"unpickling"* is the inverse operation, whereby a byte stream +(from a :term:`binary file` or :term:`bytes-like object`) is converted +back into an object hierarchy. Pickling (and unpickling) is alternatively +known as "serialization", "marshalling," [#]_ or "flattening"; however, to +avoid confusion, the terms used here are "pickling" and "unpickling". .. warning:: @@ -33,9 +34,8 @@ confusion, the terms used here are "pickling" and "unpickling".. Relationship to other Python modules ------------------------------------ -The :mod:`pickle` module has an transparent optimizer (:mod:`_pickle`) written -in C. It is used whenever available. Otherwise the pure Python implementation is -used. +Comparison with ``marshal`` +^^^^^^^^^^^^^^^^^^^^^^^^^^^ Python has a more primitive serialization module called :mod:`marshal`, but in general :mod:`pickle` should always be the preferred way to serialize Python @@ -69,17 +69,30 @@ The :mod:`pickle` module differs from :mod:`marshal` in several significant ways The :mod:`pickle` serialization format is guaranteed to be backwards compatible across Python releases. -Note that serialization is a more primitive notion than persistence; although -:mod:`pickle` reads and writes file objects, it does not handle the issue of -naming persistent objects, nor the (even more complicated) issue of concurrent -access to persistent objects. The :mod:`pickle` module can transform a complex -object into a byte stream and it can transform the byte stream into an object -with the same internal structure. Perhaps the most obvious thing to do with -these byte streams is to write them onto a file, but it is also conceivable to -send them across a network or store them in a database. The module -:mod:`shelve` provides a simple interface to pickle and unpickle objects on -DBM-style database files. +Comparison with ``json`` +^^^^^^^^^^^^^^^^^^^^^^^^ +There are fundamental differences between the pickle protocols and +`JSON (JavaScript Object Notation) <http://json.org>`_: + +* JSON is a text serialization format (it outputs unicode text, although + most of the time it is then encoded to ``utf-8``), while pickle is + a binary serialization format; + +* JSON is human-readable, while pickle is not; + +* JSON is interoperable and widely used outside of the Python ecosystem, + while pickle is Python-specific; + +* JSON, by default, can only represent a subset of the Python built-in + types, and no custom classes; pickle can represent an extremely large + number of Python types (many of them automatically, by clever usage + of Python's introspection facilities; complex cases can be tackled by + implementing :ref:`specific object APIs <pickle-inst>`). + +.. seealso:: + The :mod:`json` module: a standard library module allowing JSON + serialization and deserialization. Data stream format ------------------ @@ -117,6 +130,18 @@ There are currently 4 different protocols which can be used for pickling. the default as well as the current recommended protocol; use it whenever possible. +.. note:: + Serialization is a more primitive notion than persistence; although + :mod:`pickle` reads and writes file objects, it does not handle the issue of + naming persistent objects, nor the (even more complicated) issue of concurrent + access to persistent objects. The :mod:`pickle` module can transform a complex + object into a byte stream and it can transform the byte stream into an object + with the same internal structure. Perhaps the most obvious thing to do with + these byte streams is to write them onto a file, but it is also conceivable to + send them across a network or store them in a database. The :mod:`shelve` + module provides a simple interface to pickle and unpickle objects on + DBM-style database files. + Module Interface ---------------- @@ -161,7 +186,7 @@ process more convenient: :class:`io.BytesIO` instance, or any other custom object that meets this interface. - If *fix_imports* is True and *protocol* is less than 3, pickle will try to + If *fix_imports* is true and *protocol* is less than 3, pickle will try to map the new Python 3.x names to the old module names used in Python 2.x, so that the pickle data stream is readable with Python 2.x. @@ -178,7 +203,7 @@ process more convenient: supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced. - If *fix_imports* is True and *protocol* is less than 3, pickle will try to + If *fix_imports* is true and *protocol* is less than 3, pickle will try to map the new Python 3.x names to the old module names used in Python 2.x, so that the pickle data stream is readable with Python 2.x. @@ -200,7 +225,7 @@ process more convenient: Optional keyword arguments are *fix_imports*, *encoding* and *errors*, which are used to control compatibility support for pickle stream generated - by Python 2.x. If *fix_imports* is True, pickle will try to map the old + by Python 2.x. If *fix_imports* is true, pickle will try to map the old Python 2.x names to the new names used in Python 3.x. The *encoding* and *errors* tell pickle how to decode 8-bit string instances pickled by Python 2.x; these default to 'ASCII' and 'strict', respectively. @@ -216,7 +241,7 @@ process more convenient: Optional keyword arguments are *fix_imports*, *encoding* and *errors*, which are used to control compatibility support for pickle stream generated - by Python 2.x. If *fix_imports* is True, pickle will try to map the old + by Python 2.x. If *fix_imports* is true, pickle will try to map the old Python 2.x names to the new names used in Python 3.x. The *encoding* and *errors* tell pickle how to decode 8-bit string instances pickled by Python 2.x; these default to 'ASCII' and 'strict', respectively. @@ -266,7 +291,7 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and argument. It can thus be an on-disk file opened for binary writing, a :class:`io.BytesIO` instance, or any other custom object that meets this interface. - If *fix_imports* is True and *protocol* is less than 3, pickle will try to + If *fix_imports* is true and *protocol* is less than 3, pickle will try to map the new Python 3.x names to the old module names used in Python 2.x, so that the pickle data stream is readable with Python 2.x. @@ -287,6 +312,29 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and See :ref:`pickle-persistent` for details and examples of uses. + .. attribute:: dispatch_table + + A pickler object's dispatch table is a registry of *reduction + functions* of the kind which can be declared using + :func:`copyreg.pickle`. It is a mapping whose keys are classes + and whose values are reduction functions. A reduction function + takes a single argument of the associated class and should + conform to the same interface as a :meth:`__reduce__` + method. + + By default, a pickler object will not have a + :attr:`dispatch_table` attribute, and it will instead use the + global dispatch table managed by the :mod:`copyreg` module. + However, to customize the pickling for a specific pickler object + one can set the :attr:`dispatch_table` attribute to a dict-like + object. Alternatively, if a subclass of :class:`Pickler` has a + :attr:`dispatch_table` attribute then this will be used as the + default dispatch table for instances of that class. + + See :ref:`pickle-dispatch` for usage examples. + + .. versionadded:: 3.3 + .. attribute:: fast Deprecated. Enable fast mode if set to a true value. The fast mode @@ -313,7 +361,7 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and Optional keyword arguments are *fix_imports*, *encoding* and *errors*, which are used to control compatibility support for pickle stream generated - by Python 2.x. If *fix_imports* is True, pickle will try to map the old + by Python 2.x. If *fix_imports* is true, pickle will try to map the old Python 2.x names to the new names used in Python 3.x. The *encoding* and *errors* tell pickle how to decode 8-bit string instances pickled by Python 2.x; these default to 'ASCII' and 'strict', respectively. @@ -367,8 +415,8 @@ The following types can be pickled: * classes that are defined at the top level of a module -* instances of such classes whose :attr:`__dict__` or the result of calling - :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for +* instances of such classes whose :attr:`~object.__dict__` or the result of + calling :meth:`__getstate__` is picklable (see section :ref:`pickle-inst` for details). Attempts to pickle unpicklable objects will raise the :exc:`PicklingError` @@ -412,6 +460,8 @@ conversions can be made by the class's :meth:`__setstate__` method. Pickling Class Instances ------------------------ +.. currentmodule:: None + In this section, we describe the general mechanisms available to you to define, customize, and control how class instances are pickled and unpickled. @@ -447,7 +497,7 @@ methods: defines the method :meth:`__getstate__`, it is called and the returned object is pickled as the contents for the instance, instead of the contents of the instance's dictionary. If the :meth:`__getstate__` method is absent, the - instance's :attr:`__dict__` is pickled as usual. + instance's :attr:`~object.__dict__` is pickled as usual. .. method:: object.__setstate__(state) @@ -516,7 +566,7 @@ or both. * Optionally, the object's state, which will be passed to the object's :meth:`__setstate__` method as previously described. If the object has no such method then, the value must be a dictionary and it will be added to - the object's :attr:`__dict__` attribute. + the object's :attr:`~object.__dict__` attribute. * Optionally, an iterator (and not a sequence) yielding successive items. These items will be appended to the object either using @@ -542,6 +592,8 @@ or both. the extended version. The main use for this method is to provide backwards-compatible reduce values for older Python releases. +.. currentmodule:: pickle + .. _pickle-persistent: Persistence of External Objects @@ -559,25 +611,63 @@ any newer protocol). The resolution of such persistent IDs is not defined by the :mod:`pickle` module; it will delegate this resolution to the user defined methods on the -pickler and unpickler, :meth:`persistent_id` and :meth:`persistent_load` -respectively. +pickler and unpickler, :meth:`~Pickler.persistent_id` and +:meth:`~Unpickler.persistent_load` respectively. To pickle objects that have an external persistent id, the pickler must have a -custom :meth:`persistent_id` method that takes an object as an argument and -returns either ``None`` or the persistent id for that object. When ``None`` is -returned, the pickler simply pickles the object as normal. When a persistent ID -string is returned, the pickler will pickle that object, along with a marker so -that the unpickler will recognize it as a persistent ID. +custom :meth:`~Pickler.persistent_id` method that takes an object as an +argument and returns either ``None`` or the persistent id for that object. +When ``None`` is returned, the pickler simply pickles the object as normal. +When a persistent ID string is returned, the pickler will pickle that object, +along with a marker so that the unpickler will recognize it as a persistent ID. To unpickle external objects, the unpickler must have a custom -:meth:`persistent_load` method that takes a persistent ID object and returns the -referenced object. +:meth:`~Unpickler.persistent_load` method that takes a persistent ID object and +returns the referenced object. Here is a comprehensive example presenting how persistent ID can be used to pickle external objects by reference. .. literalinclude:: ../includes/dbpickle.py +.. _pickle-dispatch: + +Dispatch Tables +^^^^^^^^^^^^^^^ + +If one wants to customize pickling of some classes without disturbing +any other code which depends on pickling, then one can create a +pickler with a private dispatch table. + +The global dispatch table managed by the :mod:`copyreg` module is +available as :data:`copyreg.dispatch_table`. Therefore, one may +choose to use a modified copy of :data:`copyreg.dispatch_table` as a +private dispatch table. + +For example :: + + f = io.BytesIO() + p = pickle.Pickler(f) + p.dispatch_table = copyreg.dispatch_table.copy() + p.dispatch_table[SomeClass] = reduce_SomeClass + +creates an instance of :class:`pickle.Pickler` with a private dispatch +table which handles the ``SomeClass`` class specially. Alternatively, +the code :: + + class MyPickler(pickle.Pickler): + dispatch_table = copyreg.dispatch_table.copy() + dispatch_table[SomeClass] = reduce_SomeClass + f = io.BytesIO() + p = MyPickler(f) + +does the same, but all instances of ``MyPickler`` will by default +share the same dispatch table. The equivalent code using the +:mod:`copyreg` module is :: + + copyreg.pickle(SomeClass, reduce_SomeClass) + f = io.BytesIO() + p = pickle.Pickler(f) .. _pickle-state: @@ -590,7 +680,7 @@ Handling Stateful Objects Here's an example that shows how to modify pickling behavior for a class. The :class:`TextReader` class opens a text file, and returns the line number and -line contents each time its :meth:`readline` method is called. If a +line contents each time its :meth:`!readline` method is called. If a :class:`TextReader` instance is pickled, all attributes *except* the file object member are saved. When the instance is unpickled, the file is reopened, and reading resumes from the last location. The :meth:`__setstate__` and @@ -669,9 +759,10 @@ apply the string argument "echo hello world". Although this example is inoffensive, it is not difficult to imagine one that could damage your system. For this reason, you may want to control what gets unpickled by customizing -:meth:`Unpickler.find_class`. Unlike its name suggests, :meth:`find_class` is -called whenever a global (i.e., a class or a function) is requested. Thus it is -possible to either completely forbid globals or restrict them to a safe subset. +:meth:`Unpickler.find_class`. Unlike its name suggests, +:meth:`Unpickler.find_class` is called whenever a global (i.e., a class or +a function) is requested. Thus it is possible to either completely forbid +globals or restrict them to a safe subset. Here is an example of an unpickler allowing only few safe classes from the :mod:`builtins` module to be loaded:: @@ -727,6 +818,14 @@ alternatives such as the marshalling API in :mod:`xmlrpc.client` or third-party solutions. +Performance +----------- + +Recent versions of the pickle protocol (from protocol 2 and upwards) feature +efficient binary encodings for several common features and built-in types. +Also, the :mod:`pickle` module has a transparent optimizer written in C. + + .. _pickle-example: Examples diff --git a/Doc/library/pkgutil.rst b/Doc/library/pkgutil.rst index 3118ff2..7c4f912 100644 --- a/Doc/library/pkgutil.rst +++ b/Doc/library/pkgutil.rst @@ -56,21 +56,32 @@ support. Note that :class:`ImpImporter` does not currently support being used by placement on :data:`sys.meta_path`. + .. deprecated:: 3.3 + This emulation is no longer needed, as the standard import mechanism + is now fully PEP 302 compliant and available in :mod:`importlib` + .. class:: ImpLoader(fullname, file, filename, etc) :pep:`302` Loader that wraps Python's "classic" import algorithm. + .. deprecated:: 3.3 + This emulation is no longer needed, as the standard import mechanism + is now fully PEP 302 compliant and available in :mod:`importlib` + .. function:: find_loader(fullname) - Find a :pep:`302` "loader" object for *fullname*. + Retrieve a :pep:`302` module loader for the given *fullname*. - If *fullname* contains dots, path must be the containing package's - ``__path__``. Returns ``None`` if the module cannot be found or imported. - This function uses :func:`iter_importers`, and is thus subject to the same - limitations regarding platform-specific special import locations such as the - Windows registry. + This is a convenience wrapper around :func:`importlib.find_loader` that + sets the *path* argument correctly when searching for submodules, and + also ensures parent packages (if any) are imported before searching for + submodules. + + .. versionchanged:: 3.3 + Updated to be based directly on :mod:`importlib` rather than relying + on the package internal PEP 302 import emulation. .. function:: get_importer(path_item) @@ -80,13 +91,13 @@ support. The returned importer is cached in :data:`sys.path_importer_cache` if it was newly created by a path hook. - If there is no importer, a wrapper around the basic import machinery is - returned. This wrapper is never inserted into the importer cache (``None`` - is inserted instead). - The cache (or part of it) can be cleared manually if a rescan of :data:`sys.path_hooks` is necessary. + .. versionchanged:: 3.3 + Updated to be based directly on :mod:`importlib` rather than relying + on the package internal PEP 302 import emulation. + .. function:: get_loader(module_or_name) @@ -102,46 +113,53 @@ support. limitations regarding platform-specific special import locations such as the Windows registry. + .. versionchanged:: 3.3 + Updated to be based directly on :mod:`importlib` rather than relying + on the package internal PEP 302 import emulation. + .. function:: iter_importers(fullname='') Yield :pep:`302` importers for the given module name. - If fullname contains a '.', the importers will be for the package containing - fullname, otherwise they will be importers for :data:`sys.meta_path`, - :data:`sys.path`, and Python's "classic" import machinery, in that order. If - the named module is in a package, that package is imported as a side effect - of invoking this function. + If fullname contains a '.', the importers will be for the package + containing fullname, otherwise they will be all registered top level + importers (i.e. those on both sys.meta_path and sys.path_hooks). - Non-:pep:`302` mechanisms (e.g. the Windows registry) used by the standard - import machinery to find files in alternative locations are partially - supported, but are searched *after* :data:`sys.path`. Normally, these - locations are searched *before* :data:`sys.path`, preventing :data:`sys.path` - entries from shadowing them. + If the named module is in a package, that package is imported as a side + effect of invoking this function. - For this to cause a visible difference in behaviour, there must be a module - or package name that is accessible via both :data:`sys.path` and one of the - non-:pep:`302` file system mechanisms. In this case, the emulation will find - the former version, while the builtin import mechanism will find the latter. + If no module name is specified, all top level importers are produced. - Items of the following types can be affected by this discrepancy: - ``imp.C_EXTENSION``, ``imp.PY_SOURCE``, ``imp.PY_COMPILED``, - ``imp.PKG_DIRECTORY``. + .. versionchanged:: 3.3 + Updated to be based directly on :mod:`importlib` rather than relying + on the package internal PEP 302 import emulation. .. function:: iter_modules(path=None, prefix='') - Yields ``(module_loader, name, ispkg)`` for all submodules on *path*, or, if + Yields ``(module_finder, name, ispkg)`` for all submodules on *path*, or, if path is ``None``, all top-level modules on ``sys.path``. *path* should be either ``None`` or a list of paths to look for modules in. *prefix* is a string to output on the front of every module name on output. + .. note:: + + Only works for a :term:`finder` which defines an ``iter_modules()`` + method. This interface is non-standard, so the module also provides + implementations for :class:`importlib.machinery.FileFinder` and + :class:`zipimport.zipimporter`. + + .. versionchanged:: 3.3 + Updated to be based directly on :mod:`importlib` rather than relying + on the package internal PEP 302 import emulation. + .. function:: walk_packages(path=None, prefix='', onerror=None) - Yields ``(module_loader, name, ispkg)`` for all modules recursively on + Yields ``(module_finder, name, ispkg)`` for all modules recursively on *path*, or, if path is ``None``, all accessible modules. *path* should be either ``None`` or a list of paths to look for modules in. @@ -166,6 +184,17 @@ support. # list all submodules of ctypes walk_packages(ctypes.__path__, ctypes.__name__ + '.') + .. note:: + + Only works for a :term:`finder` which defines an ``iter_modules()`` + method. This interface is non-standard, so the module also provides + implementations for :class:`importlib.machinery.FileFinder` and + :class:`zipimport.zipimporter`. + + .. versionchanged:: 3.3 + Updated to be based directly on :mod:`importlib` rather than relying + on the package internal PEP 302 import emulation. + .. function:: get_data(package, resource) diff --git a/Doc/library/platform.rst b/Doc/library/platform.rst index 157ac3a..e27f2ad 100644 --- a/Doc/library/platform.rst +++ b/Doc/library/platform.rst @@ -30,8 +30,8 @@ Cross Platform returned as strings. Values that cannot be determined are returned as given by the parameter presets. - If bits is given as ``''``, the :c:func:`sizeof(pointer)` (or - :c:func:`sizeof(long)` on Python version < 1.5.2) is used as indicator for the + If bits is given as ``''``, the ``sizeof(pointer)`` (or + ``sizeof(long)`` on Python version < 1.5.2) is used as indicator for the supported pointer size. The function relies on the system's :file:`file` command to do the actual work. @@ -158,14 +158,20 @@ Cross Platform .. function:: uname() - Fairly portable uname interface. Returns a tuple of strings ``(system, node, - release, version, machine, processor)`` identifying the underlying platform. + Fairly portable uname interface. Returns a :func:`~collections.namedtuple` + containing six attributes: :attr:`system`, :attr:`node`, :attr:`release`, + :attr:`version`, :attr:`machine`, and :attr:`processor`. - Note that unlike the :func:`os.uname` function this also returns possible - processor information as additional tuple entry. + Note that this adds a sixth attribute (:attr:`processor`) not present + in the :func:`os.uname` result. Also, the attribute names are different + for the first two attributes; :func:`os.uname` names them + :attr:`sysname` and :attr:`nodename`. Entries which cannot be determined are set to ``''``. + .. versionchanged:: 3.3 + Result changed from a tuple to a namedtuple. + Java Platform ------------- @@ -188,8 +194,8 @@ Windows Platform .. function:: win32_ver(release='', version='', csd='', ptype='') Get additional version information from the Windows Registry and return a tuple - ``(version, csd, ptype)`` referring to version number, CSD level - (service pack) and OS type (multi/single processor). + ``(release, version, csd, ptype)`` referring to OS release, version number, + CSD level (service pack) and OS type (multi/single processor). As a hint: *ptype* is ``'Uniprocessor Free'`` on single processor NT machines and ``'Multiprocessor Free'`` on multi processor machines. The *'Free'* refers @@ -214,6 +220,10 @@ Win95/98 specific preferring :func:`win32pipe.popen`. On Windows NT, :func:`win32pipe.popen` should work; on Windows 9x it hangs due to bugs in the MS C library. + .. deprecated:: 3.3 + This function is obsolete. Use the :mod:`subprocess` module. Check + especially the :ref:`subprocess-replacements` section. + Mac OS Platform --------------- diff --git a/Doc/library/plistlib.rst b/Doc/library/plistlib.rst index ae5e94d..a267368 100644 --- a/Doc/library/plistlib.rst +++ b/Doc/library/plistlib.rst @@ -47,8 +47,8 @@ This module defines the following functions: .. function:: readPlist(pathOrFile) - Read a plist file. *pathOrFile* may either be a file name or a (readable) - file object. Return the unpacked root object (which usually is a + Read a plist file. *pathOrFile* may either be a file name or a (readable and + binary) file object. Return the unpacked root object (which usually is a dictionary). The XML data is parsed using the Expat parser from :mod:`xml.parsers.expat` @@ -59,7 +59,7 @@ This module defines the following functions: .. function:: writePlist(rootObject, pathOrFile) Write *rootObject* to a plist file. *pathOrFile* may either be a file name - or a (writable) file object. + or a (writable and binary) file object. A :exc:`TypeError` will be raised if the object is of an unsupported type or a container that contains objects of unsupported types. diff --git a/Doc/library/posix.rst b/Doc/library/posix.rst index 07db2b2..06bab04 100644 --- a/Doc/library/posix.rst +++ b/Doc/library/posix.rst @@ -19,7 +19,7 @@ systems the :mod:`posix` module is not available, but a subset is always available through the :mod:`os` interface. Once :mod:`os` is imported, there is *no* performance penalty in using it instead of :mod:`posix`. In addition, :mod:`os` provides some additional functionality, such as automatically calling -:func:`putenv` when an entry in ``os.environ`` is changed. +:func:`~os.putenv` when an entry in ``os.environ`` is changed. Errors are reported as exceptions; the usual exceptions are given for type errors, while errors reported by the system calls raise :exc:`OSError`. @@ -37,7 +37,7 @@ Large File Support .. sectionauthor:: Steve Clift <clift@mail.anacapa.net> Several operating systems (including AIX, HP-UX, Irix and Solaris) provide -support for files that are larger than 2 GB from a C programming model where +support for files that are larger than 2 GiB from a C programming model where :c:type:`int` and :c:type:`long` are 32-bit values. This is typically accomplished by defining the relevant size and offset types as 64-bit values. Such files are sometimes referred to as :dfn:`large files`. @@ -74,9 +74,10 @@ In addition to many functions described in the :mod:`os` module documentation, pathname of your home directory, equivalent to ``getenv("HOME")`` in C. Modifying this dictionary does not affect the string environment passed on by - :func:`execv`, :func:`popen` or :func:`system`; if you need to change the - environment, pass ``environ`` to :func:`execve` or add variable assignments and - export statements to the command string for :func:`system` or :func:`popen`. + :func:`~os.execv`, :func:`~os.popen` or :func:`~os.system`; if you need to + change the environment, pass ``environ`` to :func:`~os.execve` or add + variable assignments and export statements to the command string for + :func:`~os.system` or :func:`~os.popen`. .. versionchanged:: 3.2 On Unix, keys and values are bytes. diff --git a/Doc/library/profile.rst b/Doc/library/profile.rst index 98dbabf..f2453f1 100644 --- a/Doc/library/profile.rst +++ b/Doc/library/profile.rst @@ -4,11 +4,6 @@ The Python Profilers ******************** -.. sectionauthor:: James Roskind - -.. module:: profile - :synopsis: Python source profiler. - **Source code:** :source:`Lib/profile.py` and :source:`Lib/pstats.py` -------------- @@ -22,14 +17,13 @@ Introduction to the profilers single: deterministic profiling single: profiling, deterministic -A :dfn:`profiler` is a program that describes the run time performance of a -program, providing a variety of statistics. This documentation describes the -profiler functionality provided in the modules :mod:`cProfile`, :mod:`profile` -and :mod:`pstats`. This profiler provides :dfn:`deterministic profiling` of -Python programs. It also provides a series of report generation tools to allow -users to rapidly examine the results of a profile operation. +:mod:`cProfile` and :mod:`profile` provide :dfn:`deterministic profiling` of +Python programs. A :dfn:`profile` is a set of statistics that describes how +often and for how long various parts of the program executed. These statistics +can be formatted into reports via the :mod:`pstats` module. -The Python standard library provides two different profilers: +The Python standard library provides two different implementations of the same +profiling interface: 1. :mod:`cProfile` is recommended for most users; it's a C extension with reasonable overhead that makes it suitable for profiling long-running @@ -37,14 +31,9 @@ The Python standard library provides two different profilers: Czotter. 2. :mod:`profile`, a pure Python module whose interface is imitated by - :mod:`cProfile`. Adds significant overhead to profiled programs. If you're - trying to extend the profiler in some way, the task might be easier with this - module. - -The :mod:`profile` and :mod:`cProfile` modules export the same interface, so -they are mostly interchangeable; :mod:`cProfile` has a much lower overhead but -is newer and might not be available on all systems. :mod:`cProfile` is really a -compatibility layer on top of the internal :mod:`_lsprof` module. + :mod:`cProfile`, but which adds significant overhead to profiled programs. + If you're trying to extend the profiler in some way, the task might be easier + with this module. .. note:: @@ -65,57 +54,94 @@ This section is provided for users that "don't want to read the manual." It provides a very brief overview, and allows a user to rapidly perform profiling on an existing application. -To profile an application with a main entry point of :func:`foo`, you would add -the following to your module:: +To profile a function that takes a single argument, you can do:: import cProfile - cProfile.run('foo()') + import re + cProfile.run('re.compile("foo|bar")') (Use :mod:`profile` instead of :mod:`cProfile` if the latter is not available on your system.) -The above action would cause :func:`foo` to be run, and a series of informative -lines (the profile) to be printed. The above approach is most useful when -working with the interpreter. If you would like to save the results of a -profile into a file for later examination, you can supply a file name as the -second argument to the :func:`run` function:: +The above action would run :func:`re.compile` and print profile results like +the following:: - import cProfile - cProfile.run('foo()', 'fooprof') + 197 function calls (192 primitive calls) in 0.002 seconds -The file :file:`cProfile.py` can also be invoked as a script to profile another -script. For example:: + Ordered by: standard name - python -m cProfile myscript.py + ncalls tottime percall cumtime percall filename:lineno(function) + 1 0.000 0.000 0.001 0.001 <string>:1(<module>) + 1 0.000 0.000 0.001 0.001 re.py:212(compile) + 1 0.000 0.000 0.001 0.001 re.py:268(_compile) + 1 0.000 0.000 0.000 0.000 sre_compile.py:172(_compile_charset) + 1 0.000 0.000 0.000 0.000 sre_compile.py:201(_optimize_charset) + 4 0.000 0.000 0.000 0.000 sre_compile.py:25(_identityfunction) + 3/1 0.000 0.000 0.000 0.000 sre_compile.py:33(_compile) -:file:`cProfile.py` accepts two optional arguments on the command line:: +The first line indicates that 197 calls were monitored. Of those calls, 192 +were :dfn:`primitive`, meaning that the call was not induced via recursion. The +next line: ``Ordered by: standard name``, indicates that the text string in the +far right column was used to sort the output. The column headings include: - cProfile.py [-o output_file] [-s sort_order] +ncalls + for the number of calls, -``-s`` only applies to standard output (``-o`` is not supplied). -Look in the :class:`Stats` documentation for valid sort values. +tottime + for the total time spent in the given function (and excluding time made in + calls to sub-functions) -When you wish to review the profile, you should use the methods in the -:mod:`pstats` module. Typically you would load the statistics data as follows:: +percall + is the quotient of ``tottime`` divided by ``ncalls`` - import pstats - p = pstats.Stats('fooprof') +cumtime + is the cumulative time spent in this and all subfunctions (from invocation + till exit). This figure is accurate *even* for recursive functions. -The class :class:`Stats` (the above code just created an instance of this class) -has a variety of methods for manipulating and printing the data that was just -read into ``p``. When you ran :func:`cProfile.run` above, what was printed was -the result of three method calls:: +percall + is the quotient of ``cumtime`` divided by primitive calls - p.strip_dirs().sort_stats(-1).print_stats() +filename:lineno(function) + provides the respective data of each function + +When there are two numbers in the first column (for example ``3/1``), it means +that the function recursed. The second value is the number of primitive calls +and the former is the total number of calls. Note that when the function does +not recurse, these two values are the same, and only the single figure is +printed. -The first method removed the extraneous path from all the module names. The -second method sorted all the entries according to the standard module/line/name -string that is printed. The third method printed out all the statistics. You -might try the following sort calls: +Instead of printing the output at the end of the profile run, you can save the +results to a file by specifying a filename to the :func:`run` function:: -.. (this is to comply with the semantics of the old profiler). + import cProfile + import re + cProfile.run('re.compile("foo|bar")', 'restats') + +The :class:`pstats.Stats` class reads profile results from a file and formats +them in various ways. -:: +The file :mod:`cProfile` can also be invoked as a script to profile another +script. For example:: + + python -m cProfile [-o output_file] [-s sort_order] myscript.py + +``-o`` writes the profile results to a file instead of to stdout + +``-s`` specifies one of the :func:`~pstats.Stats.sort_stats` sort values to sort +the output by. This only applies when ``-o`` is not supplied. + +The :mod:`pstats` module's :class:`~pstats.Stats` class has a variety of methods +for manipulating and printing the data saved into a profile results file:: + + import pstats + p = pstats.Stats('restats') + p.strip_dirs().sort_stats(-1).print_stats() + +The :meth:`~pstats.Stats.strip_dirs` method removed the extraneous path from all +the module names. The :meth:`~pstats.Stats.sort_stats` method sorted all the +entries according to the standard module/line/name string that is printed. The +:meth:`~pstats.Stats.print_stats` method printed out all the statistics. You +might try the following sort calls:: p.sort_stats('name') p.print_stats() @@ -164,336 +190,338 @@ If you want more functionality, you're going to have to read the manual, or guess what the following functions do:: p.print_callees() - p.add('fooprof') + p.add('restats') Invoked as a script, the :mod:`pstats` module is a statistics browser for reading and examining profile dumps. It has a simple line-oriented interface (implemented using :mod:`cmd`) and interactive help. +:mod:`profile` and :mod:`cProfile` Module Reference +======================================================= -.. _deterministic-profiling: +.. module:: cProfile +.. module:: profile + :synopsis: Python source profiler. -What Is Deterministic Profiling? -================================ +Both the :mod:`profile` and :mod:`cProfile` modules provide the following +functions: -:dfn:`Deterministic profiling` is meant to reflect the fact that all *function -call*, *function return*, and *exception* events are monitored, and precise -timings are made for the intervals between these events (during which time the -user's code is executing). In contrast, :dfn:`statistical profiling` (which is -not done by this module) randomly samples the effective instruction pointer, and -deduces where time is being spent. The latter technique traditionally involves -less overhead (as the code does not need to be instrumented), but provides only -relative indications of where time is being spent. +.. function:: run(command, filename=None, sort=-1) -In Python, since there is an interpreter active during execution, the presence -of instrumented code is not required to do deterministic profiling. Python -automatically provides a :dfn:`hook` (optional callback) for each event. In -addition, the interpreted nature of Python tends to add so much overhead to -execution, that deterministic profiling tends to only add small processing -overhead in typical applications. The result is that deterministic profiling is -not that expensive, yet provides extensive run time statistics about the -execution of a Python program. + This function takes a single argument that can be passed to the :func:`exec` + function, and an optional file name. In all cases this routine executes:: -Call count statistics can be used to identify bugs in code (surprising counts), -and to identify possible inline-expansion points (high call counts). Internal -time statistics can be used to identify "hot loops" that should be carefully -optimized. Cumulative time statistics should be used to identify high level -errors in the selection of algorithms. Note that the unusual handling of -cumulative times in this profiler allows statistics for recursive -implementations of algorithms to be directly compared to iterative -implementations. + exec(command, __main__.__dict__, __main__.__dict__) + and gathers profiling statistics from the execution. If no file name is + present, then this function automatically creates a :class:`~pstats.Stats` + instance and prints a simple profiling report. If the sort value is specified + it is passed to this :class:`~pstats.Stats` instance to control how the + results are sorted. -Reference Manual -- :mod:`profile` and :mod:`cProfile` -====================================================== +.. function:: runctx(command, globals, locals, filename=None) -.. module:: cProfile - :synopsis: Python profiler + This function is similar to :func:`run`, with added arguments to supply the + globals and locals dictionaries for the *command* string. This routine + executes:: + exec(command, globals, locals) -The primary entry point for the profiler is the global function -:func:`profile.run` (resp. :func:`cProfile.run`). It is typically used to create -any profile information. The reports are formatted and printed using methods of -the class :class:`pstats.Stats`. The following is a description of all of these -standard entry points and functions. For a more in-depth view of some of the -code, consider reading the later section on Profiler Extensions, which includes -discussion of how to derive "better" profilers from the classes presented, or -reading the source code for these modules. + and gathers profiling statistics as in the :func:`run` function above. +.. class:: Profile(timer=None, timeunit=0.0, subcalls=True, builtins=True) -.. function:: run(command, filename=None, sort=-1) + This class is normally only used if more precise control over profiling is + needed than what the :func:`cProfile.run` function provides. - This function takes a single argument that can be passed to the :func:`exec` - function, and an optional file name. In all cases this routine attempts to - :func:`exec` its first argument, and gather profiling statistics from the - execution. If no file name is present, then this function automatically - prints a simple profiling report, sorted by the standard name string - (file/line/function-name) that is presented in each line. The following is a - typical output from such a call:: + A custom timer can be supplied for measuring how long code takes to run via + the *timer* argument. This must be a function that returns a single number + representing the current time. If the number is an integer, the *timeunit* + specifies a multiplier that specifies the duration of each unit of time. For + example, if the timer returns times measured in thousands of seconds, the + time unit would be ``.001``. - 2706 function calls (2004 primitive calls) in 4.504 CPU seconds + Directly using the :class:`Profile` class allows formatting profile results + without writing the profile data to a file:: - Ordered by: standard name + import cProfile, pstats, io + pr = cProfile.Profile() + pr.enable() + # ... do something ... + pr.disable() + s = io.StringIO() + sortby = 'cumulative' + ps = pstats.Stats(pr, stream=s).sort_stats(sortby) + ps.print_stats() + print(s.getvalue()) - ncalls tottime percall cumtime percall filename:lineno(function) - 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects) - 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate) - ... + .. method:: enable() - The first line indicates that 2706 calls were monitored. Of those - calls, 2004 were :dfn:`primitive`. We define :dfn:`primitive` to - mean that the call was not induced via recursion. The next line: - ``Ordered by: standard name``, indicates that the text string in - the far right column was used to sort the output. The column - headings include: + Start collecting profiling data. - ncalls - for the number of calls, + .. method:: disable() - tottime - for the total time spent in the given function (and excluding time made in - calls to sub-functions), + Stop collecting profiling data. - percall - is the quotient of ``tottime`` divided by ``ncalls`` + .. method:: create_stats() - cumtime - is the total time spent in this and all subfunctions (from invocation till - exit). This figure is accurate *even* for recursive functions. + Stop collecting profiling data and record the results internally + as the current profile. - percall - is the quotient of ``cumtime`` divided by primitive calls + .. method:: print_stats(sort=-1) - filename:lineno(function) - provides the respective data of each function + Create a :class:`~pstats.Stats` object based on the current + profile and print the results to stdout. - When there are two numbers in the first column (for example, - ``43/3``), then the latter is the number of primitive calls, and - the former is the actual number of calls. Note that when the - function does not recurse, these two values are the same, and only - the single figure is printed. + .. method:: dump_stats(filename) - If *sort* is given, it can be one of values allowed for *key* - parameter from :meth:`pstats.Stats.sort_stats`. + Write the results of the current profile to *filename*. + .. method:: run(cmd) -.. function:: runctx(command, globals, locals, filename=None) + Profile the cmd via :func:`exec`. - This function is similar to :func:`run`, with added arguments to supply the - globals and locals dictionaries for the *command* string. + .. method:: runctx(cmd, globals, locals) + + Profile the cmd via :func:`exec` with the specified global and + local environment. + .. method:: runcall(func, *args, **kwargs) -Analysis of the profiler data is done using the :class:`pstats.Stats` class. + Profile ``func(*args, **kwargs)`` +.. _profile-stats: + +The :class:`Stats` Class +======================== + +Analysis of the profiler data is done using the :class:`~pstats.Stats` class. .. module:: pstats :synopsis: Statistics object for use with the profiler. +.. class:: Stats(*filenames or profile, stream=sys.stdout) + + This class constructor creates an instance of a "statistics object" from a + *filename* (or list of filenames) or from a :class:`Profile` instance. Output + will be printed to the stream specified by *stream*. + + The file selected by the above constructor must have been created by the + corresponding version of :mod:`profile` or :mod:`cProfile`. To be specific, + there is *no* file compatibility guaranteed with future versions of this + profiler, and there is no compatibility with files produced by other + profilers. If several files are provided, all the statistics for identical + functions will be coalesced, so that an overall view of several processes can + be considered in a single report. If additional files need to be combined + with data in an existing :class:`~pstats.Stats` object, the + :meth:`~pstats.Stats.add` method can be used. + + Instead of reading the profile data from a file, a :class:`cProfile.Profile` + or :class:`profile.Profile` object can be used as the profile data source. + + :class:`Stats` objects have the following methods: + + .. method:: strip_dirs() + + This method for the :class:`Stats` class removes all leading path + information from file names. It is very useful in reducing the size of + the printout to fit within (close to) 80 columns. This method modifies + the object, and the stripped information is lost. After performing a + strip operation, the object is considered to have its entries in a + "random" order, as it was just after object initialization and loading. + If :meth:`~pstats.Stats.strip_dirs` causes two function names to be + indistinguishable (they are on the same line of the same filename, and + have the same function name), then the statistics for these two entries + are accumulated into a single entry. + + + .. method:: add(*filenames) + + This method of the :class:`Stats` class accumulates additional profiling + information into the current profiling object. Its arguments should refer + to filenames created by the corresponding version of :func:`profile.run` + or :func:`cProfile.run`. Statistics for identically named (re: file, line, + name) functions are automatically accumulated into single function + statistics. + + + .. method:: dump_stats(filename) + + Save the data loaded into the :class:`Stats` object to a file named + *filename*. The file is created if it does not exist, and is overwritten + if it already exists. This is equivalent to the method of the same name + on the :class:`profile.Profile` and :class:`cProfile.Profile` classes. + + + .. method:: sort_stats(*keys) + + This method modifies the :class:`Stats` object by sorting it according to + the supplied criteria. The argument is typically a string identifying the + basis of a sort (example: ``'time'`` or ``'name'``). + + When more than one key is provided, then additional keys are used as + secondary criteria when there is equality in all keys selected before + them. For example, ``sort_stats('name', 'file')`` will sort all the + entries according to their function name, and resolve all ties (identical + function names) by sorting by file name. + + Abbreviations can be used for any key names, as long as the abbreviation + is unambiguous. The following are the keys currently defined: + + +------------------+----------------------+ + | Valid Arg | Meaning | + +==================+======================+ + | ``'calls'`` | call count | + +------------------+----------------------+ + | ``'cumulative'`` | cumulative time | + +------------------+----------------------+ + | ``'cumtime'`` | cumulative time | + +------------------+----------------------+ + | ``'file'`` | file name | + +------------------+----------------------+ + | ``'filename'`` | file name | + +------------------+----------------------+ + | ``'module'`` | file name | + +------------------+----------------------+ + | ``'ncalls'`` | call count | + +------------------+----------------------+ + | ``'pcalls'`` | primitive call count | + +------------------+----------------------+ + | ``'line'`` | line number | + +------------------+----------------------+ + | ``'name'`` | function name | + +------------------+----------------------+ + | ``'nfl'`` | name/file/line | + +------------------+----------------------+ + | ``'stdname'`` | standard name | + +------------------+----------------------+ + | ``'time'`` | internal time | + +------------------+----------------------+ + | ``'tottime'`` | internal time | + +------------------+----------------------+ + + Note that all sorts on statistics are in descending order (placing most + time consuming items first), where as name, file, and line number searches + are in ascending order (alphabetical). The subtle distinction between + ``'nfl'`` and ``'stdname'`` is that the standard name is a sort of the + name as printed, which means that the embedded line numbers get compared + in an odd way. For example, lines 3, 20, and 40 would (if the file names + were the same) appear in the string order 20, 3 and 40. In contrast, + ``'nfl'`` does a numeric compare of the line numbers. In fact, + ``sort_stats('nfl')`` is the same as ``sort_stats('name', 'file', + 'line')``. + + For backward-compatibility reasons, the numeric arguments ``-1``, ``0``, + ``1``, and ``2`` are permitted. They are interpreted as ``'stdname'``, + ``'calls'``, ``'time'``, and ``'cumulative'`` respectively. If this old + style format (numeric) is used, only one sort key (the numeric key) will + be used, and additional arguments will be silently ignored. + + .. For compatibility with the old profiler. + + + .. method:: reverse_order() + + This method for the :class:`Stats` class reverses the ordering of the + basic list within the object. Note that by default ascending vs + descending order is properly selected based on the sort key of choice. + + .. This method is provided primarily for compatibility with the old + profiler. + + + .. method:: print_stats(*restrictions) + + This method for the :class:`Stats` class prints out a report as described + in the :func:`profile.run` definition. + + The order of the printing is based on the last + :meth:`~pstats.Stats.sort_stats` operation done on the object (subject to + caveats in :meth:`~pstats.Stats.add` and + :meth:`~pstats.Stats.strip_dirs`). + + The arguments provided (if any) can be used to limit the list down to the + significant entries. Initially, the list is taken to be the complete set + of profiled functions. Each restriction is either an integer (to select a + count of lines), or a decimal fraction between 0.0 and 1.0 inclusive (to + select a percentage of lines), or a regular expression (to pattern match + the standard name that is printed. If several restrictions are provided, + then they are applied sequentially. For example:: + + print_stats(.1, 'foo:') + + would first limit the printing to first 10% of list, and then only print + functions that were part of filename :file:`.\*foo:`. In contrast, the + command:: + + print_stats('foo:', .1) + + would limit the list to all functions having file names :file:`.\*foo:`, + and then proceed to only print the first 10% of them. + + + .. method:: print_callers(*restrictions) + + This method for the :class:`Stats` class prints a list of all functions + that called each function in the profiled database. The ordering is + identical to that provided by :meth:`~pstats.Stats.print_stats`, and the + definition of the restricting argument is also identical. Each caller is + reported on its own line. The format differs slightly depending on the + profiler that produced the stats: + + * With :mod:`profile`, a number is shown in parentheses after each caller + to show how many times this specific call was made. For convenience, a + second non-parenthesized number repeats the cumulative time spent in the + function at the right. + + * With :mod:`cProfile`, each caller is preceded by three numbers: the + number of times this specific call was made, and the total and + cumulative times spent in the current function while it was invoked by + this specific caller. + + + .. method:: print_callees(*restrictions) -.. class:: Stats(*filenames, stream=sys.stdout) - - This class constructor creates an instance of a "statistics object" - from a *filename* (or set of filenames). :class:`Stats` objects - are manipulated by methods, in order to print useful reports. You - may specify an alternate output stream by giving the keyword - argument, ``stream``. + This method for the :class:`Stats` class prints a list of all function + that were called by the indicated function. Aside from this reversal of + direction of calls (re: called vs was called by), the arguments and + ordering are identical to the :meth:`~pstats.Stats.print_callers` method. - The file selected by the above constructor must have been created - by the corresponding version of :mod:`profile` or :mod:`cProfile`. - To be specific, there is *no* file compatibility guaranteed with - future versions of this profiler, and there is no compatibility - with files produced by other profilers. If several files are - provided, all the statistics for identical functions will be - coalesced, so that an overall view of several processes can be - considered in a single report. If additional files need to be - combined with data in an existing :class:`Stats` object, the - :meth:`add` method can be used. - .. (such as the old system profiler). +.. _deterministic-profiling: +What Is Deterministic Profiling? +================================ -.. _profile-stats: +:dfn:`Deterministic profiling` is meant to reflect the fact that all *function +call*, *function return*, and *exception* events are monitored, and precise +timings are made for the intervals between these events (during which time the +user's code is executing). In contrast, :dfn:`statistical profiling` (which is +not done by this module) randomly samples the effective instruction pointer, and +deduces where time is being spent. The latter technique traditionally involves +less overhead (as the code does not need to be instrumented), but provides only +relative indications of where time is being spent. -The :class:`Stats` Class ------------------------- - -:class:`Stats` objects have the following methods: - - -.. method:: Stats.strip_dirs() - - This method for the :class:`Stats` class removes all leading path - information from file names. It is very useful in reducing the - size of the printout to fit within (close to) 80 columns. This - method modifies the object, and the stripped information is lost. - After performing a strip operation, the object is considered to - have its entries in a "random" order, as it was just after object - initialization and loading. If :meth:`strip_dirs` causes two - function names to be indistinguishable (they are on the same line - of the same filename, and have the same function name), then the - statistics for these two entries are accumulated into a single - entry. - - -.. method:: Stats.add(*filenames) - - This method of the :class:`Stats` class accumulates additional profiling - information into the current profiling object. Its arguments should refer to - filenames created by the corresponding version of :func:`profile.run` or - :func:`cProfile.run`. Statistics for identically named (re: file, line, name) - functions are automatically accumulated into single function statistics. - - -.. method:: Stats.dump_stats(filename) - - Save the data loaded into the :class:`Stats` object to a file named - *filename*. The file is created if it does not exist, and is - overwritten if it already exists. This is equivalent to the method - of the same name on the :class:`profile.Profile` and - :class:`cProfile.Profile` classes. - - -.. method:: Stats.sort_stats(*keys) - - This method modifies the :class:`Stats` object by sorting it - according to the supplied criteria. The argument is typically a - string identifying the basis of a sort (example: ``'time'`` or - ``'name'``). - - When more than one key is provided, then additional keys are used - as secondary criteria when there is equality in all keys selected - before them. For example, ``sort_stats('name', 'file')`` will sort - all the entries according to their function name, and resolve all - ties (identical function names) by sorting by file name. - - Abbreviations can be used for any key names, as long as the abbreviation is - unambiguous. The following are the keys currently defined: - - +------------------+----------------------+ - | Valid Arg | Meaning | - +==================+======================+ - | ``'calls'`` | call count | - +------------------+----------------------+ - | ``'cumulative'`` | cumulative time | - +------------------+----------------------+ - | ``'cumtime'`` | cumulative time | - +------------------+----------------------+ - | ``'file'`` | file name | - +------------------+----------------------+ - | ``'filename'`` | file name | - +------------------+----------------------+ - | ``'module'`` | file name | - +------------------+----------------------+ - | ``'ncalls'`` | call count | - +------------------+----------------------+ - | ``'pcalls'`` | primitive call count | - +------------------+----------------------+ - | ``'line'`` | line number | - +------------------+----------------------+ - | ``'name'`` | function name | - +------------------+----------------------+ - | ``'nfl'`` | name/file/line | - +------------------+----------------------+ - | ``'stdname'`` | standard name | - +------------------+----------------------+ - | ``'time'`` | internal time | - +------------------+----------------------+ - | ``'tottime'`` | internal time | - +------------------+----------------------+ - - Note that all sorts on statistics are in descending order (placing - most time consuming items first), where as name, file, and line - number searches are in ascending order (alphabetical). The subtle - distinction between ``'nfl'`` and ``'stdname'`` is that the - standard name is a sort of the name as printed, which means that - the embedded line numbers get compared in an odd way. For example, - lines 3, 20, and 40 would (if the file names were the same) appear - in the string order 20, 3 and 40. In contrast, ``'nfl'`` does a - numeric compare of the line numbers. In fact, - ``sort_stats('nfl')`` is the same as ``sort_stats('name', 'file', - 'line')``. - - For backward-compatibility reasons, the numeric arguments ``-1``, - ``0``, ``1``, and ``2`` are permitted. They are interpreted as - ``'stdname'``, ``'calls'``, ``'time'``, and ``'cumulative'`` - respectively. If this old style format (numeric) is used, only one - sort key (the numeric key) will be used, and additional arguments - will be silently ignored. - - .. For compatibility with the old profiler, - - -.. method:: Stats.reverse_order() - - This method for the :class:`Stats` class reverses the ordering of - the basic list within the object. Note that by default ascending - vs descending order is properly selected based on the sort key of - choice. - - .. This method is provided primarily for compatibility with the old profiler. - - -.. method:: Stats.print_stats(*restrictions) - - This method for the :class:`Stats` class prints out a report as - described in the :func:`profile.run` definition. - - The order of the printing is based on the last :meth:`sort_stats` - operation done on the object (subject to caveats in :meth:`add` and - :meth:`strip_dirs`). - - The arguments provided (if any) can be used to limit the list down - to the significant entries. Initially, the list is taken to be the - complete set of profiled functions. Each restriction is either an - integer (to select a count of lines), or a decimal fraction between - 0.0 and 1.0 inclusive (to select a percentage of lines), or a - regular expression (to pattern match the standard name that is - printed; as of Python 1.5b1, this uses the Perl-style regular - expression syntax defined by the :mod:`re` module). If several - restrictions are provided, then they are applied sequentially. For - example:: - - print_stats(.1, 'foo:') - - would first limit the printing to first 10% of list, and then only print - functions that were part of filename :file:`.\*foo:`. In contrast, the - command:: - - print_stats('foo:', .1) - - would limit the list to all functions having file names :file:`.\*foo:`, and - then proceed to only print the first 10% of them. - - -.. method:: Stats.print_callers(*restrictions) - - This method for the :class:`Stats` class prints a list of all functions that - called each function in the profiled database. The ordering is identical to - that provided by :meth:`print_stats`, and the definition of the restricting - argument is also identical. Each caller is reported on its own line. The - format differs slightly depending on the profiler that produced the stats: - - * With :mod:`profile`, a number is shown in parentheses after each caller to - show how many times this specific call was made. For convenience, a second - non-parenthesized number repeats the cumulative time spent in the function - at the right. - - * With :mod:`cProfile`, each caller is preceded by three numbers: - the number of times this specific call was made, and the total - and cumulative times spent in the current function while it was - invoked by this specific caller. - - -.. method:: Stats.print_callees(*restrictions) +In Python, since there is an interpreter active during execution, the presence +of instrumented code is not required to do deterministic profiling. Python +automatically provides a :dfn:`hook` (optional callback) for each event. In +addition, the interpreted nature of Python tends to add so much overhead to +execution, that deterministic profiling tends to only add small processing +overhead in typical applications. The result is that deterministic profiling is +not that expensive, yet provides extensive run time statistics about the +execution of a Python program. - This method for the :class:`Stats` class prints a list of all - function that were called by the indicated function. Aside from - this reversal of direction of calls (re: called vs was called by), - the arguments and ordering are identical to the - :meth:`print_callers` method. +Call count statistics can be used to identify bugs in code (surprising counts), +and to identify possible inline-expansion points (high call counts). Internal +time statistics can be used to identify "hot loops" that should be carefully +optimized. Cumulative time statistics should be used to identify high level +errors in the selection of algorithms. Note that the unusual handling of +cumulative times in this profiler allows statistics for recursive +implementations of algorithms to be directly compared to iterative +implementations. -.. _profile-limits: +.. _profile-limitations: Limitations =========== @@ -536,7 +564,7 @@ The profiler of the :mod:`profile` module subtracts a constant from each event handling time to compensate for the overhead of calling the time function, and socking away the results. By default, the constant is 0. The following procedure can be used to obtain a better constant for a given platform (see -discussion in section Limitations above). :: +:ref:`profile-limitations`). :: import profile pr = profile.Profile() @@ -546,8 +574,8 @@ discussion in section Limitations above). :: The method executes the number of Python calls given by the argument, directly and again under the profiler, measuring the time for both. It then computes the hidden overhead per profiler event, and returns that as a float. For example, -on an 800 MHz Pentium running Windows 2000, and using Python's time.clock() as -the timer, the magical number is about 12.5e-6. +on a 1.8Ghz Intel Core i5 running Mac OS X, and using Python's time.clock() as +the timer, the magical number is about 4.04e-6. The object of this exercise is to get a fairly consistent result. If your computer is *very* fast, or your timer function has poor resolution, you might @@ -570,54 +598,51 @@ When you have a consistent answer, there are three ways you can use it:: If you have a choice, you are better off choosing a smaller constant, and then your results will "less often" show up as negative in profile statistics. +.. _profile-timers: -.. _profiler-extensions: - -Extensions --- Deriving Better Profilers -======================================== - -The :class:`Profile` class of both modules, :mod:`profile` and :mod:`cProfile`, -were written so that derived classes could be developed to extend the profiler. -The details are not described here, as doing this successfully requires an -expert understanding of how the :class:`Profile` class works internally. Study -the source code of the module carefully if you want to pursue this. +Using a custom timer +==================== -If all you want to do is change how current time is determined (for example, to -force use of wall-clock time or elapsed process time), pass the timing function -you want to the :class:`Profile` class constructor:: +If you want to change how current time is determined (for example, to force use +of wall-clock time or elapsed process time), pass the timing function you want +to the :class:`Profile` class constructor:: - pr = profile.Profile(your_time_func) + pr = profile.Profile(your_time_func) -The resulting profiler will then call :func:`your_time_func`. +The resulting profiler will then call ``your_time_func``. Depending on whether +you are using :class:`profile.Profile` or :class:`cProfile.Profile`, +``your_time_func``'s return value will be interpreted differently: :class:`profile.Profile` - :func:`your_time_func` should return a single number, or a list of - numbers whose sum is the current time (like what :func:`os.times` - returns). If the function returns a single time number, or the - list of returned numbers has length 2, then you will get an - especially fast version of the dispatch routine. - - Be warned that you should calibrate the profiler class for the - timer function that you choose. For most machines, a timer that - returns a lone integer value will provide the best results in terms - of low overhead during profiling. (:func:`os.times` is *pretty* - bad, as it returns a tuple of floating point values). If you want - to substitute a better timer in the cleanest fashion, derive a - class and hardwire a replacement dispatch method that best handles - your timer call, along with the appropriate calibration constant. + ``your_time_func`` should return a single number, or a list of numbers whose + sum is the current time (like what :func:`os.times` returns). If the + function returns a single time number, or the list of returned numbers has + length 2, then you will get an especially fast version of the dispatch + routine. + + Be warned that you should calibrate the profiler class for the timer function + that you choose (see :ref:`profile-calibration`). For most machines, a timer + that returns a lone integer value will provide the best results in terms of + low overhead during profiling. (:func:`os.times` is *pretty* bad, as it + returns a tuple of floating point values). If you want to substitute a + better timer in the cleanest fashion, derive a class and hardwire a + replacement dispatch method that best handles your timer call, along with the + appropriate calibration constant. :class:`cProfile.Profile` - :func:`your_time_func` should return a single number. If it - returns integers, you can also invoke the class constructor with a - second argument specifying the real duration of one unit of time. - For example, if :func:`your_integer_time_func` returns times - measured in thousands of seconds, you would construct the - :class:`Profile` instance as follows:: - - pr = profile.Profile(your_integer_time_func, 0.001) - - As the :mod:`cProfile.Profile` class cannot be calibrated, custom - timer functions should be used with care and should be as fast as - possible. For the best results with a custom timer, it might be - necessary to hard-code it in the C source of the internal - :mod:`_lsprof` module. + ``your_time_func`` should return a single number. If it returns integers, + you can also invoke the class constructor with a second argument specifying + the real duration of one unit of time. For example, if + ``your_integer_time_func`` returns times measured in thousands of seconds, + you would construct the :class:`Profile` instance as follows:: + + pr = cProfile.Profile(your_integer_time_func, 0.001) + + As the :mod:`cProfile.Profile` class cannot be calibrated, custom timer + functions should be used with care and should be as fast as possible. For + the best results with a custom timer, it might be necessary to hard-code it + in the C source of the internal :mod:`_lsprof` module. + +Python 3.3 adds several new functions in :mod:`time` that can be used to make +precise measurements of process or wall-clock time. For example, see +:func:`time.perf_counter`. diff --git a/Doc/library/pyexpat.rst b/Doc/library/pyexpat.rst index 420e407..3d88d85 100644 --- a/Doc/library/pyexpat.rst +++ b/Doc/library/pyexpat.rst @@ -339,8 +339,10 @@ otherwise stated. .. method:: xmlparser.StartElementHandler(name, attributes) Called for the start of every element. *name* is a string containing the - element name, and *attributes* is a dictionary mapping attribute names to their - values. + element name, and *attributes* is the element attributes. If + :attr:`ordered_attributes` is true, this is a list (see + :attr:`ordered_attributes` for a full description). Otherwise it's a + dictionary mapping names to values. .. method:: xmlparser.EndElementHandler(name) @@ -482,8 +484,8 @@ ExpatError Exceptions .. attribute:: ExpatError.code Expat's internal error number for the specific error. The - :data:`errors.messages` dictionary maps these error numbers to Expat's error - messages. For example:: + :data:`errors.messages <xml.parsers.expat.errors.messages>` dictionary maps + these error numbers to Expat's error messages. For example:: from xml.parsers.expat import ParserCreate, ExpatError, errors @@ -493,9 +495,9 @@ ExpatError Exceptions except ExpatError as err: print("Error:", errors.messages[err.code]) - The :mod:`errors` module also provides error message constants and a - dictionary :data:`~errors.codes` mapping these messages back to the error - codes, see below. + The :mod:`~xml.parsers.expat.errors` module also provides error message + constants and a dictionary :data:`~xml.parsers.expat.errors.codes` mapping + these messages back to the error codes, see below. .. attribute:: ExpatError.lineno @@ -859,5 +861,5 @@ The ``errors`` module has the following attributes: .. [#] The encoding string included in XML output should conform to the appropriate standards. For example, "UTF-8" is valid, but "UTF8" is not. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl - and http://www.iana.org/assignments/character-sets . + and http://www.iana.org/assignments/character-sets\ . diff --git a/Doc/library/queue.rst b/Doc/library/queue.rst index bc35dfe..680d690 100644 --- a/Doc/library/queue.rst +++ b/Doc/library/queue.rst @@ -54,13 +54,15 @@ The :mod:`queue` module defines the following classes and exceptions: .. exception:: Empty - Exception raised when non-blocking :meth:`get` (or :meth:`get_nowait`) is called + Exception raised when non-blocking :meth:`~Queue.get` (or + :meth:`~Queue.get_nowait`) is called on a :class:`Queue` object which is empty. .. exception:: Full - Exception raised when non-blocking :meth:`put` (or :meth:`put_nowait`) is called + Exception raised when non-blocking :meth:`~Queue.put` (or + :meth:`~Queue.put_nowait`) is called on a :class:`Queue` object which is full. @@ -181,6 +183,6 @@ Example of how to wait for enqueued tasks to be completed:: context. :class:`collections.deque` is an alternative implementation of unbounded - queues with fast atomic :func:`append` and :func:`popleft` operations that - do not require locking. + queues with fast atomic :meth:`~collections.deque.append` and + :meth:`~collections.deque.popleft` operations that do not require locking. diff --git a/Doc/library/random.rst b/Doc/library/random.rst index 1cd4d26..11dd367 100644 --- a/Doc/library/random.rst +++ b/Doc/library/random.rst @@ -43,6 +43,12 @@ The :mod:`random` module also provides the :class:`SystemRandom` class which uses the system function :func:`os.urandom` to generate random numbers from sources provided by the operating system. +.. warning:: + + The pseudo-random generators of this module should not be used for + security purposes. Use :func:`os.urandom` or :class:`SystemRandom` if + you require a cryptographically secure pseudo-random number generator. + Bookkeeping functions: @@ -145,6 +151,9 @@ Functions for sequences: argument. This is especially fast and space efficient for sampling from a large population: ``sample(range(10000000), 60)``. + If the sample size is larger than the population size, a :exc:`ValueError` + is raised. + The following functions generate specific real-valued distributions. Function parameters are named after the corresponding variables in the distribution's equation, as used in common mathematical practice; most of these equations can diff --git a/Doc/library/re.rst b/Doc/library/re.rst index 26f2a38..dd790e7 100644 --- a/Doc/library/re.rst +++ b/Doc/library/re.rst @@ -242,21 +242,32 @@ The special characters are: ``(?P<name>...)`` Similar to regular parentheses, but the substring matched by the group is - accessible within the rest of the regular expression via the symbolic group - name *name*. Group names must be valid Python identifiers, and each group - name must be defined only once within a regular expression. A symbolic group - is also a numbered group, just as if the group were not named. So the group - named ``id`` in the example below can also be referenced as the numbered group - ``1``. - - For example, if the pattern is ``(?P<id>[a-zA-Z_]\w*)``, the group can be - referenced by its name in arguments to methods of match objects, such as - ``m.group('id')`` or ``m.end('id')``, and also by name in the regular - expression itself (using ``(?P=id)``) and replacement text given to - ``.sub()`` (using ``\g<id>``). + accessible via the symbolic group name *name*. Group names must be valid + Python identifiers, and each group name must be defined only once within a + regular expression. A symbolic group is also a numbered group, just as if + the group were not named. + + Named groups can be referenced in three contexts. If the pattern is + ``(?P<quote>['"]).*?(?P=quote)`` (i.e. matching a string quoted with either + single or double quotes): + + +---------------------------------------+----------------------------------+ + | Context of reference to group "quote" | Ways to reference it | + +=======================================+==================================+ + | in the same pattern itself | * ``(?P=quote)`` (as shown) | + | | * ``\1`` | + +---------------------------------------+----------------------------------+ + | when processing match object ``m`` | * ``m.group('quote')`` | + | | * ``m.end('quote')`` (etc.) | + +---------------------------------------+----------------------------------+ + | in a string passed to the ``repl`` | * ``\g<quote>`` | + | argument of ``re.sub()`` | * ``\g<1>`` | + | | * ``\1`` | + +---------------------------------------+----------------------------------+ ``(?P=name)`` - Matches whatever text was matched by the earlier group named *name*. + A backreference to a named group; it matches whatever text was matched by the + earlier group named *name*. ``(?#...)`` A comment; the contents of the parentheses are simply ignored. @@ -306,7 +317,7 @@ The special characters are: optional and can be omitted. For example, ``(<)?(\w+@\w+(?:\.\w+)+)(?(1)>|$)`` is a poor email matching pattern, which will match with ``'<user@host.com>'`` as well as ``'user@host.com'``, but - not with ``'<user@host.com'`` nor ``'user@host.com>'`` . + not with ``'<user@host.com'`` nor ``'user@host.com>'``. The special sequences consist of ``'\'`` and a character from the list below. @@ -316,7 +327,7 @@ the second character. For example, ``\$`` matches the character ``'$'``. ``\number`` Matches the contents of the group of the same number. Groups are numbered starting from 1. For example, ``(.+) \1`` matches ``'the the'`` or ``'55 55'``, - but not ``'the end'`` (note the space after the group). This special sequence + but not ``'thethe'`` (note the space after the group). This special sequence can only be used to match one of the first 99 groups. If the first digit of *number* is 0, or *number* is 3 octal digits long, it will not be interpreted as a group match, but as the character with octal value *number*. Inside the @@ -414,17 +425,24 @@ Most of the standard escapes supported by Python string literals are also accepted by the regular expression parser:: \a \b \f \n - \r \t \v \x - \\ + \r \t \u \U + \v \x \\ (Note that ``\b`` is used to represent word boundaries, and means "backspace" only inside character classes.) +``'\u'`` and ``'\U'`` escape sequences are only recognized in Unicode +patterns. In bytes patterns they are not treated specially. + Octal escapes are included in a limited form. If the first digit is a 0, or if there are three octal digits, it is considered an octal escape. Otherwise, it is a group reference. As for string literals, octal escapes are always at most three digits in length. +.. versionchanged:: 3.3 + The ``'\u'`` and ``'\U'`` escape sequences have been added. + + .. _contents-of-module-re: @@ -660,7 +678,8 @@ form. when not adjacent to a previous match, so ``sub('x*', '-', 'abc')`` returns ``'-a-b-c-'``. - In addition to character escapes and backreferences as described above, + In string-type *repl* arguments, in addition to the character escapes and + backreferences described above, ``\g<name>`` will use the substring matched by the group named ``name``, as defined by the ``(?P<name>...)`` syntax. ``\g<number>`` uses the corresponding group number; ``\g<2>`` is therefore equivalent to ``\2``, but isn't ambiguous @@ -684,9 +703,12 @@ form. .. function:: escape(string) - Return *string* with all non-alphanumerics backslashed; this is useful if you - want to match an arbitrary literal string that may have regular expression - metacharacters in it. + Escape all the characters in pattern except ASCII letters, numbers and ``'_'``. + This is useful if you want to match an arbitrary literal string that may + have regular expression metacharacters in it. + + .. versionchanged:: 3.3 + The ``'_'`` character is no longer escaped. .. function:: purge() diff --git a/Doc/library/readline.rst b/Doc/library/readline.rst index ab55197..1134619 100644 --- a/Doc/library/readline.rst +++ b/Doc/library/readline.rst @@ -199,7 +199,7 @@ normally be executed automatically during interactive sessions from the user's histfile = os.path.join(os.path.expanduser("~"), ".pyhist") try: readline.read_history_file(histfile) - except IOError: + except FileNotFoundError: pass import atexit atexit.register(readline.write_history_file, histfile) @@ -224,7 +224,7 @@ support history save/restore. :: if hasattr(readline, "read_history_file"): try: readline.read_history_file(histfile) - except IOError: + except FileNotFoundError: pass atexit.register(self.save_history, histfile) diff --git a/Doc/library/resource.rst b/Doc/library/resource.rst index c16b013..ed85850 100644 --- a/Doc/library/resource.rst +++ b/Doc/library/resource.rst @@ -14,13 +14,15 @@ resources utilized by a program. Symbolic constants are used to specify particular system resources and to request usage information about either the current process or its children. -A single exception is defined for errors: +An :exc:`OSError` is raised on syscall failure. .. exception:: error - The functions described below may raise this error if the underlying system call - failures unexpectedly. + A deprecated alias of :exc:`OSError`. + + .. versionchanged:: 3.3 + Following :pep:`3151`, this class was made an alias of :exc:`OSError`. Resource Limits @@ -41,6 +43,11 @@ which cannot be checked or controlled by the operating system are not defined in this module for those platforms. +.. data:: RLIM_INFINITY + + Constant used to represent the the limit for an unlimited resource. + + .. function:: getrlimit(resource) Returns a tuple ``(soft, hard)`` with the current soft and hard limits of @@ -52,12 +59,20 @@ this module for those platforms. Sets new limits of consumption of *resource*. The *limits* argument must be a tuple ``(soft, hard)`` of two integers describing the new limits. A value of - ``-1`` can be used to specify the maximum possible upper limit. + :data:`~resource.RLIM_INFINITY` can be used to request a limit that is + unlimited. Raises :exc:`ValueError` if an invalid resource is specified, if the new soft - limit exceeds the hard limit, or if a process tries to raise its hard limit - (unless the process has an effective UID of super-user). Can also raise - :exc:`error` if the underlying system call fails. + limit exceeds the hard limit, or if a process tries to raise its hard limit. + Specifying a limit of :data:`~resource.RLIM_INFINITY` when the hard or + system limit for that resource is not unlimited will result in a + :exc:`ValueError`. A process with the effective UID of super-user can + request any valid limit value, including unlimited, but :exc:`ValueError` + will still be raised if the requested limit exceeds the system imposed + limit. + + ``setrlimit`` may also raise :exc:`error` if the underlying system call + fails. These symbols define resources whose consumption can be controlled using the :func:`setrlimit` and :func:`getrlimit` functions described below. The values of diff --git a/Doc/library/sched.rst b/Doc/library/sched.rst index 000dba0..26f59c5 100644 --- a/Doc/library/sched.rst +++ b/Doc/library/sched.rst @@ -14,63 +14,47 @@ The :mod:`sched` module defines a class which implements a general purpose event scheduler: -.. class:: scheduler(timefunc, delayfunc) +.. class:: scheduler(timefunc=time.monotonic, delayfunc=time.sleep) The :class:`scheduler` class defines a generic interface to scheduling events. It needs two functions to actually deal with the "outside world" --- *timefunc* should be callable without arguments, and return a number (the "time", in any - units whatsoever). The *delayfunc* function should be callable with one + units whatsoever). If time.monotonic is not available, the *timefunc* default + is time.time instead. The *delayfunc* function should be callable with one argument, compatible with the output of *timefunc*, and should delay that many time units. *delayfunc* will also be called with the argument ``0`` after each event is run to allow other threads an opportunity to run in multi-threaded applications. + .. versionchanged:: 3.3 + *timefunc* and *delayfunc* parameters are optional. + + .. versionchanged:: 3.3 + :class:`scheduler` class can be safely used in multi-threaded + environments. + Example:: >>> import sched, time >>> s = sched.scheduler(time.time, time.sleep) - >>> def print_time(): print("From print_time", time.time()) + >>> def print_time(a='default'): + ... print("From print_time", time.time(), a) ... >>> def print_some_times(): ... print(time.time()) - ... s.enter(5, 1, print_time, ()) - ... s.enter(10, 1, print_time, ()) + ... s.enter(10, 1, print_time) + ... s.enter(5, 2, print_time, argument=('positional',)) + ... s.enter(5, 1, print_time, kwargs={'a': 'keyword'}) ... s.run() ... print(time.time()) ... >>> print_some_times() 930343690.257 - From print_time 930343695.274 - From print_time 930343700.273 + From print_time 930343695.274 positional + From print_time 930343695.275 keyword + From print_time 930343700.273 default 930343700.276 -In multi-threaded environments, the :class:`scheduler` class has limitations -with respect to thread-safety, inability to insert a new task before -the one currently pending in a running scheduler, and holding up the main -thread until the event queue is empty. Instead, the preferred approach -is to use the :class:`threading.Timer` class instead. - -Example:: - - >>> import time - >>> from threading import Timer - >>> def print_time(): - ... print("From print_time", time.time()) - ... - >>> def print_some_times(): - ... print(time.time()) - ... Timer(5, print_time, ()).start() - ... Timer(10, print_time, ()).start() - ... time.sleep(11) # sleep while time-delay events execute - ... print(time.time()) - ... - >>> print_some_times() - 930343690.257 - From print_time 930343695.274 - From print_time 930343700.273 - 930343701.301 - - .. _scheduler-objects: Scheduler Objects @@ -79,26 +63,38 @@ Scheduler Objects :class:`scheduler` instances have the following methods and attributes: -.. method:: scheduler.enterabs(time, priority, action, argument) +.. method:: scheduler.enterabs(time, priority, action, argument=(), kwargs={}) Schedule a new event. The *time* argument should be a numeric type compatible with the return value of the *timefunc* function passed to the constructor. Events scheduled for the same *time* will be executed in the order of their *priority*. - Executing the event means executing ``action(*argument)``. *argument* must be a - sequence holding the parameters for *action*. + Executing the event means executing ``action(*argument, **kwargs)``. + *argument* is a sequence holding the positional arguments for *action*. + *kwargs* is a dictionary holding the keyword arguments for *action*. Return value is an event which may be used for later cancellation of the event (see :meth:`cancel`). + .. versionchanged:: 3.3 + *argument* parameter is optional. + + .. versionadded:: 3.3 + *kwargs* parameter was added. + -.. method:: scheduler.enter(delay, priority, action, argument) +.. method:: scheduler.enter(delay, priority, action, argument=(), kwargs={}) Schedule an event for *delay* more time units. Other than the relative time, the other arguments, the effect and the return value are the same as those for :meth:`enterabs`. + .. versionchanged:: 3.3 + *argument* parameter is optional. + + .. versionadded:: 3.3 + *kwargs* parameter was added. .. method:: scheduler.cancel(event) @@ -111,12 +107,16 @@ Scheduler Objects Return true if the event queue is empty. -.. method:: scheduler.run() +.. method:: scheduler.run(blocking=True) - Run all scheduled events. This function will wait (using the :func:`delayfunc` + Run all scheduled events. This method will wait (using the :func:`delayfunc` function passed to the constructor) for the next event, then execute it and so on until there are no more scheduled events. + If *blocking* is false executes the scheduled events due to expire soonest + (if any) and then return the deadline of the next scheduled call in the + scheduler (if any). + Either *action* or *delayfunc* can raise an exception. In either case, the scheduler will maintain a consistent state and propagate the exception. If an exception is raised by *action*, the event will not be attempted in future calls @@ -127,8 +127,11 @@ Scheduler Objects the calling code is responsible for canceling events which are no longer pertinent. + .. versionadded:: 3.3 + *blocking* parameter was added. + .. attribute:: scheduler.queue Read-only attribute returning a list of upcoming events in the order they will be run. Each event is shown as a :term:`named tuple` with the - following fields: time, priority, action, argument. + following fields: time, priority, action, argument, kwargs. diff --git a/Doc/library/select.rst b/Doc/library/select.rst index a450ec2..b1fd3a8 100644 --- a/Doc/library/select.rst +++ b/Doc/library/select.rst @@ -6,7 +6,8 @@ This module provides access to the :c:func:`select` and :c:func:`poll` functions -available in most operating systems, :c:func:`epoll` available on Linux 2.5+ and +available in most operating systems, :c:func:`devpoll` available on +Solaris and derivatives, :c:func:`epoll` available on Linux 2.5+ and :c:func:`kqueue` available on most BSD. Note that on Windows, it only works for sockets; on other operating systems, it also works for other file types (in particular, on Unix, it works on pipes). @@ -18,17 +19,38 @@ The module defines the following: .. exception:: error - The exception raised when an error occurs. The accompanying value is a pair - containing the numeric error code from :c:data:`errno` and the corresponding - string, as would be printed by the C function :c:func:`perror`. + A deprecated alias of :exc:`OSError`. + .. versionchanged:: 3.3 + Following :pep:`3151`, this class was made an alias of :exc:`OSError`. -.. function:: epoll(sizehint=-1) - (Only supported on Linux 2.5.44 and newer.) Returns an edge polling object, - which can be used as Edge or Level Triggered interface for I/O events; see - section :ref:`epoll-objects` below for the methods supported by epolling - objects. +.. function:: devpoll() + + (Only supported on Solaris and derivatives.) Returns a ``/dev/poll`` + polling object; see section :ref:`devpoll-objects` below for the + methods supported by devpoll objects. + + :c:func:`devpoll` objects are linked to the number of file + descriptors allowed at the time of instantiation. If your program + reduces this value, :c:func:`devpoll` will fail. If your program + increases this value, :c:func:`devpoll` may return an + incomplete list of active file descriptors. + + .. versionadded:: 3.3 + +.. function:: epoll(sizehint=-1, flags=0) + + (Only supported on Linux 2.5.44 and newer.) Return an edge polling object, + which can be used as Edge or Level Triggered interface for I/O + events. *sizehint* is deprecated and completely ignored. *flags* can be set + to :const:`EPOLL_CLOEXEC`, which causes the epoll descriptor to be closed + automatically when :func:`os.execve` is called. See section + :ref:`epoll-objects` below for the methods supported by epolling objects. + + + .. versionchanged:: 3.3 + Added the *flags* parameter. .. function:: poll() @@ -56,7 +78,7 @@ The module defines the following: This is a straightforward interface to the Unix :c:func:`select` system call. The first three arguments are sequences of 'waitable objects': either integers representing file descriptors or objects with a parameterless method - named :meth:`fileno` returning such an integer: + named :meth:`~io.IOBase.fileno` returning such an integer: * *rlist*: wait until ready for reading * *wlist*: wait until ready for writing @@ -82,8 +104,8 @@ The module defines the following: objects <file object>` (e.g. ``sys.stdin``, or objects returned by :func:`open` or :func:`os.popen`), socket objects returned by :func:`socket.socket`. You may also define a :dfn:`wrapper` class yourself, - as long as it has an appropriate :meth:`fileno` method (that really returns - a file descriptor, not just a random integer). + as long as it has an appropriate :meth:`~io.IOBase.fileno` method (that + really returns a file descriptor, not just a random integer). .. note:: @@ -97,7 +119,7 @@ The module defines the following: .. attribute:: PIPE_BUF The minimum number of bytes which can be written without blocking to a pipe - when the pipe has been reported as ready for writing by :func:`select`, + when the pipe has been reported as ready for writing by :func:`~select.select`, :func:`poll` or another interface in this module. This doesn't apply to other kind of file-like objects such as sockets. @@ -106,6 +128,74 @@ The module defines the following: .. versionadded:: 3.2 +.. _devpoll-objects: + +``/dev/poll`` Polling Objects +---------------------------------------------- + + http://developers.sun.com/solaris/articles/using_devpoll.html + http://developers.sun.com/solaris/articles/polling_efficient.html + +Solaris and derivatives have ``/dev/poll``. While :c:func:`select` is +O(highest file descriptor) and :c:func:`poll` is O(number of file +descriptors), ``/dev/poll`` is O(active file descriptors). + +``/dev/poll`` behaviour is very close to the standard :c:func:`poll` +object. + + +.. method:: devpoll.register(fd[, eventmask]) + + Register a file descriptor with the polling object. Future calls to the + :meth:`poll` method will then check whether the file descriptor has any + pending I/O events. *fd* can be either an integer, or an object with a + :meth:`~io.IOBase.fileno` method that returns an integer. File objects + implement :meth:`!fileno`, so they can also be used as the argument. + + *eventmask* is an optional bitmask describing the type of events you want to + check for. The constants are the same that with :c:func:`poll` + object. The default value is a combination of the constants :const:`POLLIN`, + :const:`POLLPRI`, and :const:`POLLOUT`. + + .. warning:: + + Registering a file descriptor that's already registered is not an + error, but the result is undefined. The appropiate action is to + unregister or modify it first. This is an important difference + compared with :c:func:`poll`. + + +.. method:: devpoll.modify(fd[, eventmask]) + + This method does an :meth:`unregister` followed by a + :meth:`register`. It is (a bit) more efficient that doing the same + explicitly. + + +.. method:: devpoll.unregister(fd) + + Remove a file descriptor being tracked by a polling object. Just like the + :meth:`register` method, *fd* can be an integer or an object with a + :meth:`~io.IOBase.fileno` method that returns an integer. + + Attempting to remove a file descriptor that was never registered is + safely ignored. + + +.. method:: devpoll.poll([timeout]) + + Polls the set of registered file descriptors, and returns a possibly-empty list + containing ``(fd, event)`` 2-tuples for the descriptors that have events or + errors to report. *fd* is the file descriptor, and *event* is a bitmask with + bits set for the reported events for that descriptor --- :const:`POLLIN` for + waiting input, :const:`POLLOUT` to indicate that the descriptor can be written + to, and so forth. An empty list indicates that the call timed out and no file + descriptors had any events to report. If *timeout* is given, it specifies the + length of time in milliseconds which the system will wait for events before + returning. If *timeout* is omitted, -1, or :const:`None`, the call will + block until there is an event for this poll object. + + .. _epoll-objects: Edge and Level Trigger Polling (epoll) Objects @@ -165,11 +255,6 @@ Edge and Level Trigger Polling (epoll) Objects Register a fd descriptor with the epoll object. - .. note:: - - Registering a file descriptor that's already registered raises an - IOError -- contrary to :ref:`poll-objects`'s register. - .. method:: epoll.modify(fd, eventmask) @@ -203,10 +288,10 @@ linearly scanned again. :c:func:`select` is O(highest file descriptor), while .. method:: poll.register(fd[, eventmask]) Register a file descriptor with the polling object. Future calls to the - :meth:`poll` method will then check whether the file descriptor has any pending - I/O events. *fd* can be either an integer, or an object with a :meth:`fileno` - method that returns an integer. File objects implement :meth:`fileno`, so they - can also be used as the argument. + :meth:`poll` method will then check whether the file descriptor has any + pending I/O events. *fd* can be either an integer, or an object with a + :meth:`~io.IOBase.fileno` method that returns an integer. File objects + implement :meth:`!fileno`, so they can also be used as the argument. *eventmask* is an optional bitmask describing the type of events you want to check for, and can be a combination of the constants :const:`POLLIN`, @@ -245,7 +330,7 @@ linearly scanned again. :c:func:`select` is O(highest file descriptor), while Remove a file descriptor being tracked by a polling object. Just like the :meth:`register` method, *fd* can be an integer or an object with a - :meth:`fileno` method that returns an integer. + :meth:`~io.IOBase.fileno` method that returns an integer. Attempting to remove a file descriptor that was never registered causes a :exc:`KeyError` exception to be raised. @@ -305,8 +390,8 @@ http://www.freebsd.org/cgi/man.cgi?query=kqueue&sektion=2 Value used to identify the event. The interpretation depends on the filter but it's usually the file descriptor. In the constructor ident can either - be an int or an object with a fileno() function. kevent stores the integer - internally. + be an int or an object with a :meth:`~io.IOBase.fileno` method. kevent + stores the integer internally. .. attribute:: kevent.filter diff --git a/Doc/library/shelve.rst b/Doc/library/shelve.rst index 9d7d504..048f80a 100644 --- a/Doc/library/shelve.rst +++ b/Doc/library/shelve.rst @@ -103,8 +103,8 @@ Restrictions .. class:: Shelf(dict, protocol=None, writeback=False, keyencoding='utf-8') - A subclass of :class:`collections.MutableMapping` which stores pickled values - in the *dict* object. + A subclass of :class:`collections.abc.MutableMapping` which stores pickled + values in the *dict* object. By default, version 0 pickles are used to serialize values. The version of the pickle protocol can be specified with the *protocol* parameter. See the diff --git a/Doc/library/shlex.rst b/Doc/library/shlex.rst index 0113fb7..e40a10d 100644 --- a/Doc/library/shlex.rst +++ b/Doc/library/shlex.rst @@ -12,9 +12,9 @@ -------------- -The :class:`shlex` class makes it easy to write lexical analyzers for simple -syntaxes resembling that of the Unix shell. This will often be useful for -writing minilanguages, (for example, in run control files for Python +The :class:`~shlex.shlex` class makes it easy to write lexical analyzers for +simple syntaxes resembling that of the Unix shell. This will often be useful +for writing minilanguages, (for example, in run control files for Python applications) or for parsing quoted strings. The :mod:`shlex` module defines the following functions: @@ -24,32 +24,69 @@ The :mod:`shlex` module defines the following functions: Split the string *s* using shell-like syntax. If *comments* is :const:`False` (the default), the parsing of comments in the given string will be disabled - (setting the :attr:`commenters` attribute of the :class:`shlex` instance to - the empty string). This function operates in POSIX mode by default, but uses - non-POSIX mode if the *posix* argument is false. + (setting the :attr:`~shlex.commenters` attribute of the + :class:`~shlex.shlex` instance to the empty string). This function operates + in POSIX mode by default, but uses non-POSIX mode if the *posix* argument is + false. .. note:: - Since the :func:`split` function instantiates a :class:`shlex` instance, - passing ``None`` for *s* will read the string to split from standard - input. + Since the :func:`split` function instantiates a :class:`~shlex.shlex` + instance, passing ``None`` for *s* will read the string to split from + standard input. + + +.. function:: quote(s) + + Return a shell-escaped version of the string *s*. The returned value is a + string that can safely be used as one token in a shell command line, for + cases where you cannot use a list. + + This idiom would be unsafe:: + + >>> filename = 'somefile; rm -rf ~' + >>> command = 'ls -l {}'.format(filename) + >>> print(command) # executed by a shell: boom! + ls -l somefile; rm -rf ~ + + :func:`quote` lets you plug the security hole:: + + >>> command = 'ls -l {}'.format(quote(filename)) + >>> print(command) + ls -l 'somefile; rm -rf ~' + >>> remote_command = 'ssh home {}'.format(quote(command)) + >>> print(remote_command) + ssh home 'ls -l '"'"'somefile; rm -rf ~'"'"'' + + The quoting is compatible with UNIX shells and with :func:`split`: + + >>> remote_command = split(remote_command) + >>> remote_command + ['ssh', 'home', "ls -l 'somefile; rm -rf ~'"] + >>> command = split(remote_command[-1]) + >>> command + ['ls', '-l', 'somefile; rm -rf ~'] + + .. versionadded:: 3.3 The :mod:`shlex` module defines the following class: .. class:: shlex(instream=None, infile=None, posix=False) - A :class:`shlex` instance or subclass instance is a lexical analyzer object. - The initialization argument, if present, specifies where to read characters - from. It must be a file-/stream-like object with :meth:`read` and - :meth:`readline` methods, or a string. If no argument is given, input will - be taken from ``sys.stdin``. The second optional argument is a filename - string, which sets the initial value of the :attr:`infile` attribute. If the - *instream* argument is omitted or equal to ``sys.stdin``, this second - argument defaults to "stdin". The *posix* argument defines the operational - mode: when *posix* is not true (default), the :class:`shlex` instance will - operate in compatibility mode. When operating in POSIX mode, :class:`shlex` - will try to be as close as possible to the POSIX shell parsing rules. + A :class:`~shlex.shlex` instance or subclass instance is a lexical analyzer + object. The initialization argument, if present, specifies where to read + characters from. It must be a file-/stream-like object with + :meth:`~io.TextIOBase.read` and :meth:`~io.TextIOBase.readline` methods, or + a string. If no argument is given, input will be taken from ``sys.stdin``. + The second optional argument is a filename string, which sets the initial + value of the :attr:`~shlex.infile` attribute. If the *instream* + argument is omitted or equal to ``sys.stdin``, this second argument + defaults to "stdin". The *posix* argument defines the operational mode: + when *posix* is not true (default), the :class:`~shlex.shlex` instance will + operate in compatibility mode. When operating in POSIX mode, + :class:`~shlex.shlex` will try to be as close as possible to the POSIX shell + parsing rules. .. seealso:: @@ -63,14 +100,14 @@ The :mod:`shlex` module defines the following class: shlex Objects ------------- -A :class:`shlex` instance has the following methods: +A :class:`~shlex.shlex` instance has the following methods: .. method:: shlex.get_token() Return a token. If tokens have been stacked using :meth:`push_token`, pop a token off the stack. Otherwise, read one from the input stream. If reading - encounters an immediate end-of-file, :attr:`self.eof` is returned (the empty + encounters an immediate end-of-file, :attr:`eof` is returned (the empty string (``''``) in non-POSIX mode, and ``None`` in POSIX mode). @@ -88,9 +125,9 @@ A :class:`shlex` instance has the following methods: .. method:: shlex.sourcehook(filename) - When :class:`shlex` detects a source request (see :attr:`source` below) this - method is given the following token as argument, and expected to return a tuple - consisting of a filename and an open file-like object. + When :class:`~shlex.shlex` detects a source request (see :attr:`source` + below) this method is given the following token as argument, and expected + to return a tuple consisting of a filename and an open file-like object. Normally, this method first strips any quotes off the argument. If the result is an absolute pathname, or there was no previous source request in effect, or @@ -107,8 +144,9 @@ A :class:`shlex` instance has the following methods: This hook is exposed so that you can use it to implement directory search paths, addition of file extensions, and other namespace hacks. There is no - corresponding 'close' hook, but a shlex instance will call the :meth:`close` - method of the sourced input stream when it returns EOF. + corresponding 'close' hook, but a shlex instance will call the + :meth:`~io.IOBase.close` method of the sourced input stream when it returns + EOF. For more explicit control of source stacking, use the :meth:`push_source` and :meth:`pop_source` methods. @@ -138,8 +176,8 @@ A :class:`shlex` instance has the following methods: messages in the standard, parseable format understood by Emacs and other Unix tools. -Instances of :class:`shlex` subclasses have some public instance variables which -either control lexical analysis or can be used for debugging: +Instances of :class:`~shlex.shlex` subclasses have some public instance +variables which either control lexical analysis or can be used for debugging: .. attribute:: shlex.commenters @@ -184,8 +222,8 @@ either control lexical analysis or can be used for debugging: .. attribute:: shlex.whitespace_split If ``True``, tokens will only be split in whitespaces. This is useful, for - example, for parsing command lines with :class:`shlex`, getting tokens in a - similar way to shell arguments. + example, for parsing command lines with :class:`~shlex.shlex`, getting + tokens in a similar way to shell arguments. .. attribute:: shlex.infile @@ -197,7 +235,8 @@ either control lexical analysis or can be used for debugging: .. attribute:: shlex.instream - The input stream from which this :class:`shlex` instance is reading characters. + The input stream from which this :class:`~shlex.shlex` instance is reading + characters. .. attribute:: shlex.source @@ -206,16 +245,16 @@ either control lexical analysis or can be used for debugging: string will be recognized as a lexical-level inclusion request similar to the ``source`` keyword in various shells. That is, the immediately following token will opened as a filename and input taken from that stream until EOF, at which - point the :meth:`close` method of that stream will be called and the input - source will again become the original input stream. Source requests may be - stacked any number of levels deep. + point the :meth:`~io.IOBase.close` method of that stream will be called and + the input source will again become the original input stream. Source + requests may be stacked any number of levels deep. .. attribute:: shlex.debug - If this attribute is numeric and ``1`` or more, a :class:`shlex` instance will - print verbose progress output on its behavior. If you need to use this, you can - read the module source code to learn the details. + If this attribute is numeric and ``1`` or more, a :class:`~shlex.shlex` + instance will print verbose progress output on its behavior. If you need + to use this, you can read the module source code to learn the details. .. attribute:: shlex.lineno @@ -239,7 +278,7 @@ either control lexical analysis or can be used for debugging: Parsing Rules ------------- -When operating in non-POSIX mode, :class:`shlex` will try to obey to the +When operating in non-POSIX mode, :class:`~shlex.shlex` will try to obey to the following rules. * Quote characters are not recognized within words (``Do"Not"Separate`` is @@ -253,16 +292,17 @@ following rules. * Closing quotes separate words (``"Do"Separate`` is parsed as ``"Do"`` and ``Separate``); -* If :attr:`whitespace_split` is ``False``, any character not declared to be a - word character, whitespace, or a quote will be returned as a single-character - token. If it is ``True``, :class:`shlex` will only split words in whitespaces; +* If :attr:`~shlex.whitespace_split` is ``False``, any character not + declared to be a word character, whitespace, or a quote will be returned as + a single-character token. If it is ``True``, :class:`~shlex.shlex` will only + split words in whitespaces; * EOF is signaled with an empty string (``''``); * It's not possible to parse empty strings, even if quoted. -When operating in POSIX mode, :class:`shlex` will try to obey to the following -parsing rules. +When operating in POSIX mode, :class:`~shlex.shlex` will try to obey to the +following parsing rules. * Quotes are stripped out, and do not separate words (``"Do"Not"Separate"`` is parsed as the single word ``DoNotSeparate``); @@ -270,17 +310,18 @@ parsing rules. * Non-quoted escape characters (e.g. ``'\'``) preserve the literal value of the next character that follows; -* Enclosing characters in quotes which are not part of :attr:`escapedquotes` - (e.g. ``"'"``) preserve the literal value of all characters within the quotes; +* Enclosing characters in quotes which are not part of + :attr:`~shlex.escapedquotes` (e.g. ``"'"``) preserve the literal value + of all characters within the quotes; -* Enclosing characters in quotes which are part of :attr:`escapedquotes` (e.g. - ``'"'``) preserves the literal value of all characters within the quotes, with - the exception of the characters mentioned in :attr:`escape`. The escape - characters retain its special meaning only when followed by the quote in use, or - the escape character itself. Otherwise the escape character will be considered a +* Enclosing characters in quotes which are part of + :attr:`~shlex.escapedquotes` (e.g. ``'"'``) preserves the literal value + of all characters within the quotes, with the exception of the characters + mentioned in :attr:`~shlex.escape`. The escape characters retain its + special meaning only when followed by the quote in use, or the escape + character itself. Otherwise the escape character will be considered a normal character. * EOF is signaled with a :const:`None` value; -* Quoted empty strings (``''``) are allowed; - +* Quoted empty strings (``''``) are allowed. diff --git a/Doc/library/shutil.rst b/Doc/library/shutil.rst index 18f6485..a1457d0 100644 --- a/Doc/library/shutil.rst +++ b/Doc/library/shutil.rst @@ -47,45 +47,129 @@ Directory and files operations be copied. -.. function:: copyfile(src, dst) +.. function:: copyfile(src, dst, *, follow_symlinks=True) Copy the contents (no metadata) of the file named *src* to a file named - *dst*. *dst* must be the complete target file name; look at - :func:`shutil.copy` for a copy that accepts a target directory path. If - *src* and *dst* are the same files, :exc:`Error` is raised. - The destination location must be writable; otherwise, an :exc:`IOError` exception - will be raised. If *dst* already exists, it will be replaced. Special files - such as character or block devices and pipes cannot be copied with this - function. *src* and *dst* are path names given as strings. + *dst* and return *dst*. *src* and *dst* are path names given as strings. + *dst* must be the complete target file name; look at :func:`shutil.copy` + for a copy that accepts a target directory path. If *src* and *dst* + specify the same file, :exc:`Error` is raised. + The destination location must be writable; otherwise, an :exc:`OSError` + exception will be raised. If *dst* already exists, it will be replaced. + Special files such as character or block devices and pipes cannot be + copied with this function. -.. function:: copymode(src, dst) + If *follow_symlinks* is false and *src* is a symbolic link, + a new symbolic link will be created instead of copying the + file *src* points to. + + .. versionchanged:: 3.3 + :exc:`IOError` used to be raised instead of :exc:`OSError`. + Added *follow_symlinks* argument. + Now returns *dst*. + +.. function:: copymode(src, dst, *, follow_symlinks=True) Copy the permission bits from *src* to *dst*. The file contents, owner, and group are unaffected. *src* and *dst* are path names given as strings. + If *follow_symlinks* is false, and both *src* and *dst* are symbolic links, + :func:`copymode` will attempt to modify the mode of *dst* itself (rather + than the file it points to). This functionality is not available on every + platform; please see :func:`copystat` for more information. If + :func:`copymode` cannot modify symbolic links on the local platform, and it + is asked to do so, it will do nothing and return. + + .. versionchanged:: 3.3 + Added *follow_symlinks* argument. + +.. function:: copystat(src, dst, *, follow_symlinks=True) + + Copy the permission bits, last access time, last modification time, and + flags from *src* to *dst*. On Linux, :func:`copystat` also copies the + "extended attributes" where possible. The file contents, owner, and + group are unaffected. *src* and *dst* are path names given as strings. + + If *follow_symlinks* is false, and *src* and *dst* both + refer to symbolic links, :func:`copystat` will operate on + the symbolic links themselves rather than the files the + symbolic links refer to--reading the information from the + *src* symbolic link, and writing the information to the + *dst* symbolic link. + + .. note:: + + Not all platforms provide the ability to examine and + modify symbolic links. Python itself can tell you what + functionality is locally available. + * If ``os.chmod in os.supports_follow_symlinks`` is + ``True``, :func:`copystat` can modify the permission + bits of a symbolic link. -.. function:: copystat(src, dst) + * If ``os.utime in os.supports_follow_symlinks`` is + ``True``, :func:`copystat` can modify the last access + and modification times of a symbolic link. - Copy the permission bits, last access time, last modification time, and flags - from *src* to *dst*. The file contents, owner, and group are unaffected. *src* - and *dst* are path names given as strings. + * If ``os.chflags in os.supports_follow_symlinks`` is + ``True``, :func:`copystat` can modify the flags of + a symbolic link. (``os.chflags`` is not available on + all platforms.) + On platforms where some or all of this functionality + is unavailable, when asked to modify a symbolic link, + :func:`copystat` will copy everything it can. + :func:`copystat` never returns failure. -.. function:: copy(src, dst) + Please see :data:`os.supports_follow_symlinks` + for more information. - Copy the file *src* to the file or directory *dst*. If *dst* is a directory, a - file with the same basename as *src* is created (or overwritten) in the - directory specified. Permission bits are copied. *src* and *dst* are path - names given as strings. + .. versionchanged:: 3.3 + Added *follow_symlinks* argument and support for Linux extended attributes. +.. function:: copy(src, dst, *, follow_symlinks=True) -.. function:: copy2(src, dst) + Copies the file *src* to the file or directory *dst*. *src* and *dst* + should be strings. If *dst* specifies a directory, the file will be + copied into *dst* using the base filename from *src*. Returns the + path to the newly created file. - Similar to :func:`shutil.copy`, but metadata is copied as well -- in fact, - this is just :func:`shutil.copy` followed by :func:`copystat`. This is - similar to the Unix command :program:`cp -p`. + If *follow_symlinks* is false, and *src* is a symbolic link, + *dst* will be created as a symbolic link. If *follow_symlinks* + is true and *src* is a symbolic link, *dst* will be a copy of + the file *src* refers to. + :func:`copy` copies the file data and the file's permission + mode (see :func:`os.chmod`). Other metadata, like the + file's creation and modification times, is not preserved. + To preserve all file metadata from the original, use + :func:`~shutil.copy2` instead. + + .. versionchanged:: 3.3 + Added *follow_symlinks* argument. + Now returns path to the newly created file. + +.. function:: copy2(src, dst, *, follow_symlinks=True) + + Identical to :func:`~shutil.copy` except that :func:`copy2` + also attempts to preserve all file metadata. + + When *follow_symlinks* is false, and *src* is a symbolic + link, :func:`copy2` attempts to copy all metadata from the + *src* symbolic link to the newly-created *dst* symbolic link. + However, this functionality is not available on all platforms. + On platforms where some or all of this functionality is + unavailable, :func:`copy2` will preserve all the metadata + it can; :func:`copy2` never returns failure. + + :func:`copy2` uses :func:`copystat` to copy the file metadata. + Please see :func:`copystat` for more information + about platform support for modifying symbolic link metadata. + + .. versionchanged:: 3.3 + Added *follow_symlinks* argument, try to copy extended + file system attributes too (currently Linux only). + Now returns path to the newly created file. .. function:: ignore_patterns(\*patterns) @@ -96,16 +180,17 @@ Directory and files operations .. function:: copytree(src, dst, symlinks=False, ignore=None, copy_function=copy2, ignore_dangling_symlinks=False) - Recursively copy an entire directory tree rooted at *src*. The destination + Recursively copy an entire directory tree rooted at *src*, returning the + destination directory. The destination directory, named by *dst*, must not already exist; it will be created as well as missing parent directories. Permissions and times of directories are copied with :func:`copystat`, individual files are copied using :func:`shutil.copy2`. If *symlinks* is true, symbolic links in the source tree are represented as - symbolic links in the new tree, but the metadata of the original links is NOT - copied; if false or omitted, the contents and metadata of the linked files - are copied to the new tree. + symbolic links in the new tree and the metadata of the original links will + be copied as far as the platform allows; if false or omitted, the contents + and metadata of the linked files are copied to the new tree. When *symlinks* is false, if the file pointed by the symlink doesn't exist, a exception will be added in the list of errors raised in @@ -129,13 +214,15 @@ Directory and files operations If *copy_function* is given, it must be a callable that will be used to copy each file. It will be called with the source path and the destination path as arguments. By default, :func:`shutil.copy2` is used, but any function - that supports the same signature (like :func:`copy`) can be used. + that supports the same signature (like :func:`shutil.copy`) can be used. + + .. versionchanged:: 3.3 + Copy metadata when *symlinks* is false. + Now returns *dst*. .. versionchanged:: 3.2 Added the *copy_function* argument to be able to provide a custom copy function. - - .. versionchanged:: 3.2 Added the *ignore_dangling_symlinks* argument to silent dangling symlinks errors when *symlinks* is false. @@ -150,19 +237,42 @@ Directory and files operations handled by calling a handler specified by *onerror* or, if that is omitted, they raise an exception. + .. note:: + + On platforms that support the necessary fd-based functions a symlink + attack resistant version of :func:`rmtree` is used by default. On other + platforms, the :func:`rmtree` implementation is susceptible to a symlink + attack: given proper timing and circumstances, attackers can manipulate + symlinks on the filesystem to delete files they wouldn't be able to access + otherwise. Applications can use the :data:`rmtree.avoids_symlink_attacks` + function attribute to determine which case applies. + If *onerror* is provided, it must be a callable that accepts three - parameters: *function*, *path*, and *excinfo*. The first parameter, - *function*, is the function which raised the exception; it will be - :func:`os.path.islink`, :func:`os.listdir`, :func:`os.remove` or - :func:`os.rmdir`. The second parameter, *path*, will be the path name passed - to *function*. The third parameter, *excinfo*, will be the exception - information return by :func:`sys.exc_info`. Exceptions raised by *onerror* - will not be caught. + parameters: *function*, *path*, and *excinfo*. + + The first parameter, *function*, is the function which raised the exception; + it depends on the platform and implementation. The second parameter, + *path*, will be the path name passed to *function*. The third parameter, + *excinfo*, will be the exception information returned by + :func:`sys.exc_info`. Exceptions raised by *onerror* will not be caught. + + .. versionchanged:: 3.3 + Added a symlink attack resistant version that is used automatically + if platform supports fd-based functions. + + .. attribute:: rmtree.avoids_symlink_attacks + + Indicates whether the current platform and implementation provides a + symlink attack resistant version of :func:`rmtree`. Currently this is + only true for platforms supporting fd-based directory access functions. + + .. versionadded:: 3.3 .. function:: move(src, dst) - Recursively move a file or directory (*src*) to another location (*dst*). + Recursively move a file or directory (*src*) to another location (*dst*) + and return the destination. If the destination is a directory or a symlink to a directory, then *src* is moved inside that directory. @@ -173,7 +283,61 @@ Directory and files operations If the destination is on the current filesystem, then :func:`os.rename` is used. Otherwise, *src* is copied (using :func:`shutil.copy2`) to *dst* and - then removed. + then removed. In case of symlinks, a new symlink pointing to the target of + *src* will be created in or as *dst* and *src* will be removed. + + .. versionchanged:: 3.3 + Added explicit symlink handling for foreign filesystems, thus adapting + it to the behavior of GNU's :program:`mv`. + Now returns *dst*. + +.. function:: disk_usage(path) + + Return disk usage statistics about the given path as a :term:`named tuple` + with the attributes *total*, *used* and *free*, which are the amount of + total, used and free space, in bytes. + + .. versionadded:: 3.3 + + Availability: Unix, Windows. + +.. function:: chown(path, user=None, group=None) + + Change owner *user* and/or *group* of the given *path*. + + *user* can be a system user name or a uid; the same applies to *group*. At + least one argument is required. + + See also :func:`os.chown`, the underlying function. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: which(cmd, mode=os.F_OK | os.X_OK, path=None) + + Return the path to an executable which would be run if the given *cmd* was + called. If no *cmd* would be called, return ``None``. + + *mode* is a permission mask passed a to :func:`os.access`, by default + determining if the file exists and executable. + + When no *path* is specified, the results of :func:`os.environ` are used, + returning either the "PATH" value or a fallback of :attr:`os.defpath`. + + On Windows, the current directory is always prepended to the *path* whether + or not you use the default or provide your own, which is the behavior the + command shell uses when finding executables. Additionaly, when finding the + *cmd* in the *path*, the ``PATHEXT`` environment variable is checked. For + example, if you call ``shutil.which("python")``, :func:`which` will search + ``PATHEXT`` to know that it should look for ``python.exe`` within the *path* + directories. For example, on Windows:: + + >>> shutil.which("python") + 'C:\\Python33\\python.EXE' + + .. versionadded:: 3.3 .. exception:: Error @@ -186,7 +350,7 @@ Directory and files operations .. _shutil-copytree-example: copytree example -:::::::::::::::: +~~~~~~~~~~~~~~~~ This example is the implementation of the :func:`copytree` function, described above, with the docstring omitted. It demonstrates many of the other functions @@ -208,7 +372,7 @@ provided by this module. :: else: copy2(srcname, dstname) # XXX What about devices, sockets etc.? - except (IOError, os.error) as why: + except OSError as why: errors.append((srcname, dstname, str(why))) # catch the Error from the recursive copytree so that we can # continue with other files @@ -250,6 +414,8 @@ Another example that uses the *ignore* argument to add a logging call:: Archiving operations -------------------- +.. versionadded:: 3.2 + High-level utilities to create and read compressed and archived files are also provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules. @@ -277,8 +443,6 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules. *logger* must be an object compatible with :pep:`282`, usually an instance of :class:`logging.Logger`. - .. versionadded:: 3.2 - .. function:: get_archive_formats() @@ -295,8 +459,6 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules. You can register new formats or provide your own archiver for any existing formats, by using :func:`register_archive_format`. - .. versionadded:: 3.2 - .. function:: register_archive_format(name, function, [extra_args, [description]]) @@ -309,15 +471,11 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules. *description* is used by :func:`get_archive_formats` which returns the list of archivers. Defaults to an empty list. - .. versionadded:: 3.2 - .. function:: unregister_archive_format(name) Remove the archive format *name* from the list of supported formats. - .. versionadded:: 3.2 - .. function:: unpack_archive(filename[, extract_dir[, format]]) @@ -332,8 +490,6 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules. and see if an unpacker was registered for that extension. In case none is found, a :exc:`ValueError` is raised. - .. versionadded:: 3.2 - .. function:: register_unpack_format(name, extensions, function[, extra_args[, description]]) @@ -351,15 +507,11 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules. *description* can be provided to describe the format, and will be returned by the :func:`get_unpack_formats` function. - .. versionadded:: 3.2 - .. function:: unregister_unpack_format(name) Unregister an unpack format. *name* is the name of the format. - .. versionadded:: 3.2 - .. function:: get_unpack_formats() @@ -377,13 +529,11 @@ provided. They rely on the :mod:`zipfile` and :mod:`tarfile` modules. You can register new formats or provide your own unpacker for any existing formats, by using :func:`register_unpack_format`. - .. versionadded:: 3.2 - .. _shutil-archiving-example: Archiving example -::::::::::::::::: +~~~~~~~~~~~~~~~~~ In this example, we create a gzip'ed tar-file archive containing all files found in the :file:`.ssh` directory of the user:: @@ -406,3 +556,36 @@ The resulting archive contains:: -rw------- tarek/staff 1675 2008-06-09 13:26:54 ./id_rsa -rw-r--r-- tarek/staff 397 2008-06-09 13:26:54 ./id_rsa.pub -rw-r--r-- tarek/staff 37192 2010-02-06 18:23:10 ./known_hosts + + +Querying the size of the output terminal +---------------------------------------- + +.. versionadded:: 3.3 + +.. function:: get_terminal_size(fallback=(columns, lines)) + + Get the size of the terminal window. + + For each of the two dimensions, the environment variable, ``COLUMNS`` + and ``LINES`` respectively, is checked. If the variable is defined and + the value is a positive integer, it is used. + + When ``COLUMNS`` or ``LINES`` is not defined, which is the common case, + the terminal connected to :data:`sys.__stdout__` is queried + by invoking :func:`os.get_terminal_size`. + + If the terminal size cannot be successfully queried, either because + the system doesn't support querying, or because we are not + connected to a terminal, the value given in ``fallback`` parameter + is used. ``fallback`` defaults to ``(80, 24)`` which is the default + size used by many terminal emulators. + + The value returned is a named tuple of type :class:`os.terminal_size`. + + See also: The Single UNIX Specification, Version 2, + `Other Environment Variables`_. + +.. _`Other Environment Variables`: + http://pubs.opengroup.org/onlinepubs/7908799/xbd/envvar.html#tag_002_003 + diff --git a/Doc/library/signal.rst b/Doc/library/signal.rst index d1cae13..84e2836 100644 --- a/Doc/library/signal.rst +++ b/Doc/library/signal.rst @@ -36,7 +36,11 @@ at a later point(for example at the next :term:`bytecode` instruction). This has consequences: * It makes little sense to catch synchronous errors like :const:`SIGFPE` or - :const:`SIGSEGV`. + :const:`SIGSEGV` that are caused by an invalid operation in C code. Python + will return from the signal handler to the C code, which is likely to raise + the same signal again, causing Python to apparently hang. From Python 3.3 + onwards, you can use the :mod:`faulthandler` module to report on synchronous + errors. * A long-running calculation implemented purely in C (such as regular expression matching on a large body of text) may run uninterrupted for an @@ -44,6 +48,9 @@ This has consequences: signal handlers will be called when the calculation finishes. +.. _signals-and-threads: + + Signals and threads ^^^^^^^^^^^^^^^^^^^ @@ -131,6 +138,28 @@ The variables defined in the :mod:`signal` module are: in user and kernel space. SIGPROF is delivered upon expiration. +.. data:: SIG_BLOCK + + A possible value for the *how* parameter to :func:`pthread_sigmask` + indicating that signals are to be blocked. + + .. versionadded:: 3.3 + +.. data:: SIG_UNBLOCK + + A possible value for the *how* parameter to :func:`pthread_sigmask` + indicating that signals are to be unblocked. + + .. versionadded:: 3.3 + +.. data:: SIG_SETMASK + + A possible value for the *how* parameter to :func:`pthread_sigmask` + indicating that the signal mask is to be replaced. + + .. versionadded:: 3.3 + + The :mod:`signal` module defines one exception: .. exception:: ItimerError @@ -138,7 +167,11 @@ The :mod:`signal` module defines one exception: Raised to signal an error from the underlying :func:`setitimer` or :func:`getitimer` implementation. Expect this error if an invalid interval timer or a negative time is passed to :func:`setitimer`. - This error is a subtype of :exc:`IOError`. + This error is a subtype of :exc:`OSError`. + + .. versionadded:: 3.3 + This error used to be a subtype of :exc:`IOError`, which is now an + alias of :exc:`OSError`. The :mod:`signal` module defines the following functions: @@ -172,6 +205,65 @@ The :mod:`signal` module defines the following functions: will then be called. Returns nothing. Not on Windows. (See the Unix man page :manpage:`signal(2)`.) + See also :func:`sigwait`, :func:`sigwaitinfo`, :func:`sigtimedwait` and + :func:`sigpending`. + + +.. function:: pthread_kill(thread_id, signum) + + Send the signal *signum* to the thread *thread_id*, another thread in the + same process as the caller. The target thread can be executing any code + (Python or not). However, if the target thread is executing the Python + interpreter, the Python signal handlers will be :ref:`executed by the main + thread <signals-and-threads>`. Therefore, the only point of sending a signal to a particular + Python thread would be to force a running system call to fail with + :exc:`InterruptedError`. + + Use :func:`threading.get_ident()` or the :attr:`~threading.Thread.ident` + attribute of :class:`threading.Thread` objects to get a suitable value + for *thread_id*. + + If *signum* is 0, then no signal is sent, but error checking is still + performed; this can be used to check if the target thread is still running. + + Availability: Unix (see the man page :manpage:`pthread_kill(3)` for further + information). + + See also :func:`os.kill`. + + .. versionadded:: 3.3 + + +.. function:: pthread_sigmask(how, mask) + + Fetch and/or change the signal mask of the calling thread. The signal mask + is the set of signals whose delivery is currently blocked for the caller. + Return the old signal mask as a set of signals. + + The behavior of the call is dependent on the value of *how*, as follows. + + * :data:`SIG_BLOCK`: The set of blocked signals is the union of the current + set and the *mask* argument. + * :data:`SIG_UNBLOCK`: The signals in *mask* are removed from the current + set of blocked signals. It is permissible to attempt to unblock a + signal which is not blocked. + * :data:`SIG_SETMASK`: The set of blocked signals is set to the *mask* + argument. + + *mask* is a set of signal numbers (e.g. {:const:`signal.SIGINT`, + :const:`signal.SIGTERM`}). Use ``range(1, signal.NSIG)`` for a full mask + including all signals. + + For example, ``signal.pthread_sigmask(signal.SIG_BLOCK, [])`` reads the + signal mask of the calling thread. + + Availability: Unix. See the man page :manpage:`sigprocmask(3)` and + :manpage:`pthread_sigmask(3)` for further information. + + See also :func:`pause`, :func:`sigpending` and :func:`sigwait`. + + .. versionadded:: 3.3 + .. function:: setitimer(which, seconds[, interval]) @@ -201,13 +293,17 @@ The :mod:`signal` module defines the following functions: .. function:: set_wakeup_fd(fd) - Set the wakeup fd to *fd*. When a signal is received, a ``'\0'`` byte is - written to the fd. This can be used by a library to wakeup a poll or select - call, allowing the signal to be fully processed. + Set the wakeup file descriptor to *fd*. When a signal is received, the + signal number is written as a single byte into the fd. This can be used by + a library to wakeup a poll or select call, allowing the signal to be fully + processed. The old wakeup fd is returned. *fd* must be non-blocking. It is up to the library to remove any bytes before calling poll or select again. + Use for example ``struct.unpack('%uB' % len(data), data)`` to decode the + signal numbers list. + When threads are enabled, this function can only be called from the main thread; attempting to call it from other threads will cause a :exc:`ValueError` exception to be raised. @@ -247,6 +343,73 @@ The :mod:`signal` module defines the following functions: :const:`SIGTERM`. A :exc:`ValueError` will be raised in any other case. +.. function:: sigpending() + + Examine the set of signals that are pending for delivery to the calling + thread (i.e., the signals which have been raised while blocked). Return the + set of the pending signals. + + Availability: Unix (see the man page :manpage:`sigpending(2)` for further + information). + + See also :func:`pause`, :func:`pthread_sigmask` and :func:`sigwait`. + + .. versionadded:: 3.3 + + +.. function:: sigwait(sigset) + + Suspend execution of the calling thread until the delivery of one of the + signals specified in the signal set *sigset*. The function accepts the signal + (removes it from the pending list of signals), and returns the signal number. + + Availability: Unix (see the man page :manpage:`sigwait(3)` for further + information). + + See also :func:`pause`, :func:`pthread_sigmask`, :func:`sigpending`, + :func:`sigwaitinfo` and :func:`sigtimedwait`. + + .. versionadded:: 3.3 + + +.. function:: sigwaitinfo(sigset) + + Suspend execution of the calling thread until the delivery of one of the + signals specified in the signal set *sigset*. The function accepts the + signal and removes it from the pending list of signals. If one of the + signals in *sigset* is already pending for the calling thread, the function + will return immediately with information about that signal. The signal + handler is not called for the delivered signal. The function raises an + :exc:`InterruptedError` if it is interrupted by a signal that is not in + *sigset*. + + The return value is an object representing the data contained in the + :c:type:`siginfo_t` structure, namely: :attr:`si_signo`, :attr:`si_code`, + :attr:`si_errno`, :attr:`si_pid`, :attr:`si_uid`, :attr:`si_status`, + :attr:`si_band`. + + Availability: Unix (see the man page :manpage:`sigwaitinfo(2)` for further + information). + + See also :func:`pause`, :func:`sigwait` and :func:`sigtimedwait`. + + .. versionadded:: 3.3 + + +.. function:: sigtimedwait(sigset, timeout) + + Like :func:`sigwaitinfo`, but takes an additional *timeout* argument + specifying a timeout. If *timeout* is specified as :const:`0`, a poll is + performed. Returns :const:`None` if a timeout occurs. + + Availability: Unix (see the man page :manpage:`sigtimedwait(2)` for further + information). + + See also :func:`pause`, :func:`sigwait` and :func:`sigwaitinfo`. + + .. versionadded:: 3.3 + + .. _signal-example: Example @@ -263,7 +426,7 @@ be sent, and the handler raises an exception. :: def handler(signum, frame): print('Signal handler called with signal', signum) - raise IOError("Couldn't open device!") + raise OSError("Couldn't open device!") # Set the signal handler and a 5-second alarm signal.signal(signal.SIGALRM, handler) diff --git a/Doc/library/site.rst b/Doc/library/site.rst index 579571a..36b80c3 100644 --- a/Doc/library/site.rst +++ b/Doc/library/site.rst @@ -16,7 +16,14 @@ import can be suppressed using the interpreter's :option:`-S` option. .. index:: triple: module; search; path Importing this module will append site-specific paths to the module search path -and add a few builtins. +and add a few builtins, unless :option:`-S` was used. In that case, this module +can be safely imported with no automatic modifications to the module search path +or additions to the builtins. To explicitly trigger the usual site-specific +additions, call the :func:`site.main` function. + +.. versionchanged:: 3.3 + Importing the module used to trigger paths manipulation even when using + :option:`-S`. .. index:: pair: site-python; directory @@ -31,6 +38,15 @@ Unix and Macintosh). For each of the distinct head-tail combinations, it sees if it refers to an existing directory, and if so, adds it to ``sys.path`` and also inspects the newly added path for configuration files. +If a file named "pyvenv.cfg" exists one directory above sys.executable, +sys.prefix and sys.exec_prefix are set to that directory and +it is also checked for site-packages and site-python (sys.base_prefix and +sys.base_exec_prefix will always be the "real" prefixes of the Python +installation). If "pyvenv.cfg" (a bootstrap configuration file) contains +the key "include-system-site-packages" set to anything other than "false" +(case-insensitive), the system-level prefixes will still also be +searched for site-packages; otherwise they won't. + A path configuration file is a file whose name has the form :file:`{name}.pth` and exists in one of the four directories mentioned above; its contents are additional items (one per line) to be added to ``sys.path``. Non-existing items @@ -129,8 +145,19 @@ empty, and the path manipulations are skipped; however the import of :file:`~/Library/Python/{X.Y}` for Mac framework builds, and :file:`{%APPDATA%}\\Python` for Windows. This value is used by Distutils to compute the installation directories for scripts, data files, Python modules, - etc. for the :ref:`user installation scheme <inst-alt-install-user>`. See - also :envvar:`PYTHONUSERBASE`. + etc. for the :ref:`user installation scheme <inst-alt-install-user>`. + See also :envvar:`PYTHONUSERBASE`. + + +.. function:: main() + + Adds all the standard site-specific directories to the module search + path. This function is called automatically when this module is imported, + unless the :program:`python` interpreter was started with the :option:`-S` + flag. + + .. versionchanged:: 3.3 + This function used to be called unconditionnally. .. function:: addsitedir(sitedir, known_paths=None) diff --git a/Doc/library/smtpd.rst b/Doc/library/smtpd.rst index bfdc727..f305818 100644 --- a/Doc/library/smtpd.rst +++ b/Doc/library/smtpd.rst @@ -4,7 +4,7 @@ .. module:: smtpd :synopsis: A SMTP server implementation in Python. -.. moduleauthor:: Barry Warsaw <barry@zope.com> +.. moduleauthor:: Barry Warsaw <barry@python.org> .. sectionauthor:: Moshe Zadka <moshez@moshez.org> **Source code:** :source:`Lib/smtpd.py` @@ -20,17 +20,24 @@ specific mail-sending strategies. Additionally the SMTPChannel may be extended to implement very specific interaction behaviour with SMTP clients. +The code supports :RFC:`5321`, plus the :rfc:`1870` SIZE extension. + + SMTPServer Objects ------------------ -.. class:: SMTPServer(localaddr, remoteaddr) +.. class:: SMTPServer(localaddr, remoteaddr, data_size_limit=33554432) Create a new :class:`SMTPServer` object, which binds to local address *localaddr*. It will treat *remoteaddr* as an upstream SMTP relayer. It inherits from :class:`asyncore.dispatcher`, and so will insert itself into :mod:`asyncore`'s event loop on instantiation. + *data_size_limit* specifies the maximum number of bytes that will be + accepted in a ``DATA`` command. A value of ``None`` or ``0`` means no + limit. + .. method:: process_message(peer, mailfrom, rcpttos, data) Raise :exc:`NotImplementedError` exception. Override this in subclasses to @@ -156,11 +163,15 @@ SMTPChannel Objects Command Action taken ======== =================================================================== HELO Accepts the greeting from the client and stores it in - :attr:`seen_greeting`. + :attr:`seen_greeting`. Sets server to base command mode. + EHLO Accepts the greeting from the client and stores it in + :attr:`seen_greeting`. Sets server to extended command mode. NOOP Takes no action. QUIT Closes the connection cleanly. MAIL Accepts the "MAIL FROM:" syntax and stores the supplied address as - :attr:`mailfrom`. + :attr:`mailfrom`. In extended command mode, accepts the + :rfc:`1870` SIZE attribute and responds appropriately based on the + value of *data_size_limit*. RCPT Accepts the "RCPT TO:" syntax and stores the supplied addresses in the :attr:`rcpttos` list. RSET Resets the :attr:`mailfrom`, :attr:`rcpttos`, and @@ -168,4 +179,7 @@ SMTPChannel Objects DATA Sets the internal state to :attr:`DATA` and stores remaining lines from the client in :attr:`received_data` until the terminator ``"\r\n.\r\n"`` is received. + HELP Returns minimal information on command syntax + VRFY Returns code 252 (the server doesn't know if the address is valid) + EXPN Reports that the command is not implemented. ======== =================================================================== diff --git a/Doc/library/smtplib.rst b/Doc/library/smtplib.rst index 4a539fc..addc6be 100644 --- a/Doc/library/smtplib.rst +++ b/Doc/library/smtplib.rst @@ -20,46 +20,89 @@ details of SMTP and ESMTP operation, consult :rfc:`821` (Simple Mail Transfer Protocol) and :rfc:`1869` (SMTP Service Extensions). -.. class:: SMTP(host='', port=0, local_hostname=None[, timeout]) +.. class:: SMTP(host='', port=0, local_hostname=None[, timeout], source_address=None) A :class:`SMTP` instance encapsulates an SMTP connection. It has methods that support a full repertoire of SMTP and ESMTP operations. If the optional - host and port parameters are given, the SMTP :meth:`connect` method is called - with those parameters during initialization. An :exc:`SMTPConnectError` is - raised if the specified host doesn't respond correctly. The optional + host and port parameters are given, the SMTP :meth:`connect` method is + called with those parameters during initialization. If specified, + *local_hostname* is used as the FQDN of the local host in the HELO/EHLO + command. Otherwise, the local hostname is found using + :func:`socket.getfqdn`. If the :meth:`connect` call returns anything other + than a success code, an :exc:`SMTPConnectError` is raised. The optional *timeout* parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout - setting will be used). + setting will be used). The optional source_address parameter allows to bind + to some specific source address in a machine with multiple network + interfaces, and/or to some specific source TCP port. It takes a 2-tuple + (host, port), for the socket to bind to as its source address before + connecting. If omitted (or if host or port are ``''`` and/or 0 respectively) + the OS default behavior will be used. For normal use, you should only require the initialization/connect, :meth:`sendmail`, and :meth:`~smtplib.quit` methods. An example is included below. + The :class:`SMTP` class supports the :keyword:`with` statement. When used + like this, the SMTP ``QUIT`` command is issued automatically when the + :keyword:`with` statement exits. E.g.:: -.. class:: SMTP_SSL(host='', port=0, local_hostname=None, keyfile=None, certfile=None[, timeout]) + >>> from smtplib import SMTP + >>> with SMTP("domain.org") as smtp: + ... smtp.noop() + ... + (250, b'Ok') + >>> + + .. versionchanged:: 3.3 + Support for the :keyword:`with` statement was added. + + .. versionchanged:: 3.3 + source_address argument was added. + +.. class:: SMTP_SSL(host='', port=0, local_hostname=None, keyfile=None, \ + certfile=None [, timeout], context=None, \ + source_address=None) A :class:`SMTP_SSL` instance behaves exactly the same as instances of :class:`SMTP`. :class:`SMTP_SSL` should be used for situations where SSL is required from the beginning of the connection and using :meth:`starttls` is not appropriate. If *host* is not specified, the local host is used. If - *port* is zero, the standard SMTP-over-SSL port (465) is used. *keyfile* - and *certfile* are also optional, and can contain a PEM formatted private key - and certificate chain file for the SSL connection. The optional *timeout* + *port* is zero, the standard SMTP-over-SSL port (465) is used. The optional + arguments *local_hostname* and *source_address* have the same meaning as + they do in the :class:`SMTP` class. *keyfile* and *certfile* are also + optional, and can contain a PEM formatted private key and certificate chain + file for the SSL connection. *context* also optional, can contain a + SSLContext, and is an alternative to keyfile and certfile; If it is + specified both keyfile and certfile must be None. The optional *timeout* parameter specifies a timeout in seconds for blocking operations like the connection attempt (if not specified, the global default timeout setting - will be used). + will be used). The optional source_address parameter allows to bind to some + specific source address in a machine with multiple network interfaces, + and/or to some specific source tcp port. It takes a 2-tuple (host, port), + for the socket to bind to as its source address before connecting. If + omitted (or if host or port are ``''`` and/or 0 respectively) the OS default + behavior will be used. + .. versionchanged:: 3.3 + *context* was added. -.. class:: LMTP(host='', port=LMTP_PORT, local_hostname=None) + .. versionchanged:: 3.3 + source_address argument was added. + + +.. class:: LMTP(host='', port=LMTP_PORT, local_hostname=None, source_address=None) The LMTP protocol, which is very similar to ESMTP, is heavily based on the - standard SMTP client. It's common to use Unix sockets for LMTP, so our :meth:`connect` - method must support that as well as a regular host:port server. To specify a - Unix socket, you must use an absolute path for *host*, starting with a '/'. + standard SMTP client. It's common to use Unix sockets for LMTP, so our + :meth:`connect` method must support that as well as a regular host:port + server. The optional arguments local_hostname and source_address have the + same meaning as they do in the :class:`SMTP` class. To specify a Unix + socket, you must use an absolute path for *host*, starting with a '/'. - Authentication is supported, using the regular SMTP mechanism. When using a Unix - socket, LMTP generally don't support or require any authentication, but your - mileage might vary. + Authentication is supported, using the regular SMTP mechanism. When using a + Unix socket, LMTP generally don't support or require any authentication, but + your mileage might vary. A nice selection of exceptions is defined as well: @@ -67,7 +110,8 @@ A nice selection of exceptions is defined as well: .. exception:: SMTPException - Base exception class for all exceptions raised by this module. + The base exception class for all the other exceptions provided by this + module. .. exception:: SMTPServerDisconnected @@ -146,15 +190,6 @@ An :class:`SMTP` instance has the following methods: for connection and for all messages sent to and received from the server. -.. method:: SMTP.connect(host='localhost', port=0) - - Connect to a host on a given port. The defaults are to connect to the local - host at the standard SMTP port (25). If the hostname ends with a colon (``':'``) - followed by a number, that suffix will be stripped off and the number - interpreted as the port number to use. This method is automatically invoked by - the constructor if a host is specified during instantiation. - - .. method:: SMTP.docmd(cmd, args='') Send a command *cmd* to the server. The optional argument *args* is simply @@ -171,6 +206,17 @@ An :class:`SMTP` instance has the following methods: :exc:`SMTPServerDisconnected` will be raised. +.. method:: SMTP.connect(host='localhost', port=0) + + Connect to a host on a given port. The defaults are to connect to the local + host at the standard SMTP port (25). If the hostname ends with a colon (``':'``) + followed by a number, that suffix will be stripped off and the number + interpreted as the port number to use. This method is automatically invoked by + the constructor if a host is specified during instantiation. Returns a + 2-tuple of the response code and message sent by the server in its + connection response. + + .. method:: SMTP.helo(name='') Identify yourself to the SMTP server using ``HELO``. The hostname argument @@ -243,7 +289,7 @@ An :class:`SMTP` instance has the following methods: No suitable authentication method was found. -.. method:: SMTP.starttls(keyfile=None, certfile=None) +.. method:: SMTP.starttls(keyfile=None, certfile=None, context=None) Put the SMTP connection in TLS (Transport Layer Security) mode. All SMTP commands that follow will be encrypted. You should then call :meth:`ehlo` @@ -252,6 +298,9 @@ An :class:`SMTP` instance has the following methods: If *keyfile* and *certfile* are provided, these are passed to the :mod:`socket` module's :func:`ssl` function. + Optional *context* parameter is a :class:`ssl.SSLContext` object; This is an alternative to + using a keyfile and a certfile and if specified both *keyfile* and *certfile* should be None. + If there has been no previous ``EHLO`` or ``HELO`` command this session, this method tries ESMTP ``EHLO`` first. @@ -264,6 +313,9 @@ An :class:`SMTP` instance has the following methods: :exc:`RuntimeError` SSL/TLS support is not available to your Python interpreter. + .. versionchanged:: 3.3 + *context* was added. + .. method:: SMTP.sendmail(from_addr, to_addrs, msg, mail_options=[], rcpt_options=[]) @@ -321,7 +373,8 @@ An :class:`SMTP` instance has the following methods: Unless otherwise noted, the connection will be open even after an exception is raised. - .. versionchanged:: 3.2 *msg* may be a byte string. + .. versionchanged:: 3.2 + *msg* may be a byte string. .. method:: SMTP.send_message(msg, from_addr=None, to_addrs=None, \ diff --git a/Doc/library/socket.rst b/Doc/library/socket.rst index 344a29f..3a49eed 100644 --- a/Doc/library/socket.rst +++ b/Doc/library/socket.rst @@ -18,7 +18,7 @@ platforms. The Python interface is a straightforward transliteration of the Unix system call and library interface for sockets to Python's object-oriented style: the -:func:`socket` function returns a :dfn:`socket object` whose methods implement +:func:`.socket` function returns a :dfn:`socket object` whose methods implement the various socket system calls. Parameter types are somewhat higher-level than in the C interface: as with :meth:`read` and :meth:`write` operations on Python files, buffer allocation on receive operations is automatic, and buffer length @@ -40,9 +40,23 @@ Socket families Depending on the system and the build options, various socket families are supported by this module. -Socket addresses are represented as follows: +The address format required by a particular socket object is automatically +selected based on the address family specified when the socket object was +created. Socket addresses are represented as follows: -- A single string is used for the :const:`AF_UNIX` address family. +- The address of an :const:`AF_UNIX` socket bound to a file system node + is represented as a string, using the file system encoding and the + ``'surrogateescape'`` error handler (see :pep:`383`). An address in + Linux's abstract namespace is returned as a :class:`bytes` object with + an initial null byte; note that sockets in this namespace can + communicate with normal file system sockets, so programs intended to + run on Linux may need to deal with both types of address. A string or + :class:`bytes` object can be used for either type of address when + passing it as an argument. + + .. versionchanged:: 3.3 + Previously, :const:`AF_UNIX` socket paths were assumed to use UTF-8 + encoding. - A pair ``(host, port)`` is used for the :const:`AF_INET` address family, where *host* is a string representing either a hostname in Internet domain @@ -80,6 +94,19 @@ Socket addresses are represented as follows: If *addr_type* is :const:`TIPC_ADDR_ID`, then *v1* is the node, *v2* is the reference, and *v3* should be set to 0. +- A tuple ``(interface, )`` is used for the :const:`AF_CAN` address family, + where *interface* is a string representing a network interface name like + ``'can0'``. The network interface name ``''`` can be used to receive packets + from all network interfaces of this family. + +- A string or a tuple ``(id, unit)`` is used for the :const:`SYSPROTO_CONTROL` + protocol of the :const:`PF_SYSTEM` family. The string is the name of a + kernel control using a dynamically-assigned ID. The tuple can be used if ID + and unit number of the kernel control are known or if a registered ID is + used. + + .. versionadded:: 3.3 + - Certain other address families (:const:`AF_BLUETOOTH`, :const:`AF_PACKET`) support specific representations. @@ -99,8 +126,9 @@ resolution and/or the host configuration. For deterministic behavior use a numeric address in *host* portion. All errors raise exceptions. The normal exceptions for invalid argument types -and out-of-memory conditions can be raised; errors related to socket or address -semantics raise :exc:`socket.error` or one of its subclasses. +and out-of-memory conditions can be raised; starting from Python 3.3, errors +related to socket or address semantics raise :exc:`OSError` or one of its +subclasses (they used to raise :exc:`socket.error`). Non-blocking mode is supported through :meth:`~socket.setblocking`. A generalization of this based on timeouts is supported through @@ -110,25 +138,23 @@ generalization of this based on timeouts is supported through Module contents --------------- -The module :mod:`socket` exports the following constants and functions: +The module :mod:`socket` exports the following elements. + +Exceptions +^^^^^^^^^^ .. exception:: error - .. index:: module: errno + A deprecated alias of :exc:`OSError`. - A subclass of :exc:`IOError`, this exception is raised for socket-related - errors. It is recommended that you inspect its ``errno`` attribute to - discriminate between different kinds of errors. - - .. seealso:: - The :mod:`errno` module contains symbolic names for the error codes - defined by the underlying operating system. + .. versionchanged:: 3.3 + Following :pep:`3151`, this class was made an alias of :exc:`OSError`. .. exception:: herror - A subclass of :exc:`socket.error`, this exception is raised for + A subclass of :exc:`OSError`, this exception is raised for address-related errors, i.e. for functions that use *h_errno* in the POSIX C API, including :func:`gethostbyname_ex` and :func:`gethostbyaddr`. The accompanying value is a pair ``(h_errno, string)`` representing an @@ -136,10 +162,12 @@ The module :mod:`socket` exports the following constants and functions: *string* represents the description of *h_errno*, as returned by the :c:func:`hstrerror` C function. + .. versionchanged:: 3.3 + This class was made a subclass of :exc:`OSError`. .. exception:: gaierror - A subclass of :exc:`socket.error`, this exception is raised for + A subclass of :exc:`OSError`, this exception is raised for address-related errors by :func:`getaddrinfo` and :func:`getnameinfo`. The accompanying value is a pair ``(error, string)`` representing an error returned by a library call. *string* represents the description of @@ -147,22 +175,30 @@ The module :mod:`socket` exports the following constants and functions: numeric *error* value will match one of the :const:`EAI_\*` constants defined in this module. + .. versionchanged:: 3.3 + This class was made a subclass of :exc:`OSError`. .. exception:: timeout - A subclass of :exc:`socket.error`, this exception is raised when a timeout + A subclass of :exc:`OSError`, this exception is raised when a timeout occurs on a socket which has had timeouts enabled via a prior call to :meth:`~socket.settimeout` (or implicitly through :func:`~socket.setdefaulttimeout`). The accompanying value is a string whose value is currently always "timed out". + .. versionchanged:: 3.3 + This class was made a subclass of :exc:`OSError`. + + +Constants +^^^^^^^^^ .. data:: AF_UNIX AF_INET AF_INET6 These constants represent the address (and protocol) families, used for the - first argument to :func:`socket`. If the :const:`AF_UNIX` constant is not + first argument to :func:`.socket`. If the :const:`AF_UNIX` constant is not defined then this protocol is unsupported. More constants may be available depending on the system. @@ -174,7 +210,7 @@ The module :mod:`socket` exports the following constants and functions: SOCK_SEQPACKET These constants represent the socket types, used for the second argument to - :func:`socket`. More constants may be available depending on the system. + :func:`.socket`. More constants may be available depending on the system. (Only :const:`SOCK_STREAM` and :const:`SOCK_DGRAM` appear to be generally useful.) @@ -198,6 +234,7 @@ The module :mod:`socket` exports the following constants and functions: SOMAXCONN MSG_* SOL_* + SCM_* IPPROTO_* IPPORT_* INADDR_* @@ -215,11 +252,37 @@ The module :mod:`socket` exports the following constants and functions: in the Unix header files are defined; for a few symbols, default values are provided. +.. data:: AF_CAN + PF_CAN + SOL_CAN_* + CAN_* + + Many constants of these forms, documented in the Linux documentation, are + also defined in the socket module. + + Availability: Linux >= 2.6.25. + + .. versionadded:: 3.3 + + +.. data:: AF_RDS + PF_RDS + SOL_RDS + RDS_* + + Many constants of these forms, documented in the Linux documentation, are + also defined in the socket module. + + Availability: Linux >= 2.6.30. + + .. versionadded:: 3.3 + + .. data:: SIO_* RCVALL_* Constants for Windows' WSAIoctl(). The constants are used as arguments to the - :meth:`ioctl` method of socket objects. + :meth:`~socket.socket.ioctl` method of socket objects. .. data:: TIPC_* @@ -234,6 +297,43 @@ The module :mod:`socket` exports the following constants and functions: this platform. +Functions +^^^^^^^^^ + +Creating sockets +'''''''''''''''' + +The following functions all create :ref:`socket objects <socket-objects>`. + + +.. function:: socket([family[, type[, proto]]]) + + Create a new socket using the given address family, socket type and protocol + number. The address family should be :const:`AF_INET` (the default), + :const:`AF_INET6`, :const:`AF_UNIX`, :const:`AF_CAN` or :const:`AF_RDS`. The + socket type should be :const:`SOCK_STREAM` (the default), + :const:`SOCK_DGRAM`, :const:`SOCK_RAW` or perhaps one of the other ``SOCK_`` + constants. The protocol number is usually zero and may be omitted in that + case or :const:`CAN_RAW` in case the address family is :const:`AF_CAN`. + + .. versionchanged:: 3.3 + The AF_CAN family was added. + The AF_RDS family was added. + + +.. function:: socketpair([family[, type[, proto]]]) + + Build a pair of connected socket objects using the given address family, socket + type, and protocol number. Address family, socket type, and protocol number are + as for the :func:`.socket` function above. The default family is :const:`AF_UNIX` + if defined on the platform; otherwise, the default is :const:`AF_INET`. + Availability: Unix. + + .. versionchanged:: 3.2 + The returned socket objects now support the whole socket API, rather + than a subset. + + .. function:: create_connection(address[, timeout[, source_address]]) Connect to a TCP service listening on the Internet *address* (a 2-tuple @@ -260,6 +360,40 @@ The module :mod:`socket` exports the following constants and functions: support for the :keyword:`with` statement was added. +.. function:: fromfd(fd, family, type[, proto]) + + Duplicate the file descriptor *fd* (an integer as returned by a file object's + :meth:`fileno` method) and build a socket object from the result. Address + family, socket type and protocol number are as for the :func:`.socket` function + above. The file descriptor should refer to a socket, but this is not checked --- + subsequent operations on the object may fail if the file descriptor is invalid. + This function is rarely needed, but can be used to get or set socket options on + a socket passed to a program as standard input or output (such as a server + started by the Unix inet daemon). The socket is assumed to be in blocking mode. + + +.. function:: fromshare(data) + + Instantiate a socket from data obtained from the :meth:`socket.share` + method. The socket is assumed to be in blocking mode. + + Availability: Windows. + + .. versionadded:: 3.3 + + +.. data:: SocketType + + This is a Python type object that represents the socket object type. It is the + same as ``type(socket(...))``. + + +Other functions +''''''''''''''' + +The :mod:`socket` module also offers various network-related services: + + .. function:: getaddrinfo(host, port, family=0, type=0, proto=0, flags=0) Translate the *host*/*port* argument into a sequence of 5-tuples that contain @@ -282,7 +416,7 @@ The module :mod:`socket` exports the following constants and functions: ``(family, type, proto, canonname, sockaddr)`` In these tuples, *family*, *type*, *proto* are all integers and are - meant to be passed to the :func:`socket` function. *canonname* will be + meant to be passed to the :func:`.socket` function. *canonname* will be a string representing the canonical name of the *host* if :const:`AI_CANONNAME` is part of the *flags* argument; else *canonname* will be empty. *sockaddr* is a tuple describing a socket address, whose @@ -300,7 +434,7 @@ The module :mod:`socket` exports the following constants and functions: (10, 1, 6, '', ('2001:888:2000:d::a2', 80, 0, 0))] .. versionchanged:: 3.2 - parameters can now be passed as single keyword arguments. + parameters can now be passed using keyword arguments. .. function:: getfqdn([name]) @@ -369,7 +503,7 @@ The module :mod:`socket` exports the following constants and functions: .. function:: getprotobyname(protocolname) Translate an Internet protocol name (for example, ``'icmp'``) to a constant - suitable for passing as the (optional) third argument to the :func:`socket` + suitable for passing as the (optional) third argument to the :func:`.socket` function. This is usually only needed for sockets opened in "raw" mode (:const:`SOCK_RAW`); for the normal socket modes, the correct protocol is chosen automatically if the protocol is omitted or zero. @@ -389,41 +523,6 @@ The module :mod:`socket` exports the following constants and functions: ``'udp'``, otherwise any protocol will match. -.. function:: socket([family[, type[, proto]]]) - - Create a new socket using the given address family, socket type and protocol - number. The address family should be :const:`AF_INET` (the default), - :const:`AF_INET6` or :const:`AF_UNIX`. The socket type should be - :const:`SOCK_STREAM` (the default), :const:`SOCK_DGRAM` or perhaps one of the - other ``SOCK_`` constants. The protocol number is usually zero and may be - omitted in that case. - - -.. function:: socketpair([family[, type[, proto]]]) - - Build a pair of connected socket objects using the given address family, socket - type, and protocol number. Address family, socket type, and protocol number are - as for the :func:`socket` function above. The default family is :const:`AF_UNIX` - if defined on the platform; otherwise, the default is :const:`AF_INET`. - Availability: Unix. - - .. versionchanged:: 3.2 - The returned socket objects now support the whole socket API, rather - than a subset. - - -.. function:: fromfd(fd, family, type[, proto]) - - Duplicate the file descriptor *fd* (an integer as returned by a file object's - :meth:`fileno` method) and build a socket object from the result. Address - family, socket type and protocol number are as for the :func:`socket` function - above. The file descriptor should refer to a socket, but this is not checked --- - subsequent operations on the object may fail if the file descriptor is invalid. - This function is rarely needed, but can be used to get or set socket options on - a socket passed to a program as standard input or output (such as a server - started by the Unix inet daemon). The socket is assumed to be in blocking mode. - - .. function:: ntohl(x) Convert 32-bit positive integers from network to host byte order. On machines @@ -464,7 +563,7 @@ The module :mod:`socket` exports the following constants and functions: Unix manual page :manpage:`inet(3)` for details. If the IPv4 address string passed to this function is invalid, - :exc:`socket.error` will be raised. Note that exactly what is valid depends on + :exc:`OSError` will be raised. Note that exactly what is valid depends on the underlying C implementation of :c:func:`inet_aton`. :func:`inet_aton` does not support IPv6, and :func:`inet_pton` should be used @@ -481,7 +580,7 @@ The module :mod:`socket` exports the following constants and functions: argument. If the byte sequence passed to this function is not exactly 4 bytes in - length, :exc:`socket.error` will be raised. :func:`inet_ntoa` does not + length, :exc:`OSError` will be raised. :func:`inet_ntoa` does not support IPv6, and :func:`inet_ntop` should be used instead for IPv4/v6 dual stack support. @@ -495,7 +594,7 @@ The module :mod:`socket` exports the following constants and functions: Supported values for *address_family* are currently :const:`AF_INET` and :const:`AF_INET6`. If the IP address string *ip_string* is invalid, - :exc:`socket.error` will be raised. Note that exactly what is valid depends on + :exc:`OSError` will be raised. Note that exactly what is valid depends on both the value of *address_family* and the underlying implementation of :c:func:`inet_pton`. @@ -513,11 +612,54 @@ The module :mod:`socket` exports the following constants and functions: Supported values for *address_family* are currently :const:`AF_INET` and :const:`AF_INET6`. If the string *packed_ip* is not the correct length for the specified address family, :exc:`ValueError` will be raised. A - :exc:`socket.error` is raised for errors from the call to :func:`inet_ntop`. + :exc:`OSError` is raised for errors from the call to :func:`inet_ntop`. Availability: Unix (maybe not all platforms). +.. + XXX: Are sendmsg(), recvmsg() and CMSG_*() available on any + non-Unix platforms? The old (obsolete?) 4.2BSD form of the + interface, in which struct msghdr has no msg_control or + msg_controllen members, is not currently supported. + +.. function:: CMSG_LEN(length) + + Return the total length, without trailing padding, of an ancillary + data item with associated data of the given *length*. This value + can often be used as the buffer size for :meth:`~socket.recvmsg` to + receive a single item of ancillary data, but :rfc:`3542` requires + portable applications to use :func:`CMSG_SPACE` and thus include + space for padding, even when the item will be the last in the + buffer. Raises :exc:`OverflowError` if *length* is outside the + permissible range of values. + + Availability: most Unix platforms, possibly others. + + .. versionadded:: 3.3 + + +.. function:: CMSG_SPACE(length) + + Return the buffer size needed for :meth:`~socket.recvmsg` to + receive an ancillary data item with associated data of the given + *length*, along with any trailing padding. The buffer space needed + to receive multiple items is the sum of the :func:`CMSG_SPACE` + values for their associated data lengths. Raises + :exc:`OverflowError` if *length* is outside the permissible range + of values. + + Note that some systems might support ancillary data without + providing this function. Also note that setting the buffer size + using the results of this function may not precisely limit the + amount of ancillary data that can be received, since additional + data may be able to fit into the padding area. + + Availability: most Unix platforms, possibly others. + + .. versionadded:: 3.3 + + .. function:: getdefaulttimeout() Return the default timeout in seconds (float) for new socket objects. A value @@ -533,10 +675,47 @@ The module :mod:`socket` exports the following constants and functions: meanings. -.. data:: SocketType +.. function:: sethostname(name) - This is a Python type object that represents the socket object type. It is the - same as ``type(socket(...))``. + Set the machine's hostname to *name*. This will raise a + :exc:`OSError` if you don't have enough rights. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: if_nameindex() + + Return a list of network interface information + (index int, name string) tuples. + :exc:`OSError` if the system call fails. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: if_nametoindex(if_name) + + Return a network interface index number corresponding to an + interface name. + :exc:`OSError` if no interface with the given name exists. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: if_indextoname(if_index) + + Return a network interface name corresponding to a + interface index number. + :exc:`OSError` if no interface with the given index exists. + + Availability: Unix. + + .. versionadded:: 3.3 .. _socket-objects: @@ -544,8 +723,9 @@ The module :mod:`socket` exports the following constants and functions: Socket Objects -------------- -Socket objects have the following methods. Except for :meth:`makefile` these -correspond to Unix system calls applicable to sockets. +Socket objects have the following methods. Except for +:meth:`~socket.makefile`, these correspond to Unix system calls applicable +to sockets. .. method:: socket.accept() @@ -564,11 +744,18 @@ correspond to Unix system calls applicable to sockets. .. method:: socket.close() - Close the socket. All future operations on the socket object will fail. The - remote end will receive no more data (after queued data is flushed). Sockets are - automatically closed when they are garbage-collected. + Mark the socket closed. The underlying system resource (e.g. a file + descriptor) is also closed when all file objects from :meth:`makefile()` + are closed. Once that happens, all future operations on the socket + object will fail. The remote end will receive no more data (after + queued data is flushed). + + Sockets are automatically closed when they are garbage-collected, but + it is recommended to :meth:`close` them explicitly, or to use a + :keyword:`with` statement around them. .. note:: + :meth:`close()` releases the resource associated with a connection but does not necessarily close the connection immediately. If you want to close the connection in a timely fashion, call :meth:`shutdown()` @@ -672,10 +859,13 @@ correspond to Unix system calls applicable to sockets. type depends on the arguments given to :meth:`makefile`. These arguments are interpreted the same way as by the built-in :func:`open` function. - Closing the file object won't close the socket unless there are no remaining - references to the socket. The socket must be in blocking mode; it can have - a timeout, but the file object's internal buffer may end up in a inconsistent - state if a timeout occurs. + The socket must be in blocking mode; it can have a timeout, but the file + object's internal buffer may end up in a inconsistent state if a timeout + occurs. + + Closing the file object returned by :meth:`makefile` won't close the + original socket unless all other file objects have been closed and + :meth:`socket.close` has been called on the socket object. .. note:: @@ -706,6 +896,109 @@ correspond to Unix system calls applicable to sockets. to zero. (The format of *address* depends on the address family --- see above.) +.. method:: socket.recvmsg(bufsize[, ancbufsize[, flags]]) + + Receive normal data (up to *bufsize* bytes) and ancillary data from + the socket. The *ancbufsize* argument sets the size in bytes of + the internal buffer used to receive the ancillary data; it defaults + to 0, meaning that no ancillary data will be received. Appropriate + buffer sizes for ancillary data can be calculated using + :func:`CMSG_SPACE` or :func:`CMSG_LEN`, and items which do not fit + into the buffer might be truncated or discarded. The *flags* + argument defaults to 0 and has the same meaning as for + :meth:`recv`. + + The return value is a 4-tuple: ``(data, ancdata, msg_flags, + address)``. The *data* item is a :class:`bytes` object holding the + non-ancillary data received. The *ancdata* item is a list of zero + or more tuples ``(cmsg_level, cmsg_type, cmsg_data)`` representing + the ancillary data (control messages) received: *cmsg_level* and + *cmsg_type* are integers specifying the protocol level and + protocol-specific type respectively, and *cmsg_data* is a + :class:`bytes` object holding the associated data. The *msg_flags* + item is the bitwise OR of various flags indicating conditions on + the received message; see your system documentation for details. + If the receiving socket is unconnected, *address* is the address of + the sending socket, if available; otherwise, its value is + unspecified. + + On some systems, :meth:`sendmsg` and :meth:`recvmsg` can be used to + pass file descriptors between processes over an :const:`AF_UNIX` + socket. When this facility is used (it is often restricted to + :const:`SOCK_STREAM` sockets), :meth:`recvmsg` will return, in its + ancillary data, items of the form ``(socket.SOL_SOCKET, + socket.SCM_RIGHTS, fds)``, where *fds* is a :class:`bytes` object + representing the new file descriptors as a binary array of the + native C :c:type:`int` type. If :meth:`recvmsg` raises an + exception after the system call returns, it will first attempt to + close any file descriptors received via this mechanism. + + Some systems do not indicate the truncated length of ancillary data + items which have been only partially received. If an item appears + to extend beyond the end of the buffer, :meth:`recvmsg` will issue + a :exc:`RuntimeWarning`, and will return the part of it which is + inside the buffer provided it has not been truncated before the + start of its associated data. + + On systems which support the :const:`SCM_RIGHTS` mechanism, the + following function will receive up to *maxfds* file descriptors, + returning the message data and a list containing the descriptors + (while ignoring unexpected conditions such as unrelated control + messages being received). See also :meth:`sendmsg`. :: + + import socket, array + + def recv_fds(sock, msglen, maxfds): + fds = array.array("i") # Array of ints + msg, ancdata, flags, addr = sock.recvmsg(msglen, socket.CMSG_LEN(maxfds * fds.itemsize)) + for cmsg_level, cmsg_type, cmsg_data in ancdata: + if (cmsg_level == socket.SOL_SOCKET and cmsg_type == socket.SCM_RIGHTS): + # Append data, ignoring any truncated integers at the end. + fds.fromstring(cmsg_data[:len(cmsg_data) - (len(cmsg_data) % fds.itemsize)]) + return msg, list(fds) + + Availability: most Unix platforms, possibly others. + + .. versionadded:: 3.3 + + +.. method:: socket.recvmsg_into(buffers[, ancbufsize[, flags]]) + + Receive normal data and ancillary data from the socket, behaving as + :meth:`recvmsg` would, but scatter the non-ancillary data into a + series of buffers instead of returning a new bytes object. The + *buffers* argument must be an iterable of objects that export + writable buffers (e.g. :class:`bytearray` objects); these will be + filled with successive chunks of the non-ancillary data until it + has all been written or there are no more buffers. The operating + system may set a limit (:func:`~os.sysconf` value ``SC_IOV_MAX``) + on the number of buffers that can be used. The *ancbufsize* and + *flags* arguments have the same meaning as for :meth:`recvmsg`. + + The return value is a 4-tuple: ``(nbytes, ancdata, msg_flags, + address)``, where *nbytes* is the total number of bytes of + non-ancillary data written into the buffers, and *ancdata*, + *msg_flags* and *address* are the same as for :meth:`recvmsg`. + + Example:: + + >>> import socket + >>> s1, s2 = socket.socketpair() + >>> b1 = bytearray(b'----') + >>> b2 = bytearray(b'0123456789') + >>> b3 = bytearray(b'--------------') + >>> s1.send(b'Mary had a little lamb') + 22 + >>> s2.recvmsg_into([b1, memoryview(b2)[2:9], b3]) + (22, [], 0, None) + >>> [b1, b2, b3] + [bytearray(b'Mary'), bytearray(b'01 had a 9'), bytearray(b'little lamb---')] + + Availability: most Unix platforms, possibly others. + + .. versionadded:: 3.3 + + .. method:: socket.recvfrom_into(buffer[, nbytes[, flags]]) Receive data from the socket, writing it into *buffer* instead of creating a @@ -755,6 +1048,41 @@ correspond to Unix system calls applicable to sockets. above.) +.. method:: socket.sendmsg(buffers[, ancdata[, flags[, address]]]) + + Send normal and ancillary data to the socket, gathering the + non-ancillary data from a series of buffers and concatenating it + into a single message. The *buffers* argument specifies the + non-ancillary data as an iterable of buffer-compatible objects + (e.g. :class:`bytes` objects); the operating system may set a limit + (:func:`~os.sysconf` value ``SC_IOV_MAX``) on the number of buffers + that can be used. The *ancdata* argument specifies the ancillary + data (control messages) as an iterable of zero or more tuples + ``(cmsg_level, cmsg_type, cmsg_data)``, where *cmsg_level* and + *cmsg_type* are integers specifying the protocol level and + protocol-specific type respectively, and *cmsg_data* is a + buffer-compatible object holding the associated data. Note that + some systems (in particular, systems without :func:`CMSG_SPACE`) + might support sending only one control message per call. The + *flags* argument defaults to 0 and has the same meaning as for + :meth:`send`. If *address* is supplied and not ``None``, it sets a + destination address for the message. The return value is the + number of bytes of non-ancillary data sent. + + The following function sends the list of file descriptors *fds* + over an :const:`AF_UNIX` socket, on systems which support the + :const:`SCM_RIGHTS` mechanism. See also :meth:`recvmsg`. :: + + import socket, array + + def send_fds(sock, msg, fds): + return sock.sendmsg([msg], [(socket.SOL_SOCKET, socket.SCM_RIGHTS, array.array("i", fds))]) + + Availability: most Unix platforms, possibly others. + + .. versionadded:: 3.3 + + .. method:: socket.setblocking(flag) Set blocking or non-blocking mode of the socket: if *flag* is false, the @@ -796,9 +1124,22 @@ correspond to Unix system calls applicable to sockets. Shut down one or both halves of the connection. If *how* is :const:`SHUT_RD`, further receives are disallowed. If *how* is :const:`SHUT_WR`, further sends are disallowed. If *how* is :const:`SHUT_RDWR`, further sends and receives are - disallowed. Depending on the platform, shutting down one half of the connection - can also close the opposite half (e.g. on Mac OS X, ``shutdown(SHUT_WR)`` does - not allow further reads on the other end of the connection). + disallowed. + + +.. method:: socket.share(process_id) + + Duplicate a socket and prepare it for sharing with a target process. The + target process must be provided with *process_id*. The resulting bytes object + can then be passed to the target process using some form of interprocess + communication and the socket can be recreated there using :func:`fromshare`. + Once this method has been called, it is safe to close the socket since + the operating system has already duplicated it for the target process. + + Availability: Windows. + + .. versionadded:: 3.3 + Note that there are no methods :meth:`read` or :meth:`write`; use :meth:`~socket.recv` and :meth:`~socket.send` without *flags* argument instead. @@ -884,10 +1225,10 @@ Example Here are four minimal example programs using the TCP/IP protocol: a server that echoes all data that it receives back (servicing only one client), and a client -using it. Note that a server must perform the sequence :func:`socket`, +using it. Note that a server must perform the sequence :func:`.socket`, :meth:`~socket.bind`, :meth:`~socket.listen`, :meth:`~socket.accept` (possibly repeating the :meth:`~socket.accept` to service more than one client), while a -client only needs the sequence :func:`socket`, :meth:`~socket.connect`. Also +client only needs the sequence :func:`.socket`, :meth:`~socket.connect`. Also note that the server does not :meth:`~socket.sendall`/:meth:`~socket.recv` on the socket it is listening on but on the new socket returned by :meth:`~socket.accept`. @@ -943,13 +1284,13 @@ sends traffic to the first one connected successfully. :: af, socktype, proto, canonname, sa = res try: s = socket.socket(af, socktype, proto) - except socket.error as msg: + except OSError as msg: s = None continue try: s.bind(sa) s.listen(1) - except socket.error as msg: + except OSError as msg: s.close() s = None continue @@ -978,12 +1319,12 @@ sends traffic to the first one connected successfully. :: af, socktype, proto, canonname, sa = res try: s = socket.socket(af, socktype, proto) - except socket.error as msg: + except OSError as msg: s = None continue try: s.connect(sa) - except socket.error as msg: + except OSError as msg: s.close() s = None continue @@ -997,7 +1338,7 @@ sends traffic to the first one connected successfully. :: print('Received', repr(data)) -The last example shows how to write a very simple network sniffer with raw +The next example shows how to write a very simple network sniffer with raw sockets on Windows. The example requires administrator privileges to modify the interface:: @@ -1022,11 +1363,51 @@ the interface:: # disabled promiscuous mode s.ioctl(socket.SIO_RCVALL, socket.RCVALL_OFF) +The last example shows how to use the socket interface to communicate to a CAN +network. This example might require special priviledge:: + + import socket + import struct + + + # CAN frame packing/unpacking (see 'struct can_frame' in <linux/can.h>) + + can_frame_fmt = "=IB3x8s" + can_frame_size = struct.calcsize(can_frame_fmt) + + def build_can_frame(can_id, data): + can_dlc = len(data) + data = data.ljust(8, b'\x00') + return struct.pack(can_frame_fmt, can_id, can_dlc, data) + + def dissect_can_frame(frame): + can_id, can_dlc, data = struct.unpack(can_frame_fmt, frame) + return (can_id, can_dlc, data[:can_dlc]) + + + # create a raw socket and bind it to the 'vcan0' interface + s = socket.socket(socket.AF_CAN, socket.SOCK_RAW, socket.CAN_RAW) + s.bind(('vcan0',)) + + while True: + cf, addr = s.recvfrom(can_frame_size) + + print('Received: can_id=%x, can_dlc=%x, data=%s' % dissect_can_frame(cf)) + + try: + s.send(cf) + except OSError: + print('Error sending CAN frame') + + try: + s.send(build_can_frame(0x01, b'\x01\x02\x03')) + except OSError: + print('Error sending CAN frame') Running an example several times with too small delay between executions, could lead to this error:: - socket.error: [Errno 98] Address already in use + OSError: [Errno 98] Address already in use This is because the previous execution has left the socket in a ``TIME_WAIT`` state, and can't be immediately reused. @@ -1057,4 +1438,3 @@ the :data:`SO_REUSEADDR` flag tells the kernel to reuse a local socket in details of socket semantics. For Unix, refer to the manual pages; for Windows, see the WinSock (or Winsock 2) specification. For IPv6-ready APIs, readers may want to refer to :rfc:`3493` titled Basic Socket Interface Extensions for IPv6. - diff --git a/Doc/library/socketserver.rst b/Doc/library/socketserver.rst index 4f22347..1ec4438 100644 --- a/Doc/library/socketserver.rst +++ b/Doc/library/socketserver.rst @@ -111,13 +111,13 @@ can be implemented by using a synchronous server and doing an explicit fork in the request handler class :meth:`handle` method. Another approach to handling multiple simultaneous requests in an environment -that supports neither threads nor :func:`fork` (or where these are too expensive -or inappropriate for the service) is to maintain an explicit table of partially -finished requests and to use :func:`select` to decide which request to work on -next (or whether to handle a new incoming request). This is particularly -important for stream services where each client can potentially be connected for -a long time (if threads or subprocesses cannot be used). See :mod:`asyncore` -for another way to manage this. +that supports neither threads nor :func:`~os.fork` (or where these are too +expensive or inappropriate for the service) is to maintain an explicit table of +partially finished requests and to use :func:`~select.select` to decide which +request to work on next (or whether to handle a new incoming request). This is +particularly important for stream services where each client can potentially be +connected for a long time (if threads or subprocesses cannot be used). See +:mod:`asyncore` for another way to manage this. .. XXX should data and methods be intermingled, or separate? how should the distinction between class and instance variables be drawn? @@ -153,10 +153,24 @@ Server Objects .. method:: BaseServer.serve_forever(poll_interval=0.5) - Handle requests until an explicit :meth:`shutdown` request. - Poll for shutdown every *poll_interval* seconds. Ignores :attr:`self.timeout`. - If you need to do periodic tasks, do them in another thread. + Handle requests until an explicit :meth:`shutdown` request. Poll for + shutdown every *poll_interval* seconds. Ignores :attr:`self.timeout`. It + also calls :meth:`service_actions`, which may be used by a subclass or mixin + to provide actions specific to a given service. For example, the + :class:`ForkingMixIn` class uses :meth:`service_actions` to clean up zombie + child processes. + .. versionchanged:: 3.3 + Added ``service_actions`` call to the ``serve_forever`` method. + + +.. method:: BaseServer.service_actions() + + This is called in the :meth:`serve_forever` loop. This method can be + overridden by subclasses or mixin classes to perform actions specific to + a given service, such as cleanup actions. + + .. versionadded:: 3.3 .. method:: BaseServer.shutdown() diff --git a/Doc/library/someos.rst b/Doc/library/someos.rst deleted file mode 100644 index d2009bb..0000000 --- a/Doc/library/someos.rst +++ /dev/null @@ -1,24 +0,0 @@ -.. _someos: - -********************************** -Optional Operating System Services -********************************** - -The modules described in this chapter provide interfaces to operating system -features that are available on selected operating systems only. The interfaces -are generally modeled after the Unix or C interfaces but they are available on -some other systems as well (e.g. Windows). Here's an overview: - - -.. toctree:: - - select.rst - threading.rst - multiprocessing.rst - concurrent.futures.rst - mmap.rst - readline.rst - rlcompleter.rst - dummy_threading.rst - _thread.rst - _dummy_thread.rst diff --git a/Doc/library/sqlite3.rst b/Doc/library/sqlite3.rst index eeb9666..8a50f87 100644 --- a/Doc/library/sqlite3.rst +++ b/Doc/library/sqlite3.rst @@ -91,7 +91,7 @@ This example uses the iterator form:: .. seealso:: - http://code.google.com/p/pysqlite/ + https://github.com/ghaering/pysqlite The pysqlite web page -- sqlite3 is developed externally under the name "pysqlite". @@ -174,7 +174,7 @@ Module functions and constants For the *isolation_level* parameter, please see the :attr:`Connection.isolation_level` property of :class:`Connection` objects. - SQLite natively supports only the types TEXT, INTEGER, FLOAT, BLOB and NULL. If + SQLite natively supports only the types TEXT, INTEGER, REAL, BLOB and NULL. If you want to use other types you must add support for them yourself. The *detect_types* parameter and the using custom **converters** registered with the module-level :func:`register_converter` function allow you to easily do that. @@ -229,10 +229,10 @@ Module functions and constants .. function:: enable_callback_tracebacks(flag) By default you will not get any tracebacks in user-defined functions, - aggregates, converters, authorizer callbacks etc. If you want to debug them, you - can call this function with *flag* as True. Afterwards, you will get tracebacks - from callbacks on ``sys.stderr``. Use :const:`False` to disable the feature - again. + aggregates, converters, authorizer callbacks etc. If you want to debug them, + you can call this function with *flag* set to ``True``. Afterwards, you will + get tracebacks from callbacks on ``sys.stderr``. Use :const:`False` to + disable the feature again. .. _sqlite3-connection-objects: @@ -391,6 +391,22 @@ Connection Objects method with :const:`None` for *handler*. + .. method:: set_trace_callback(trace_callback) + + Registers *trace_callback* to be called for each SQL statement that is + actually executed by the SQLite backend. + + The only argument passed to the callback is the statement (as string) that + is being executed. The return value of the callback is ignored. Note that + the backend does not only run statements passed to the :meth:`Cursor.execute` + methods. Other sources include the transaction management of the Python + module and the execution of triggers defined in the current database. + + Passing :const:`None` as *trace_callback* will disable the trace callback. + + .. versionadded:: 3.3 + + .. method:: enable_load_extension(enabled) This routine allows/disallows the SQLite engine to load SQLite extensions diff --git a/Doc/library/ssl.rst b/Doc/library/ssl.rst index 0f5cea2..aa5a52f 100644 --- a/Doc/library/ssl.rst +++ b/Doc/library/ssl.rst @@ -28,6 +28,12 @@ probably additional platforms, as long as OpenSSL is installed on that platform. operating system socket APIs. The installed version of OpenSSL may also cause variations in behavior. +.. warning:: + Don't use this module without reading the :ref:`ssl-security`. Doing so + may lead to a false sense of security, as the default settings of the + ssl module are not necessarily appropriate for your application. + + This section documents the objects and functions in the ``ssl`` module; for more general information about TLS, SSL, and certificates, the reader is referred to the documents in the "See Also" section at the bottom. @@ -53,9 +59,69 @@ Functions, Constants, and Exceptions (currently provided by the OpenSSL library). This signifies some problem in the higher-level encryption and authentication layer that's superimposed on the underlying network connection. This error - is a subtype of :exc:`socket.error`, which in turn is a subtype of - :exc:`IOError`. The error code and message of :exc:`SSLError` instances - are provided by the OpenSSL library. + is a subtype of :exc:`OSError`. The error code and message of + :exc:`SSLError` instances are provided by the OpenSSL library. + + .. versionchanged:: 3.3 + :exc:`SSLError` used to be a subtype of :exc:`socket.error`. + + .. attribute:: library + + A string mnemonic designating the OpenSSL submodule in which the error + occurred, such as ``SSL``, ``PEM`` or ``X509``. The range of possible + values depends on the OpenSSL version. + + .. versionadded:: 3.3 + + .. attribute:: reason + + A string mnemonic designating the reason this error occurred, for + example ``CERTIFICATE_VERIFY_FAILED``. The range of possible + values depends on the OpenSSL version. + + .. versionadded:: 3.3 + +.. exception:: SSLZeroReturnError + + A subclass of :exc:`SSLError` raised when trying to read or write and + the SSL connection has been closed cleanly. Note that this doesn't + mean that the underlying transport (read TCP) has been closed. + + .. versionadded:: 3.3 + +.. exception:: SSLWantReadError + + A subclass of :exc:`SSLError` raised by a :ref:`non-blocking SSL socket + <ssl-nonblocking>` when trying to read or write data, but more data needs + to be received on the underlying TCP transport before the request can be + fulfilled. + + .. versionadded:: 3.3 + +.. exception:: SSLWantWriteError + + A subclass of :exc:`SSLError` raised by a :ref:`non-blocking SSL socket + <ssl-nonblocking>` when trying to read or write data, but more data needs + to be sent on the underlying TCP transport before the request can be + fulfilled. + + .. versionadded:: 3.3 + +.. exception:: SSLSyscallError + + A subclass of :exc:`SSLError` raised when a system error was encountered + while trying to fulfill an operation on a SSL socket. Unfortunately, + there is no easy way to inspect the original errno number. + + .. versionadded:: 3.3 + +.. exception:: SSLEOFError + + A subclass of :exc:`SSLError` raised when the SSL connection has been + terminated abruptly. Generally, you shouldn't try to reuse the underlying + transport when this error is encountered. + + .. versionadded:: 3.3 .. exception:: CertificateError @@ -75,13 +141,16 @@ instead. Takes an instance ``sock`` of :class:`socket.socket`, and returns an instance of :class:`ssl.SSLSocket`, a subtype of :class:`socket.socket`, which wraps - the underlying socket in an SSL context. For client-side sockets, the - context construction is lazy; if the underlying socket isn't connected yet, - the context construction will be performed after :meth:`connect` is called on - the socket. For server-side sockets, if the socket has no remote peer, it is - assumed to be a listening socket, and the server-side SSL wrapping is - automatically performed on client connections accepted via the :meth:`accept` - method. :func:`wrap_socket` may raise :exc:`SSLError`. + the underlying socket in an SSL context. ``sock`` must be a + :data:`~socket.SOCK_STREAM` socket; other socket types are unsupported. + + For client-side sockets, the context construction is lazy; if the + underlying socket isn't connected yet, the context construction will be + performed after :meth:`connect` is called on the socket. For + server-side sockets, if the socket has no remote peer, it is assumed + to be a listening socket, and the server-side SSL wrapping is + automatically performed on client connections accepted via the + :meth:`accept` method. :func:`wrap_socket` may raise :exc:`SSLError`. The ``keyfile`` and ``certfile`` parameters specify optional files which contain a certificate to be used to identify the local side of the @@ -161,16 +230,45 @@ instead. Random generation ^^^^^^^^^^^^^^^^^ +.. function:: RAND_bytes(num) + + Returns *num* cryptographically strong pseudo-random bytes. Raises an + :class:`SSLError` if the PRNG has not been seeded with enough data or if the + operation is not supported by the current RAND method. :func:`RAND_status` + can be used to check the status of the PRNG and :func:`RAND_add` can be used + to seed the PRNG. + + Read the Wikipedia article, `Cryptographically secure pseudorandom number + generator (CSPRNG) + <http://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator>`_, + to get the requirements of a cryptographically generator. + + .. versionadded:: 3.3 + +.. function:: RAND_pseudo_bytes(num) + + Returns (bytes, is_cryptographic): bytes are *num* pseudo-random bytes, + is_cryptographic is ``True`` if the bytes generated are cryptographically + strong. Raises an :class:`SSLError` if the operation is not supported by the + current RAND method. + + Generated pseudo-random byte sequences will be unique if they are of + sufficient length, but are not necessarily unpredictable. They can be used + for non-cryptographic purposes and for certain purposes in cryptographic + protocols, but usually not for key generation etc. + + .. versionadded:: 3.3 + .. function:: RAND_status() - Returns True if the SSL pseudo-random number generator has been seeded with - 'enough' randomness, and False otherwise. You can use :func:`ssl.RAND_egd` + Returns ``True`` if the SSL pseudo-random number generator has been seeded with + 'enough' randomness, and ``False`` otherwise. You can use :func:`ssl.RAND_egd` and :func:`ssl.RAND_add` to increase the randomness of the pseudo-random number generator. .. function:: RAND_egd(path) - If you are running an entropy-gathering daemon (EGD) somewhere, and ``path`` + If you are running an entropy-gathering daemon (EGD) somewhere, and *path* is the pathname of a socket connection open to it, this will read 256 bytes of randomness from the socket, and add it to the SSL pseudo-random number generator to increase the security of generated secret keys. This is @@ -181,8 +279,8 @@ Random generation .. function:: RAND_add(bytes, entropy) - Mixes the given ``bytes`` into the SSL pseudo-random number generator. The - parameter ``entropy`` (a float) is a lower bound on the entropy contained in + Mixes the given *bytes* into the SSL pseudo-random number generator. The + parameter *entropy* (a float) is a lower bound on the entropy contained in string (so you can always use :const:`0.0`). See :rfc:`1750` for more information on sources of entropy. @@ -194,10 +292,10 @@ Certificate handling Verify that *cert* (in decoded format as returned by :meth:`SSLSocket.getpeercert`) matches the given *hostname*. The rules applied are those for checking the identity of HTTPS servers as outlined - in :rfc:`2818`, except that IP addresses are not currently supported. - In addition to HTTPS, this function should be suitable for checking the - identity of servers in various SSL-based protocols such as FTPS, IMAPS, - POPS and others. + in :rfc:`2818` and :rfc:`6125`, except that IP addresses are not currently + supported. In addition to HTTPS, this function should be suitable for + checking the identity of servers in various SSL-based protocols such as + FTPS, IMAPS, POPS and others. :exc:`CertificateError` is raised on failure. On success, the function returns nothing:: @@ -212,6 +310,13 @@ Certificate handling .. versionadded:: 3.2 + .. versionchanged:: 3.3.3 + The function now follows :rfc:`6125`, section 6.4.3 and does neither + match multiple wildcards (e.g. ``*.*.com`` or ``*a*.example.org``) nor + a wildcard inside an internationalized domain names (IDN) fragment. + IDN A-labels such as ``www*.xn--pthon-kva.org`` are still supported, + but ``x*.python.org`` no longer matches ``xn--tda.python.org``. + .. function:: cert_time_to_seconds(timestring) Returns a floating-point value containing a normal seconds-after-the-epoch @@ -238,6 +343,9 @@ Certificate handling will attempt to validate the server certificate against that set of root certificates, and will fail if the validation attempt fails. + .. versionchanged:: 3.3 + This function is now IPv6-compatible. + .. function:: DER_cert_to_PEM_cert(DER_cert_bytes) Given a certificate as a DER-encoded blob of bytes, returns a PEM-encoded @@ -345,6 +453,46 @@ Constants .. versionadded:: 3.2 +.. data:: OP_CIPHER_SERVER_PREFERENCE + + Use the server's cipher ordering preference, rather than the client's. + This option has no effect on client sockets and SSLv2 server sockets. + + .. versionadded:: 3.3 + +.. data:: OP_SINGLE_DH_USE + + Prevents re-use of the same DH key for distinct SSL sessions. This + improves forward secrecy but requires more computational resources. + This option only applies to server sockets. + + .. versionadded:: 3.3 + +.. data:: OP_SINGLE_ECDH_USE + + Prevents re-use of the same ECDH key for distinct SSL sessions. This + improves forward secrecy but requires more computational resources. + This option only applies to server sockets. + + .. versionadded:: 3.3 + +.. data:: OP_NO_COMPRESSION + + Disable compression on the SSL channel. This is useful if the application + protocol supports its own compression scheme. + + This option is only available with OpenSSL 1.0.0 and later. + + .. versionadded:: 3.3 + +.. data:: HAS_ECDH + + Whether the OpenSSL library has built-in support for Elliptic Curve-based + Diffie-Hellman key exchange. This should be true unless the feature was + explicitly disabled by the distributor. + + .. versionadded:: 3.3 + .. data:: HAS_SNI Whether the OpenSSL library has built-in support for the *Server Name @@ -354,6 +502,23 @@ Constants .. versionadded:: 3.2 +.. data:: HAS_NPN + + Whether the OpenSSL library has built-in support for *Next Protocol + Negotiation* as described in the `NPN draft specification + <http://tools.ietf.org/html/draft-agl-tls-nextprotoneg>`_. When true, + you can use the :meth:`SSLContext.set_npn_protocols` method to advertise + which protocols you want to support. + + .. versionadded:: 3.3 + +.. data:: CHANNEL_BINDING_TYPES + + List of supported TLS channel binding types. Strings in this list + can be used as arguments to :meth:`SSLSocket.get_channel_binding`. + + .. versionadded:: 3.3 + .. data:: OPENSSL_VERSION The version string of the OpenSSL library loaded by the interpreter:: @@ -424,7 +589,7 @@ SSL sockets also have the following additional methods and attributes: If there is no certificate for the peer on the other end of the connection, returns ``None``. - If the parameter ``binary_form`` is :const:`False`, and a certificate was + If the ``binary_form`` parameter is :const:`False`, and a certificate was received from the peer, this method returns a :class:`dict` instance. If the certificate was not validated, the dict is empty. If the certificate was validated, it returns a dict with several keys, amongst them ``subject`` @@ -458,16 +623,23 @@ SSL sockets also have the following additional methods and attributes: 'version': 3} .. note:: + To validate a certificate for a particular service, you can use the :func:`match_hostname` function. If the ``binary_form`` parameter is :const:`True`, and a certificate was provided, this method returns the DER-encoded form of the entire certificate as a sequence of bytes, or :const:`None` if the peer did not provide a - certificate. This return value is independent of validation; if validation - was required (:const:`CERT_OPTIONAL` or :const:`CERT_REQUIRED`), it will have - been validated, but if :const:`CERT_NONE` was used to establish the - connection, the certificate, if present, will not have been validated. + certificate. Whether the peer provides a certificate depends on the SSL + socket's role: + + * for a client SSL socket, the server will always provide a certificate, + regardless of whether validation was required; + + * for a server SSL socket, the client will only provide a certificate + when requested by the server; therefore :meth:`getpeercert` will return + :const:`None` if you used :const:`CERT_NONE` (rather than + :const:`CERT_OPTIONAL` or :const:`CERT_REQUIRED`). .. versionchanged:: 3.2 The returned dictionary includes additional items such as ``issuer`` @@ -479,6 +651,37 @@ SSL sockets also have the following additional methods and attributes: version of the SSL protocol that defines its use, and the number of secret bits being used. If no connection has been established, returns ``None``. +.. method:: SSLSocket.compression() + + Return the compression algorithm being used as a string, or ``None`` + if the connection isn't compressed. + + If the higher-level protocol supports its own compression mechanism, + you can use :data:`OP_NO_COMPRESSION` to disable SSL-level compression. + + .. versionadded:: 3.3 + +.. method:: SSLSocket.get_channel_binding(cb_type="tls-unique") + + Get channel binding data for current connection, as a bytes object. Returns + ``None`` if not connected or the handshake has not been completed. + + The *cb_type* parameter allow selection of the desired channel binding + type. Valid channel binding types are listed in the + :data:`CHANNEL_BINDING_TYPES` list. Currently only the 'tls-unique' channel + binding, defined by :rfc:`5929`, is supported. :exc:`ValueError` will be + raised if an unsupported channel binding type is requested. + + .. versionadded:: 3.3 + +.. method:: SSLSocket.selected_npn_protocol() + + Returns the protocol that was selected during the TLS/SSL handshake. If + :meth:`SSLContext.set_npn_protocols` was not called, or if the other party + does not support NPN, or if the handshake has not yet happened, this will + return ``None``. + + .. versionadded:: 3.3 .. method:: SSLSocket.unwrap() @@ -488,7 +691,6 @@ SSL sockets also have the following additional methods and attributes: returned socket should always be used for further communication with the other side of the connection, rather than the original socket. - .. attribute:: SSLSocket.context The :class:`SSLContext` object this SSL socket is tied to. If the SSL @@ -518,7 +720,7 @@ to speed up repeated connections from the same clients. :class:`SSLContext` objects have the following methods and attributes: -.. method:: SSLContext.load_cert_chain(certfile, keyfile=None) +.. method:: SSLContext.load_cert_chain(certfile, keyfile=None, password=None) Load a private key and the corresponding certificate. The *certfile* string must be the path to a single file in PEM format containing the @@ -529,9 +731,25 @@ to speed up repeated connections from the same clients. :ref:`ssl-certificates` for more information on how the certificate is stored in the *certfile*. + The *password* argument may be a function to call to get the password for + decrypting the private key. It will only be called if the private key is + encrypted and a password is necessary. It will be called with no arguments, + and it should return a string, bytes, or bytearray. If the return value is + a string it will be encoded as UTF-8 before using it to decrypt the key. + Alternatively a string, bytes, or bytearray value may be supplied directly + as the *password* argument. It will be ignored if the private key is not + encrypted and no password is needed. + + If the *password* argument is not specified and a password is required, + OpenSSL's built-in password prompting mechanism will be used to + interactively prompt the user for a password. + An :class:`SSLError` is raised if the private key doesn't match with the certificate. + .. versionchanged:: 3.3 + New optional argument *password*. + .. method:: SSLContext.load_verify_locations(cafile=None, capath=None) Load a set of "certification authority" (CA) certificates used to validate @@ -570,12 +788,62 @@ to speed up repeated connections from the same clients. when connected, the :meth:`SSLSocket.cipher` method of SSL sockets will give the currently selected cipher. +.. method:: SSLContext.set_npn_protocols(protocols) + + Specify which protocols the socket should advertise during the SSL/TLS + handshake. It should be a list of strings, like ``['http/1.1', 'spdy/2']``, + ordered by preference. The selection of a protocol will happen during the + handshake, and will play out according to the `NPN draft specification + <http://tools.ietf.org/html/draft-agl-tls-nextprotoneg>`_. After a + successful handshake, the :meth:`SSLSocket.selected_npn_protocol` method will + return the agreed-upon protocol. + + This method will raise :exc:`NotImplementedError` if :data:`HAS_NPN` is + False. + + .. versionadded:: 3.3 + +.. method:: SSLContext.load_dh_params(dhfile) + + Load the key generation parameters for Diffie-Helman (DH) key exchange. + Using DH key exchange improves forward secrecy at the expense of + computational resources (both on the server and on the client). + The *dhfile* parameter should be the path to a file containing DH + parameters in PEM format. + + This setting doesn't apply to client sockets. You can also use the + :data:`OP_SINGLE_DH_USE` option to further improve security. + + .. versionadded:: 3.3 + +.. method:: SSLContext.set_ecdh_curve(curve_name) + + Set the curve name for Elliptic Curve-based Diffie-Hellman (ECDH) key + exchange. ECDH is significantly faster than regular DH while arguably + as secure. The *curve_name* parameter should be a string describing + a well-known elliptic curve, for example ``prime256v1`` for a widely + supported curve. + + This setting doesn't apply to client sockets. You can also use the + :data:`OP_SINGLE_ECDH_USE` option to further improve security. + + This method is not available if :data:`HAS_ECDH` is False. + + .. versionadded:: 3.3 + + .. seealso:: + `SSL/TLS & Perfect Forward Secrecy <http://vincent.bernat.im/en/blog/2011-ssl-perfect-forward-secrecy.html>`_ + Vincent Bernat. + .. method:: SSLContext.wrap_socket(sock, server_side=False, \ do_handshake_on_connect=True, suppress_ragged_eofs=True, \ server_hostname=None) Wrap an existing Python socket *sock* and return an :class:`SSLSocket` - object. The SSL socket is tied to the context, its settings and + object. *sock* must be a :data:`~socket.SOCK_STREAM` socket; other socket + types are unsupported. + + The returned SSL socket is tied to the context, its settings and certificates. The parameters *server_side*, *do_handshake_on_connect* and *suppress_ragged_eofs* have the same meaning as in the top-level :func:`wrap_socket` function. @@ -984,13 +1252,10 @@ to be aware of: try: sock.do_handshake() break - except ssl.SSLError as err: - if err.args[0] == ssl.SSL_ERROR_WANT_READ: - select.select([sock], [], []) - elif err.args[0] == ssl.SSL_ERROR_WANT_WRITE: - select.select([], [sock], []) - else: - raise + except ssl.SSLWantReadError: + select.select([sock], [], []) + except ssl.SSLWantWriteError: + select.select([], [sock], []) .. _ssl-security: @@ -1054,14 +1319,25 @@ format <http://www.openssl.org/docs/apps/ciphers.html#CIPHER_LIST_FORMAT>`_. If you want to check which ciphers are enabled by a given cipher list, use the ``openssl ciphers`` command on your system. +Multi-processing +^^^^^^^^^^^^^^^^ + +If using this module as part of a multi-processed application (using, +for example the :mod:`multiprocessing` or :mod:`concurrent.futures` modules), +be aware that OpenSSL's internal random number generator does not properly +handle forked processes. Applications must change the PRNG state of the +parent process if they use any SSL feature with :func:`os.fork`. Any +successful call of :func:`~ssl.RAND_add`, :func:`~ssl.RAND_bytes` or +:func:`~ssl.RAND_pseudo_bytes` is sufficient. + .. seealso:: Class :class:`socket.socket` - Documentation of underlying :mod:`socket` class + Documentation of underlying :mod:`socket` class - `TLS (Transport Layer Security) and SSL (Secure Socket Layer) <http://www3.rad.com/networks/applications/secure/tls.htm>`_ - Debby Koren + `SSL/TLS Strong Encryption: An Introduction <http://httpd.apache.org/docs/trunk/en/ssl/ssl_intro.html>`_ + Intro from the Apache webserver documentation `RFC 1422: Privacy Enhancement for Internet Electronic Mail: Part II: Certificate-Based Key Management <http://www.ietf.org/rfc/rfc1422>`_ Steve Kent diff --git a/Doc/library/stat.rst b/Doc/library/stat.rst index 7de98b6..6c20aa2 100644 --- a/Doc/library/stat.rst +++ b/Doc/library/stat.rst @@ -1,5 +1,5 @@ -:mod:`stat` --- Interpreting :func:`stat` results -================================================= +:mod:`stat` --- Interpreting :func:`~os.stat` results +===================================================== .. module:: stat :synopsis: Utilities for interpreting the results of os.stat(), @@ -104,6 +104,16 @@ Example:: if __name__ == '__main__': walktree(sys.argv[1], visitfile) +An additional utility function is provided to covert a file's mode in a human +readable string: + +.. function:: filemode(mode) + + Convert a file's mode to a string of the form '-rwxrwxrwx'. + + .. versionadded:: 3.3 + + All the variables below are simply symbolic indexes into the 10-tuple returned by :func:`os.stat`, :func:`os.fstat` or :func:`os.lstat`. @@ -172,10 +182,6 @@ The variables below define the flags used in the :data:`ST_MODE` field. Use of the functions above is more portable than use of the first set of flags: -.. data:: S_IFMT - - Bit mask for the file type bit fields. - .. data:: S_IFSOCK Socket. @@ -344,4 +350,3 @@ The following flags can be used in the *flags* argument of :func:`os.chflags`: The file is a snapshot file. See the \*BSD or Mac OS systems man page :manpage:`chflags(2)` for more information. - diff --git a/Doc/library/stdtypes.rst b/Doc/library/stdtypes.rst index 4b224b3..ab78cdc 100644 --- a/Doc/library/stdtypes.rst +++ b/Doc/library/stdtypes.rst @@ -15,6 +15,10 @@ interpreter. The principal built-in types are numerics, sequences, mappings, classes, instances and exceptions. +Some collection classes are mutable. The methods that add, subtract, or +rearrange their members in place, and don't return a specific item, never return +the collection instance itself but ``None``. + Some operations are supported by several object types; in particular, practically all objects can be compared, tested for truth value, and converted to a string (with the :func:`repr` function or the slightly different @@ -335,8 +339,8 @@ Notes: pair: C; language Conversion from floating point to integer may round or truncate - as in C; see functions :func:`floor` and :func:`ceil` in the :mod:`math` module - for well-defined conversions. + as in C; see functions :func:`math.floor` and :func:`math.ceil` for + well-defined conversions. (4) float also accepts the strings "nan" and "inf" with an optional prefix "+" @@ -515,9 +519,8 @@ class`. In addition, it provides one more method: >>> int.from_bytes([255, 0, 0], byteorder='big') 16711680 - The argument *bytes* must either support the buffer protocol or be an - iterable producing bytes. :class:`bytes` and :class:`bytearray` are - examples of built-in objects that support the buffer protocol. + The argument *bytes* must either be a :term:`bytes-like object` or an + iterable producing bytes. The *byteorder* argument determines the byte order used to represent the integer. If *byteorder* is ``"big"``, the most significant byte is at the @@ -628,7 +631,7 @@ efficiency across a variety of numeric types (including :class:`int`, :class:`float`, :class:`decimal.Decimal` and :class:`fractions.Fraction`) Python's hash for numeric types is based on a single mathematical function that's defined for any rational number, and hence applies to all instances of -:class:`int` and :class:`fraction.Fraction`, and all finite instances of +:class:`int` and :class:`fractions.Fraction`, and all finite instances of :class:`float` and :class:`decimal.Decimal`. Essentially, this function is given by reduction modulo ``P`` for a fixed prime ``P``. The value of ``P`` is made available to Python as the :attr:`modulus` attribute of @@ -641,34 +644,34 @@ made available to Python as the :attr:`modulus` attribute of Here are the rules in detail: - - If ``x = m / n`` is a nonnegative rational number and ``n`` is not divisible - by ``P``, define ``hash(x)`` as ``m * invmod(n, P) % P``, where ``invmod(n, - P)`` gives the inverse of ``n`` modulo ``P``. +- If ``x = m / n`` is a nonnegative rational number and ``n`` is not divisible + by ``P``, define ``hash(x)`` as ``m * invmod(n, P) % P``, where ``invmod(n, + P)`` gives the inverse of ``n`` modulo ``P``. - - If ``x = m / n`` is a nonnegative rational number and ``n`` is - divisible by ``P`` (but ``m`` is not) then ``n`` has no inverse - modulo ``P`` and the rule above doesn't apply; in this case define - ``hash(x)`` to be the constant value ``sys.hash_info.inf``. +- If ``x = m / n`` is a nonnegative rational number and ``n`` is + divisible by ``P`` (but ``m`` is not) then ``n`` has no inverse + modulo ``P`` and the rule above doesn't apply; in this case define + ``hash(x)`` to be the constant value ``sys.hash_info.inf``. - - If ``x = m / n`` is a negative rational number define ``hash(x)`` - as ``-hash(-x)``. If the resulting hash is ``-1``, replace it with - ``-2``. +- If ``x = m / n`` is a negative rational number define ``hash(x)`` + as ``-hash(-x)``. If the resulting hash is ``-1``, replace it with + ``-2``. - - The particular values ``sys.hash_info.inf``, ``-sys.hash_info.inf`` - and ``sys.hash_info.nan`` are used as hash values for positive - infinity, negative infinity, or nans (respectively). (All hashable - nans have the same hash value.) +- The particular values ``sys.hash_info.inf``, ``-sys.hash_info.inf`` + and ``sys.hash_info.nan`` are used as hash values for positive + infinity, negative infinity, or nans (respectively). (All hashable + nans have the same hash value.) - - For a :class:`complex` number ``z``, the hash values of the real - and imaginary parts are combined by computing ``hash(z.real) + - sys.hash_info.imag * hash(z.imag)``, reduced modulo - ``2**sys.hash_info.width`` so that it lies in - ``range(-2**(sys.hash_info.width - 1), 2**(sys.hash_info.width - - 1))``. Again, if the result is ``-1``, it's replaced with ``-2``. +- For a :class:`complex` number ``z``, the hash values of the real + and imaginary parts are combined by computing ``hash(z.real) + + sys.hash_info.imag * hash(z.imag)``, reduced modulo + ``2**sys.hash_info.width`` so that it lies in + ``range(-2**(sys.hash_info.width - 1), 2**(sys.hash_info.width - + 1))``. Again, if the result is ``-1``, it's replaced with ``-2``. To clarify the above rules, here's some example Python code, -equivalent to the builtin hash, for computing the hash of a rational +equivalent to the built-in hash, for computing the hash of a rational number, :class:`float`, or :class:`complex`:: @@ -748,7 +751,7 @@ support: iterators for those iteration types. (An example of an object supporting multiple forms of iteration would be a tree structure which supports both breadth-first and depth-first traversal.) This method corresponds to the - :attr:`tp_iter` slot of the type structure for Python objects in the Python/C + :c:member:`~PyTypeObject.tp_iter` slot of the type structure for Python objects in the Python/C API. The iterator objects themselves are required to support the following two @@ -759,7 +762,7 @@ methods, which together form the :dfn:`iterator protocol`: Return the iterator object itself. This is required to allow both containers and iterators to be used with the :keyword:`for` and :keyword:`in` statements. - This method corresponds to the :attr:`tp_iter` slot of the type structure for + This method corresponds to the :c:member:`~PyTypeObject.tp_iter` slot of the type structure for Python objects in the Python/C API. @@ -767,7 +770,7 @@ methods, which together form the :dfn:`iterator protocol`: Return the next item from the container. If there are no further items, raise the :exc:`StopIteration` exception. This method corresponds to the - :attr:`tp_iternext` slot of the type structure for Python objects in the + :c:member:`~PyTypeObject.tp_iternext` slot of the type structure for Python objects in the Python/C API. Python defines several iterator objects to support iteration over general and @@ -794,118 +797,37 @@ More information about generators can be found in :ref:`the documentation for the yield expression <yieldexpr>`. -.. index:: - single: string; sequence types - .. _typesseq: -Sequence Types --- :class:`str`, :class:`bytes`, :class:`bytearray`, :class:`list`, :class:`tuple`, :class:`range` -================================================================================================================== +Sequence Types --- :class:`list`, :class:`tuple`, :class:`range` +================================================================ -There are six sequence types: strings, byte sequences (:class:`bytes` objects), -byte arrays (:class:`bytearray` objects), lists, tuples, and range objects. For -other containers see the built in :class:`dict` and :class:`set` classes, and -the :mod:`collections` module. +There are three basic sequence types: lists, tuples, and range objects. +Additional sequence types tailored for processing of +:ref:`binary data <binaryseq>` and :ref:`text strings <textseq>` are +described in dedicated sections. -.. index:: - object: sequence - object: bytes - object: bytearray - object: tuple - object: list - object: range - object: string - single: string - single: str() (built-in function); (see also string) -Textual data in Python is handled with :class:`str` objects, or :dfn:`strings`. -Strings are immutable :ref:`sequences <typesseq>` of Unicode code points. -String literals are written in single or -double quotes: ``'xyzzy'``, ``"frobozz"``. See :ref:`strings` for more about -string literals. In addition to the functionality described here, there are -also string-specific methods described in the :ref:`string-methods` section. - -Bytes and bytearray objects contain single bytes -- the former is immutable -while the latter is a mutable sequence. -Bytes objects can be constructed by using the -constructor, :func:`bytes`, and from literals; use a ``b`` prefix with normal -string syntax: ``b'xyzzy'``. To construct byte arrays, use the -:func:`bytearray` function. - -While string objects are sequences of characters (represented by strings of -length 1), bytes and bytearray objects are sequences of *integers* (between 0 -and 255), representing the ASCII value of single bytes. That means that for -a bytes or bytearray object *b*, ``b[0]`` will be an integer, while -``b[0:1]`` will be a bytes or bytearray object of length 1. The -representation of bytes objects uses the literal format (``b'...'``) since it -is generally more useful than e.g. ``bytes([50, 19, 100])``. You can always -convert a bytes object into a list of integers using ``list(b)``. - -Also, while in previous Python versions, byte strings and Unicode strings -could be exchanged for each other rather freely (barring encoding issues), -strings and bytes are now completely separate concepts. There's no implicit -en-/decoding if you pass an object of the wrong type. A string always -compares unequal to a bytes or bytearray object. - -Lists are constructed with square brackets, separating items with commas: ``[a, -b, c]``. Tuples are constructed by the comma operator (not within square -brackets), with or without enclosing parentheses, but an empty tuple must have -the enclosing parentheses, such as ``a, b, c`` or ``()``. A single item tuple -must have a trailing comma, such as ``(d,)``. - -Objects of type range are created using the :func:`range` function. They don't -support concatenation or repetition, and using :func:`min` or :func:`max` on -them is inefficient. - -Most sequence types support the following operations. The ``in`` and ``not in`` -operations have the same priorities as the comparison operations. The ``+`` and -``*`` operations have the same priority as the corresponding numeric operations. -[3]_ Additional methods are provided for :ref:`typesseq-mutable`. +.. _typesseq-common: + +Common Sequence Operations +-------------------------- + +.. index:: object: sequence + +The operations in the following table are supported by most sequence types, +both mutable and immutable. The :class:`collections.abc.Sequence` ABC is +provided to make it easier to correctly implement these operations on +custom sequence types. This table lists the sequence operations sorted in ascending priority (operations in the same box have the same priority). In the table, *s* and *t* -are sequences of the same type; *n*, *i*, *j* and *k* are integers. - -+------------------+--------------------------------+----------+ -| Operation | Result | Notes | -+==================+================================+==========+ -| ``x in s`` | ``True`` if an item of *s* is | \(1) | -| | equal to *x*, else ``False`` | | -+------------------+--------------------------------+----------+ -| ``x not in s`` | ``False`` if an item of *s* is | \(1) | -| | equal to *x*, else ``True`` | | -+------------------+--------------------------------+----------+ -| ``s + t`` | the concatenation of *s* and | \(6) | -| | *t* | | -+------------------+--------------------------------+----------+ -| ``s * n, n * s`` | *n* shallow copies of *s* | \(2) | -| | concatenated | | -+------------------+--------------------------------+----------+ -| ``s[i]`` | *i*\ th item of *s*, origin 0 | \(3) | -+------------------+--------------------------------+----------+ -| ``s[i:j]`` | slice of *s* from *i* to *j* | (3)(4) | -+------------------+--------------------------------+----------+ -| ``s[i:j:k]`` | slice of *s* from *i* to *j* | (3)(5) | -| | with step *k* | | -+------------------+--------------------------------+----------+ -| ``len(s)`` | length of *s* | | -+------------------+--------------------------------+----------+ -| ``min(s)`` | smallest item of *s* | | -+------------------+--------------------------------+----------+ -| ``max(s)`` | largest item of *s* | | -+------------------+--------------------------------+----------+ -| ``s.index(i)`` | index of the first occurence | | -| | of *i* in *s* | | -+------------------+--------------------------------+----------+ -| ``s.count(i)`` | total number of occurences of | | -| | *i* in *s* | | -+------------------+--------------------------------+----------+ - -Sequence types also support comparisons. In particular, tuples and lists are -compared lexicographically by comparing corresponding elements. This means that -to compare equal, every element must compare equal and the two sequences must be -of the same type and have the same length. (For full details see -:ref:`comparisons` in the language reference.) +are sequences of the same type, *n*, *i*, *j* and *k* are integers and *x* is +an arbitrary object that meets any type and value restrictions imposed by *s*. + +The ``in`` and ``not in`` operations have the same priorities as the +comparison operations. The ``+`` (concatenation) and ``*`` (repetition) +operations have the same priority as the corresponding numeric operations. .. index:: triple: operations on; sequence; types @@ -918,18 +840,67 @@ of the same type and have the same length. (For full details see pair: slice; operation operator: in operator: not in + single: count() (sequence method) + single: index() (sequence method) + ++--------------------------+--------------------------------+----------+ +| Operation | Result | Notes | ++==========================+================================+==========+ +| ``x in s`` | ``True`` if an item of *s* is | \(1) | +| | equal to *x*, else ``False`` | | ++--------------------------+--------------------------------+----------+ +| ``x not in s`` | ``False`` if an item of *s* is | \(1) | +| | equal to *x*, else ``True`` | | ++--------------------------+--------------------------------+----------+ +| ``s + t`` | the concatenation of *s* and | (6)(7) | +| | *t* | | ++--------------------------+--------------------------------+----------+ +| ``s * n`` or | *n* shallow copies of *s* | (2)(7) | +| ``n * s`` | concatenated | | ++--------------------------+--------------------------------+----------+ +| ``s[i]`` | *i*\ th item of *s*, origin 0 | \(3) | ++--------------------------+--------------------------------+----------+ +| ``s[i:j]`` | slice of *s* from *i* to *j* | (3)(4) | ++--------------------------+--------------------------------+----------+ +| ``s[i:j:k]`` | slice of *s* from *i* to *j* | (3)(5) | +| | with step *k* | | ++--------------------------+--------------------------------+----------+ +| ``len(s)`` | length of *s* | | ++--------------------------+--------------------------------+----------+ +| ``min(s)`` | smallest item of *s* | | ++--------------------------+--------------------------------+----------+ +| ``max(s)`` | largest item of *s* | | ++--------------------------+--------------------------------+----------+ +| ``s.index(x[, i[, j]])`` | index of the first occurrence | \(8) | +| | of *x* in *s* (at or after | | +| | index *i* and before index *j*)| | ++--------------------------+--------------------------------+----------+ +| ``s.count(x)`` | total number of occurrences of | | +| | *x* in *s* | | ++--------------------------+--------------------------------+----------+ + +Sequences of the same type also support comparisons. In particular, tuples +and lists are compared lexicographically by comparing corresponding elements. +This means that to compare equal, every element must compare equal and the +two sequences must be of the same type and have the same length. (For full +details see :ref:`comparisons` in the language reference.) Notes: (1) - When *s* is a string object, the ``in`` and ``not in`` operations act like a - substring test. + While the ``in`` and ``not in`` operations are used only for simple + containment testing in the general case, some specialised sequences + (such as :class:`str`, :class:`bytes` and :class:`bytearray`) also use + them for subsequence testing:: + + >>> "gg" in "eggs" + True (2) Values of *n* less than ``0`` are treated as ``0`` (which yields an empty sequence of the same type as *s*). Note also that the copies are shallow; nested structures are not copied. This often haunts new Python programmers; - consider: + consider:: >>> lists = [[]] * 3 >>> lists @@ -941,7 +912,7 @@ Notes: What has happened is that ``[[]]`` is a one-element list containing an empty list, so all three elements of ``[[]] * 3`` are (pointers to) this single empty list. Modifying any of the elements of ``lists`` modifies this single list. - You can create a list of different lists this way: + You can create a list of different lists this way:: >>> lists = [[] for i in range(3)] >>> lists[0].append(3) @@ -972,33 +943,529 @@ Notes: If *k* is ``None``, it is treated like ``1``. (6) - Concatenating immutable strings always results in a new object. This means - that building up a string by repeated concatenation will have a quadratic - runtime cost in the total string length. To get a linear runtime cost, - you must switch to one of the alternatives below: + Concatenating immutable sequences always results in a new object. This + means that building up a sequence by repeated concatenation will have a + quadratic runtime cost in the total sequence length. To get a linear + runtime cost, you must switch to one of the alternatives below: * if concatenating :class:`str` objects, you can build a list and use - :meth:`str.join` at the end; + :meth:`str.join` at the end or else write to a :class:`io.StringIO` + instance and retrieve its value when complete * if concatenating :class:`bytes` objects, you can similarly use - :meth:`bytes.join`, or you can do in-place concatenation with a - :class:`bytearray` object. :class:`bytearray` objects are mutable and - have an efficient overallocation mechanism. + :meth:`bytes.join` or :class:`io.BytesIO`, or you can do in-place + concatenation with a :class:`bytearray` object. :class:`bytearray` + objects are mutable and have an efficient overallocation mechanism + + * if concatenating :class:`tuple` objects, extend a :class:`list` instead + + * for other types, investigate the relevant class documentation + + +(7) + Some sequence types (such as :class:`range`) only support item sequences + that follow specific patterns, and hence don't support sequence + concatenation or repetition. + +(8) + ``index`` raises :exc:`ValueError` when *x* is not found in *s*. + When supported, the additional arguments to the index method allow + efficient searching of subsections of the sequence. Passing the extra + arguments is roughly equivalent to using ``s[i:j].index(x)``, only + without copying any data and with the returned index being relative to + the start of the sequence rather than the start of the slice. + + +.. _typesseq-immutable: + +Immutable Sequence Types +------------------------ + +.. index:: + triple: immutable; sequence; types + object: tuple + builtin: hash + +The only operation that immutable sequence types generally implement that is +not also implemented by mutable sequence types is support for the :func:`hash` +built-in. + +This support allows immutable sequences, such as :class:`tuple` instances, to +be used as :class:`dict` keys and stored in :class:`set` and :class:`frozenset` +instances. + +Attempting to hash an immutable sequence that contains unhashable values will +result in :exc:`TypeError`. + + +.. _typesseq-mutable: + +Mutable Sequence Types +---------------------- + +.. index:: + triple: mutable; sequence; types + object: list + object: bytearray + +The operations in the following table are defined on mutable sequence types. +The :class:`collections.abc.MutableSequence` ABC is provided to make it +easier to correctly implement these operations on custom sequence types. + +In the table *s* is an instance of a mutable sequence type, *t* is any +iterable object and *x* is an arbitrary object that meets any type +and value restrictions imposed by *s* (for example, :class:`bytearray` only +accepts integers that meet the value restriction ``0 <= x <= 255``). + + +.. index:: + triple: operations on; sequence; types + triple: operations on; list; type + pair: subscript; assignment + pair: slice; assignment + statement: del + single: append() (sequence method) + single: clear() (sequence method) + single: copy() (sequence method) + single: extend() (sequence method) + single: insert() (sequence method) + single: pop() (sequence method) + single: remove() (sequence method) + single: reverse() (sequence method) + ++------------------------------+--------------------------------+---------------------+ +| Operation | Result | Notes | ++==============================+================================+=====================+ +| ``s[i] = x`` | item *i* of *s* is replaced by | | +| | *x* | | ++------------------------------+--------------------------------+---------------------+ +| ``s[i:j] = t`` | slice of *s* from *i* to *j* | | +| | is replaced by the contents of | | +| | the iterable *t* | | ++------------------------------+--------------------------------+---------------------+ +| ``del s[i:j]`` | same as ``s[i:j] = []`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) | +| | are replaced by those of *t* | | ++------------------------------+--------------------------------+---------------------+ +| ``del s[i:j:k]`` | removes the elements of | | +| | ``s[i:j:k]`` from the list | | ++------------------------------+--------------------------------+---------------------+ +| ``s.append(x)`` | appends *x* to the end of the | | +| | sequence (same as | | +| | ``s[len(s):len(s)] = [x]``) | | ++------------------------------+--------------------------------+---------------------+ +| ``s.clear()`` | removes all items from ``s`` | \(5) | +| | (same as ``del s[:]``) | | ++------------------------------+--------------------------------+---------------------+ +| ``s.copy()`` | creates a shallow copy of ``s``| \(5) | +| | (same as ``s[:]``) | | ++------------------------------+--------------------------------+---------------------+ +| ``s.extend(t)`` | extends *s* with the | | +| | contents of *t* (same as | | +| | ``s[len(s):len(s)] = t``) | | ++------------------------------+--------------------------------+---------------------+ +| ``s.insert(i, x)`` | inserts *x* into *s* at the | | +| | index given by *i* | | +| | (same as ``s[i:i] = [x]``) | | ++------------------------------+--------------------------------+---------------------+ +| ``s.pop([i])`` | retrieves the item at *i* and | \(2) | +| | also removes it from *s* | | ++------------------------------+--------------------------------+---------------------+ +| ``s.remove(x)`` | remove the first item from *s* | \(3) | +| | where ``s[i] == x`` | | ++------------------------------+--------------------------------+---------------------+ +| ``s.reverse()`` | reverses the items of *s* in | \(4) | +| | place | | ++------------------------------+--------------------------------+---------------------+ + + +Notes: + +(1) + *t* must have the same length as the slice it is replacing. + +(2) + The optional argument *i* defaults to ``-1``, so that by default the last + item is removed and returned. + +(3) + ``remove`` raises :exc:`ValueError` when *x* is not found in *s*. + +(4) + The :meth:`reverse` method modifies the sequence in place for economy of + space when reversing a large sequence. To remind users that it operates by + side effect, it does not return the reversed sequence. + +(5) + :meth:`clear` and :meth:`!copy` are included for consistency with the + interfaces of mutable containers that don't support slicing operations + (such as :class:`dict` and :class:`set`) + + .. versionadded:: 3.3 + :meth:`clear` and :meth:`!copy` methods. + + +.. _typesseq-list: + +Lists +----- + +.. index:: object: list + +Lists are mutable sequences, typically used to store collections of +homogeneous items (where the precise degree of similarity will vary by +application). + +.. class:: list([iterable]) + + Lists may be constructed in several ways: + + * Using a pair of square brackets to denote the empty list: ``[]`` + * Using square brackets, separating items with commas: ``[a]``, ``[a, b, c]`` + * Using a list comprehension: ``[x for x in iterable]`` + * Using the type constructor: ``list()`` or ``list(iterable)`` + + The constructor builds a list whose items are the same and in the same + order as *iterable*'s items. *iterable* may be either a sequence, a + container that supports iteration, or an iterator object. If *iterable* + is already a list, a copy is made and returned, similar to ``iterable[:]``. + For example, ``list('abc')`` returns ``['a', 'b', 'c']`` and + ``list( (1, 2, 3) )`` returns ``[1, 2, 3]``. + If no argument is given, the constructor creates a new empty list, ``[]``. + + + Many other operations also produce lists, including the :func:`sorted` + built-in. + + Lists implement all of the :ref:`common <typesseq-common>` and + :ref:`mutable <typesseq-mutable>` sequence operations. Lists also provide the + following additional method: + + .. method:: list.sort(*, key=None, reverse=None) + + This method sorts the list in place, using only ``<`` comparisons + between items. Exceptions are not suppressed - if any comparison operations + fail, the entire sort operation will fail (and the list will likely be left + in a partially modified state). + + :meth:`sort` accepts two arguments that can only be passed by keyword + (:ref:`keyword-only arguments <keyword-only_parameter>`): + + *key* specifies a function of one argument that is used to extract a + comparison key from each list element (for example, ``key=str.lower``). + The key corresponding to each item in the list is calculated once and + then used for the entire sorting process. The default value of ``None`` + means that list items are sorted directly without calculating a separate + key value. + + The :func:`functools.cmp_to_key` utility is available to convert a 2.x + style *cmp* function to a *key* function. + + *reverse* is a boolean value. If set to ``True``, then the list elements + are sorted as if each comparison were reversed. + + This method modifies the sequence in place for economy of space when + sorting a large sequence. To remind users that it operates by side + effect, it does not return the sorted sequence (use :func:`sorted` to + explicitly request a new sorted list instance). + + The :meth:`sort` method is guaranteed to be stable. A sort is stable if it + guarantees not to change the relative order of elements that compare equal + --- this is helpful for sorting in multiple passes (for example, sort by + department, then by salary grade). + + .. impl-detail:: + + While a list is being sorted, the effect of attempting to mutate, or even + inspect, the list is undefined. The C implementation of Python makes the + list appear empty for the duration, and raises :exc:`ValueError` if it can + detect that the list has been mutated during a sort. + + +.. _typesseq-tuple: + +Tuples +------ + +.. index:: object: tuple + +Tuples are immutable sequences, typically used to store collections of +heterogeneous data (such as the 2-tuples produced by the :func:`enumerate` +built-in). Tuples are also used for cases where an immutable sequence of +homogeneous data is needed (such as allowing storage in a :class:`set` or +:class:`dict` instance). + +.. class:: tuple([iterable]) + + Tuples may be constructed in a number of ways: + + * Using a pair of parentheses to denote the empty tuple: ``()`` + * Using a trailing comma for a singleton tuple: ``a,`` or ``(a,)`` + * Separating items with commas: ``a, b, c`` or ``(a, b, c)`` + * Using the :func:`tuple` built-in: ``tuple()`` or ``tuple(iterable)`` + + The constructor builds a tuple whose items are the same and in the same + order as *iterable*'s items. *iterable* may be either a sequence, a + container that supports iteration, or an iterator object. If *iterable* + is already a tuple, it is returned unchanged. For example, + ``tuple('abc')`` returns ``('a', 'b', 'c')`` and + ``tuple( [1, 2, 3] )`` returns ``(1, 2, 3)``. + If no argument is given, the constructor creates a new empty tuple, ``()``. + + Note that it is actually the comma which makes a tuple, not the parentheses. + The parentheses are optional, except in the empty tuple case, or + when they are needed to avoid syntactic ambiguity. For example, + ``f(a, b, c)`` is a function call with three arguments, while + ``f((a, b, c))`` is a function call with a 3-tuple as the sole argument. + + Tuples implement all of the :ref:`common <typesseq-common>` sequence + operations. + +For heterogeneous collections of data where access by name is clearer than +access by index, :func:`collections.namedtuple` may be a more appropriate +choice than a simple tuple object. + +.. _typesseq-range: + +Ranges +------ + +.. index:: object: range + +The :class:`range` type represents an immutable sequence of numbers and is +commonly used for looping a specific number of times in :keyword:`for` +loops. + +.. class:: range(stop) + range(start, stop[, step]) + + The arguments to the range constructor must be integers (either built-in + :class:`int` or any object that implements the ``__index__`` special + method). If the *step* argument is omitted, it defaults to ``1``. + If the *start* argument is omitted, it defaults to ``0``. + If *step* is zero, :exc:`ValueError` is raised. + + For a positive *step*, the contents of a range ``r`` are determined by the + formula ``r[i] = start + step*i`` where ``i >= 0`` and + ``r[i] < stop``. + + For a negative *step*, the contents of the range are still determined by + the formula ``r[i] = start + step*i``, but the constraints are ``i >= 0`` + and ``r[i] > stop``. + + A range object will be empty if ``r[0]`` does not meet the value + constraint. Ranges do support negative indices, but these are interpreted + as indexing from the end of the sequence determined by the positive + indices. + + Ranges containing absolute values larger than :data:`sys.maxsize` are + permitted but some features (such as :func:`len`) may raise + :exc:`OverflowError`. + + Range examples:: + + >>> list(range(10)) + [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] + >>> list(range(1, 11)) + [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] + >>> list(range(0, 30, 5)) + [0, 5, 10, 15, 20, 25] + >>> list(range(0, 10, 3)) + [0, 3, 6, 9] + >>> list(range(0, -10, -1)) + [0, -1, -2, -3, -4, -5, -6, -7, -8, -9] + >>> list(range(0)) + [] + >>> list(range(1, 0)) + [] + + Ranges implement all of the :ref:`common <typesseq-common>` sequence operations + except concatenation and repetition (due to the fact that range objects can + only represent sequences that follow a strict pattern and repetition and + concatenation will usually violate that pattern). + + .. data: start + + The value of the *start* parameter (or ``0`` if the parameter was + not supplied) + + .. data: stop + + The value of the *stop* parameter + + .. data: step + + The value of the *step* parameter (or ``1`` if the parameter was + not supplied) + +The advantage of the :class:`range` type over a regular :class:`list` or +:class:`tuple` is that a :class:`range` object will always take the same +(small) amount of memory, no matter the size of the range it represents (as it +only stores the ``start``, ``stop`` and ``step`` values, calculating individual +items and subranges as needed). + +Range objects implement the :class:`collections.abc.Sequence` ABC, and provide +features such as containment tests, element index lookup, slicing and +support for negative indices (see :ref:`typesseq`): + + >>> r = range(0, 20, 2) + >>> r + range(0, 20, 2) + >>> 11 in r + False + >>> 10 in r + True + >>> r.index(10) + 5 + >>> r[5] + 10 + >>> r[:5] + range(0, 10, 2) + >>> r[-1] + 18 + +Testing range objects for equality with ``==`` and ``!=`` compares +them as sequences. That is, two range objects are considered equal if +they represent the same sequence of values. (Note that two range +objects that compare equal might have different :attr:`~range.start`, +:attr:`~range.stop` and :attr:`~range.step` attributes, for example +``range(0) == range(2, 1, 3)`` or ``range(0, 3, 2) == range(0, 4, 2)``.) + +.. versionchanged:: 3.2 + Implement the Sequence ABC. + Support slicing and negative indices. + Test :class:`int` objects for membership in constant time instead of + iterating through all items. + +.. versionchanged:: 3.3 + Define '==' and '!=' to compare range objects based on the + sequence of values they define (instead of comparing based on + object identity). + +.. versionadded:: 3.3 + The :attr:`~range.start`, :attr:`~range.stop` and :attr:`~range.step` + attributes. + + +.. index:: + single: string; text sequence type + single: str (built-in class); (see also string) + object: string + +.. _textseq: + +Text Sequence Type --- :class:`str` +=================================== + +Textual data in Python is handled with :class:`str` objects, or :dfn:`strings`. +Strings are immutable +:ref:`sequences <typesseq>` of Unicode code points. String literals are +written in a variety of ways: + +* Single quotes: ``'allows embedded "double" quotes'`` +* Double quotes: ``"allows embedded 'single' quotes"``. +* Triple quoted: ``'''Three single quotes'''``, ``"""Three double quotes"""`` + +Triple quoted strings may span multiple lines - all associated whitespace will +be included in the string literal. + +String literals that are part of a single expression and have only whitespace +between them will be implicitly converted to a single string literal. That +is, ``("spam " "eggs") == "spam eggs"``. + +See :ref:`strings` for more about the various forms of string literal, +including supported escape sequences, and the ``r`` ("raw") prefix that +disables most escape sequence processing. + +Strings may also be created from other objects using the :class:`str` +constructor. + +Since there is no separate "character" type, indexing a string produces +strings of length 1. That is, for a non-empty string *s*, ``s[0] == s[0:1]``. + +.. index:: + object: io.StringIO + +There is also no mutable string type, but :meth:`str.join` or +:class:`io.StringIO` can be used to efficiently construct strings from +multiple fragments. + +.. versionchanged:: 3.3 + For backwards compatibility with the Python 2 series, the ``u`` prefix is + once again permitted on string literals. It has no effect on the meaning + of string literals and cannot be combined with the ``r`` prefix. + + +.. index:: + single: string; str (built-in class) + +.. class:: str(object='') + str(object=b'', encoding='utf-8', errors='strict') + + Return a :ref:`string <textseq>` version of *object*. If *object* is not + provided, returns the empty string. Otherwise, the behavior of ``str()`` + depends on whether *encoding* or *errors* is given, as follows. + + If neither *encoding* nor *errors* is given, ``str(object)`` returns + :meth:`object.__str__() <object.__str__>`, which is the "informal" or nicely + printable string representation of *object*. For string objects, this is + the string itself. If *object* does not have a :meth:`~object.__str__` + method, then :func:`str` falls back to returning + :meth:`repr(object) <repr>`. + + .. index:: + single: buffer protocol; str (built-in class) + single: bytes; str (built-in class) + + If at least one of *encoding* or *errors* is given, *object* should be a + :term:`bytes-like object` (e.g. :class:`bytes` or :class:`bytearray`). In + this case, if *object* is a :class:`bytes` (or :class:`bytearray`) object, + then ``str(bytes, encoding, errors)`` is equivalent to + :meth:`bytes.decode(encoding, errors) <bytes.decode>`. Otherwise, the bytes + object underlying the buffer object is obtained before calling + :meth:`bytes.decode`. See :ref:`binaryseq` and + :ref:`bufferobjects` for information on buffer objects. + + Passing a :class:`bytes` object to :func:`str` without the *encoding* + or *errors* arguments falls under the first case of returning the informal + string representation (see also the :option:`-b` command-line option to + Python). For example:: + + >>> str(b'Zoot!') + "b'Zoot!'" + + For more information on the ``str`` class and its methods, see + :ref:`textseq` and the :ref:`string-methods` section below. To output + formatted strings, see the :ref:`string-formatting` section. In addition, + see the :ref:`stringservices` section. + + +.. index:: + pair: string; methods .. _string-methods: String Methods -------------- -.. index:: pair: string; methods +.. index:: + module: re -String objects support the methods listed below. +Strings implement all of the :ref:`common <typesseq-common>` sequence +operations, along with the additional methods described below. -In addition, Python's strings support the sequence type methods described in the -:ref:`typesseq` section. To output formatted strings, see the -:ref:`string-formatting` section. Also, see the :mod:`re` module for string -functions based on regular expressions. +Strings also support two styles of string formatting, one providing a large +degree of flexibility and customization (see :meth:`str.format`, +:ref:`formatstrings` and :ref:`string-formatting`) and the other based on C +``printf`` style formatting that handles a narrower range of types and is +slightly harder to use correctly, but is often faster for the cases it can +handle (:ref:`old-string-formatting`). + +The :ref:`textservices` section of the standard library covers a number of +other modules that provide various text related utilities (including regular +expression support in the :mod:`re` module). .. method:: str.capitalize() @@ -1006,6 +1473,23 @@ functions based on regular expressions. rest lowercased. +.. method:: str.casefold() + + Return a casefolded copy of the string. Casefolded strings may be used for + caseless matching. + + Casefolding is similar to lowercasing but more aggressive because it is + intended to remove all case distinctions in a string. For example, the German + lowercase letter ``'ß'`` is equivalent to ``"ss"``. Since it is already + lowercase, :meth:`lower` would do nothing to ``'ß'``; :meth:`casefold` + converts it to ``"ss"``. + + The casefolding algorithm is described in section 3.13 of the Unicode + Standard. + + .. versionadded:: 3.3 + + .. method:: str.center(width[, fillchar]) Return centered in a string of length *width*. Padding is done using the @@ -1044,11 +1528,23 @@ functions based on regular expressions. .. method:: str.expandtabs([tabsize]) - Return a copy of the string where all tab characters are replaced by zero or - more spaces, depending on the current column and the given tab size. The - column number is reset to zero after each newline occurring in the string. - If *tabsize* is not given, a tab size of ``8`` characters is assumed. This - doesn't understand other non-printing characters or escape sequences. + Return a copy of the string where all tab characters are replaced by one or + more spaces, depending on the current column and the given tab size. Tab + positions occur every *tabsize* characters (default is 8, giving tab + positions at columns 0, 8, 16 and so on). To expand the string, the current + column is set to zero and the string is examined character by character. If + the character is a tab (``\t``), one or more space characters are inserted + in the result until the current column is equal to the next tab position. + (The tab character itself is not copied.) If the character is a newline + (``\n``) or return (``\r``), it is copied and the current column is reset to + zero. Any other character is copied unchanged and the current column is + incremented by one regardless of how the character is represented when + printed. + + >>> '01\t012\t0123\t01234'.expandtabs() + '01 012 0123 01234' + >>> '01\t012\t0123\t01234'.expandtabs(4) + '01 012 0123 01234' .. method:: str.find(sub[, start[, end]]) @@ -1087,7 +1583,7 @@ functions based on regular expressions. .. method:: str.format_map(mapping) Similar to ``str.format(**mapping)``, except that ``mapping`` is - used directly and not copied to a :class:`dict` . This is useful + used directly and not copied to a :class:`dict`. This is useful if for example ``mapping`` is a dict subclass: >>> class Default(dict): @@ -1145,6 +1641,8 @@ functions based on regular expressions. Return true if the string is a valid identifier according to the language definition, section :ref:`identifiers`. + Use :func:`keyword.iskeyword` to test for reserved identifiers such as + :keyword:`def` and :keyword:`class`. .. method:: str.islower() @@ -1213,6 +1711,9 @@ functions based on regular expressions. Return a copy of the string with all the cased characters [4]_ converted to lowercase. + The lowercasing algorithm used is described in section 3.13 of the Unicode + Standard. + .. method:: str.lstrip([chars]) @@ -1285,7 +1786,7 @@ functions based on regular expressions. two empty strings, followed by the string itself. -.. method:: str.rsplit([sep[, maxsplit]]) +.. method:: str.rsplit(sep=None, maxsplit=-1) Return a list of the words in the string, using *sep* as the delimiter string. If *maxsplit* is given, at most *maxsplit* splits are done, the *rightmost* @@ -1307,7 +1808,7 @@ functions based on regular expressions. 'mississ' -.. method:: str.split([sep[, maxsplit]]) +.. method:: str.split(sep=None, maxsplit=-1) Return a list of the words in the string, using *sep* as the delimiter string. If *maxsplit* is given, at most *maxsplit* splits are done (thus, @@ -1376,7 +1877,8 @@ functions based on regular expressions. .. method:: str.swapcase() Return a copy of the string with uppercase characters converted to lowercase and - vice versa. + vice versa. Note that it is not necessarily true that + ``s.swapcase().swapcase() == s``. .. method:: str.title() @@ -1427,7 +1929,11 @@ functions based on regular expressions. Return a copy of the string with all the cased characters [4]_ converted to uppercase. Note that ``str.upper().isupper()`` might be ``False`` if ``s`` contains uncased characters or if the Unicode category of the resulting - character(s) is not "Lu" (Letter, uppercase), but e.g. "Lt" (Letter, titlecase). + character(s) is not "Lu" (Letter, uppercase), but e.g. "Lt" (Letter, + titlecase). + + The uppercasing algorithm used is described in section 3.13 of the Unicode + Standard. .. method:: str.zfill(width) @@ -1440,8 +1946,8 @@ functions based on regular expressions. .. _old-string-formatting: -Old String Formatting Operations --------------------------------- +``printf``-style String Formatting +---------------------------------- .. index:: single: formatting, string (%) @@ -1453,23 +1959,19 @@ Old String Formatting Operations single: % formatting single: % interpolation -.. XXX is the note enough? - .. note:: - The formatting operations described here are modelled on C's printf() - syntax. They only support formatting of certain builtin types. The - use of a binary operator means that care may be needed in order to - format tuples and dictionaries correctly. As the new - :ref:`string-formatting` syntax is more flexible and handles tuples and - dictionaries naturally, it is recommended for new code. However, there - are no current plans to deprecate printf-style formatting. + The formatting operations described here exhibit a variety of quirks that + lead to a number of common errors (such as failing to display tuples and + dictionaries correctly). Using the newer :meth:`str.format` interface + helps avoid these errors, and also provides a generally more powerful, + flexible and extensible approach to formatting text. String objects have one unique built-in operation: the ``%`` operator (modulo). This is also known as the string *formatting* or *interpolation* operator. Given ``format % values`` (where *format* is a string), ``%`` conversion specifications in *format* are replaced with zero or more elements of *values*. -The effect is similar to the using :c:func:`sprintf` in the C language. +The effect is similar to using the :c:func:`sprintf` in the C language. If *format* requires a single argument, *values* may be a single non-tuple object. [5]_ Otherwise, *values* must be a tuple with exactly the number of @@ -1627,211 +2129,181 @@ that ``'\0'`` is the end of the string. ``%f`` conversions for numbers whose absolute value is over 1e50 are no longer replaced by ``%g`` conversions. -.. index:: - module: string - module: re - -Additional string operations are defined in standard modules :mod:`string` and -:mod:`re`. +.. index:: + single: buffer protocol; binary sequence types -.. _typesseq-range: - -Range Type ----------- - -.. index:: object: range - -The :class:`range` type is an immutable sequence which is commonly used for -looping. The advantage of the :class:`range` type is that an :class:`range` -object will always take the same amount of memory, no matter the size of the -range it represents. - -Range objects have relatively little behavior: they support indexing, contains, -iteration, the :func:`len` function, and the following methods: - -.. method:: range.count(x) +.. _binaryseq: - Return the number of *i*'s for which ``s[i] == x``. +Binary Sequence Types --- :class:`bytes`, :class:`bytearray`, :class:`memoryview` +================================================================================= - .. versionadded:: 3.2 +.. index:: + object: bytes + object: bytearray + object: memoryview + module: array -.. method:: range.index(x) +The core built-in types for manipulating binary data are :class:`bytes` and +:class:`bytearray`. They are supported by :class:`memoryview` which uses +the :ref:`buffer protocol <bufferobjects>` to access the memory of other +binary objects without needing to make a copy. - Return the smallest *i* such that ``s[i] == x``. Raises - :exc:`ValueError` when *x* is not in the range. +The :mod:`array` module supports efficient storage of basic data types like +32-bit integers and IEEE754 double-precision floating values. - .. versionadded:: 3.2 +.. _typebytes: -.. _typesseq-mutable: +Bytes +----- -Mutable Sequence Types ----------------------- +.. index:: object: bytes -.. index:: - triple: mutable; sequence; types - object: list - object: bytearray +Bytes objects are immutable sequences of single bytes. Since many major +binary protocols are based on the ASCII text encoding, bytes objects offer +several methods that are only valid when working with ASCII compatible +data and are closely related to string objects in a variety of other ways. -List and bytearray objects support additional operations that allow in-place -modification of the object. Other mutable sequence types (when added to the -language) should also support these operations. Strings and tuples are -immutable sequence types: such objects cannot be modified once created. The -following operations are defined on mutable sequence types (where *x* is an -arbitrary object). +Firstly, the syntax for bytes literals is largely the same as that for string +literals, except that a ``b`` prefix is added: -Note that while lists allow their items to be of any type, bytearray object -"items" are all integers in the range 0 <= x < 256. +* Single quotes: ``b'still allows embedded "double" quotes'`` +* Double quotes: ``b"still allows embedded 'single' quotes"``. +* Triple quoted: ``b'''3 single quotes'''``, ``b"""3 double quotes"""`` -.. index:: - triple: operations on; sequence; types - triple: operations on; list; type - pair: subscript; assignment - pair: slice; assignment - statement: del - single: append() (sequence method) - single: extend() (sequence method) - single: count() (sequence method) - single: index() (sequence method) - single: insert() (sequence method) - single: pop() (sequence method) - single: remove() (sequence method) - single: reverse() (sequence method) - single: sort() (sequence method) +Only ASCII characters are permitted in bytes literals (regardless of the +declared source code encoding). Any binary values over 127 must be entered +into bytes literals using the appropriate escape sequence. -+------------------------------+--------------------------------+---------------------+ -| Operation | Result | Notes | -+==============================+================================+=====================+ -| ``s[i] = x`` | item *i* of *s* is replaced by | | -| | *x* | | -+------------------------------+--------------------------------+---------------------+ -| ``s[i:j] = t`` | slice of *s* from *i* to *j* | | -| | is replaced by the contents of | | -| | the iterable *t* | | -+------------------------------+--------------------------------+---------------------+ -| ``del s[i:j]`` | same as ``s[i:j] = []`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s[i:j:k] = t`` | the elements of ``s[i:j:k]`` | \(1) | -| | are replaced by those of *t* | | -+------------------------------+--------------------------------+---------------------+ -| ``del s[i:j:k]`` | removes the elements of | | -| | ``s[i:j:k]`` from the list | | -+------------------------------+--------------------------------+---------------------+ -| ``s.append(x)`` | same as ``s[len(s):len(s)] = | | -| | [x]`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.extend(x)`` | same as ``s[len(s):len(s)] = | \(2) | -| | x`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.count(x)`` | return number of *i*'s for | | -| | which ``s[i] == x`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.index(x[, i[, j]])`` | return smallest *k* such that | \(3) | -| | ``s[k] == x`` and ``i <= k < | | -| | j`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.insert(i, x)`` | same as ``s[i:i] = [x]`` | \(4) | -+------------------------------+--------------------------------+---------------------+ -| ``s.pop([i])`` | same as ``x = s[i]; del s[i]; | \(5) | -| | return x`` | | -+------------------------------+--------------------------------+---------------------+ -| ``s.remove(x)`` | same as ``del s[s.index(x)]`` | \(3) | -+------------------------------+--------------------------------+---------------------+ -| ``s.reverse()`` | reverses the items of *s* in | \(6) | -| | place | | -+------------------------------+--------------------------------+---------------------+ -| ``s.sort([key[, reverse]])`` | sort the items of *s* in place | (6), (7), (8) | -+------------------------------+--------------------------------+---------------------+ +As with string literals, bytes literals may also use a ``r`` prefix to disable +processing of escape sequences. See :ref:`strings` for more about the various +forms of bytes literal, including supported escape sequences. +While bytes literals and representations are based on ASCII text, bytes +objects actually behave like immutable sequences of integers, with each +value in the sequence restricted such that ``0 <= x < 256`` (attempts to +violate this restriction will trigger :exc:`ValueError`. This is done +deliberately to emphasise that while many binary formats include ASCII based +elements and can be usefully manipulated with some text-oriented algorithms, +this is not generally the case for arbitrary binary data (blindly applying +text processing algorithms to binary data formats that are not ASCII +compatible will usually lead to data corruption). -Notes: +In addition to the literal forms, bytes objects can be created in a number of +other ways: -(1) - *t* must have the same length as the slice it is replacing. +* A zero-filled bytes object of a specified length: ``bytes(10)`` +* From an iterable of integers: ``bytes(range(20))`` +* Copying existing binary data via the buffer protocol: ``bytes(obj)`` -(2) - *x* can be any iterable object. +Also see the :ref:`bytes <func-bytes>` built-in. -(3) - Raises :exc:`ValueError` when *x* is not found in *s*. When a negative index is - passed as the second or third parameter to the :meth:`index` method, the sequence - length is added, as for slice indices. If it is still negative, it is truncated - to zero, as for slice indices. +Since bytes objects are sequences of integers, for a bytes object *b*, +``b[0]`` will be an integer, while ``b[0:1]`` will be a bytes object of +length 1. (This contrasts with text strings, where both indexing and +slicing will produce a string of length 1) -(4) - When a negative index is passed as the first parameter to the :meth:`insert` - method, the sequence length is added, as for slice indices. If it is still - negative, it is truncated to zero, as for slice indices. +The representation of bytes objects uses the literal format (``b'...'``) +since it is often more useful than e.g. ``bytes([46, 46, 46])``. You can +always convert a bytes object into a list of integers using ``list(b)``. -(5) - The optional argument *i* defaults to ``-1``, so that by default the last - item is removed and returned. -(6) - The :meth:`sort` and :meth:`reverse` methods modify the sequence in place for - economy of space when sorting or reversing a large sequence. To remind you - that they operate by side effect, they don't return the sorted or reversed - sequence. +.. note:: + For Python 2.x users: In the Python 2.x series, a variety of implicit + conversions between 8-bit strings (the closest thing 2.x offers to a + built-in binary data type) and Unicode strings were permitted. This was a + backwards compatibility workaround to account for the fact that Python + originally only supported 8-bit text, and Unicode text was a later + addition. In Python 3.x, those implicit conversions are gone - conversions + between 8-bit binary data and Unicode text must be explicit, and bytes and + string objects will always compare unequal. -(7) - The :meth:`sort` method takes optional arguments for controlling the - comparisons. Each must be specified as a keyword argument. - *key* specifies a function of one argument that is used to extract a comparison - key from each list element: ``key=str.lower``. The default value is ``None``. - Use :func:`functools.cmp_to_key` to convert an - old-style *cmp* function to a *key* function. +.. _typebytearray: +Bytearray Objects +----------------- - *reverse* is a boolean value. If set to ``True``, then the list elements are - sorted as if each comparison were reversed. +.. index:: object: bytearray - The :meth:`sort` method is guaranteed to be stable. A - sort is stable if it guarantees not to change the relative order of elements - that compare equal --- this is helpful for sorting in multiple passes (for - example, sort by department, then by salary grade). +:class:`bytearray` objects are a mutable counterpart to :class:`bytes` +objects. There is no dedicated literal syntax for bytearray objects, instead +they are always created by calling the constructor: - .. impl-detail:: +* Creating an empty instance: ``bytearray()`` +* Creating a zero-filled instance with a given length: ``bytearray(10)`` +* From an iterable of integers: ``bytearray(range(20))`` +* Copying existing binary data via the buffer protocol: ``bytearray(b'Hi!')`` - While a list is being sorted, the effect of attempting to mutate, or even - inspect, the list is undefined. The C implementation of Python makes the - list appear empty for the duration, and raises :exc:`ValueError` if it can - detect that the list has been mutated during a sort. +As bytearray objects are mutable, they support the +:ref:`mutable <typesseq-mutable>` sequence operations in addition to the +common bytes and bytearray operations described in :ref:`bytes-methods`. -(8) - :meth:`sort` is not supported by :class:`bytearray` objects. +Also see the :ref:`bytearray <func-bytearray>` built-in. .. _bytes-methods: -Bytes and Byte Array Methods ----------------------------- +Bytes and Bytearray Operations +------------------------------ .. index:: pair: bytes; methods pair: bytearray; methods -Bytes and bytearray objects, being "strings of bytes", have all methods found on -strings, with the exception of :func:`encode`, :func:`format` and -:func:`isidentifier`, which do not make sense with these types. For converting -the objects to strings, they have a :func:`decode` method. +Both bytes and bytearray objects support the :ref:`common <typesseq-common>` +sequence operations. They interoperate not just with operands of the same +type, but with any object that supports the +:ref:`buffer protocol <bufferobjects>`. Due to this flexibility, they can be +freely mixed in operations without causing errors. However, the return type +of the result may depend on the order of operands. + +Due to the common use of ASCII text as the basis for binary protocols, bytes +and bytearray objects provide almost all methods found on text strings, with +the exceptions of: + +* :meth:`str.encode` (which converts text strings to bytes objects) +* :meth:`str.format` and :meth:`str.format_map` (which are used to format + text for display to users) +* :meth:`str.isidentifier`, :meth:`str.isnumeric`, :meth:`str.isdecimal`, + :meth:`str.isprintable` (which are used to check various properties of + text strings which are not typically applicable to binary protocols). -Wherever one of these methods needs to interpret the bytes as characters -(e.g. the :func:`is...` methods), the ASCII character set is assumed. +All other string methods are supported, although sometimes with slight +differences in functionality and semantics (as described below). .. note:: The methods on bytes and bytearray objects don't accept strings as their arguments, just as the methods on strings don't accept bytes as their - arguments. For example, you have to write :: + arguments. For example, you have to write:: a = "abc" b = a.replace("a", "f") - and :: + and:: a = b"abc" b = a.replace(b"a", b"f") +Whenever a bytes or bytearray method needs to interpret the bytes as +characters (e.g. the :meth:`is...` methods, :meth:`split`, :meth:`strip`), +the ASCII character set is assumed (text strings use Unicode semantics). + +.. note:: + Using these ASCII based methods to manipulate binary data that is not + stored in an ASCII based format may lead to data corruption. + +The search operations (:keyword:`in`, :meth:`count`, :meth:`find`, +:meth:`index`, :meth:`rfind` and :meth:`rindex`) all accept both integers +in the range 0 to 255 (inclusive) as well as bytes and byte array sequences. + +.. versionchanged:: 3.3 + All of the search methods also accept an integer in the range 0 to 255 + (inclusive) as their first argument. + + +Each bytes and bytearray instance provides a :meth:`~bytes.decode` convenience +method that is the inverse of :meth:`str.encode`: .. method:: bytes.decode(encoding="utf-8", errors="strict") bytearray.decode(encoding="utf-8", errors="strict") @@ -1847,8 +2319,10 @@ Wherever one of these methods needs to interpret the bytes as characters .. versionchanged:: 3.1 Added support for keyword arguments. - -The bytes and bytearray types have an additional class method: +Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal +numbers are a commonly used format for describing binary data. Accordingly, +the bytes and bytearray types have an additional class method to read data in +that format: .. classmethod:: bytes.fromhex(string) bytearray.fromhex(string) @@ -1857,8 +2331,8 @@ The bytes and bytearray types have an additional class method: decoding the given string object. The string must contain two hexadecimal digits per byte, spaces are ignored. - >>> bytes.fromhex('f0 f1f2 ') - b'\xf0\xf1\xf2' + >>> bytes.fromhex('2Ef0 F1f2 ') + b'.\xf0\xf1\xf2' The maketrans and translate methods differ in semantics from the versions @@ -1892,6 +2366,428 @@ available on strings: .. versionadded:: 3.1 +.. _typememoryview: + +Memory Views +------------ + +:class:`memoryview` objects allow Python code to access the internal data +of an object that supports the :ref:`buffer protocol <bufferobjects>` without +copying. + +.. class:: memoryview(obj) + + Create a :class:`memoryview` that references *obj*. *obj* must support the + buffer protocol. Built-in objects that support the buffer protocol include + :class:`bytes` and :class:`bytearray`. + + A :class:`memoryview` has the notion of an *element*, which is the + atomic memory unit handled by the originating object *obj*. For many + simple types such as :class:`bytes` and :class:`bytearray`, an element + is a single byte, but other types such as :class:`array.array` may have + bigger elements. + + ``len(view)`` is equal to the length of :class:`~memoryview.tolist`. + If ``view.ndim = 0``, the length is 1. If ``view.ndim = 1``, the length + is equal to the number of elements in the view. For higher dimensions, + the length is equal to the length of the nested list representation of + the view. The :class:`~memoryview.itemsize` attribute will give you the + number of bytes in a single element. + + A :class:`memoryview` supports slicing to expose its data. If + :class:`~memoryview.format` is one of the native format specifiers + from the :mod:`struct` module, indexing will return a single element + with the correct type. Full slicing will result in a subview:: + + >>> v = memoryview(b'abcefg') + >>> v[1] + 98 + >>> v[-1] + 103 + >>> v[1:4] + <memory at 0x7f3ddc9f4350> + >>> bytes(v[1:4]) + b'bce' + + Other native formats:: + + >>> import array + >>> a = array.array('l', [-11111111, 22222222, -33333333, 44444444]) + >>> a[0] + -11111111 + >>> a[-1] + 44444444 + >>> a[2:3].tolist() + [-33333333] + >>> a[::2].tolist() + [-11111111, -33333333] + >>> a[::-1].tolist() + [44444444, -33333333, 22222222, -11111111] + + .. versionadded:: 3.3 + + If the underlying object is writable, the memoryview supports slice + assignment. Resizing is not allowed:: + + >>> data = bytearray(b'abcefg') + >>> v = memoryview(data) + >>> v.readonly + False + >>> v[0] = ord(b'z') + >>> data + bytearray(b'zbcefg') + >>> v[1:4] = b'123' + >>> data + bytearray(b'z123fg') + >>> v[2:3] = b'spam' + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + ValueError: memoryview assignment: lvalue and rvalue have different structures + >>> v[2:6] = b'spam' + >>> data + bytearray(b'z1spam') + + One-dimensional memoryviews of hashable (read-only) types with formats + 'B', 'b' or 'c' are also hashable. The hash is defined as + ``hash(m) == hash(m.tobytes())``:: + + >>> v = memoryview(b'abcefg') + >>> hash(v) == hash(b'abcefg') + True + >>> hash(v[2:4]) == hash(b'ce') + True + >>> hash(v[::-2]) == hash(b'abcefg'[::-2]) + True + + .. versionchanged:: 3.3 + One-dimensional memoryviews with formats 'B', 'b' or 'c' are now hashable. + + :class:`memoryview` has several methods: + + .. method:: __eq__(exporter) + + A memoryview and a :pep:`3118` exporter are equal if their shapes are + equivalent and if all corresponding values are equal when the operands' + respective format codes are interpreted using :mod:`struct` syntax. + + For the subset of :mod:`struct` format strings currently supported by + :meth:`tolist`, ``v`` and ``w`` are equal if ``v.tolist() == w.tolist()``:: + + >>> import array + >>> a = array.array('I', [1, 2, 3, 4, 5]) + >>> b = array.array('d', [1.0, 2.0, 3.0, 4.0, 5.0]) + >>> c = array.array('b', [5, 3, 1]) + >>> x = memoryview(a) + >>> y = memoryview(b) + >>> x == a == y == b + True + >>> x.tolist() == a.tolist() == y.tolist() == b.tolist() + True + >>> z = y[::-2] + >>> z == c + True + >>> z.tolist() == c.tolist() + True + + If either format string is not supported by the :mod:`struct` module, + then the objects will always compare as unequal (even if the format + strings and buffer contents are identical):: + + >>> from ctypes import BigEndianStructure, c_long + >>> class BEPoint(BigEndianStructure): + ... _fields_ = [("x", c_long), ("y", c_long)] + ... + >>> point = BEPoint(100, 200) + >>> a = memoryview(point) + >>> b = memoryview(point) + >>> a == point + False + >>> a == b + False + + Note that, as with floating point numbers, ``v is w`` does *not* imply + ``v == w`` for memoryview objects. + + .. versionchanged:: 3.3 + Previous versions compared the raw memory disregarding the item format + and the logical array structure. + + .. method:: tobytes() + + Return the data in the buffer as a bytestring. This is equivalent to + calling the :class:`bytes` constructor on the memoryview. :: + + >>> m = memoryview(b"abc") + >>> m.tobytes() + b'abc' + >>> bytes(m) + b'abc' + + For non-contiguous arrays the result is equal to the flattened list + representation with all elements converted to bytes. :meth:`tobytes` + supports all format strings, including those that are not in + :mod:`struct` module syntax. + + .. method:: tolist() + + Return the data in the buffer as a list of elements. :: + + >>> memoryview(b'abc').tolist() + [97, 98, 99] + >>> import array + >>> a = array.array('d', [1.1, 2.2, 3.3]) + >>> m = memoryview(a) + >>> m.tolist() + [1.1, 2.2, 3.3] + + .. versionchanged:: 3.3 + :meth:`tolist` now supports all single character native formats in + :mod:`struct` module syntax as well as multi-dimensional + representations. + + .. method:: release() + + Release the underlying buffer exposed by the memoryview object. Many + objects take special actions when a view is held on them (for example, + a :class:`bytearray` would temporarily forbid resizing); therefore, + calling release() is handy to remove these restrictions (and free any + dangling resources) as soon as possible. + + After this method has been called, any further operation on the view + raises a :class:`ValueError` (except :meth:`release()` itself which can + be called multiple times):: + + >>> m = memoryview(b'abc') + >>> m.release() + >>> m[0] + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + ValueError: operation forbidden on released memoryview object + + The context management protocol can be used for a similar effect, + using the ``with`` statement:: + + >>> with memoryview(b'abc') as m: + ... m[0] + ... + 97 + >>> m[0] + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + ValueError: operation forbidden on released memoryview object + + .. versionadded:: 3.2 + + .. method:: cast(format[, shape]) + + Cast a memoryview to a new format or shape. *shape* defaults to + ``[byte_length//new_itemsize]``, which means that the result view + will be one-dimensional. The return value is a new memoryview, but + the buffer itself is not copied. Supported casts are 1D -> C-contiguous + and C-contiguous -> 1D. + + Both formats are restricted to single element native formats in + :mod:`struct` syntax. One of the formats must be a byte format + ('B', 'b' or 'c'). The byte length of the result must be the same + as the original length. + + Cast 1D/long to 1D/unsigned bytes:: + + >>> import array + >>> a = array.array('l', [1,2,3]) + >>> x = memoryview(a) + >>> x.format + 'l' + >>> x.itemsize + 8 + >>> len(x) + 3 + >>> x.nbytes + 24 + >>> y = x.cast('B') + >>> y.format + 'B' + >>> y.itemsize + 1 + >>> len(y) + 24 + >>> y.nbytes + 24 + + Cast 1D/unsigned bytes to 1D/char:: + + >>> b = bytearray(b'zyz') + >>> x = memoryview(b) + >>> x[0] = b'a' + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + ValueError: memoryview: invalid value for format "B" + >>> y = x.cast('c') + >>> y[0] = b'a' + >>> b + bytearray(b'ayz') + + Cast 1D/bytes to 3D/ints to 1D/signed char:: + + >>> import struct + >>> buf = struct.pack("i"*12, *list(range(12))) + >>> x = memoryview(buf) + >>> y = x.cast('i', shape=[2,2,3]) + >>> y.tolist() + [[[0, 1, 2], [3, 4, 5]], [[6, 7, 8], [9, 10, 11]]] + >>> y.format + 'i' + >>> y.itemsize + 4 + >>> len(y) + 2 + >>> y.nbytes + 48 + >>> z = y.cast('b') + >>> z.format + 'b' + >>> z.itemsize + 1 + >>> len(z) + 48 + >>> z.nbytes + 48 + + Cast 1D/unsigned char to 2D/unsigned long:: + + >>> buf = struct.pack("L"*6, *list(range(6))) + >>> x = memoryview(buf) + >>> y = x.cast('L', shape=[2,3]) + >>> len(y) + 2 + >>> y.nbytes + 48 + >>> y.tolist() + [[0, 1, 2], [3, 4, 5]] + + .. versionadded:: 3.3 + + There are also several readonly attributes available: + + .. attribute:: obj + + The underlying object of the memoryview:: + + >>> b = bytearray(b'xyz') + >>> m = memoryview(b) + >>> m.obj is b + True + + .. versionadded:: 3.3 + + .. attribute:: nbytes + + ``nbytes == product(shape) * itemsize == len(m.tobytes())``. This is + the amount of space in bytes that the array would use in a contiguous + representation. It is not necessarily equal to len(m):: + + >>> import array + >>> a = array.array('i', [1,2,3,4,5]) + >>> m = memoryview(a) + >>> len(m) + 5 + >>> m.nbytes + 20 + >>> y = m[::2] + >>> len(y) + 3 + >>> y.nbytes + 12 + >>> len(y.tobytes()) + 12 + + Multi-dimensional arrays:: + + >>> import struct + >>> buf = struct.pack("d"*12, *[1.5*x for x in range(12)]) + >>> x = memoryview(buf) + >>> y = x.cast('d', shape=[3,4]) + >>> y.tolist() + [[0.0, 1.5, 3.0, 4.5], [6.0, 7.5, 9.0, 10.5], [12.0, 13.5, 15.0, 16.5]] + >>> len(y) + 3 + >>> y.nbytes + 96 + + .. versionadded:: 3.3 + + .. attribute:: readonly + + A bool indicating whether the memory is read only. + + .. attribute:: format + + A string containing the format (in :mod:`struct` module style) for each + element in the view. A memoryview can be created from exporters with + arbitrary format strings, but some methods (e.g. :meth:`tolist`) are + restricted to native single element formats. + + .. versionchanged:: 3.3 + format ``'B'`` is now handled according to the struct module syntax. + This means that ``memoryview(b'abc')[0] == b'abc'[0] == 97``. + + .. attribute:: itemsize + + The size in bytes of each element of the memoryview:: + + >>> import array, struct + >>> m = memoryview(array.array('H', [32000, 32001, 32002])) + >>> m.itemsize + 2 + >>> m[0] + 32000 + >>> struct.calcsize('H') == m.itemsize + True + + .. attribute:: ndim + + An integer indicating how many dimensions of a multi-dimensional array the + memory represents. + + .. attribute:: shape + + A tuple of integers the length of :attr:`ndim` giving the shape of the + memory as an N-dimensional array. + + .. versionchanged:: 3.3 + An empty tuple instead of None when ndim = 0. + + .. attribute:: strides + + A tuple of integers the length of :attr:`ndim` giving the size in bytes to + access each element for each dimension of the array. + + .. versionchanged:: 3.3 + An empty tuple instead of None when ndim = 0. + + .. attribute:: suboffsets + + Used internally for PIL-style arrays. The value is informational only. + + .. attribute:: c_contiguous + + A bool indicating whether the memory is C-contiguous. + + .. versionadded:: 3.3 + + .. attribute:: f_contiguous + + A bool indicating whether the memory is Fortran contiguous. + + .. versionadded:: 3.3 + + .. attribute:: contiguous + + A bool indicating whether the memory is contiguous. + + .. versionadded:: 3.3 + + .. _types-set: Set Types --- :class:`set`, :class:`frozenset` @@ -1903,7 +2799,7 @@ A :dfn:`set` object is an unordered collection of distinct :term:`hashable` obje Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference. -(For other containers see the built in :class:`dict`, :class:`list`, +(For other containers see the built-in :class:`dict`, :class:`list`, and :class:`tuple` classes, and the :mod:`collections` module.) Like other collections, sets support ``x in set``, ``len(set)``, and ``for x in @@ -1913,11 +2809,11 @@ other sequence-like behavior. There are currently two built-in set types, :class:`set` and :class:`frozenset`. The :class:`set` type is mutable --- the contents can be changed using methods -like :meth:`add` and :meth:`remove`. Since it is mutable, it has no hash value -and cannot be used as either a dictionary key or as an element of another set. -The :class:`frozenset` type is immutable and :term:`hashable` --- its contents cannot be -altered after it is created; it can therefore be used as a dictionary key or as -an element of another set. +like :meth:`~set.add` and :meth:`~set.remove`. Since it is mutable, it has no +hash value and cannot be used as either a dictionary key or as an element of +another set. The :class:`frozenset` type is immutable and :term:`hashable` --- +its contents cannot be altered after it is created; it can therefore be used as +a dictionary key or as an element of another set. Non-empty sets (not frozensets) can be created by placing a comma-separated list of elements within braces, for example: ``{'jack', 'sjoerd'}``, in addition to the @@ -1929,9 +2825,10 @@ The constructors for both classes work the same: frozenset([iterable]) Return a new set or frozenset object whose elements are taken from - *iterable*. The elements of a set must be hashable. To represent sets of - sets, the inner sets must be :class:`frozenset` objects. If *iterable* is - not specified, a new empty set is returned. + *iterable*. The elements of a set must be :term:`hashable`. To + represent sets of sets, the inner sets must be :class:`frozenset` + objects. If *iterable* is not specified, a new empty set is + returned. Instances of :class:`set` and :class:`frozenset` provide the following operations: @@ -1950,7 +2847,7 @@ The constructors for both classes work the same: .. method:: isdisjoint(other) - Return True if the set has no elements in common with *other*. Sets are + Return ``True`` if the set has no elements in common with *other*. Sets are disjoint if and only if their intersection is the empty set. .. method:: issubset(other) @@ -2016,8 +2913,8 @@ The constructors for both classes work the same: based on their members. For example, ``set('abc') == frozenset('abc')`` returns ``True`` and so does ``set('abc') in set([frozenset('abc')])``. - The subset and equality comparisons do not generalize to a complete ordering - function. For example, any two disjoint sets are not equal and are not + The subset and equality comparisons do not generalize to a total ordering + function. For example, any two nonempty disjoint sets are not equal and are not subsets of each other, so *all* of the following return ``False``: ``a<b``, ``a==b``, or ``a>b``. @@ -2103,7 +3000,7 @@ Mapping Types --- :class:`dict` A :term:`mapping` object maps :term:`hashable` values to arbitrary objects. Mappings are mutable objects. There is currently only one standard mapping -type, the :dfn:`dictionary`. (For other containers see the built in +type, the :dfn:`dictionary`. (For other containers see the built-in :class:`list`, :class:`set`, and :class:`tuple` classes, and the :mod:`collections` module.) @@ -2235,13 +3132,13 @@ pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098: .. method:: items() - Return a new view of the dictionary's items (``(key, value)`` pairs). See - below for documentation of view objects. + Return a new view of the dictionary's items (``(key, value)`` pairs). + See the :ref:`documentation of view objects <dict-views>`. .. method:: keys() - Return a new view of the dictionary's keys. See below for documentation of - view objects. + Return a new view of the dictionary's keys. See the :ref:`documentation + of view objects <dict-views>`. .. method:: pop(key[, default]) @@ -2275,8 +3172,12 @@ pairs within braces, for example: ``{'jack': 4098, 'sjoerd': 4127}`` or ``{4098: .. method:: values() - Return a new view of the dictionary's values. See below for documentation of - view objects. + Return a new view of the dictionary's values. See the + :ref:`documentation of view objects <dict-views>`. + +.. seealso:: + :class:`types.MappingProxyType` can be used to create a read-only view + of a :class:`dict`. .. _dict-views: @@ -2322,7 +3223,7 @@ Keys views are set-like since their entries are unique and hashable. If all values are hashable, so that ``(key, value)`` pairs are unique and hashable, then the items view is also set-like. (Values views are not treated as set-like since the entries are generally not unique.) For set-like views, all of the -operations defined for the abstract base class :class:`collections.Set` are +operations defined for the abstract base class :class:`collections.abc.Set` are available (for example, ``==``, ``<``, or ``^``). An example of dictionary view usage:: @@ -2357,159 +3258,6 @@ An example of dictionary view usage:: {'juice', 'sausage', 'bacon', 'spam'} -.. _typememoryview: - -memoryview type -=============== - -:class:`memoryview` objects allow Python code to access the internal data -of an object that supports the :ref:`buffer protocol <bufferobjects>` without -copying. Memory is generally interpreted as simple bytes. - -.. class:: memoryview(obj) - - Create a :class:`memoryview` that references *obj*. *obj* must support the - buffer protocol. Built-in objects that support the buffer protocol include - :class:`bytes` and :class:`bytearray`. - - A :class:`memoryview` has the notion of an *element*, which is the - atomic memory unit handled by the originating object *obj*. For many - simple types such as :class:`bytes` and :class:`bytearray`, an element - is a single byte, but other types such as :class:`array.array` may have - bigger elements. - - ``len(view)`` returns the total number of elements in the memoryview, - *view*. The :class:`~memoryview.itemsize` attribute will give you the - number of bytes in a single element. - - A :class:`memoryview` supports slicing to expose its data. Taking a single - index will return a single element as a :class:`bytes` object. Full - slicing will result in a subview:: - - >>> v = memoryview(b'abcefg') - >>> v[1] - b'b' - >>> v[-1] - b'g' - >>> v[1:4] - <memory at 0x77ab28> - >>> bytes(v[1:4]) - b'bce' - - If the object the memoryview is over supports changing its data, the - memoryview supports slice assignment:: - - >>> data = bytearray(b'abcefg') - >>> v = memoryview(data) - >>> v.readonly - False - >>> v[0] = b'z' - >>> data - bytearray(b'zbcefg') - >>> v[1:4] = b'123' - >>> data - bytearray(b'z123fg') - >>> v[2] = b'spam' - Traceback (most recent call last): - File "<stdin>", line 1, in <module> - ValueError: cannot modify size of memoryview object - - Notice how the size of the memoryview object cannot be changed. - - :class:`memoryview` has several methods: - - .. method:: tobytes() - - Return the data in the buffer as a bytestring. This is equivalent to - calling the :class:`bytes` constructor on the memoryview. :: - - >>> m = memoryview(b"abc") - >>> m.tobytes() - b'abc' - >>> bytes(m) - b'abc' - - .. method:: tolist() - - Return the data in the buffer as a list of integers. :: - - >>> memoryview(b'abc').tolist() - [97, 98, 99] - - .. method:: release() - - Release the underlying buffer exposed by the memoryview object. Many - objects take special actions when a view is held on them (for example, - a :class:`bytearray` would temporarily forbid resizing); therefore, - calling release() is handy to remove these restrictions (and free any - dangling resources) as soon as possible. - - After this method has been called, any further operation on the view - raises a :class:`ValueError` (except :meth:`release()` itself which can - be called multiple times):: - - >>> m = memoryview(b'abc') - >>> m.release() - >>> m[0] - Traceback (most recent call last): - File "<stdin>", line 1, in <module> - ValueError: operation forbidden on released memoryview object - - The context management protocol can be used for a similar effect, - using the ``with`` statement:: - - >>> with memoryview(b'abc') as m: - ... m[0] - ... - b'a' - >>> m[0] - Traceback (most recent call last): - File "<stdin>", line 1, in <module> - ValueError: operation forbidden on released memoryview object - - .. versionadded:: 3.2 - - There are also several readonly attributes available: - - .. attribute:: format - - A string containing the format (in :mod:`struct` module style) for each - element in the view. This defaults to ``'B'``, a simple bytestring. - - .. attribute:: itemsize - - The size in bytes of each element of the memoryview:: - - >>> m = memoryview(array.array('H', [1,2,3])) - >>> m.itemsize - 2 - >>> m[0] - b'\x01\x00' - >>> len(m[0]) == m.itemsize - True - - .. attribute:: shape - - A tuple of integers the length of :attr:`ndim` giving the shape of the - memory as a N-dimensional array. - - .. attribute:: ndim - - An integer indicating how many dimensions of a multi-dimensional array the - memory represents. - - .. attribute:: strides - - A tuple of integers the length of :attr:`ndim` giving the size in bytes to - access each element for each dimension of the array. - - .. attribute:: readonly - - A bool indicating whether the memory is read only. - - .. memoryview.suboffsets isn't documented because it only seems useful for C - - .. _typecontextmanager: Context Manager Types @@ -2606,12 +3354,12 @@ statement is not, strictly speaking, an operation on a module object; ``import foo`` does not require a module object named *foo* to exist, rather it requires an (external) *definition* for a module named *foo* somewhere.) -A special attribute of every module is :attr:`__dict__`. This is the dictionary -containing the module's symbol table. Modifying this dictionary will actually -change the module's symbol table, but direct assignment to the :attr:`__dict__` -attribute is not possible (you can write ``m.__dict__['a'] = 1``, which defines -``m.a`` to be ``1``, but you can't write ``m.__dict__ = {}``). Modifying -:attr:`__dict__` directly is not recommended. +A special attribute of every module is :attr:`~object.__dict__`. This is the +dictionary containing the module's symbol table. Modifying this dictionary will +actually change the module's symbol table, but direct assignment to the +:attr:`__dict__` attribute is not possible (you can write +``m.__dict__['a'] = 1``, which defines ``m.a`` to be ``1``, but you can't write +``m.__dict__ = {}``). Modifying :attr:`__dict__` directly is not recommended. Modules built into the interpreter are written like this: ``<module 'sys' (built-in)>``. If loaded from a file, they are written as ``<module 'os' from @@ -2737,7 +3485,7 @@ The Null Object This object is returned by functions that don't explicitly return a value. It supports no special operations. There is exactly one null object, named -``None`` (a built-in name). +``None`` (a built-in name). ``type(None)()`` produces the same singleton. It is written as ``None``. @@ -2749,7 +3497,8 @@ The Ellipsis Object This object is commonly used by slicing (see :ref:`slicings`). It supports no special operations. There is exactly one ellipsis object, named -:const:`Ellipsis` (a built-in name). +:const:`Ellipsis` (a built-in name). ``type(Ellipsis)()`` produces the +:const:`Ellipsis` singleton. It is written as ``Ellipsis`` or ``...``. @@ -2761,7 +3510,8 @@ The NotImplemented Object This object is returned from comparisons and binary operations when they are asked to operate on types they don't support. See :ref:`comparisons` for more -information. +information. There is exactly one ``NotImplemented`` object. +``type(NotImplemented)()`` produces the singleton instance. It is written as ``NotImplemented``. @@ -2827,6 +3577,13 @@ types, where they are relevant. Some of these are not reported by the The name of the class or type. +.. attribute:: class.__qualname__ + + The :term:`qualified name` of the class or type. + + .. versionadded:: 3.3 + + .. attribute:: class.__mro__ This attribute is a tuple of classes that are considered when looking for @@ -2837,7 +3594,7 @@ types, where they are relevant. Some of these are not reported by the This method can be overridden by a metaclass to customize the method resolution order for its instances. It is called at class instantiation, and - its result is stored in :attr:`__mro__`. + its result is stored in :attr:`~class.__mro__`. .. method:: class.__subclasses__ diff --git a/Doc/library/string.rst b/Doc/library/string.rst index 81a01fd..e5bab68 100644 --- a/Doc/library/string.rst +++ b/Doc/library/string.rst @@ -10,7 +10,7 @@ .. seealso:: - :ref:`typesseq` + :ref:`textseq` :ref:`string-methods` @@ -293,18 +293,18 @@ The general form of a *standard format specifier* is: .. productionlist:: sf format_spec: [[`fill`]`align`][`sign`][#][0][`width`][,][.`precision`][`type`] - fill: <a character other than '{' or '}'> + fill: <any character> align: "<" | ">" | "=" | "^" sign: "+" | "-" | " " width: `integer` precision: `integer` type: "b" | "c" | "d" | "e" | "E" | "f" | "F" | "g" | "G" | "n" | "o" | "s" | "x" | "X" | "%" -The *fill* character can be any character other than '{' or '}'. The presence -of a fill character is signaled by the character following it, which must be -one of the alignment options. If the second character of *format_spec* is not -a valid alignment option, then it is assumed that both the fill character and -the alignment option are absent. +If a valid *align* value is specified, it can be preceded by a *fill* +character that can be any character and defaults to a space if omitted. +Note that it is not possible to use ``{`` and ``}`` as *fill* char while +using the :meth:`str.format` method; this limitation however doesn't +affect the :func:`format` function. The meaning of the various alignment options is as follows: @@ -432,12 +432,13 @@ The available presentation types for floating point and decimal values are: +=========+==========================================================+ | ``'e'`` | Exponent notation. Prints the number in scientific | | | notation using the letter 'e' to indicate the exponent. | + | | The default precision is ``6``. | +---------+----------------------------------------------------------+ | ``'E'`` | Exponent notation. Same as ``'e'`` except it uses an | | | upper case 'E' as the separator character. | +---------+----------------------------------------------------------+ | ``'f'`` | Fixed point. Displays the number as a fixed-point | - | | number. | + | | number. The default precision is ``6``. | +---------+----------------------------------------------------------+ | ``'F'`` | Fixed point. Same as ``'f'``, but converts ``nan`` to | | | ``NAN`` and ``inf`` to ``INF``. | @@ -464,7 +465,7 @@ The available presentation types for floating point and decimal values are: | | the precision. | | | | | | A precision of ``0`` is treated as equivalent to a | - | | precision of ``1``. | + | | precision of ``1``. The default precision is ``6``. | +---------+----------------------------------------------------------+ | ``'G'`` | General format. Same as ``'g'`` except switches to | | | ``'E'`` if the number gets too large. The | diff --git a/Doc/library/stringprep.rst b/Doc/library/stringprep.rst index 47144a6..fc890cb 100644 --- a/Doc/library/stringprep.rst +++ b/Doc/library/stringprep.rst @@ -3,7 +3,6 @@ .. module:: stringprep :synopsis: String preparation, as per RFC 3453 - :deprecated: .. moduleauthor:: Martin v. Löwis <martin@v.loewis.de> .. sectionauthor:: Martin v. Löwis <martin@v.loewis.de> diff --git a/Doc/library/strings.rst b/Doc/library/strings.rst deleted file mode 100644 index 08f1658..0000000 --- a/Doc/library/strings.rst +++ /dev/null @@ -1,27 +0,0 @@ -.. _stringservices: - -*************** -String Services -*************** - -The modules described in this chapter provide a wide range of string -manipulation operations. - -In addition, Python's built-in string classes support the sequence type methods -described in the :ref:`typesseq` section, and also the string-specific methods -described in the :ref:`string-methods` section. To output formatted strings, -see the :ref:`string-formatting` section. Also, see the :mod:`re` module for -string functions based on regular expressions. - - -.. toctree:: - - string.rst - re.rst - struct.rst - difflib.rst - textwrap.rst - codecs.rst - unicodedata.rst - stringprep.rst - diff --git a/Doc/library/struct.rst b/Doc/library/struct.rst index 12820e0..dde32b9 100644 --- a/Doc/library/struct.rst +++ b/Doc/library/struct.rst @@ -187,17 +187,24 @@ platform-dependent. | ``Q`` | :c:type:`unsigned long | integer | 8 | \(2), \(3) | | | long` | | | | +--------+--------------------------+--------------------+----------------+------------+ -| ``f`` | :c:type:`float` | float | 4 | \(4) | +| ``n`` | :c:type:`ssize_t` | integer | | \(4) | +--------+--------------------------+--------------------+----------------+------------+ -| ``d`` | :c:type:`double` | float | 8 | \(4) | +| ``N`` | :c:type:`size_t` | integer | | \(4) | ++--------+--------------------------+--------------------+----------------+------------+ +| ``f`` | :c:type:`float` | float | 4 | \(5) | ++--------+--------------------------+--------------------+----------------+------------+ +| ``d`` | :c:type:`double` | float | 8 | \(5) | +--------+--------------------------+--------------------+----------------+------------+ | ``s`` | :c:type:`char[]` | bytes | | | +--------+--------------------------+--------------------+----------------+------------+ | ``p`` | :c:type:`char[]` | bytes | | | +--------+--------------------------+--------------------+----------------+------------+ -| ``P`` | :c:type:`void \*` | integer | | \(5) | +| ``P`` | :c:type:`void \*` | integer | | \(6) | +--------+--------------------------+--------------------+----------------+------------+ +.. versionchanged:: 3.3 + Added support for the ``'n'`` and ``'N'`` formats. + Notes: (1) @@ -219,11 +226,17 @@ Notes: Use of the :meth:`__index__` method for non-integers is new in 3.2. (4) + The ``'n'`` and ``'N'`` conversion codes are only available for the native + size (selected as the default or with the ``'@'`` byte order character). + For the standard size, you can use whichever of the other integer formats + fits your application. + +(5) For the ``'f'`` and ``'d'`` conversion codes, the packed representation uses the IEEE 754 binary32 (for ``'f'``) or binary64 (for ``'d'``) format, regardless of the floating-point format used by the platform. -(5) +(6) The ``'P'`` format character is only available for the native byte ordering (selected as the default or with the ``'@'`` byte order character). The byte order character ``'='`` chooses to use little- or big-endian ordering based @@ -269,7 +282,7 @@ bytes. For the ``'?'`` format character, the return value is either :const:`True` or :const:`False`. When packing, the truth value of the argument object is used. Either 0 or 1 in the native or standard bool representation will be packed, and -any non-zero value will be True when unpacking. +any non-zero value will be ``True`` when unpacking. diff --git a/Doc/library/subprocess.rst b/Doc/library/subprocess.rst index 59ee13f..a9ee970 100644 --- a/Doc/library/subprocess.rst +++ b/Doc/library/subprocess.rst @@ -9,7 +9,7 @@ The :mod:`subprocess` module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to -replace several other, older modules and functions, such as:: +replace several older modules and functions:: os.system os.spawn* @@ -30,16 +30,21 @@ convenience functions for all use cases they can handle. For more advanced use cases, the underlying :class:`Popen` interface can be used directly. -.. function:: call(args, *, stdin=None, stdout=None, stderr=None, shell=False) +.. function:: call(args, *, stdin=None, stdout=None, stderr=None, shell=False, timeout=None) Run the command described by *args*. Wait for command to complete, then return the :attr:`returncode` attribute. The arguments shown above are merely the most common ones, described below - in :ref:`frequently-used-arguments` (hence the slightly odd notation in - the abbreviated signature). The full function signature is the same as - that of the :class:`Popen` constructor - this functions passes all - supplied arguments directly through to that interface. + in :ref:`frequently-used-arguments` (hence the use of keyword-only notation + in the abbreviated signature). The full function signature is largely the + same as that of the :class:`Popen` constructor - this function passes all + supplied arguments other than *timeout* directly through to that interface. + + The *timeout* argument is passed to :meth:`Popen.wait`. If the timeout + expires, the child process will be killed and then waited for again. The + :exc:`TimeoutExpired` exception will be re-raised after the child process + has terminated. Examples:: @@ -62,19 +67,27 @@ use cases, the underlying :class:`Popen` interface can be used directly. process may block if it generates enough output to a pipe to fill up the OS pipe buffer. + .. versionchanged:: 3.3 + *timeout* was added. + -.. function:: check_call(args, *, stdin=None, stdout=None, stderr=None, shell=False) +.. function:: check_call(args, *, stdin=None, stdout=None, stderr=None, shell=False, timeout=None) Run command with arguments. Wait for command to complete. If the return code was zero then return, otherwise raise :exc:`CalledProcessError`. The :exc:`CalledProcessError` object will have the return code in the - :attr:`returncode` attribute. + :attr:`~CalledProcessError.returncode` attribute. The arguments shown above are merely the most common ones, described below - in :ref:`frequently-used-arguments` (hence the slightly odd notation in - the abbreviated signature). The full function signature is the same as - that of the :class:`Popen` constructor - this functions passes all - supplied arguments directly through to that interface. + in :ref:`frequently-used-arguments` (hence the use of keyword-only notation + in the abbreviated signature). The full function signature is largely the + same as that of the :class:`Popen` constructor - this function passes all + supplied arguments other than *timeout* directly through to that interface. + + The *timeout* argument is passed to :meth:`Popen.wait`. If the timeout + expires, the child process will be killed and then waited for again. The + :exc:`TimeoutExpired` exception will be re-raised after the child process + has terminated. Examples:: @@ -86,8 +99,6 @@ use cases, the underlying :class:`Popen` interface can be used directly. ... subprocess.CalledProcessError: Command 'exit 1' returned non-zero exit status 1 - .. versionadded:: 2.5 - .. warning:: Invoking the system shell with ``shell=True`` can be a security hazard @@ -101,22 +112,31 @@ use cases, the underlying :class:`Popen` interface can be used directly. process may block if it generates enough output to a pipe to fill up the OS pipe buffer. + .. versionchanged:: 3.3 + *timeout* was added. -.. function:: check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False) - Run command with arguments and return its output as a byte string. +.. function:: check_output(args, *, stdin=None, stderr=None, shell=False, universal_newlines=False, timeout=None) + + Run command with arguments and return its output. If the return code was non-zero it raises a :exc:`CalledProcessError`. The :exc:`CalledProcessError` object will have the return code in the - :attr:`returncode` attribute and any output in the :attr:`output` - attribute. + :attr:`~CalledProcessError.returncode` attribute and any output in the + :attr:`~CalledProcessError.output` attribute. The arguments shown above are merely the most common ones, described below - in :ref:`frequently-used-arguments` (hence the slightly odd notation in - the abbreviated signature). The full function signature is largely the - same as that of the :class:`Popen` constructor, except that *stdout* is - not permitted as it is used internally. All other supplied arguments are - passed directly through to the :class:`Popen` constructor. + in :ref:`frequently-used-arguments` (hence the use of keyword-only notation + in the abbreviated signature). The full function signature is largely the + same as that of the :class:`Popen` constructor - this functions passes all + supplied arguments other than *timeout* directly through to that interface. + In addition, *stdout* is not permitted as an argument, as it is used + internally to collect the output from the subprocess. + + The *timeout* argument is passed to :meth:`Popen.wait`. If the timeout + expires, the child process will be killed and then waited for again. The + :exc:`TimeoutExpired` exception will be re-raised after the child process + has terminated. Examples:: @@ -147,7 +167,9 @@ use cases, the underlying :class:`Popen` interface can be used directly. ... shell=True) 'ls: non_existent_file: No such file or directory\n' - .. versionadded:: 2.7 + .. versionadded:: 3.1 + + .. .. warning:: @@ -161,6 +183,18 @@ use cases, the underlying :class:`Popen` interface can be used directly. read in the current process, the child process may block if it generates enough output to the pipe to fill up the OS pipe buffer. + .. versionchanged:: 3.3 + *timeout* was added. + + +.. data:: DEVNULL + + Special value that can be used as the *stdin*, *stdout* or *stderr* argument + to :class:`Popen` and indicates that the special file :data:`os.devnull` + will be used. + + .. versionadded:: 3.3 + .. data:: PIPE @@ -176,10 +210,38 @@ use cases, the underlying :class:`Popen` interface can be used directly. output. +.. exception:: SubprocessError + + Base class for all other exceptions from this module. + + .. versionadded:: 3.3 + + +.. exception:: TimeoutExpired + + Subclass of :exc:`SubprocessError`, raised when a timeout expires + while waiting for a child process. + + .. attribute:: cmd + + Command that was used to spawn the child process. + + .. attribute:: timeout + + Timeout in seconds. + + .. attribute:: output + + Output of the child process if this exception is raised by + :func:`check_output`. Otherwise, ``None``. + + .. versionadded:: 3.3 + + .. exception:: CalledProcessError - Exception raised when a process run by :func:`check_call` or - :func:`check_output` returns a non-zero exit status. + Subclass of :exc:`SubprocessError`, raised when a process run by + :func:`check_call` or :func:`check_output` returns a non-zero exit status. .. attribute:: returncode @@ -216,25 +278,31 @@ default values. The arguments that are most commonly needed are: *stdin*, *stdout* and *stderr* specify the executed program's standard input, standard output and standard error file handles, respectively. Valid values - are :data:`PIPE`, an existing file descriptor (a positive integer), an - existing file object, and ``None``. :data:`PIPE` indicates that a new pipe - to the child should be created. With the default settings of ``None``, no - redirection will occur; the child's file handles will be inherited from the - parent. Additionally, *stderr* can be :data:`STDOUT`, which indicates that - the stderr data from the child process should be captured into the same file - handle as for stdout. + are :data:`PIPE`, :data:`DEVNULL`, an existing file descriptor (a positive + integer), an existing file object, and ``None``. :data:`PIPE` indicates + that a new pipe to the child should be created. :data:`DEVNULL` indicates + that the special file :data:`os.devnull` will be used. With the default + settings of ``None``, no redirection will occur; the child's file handles + will be inherited from the parent. Additionally, *stderr* can be + :data:`STDOUT`, which indicates that the stderr data from the child + process should be captured into the same file handle as for *stdout*. .. index:: single: universal newlines; subprocess module - If *universal_newlines* is ``True``, the file objects *stdin*, *stdout* - and *stderr* will be opened as text streams in :term:`universal newlines` - mode using the encoding returned by :func:`locale.getpreferredencoding`. - For *stdin*, line ending characters ``'\n'`` in the input will be converted - to the default line separator :data:`os.linesep`. For *stdout* and - *stderr*, all line endings in the output will be converted to ``'\n'``. - For more information see the documentation of the :class:`io.TextIOWrapper` - class when the *newline* argument to its constructor is ``None``. + If *universal_newlines* is ``False`` the file objects *stdin*, *stdout* and + *stderr* will be opened as binary streams, and no line ending conversion is + done. + + If *universal_newlines* is ``True``, these file objects + will be opened as text streams in :term:`universal newlines` mode + using the encoding returned by :func:`locale.getpreferredencoding(False) + <locale.getpreferredencoding>`. For *stdin*, line ending characters + ``'\n'`` in the input will be converted to the default line separator + :data:`os.linesep`. For *stdout* and *stderr*, all line endings in the + output will be converted to ``'\n'``. For more information see the + documentation of the :class:`io.TextIOWrapper` class when the *newline* + argument to its constructor is ``None``. .. note:: @@ -252,6 +320,12 @@ default values. The arguments that are most commonly needed are: :mod:`fnmatch`, :func:`os.walk`, :func:`os.path.expandvars`, :func:`os.path.expanduser`, and :mod:`shutil`). + .. versionchanged:: 3.3 + When *universal_newlines* is ``True``, the class uses the encoding + :func:`locale.getpreferredencoding(False) <locale.getpreferredencoding>` + instead of ``locale.getpreferredencoding()``. See the + :class:`io.TextIOWrapper` class for more information on this change. + .. warning:: Executing shell commands that incorporate unsanitized input from an @@ -271,6 +345,10 @@ default values. The arguments that are most commonly needed are: from this vulnerability; see the Note in the :class:`Popen` constructor documentation for helpful hints in getting ``shell=False`` to work. + When using ``shell=True``, :func:`shlex.quote` can be used to properly + escape whitespace and shell metacharacters in strings that are going to + be used to construct shell commands. + These options, along with all of the other options, are described in more detail in the :class:`Popen` constructor documentation. @@ -356,20 +434,19 @@ functions. untrusted input. See the warning under :ref:`frequently-used-arguments` for details. - *bufsize* will be supplied as the corresponding argument to the :meth:`io.open` - function when creating the stdin/stdout/stderr pipe file objects: - :const:`0` means unbuffered (read and write are one system call and can return short), - :const:`1` means line buffered, any other positive value means use a buffer of - approximately that size. A negative bufsize (the default) means - the system default of io.DEFAULT_BUFFER_SIZE will be used. - - .. versionchanged:: 3.2.4 + *bufsize* will be supplied as the corresponding argument to the :func:`open` + function when creating the stdin/stdout/stderr pipe file objects: :const:`0` + means unbuffered (read and write are one system call and can return short), + :const:`1` means line buffered, any other positive value means use a buffer + of approximately that size. A negative bufsize (the default) means the + system default of io.DEFAULT_BUFFER_SIZE will be used. + .. versionchanged:: 3.3.1 *bufsize* now defaults to -1 to enable buffering by default to match the - behavior that most code expects. In 3.2.0 through 3.2.3 it incorrectly - defaulted to :const:`0` which was unbuffered and allowed short reads. - This was unintentional and did not match the behavior of Python 2 as - most code expected. + behavior that most code expects. In versions prior to Python 3.2.4 and + 3.3.1 it incorrectly defaulted to :const:`0` which was unbuffered + and allowed short reads. This was unintentional and did not match the + behavior of Python 2 as most code expected. The *executable* argument specifies a replacement program to execute. It is very seldom needed. When ``shell=False``, *executable* replaces the @@ -383,13 +460,14 @@ functions. *stdin*, *stdout* and *stderr* specify the executed program's standard input, standard output and standard error file handles, respectively. Valid values - are :data:`PIPE`, an existing file descriptor (a positive integer), an - existing :term:`file object`, and ``None``. :data:`PIPE` indicates that a - new pipe to the child should be created. With the default settings of - ``None``, no redirection will occur; the child's file handles will be - inherited from the parent. Additionally, *stderr* can be :data:`STDOUT`, - which indicates that the stderr data from the applications should be - captured into the same file handle as for stdout. + are :data:`PIPE`, :data:`DEVNULL`, an existing file descriptor (a positive + integer), an existing :term:`file object`, and ``None``. :data:`PIPE` + indicates that a new pipe to the child should be created. :data:`DEVNULL` + indicates that the special file :data:`os.devnull` will be used. With the + default settings of ``None``, no redirection will occur; the child's file + handles will be inherited from the parent. Additionally, *stderr* can be + :data:`STDOUT`, which indicates that the stderr data from the applications + should be captured into the same file handle as for stdout. If *preexec_fn* is set to a callable object, this object will be called in the child process just before the child is executed. @@ -434,7 +512,7 @@ functions. *executable* (or for the first item in *args*) relative to *cwd* if the executable path is a relative path. - If *restore_signals* is True (the default) all signals that Python has set to + If *restore_signals* is true (the default) all signals that Python has set to SIG_IGN are restored to SIG_DFL in the child process before the exec. Currently this includes the SIGPIPE, SIGXFZ and SIGXFSZ signals. (Unix only) @@ -442,7 +520,7 @@ functions. .. versionchanged:: 3.2 *restore_signals* was added. - If *start_new_session* is True the setsid() system call will be made in the + If *start_new_session* is true the setsid() system call will be made in the child process prior to the execution of the subprocess. (Unix only) .. versionchanged:: 3.2 @@ -462,7 +540,8 @@ functions. If *universal_newlines* is ``True``, the file objects *stdin*, *stdout* and *stderr* are opened as text streams in universal newlines mode, as - described above in :ref:`frequently-used-arguments`. + described above in :ref:`frequently-used-arguments`, otherwise they are + opened as binary streams. If given, *startupinfo* will be a :class:`STARTUPINFO` object, which is passed to the underlying ``CreateProcess`` function. @@ -499,6 +578,15 @@ arguments. :exc:`CalledProcessError` if the called process returns a non-zero return code. +All of the functions and methods that accept a *timeout* parameter, such as +:func:`call` and :meth:`Popen.communicate` will raise :exc:`TimeoutExpired` if +the timeout expires before the process exits. + +Exceptions defined in this module all inherit from :exc:`SubprocessError`. + + .. versionadded:: 3.3 + The :exc:`SubprocessError` base class was added. + Security ^^^^^^^^ @@ -518,14 +606,18 @@ Instances of the :class:`Popen` class have the following methods: .. method:: Popen.poll() - Check if child process has terminated. Set and return :attr:`returncode` - attribute. + Check if child process has terminated. Set and return + :attr:`~Popen.returncode` attribute. -.. method:: Popen.wait() +.. method:: Popen.wait(timeout=None) - Wait for child process to terminate. Set and return :attr:`returncode` - attribute. + Wait for child process to terminate. Set and return + :attr:`~Popen.returncode` attribute. + + If the process does not terminate after *timeout* seconds, raise a + :exc:`TimeoutExpired` exception. It is safe to catch this exception and + retry the wait. .. warning:: @@ -534,8 +626,11 @@ Instances of the :class:`Popen` class have the following methods: a pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use :meth:`communicate` to avoid that. + .. versionchanged:: 3.3 + *timeout* was added. + -.. method:: Popen.communicate(input=None) +.. method:: Popen.communicate(input=None, timeout=None) Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optional @@ -550,11 +645,29 @@ Instances of the :class:`Popen` class have the following methods: ``None`` in the result tuple, you need to give ``stdout=PIPE`` and/or ``stderr=PIPE`` too. + If the process does not terminate after *timeout* seconds, a + :exc:`TimeoutExpired` exception will be raised. Catching this exception and + retrying communication will not lose any output. + + The child process is not killed if the timeout expires, so in order to + cleanup properly a well-behaved application should kill the child process and + finish communication:: + + proc = subprocess.Popen(...) + try: + outs, errs = proc.communicate(timeout=15) + except TimeoutExpired: + proc.kill() + outs, errs = proc.communicate() + .. note:: The data read is buffered in memory, so do not use this method if the data size is large or unlimited. + .. versionchanged:: 3.3 + *timeout* was added. + .. method:: Popen.send_signal(signal) @@ -589,24 +702,38 @@ The following attributes are also available: deadlocks due to any of the other OS pipe buffers filling up and blocking the child process. +.. attribute:: Popen.args + + The *args* argument as it was passed to :class:`Popen` -- a + sequence of program arguments or else a single string. + + .. versionadded:: 3.3 .. attribute:: Popen.stdin - If the *stdin* argument was :data:`PIPE`, this attribute is a :term:`file - object` that provides input to the child process. Otherwise, it is ``None``. + If the *stdin* argument was :data:`PIPE`, this attribute is a writeable + stream object as returned by :func:`open`. If the *universal_newlines* + argument was ``True``, the stream is a text stream, otherwise it is a byte + stream. If the *stdin* argument was not :data:`PIPE`, this attribute is + ``None``. .. attribute:: Popen.stdout - If the *stdout* argument was :data:`PIPE`, this attribute is a :term:`file - object` that provides output from the child process. Otherwise, it is ``None``. + If the *stdout* argument was :data:`PIPE`, this attribute is a readable + stream object as returned by :func:`open`. Reading from the stream provides + output from the child process. If the *universal_newlines* argument was + ``True``, the stream is a text stream, otherwise it is a byte stream. If the + *stdout* argument was not :data:`PIPE`, this attribute is ``None``. .. attribute:: Popen.stderr - If the *stderr* argument was :data:`PIPE`, this attribute is a :term:`file - object` that provides error output from the child process. Otherwise, it is - ``None``. + If the *stderr* argument was :data:`PIPE`, this attribute is a readable + stream object as returned by :func:`open`. Reading from the stream provides + error output from the child process. If the *universal_newlines* argument was + ``True``, the stream is a text stream, otherwise it is a byte stream. If the + *stderr* argument was not :data:`PIPE`, this attribute is ``None``. .. attribute:: Popen.pid @@ -746,8 +873,8 @@ In this section, "a becomes b" means that b can be used as a replacement for a. In addition, the replacements using :func:`check_output` will fail with a :exc:`CalledProcessError` if the requested operation produces a non-zero - return code. The output is still available as the ``output`` attribute of - the raised exception. + return code. The output is still available as the + :attr:`~CalledProcessError.output` attribute of the raised exception. In the following examples, we assume that the relevant functions have already been imported from the :mod:`subprocess` module. @@ -936,10 +1063,12 @@ handling consistency are valid for these functions. Return ``(status, output)`` of executing *cmd* in a shell. - Execute the string *cmd* in a shell with :func:`os.popen` and return a 2-tuple - ``(status, output)``. *cmd* is actually run as ``{ cmd ; } 2>&1``, so that the - returned output will contain output or error messages. A trailing newline is - stripped from the output. The exit status for the command can be interpreted + Execute the string *cmd* in a shell with :class:`Popen` and return a 2-tuple + ``(status, output)`` via :func:`Popen.communicate`. Universal newlines mode + is used; see the notes on :ref:`frequently-used-arguments` for more details. + + A trailing newline is stripped from the output. + The exit status for the command can be interpreted according to the rules for the C function :c:func:`wait`. Example:: >>> subprocess.getstatusoutput('ls /bin/ls') @@ -949,7 +1078,10 @@ handling consistency are valid for these functions. >>> subprocess.getstatusoutput('/bin/junk') (256, 'sh: /bin/junk: not found') - Availability: UNIX. + Availability: Unix & Windows + + .. versionchanged:: 3.3.4 + Windows support added .. function:: getoutput(cmd) @@ -962,7 +1094,10 @@ handling consistency are valid for these functions. >>> subprocess.getoutput('ls /bin/ls') '/bin/ls' - Availability: UNIX. + Availability: Unix & Windows + + .. versionchanged:: 3.3.4 + Windows support added Notes @@ -996,3 +1131,9 @@ runtime): backslash. If the number of backslashes is odd, the last backslash escapes the next double quotation mark as described in rule 3. + + +.. seealso:: + + :mod:`shlex` + Module which provides function to parse and escape command lines. diff --git a/Doc/library/sys.rst b/Doc/library/sys.rst index 1166b64..36c608c 100644 --- a/Doc/library/sys.rst +++ b/Doc/library/sys.rst @@ -29,6 +29,33 @@ always available. command line, see the :mod:`fileinput` module. +.. data:: base_exec_prefix + + Set during Python startup, before ``site.py`` is run, to the same value as + :data:`exec_prefix`. If not running in a + :ref:`virtual environment <venv-def>`, the values will stay the same; if + ``site.py`` finds that a virtual environment is in use, the values of + :data:`prefix` and :data:`exec_prefix` will be changed to point to the + virtual environment, whereas :data:`base_prefix` and + :data:`base_exec_prefix` will remain pointing to the base Python + installation (the one which the virtual environment was created from). + + .. versionadded:: 3.3 + + +.. data:: base_prefix + + Set during Python startup, before ``site.py`` is run, to the same value as + :data:`prefix`. If not running in a :ref:`virtual environment <venv-def>`, the values + will stay the same; if ``site.py`` finds that a virtual environment is in + use, the values of :data:`prefix` and :data:`exec_prefix` will be changed to + point to the virtual environment, whereas :data:`base_prefix` and + :data:`base_exec_prefix` will remain pointing to the base Python + installation (the one which the virtual environment was created from). + + .. versionadded:: 3.3 + + .. data:: byteorder An indicator of the native byte order. This will have the value ``'big'`` on @@ -80,6 +107,22 @@ always available. This function should be used for internal and specialized purposes only. +.. function:: _debugmallocstats() + + Print low-level information to stderr about the state of CPython's memory + allocator. + + If Python is configured --with-pydebug, it also performs some expensive + internal consistency checks. + + .. versionadded:: 3.3 + + .. impl-detail:: + + This function is specific to CPython. The exact output format is not + defined here, and may change. + + .. data:: dllhandle Integer specifying the handle of the Python DLL. Availability: Windows. @@ -172,21 +215,6 @@ always available. a traceback object (see the Reference Manual) which encapsulates the call stack at the point where the exception originally occurred. - .. warning:: - - Assigning the *traceback* return value to a local variable in a function - that is handling an exception will cause a circular reference. Since most - functions don't need access to the traceback, the best solution is to use - something like ``exctype, value = sys.exc_info()[:2]`` to extract only the - exception type and value. If you do need the traceback, make sure to - delete it after use (best done with a :keyword:`try` - ... :keyword:`finally` statement) or to call :func:`exc_info` in a - function that does not itself handle an exception. - - Such cycles are normally automatically reclaimed when garbage collection - is enabled and they become unreachable, but it remains more efficient to - avoid creating cycles. - .. data:: exec_prefix @@ -199,6 +227,13 @@ always available. installed in :file:`{exec_prefix}/lib/python{X.Y}/lib-dynload`, where *X.Y* is the version number of Python, for example ``3.2``. + .. note:: + + If a :ref:`virtual environment <venv-def>` is in effect, this + value will be changed in ``site.py`` to point to the virtual environment. + The value for the Python installation will still be available, via + :data:`base_exec_prefix`. + .. data:: executable @@ -235,14 +270,13 @@ always available. .. data:: flags - The struct sequence *flags* exposes the status of command line flags. The - attributes are read only. + The :term:`struct sequence` *flags* exposes the status of command line + flags. The attributes are read only. ============================= ============================= attribute flag ============================= ============================= :const:`debug` :option:`-d` - :const:`division_warning` :option:`-Q` :const:`inspect` :option:`-i` :const:`interactive` :option:`-i` :const:`optimize` :option:`-O` or :option:`-OO` @@ -262,15 +296,20 @@ always available. .. versionadded:: 3.2.3 The ``hash_randomization`` attribute. + .. versionchanged:: 3.3 + Removed obsolete ``division_warning`` attribute. + .. data:: float_info - A structseq holding information about the float type. It contains low level - information about the precision and internal representation. The values - correspond to the various floating-point constants defined in the standard - header file :file:`float.h` for the 'C' programming language; see section - 5.2.4.2.2 of the 1999 ISO/IEC C standard [C99]_, 'Characteristics of - floating types', for details. + A :term:`struct sequence` holding information about the float type. It + contains low level information about the precision and internal + representation. The values correspond to the various floating-point + constants defined in the standard header file :file:`float.h` for the 'C' + programming language; see section 5.2.4.2.2 of the 1999 ISO/IEC C standard + [C99]_, 'Characteristics of floating types', for details. + + .. tabularcolumns:: |l|l|L| +---------------------+----------------+--------------------------------------------------+ | attribute | float.h macro | explanation | @@ -372,7 +411,7 @@ always available. * On Mac OS X, the encoding is ``'utf-8'``. * On Unix, the encoding is the user's preference according to the result of - nl_langinfo(CODESET), or ``'utf-8'`` if ``nl_langinfo(CODESET)`` failed. + nl_langinfo(CODESET). * On Windows NT+, file names are Unicode natively, so no conversion is performed. :func:`getfilesystemencoding` still returns ``'mbcs'``, as @@ -383,8 +422,7 @@ always available. * On Windows 9x, the encoding is ``'mbcs'``. .. versionchanged:: 3.2 - On Unix, use ``'utf-8'`` instead of ``None`` if ``nl_langinfo(CODESET)`` - failed. :func:`getfilesystemencoding` result cannot be ``None``. + :func:`getfilesystemencoding` result cannot be ``None`` anymore. .. function:: getrefcount(object) @@ -409,6 +447,9 @@ always available. does not have to hold true for third-party extensions as it is implementation specific. + Only the memory consumption directly attributed to the object is + accounted for, not the memory consumption of objects it refers to. + If given, *default* will be returned if the object does not provide means to retrieve the size. Otherwise a :exc:`TypeError` will be raised. @@ -520,8 +561,9 @@ always available. .. data:: hash_info - A structseq giving parameters of the numeric hash implementation. For - more details about hashing of numeric types, see :ref:`numeric-hash`. + A :term:`struct sequence` giving parameters of the numeric hash + implementation. For more details about hashing of numeric types, see + :ref:`numeric-hash`. +---------------------+--------------------------------------------------+ | attribute | explanation | @@ -556,37 +598,58 @@ always available. This is called ``hexversion`` since it only really looks meaningful when viewed as the result of passing it to the built-in :func:`hex` function. The - struct sequence :data:`sys.version_info` may be used for a more human-friendly - encoding of the same information. - - The ``hexversion`` is a 32-bit number with the following layout: - - +-------------------------+------------------------------------------------+ - | Bits (big endian order) | Meaning | - +=========================+================================================+ - | :const:`1-8` | ``PY_MAJOR_VERSION`` (the ``2`` in | - | | ``2.1.0a3``) | - +-------------------------+------------------------------------------------+ - | :const:`9-16` | ``PY_MINOR_VERSION`` (the ``1`` in | - | | ``2.1.0a3``) | - +-------------------------+------------------------------------------------+ - | :const:`17-24` | ``PY_MICRO_VERSION`` (the ``0`` in | - | | ``2.1.0a3``) | - +-------------------------+------------------------------------------------+ - | :const:`25-28` | ``PY_RELEASE_LEVEL`` (``0xA`` for alpha, | - | | ``0xB`` for beta, ``0xC`` for release | - | | candidate and ``0xF`` for final) | - +-------------------------+------------------------------------------------+ - | :const:`29-32` | ``PY_RELEASE_SERIAL`` (the ``3`` in | - | | ``2.1.0a3``, zero for final releases) | - +-------------------------+------------------------------------------------+ - - Thus ``2.1.0a3`` is hexversion ``0x020100a3``. + :term:`struct sequence` :data:`sys.version_info` may be used for a more + human-friendly encoding of the same information. + + More details of ``hexversion`` can be found at :ref:`apiabiversion` + + +.. data:: implementation + + An object containing information about the implementation of the + currently running Python interpreter. The following attributes are + required to exist in all Python implementations. + + *name* is the implementation's identifier, e.g. ``'cpython'``. The actual + string is defined by the Python implementation, but it is guaranteed to be + lower case. + + *version* is a named tuple, in the same format as + :data:`sys.version_info`. It represents the version of the Python + *implementation*. This has a distinct meaning from the specific + version of the Python *language* to which the currently running + interpreter conforms, which ``sys.version_info`` represents. For + example, for PyPy 1.8 ``sys.implementation.version`` might be + ``sys.version_info(1, 8, 0, 'final', 0)``, whereas ``sys.version_info`` + would be ``sys.version_info(2, 7, 2, 'final', 0)``. For CPython they + are the same value, since it is the reference implementation. + + *hexversion* is the implementation version in hexadecimal format, like + :data:`sys.hexversion`. + + *cache_tag* is the tag used by the import machinery in the filenames of + cached modules. By convention, it would be a composite of the + implementation's name and version, like ``'cpython-33'``. However, a + Python implementation may use some other value if appropriate. If + ``cache_tag`` is set to ``None``, it indicates that module caching should + be disabled. + + :data:`sys.implementation` may contain additional attributes specific to + the Python implementation. These non-standard attributes must start with + an underscore, and are not described here. Regardless of its contents, + :data:`sys.implementation` will not change during a run of the interpreter, + nor between implementation versions. (It may change between Python + language versions, however.) See `PEP 421` for more information. + + .. versionadded:: 3.3 + .. data:: int_info - A struct sequence that holds information about Python's - internal representation of integers. The attributes are read only. + A :term:`struct sequence` that holds information about Python's internal + representation of integers. The attributes are read only. + + .. tabularcolumns:: |l|L| +-------------------------+----------------------------------------------+ | Attribute | Explanation | @@ -641,9 +704,13 @@ always available. .. data:: maxunicode - An integer giving the largest supported code point for a Unicode character. The - value of this depends on the configuration option that specifies whether Unicode - characters are stored as UCS-2 or UCS-4. + An integer giving the value of the largest Unicode code point, + i.e. ``1114111`` (``0x10FFFF`` in hexadecimal). + + .. versionchanged:: 3.3 + Before :pep:`393`, ``sys.maxunicode`` used to be either ``0xFFFF`` + or ``0x10FFFF``, depending on the configuration option that specified + whether Unicode characters were stored as UCS-2 or UCS-4. .. data:: meta_path @@ -666,6 +733,8 @@ always available. This is a dictionary that maps module names to modules which have already been loaded. This can be manipulated to force reloading of modules and other tricks. + However, replacing the dictionary will not necessarily work as expected and + deleting essential items from the dictionary may cause Python to fail. .. data:: path @@ -684,7 +753,9 @@ always available. current directory first. Notice that the script directory is inserted *before* the entries inserted as a result of :envvar:`PYTHONPATH`. - A program is free to modify this list for its own purposes. + A program is free to modify this list for its own purposes. Only strings + and bytes should be added to :data:`sys.path`; all other data types are + ignored during import. .. seealso:: @@ -706,48 +777,50 @@ always available. A dictionary acting as a cache for :term:`finder` objects. The keys are paths that have been passed to :data:`sys.path_hooks` and the values are the finders that are found. If a path is a valid file system path but no - explicit finder is found on :data:`sys.path_hooks` then ``None`` is - stored to represent the implicit default finder should be used. If the path - is not an existing path then :class:`imp.NullImporter` is set. + finder is found on :data:`sys.path_hooks` then ``None`` is + stored. Originally specified in :pep:`302`. + .. versionchanged:: 3.3 + ``None`` is stored instead of :class:`imp.NullImporter` when no finder + is found. + .. data:: platform This string contains a platform identifier that can be used to append platform-specific components to :data:`sys.path`, for instance. - For most Unix systems, this is the lowercased OS name as returned by ``uname - -s`` with the first part of the version as returned by ``uname -r`` appended, - e.g. ``'sunos5'``, *at the time when Python was built*. Unless you want to - test for a specific system version, it is therefore recommended to use the - following idiom:: + For Unix systems, except on Linux, this is the lowercased OS name as + returned by ``uname -s`` with the first part of the version as returned by + ``uname -r`` appended, e.g. ``'sunos5'`` or ``'freebsd8'``, *at the time + when Python was built*. Unless you want to test for a specific system + version, it is therefore recommended to use the following idiom:: if sys.platform.startswith('freebsd'): # FreeBSD-specific code here... elif sys.platform.startswith('linux'): # Linux-specific code here... - .. versionchanged:: 3.2.2 - Since lots of code check for ``sys.platform == 'linux2'``, and there is - no essential change between Linux 2.x and 3.x, ``sys.platform`` is always - set to ``'linux2'``, even on Linux 3.x. In Python 3.3 and later, the - value will always be set to ``'linux'``, so it is recommended to always - use the ``startswith`` idiom presented above. - For other systems, the values are: - ====================== =========================== - System ``platform`` value - ====================== =========================== - Linux (2.x *and* 3.x) ``'linux2'`` - Windows ``'win32'`` - Windows/Cygwin ``'cygwin'`` - Mac OS X ``'darwin'`` - OS/2 ``'os2'`` - OS/2 EMX ``'os2emx'`` - ====================== =========================== + ================ =========================== + System ``platform`` value + ================ =========================== + Linux ``'linux'`` + Windows ``'win32'`` + Windows/Cygwin ``'cygwin'`` + Mac OS X ``'darwin'`` + OS/2 ``'os2'`` + OS/2 EMX ``'os2emx'`` + ================ =========================== + + .. versionchanged:: 3.3 + On Linux, :attr:`sys.platform` doesn't contain the major version anymore. + It is always ``'linux'``, instead of ``'linux2'`` or ``'linux3'``. Since + older Python versions include the version number, it is recommended to + always use the ``startswith`` idiom presented above. .. seealso:: @@ -769,6 +842,11 @@ always available. stored in :file:`{prefix}/include/python{X.Y}`, where *X.Y* is the version number of Python, for example ``3.2``. + .. note:: If a :ref:`virtual environment <venv-def>` is in effect, this + value will be changed in ``site.py`` to point to the virtual + environment. The value for the Python installation will still be + available, via :data:`base_prefix`. + .. data:: ps1 ps2 @@ -806,11 +884,11 @@ always available. the interpreter loads extension modules. Among other things, this will enable a lazy resolving of symbols when importing a module, if called as ``sys.setdlopenflags(0)``. To share symbols across extension modules, call as - ``sys.setdlopenflags(ctypes.RTLD_GLOBAL)``. Symbolic names for the - flag modules can be either found in the :mod:`ctypes` module, or in the :mod:`DLFCN` - module. If :mod:`DLFCN` is not available, it can be generated from - :file:`/usr/include/dlfcn.h` using the :program:`h2py` script. Availability: - Unix. + ``sys.setdlopenflags(os.RTLD_GLOBAL)``. Symbolic names for the flag modules + can be found in the :mod:`os` module (``RTLD_xxx`` constants, e.g. + :data:`os.RTLD_LAZY`). + + Availability: Unix. .. function:: setprofile(profilefunc) @@ -955,7 +1033,7 @@ always available. :func:`open` function. Their parameters are chosen as follows: * The character encoding is platform-dependent. Under Windows, if the stream - is interactive (that is, if its :meth:`isatty` method returns True), the + is interactive (that is, if its :meth:`isatty` method returns ``True``), the console codepage is used, otherwise the ANSI code page. Under other platforms, the locale encoding is used (see :meth:`locale.getpreferredencoding`). @@ -1003,22 +1081,35 @@ always available. to a console and Python apps started with :program:`pythonw`. -.. data:: subversion +.. data:: thread_info - A triple (repo, branch, version) representing the Subversion information of the - Python interpreter. *repo* is the name of the repository, ``'CPython'``. - *branch* is a string of one of the forms ``'trunk'``, ``'branches/name'`` or - ``'tags/name'``. *version* is the output of ``svnversion``, if the interpreter - was built from a Subversion checkout; it contains the revision number (range) - and possibly a trailing 'M' if there were local modifications. If the tree was - exported (or svnversion was not available), it is the revision of - ``Include/patchlevel.h`` if the branch is a tag. Otherwise, it is ``None``. + A :term:`struct sequence` holding information about the thread + implementation. - .. deprecated:: 3.2.1 - Python is now `developed <http://docs.python.org/devguide/>`_ using - Mercurial. In recent Python 3.2 bugfix releases, :data:`subversion` - therefore contains placeholder information. It is removed in Python - 3.3. + .. tabularcolumns:: |l|p{0.7\linewidth}| + + +------------------+---------------------------------------------------------+ + | Attribute | Explanation | + +==================+=========================================================+ + | :const:`name` | Name of the thread implementation: | + | | | + | | * ``'nt'``: Windows threads | + | | * ``'os2'``: OS/2 threads | + | | * ``'pthread'``: POSIX threads | + | | * ``'solaris'``: Solaris threads | + +------------------+---------------------------------------------------------+ + | :const:`lock` | Name of the lock implementation: | + | | | + | | * ``'semaphore'``: a lock uses a semaphore | + | | * ``'mutex+cond'``: a lock uses a mutex | + | | and a condition variable | + | | * ``None`` if this information is unknown | + +------------------+---------------------------------------------------------+ + | :const:`version` | Name and version of the thread library. It is a string, | + | | or ``None`` if these informations are unknown. | + +------------------+---------------------------------------------------------+ + + .. versionadded:: 3.3 .. data:: tracebacklimit @@ -1098,5 +1189,5 @@ always available. .. rubric:: Citations -.. [C99] ISO/IEC 9899:1999. "Programming languages -- C." A public draft of this standard is available at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf . +.. [C99] ISO/IEC 9899:1999. "Programming languages -- C." A public draft of this standard is available at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf\ . diff --git a/Doc/library/syslog.rst b/Doc/library/syslog.rst index 974ecf9..6e90dc0 100644 --- a/Doc/library/syslog.rst +++ b/Doc/library/syslog.rst @@ -79,12 +79,14 @@ Priority levels (high to low): Facilities: :const:`LOG_KERN`, :const:`LOG_USER`, :const:`LOG_MAIL`, :const:`LOG_DAEMON`, :const:`LOG_AUTH`, :const:`LOG_LPR`, :const:`LOG_NEWS`, :const:`LOG_UUCP`, - :const:`LOG_CRON`, :const:`LOG_SYSLOG` and :const:`LOG_LOCAL0` to - :const:`LOG_LOCAL7`. + :const:`LOG_CRON`, :const:`LOG_SYSLOG`, :const:`LOG_LOCAL0` to + :const:`LOG_LOCAL7`, and, if defined in ``<syslog.h>``, + :const:`LOG_AUTHPRIV`. Log options: - :const:`LOG_PID`, :const:`LOG_CONS`, :const:`LOG_NDELAY`, :const:`LOG_NOWAIT` - and :const:`LOG_PERROR` if defined in ``<syslog.h>``. + :const:`LOG_PID`, :const:`LOG_CONS`, :const:`LOG_NDELAY`, and, if defined + in ``<syslog.h>``, :const:`LOG_ODELAY`, :const:`LOG_NOWAIT`, and + :const:`LOG_PERROR`. Examples diff --git a/Doc/library/tarfile.rst b/Doc/library/tarfile.rst index 46e4900..1ead71c 100644 --- a/Doc/library/tarfile.rst +++ b/Doc/library/tarfile.rst @@ -13,13 +13,13 @@ -------------- The :mod:`tarfile` module makes it possible to read and write tar -archives, including those using gzip or bz2 compression. +archives, including those using gzip, bz2 and lzma compression. Use the :mod:`zipfile` module to read or write :file:`.zip` files, or the higher-level functions in :ref:`shutil <archiving-operations>`. Some facts and figures: -* reads and writes :mod:`gzip` and :mod:`bz2` compressed archives. +* reads and writes :mod:`gzip`, :mod:`bz2` and :mod:`lzma` compressed archives. * read/write support for the POSIX.1-1988 (ustar) format. @@ -33,6 +33,9 @@ Some facts and figures: character devices and block devices and is able to acquire and restore file information like timestamp, access permissions and owner. +.. versionchanged:: 3.3 + Added support for :mod:`lzma` compression. + .. function:: open(name=None, mode='r', fileobj=None, bufsize=10240, \*\*kwargs) @@ -56,6 +59,8 @@ Some facts and figures: +------------------+---------------------------------------------+ | ``'r:bz2'`` | Open for reading with bzip2 compression. | +------------------+---------------------------------------------+ + | ``'r:xz'`` | Open for reading with lzma compression. | + +------------------+---------------------------------------------+ | ``'a' or 'a:'`` | Open for appending with no compression. The | | | file is created if it does not exist. | +------------------+---------------------------------------------+ @@ -65,11 +70,13 @@ Some facts and figures: +------------------+---------------------------------------------+ | ``'w:bz2'`` | Open for bzip2 compressed writing. | +------------------+---------------------------------------------+ + | ``'w:xz'`` | Open for lzma compressed writing. | + +------------------+---------------------------------------------+ - Note that ``'a:gz'`` or ``'a:bz2'`` is not possible. If *mode* is not suitable - to open a certain (compressed) file for reading, :exc:`ReadError` is raised. Use - *mode* ``'r'`` to avoid this. If a compression method is not supported, - :exc:`CompressionError` is raised. + Note that ``'a:gz'``, ``'a:bz2'`` or ``'a:xz'`` is not possible. If *mode* + is not suitable to open a certain (compressed) file for reading, + :exc:`ReadError` is raised. Use *mode* ``'r'`` to avoid this. If a + compression method is not supported, :exc:`CompressionError` is raised. If *fileobj* is specified, it is used as an alternative to a :term:`file object` opened in binary mode for *name*. It is supposed to be at position 0. @@ -100,6 +107,9 @@ Some facts and figures: | ``'r|bz2'`` | Open a bzip2 compressed *stream* for | | | reading. | +-------------+--------------------------------------------+ + | ``'r|xz'`` | Open a lzma compressed *stream* for | + | | reading. | + +-------------+--------------------------------------------+ | ``'w|'`` | Open an uncompressed *stream* for writing. | +-------------+--------------------------------------------+ | ``'w|gz'`` | Open a gzip compressed *stream* for | @@ -108,6 +118,9 @@ Some facts and figures: | ``'w|bz2'`` | Open a bzip2 compressed *stream* for | | | writing. | +-------------+--------------------------------------------+ + | ``'w|xz'`` | Open an lzma compressed *stream* for | + | | writing. | + +-------------+--------------------------------------------+ .. class:: TarFile @@ -263,9 +276,9 @@ be finalized; only the internally used file object will be closed. See the If *errorlevel* is ``0``, all errors are ignored when using :meth:`TarFile.extract`. Nevertheless, they appear as error messages in the debug output, when debugging - is enabled. If ``1``, all *fatal* errors are raised as :exc:`OSError` or - :exc:`IOError` exceptions. If ``2``, all *non-fatal* errors are raised as - :exc:`TarError` exceptions as well. + is enabled. If ``1``, all *fatal* errors are raised as :exc:`OSError` + exceptions. If ``2``, all *non-fatal* errors are raised as :exc:`TarError` + exceptions as well. The *encoding* and *errors* arguments define the character encoding to be used for reading or writing the archive and how conversion errors are going @@ -346,7 +359,7 @@ be finalized; only the internally used file object will be closed. See the full name. Its file information is extracted as accurately as possible. *member* may be a filename or a :class:`TarInfo` object. You can specify a different directory using *path*. File attributes (owner, mtime, mode) are set unless - *set_attrs* is False. + *set_attrs* is false. .. note:: @@ -363,15 +376,12 @@ be finalized; only the internally used file object will be closed. See the .. method:: TarFile.extractfile(member) Extract a member from the archive as a file object. *member* may be a filename - or a :class:`TarInfo` object. If *member* is a regular file, a :term:`file-like - object` is returned. If *member* is a link, a file-like object is constructed from - the link's target. If *member* is none of the above, :const:`None` is returned. - - .. note:: + or a :class:`TarInfo` object. If *member* is a regular file or a link, an + :class:`io.BufferedReader` object is returned. Otherwise, :const:`None` is + returned. - The file-like object is read-only. It provides the methods - :meth:`read`, :meth:`readline`, :meth:`readlines`, :meth:`seek`, :meth:`tell`, - and :meth:`close`, and also supports iteration over its lines. + .. versionchanged:: 3.3 + Return an :class:`io.BufferedReader` object. .. method:: TarFile.add(name, arcname=None, recursive=True, exclude=None, *, filter=None) @@ -659,11 +669,11 @@ There are three tar formats that can be created with the :mod:`tarfile` module: * The POSIX.1-1988 ustar format (:const:`USTAR_FORMAT`). It supports filenames up to a length of at best 256 characters and linknames up to 100 characters. The - maximum file size is 8 gigabytes. This is an old and limited but widely + maximum file size is 8 GiB. This is an old and limited but widely supported format. * The GNU tar format (:const:`GNU_FORMAT`). It supports long filenames and - linknames, files bigger than 8 gigabytes and sparse files. It is the de facto + linknames, files bigger than 8 GiB and sparse files. It is the de facto standard on GNU/Linux systems. :mod:`tarfile` fully supports the GNU tar extensions for long names, sparse file support is read-only. diff --git a/Doc/library/telnetlib.rst b/Doc/library/telnetlib.rst index 646634d..31e5dbb 100644 --- a/Doc/library/telnetlib.rst +++ b/Doc/library/telnetlib.rst @@ -162,9 +162,13 @@ Telnet Objects .. method:: Telnet.write(buffer) Write a byte string to the socket, doubling any IAC characters. This can - block if the connection is blocked. May raise :exc:`socket.error` if the + block if the connection is blocked. May raise :exc:`OSError` if the connection is closed. + .. versionchanged:: 3.3 + This method used to raise :exc:`socket.error`, which is now an alias + of :exc:`OSError`. + .. method:: Telnet.interact() @@ -181,7 +185,7 @@ Telnet Objects Read until one from a list of a regular expressions matches. The first argument is a list of regular expressions, either compiled - (:class:`re.RegexObject` instances) or uncompiled (byte strings). The + (:ref:`regex objects <re-objects>`) or uncompiled (byte strings). The optional second argument is a timeout, in seconds; the default is to block indefinitely. @@ -201,7 +205,7 @@ Telnet Objects .. method:: Telnet.set_option_negotiation_callback(callback) Each time a telnet option is read on the input flow, this *callback* (if set) is - called with the following parameters : callback(telnet socket, command + called with the following parameters: callback(telnet socket, command (DO/DONT/WILL/WONT), option). No other action is done afterwards by telnetlib. diff --git a/Doc/library/tempfile.rst b/Doc/library/tempfile.rst index bc1f23a..d78159d 100644 --- a/Doc/library/tempfile.rst +++ b/Doc/library/tempfile.rst @@ -25,7 +25,7 @@ instead a string of six random characters is used. Also, all the user-callable functions now take additional arguments which allow direct control over the location and name of temporary files. It is -no longer necessary to use the global *tempdir* and *template* variables. +no longer necessary to use the global *tempdir* variable. To maintain backward compatibility, the argument order is somewhat odd; it is recommended to use keyword arguments for clarity. @@ -76,31 +76,35 @@ The module defines the following user-callable items: data is spooled in memory until the file size exceeds *max_size*, or until the file's :func:`fileno` method is called, at which point the contents are written to disk and operation proceeds as with - :func:`TemporaryFile`. Also, it's ``truncate`` method does not - accept a ``size`` argument. + :func:`TemporaryFile`. The resulting file has one additional method, :func:`rollover`, which causes the file to roll over to an on-disk file regardless of its size. The returned object is a file-like object whose :attr:`_file` attribute - is either a :class:`BytesIO` or :class:`StringIO` object (depending on + is either a :class:`io.BytesIO` or :class:`io.StringIO` object (depending on whether binary or text *mode* was specified) or a true file object, depending on whether :func:`rollover` has been called. This file-like object can be used in a :keyword:`with` statement, just like a normal file. + .. versionchanged:: 3.3 + the truncate method now accepts a ``size`` argument. + .. function:: TemporaryDirectory(suffix='', prefix='tmp', dir=None) This function creates a temporary directory using :func:`mkdtemp` (the supplied arguments are passed directly to the underlying function). The resulting object can be used as a context manager (see - :ref:`context-managers`). On completion of the context (or destruction - of the temporary directory object), the newly created temporary directory + :ref:`context-managers`). On completion of the context or destruction + of the temporary directory object the newly created temporary directory and all its contents are removed from the filesystem. - The directory name can be retrieved from the :attr:`name` attribute - of the returned object. + The directory name can be retrieved from the :attr:`name` attribute of the + returned object. When the returned object is used as a context manager, the + :attr:`name` will be assigned to the target of the :keyword:`as` clause in + the :keyword:`with` statement, if there is one. The directory can be explicitly cleaned up by calling the :func:`cleanup` method. diff --git a/Doc/library/test.rst b/Doc/library/test.rst index 24e06d5..c28536d 100644 --- a/Doc/library/test.rst +++ b/Doc/library/test.rst @@ -80,17 +80,12 @@ A basic boilerplate is often used:: ... more test classes ... - def test_main(): - support.run_unittest(MyTestCase1, - MyTestCase2, - ... list other tests ... - ) - if __name__ == '__main__': - test_main() + unittest.main() -This boilerplate code allows the testing suite to be run by :mod:`test.regrtest` -as well as on its own as a script. +This code pattern allows the testing suite to be run by :mod:`test.regrtest`, +on its own as a script that supports the :mod:`unittest` CLI, or via the +`python -m unittest` CLI. The goal for regression testing is to try to break code. This leads to a few guidelines to be followed: @@ -129,22 +124,27 @@ guidelines to be followed: as what type of input is used. Minimize code duplication by subclassing a basic test class with a class that specifies the input:: - class TestFuncAcceptsSequences(unittest.TestCase): + class TestFuncAcceptsSequencesMixin: func = mySuperWhammyFunction def test_func(self): self.func(self.arg) - class AcceptLists(TestFuncAcceptsSequences): + class AcceptLists(TestFuncAcceptsSequencesMixin, unittest.TestCase): arg = [1, 2, 3] - class AcceptStrings(TestFuncAcceptsSequences): + class AcceptStrings(TestFuncAcceptsSequencesMixin, unittest.TestCase): arg = 'abc' - class AcceptTuples(TestFuncAcceptsSequences): + class AcceptTuples(TestFuncAcceptsSequencesMixin, unittest.TestCase): arg = (1, 2, 3) + When using this pattern, remember that all classes that inherit from + `unittest.TestCase` are run as tests. The `Mixin` class in the example above + does not have any data and so can't be run by itself, thus it does not + inherit from `unittest.TestCase`. + .. seealso:: @@ -160,14 +160,15 @@ Running tests using the command-line interface The :mod:`test` package can be run as a script to drive Python's regression test suite, thanks to the :option:`-m` option: :program:`python -m test`. Under the hood, it uses :mod:`test.regrtest`; the call :program:`python -m -test.regrtest` used in previous Python versions still works). -Running the script by itself automatically starts running all regression -tests in the :mod:`test` package. It does this by finding all modules in the -package whose name starts with ``test_``, importing them, and executing the -function :func:`test_main` if present. The names of tests to execute may also -be passed to the script. Specifying a single regression test (:program:`python --m test test_spam`) will minimize output and only print -whether the test passed or failed and thus minimize output. +test.regrtest` used in previous Python versions still works). Running the +script by itself automatically starts running all regression tests in the +:mod:`test` package. It does this by finding all modules in the package whose +name starts with ``test_``, importing them, and executing the function +:func:`test_main` if present or loading the tests via +unittest.TestLoader.loadTestsFromModule if ``test_main`` does not exist. The +names of tests to execute may also be passed to the script. Specifying a single +regression test (:program:`python -m test test_spam`) will minimize output and +only print whether the test passed or failed. Running :mod:`test` directly allows what resources are available for tests to use to be set. You do this by using the ``-u`` command-line @@ -198,6 +199,7 @@ The :mod:`test.support` module provides support for Python's regression test suite. .. note:: + :mod:`test.support` is not a public module. It is documented here to help Python developers write tests. The API of this module is subject to change without backwards compatibility concerns between releases. @@ -223,14 +225,14 @@ The :mod:`test.support` module defines the following constants: .. data:: verbose - :const:`True` when verbose output is enabled. Should be checked when more + ``True`` when verbose output is enabled. Should be checked when more detailed information is desired about a running test. *verbose* is set by :mod:`test.regrtest`. .. data:: is_jython - :const:`True` if the running interpreter is Jython. + ``True`` if the running interpreter is Jython. .. data:: TESTFN @@ -249,7 +251,7 @@ The :mod:`test.support` module defines the following functions: .. function:: is_resource_enabled(resource) - Return :const:`True` if *resource* is enabled and available. The list of + Return ``True`` if *resource* is enabled and available. The list of available resources is only set when :mod:`test.regrtest` is executing the tests. @@ -258,16 +260,19 @@ The :mod:`test.support` module defines the following functions: Raise :exc:`ResourceDenied` if *resource* is not available. *msg* is the argument to :exc:`ResourceDenied` if it is raised. Always returns - :const:`True` if called by a function whose ``__name__`` is ``'__main__'``. + ``True`` if called by a function whose ``__name__`` is ``'__main__'``. Used when tests are executed by :mod:`test.regrtest`. -.. function:: findfile(filename) +.. function:: findfile(filename, subdir=None) Return the path to the file named *filename*. If no match is found *filename* is returned. This does not equal a failure since it could be the path to the file. + Setting *subdir* indicates a relative path to use to find the file + rather than looking directly in the path directories. + .. function:: run_unittest(\*classes) @@ -286,6 +291,15 @@ The :mod:`test.support` module defines the following functions: This will run all tests defined in the named module. +.. function:: run_doctest(module, verbosity=None) + + Run :func:`doctest.testmod` on the given *module*. Return + ``(failure_count, test_count)``. + + If *verbosity* is ``None``, :func:`doctest.testmod` is run with verbosity + set to :data:`verbose`. Otherwise, it is run with verbosity set to + ``None``. + .. function:: check_warnings(\*filters, quiet=True) A convenience wrapper for :func:`warnings.catch_warnings()` that makes it @@ -296,12 +310,12 @@ The :mod:`test.support` module defines the following functions: ``check_warnings`` accepts 2-tuples of the form ``("message regexp", WarningCategory)`` as positional arguments. If one or more *filters* are - provided, or if the optional keyword argument *quiet* is :const:`False`, + provided, or if the optional keyword argument *quiet* is ``False``, it checks to make sure the warnings are as expected: each specified filter must match at least one of the warnings raised by the enclosed code or the test fails, and if any warnings are raised that do not match any of the specified filters the test fails. To disable the first of these checks, - set *quiet* to :const:`True`. + set *quiet* to ``True``. If no arguments are specified, it defaults to:: @@ -316,7 +330,7 @@ The :mod:`test.support` module defines the following functions: representing the most recent warning can also be accessed directly through the recorder object (see example below). If no warning has been raised, then any of the attributes that would otherwise be expected on an object - representing a warning will return :const:`None`. + representing a warning will return ``None``. The recorder object also has a :meth:`reset` method, which clears the warnings list. @@ -352,17 +366,81 @@ The :mod:`test.support` module defines the following functions: New optional arguments *filters* and *quiet*. -.. function:: captured_stdout() +.. function:: captured_stdin() + captured_stdout() + captured_stderr() - This is a context manager that runs the :keyword:`with` statement body using - a :class:`StringIO.StringIO` object as sys.stdout. That object can be - retrieved using the ``as`` clause of the :keyword:`with` statement. + A context managers that temporarily replaces the named stream with + :class:`io.StringIO` object. - Example use:: + Example use with output streams:: - with captured_stdout() as s: + with captured_stdout() as stdout, captured_stderr() as stderr: print("hello") - assert s.getvalue() == "hello\n" + print("error", file=sys.stderr) + assert stdout.getvalue() == "hello\n" + assert stderr.getvalue() == "error\n" + + Example use with input stream:: + + with captured_stdin() as stdin: + stdin.write('hello\n') + stdin.seek(0) + # call test code that consumes from sys.stdin + captured = input() + self.assertEqual(captured, "hello") + + +.. function:: temp_dir(path=None, quiet=False) + + A context manager that creates a temporary directory at *path* and + yields the directory. + + If *path* is None, the temporary directory is created using + :func:`tempfile.mkdtemp`. If *quiet* is ``False``, the context manager + raises an exception on error. Otherwise, if *path* is specified and + cannot be created, only a warning is issued. + + +.. function:: change_cwd(path, quiet=False) + + A context manager that temporarily changes the current working + directory to *path* and yields the directory. + + If *quiet* is ``False``, the context manager raises an exception + on error. Otherwise, it issues only a warning and keeps the current + working directory the same. + + +.. function:: temp_cwd(name='tempcwd', quiet=False) + + A context manager that temporarily creates a new directory and + changes the current working directory (CWD). + + The context manager creates a temporary directory in the current + directory with name *name* before temporarily changing the current + working directory. If *name* is None, the temporary directory is + created using :func:`tempfile.mkdtemp`. + + If *quiet* is ``False`` and it is not possible to create or change + the CWD, an error is raised. Otherwise, only a warning is raised + and the original CWD is used. + + +.. function:: temp_umask(umask) + + A context manager that temporarily sets the process umask. + + +.. function:: can_symlink() + + Return ``True`` if the OS supports symbolic links, ``False`` + otherwise. + + +.. decorator:: skip_unless_symlink() + + A decorator for running tests that require support for symbolic links. .. function:: suppress_crash_popup() @@ -372,6 +450,27 @@ The :mod:`test.support` module defines the following functions: On other platforms it's a no-op. +.. decorator:: anticipate_failure(condition) + + A decorator to conditionally mark tests with + :func:`unittest.expectedFailure`. Any use of this decorator should + have an associated comment identifying the relevant tracker issue. + + +.. decorator:: run_with_locale(catstr, *locales) + + A decorator for running a function in a different locale, correctly + resetting it after it has finished. *catstr* is the locale category as + a string (for example ``"LC_ALL"``). The *locales* passed will be tried + sequentially, and the first valid locale will be used. + + +.. function:: make_bad_fd() + + Create an invalid file descriptor by opening and closing a temporary file, + and returning its descripor. + + .. function:: import_module(name, deprecated=False) This function imports and returns the named module. Unlike a normal @@ -379,7 +478,7 @@ The :mod:`test.support` module defines the following functions: cannot be imported. Module and package deprecation messages are suppressed during this import - if *deprecated* is :const:`True`. + if *deprecated* is ``True``. .. versionadded:: 3.1 @@ -394,7 +493,7 @@ The :mod:`test.support` module defines the following functions: *fresh* is an iterable of additional module names that are also removed from the ``sys.modules`` cache before doing the import. - *blocked* is an iterable of module names that are replaced with :const:`0` + *blocked* is an iterable of module names that are replaced with ``None`` in the module cache during the import to ensure that attempts to import them raise :exc:`ImportError`. @@ -403,23 +502,65 @@ The :mod:`test.support` module defines the following functions: ``sys.modules`` when the fresh import is complete. Module and package deprecation messages are suppressed during this import - if *deprecated* is :const:`True`. + if *deprecated* is ``True``. - This function will raise :exc:`unittest.SkipTest` is the named module - cannot be imported. + This function will raise :exc:`ImportError` if the named module cannot be + imported. Example use:: - # Get copies of the warnings module for testing without - # affecting the version being used by the rest of the test suite - # One copy uses the C implementation, the other is forced to use - # the pure Python fallback implementation + # Get copies of the warnings module for testing without affecting the + # version being used by the rest of the test suite. One copy uses the + # C implementation, the other is forced to use the pure Python fallback + # implementation py_warnings = import_fresh_module('warnings', blocked=['_warnings']) c_warnings = import_fresh_module('warnings', fresh=['_warnings']) .. versionadded:: 3.1 +.. function:: bind_port(sock, host=HOST) + + Bind the socket to a free port and return the port number. Relies on + ephemeral ports in order to ensure we are using an unbound port. This is + important as many tests may be running simultaneously, especially in a + buildbot environment. This method raises an exception if the + ``sock.family`` is :const:`~socket.AF_INET` and ``sock.type`` is + :const:`~socket.SOCK_STREAM`, and the socket has + :const:`~socket.SO_REUSEADDR` or :const:`~socket.SO_REUSEPORT` set on it. + Tests should never set these socket options for TCP/IP sockets. + The only case for setting these options is testing multicasting via + multiple UDP sockets. + + Additionally, if the :const:`~socket.SO_EXCLUSIVEADDRUSE` socket option is + available (i.e. on Windows), it will be set on the socket. This will + prevent anyone else from binding to our host/port for the duration of the + test. + + +.. function:: find_unused_port(family=socket.AF_INET, socktype=socket.SOCK_STREAM) + + Returns an unused port that should be suitable for binding. This is + achieved by creating a temporary socket with the same family and type as + the ``sock`` parameter (default is :const:`~socket.AF_INET`, + :const:`~socket.SOCK_STREAM`), + and binding it to the specified host address (defaults to ``0.0.0.0``) + with the port set to 0, eliciting an unused ephemeral port from the OS. + The temporary socket is then closed and deleted, and the ephemeral port is + returned. + + Either this method or :func:`bind_port` should be used for any tests + where a server socket needs to be bound to a particular port for the + duration of the test. + Which one to use depends on whether the calling code is creating a python + socket, or if an unused port needs to be provided in a constructor + or passed to an external program (i.e. the ``-accept`` argument to + openssl's s_server mode). Always prefer :func:`bind_port` over + :func:`find_unused_port` where possible. Using a hard coded port is + discouraged since it can makes multiple instances of the test impossible to + run simultaneously, which is a problem for buildbots. + + The :mod:`test.support` module defines the following classes: .. class:: TransientResource(exc, **kwargs) diff --git a/Doc/library/text.rst b/Doc/library/text.rst new file mode 100644 index 0000000..47b6784 --- /dev/null +++ b/Doc/library/text.rst @@ -0,0 +1,26 @@ +.. _stringservices: +.. _textservices: + +************************ +Text Processing Services +************************ + +The modules described in this chapter provide a wide range of string +manipulation operations and other text processing services. + +The :mod:`codecs` module described under :ref:`binaryservices` is also +highly relevant to text processing. In addition, see the documentation for +Python's built-in string type in :ref:`textseq`. + + +.. toctree:: + + string.rst + re.rst + difflib.rst + textwrap.rst + unicodedata.rst + stringprep.rst + readline.rst + rlcompleter.rst + diff --git a/Doc/library/textwrap.rst b/Doc/library/textwrap.rst index 3c1ecf6..c625254 100644 --- a/Doc/library/textwrap.rst +++ b/Doc/library/textwrap.rst @@ -12,7 +12,7 @@ The :mod:`textwrap` module provides two convenience functions, :func:`wrap` and :func:`fill`, as well as :class:`TextWrapper`, the class that does all the work, -and a utility function :func:`dedent`. If you're just wrapping or filling one +and two utility functions, :func:`dedent` and :func:`indent`. If you're just wrapping or filling one or two text strings, the convenience functions should be good enough; otherwise, you should use an instance of :class:`TextWrapper` for efficiency. @@ -48,9 +48,10 @@ Text is preferably wrapped on whitespaces and right after the hyphens in hyphenated words; only then will long words be broken if necessary, unless :attr:`TextWrapper.break_long_words` is set to false. -An additional utility function, :func:`dedent`, is provided to remove -indentation from strings that have unwanted whitespace to the left of the text. - +Two additional utility function, :func:`dedent` and :func:`indent`, are +provided to remove indentation from strings that have unwanted whitespace +to the left of the text and to add an arbitrary prefix to selected lines +in a block of text. .. function:: dedent(text) @@ -75,6 +76,32 @@ indentation from strings that have unwanted whitespace to the left of the text. print(repr(dedent(s))) # prints 'hello\n world\n' +.. function:: indent(text, prefix, predicate=None) + + Add *prefix* to the beginning of selected lines in *text*. + + Lines are separated by calling ``text.splitlines(True)``. + + By default, *prefix* is added to all lines that do not consist + solely of whitespace (including any line endings). + + For example:: + + >>> s = 'hello\n\n \nworld' + >>> indent(s, ' ') + ' hello\n\n \n world' + + The optional *predicate* argument can be used to control which lines + are indented. For example, it is easy to add *prefix* to even empty + and whitespace-only lines:: + + >>> print(indent(s, '+ ', lambda line: True)) + + hello + + + + + + world + + .. class:: TextWrapper(**kwargs) The :class:`TextWrapper` constructor accepts a number of optional keyword @@ -110,6 +137,15 @@ indentation from strings that have unwanted whitespace to the left of the text. expanded to spaces using the :meth:`expandtabs` method of *text*. + .. attribute:: tabsize + + (default: ``8``) If :attr:`expand_tabs` is true, then all tab characters + in *text* will be expanded to zero or more spaces, depending on the + current column and the given tab size. + + .. versionadded:: 3.3 + + .. attribute:: replace_whitespace (default: ``True``) If true, after tab expansion but before wrapping, diff --git a/Doc/library/threading.rst b/Doc/library/threading.rst index e30f0e3..7c326d1 100644 --- a/Doc/library/threading.rst +++ b/Doc/library/threading.rst @@ -20,19 +20,8 @@ The :mod:`dummy_threading` module is provided for situations where methods and functions in this module in the Python 2.x series are still supported by this module. -.. impl-detail:: - - In CPython, due to the :term:`Global Interpreter Lock`, only one thread - can execute Python code at once (even though certain performance-oriented - libraries might overcome this limitation). - If you want your application to make better use of the computational - resources of multi-core machines, you are advised to use - :mod:`multiprocessing` or :class:`concurrent.futures.ProcessPoolExecutor`. - However, threading is still an appropriate model if you want to run - multiple I/O-bound tasks simultaneously. - -This module defines the following functions and objects: +This module defines the following functions: .. function:: active_count() @@ -41,16 +30,6 @@ This module defines the following functions and objects: count is equal to the length of the list returned by :func:`.enumerate`. -.. function:: Condition() - :noindex: - - A factory function that returns a new condition variable object. A condition - variable allows one or more threads to wait until they are notified by another - thread. - - See :ref:`condition-objects`. - - .. function:: current_thread() Return the current :class:`Thread` object, corresponding to the caller's thread @@ -59,6 +38,17 @@ This module defines the following functions and objects: returned. +.. function:: get_ident() + + Return the 'thread identifier' of the current thread. This is a nonzero + integer. Its value has no direct meaning; it is intended as a magic cookie + to be used e.g. to index a dictionary of thread-specific data. Thread + identifiers may be recycled when a thread exits and another thread is + created. + + .. versionadded:: 3.3 + + .. function:: enumerate() Return a list of all :class:`Thread` objects currently alive. The list @@ -67,96 +57,13 @@ This module defines the following functions and objects: and threads that have not yet been started. -.. function:: Event() - :noindex: - - A factory function that returns a new event object. An event manages a flag - that can be set to true with the :meth:`~Event.set` method and reset to false - with the :meth:`clear` method. The :meth:`wait` method blocks until the flag - is true. - - See :ref:`event-objects`. - - -.. class:: local - - A class that represents thread-local data. Thread-local data are data whose - values are thread specific. To manage thread-local data, just create an - instance of :class:`local` (or a subclass) and store attributes on it:: - - mydata = threading.local() - mydata.x = 1 - - The instance's values will be different for separate threads. - - For more details and extensive examples, see the documentation string of the - :mod:`_threading_local` module. - - -.. function:: Lock() - - A factory function that returns a new primitive lock object. Once a thread has - acquired it, subsequent attempts to acquire it block, until it is released; any - thread may release it. - - See :ref:`lock-objects`. - - -.. function:: RLock() - - A factory function that returns a new reentrant lock object. A reentrant lock - must be released by the thread that acquired it. Once a thread has acquired a - reentrant lock, the same thread may acquire it again without blocking; the - thread must release it once for each time it has acquired it. - - See :ref:`rlock-objects`. - - -.. function:: Semaphore(value=1) - :noindex: - - A factory function that returns a new semaphore object. A semaphore manages a - counter representing the number of :meth:`release` calls minus the number of - :meth:`acquire` calls, plus an initial value. The :meth:`acquire` method blocks - if necessary until it can return without making the counter negative. If not - given, *value* defaults to 1. - - See :ref:`semaphore-objects`. - - -.. function:: BoundedSemaphore(value=1) - - A factory function that returns a new bounded semaphore object. A bounded - semaphore checks to make sure its current value doesn't exceed its initial - value. If it does, :exc:`ValueError` is raised. In most situations semaphores - are used to guard resources with limited capacity. If the semaphore is released - too many times it's a sign of a bug. If not given, *value* defaults to 1. - - -.. class:: Thread - :noindex: - - A class that represents a thread of control. This class can be safely - subclassed in a limited fashion. - - See :ref:`thread-objects`. - - -.. class:: Timer - :noindex: - - A thread that executes a function after a specified interval has passed. - - See :ref:`timer-objects`. - - .. function:: settrace(func) .. index:: single: trace function Set a trace function for all threads started from the :mod:`threading` module. The *func* will be passed to :func:`sys.settrace` for each thread, before its - :meth:`run` method is called. + :meth:`~Thread.run` method is called. .. function:: setprofile(func) @@ -165,7 +72,7 @@ This module defines the following functions and objects: Set a profile function for all threads started from the :mod:`threading` module. The *func* will be passed to :func:`sys.setprofile` for each thread, before its - :meth:`run` method is called. + :meth:`~Thread.run` method is called. .. function:: stack_size([size]) @@ -173,15 +80,15 @@ This module defines the following functions and objects: Return the thread stack size used when creating new threads. The optional *size* argument specifies the stack size to be used for subsequently created threads, and must be 0 (use platform or configured default) or a positive - integer value of at least 32,768 (32kB). If changing the thread stack size is - unsupported, a :exc:`ThreadError` is raised. If the specified stack size is - invalid, a :exc:`ValueError` is raised and the stack size is unmodified. 32kB + integer value of at least 32,768 (32 KiB). If changing the thread stack size is + unsupported, a :exc:`RuntimeError` is raised. If the specified stack size is + invalid, a :exc:`ValueError` is raised and the stack size is unmodified. 32 KiB is currently the minimum supported stack size value to guarantee sufficient stack space for the interpreter itself. Note that some platforms may have particular restrictions on values for the stack size, such as requiring a - minimum stack size > 32kB or requiring allocation in multiples of the system + minimum stack size > 32 KiB or requiring allocation in multiples of the system memory page size - platform documentation should be referred to for more - information (4kB pages are common; using multiples of 4096 for the stack size is + information (4 KiB pages are common; using multiples of 4096 for the stack size is the suggested approach in the absence of more specific information). Availability: Windows, systems with POSIX threads. @@ -198,7 +105,8 @@ This module also defines the following constant: .. versionadded:: 3.2 -Detailed interfaces for the objects are documented below. +This module defines a number of classes, which are detailed in the sections +below. The design of this module is loosely based on Java's threading model. However, where Java makes locks and condition variables basic behavior of every object, @@ -211,17 +119,38 @@ when implemented, are mapped to module-level functions. All of the methods described below are executed atomically. +Thread-Local Data +----------------- + +Thread-local data is data whose values are thread specific. To manage +thread-local data, just create an instance of :class:`local` (or a +subclass) and store attributes on it:: + + mydata = threading.local() + mydata.x = 1 + +The instance's values will be different for separate threads. + + +.. class:: local() + + A class that represents thread-local data. + + For more details and extensive examples, see the documentation string of the + :mod:`_threading_local` module. + + .. _thread-objects: Thread Objects -------------- -This class represents an activity that is run in a separate thread of control. -There are two ways to specify the activity: by passing a callable object to the -constructor, or by overriding the :meth:`~Thread.run` method in a subclass. -No other methods (except for the constructor) should be overridden in a -subclass. In other words, *only* override the :meth:`~Thread.__init__` -and :meth:`~Thread.run` methods of this class. +The :class:`Thread` class represents an activity that is run in a separate +thread of control. There are two ways to specify the activity: by passing a +callable object to the constructor, or by overriding the :meth:`~Thread.run` +method in a subclass. No other methods (except for the constructor) should be +overridden in a subclass. In other words, *only* override the +:meth:`~Thread.__init__` and :meth:`~Thread.run` methods of this class. Once a thread object is created, its activity must be started by calling the thread's :meth:`~Thread.start` method. This invokes the :meth:`~Thread.run` @@ -239,10 +168,11 @@ called is terminated. A thread has a name. The name can be passed to the constructor, and read or changed through the :attr:`~Thread.name` attribute. -A thread can be flagged as a "daemon thread". The significance of this flag -is that the entire Python program exits when only daemon threads are left. -The initial value is inherited from the creating thread. The flag can be -set through the :attr:`~Thread.daemon` property. +A thread can be flagged as a "daemon thread". The significance of this flag is +that the entire Python program exits when only daemon threads are left. The +initial value is inherited from the creating thread. The flag can be set +through the :attr:`~Thread.daemon` property or the *daemon* constructor +argument. .. note:: Daemon threads are abruptly stopped at shutdown. Their resources (such @@ -261,7 +191,8 @@ daemonic, and cannot be :meth:`~Thread.join`\ ed. They are never deleted, since it is impossible to detect the termination of alien threads. -.. class:: Thread(group=None, target=None, name=None, args=(), kwargs={}) +.. class:: Thread(group=None, target=None, name=None, args=(), kwargs={}, *, \ + daemon=None) This constructor should always be called with keyword arguments. Arguments are: @@ -280,10 +211,17 @@ since it is impossible to detect the termination of alien threads. *kwargs* is a dictionary of keyword arguments for the target invocation. Defaults to ``{}``. + If not ``None``, *daemon* explicitly sets whether the thread is daemonic. + If ``None`` (the default), the daemonic property is inherited from the + current thread. + If the subclass overrides the constructor, it must make sure to invoke the base class constructor (``Thread.__init__()``) before doing anything else to the thread. + .. versionchanged:: 3.3 + Added the *daemon* argument. + .. method:: start() Start the thread's activity. @@ -374,6 +312,18 @@ since it is impossible to detect the termination of alien threads. property instead. +.. impl-detail:: + + In CPython, due to the :term:`Global Interpreter Lock`, only one thread + can execute Python code at once (even though certain performance-oriented + libraries might overcome this limitation). + If you want your application to make better use of the computational + resources of multi-core machines, you are advised to use + :mod:`multiprocessing` or :class:`concurrent.futures.ProcessPoolExecutor`. + However, threading is still an appropriate model if you want to run + multiple I/O-bound tasks simultaneously. + + .. _lock-objects: Lock Objects @@ -405,45 +355,55 @@ is not defined, and may vary across implementations. All methods are executed atomically. -.. method:: Lock.acquire(blocking=True, timeout=-1) +.. class:: Lock() + + The class implementing primitive lock objects. Once a thread has acquired a + lock, subsequent attempts to acquire it block, until it is released; any + thread may release it. + + .. versionchanged:: 3.3 + Changed from a factory function to a class. + - Acquire a lock, blocking or non-blocking. + .. method:: acquire(blocking=True, timeout=-1) - When invoked with the *blocking* argument set to ``True`` (the default), - block until the lock is unlocked, then set it to locked and return ``True``. + Acquire a lock, blocking or non-blocking. - When invoked with the *blocking* argument set to ``False``, do not block. - If a call with *blocking* set to ``True`` would block, return ``False`` - immediately; otherwise, set the lock to locked and return ``True``. + When invoked with the *blocking* argument set to ``True`` (the default), + block until the lock is unlocked, then set it to locked and return ``True``. - When invoked with the floating-point *timeout* argument set to a positive - value, block for at most the number of seconds specified by *timeout* - and as long as the lock cannot be acquired. A negative *timeout* argument - specifies an unbounded wait. It is forbidden to specify a *timeout* - when *blocking* is false. + When invoked with the *blocking* argument set to ``False``, do not block. + If a call with *blocking* set to ``True`` would block, return ``False`` + immediately; otherwise, set the lock to locked and return ``True``. - The return value is ``True`` if the lock is acquired successfully, - ``False`` if not (for example if the *timeout* expired). + When invoked with the floating-point *timeout* argument set to a positive + value, block for at most the number of seconds specified by *timeout* + and as long as the lock cannot be acquired. A *timeout* argument of ``-1`` + specifies an unbounded wait. It is forbidden to specify a *timeout* + when *blocking* is false. - .. versionchanged:: 3.2 - The *timeout* parameter is new. + The return value is ``True`` if the lock is acquired successfully, + ``False`` if not (for example if the *timeout* expired). - .. versionchanged:: 3.2 - Lock acquires can now be interrupted by signals on POSIX. + .. versionchanged:: 3.2 + The *timeout* parameter is new. + + .. versionchanged:: 3.2 + Lock acquires can now be interrupted by signals on POSIX. -.. method:: Lock.release() + .. method:: release() - Release a lock. This can be called from any thread, not only the thread - which has acquired the lock. + Release a lock. This can be called from any thread, not only the thread + which has acquired the lock. - When the lock is locked, reset it to unlocked, and return. If any other threads - are blocked waiting for the lock to become unlocked, allow exactly one of them - to proceed. + When the lock is locked, reset it to unlocked, and return. If any other threads + are blocked waiting for the lock to become unlocked, allow exactly one of them + to proceed. - When invoked on an unlocked lock, a :exc:`ThreadError` is raised. + When invoked on an unlocked lock, a :exc:`RuntimeError` is raised. - There is no return value. + There is no return value. .. _rlock-objects: @@ -467,47 +427,59 @@ allows another thread blocked in :meth:`~Lock.acquire` to proceed. Reentrant locks also support the :ref:`context manager protocol <with-locks>`. -.. method:: RLock.acquire(blocking=True, timeout=-1) +.. class:: RLock() - Acquire a lock, blocking or non-blocking. + This class implements reentrant lock objects. A reentrant lock must be + released by the thread that acquired it. Once a thread has acquired a + reentrant lock, the same thread may acquire it again without blocking; the + thread must release it once for each time it has acquired it. - When invoked without arguments: if this thread already owns the lock, increment - the recursion level by one, and return immediately. Otherwise, if another - thread owns the lock, block until the lock is unlocked. Once the lock is - unlocked (not owned by any thread), then grab ownership, set the recursion level - to one, and return. If more than one thread is blocked waiting until the lock - is unlocked, only one at a time will be able to grab ownership of the lock. - There is no return value in this case. + Note that ``RLock`` is actually a factory function which returns an instance + of the most efficient version of the concrete RLock class that is supported + by the platform. - When invoked with the *blocking* argument set to true, do the same thing as when - called without arguments, and return true. - When invoked with the *blocking* argument set to false, do not block. If a call - without an argument would block, return false immediately; otherwise, do the - same thing as when called without arguments, and return true. + .. method:: acquire(blocking=True, timeout=-1) - When invoked with the floating-point *timeout* argument set to a positive - value, block for at most the number of seconds specified by *timeout* - and as long as the lock cannot be acquired. Return true if the lock has - been acquired, false if the timeout has elapsed. + Acquire a lock, blocking or non-blocking. - .. versionchanged:: 3.2 - The *timeout* parameter is new. + When invoked without arguments: if this thread already owns the lock, increment + the recursion level by one, and return immediately. Otherwise, if another + thread owns the lock, block until the lock is unlocked. Once the lock is + unlocked (not owned by any thread), then grab ownership, set the recursion level + to one, and return. If more than one thread is blocked waiting until the lock + is unlocked, only one at a time will be able to grab ownership of the lock. + There is no return value in this case. + When invoked with the *blocking* argument set to true, do the same thing as when + called without arguments, and return true. -.. method:: RLock.release() + When invoked with the *blocking* argument set to false, do not block. If a call + without an argument would block, return false immediately; otherwise, do the + same thing as when called without arguments, and return true. - Release a lock, decrementing the recursion level. If after the decrement it is - zero, reset the lock to unlocked (not owned by any thread), and if any other - threads are blocked waiting for the lock to become unlocked, allow exactly one - of them to proceed. If after the decrement the recursion level is still - nonzero, the lock remains locked and owned by the calling thread. + When invoked with the floating-point *timeout* argument set to a positive + value, block for at most the number of seconds specified by *timeout* + and as long as the lock cannot be acquired. Return true if the lock has + been acquired, false if the timeout has elapsed. - Only call this method when the calling thread owns the lock. A - :exc:`RuntimeError` is raised if this method is called when the lock is - unlocked. + .. versionchanged:: 3.2 + The *timeout* parameter is new. + + + .. method:: release() - There is no return value. + Release a lock, decrementing the recursion level. If after the decrement it is + zero, reset the lock to unlocked (not owned by any thread), and if any other + threads are blocked waiting for the lock to become unlocked, allow exactly one + of them to proceed. If after the decrement the recursion level is still + nonzero, the lock remains locked and owned by the calling thread. + + Only call this method when the calling thread owns the lock. A + :exc:`RuntimeError` is raised if this method is called when the lock is + unlocked. + + There is no return value. .. _condition-objects: @@ -542,10 +514,6 @@ not return from their :meth:`~Condition.wait` call immediately, but only when the thread that called :meth:`~Condition.notify` or :meth:`~Condition.notify_all` finally relinquishes ownership of the lock. - -Usage -^^^^^ - The typical programming style using condition variables uses the lock to synchronize access to some shared state; threads that are interested in a particular change of state call :meth:`~Condition.wait` repeatedly until they @@ -584,15 +552,18 @@ waiting threads. E.g. in a typical producer-consumer situation, adding one item to the buffer only needs to wake up one consumer thread. -Interface -^^^^^^^^^ - .. class:: Condition(lock=None) + This class implements condition variable objects. A condition variable + allows one or more threads to wait until they are notified by another thread. + If the *lock* argument is given and not ``None``, it must be a :class:`Lock` or :class:`RLock` object, and it is used as the underlying lock. Otherwise, a new :class:`RLock` object is created and used as the underlying lock. + .. versionchanged:: 3.3 + changed from a factory function to a class. + .. method:: acquire(*args) Acquire the underlying lock. This method calls the corresponding method on @@ -702,10 +673,19 @@ Semaphores also support the :ref:`context manager protocol <with-locks>`. .. class:: Semaphore(value=1) + This class implements semaphore objects. A semaphore manages a counter + representing the number of :meth:`release` calls minus the number of + :meth:`acquire` calls, plus an initial value. The :meth:`acquire` method + blocks if necessary until it can return without making the counter negative. + If not given, *value* defaults to 1. + The optional argument gives the initial *value* for the internal counter; it defaults to ``1``. If the *value* given is less than 0, :exc:`ValueError` is raised. + .. versionchanged:: 3.3 + changed from a factory function to a class. + .. method:: acquire(blocking=True, timeout=None) Acquire a semaphore. @@ -738,6 +718,18 @@ Semaphores also support the :ref:`context manager protocol <with-locks>`. than zero again, wake up that thread. +.. class:: BoundedSemaphore(value=1) + + Class implementing bounded semaphore objects. A bounded semaphore checks to + make sure its current value doesn't exceed its initial value. If it does, + :exc:`ValueError` is raised. In most situations semaphores are used to guard + resources with limited capacity. If the semaphore is released too many times + it's a sign of a bug. If not given, *value* defaults to 1. + + .. versionchanged:: 3.3 + changed from a factory function to a class. + + .. _semaphore-examples: :class:`Semaphore` Example @@ -749,7 +741,7 @@ you should use a bounded semaphore. Before spawning any worker threads, your main thread would initialize the semaphore:: maxconnections = 5 - ... + # ... pool_sema = BoundedSemaphore(value=maxconnections) Once spawned, worker threads call the semaphore's acquire and release methods @@ -758,7 +750,7 @@ when they need to connect to the server:: with pool_sema: conn = connectdb() try: - ... use connection ... + # ... use connection ... finally: conn.close() @@ -781,7 +773,13 @@ method. The :meth:`~Event.wait` method blocks until the flag is true. .. class:: Event() - The internal flag is initially false. + Class implementing event objects. An event manages a flag that can be set to + true with the :meth:`~Event.set` method and reset to false with the + :meth:`clear` method. The :meth:`wait` method blocks until the flag is true. + The flag is initially false. + + .. versionchanged:: 3.3 + changed from a factory function to a class. .. method:: is_set() @@ -827,10 +825,11 @@ This class represents an action that should be run only after a certain amount of time has passed --- a timer. :class:`Timer` is a subclass of :class:`Thread` and as such also functions as an example of creating custom threads. -Timers are started, as with threads, by calling their :meth:`start` method. The -timer can be stopped (before its action has begun) by calling the :meth:`cancel` -method. The interval the timer will wait before executing its action may not be -exactly the same as the interval specified by the user. +Timers are started, as with threads, by calling their :meth:`~Timer.start` +method. The timer can be stopped (before its action has begun) by calling the +:meth:`~Timer.cancel` method. The interval the timer will wait before +executing its action may not be exactly the same as the interval specified by +the user. For example:: @@ -841,10 +840,15 @@ For example:: t.start() # after 30 seconds, "hello, world" will be printed -.. class:: Timer(interval, function, args=[], kwargs={}) +.. class:: Timer(interval, function, args=None, kwargs=None) Create a timer that will run *function* with arguments *args* and keyword arguments *kwargs*, after *interval* seconds have passed. + If *args* is None (the default) then an empty list will be used. + If *kwargs* is None (the default) then an empty dict will be used. + + .. versionchanged:: 3.3 + changed from a factory function to a class. .. method:: cancel() @@ -979,27 +983,3 @@ is equivalent to:: Currently, :class:`Lock`, :class:`RLock`, :class:`Condition`, :class:`Semaphore`, and :class:`BoundedSemaphore` objects may be used as :keyword:`with` statement context managers. - - -.. _threaded-imports: - -Importing in threaded code --------------------------- - -While the import machinery is thread-safe, there are two key restrictions on -threaded imports due to inherent limitations in the way that thread-safety is -provided: - -* Firstly, other than in the main module, an import should not have the - side effect of spawning a new thread and then waiting for that thread in - any way. Failing to abide by this restriction can lead to a deadlock if - the spawned thread directly or indirectly attempts to import a module. -* Secondly, all import attempts must be completed before the interpreter - starts shutting itself down. This can be most easily achieved by only - performing imports from non-daemon threads created through the threading - module. Daemon threads and threads created directly with the thread - module will require some other form of synchronization to ensure they do - not attempt imports after system shutdown has commenced. Failure to - abide by this restriction will lead to intermittent exceptions and - crashes during interpreter shutdown (as the late imports attempt to - access machinery which is no longer in a valid state). diff --git a/Doc/library/time.rst b/Doc/library/time.rst index e0c7007..64b5e04 100644 --- a/Doc/library/time.rst +++ b/Doc/library/time.rst @@ -41,25 +41,6 @@ An explanation of some terminology and conventions is in order. parsed, they are converted according to the POSIX and ISO C standards: values 69--99 are mapped to 1969--1999, and values 0--68 are mapped to 2000--2068. - For backward compatibility, years with less than 4 digits are treated - specially by :func:`asctime`, :func:`mktime`, and :func:`strftime` functions - that operate on a 9-tuple or :class:`struct_time` values. If year (the first - value in the 9-tuple) is specified with less than 4 digits, its interpretation - depends on the value of ``accept2dyear`` variable. - - If ``accept2dyear`` is true (default), a backward compatibility behavior is - invoked as follows: - - - for 2-digit year, century is guessed according to POSIX rules for - ``%y`` strptime format. A deprecation warning is issued when century - information is guessed in this way. - - - for 3-digit or negative year, a :exc:`ValueError` exception is raised. - - If ``accept2dyear`` is false (set by the program or as a result of a - non-empty value assigned to ``PYTHONY2K`` environment variable) all year - values are interpreted as given. - .. index:: single: UTC single: Coordinated Universal Time @@ -96,6 +77,11 @@ An explanation of some terminology and conventions is in order. See :class:`struct_time` for a description of these objects. + .. versionchanged:: 3.3 + The :class:`struct_time` type was extended to provide the :attr:`tm_gmtoff` + and :attr:`tm_zone` attributes when platform supports corresponding + ``struct tm`` members. + * Use the following functions to convert between time representations: +-------------------------+-------------------------+-------------------------+ @@ -117,24 +103,6 @@ An explanation of some terminology and conventions is in order. The module defines the following functions and data items: - -.. data:: accept2dyear - - Boolean value indicating whether two-digit year values will be - mapped to 1969--2068 range by :func:`asctime`, :func:`mktime`, and - :func:`strftime` functions. This is true by default, but will be - set to false if the environment variable :envvar:`PYTHONY2K` has - been set to a non-empty string. It may also be modified at run - time. - - .. deprecated:: 3.2 - Mapping of 2-digit year values by :func:`asctime`, - :func:`mktime`, and :func:`strftime` functions to 1969--2068 - range is deprecated. Programs that need to process 2-digit - years should use ``%y`` code available in :func:`strptime` - function or convert 2-digit year values to 4-digit themselves. - - .. data:: altzone The offset of the local DST timezone, in seconds west of UTC, if one is defined. @@ -152,7 +120,8 @@ The module defines the following functions and data items: .. note:: - Unlike the C function of the same name, there is no trailing newline. + Unlike the C function of the same name, :func:`asctime` does not add a + trailing newline. .. function:: clock() @@ -172,6 +141,97 @@ The module defines the following functions and data items: :c:func:`QueryPerformanceCounter`. The resolution is typically better than one microsecond. + .. deprecated:: 3.3 + The behaviour of this function depends on the platform: use + :func:`perf_counter` or :func:`process_time` instead, depending on your + requirements, to have a well defined behaviour. + + +.. function:: clock_getres(clk_id) + + Return the resolution (precision) of the specified clock *clk_id*. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: clock_gettime(clk_id) + + Return the time of the specified clock *clk_id*. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. function:: clock_settime(clk_id, time) + + Set the time of the specified clock *clk_id*. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: CLOCK_HIGHRES + + The Solaris OS has a CLOCK_HIGHRES timer that attempts to use an optimal + hardware source, and may give close to nanosecond resolution. CLOCK_HIGHRES + is the nonadjustable, high-resolution clock. + + Availability: Solaris. + + .. versionadded:: 3.3 + + +.. data:: CLOCK_MONOTONIC + + Clock that cannot be set and represents monotonic time since some unspecified + starting point. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: CLOCK_MONOTONIC_RAW + + Similar to :data:`CLOCK_MONOTONIC`, but provides access to a raw + hardware-based time that is not subject to NTP adjustments. + + Availability: Linux 2.6.28 or later. + + .. versionadded:: 3.3 + + +.. data:: CLOCK_PROCESS_CPUTIME_ID + + High-resolution per-process timer from the CPU. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: CLOCK_REALTIME + + System-wide real-time clock. Setting this clock requires appropriate + privileges. + + Availability: Unix. + + .. versionadded:: 3.3 + + +.. data:: CLOCK_THREAD_CPUTIME_ID + + Thread-specific CPU-time clock. + + Availability: Unix. + + .. versionadded:: 3.3 + .. function:: ctime([secs]) @@ -186,6 +246,31 @@ The module defines the following functions and data items: Nonzero if a DST timezone is defined. +.. function:: get_clock_info(name) + + Get information on the specified clock as a namespace object. + Supported clock names and the corresponding functions to read their value + are: + + * ``'clock'``: :func:`time.clock` + * ``'monotonic'``: :func:`time.monotonic` + * ``'perf_counter'``: :func:`time.perf_counter` + * ``'process_time'``: :func:`time.process_time` + * ``'time'``: :func:`time.time` + + The result has the following attributes: + + - *adjustable*: ``True`` if the clock can be changed automatically (e.g. by + a NTP daemon) or manually by the system administrator, ``False`` otherwise + - *implementation*: The name of the underlying C function used to get + the clock value + - *monotonic*: ``True`` if the clock cannot go backward, + ``False`` otherwise + - *resolution*: The resolution of the clock in seconds (:class:`float`) + + .. versionadded:: 3.3 + + .. function:: gmtime([secs]) Convert a time expressed in seconds since the epoch to a :class:`struct_time` in @@ -215,6 +300,47 @@ The module defines the following functions and data items: The earliest date for which it can generate a time is platform-dependent. +.. function:: monotonic() + + Return the value (in fractional seconds) of a monotonic clock, i.e. a clock + that cannot go backwards. The clock is not affected by system clock updates. + The reference point of the returned value is undefined, so that only the + difference between the results of consecutive calls is valid. + + On Windows versions older than Vista, :func:`monotonic` detects + :c:func:`GetTickCount` integer overflow (32 bits, roll-over after 49.7 days). + It increases an internal epoch (reference time) by 2\ :sup:`32` each time + that an overflow is detected. The epoch is stored in the process-local state + and so the value of :func:`monotonic` may be different in two Python + processes running for more than 49 days. On more recent versions of Windows + and on other operating systems, :func:`monotonic` is system-wide. + + Availability: Windows, Mac OS X, Linux, FreeBSD, OpenBSD, Solaris. + + .. versionadded:: 3.3 + + +.. function:: perf_counter() + + Return the value (in fractional seconds) of a performance counter, i.e. a + clock with the highest available resolution to measure a short duration. It + does include time elapsed during sleep and is system-wide. The reference + point of the returned value is undefined, so that only the difference between + the results of consecutive calls is valid. + + .. versionadded:: 3.3 + + +.. function:: process_time() + + Return the value (in fractional seconds) of the sum of the system and user + CPU time of the current process. It does not include time elapsed during + sleep. It is process-wide by definition. The reference point of the + returned value is undefined, so that only the difference between the results + of consecutive calls is valid. + + .. versionadded:: 3.3 + .. function:: sleep(secs) Suspend execution for the given number of seconds. The argument may be a @@ -308,9 +434,15 @@ The module defines the following functions and data items: | ``%y`` | Year without century as a decimal number | | | | [00,99]. | | +-----------+------------------------------------------------+-------+ - | ``%Y`` | Year with century as a decimal number. | \(4) | + | ``%Y`` | Year with century as a decimal number. | | | | | | +-----------+------------------------------------------------+-------+ + | ``%z`` | Time zone offset indicating a positive or | | + | | negative time difference from UTC/GMT of the | | + | | form +HHMM or -HHMM, where H represents decimal| | + | | hour digits and M represents decimal minute | | + | | digits [-23:59, +23:59]. | | + +-----------+------------------------------------------------+-------+ | ``%Z`` | Time zone name (no characters if no time zone | | | | exists). | | +-----------+------------------------------------------------+-------+ @@ -332,12 +464,6 @@ The module defines the following functions and data items: When used with the :func:`strptime` function, ``%U`` and ``%W`` are only used in calculations when the day of the week and the year are specified. - (4) - Produces different results depending on the value of - ``time.accept2dyear`` variable. See :ref:`Year 2000 (Y2K) - issues <time-y2kissues>` for details. - - Here is an example, a format for dates compatible with that specified in the :rfc:`2822` Internet email standard. [#]_ :: @@ -345,8 +471,10 @@ The module defines the following functions and data items: >>> strftime("%a, %d %b %Y %H:%M:%S +0000", gmtime()) 'Thu, 28 Jun 2001 14:17:15 +0000' - Additional directives may be supported on certain platforms, but only the ones - listed here have a meaning standardized by ANSI C. + Additional directives may be supported on certain platforms, but only the + ones listed here have a meaning standardized by ANSI C. To see the full set + of format codes supported on your platform, consult the :manpage:`strftime(3)` + documentation. On some platforms, an optional field width and precision specification can immediately follow the initial ``'%'`` of a directive in the following order; @@ -416,10 +544,13 @@ The module defines the following functions and data items: +-------+-------------------+---------------------------------+ | 8 | :attr:`tm_isdst` | 0, 1 or -1; see below | +-------+-------------------+---------------------------------+ + | N/A | :attr:`tm_zone` | abbreviation of timezone name | + +-------+-------------------+---------------------------------+ + | N/A | :attr:`tm_gmtoff` | offset east of UTC in seconds | + +-------+-------------------+---------------------------------+ Note that unlike the C structure, the month value is a range of [1, 12], not - [0, 11]. A year value will be handled as described under :ref:`Year 2000 - (Y2K) issues <time-y2kissues>` above. A ``-1`` argument as the daylight + [0, 11]. A ``-1`` argument as the daylight savings flag, passed to :func:`mktime` will usually result in the correct daylight savings state to be filled in. @@ -427,6 +558,9 @@ The module defines the following functions and data items: :class:`struct_time`, or having elements of the wrong type, a :exc:`TypeError` is raised. + .. versionchanged:: 3.3 + :attr:`tm_gmtoff` and :attr:`tm_zone` attributes are available on platforms + with C library supporting the corresponding fields in ``struct tm``. .. function:: time() @@ -437,7 +571,6 @@ The module defines the following functions and data items: lower value than a previous call if the system clock has been set back between the two calls. - .. data:: timezone The offset of the local (non-DST) timezone, in seconds west of UTC (negative in @@ -540,12 +673,12 @@ The module defines the following functions and data items: More object-oriented interface to dates and times. Module :mod:`locale` - Internationalization services. The locale settings can affect the return values - for some of the functions in the :mod:`time` module. + Internationalization services. The locale setting affects the interpretation + of many format specifiers in :func:`strftime` and :func:`strptime`. Module :mod:`calendar` - General calendar-related functions. :func:`timegm` is the inverse of - :func:`gmtime` from this module. + General calendar-related functions. :func:`~calendar.timegm` is the + inverse of :func:`gmtime` from this module. .. rubric:: Footnotes diff --git a/Doc/library/timeit.rst b/Doc/library/timeit.rst index a3ec66f..a487917 100644 --- a/Doc/library/timeit.rst +++ b/Doc/library/timeit.rst @@ -73,13 +73,10 @@ The module defines three convenience functions and a public class: .. function:: default_timer() - Define a default timer, in a platform-specific manner. On Windows, - :func:`time.clock` has microsecond granularity, but :func:`time.time`'s - granularity is 1/60th of a second. On Unix, :func:`time.clock` has 1/100th of - a second granularity, and :func:`time.time` is much more precise. On either - platform, :func:`default_timer` measures wall clock time, not the CPU - time. This means that other processes running on the same computer may - interfere with the timing. + The default timer, which is always :func:`time.perf_counter`. + + .. versionchanged:: 3.3 + :func:`time.perf_counter` is now the default timer. .. class:: Timer(stmt='pass', setup='pass', timer=<timer function>) @@ -187,13 +184,20 @@ Where the following options are understood: statement to be executed once initially (default ``pass``) +.. cmdoption:: -p, --process + + measure process time, not wallclock time, using :func:`time.process_time` + instead of :func:`time.perf_counter`, which is the default + + .. versionadded:: 3.3 + .. cmdoption:: -t, --time - use :func:`time.time` (default on all platforms but Windows) + use :func:`time.time` (deprecated) .. cmdoption:: -c, --clock - use :func:`time.clock` (default on Windows) + use :func:`time.clock` (deprecated) .. cmdoption:: -v, --verbose @@ -211,12 +215,11 @@ similarly. If :option:`-n` is not given, a suitable number of loops is calculated by trying successive powers of 10 until the total time is at least 0.2 seconds. -:func:`default_timer` measurations can be affected by other programs running on -the same machine, so -the best thing to do when accurate timing is necessary is to repeat -the timing a few times and use the best time. The :option:`-r` option is good -for this; the default of 3 repetitions is probably enough in most cases. On -Unix, you can use :func:`time.clock` to measure CPU time. +:func:`default_timer` measurements can be affected by other programs running on +the same machine, so the best thing to do when accurate timing is necessary is +to repeat the timing a few times and use the best time. The :option:`-r` +option is good for this; the default of 3 repetitions is probably enough in +most cases. You can use :func:`time.process_time` to measure CPU time. .. note:: diff --git a/Doc/library/tkinter.rst b/Doc/library/tkinter.rst index f6e095a..eeb1f80 100644 --- a/Doc/library/tkinter.rst +++ b/Doc/library/tkinter.rst @@ -37,9 +37,6 @@ this should open a window demonstrating a simple Tk interface. `Modern Tkinter for Busy Python Developers <http://www.amazon.com/Modern-Tkinter-Python-Developers-ebook/dp/B0071QDNLO/>`_ Book by Mark Rozerman about building attractive and modern graphical user interfaces with Python and Tkinter. - `An Introduction to Tkinter <http://www.pythonware.com/library/an-introduction-to-tkinter.htm>`_ - Fredrik Lundh's on-line reference material. - `Python and Tkinter Programming <http://www.amazon.com/exec/obidos/ASIN/1884777813>`_ The book by John Grayson (ISBN 1-884777-81-3). @@ -183,7 +180,7 @@ documentation that exists. Here are some hints: The Tk/Tcl development is largely taking place at ActiveState. `Tcl and the Tk Toolkit <http://www.amazon.com/exec/obidos/ASIN/020163337X>`_ - The book by John Ousterhout, the inventor of Tcl . + The book by John Ousterhout, the inventor of Tcl. `Practical Programming in Tcl and Tk <http://www.amazon.com/exec/obidos/ASIN/0130220280>`_ Brent Welch's encyclopedic book. @@ -194,35 +191,30 @@ A Simple Hello World Program :: - from tkinter import * - - class Application(Frame): - def say_hi(self): - print("hi there, everyone!") - - def createWidgets(self): - self.QUIT = Button(self) - self.QUIT["text"] = "QUIT" - self.QUIT["fg"] = "red" - self.QUIT["command"] = self.quit + import tkinter as tk - self.QUIT.pack({"side": "left"}) + class Application(tk.Frame): + def __init__(self, master=None): + tk.Frame.__init__(self, master) + self.pack() + self.createWidgets() - self.hi_there = Button(self) - self.hi_there["text"] = "Hello", - self.hi_there["command"] = self.say_hi + def createWidgets(self): + self.hi_there = tk.Button(self) + self.hi_there["text"] = "Hello World\n(click me)" + self.hi_there["command"] = self.say_hi + self.hi_there.pack(side="top") - self.hi_there.pack({"side": "left"}) + self.QUIT = tk.Button(self, text="QUIT", fg="red", + command=root.destroy) + self.QUIT.pack(side="bottom") - def __init__(self, master=None): - Frame.__init__(self, master) - self.pack() - self.createWidgets() + def say_hi(self): + print("hi there, everyone!") - root = Tk() - app = Application(master=root) - app.mainloop() - root.destroy() + root = tk.Tk() + app = Application(master=root) + app.mainloop() A (Very) Quick Look at Tcl/Tk @@ -448,7 +440,7 @@ back will contain the name of the synonym and the "real" option (such as Example:: >>> print(fred.config()) - {'relief' : ('relief', 'relief', 'Relief', 'raised', 'groove')} + {'relief': ('relief', 'relief', 'Relief', 'raised', 'groove')} Of course, the dictionary printed will include all the options available and their values. This is meant only as an example. @@ -621,7 +613,7 @@ bitmap preceded with an ``@``, as in ``"@/usr/contrib/bitmap/gumby.bit"``. boolean - You can pass integers 0 or 1 or the strings ``"yes"`` or ``"no"`` . + You can pass integers 0 or 1 or the strings ``"yes"`` or ``"no"``. callback This is any Python function that takes no arguments. For example:: @@ -755,22 +747,32 @@ Entry widget indexes (index, view index, etc.) displayed. You can use these :mod:`tkinter` functions to access these special points in text widgets: - AtEnd() +.. function:: AtEnd() refers to the last position in the text - AtInsert() + .. deprecated:: 3.3 + +.. function:: AtInsert() refers to the point where the text cursor is - AtSelFirst() + .. deprecated:: 3.3 + +.. function:: AtSelFirst() indicates the beginning point of the selected text - AtSelLast() + .. deprecated:: 3.3 + +.. function:: AtSelLast() denotes the last point of the selected text and finally - At(x[, y]) + .. deprecated:: 3.3 + +.. function:: At(x[, y]) refers to the character at pixel location *x*, *y* (with *y* not used in the case of a text entry widget, which contains a single line of text). + .. deprecated:: 3.3 + Text widget indexes The index notation for Text widgets is very rich and is best described in the Tk man pages. @@ -818,4 +820,3 @@ some widget (e.g. labels, buttons, menus). In these cases, Tk will not keep a reference to the image. When the last Python reference to the image object is deleted, the image data is deleted as well, and Tk will display an empty box wherever the image was used. - diff --git a/Doc/library/tkinter.ttk.rst b/Doc/library/tkinter.ttk.rst index ed351f5..6f8bf1c 100644 --- a/Doc/library/tkinter.ttk.rst +++ b/Doc/library/tkinter.ttk.rst @@ -102,6 +102,8 @@ Standard Options All the :mod:`ttk` Widgets accepts the following options: + .. tabularcolumns:: |l|L| + +-----------+--------------------------------------------------------------+ | Option | Description | +===========+==============================================================+ @@ -134,8 +136,10 @@ Scrollable Widget Options The following options are supported by widgets that are controlled by a scrollbar. + .. tabularcolumns:: |l|L| + +----------------+---------------------------------------------------------+ - | option | description | + | Option | Description | +================+=========================================================+ | xscrollcommand | Used to communicate with horizontal scrollbars. | | | | @@ -158,11 +162,10 @@ Label Options The following options are supported by labels, buttons and other button-like widgets. -.. tabularcolumns:: |p{0.2\textwidth}|p{0.7\textwidth}| -.. + .. tabularcolumns:: |l|p{0.7\linewidth}| +--------------+-----------------------------------------------------------+ - | option | description | + | Option | Description | +==============+===========================================================+ | text | Specifies a text string to be displayed inside the widget.| +--------------+-----------------------------------------------------------+ @@ -202,8 +205,10 @@ widgets. Compatibility Options ^^^^^^^^^^^^^^^^^^^^^ + .. tabularcolumns:: |l|L| + +--------+----------------------------------------------------------------+ - | option | description | + | Option | Description | +========+================================================================+ | state | May be set to "normal" or "disabled" to control the "disabled" | | | state bit. This is a write-only option: setting it changes the | @@ -216,8 +221,10 @@ Widget States The widget state is a bitmap of independent state flags. + .. tabularcolumns:: |l|L| + +------------+-------------------------------------------------------------+ - | flag | description | + | Flag | Description | +============+=============================================================+ | active | The mouse cursor is over the widget and pressing a mouse | | | button will cause some action to occur | @@ -265,8 +272,8 @@ methods :meth:`tkinter.Widget.cget` and :meth:`tkinter.Widget.configure`. .. method:: instate(statespec, callback=None, *args, **kw) - Test the widget's state. If a callback is not specified, returns True - if the widget state matches *statespec* and False otherwise. If callback + Test the widget's state. If a callback is not specified, returns ``True`` + if the widget state matches *statespec* and ``False`` otherwise. If callback is specified then it is called with args if widget state matches *statespec*. @@ -301,8 +308,10 @@ Options This widget accepts the following specific options: + .. tabularcolumns:: |l|L| + +-----------------+--------------------------------------------------------+ - | option | description | + | Option | Description | +=================+========================================================+ | exportselection | Boolean value. If set, the widget selection is linked | | | to the Window Manager selection (which can be returned | @@ -380,8 +389,10 @@ Options This widget accepts the following specific options: + .. tabularcolumns:: |l|L| + +---------+----------------------------------------------------------------+ - | option | description | + | Option | Description | +=========+================================================================+ | height | If present and greater than zero, specifies the desired height | | | of the pane area (not including internal padding or tabs). | @@ -404,8 +415,10 @@ Tab Options There are also specific options for tabs: + .. tabularcolumns:: |l|L| + +-----------+--------------------------------------------------------------+ - | option | description | + | Option | Description | +===========+==============================================================+ | state | Either "normal", "disabled" or "hidden". If "disabled", then | | | the tab is not selectable. If "hidden", then the tab is not | @@ -566,8 +579,10 @@ Options This widget accepts the following specific options: + .. tabularcolumns:: |l|L| + +----------+---------------------------------------------------------------+ - | option | description | + | Option | Description | +==========+===============================================================+ | orient | One of "horizontal" or "vertical". Specifies the orientation | | | of the progress bar. | @@ -635,8 +650,10 @@ Options This widget accepts the following specific option: + .. tabularcolumns:: |l|L| + +--------+----------------------------------------------------------------+ - | option | description | + | Option | Description | +========+================================================================+ | orient | One of "horizontal" or "vertical". Specifies the orientation of| | | the separator. | @@ -701,11 +718,10 @@ Options This widget accepts the following specific options: -.. tabularcolumns:: |p{0.2\textwidth}|p{0.7\textwidth}| -.. + .. tabularcolumns:: |l|p{0.7\linewidth}| +----------------+--------------------------------------------------------+ - | option | description | + | Option | Description | +================+========================================================+ | columns | A list of column identifiers, specifying the number of | | | columns and their names. | @@ -753,8 +769,10 @@ Item Options The following item options may be specified for items in the insert and item widget commands. + .. tabularcolumns:: |l|L| + +--------+---------------------------------------------------------------+ - | option | description | + | Option | Description | +========+===============================================================+ | text | The textual label to display for the item. | +--------+---------------------------------------------------------------+ @@ -779,8 +797,10 @@ Tag Options The following options may be specified on tags: + .. tabularcolumns:: |l|L| + +------------+-----------------------------------------------------------+ - | option | description | + | Option | Description | +============+===========================================================+ | foreground | Specifies the text foreground color. | +------------+-----------------------------------------------------------+ @@ -822,8 +842,10 @@ Virtual Events The Treeview widget generates the following virtual events. + .. tabularcolumns:: |l|L| + +--------------------+--------------------------------------------------+ - | event | description | + | Event | Description | +====================+==================================================+ | <<TreeviewSelect>> | Generated whenever the selection changes. | +--------------------+--------------------------------------------------+ @@ -916,7 +938,7 @@ ttk.Treeview .. method:: exists(item) - Returns True if the specified *item* is present in the tree. + Returns ``True`` if the specified *item* is present in the tree. .. method:: focus(item=None) @@ -1062,7 +1084,7 @@ ttk.Treeview Ensure that *item* is visible. - Sets all of *item*'s ancestors open option to True, and scrolls the + Sets all of *item*'s ancestors open option to ``True``, and scrolls the widget if necessary so that *item* is within the visible portion of the tree. diff --git a/Doc/library/tokenize.rst b/Doc/library/tokenize.rst index 70919ca..37d9f41 100644 --- a/Doc/library/tokenize.rst +++ b/Doc/library/tokenize.rst @@ -17,9 +17,11 @@ colorizers for on-screen displays. To simplify token stream handling, all :ref:`operators` and :ref:`delimiters` tokens are returned using the generic :data:`token.OP` token type. The exact -type can be determined by checking the token ``string`` field on the -:term:`named tuple` returned from :func:`tokenize.tokenize` for the character -sequence that identifies a specific operator token. +type can be determined by checking the ``exact_type`` property on the +:term:`named tuple` returned from :func:`tokenize.tokenize`. + +Tokenizing Input +---------------- The primary entry point is a :term:`generator`: @@ -39,9 +41,17 @@ The primary entry point is a :term:`generator`: returned as a :term:`named tuple` with the field names: ``type string start end line``. + The returned :term:`named tuple` has a additional property named + ``exact_type`` that contains the exact operator type for + :data:`token.OP` tokens. For all other token types ``exact_type`` + equals the named tuple ``type`` field. + .. versionchanged:: 3.1 Added support for named tuples. + .. versionchanged:: 3.3 + Added support for ``exact_type``. + :func:`tokenize` determines the source encoding of the file by looking for a UTF-8 BOM or encoding cookie, according to :pep:`263`. @@ -122,6 +132,38 @@ function it uses to do this is available: .. versionadded:: 3.2 +.. _tokenize-cli: + +Command-Line Usage +------------------ + +.. versionadded:: 3.3 + +The :mod:`tokenize` module can be executed as a script from the command line. +It is as simple as: + +.. code-block:: sh + + python -m tokenize [-e] [filename.py] + +The following options are accepted: + +.. program:: tokenize + +.. cmdoption:: -h, --help + + show this help message and exit + +.. cmdoption:: -e, --exact + + display token names using the exact type + +If :file:`filename.py` is specified its contents are tokenized to stdout. +Otherwise, tokenization is performed on stdin. + +Examples +------------------ + Example of a script rewriter that transforms float literals into Decimal objects:: @@ -164,3 +206,63 @@ objects:: result.append((toknum, tokval)) return untokenize(result).decode('utf-8') +Example of tokenizing from the command line. The script:: + + def say_hello(): + print("Hello, World!") + + say_hello() + +will be tokenized to the following output where the first column is the range +of the line/column coordinates where the token is found, the second column is +the name of the token, and the final column is the value of the token (if any) + +.. code-block:: sh + + $ python -m tokenize hello.py + 0,0-0,0: ENCODING 'utf-8' + 1,0-1,3: NAME 'def' + 1,4-1,13: NAME 'say_hello' + 1,13-1,14: OP '(' + 1,14-1,15: OP ')' + 1,15-1,16: OP ':' + 1,16-1,17: NEWLINE '\n' + 2,0-2,4: INDENT ' ' + 2,4-2,9: NAME 'print' + 2,9-2,10: OP '(' + 2,10-2,25: STRING '"Hello, World!"' + 2,25-2,26: OP ')' + 2,26-2,27: NEWLINE '\n' + 3,0-3,1: NL '\n' + 4,0-4,0: DEDENT '' + 4,0-4,9: NAME 'say_hello' + 4,9-4,10: OP '(' + 4,10-4,11: OP ')' + 4,11-4,12: NEWLINE '\n' + 5,0-5,0: ENDMARKER '' + +The exact token type names can be displayed using the ``-e`` option: + +.. code-block:: sh + + $ python -m tokenize -e hello.py + 0,0-0,0: ENCODING 'utf-8' + 1,0-1,3: NAME 'def' + 1,4-1,13: NAME 'say_hello' + 1,13-1,14: LPAR '(' + 1,14-1,15: RPAR ')' + 1,15-1,16: COLON ':' + 1,16-1,17: NEWLINE '\n' + 2,0-2,4: INDENT ' ' + 2,4-2,9: NAME 'print' + 2,9-2,10: LPAR '(' + 2,10-2,25: STRING '"Hello, World!"' + 2,25-2,26: RPAR ')' + 2,26-2,27: NEWLINE '\n' + 3,0-3,1: NL '\n' + 4,0-4,0: DEDENT '' + 4,0-4,9: NAME 'say_hello' + 4,9-4,10: LPAR '(' + 4,10-4,11: RPAR ')' + 4,11-4,12: NEWLINE '\n' + 5,0-5,0: ENDMARKER '' diff --git a/Doc/library/trace.rst b/Doc/library/trace.rst index 9b52f7d..b0ac812 100644 --- a/Doc/library/trace.rst +++ b/Doc/library/trace.rst @@ -41,8 +41,8 @@ Main options At least one of the following options must be specified when invoking :mod:`trace`. The :option:`--listfuncs <-l>` option is mutually exclusive with -the :option:`--trace <-t>` and :option:`--counts <-c>` options . When -:option:`--listfuncs <-l>` is provided, neither :option:`--counts <-c>` nor +the :option:`--trace <-t>` and :option:`--count <-c>` options. When +:option:`--listfuncs <-l>` is provided, neither :option:`--count <-c>` nor :option:`--trace <-t>` are accepted, and vice versa. .. program:: trace diff --git a/Doc/library/turtle.rst b/Doc/library/turtle.rst index 4373f78..b015530 100644 --- a/Doc/library/turtle.rst +++ b/Doc/library/turtle.rst @@ -1055,8 +1055,8 @@ More drawing control Write text - the string representation of *arg* - at the current turtle position according to *align* ("left", "center" or right") and with the given - font. If *move* is True, the pen is moved to the bottom-right corner of the - text. By default, *move* is False. + font. If *move* is true, the pen is moved to the bottom-right corner of the + text. By default, *move* is ``False``. >>> turtle.write("Home = ", True, align="center") >>> turtle.write((0,0), True) @@ -1092,7 +1092,7 @@ Visibility .. function:: isvisible() - Return True if the Turtle is shown, False if it's hidden. + Return ``True`` if the Turtle is shown, ``False`` if it's hidden. >>> turtle.hideturtle() >>> turtle.isvisible() @@ -2301,9 +2301,11 @@ The :mod:`turtledemo` package directory contains: The demo scripts are: +.. tabularcolumns:: |l|L|L| + +----------------+------------------------------+-----------------------+ | Name | Description | Features | -+----------------+------------------------------+-----------------------+ ++================+==============================+=======================+ | bytedesign | complex classical | :func:`tracer`, delay,| | | turtle graphics pattern | :func:`update` | +----------------+------------------------------+-----------------------+ diff --git a/Doc/library/types.rst b/Doc/library/types.rst index d4a76b6..695480f 100644 --- a/Doc/library/types.rst +++ b/Doc/library/types.rst @@ -1,5 +1,5 @@ -:mod:`types` --- Names for built-in types -========================================= +:mod:`types` --- Dynamic type creation and names for built-in types +=================================================================== .. module:: types :synopsis: Names for built-in types. @@ -8,20 +8,77 @@ -------------- -This module defines names for some object types that are used by the standard +This module defines utility function to assist in dynamic creation of +new types. + +It also defines names for some object types that are used by the standard Python interpreter, but not exposed as builtins like :class:`int` or -:class:`str` are. Also, it does not include some of the types that arise -transparently during processing such as the ``listiterator`` type. +:class:`str` are. + + +Dynamic Type Creation +--------------------- + +.. function:: new_class(name, bases=(), kwds=None, exec_body=None) + + Creates a class object dynamically using the appropriate metaclass. + + The first three arguments are the components that make up a class + definition header: the class name, the base classes (in order), the + keyword arguments (such as ``metaclass``). + + The *exec_body* argument is a callback that is used to populate the + freshly created class namespace. It should accept the class namespace + as its sole argument and update the namespace directly with the class + contents. If no callback is provided, it has the same effect as passing + in ``lambda ns: ns``. + + .. versionadded:: 3.3 + +.. function:: prepare_class(name, bases=(), kwds=None) + + Calculates the appropriate metaclass and creates the class namespace. + + The arguments are the components that make up a class definition header: + the class name, the base classes (in order) and the keyword arguments + (such as ``metaclass``). + + The return value is a 3-tuple: ``metaclass, namespace, kwds`` + + *metaclass* is the appropriate metaclass, *namespace* is the + prepared class namespace and *kwds* is an updated copy of the passed + in *kwds* argument with any ``'metaclass'`` entry removed. If no *kwds* + argument is passed in, this will be an empty dict. -Typical use is for :func:`isinstance` or :func:`issubclass` checks. + .. versionadded:: 3.3 -The module defines the following names: +.. seealso:: + + :ref:`metaclasses` + Full details of the class creation process supported by these functions + + :pep:`3115` - Metaclasses in Python 3000 + Introduced the ``__prepare__`` namespace hook + + +Standard Interpreter Types +-------------------------- + +This module provides names for many of the types that are required to +implement a Python interpreter. It deliberately avoids including some of +the types that arise only incidentally during processing such as the +``listiterator`` type. + +Typical use of these names is for :func:`isinstance` or +:func:`issubclass` checks. + +Standard names are defined for the following types: .. data:: FunctionType LambdaType - The type of user-defined functions and functions created by :keyword:`lambda` - expressions. + The type of user-defined functions and functions created by + :keyword:`lambda` expressions. .. data:: GeneratorType @@ -85,3 +142,79 @@ The module defines the following names: In other implementations of Python, this type may be identical to ``GetSetDescriptorType``. + +.. class:: MappingProxyType(mapping) + + Read-only proxy of a mapping. It provides a dynamic view on the mapping's + entries, which means that when the mapping changes, the view reflects these + changes. + + .. versionadded:: 3.3 + + .. describe:: key in proxy + + Return ``True`` if the underlying mapping has a key *key*, else + ``False``. + + .. describe:: proxy[key] + + Return the item of the underlying mapping with key *key*. Raises a + :exc:`KeyError` if *key* is not in the underlying mapping. + + .. describe:: iter(proxy) + + Return an iterator over the keys of the underlying mapping. This is a + shortcut for ``iter(proxy.keys())``. + + .. describe:: len(proxy) + + Return the number of items in the underlying mapping. + + .. method:: copy() + + Return a shallow copy of the underlying mapping. + + .. method:: get(key[, default]) + + Return the value for *key* if *key* is in the underlying mapping, else + *default*. If *default* is not given, it defaults to ``None``, so that + this method never raises a :exc:`KeyError`. + + .. method:: items() + + Return a new view of the underlying mapping's items (``(key, value)`` + pairs). + + .. method:: keys() + + Return a new view of the underlying mapping's keys. + + .. method:: values() + + Return a new view of the underlying mapping's values. + + +.. class:: SimpleNamespace + + A simple :class:`object` subclass that provides attribute access to its + namespace, as well as a meaningful repr. + + Unlike :class:`object`, with ``SimpleNamespace`` you can add and remove + attributes. If a ``SimpleNamespace`` object is initialized with keyword + arguments, those are directly added to the underlying namespace. + + The type is roughly equivalent to the following code:: + + class SimpleNamespace: + def __init__(self, **kwargs): + self.__dict__.update(kwargs) + def __repr__(self): + keys = sorted(self.__dict__) + items = ("{}={!r}".format(k, self.__dict__[k]) for k in keys) + return "{}({})".format(type(self).__name__, ", ".join(items)) + + ``SimpleNamespace`` may be useful as a replacement for ``class NS: pass``. + However, for a structured record type use :func:`~collections.namedtuple` + instead. + + .. versionadded:: 3.3 diff --git a/Doc/library/unicodedata.rst b/Doc/library/unicodedata.rst index 33b5ce0..ad70dd9 100644 --- a/Doc/library/unicodedata.rst +++ b/Doc/library/unicodedata.rst @@ -15,8 +15,8 @@ This module provides access to the Unicode Character Database (UCD) which defines character properties for all Unicode characters. The data contained in -this database is compiled from the `UCD version 6.0.0 -<http://www.unicode.org/Public/6.0.0/ucd>`_. +this database is compiled from the `UCD version 6.1.0 +<http://www.unicode.org/Public/6.1.0/ucd>`_. The module uses the same names and symbols as defined by Unicode Standard Annex #44, `"Unicode Character Database" @@ -29,6 +29,9 @@ following functions: Look up character by name. If a character with the given name is found, return the corresponding character. If not found, :exc:`KeyError` is raised. + .. versionchanged:: 3.3 + Support for name aliases [#]_ and named sequences [#]_ has been added. + .. function:: name(chr[, default]) @@ -160,3 +163,9 @@ Examples: >>> unicodedata.bidirectional('\u0660') # 'A'rabic, 'N'umber 'AN' + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.1.0/ucd/NameAliases.txt + +.. [#] http://www.unicode.org/Public/6.1.0/ucd/NamedSequences.txt diff --git a/Doc/library/unittest.mock-examples.rst b/Doc/library/unittest.mock-examples.rst new file mode 100644 index 0000000..554ef11 --- /dev/null +++ b/Doc/library/unittest.mock-examples.rst @@ -0,0 +1,1246 @@ +:mod:`unittest.mock` --- getting started +======================================== + +.. moduleauthor:: Michael Foord <michael@python.org> +.. currentmodule:: unittest.mock + +.. versionadded:: 3.3 + + +.. _getting-started: + +Using Mock +---------- + +Mock Patching Methods +~~~~~~~~~~~~~~~~~~~~~ + +Common uses for :class:`Mock` objects include: + +* Patching methods +* Recording method calls on objects + +You might want to replace a method on an object to check that +it is called with the correct arguments by another part of the system: + + >>> real = SomeClass() + >>> real.method = MagicMock(name='method') + >>> real.method(3, 4, 5, key='value') + <MagicMock name='method()' id='...'> + +Once our mock has been used (`real.method` in this example) it has methods +and attributes that allow you to make assertions about how it has been used. + +.. note:: + + In most of these examples the :class:`Mock` and :class:`MagicMock` classes + are interchangeable. As the `MagicMock` is the more capable class it makes + a sensible one to use by default. + +Once the mock has been called its :attr:`~Mock.called` attribute is set to +`True`. More importantly we can use the :meth:`~Mock.assert_called_with` or +:meth:`~Mock.assert_called_once_with` method to check that it was called with +the correct arguments. + +This example tests that calling `ProductionClass().method` results in a call to +the `something` method: + + >>> class ProductionClass: + ... def method(self): + ... self.something(1, 2, 3) + ... def something(self, a, b, c): + ... pass + ... + >>> real = ProductionClass() + >>> real.something = MagicMock() + >>> real.method() + >>> real.something.assert_called_once_with(1, 2, 3) + + + +Mock for Method Calls on an Object +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In the last example we patched a method directly on an object to check that it +was called correctly. Another common use case is to pass an object into a +method (or some part of the system under test) and then check that it is used +in the correct way. + +The simple `ProductionClass` below has a `closer` method. If it is called with +an object then it calls `close` on it. + + >>> class ProductionClass: + ... def closer(self, something): + ... something.close() + ... + +So to test it we need to pass in an object with a `close` method and check +that it was called correctly. + + >>> real = ProductionClass() + >>> mock = Mock() + >>> real.closer(mock) + >>> mock.close.assert_called_with() + +We don't have to do any work to provide the 'close' method on our mock. +Accessing close creates it. So, if 'close' hasn't already been called then +accessing it in the test will create it, but :meth:`~Mock.assert_called_with` +will raise a failure exception. + + +Mocking Classes +~~~~~~~~~~~~~~~ + +A common use case is to mock out classes instantiated by your code under test. +When you patch a class, then that class is replaced with a mock. Instances +are created by *calling the class*. This means you access the "mock instance" +by looking at the return value of the mocked class. + +In the example below we have a function `some_function` that instantiates `Foo` +and calls a method on it. The call to `patch` replaces the class `Foo` with a +mock. The `Foo` instance is the result of calling the mock, so it is configured +by modifying the mock :attr:`~Mock.return_value`. + + >>> def some_function(): + ... instance = module.Foo() + ... return instance.method() + ... + >>> with patch('module.Foo') as mock: + ... instance = mock.return_value + ... instance.method.return_value = 'the result' + ... result = some_function() + ... assert result == 'the result' + + +Naming your mocks +~~~~~~~~~~~~~~~~~ + +It can be useful to give your mocks a name. The name is shown in the repr of +the mock and can be helpful when the mock appears in test failure messages. The +name is also propagated to attributes or methods of the mock: + + >>> mock = MagicMock(name='foo') + >>> mock + <MagicMock name='foo' id='...'> + >>> mock.method + <MagicMock name='foo.method' id='...'> + + +Tracking all Calls +~~~~~~~~~~~~~~~~~~ + +Often you want to track more than a single call to a method. The +:attr:`~Mock.mock_calls` attribute records all calls +to child attributes of the mock - and also to their children. + + >>> mock = MagicMock() + >>> mock.method() + <MagicMock name='mock.method()' id='...'> + >>> mock.attribute.method(10, x=53) + <MagicMock name='mock.attribute.method()' id='...'> + >>> mock.mock_calls + [call.method(), call.attribute.method(10, x=53)] + +If you make an assertion about `mock_calls` and any unexpected methods +have been called, then the assertion will fail. This is useful because as well +as asserting that the calls you expected have been made, you are also checking +that they were made in the right order and with no additional calls: + +You use the :data:`call` object to construct lists for comparing with +`mock_calls`: + + >>> expected = [call.method(), call.attribute.method(10, x=53)] + >>> mock.mock_calls == expected + True + + +Setting Return Values and Attributes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Setting the return values on a mock object is trivially easy: + + >>> mock = Mock() + >>> mock.return_value = 3 + >>> mock() + 3 + +Of course you can do the same for methods on the mock: + + >>> mock = Mock() + >>> mock.method.return_value = 3 + >>> mock.method() + 3 + +The return value can also be set in the constructor: + + >>> mock = Mock(return_value=3) + >>> mock() + 3 + +If you need an attribute setting on your mock, just do it: + + >>> mock = Mock() + >>> mock.x = 3 + >>> mock.x + 3 + +Sometimes you want to mock up a more complex situation, like for example +`mock.connection.cursor().execute("SELECT 1")`. If we wanted this call to +return a list, then we have to configure the result of the nested call. + +We can use :data:`call` to construct the set of calls in a "chained call" like +this for easy assertion afterwards: + + >>> mock = Mock() + >>> cursor = mock.connection.cursor.return_value + >>> cursor.execute.return_value = ['foo'] + >>> mock.connection.cursor().execute("SELECT 1") + ['foo'] + >>> expected = call.connection.cursor().execute("SELECT 1").call_list() + >>> mock.mock_calls + [call.connection.cursor(), call.connection.cursor().execute('SELECT 1')] + >>> mock.mock_calls == expected + True + +It is the call to `.call_list()` that turns our call object into a list of +calls representing the chained calls. + + +Raising exceptions with mocks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A useful attribute is :attr:`~Mock.side_effect`. If you set this to an +exception class or instance then the exception will be raised when the mock +is called. + + >>> mock = Mock(side_effect=Exception('Boom!')) + >>> mock() + Traceback (most recent call last): + ... + Exception: Boom! + + +Side effect functions and iterables +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +`side_effect` can also be set to a function or an iterable. The use case for +`side_effect` as an iterable is where your mock is going to be called several +times, and you want each call to return a different value. When you set +`side_effect` to an iterable every call to the mock returns the next value +from the iterable: + + >>> mock = MagicMock(side_effect=[4, 5, 6]) + >>> mock() + 4 + >>> mock() + 5 + >>> mock() + 6 + + +For more advanced use cases, like dynamically varying the return values +depending on what the mock is called with, `side_effect` can be a function. +The function will be called with the same arguments as the mock. Whatever the +function returns is what the call returns: + + >>> vals = {(1, 2): 1, (2, 3): 2} + >>> def side_effect(*args): + ... return vals[args] + ... + >>> mock = MagicMock(side_effect=side_effect) + >>> mock(1, 2) + 1 + >>> mock(2, 3) + 2 + + +Creating a Mock from an Existing Object +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +One problem with over use of mocking is that it couples your tests to the +implementation of your mocks rather than your real code. Suppose you have a +class that implements `some_method`. In a test for another class, you +provide a mock of this object that *also* provides `some_method`. If later +you refactor the first class, so that it no longer has `some_method` - then +your tests will continue to pass even though your code is now broken! + +`Mock` allows you to provide an object as a specification for the mock, +using the `spec` keyword argument. Accessing methods / attributes on the +mock that don't exist on your specification object will immediately raise an +attribute error. If you change the implementation of your specification, then +tests that use that class will start failing immediately without you having to +instantiate the class in those tests. + + >>> mock = Mock(spec=SomeClass) + >>> mock.old_method() + Traceback (most recent call last): + ... + AttributeError: object has no attribute 'old_method' + +If you want a stronger form of specification that prevents the setting +of arbitrary attributes as well as the getting of them then you can use +`spec_set` instead of `spec`. + + + +Patch Decorators +---------------- + +.. note:: + + With `patch` it matters that you patch objects in the namespace where they + are looked up. This is normally straightforward, but for a quick guide + read :ref:`where to patch <where-to-patch>`. + + +A common need in tests is to patch a class attribute or a module attribute, +for example patching a builtin or patching a class in a module to test that it +is instantiated. Modules and classes are effectively global, so patching on +them has to be undone after the test or the patch will persist into other +tests and cause hard to diagnose problems. + +mock provides three convenient decorators for this: `patch`, `patch.object` and +`patch.dict`. `patch` takes a single string, of the form +`package.module.Class.attribute` to specify the attribute you are patching. It +also optionally takes a value that you want the attribute (or class or +whatever) to be replaced with. 'patch.object' takes an object and the name of +the attribute you would like patched, plus optionally the value to patch it +with. + +`patch.object`: + + >>> original = SomeClass.attribute + >>> @patch.object(SomeClass, 'attribute', sentinel.attribute) + ... def test(): + ... assert SomeClass.attribute == sentinel.attribute + ... + >>> test() + >>> assert SomeClass.attribute == original + + >>> @patch('package.module.attribute', sentinel.attribute) + ... def test(): + ... from package.module import attribute + ... assert attribute is sentinel.attribute + ... + >>> test() + +If you are patching a module (including :mod:`builtins`) then use `patch` +instead of `patch.object`: + + >>> mock = MagicMock(return_value=sentinel.file_handle) + >>> with patch('builtins.open', mock): + ... handle = open('filename', 'r') + ... + >>> mock.assert_called_with('filename', 'r') + >>> assert handle == sentinel.file_handle, "incorrect file handle returned" + +The module name can be 'dotted', in the form `package.module` if needed: + + >>> @patch('package.module.ClassName.attribute', sentinel.attribute) + ... def test(): + ... from package.module import ClassName + ... assert ClassName.attribute == sentinel.attribute + ... + >>> test() + +A nice pattern is to actually decorate test methods themselves: + + >>> class MyTest(unittest2.TestCase): + ... @patch.object(SomeClass, 'attribute', sentinel.attribute) + ... def test_something(self): + ... self.assertEqual(SomeClass.attribute, sentinel.attribute) + ... + >>> original = SomeClass.attribute + >>> MyTest('test_something').test_something() + >>> assert SomeClass.attribute == original + +If you want to patch with a Mock, you can use `patch` with only one argument +(or `patch.object` with two arguments). The mock will be created for you and +passed into the test function / method: + + >>> class MyTest(unittest2.TestCase): + ... @patch.object(SomeClass, 'static_method') + ... def test_something(self, mock_method): + ... SomeClass.static_method() + ... mock_method.assert_called_with() + ... + >>> MyTest('test_something').test_something() + +You can stack up multiple patch decorators using this pattern: + + >>> class MyTest(unittest2.TestCase): + ... @patch('package.module.ClassName1') + ... @patch('package.module.ClassName2') + ... def test_something(self, MockClass2, MockClass1): + ... self.assertIs(package.module.ClassName1, MockClass1) + ... self.assertIs(package.module.ClassName2, MockClass2) + ... + >>> MyTest('test_something').test_something() + +When you nest patch decorators the mocks are passed in to the decorated +function in the same order they applied (the normal *python* order that +decorators are applied). This means from the bottom up, so in the example +above the mock for `test_module.ClassName2` is passed in first. + +There is also :func:`patch.dict` for setting values in a dictionary just +during a scope and restoring the dictionary to its original state when the test +ends: + + >>> foo = {'key': 'value'} + >>> original = foo.copy() + >>> with patch.dict(foo, {'newkey': 'newvalue'}, clear=True): + ... assert foo == {'newkey': 'newvalue'} + ... + >>> assert foo == original + +`patch`, `patch.object` and `patch.dict` can all be used as context managers. + +Where you use `patch` to create a mock for you, you can get a reference to the +mock using the "as" form of the with statement: + + >>> class ProductionClass: + ... def method(self): + ... pass + ... + >>> with patch.object(ProductionClass, 'method') as mock_method: + ... mock_method.return_value = None + ... real = ProductionClass() + ... real.method(1, 2, 3) + ... + >>> mock_method.assert_called_with(1, 2, 3) + + +As an alternative `patch`, `patch.object` and `patch.dict` can be used as +class decorators. When used in this way it is the same as applying the +decorator individually to every method whose name starts with "test". + + +.. _further-examples: + +Further Examples +---------------- + + +Here are some more examples for some slightly more advanced scenarios. + + +Mocking chained calls +~~~~~~~~~~~~~~~~~~~~~ + +Mocking chained calls is actually straightforward with mock once you +understand the :attr:`~Mock.return_value` attribute. When a mock is called for +the first time, or you fetch its `return_value` before it has been called, a +new `Mock` is created. + +This means that you can see how the object returned from a call to a mocked +object has been used by interrogating the `return_value` mock: + + >>> mock = Mock() + >>> mock().foo(a=2, b=3) + <Mock name='mock().foo()' id='...'> + >>> mock.return_value.foo.assert_called_with(a=2, b=3) + +From here it is a simple step to configure and then make assertions about +chained calls. Of course another alternative is writing your code in a more +testable way in the first place... + +So, suppose we have some code that looks a little bit like this: + + >>> class Something: + ... def __init__(self): + ... self.backend = BackendProvider() + ... def method(self): + ... response = self.backend.get_endpoint('foobar').create_call('spam', 'eggs').start_call() + ... # more code + +Assuming that `BackendProvider` is already well tested, how do we test +`method()`? Specifically, we want to test that the code section `# more +code` uses the response object in the correct way. + +As this chain of calls is made from an instance attribute we can monkey patch +the `backend` attribute on a `Something` instance. In this particular case +we are only interested in the return value from the final call to +`start_call` so we don't have much configuration to do. Let's assume the +object it returns is 'file-like', so we'll ensure that our response object +uses the builtin `open` as its `spec`. + +To do this we create a mock instance as our mock backend and create a mock +response object for it. To set the response as the return value for that final +`start_call` we could do this: + + `mock_backend.get_endpoint.return_value.create_call.return_value.start_call.return_value = mock_response`. + +We can do that in a slightly nicer way using the :meth:`~Mock.configure_mock` +method to directly set the return value for us: + + >>> something = Something() + >>> mock_response = Mock(spec=open) + >>> mock_backend = Mock() + >>> config = {'get_endpoint.return_value.create_call.return_value.start_call.return_value': mock_response} + >>> mock_backend.configure_mock(**config) + +With these we monkey patch the "mock backend" in place and can make the real +call: + + >>> something.backend = mock_backend + >>> something.method() + +Using :attr:`~Mock.mock_calls` we can check the chained call with a single +assert. A chained call is several calls in one line of code, so there will be +several entries in `mock_calls`. We can use :meth:`call.call_list` to create +this list of calls for us: + + >>> chained = call.get_endpoint('foobar').create_call('spam', 'eggs').start_call() + >>> call_list = chained.call_list() + >>> assert mock_backend.mock_calls == call_list + + +Partial mocking +~~~~~~~~~~~~~~~ + +In some tests I wanted to mock out a call to `datetime.date.today() +<http://docs.python.org/library/datetime.html#datetime.date.today>`_ to return +a known date, but I didn't want to prevent the code under test from +creating new date objects. Unfortunately `datetime.date` is written in C, and +so I couldn't just monkey-patch out the static `date.today` method. + +I found a simple way of doing this that involved effectively wrapping the date +class with a mock, but passing through calls to the constructor to the real +class (and returning real instances). + +The :func:`patch decorator <patch>` is used here to +mock out the `date` class in the module under test. The :attr:`side_effect` +attribute on the mock date class is then set to a lambda function that returns +a real date. When the mock date class is called a real date will be +constructed and returned by `side_effect`. + + >>> from datetime import date + >>> with patch('mymodule.date') as mock_date: + ... mock_date.today.return_value = date(2010, 10, 8) + ... mock_date.side_effect = lambda *args, **kw: date(*args, **kw) + ... + ... assert mymodule.date.today() == date(2010, 10, 8) + ... assert mymodule.date(2009, 6, 8) == date(2009, 6, 8) + ... + +Note that we don't patch `datetime.date` globally, we patch `date` in the +module that *uses* it. See :ref:`where to patch <where-to-patch>`. + +When `date.today()` is called a known date is returned, but calls to the +`date(...)` constructor still return normal dates. Without this you can find +yourself having to calculate an expected result using exactly the same +algorithm as the code under test, which is a classic testing anti-pattern. + +Calls to the date constructor are recorded in the `mock_date` attributes +(`call_count` and friends) which may also be useful for your tests. + +An alternative way of dealing with mocking dates, or other builtin classes, +is discussed in `this blog entry +<http://williamjohnbert.com/2011/07/how-to-unit-testing-in-django-with-mocking-and-patching/>`_. + + +Mocking a Generator Method +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A Python generator is a function or method that uses the `yield statement +<http://docs.python.org/reference/simple_stmts.html#the-yield-statement>`_ to +return a series of values when iterated over [#]_. + +A generator method / function is called to return the generator object. It is +the generator object that is then iterated over. The protocol method for +iteration is `__iter__ +<http://docs.python.org/library/stdtypes.html#container.__iter__>`_, so we can +mock this using a `MagicMock`. + +Here's an example class with an "iter" method implemented as a generator: + + >>> class Foo: + ... def iter(self): + ... for i in [1, 2, 3]: + ... yield i + ... + >>> foo = Foo() + >>> list(foo.iter()) + [1, 2, 3] + + +How would we mock this class, and in particular its "iter" method? + +To configure the values returned from the iteration (implicit in the call to +`list`), we need to configure the object returned by the call to `foo.iter()`. + + >>> mock_foo = MagicMock() + >>> mock_foo.iter.return_value = iter([1, 2, 3]) + >>> list(mock_foo.iter()) + [1, 2, 3] + +.. [#] There are also generator expressions and more `advanced uses + <http://www.dabeaz.com/coroutines/index.html>`_ of generators, but we aren't + concerned about them here. A very good introduction to generators and how + powerful they are is: `Generator Tricks for Systems Programmers + <http://www.dabeaz.com/generators/>`_. + + +Applying the same patch to every test method +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you want several patches in place for multiple test methods the obvious way +is to apply the patch decorators to every method. This can feel like unnecessary +repetition. For Python 2.6 or more recent you can use `patch` (in all its +various forms) as a class decorator. This applies the patches to all test +methods on the class. A test method is identified by methods whose names start +with `test`: + + >>> @patch('mymodule.SomeClass') + ... class MyTest(TestCase): + ... + ... def test_one(self, MockSomeClass): + ... self.assertIs(mymodule.SomeClass, MockSomeClass) + ... + ... def test_two(self, MockSomeClass): + ... self.assertIs(mymodule.SomeClass, MockSomeClass) + ... + ... def not_a_test(self): + ... return 'something' + ... + >>> MyTest('test_one').test_one() + >>> MyTest('test_two').test_two() + >>> MyTest('test_two').not_a_test() + 'something' + +An alternative way of managing patches is to use the :ref:`start-and-stop`. +These allow you to move the patching into your `setUp` and `tearDown` methods. + + >>> class MyTest(TestCase): + ... def setUp(self): + ... self.patcher = patch('mymodule.foo') + ... self.mock_foo = self.patcher.start() + ... + ... def test_foo(self): + ... self.assertIs(mymodule.foo, self.mock_foo) + ... + ... def tearDown(self): + ... self.patcher.stop() + ... + >>> MyTest('test_foo').run() + +If you use this technique you must ensure that the patching is "undone" by +calling `stop`. This can be fiddlier than you might think, because if an +exception is raised in the setUp then tearDown is not called. +:meth:`unittest.TestCase.addCleanup` makes this easier: + + >>> class MyTest(TestCase): + ... def setUp(self): + ... patcher = patch('mymodule.foo') + ... self.addCleanup(patcher.stop) + ... self.mock_foo = patcher.start() + ... + ... def test_foo(self): + ... self.assertIs(mymodule.foo, self.mock_foo) + ... + >>> MyTest('test_foo').run() + + +Mocking Unbound Methods +~~~~~~~~~~~~~~~~~~~~~~~ + +Whilst writing tests today I needed to patch an *unbound method* (patching the +method on the class rather than on the instance). I needed self to be passed +in as the first argument because I want to make asserts about which objects +were calling this particular method. The issue is that you can't patch with a +mock for this, because if you replace an unbound method with a mock it doesn't +become a bound method when fetched from the instance, and so it doesn't get +self passed in. The workaround is to patch the unbound method with a real +function instead. The :func:`patch` decorator makes it so simple to +patch out methods with a mock that having to create a real function becomes a +nuisance. + +If you pass `autospec=True` to patch then it does the patching with a +*real* function object. This function object has the same signature as the one +it is replacing, but delegates to a mock under the hood. You still get your +mock auto-created in exactly the same way as before. What it means though, is +that if you use it to patch out an unbound method on a class the mocked +function will be turned into a bound method if it is fetched from an instance. +It will have `self` passed in as the first argument, which is exactly what I +wanted: + + >>> class Foo: + ... def foo(self): + ... pass + ... + >>> with patch.object(Foo, 'foo', autospec=True) as mock_foo: + ... mock_foo.return_value = 'foo' + ... foo = Foo() + ... foo.foo() + ... + 'foo' + >>> mock_foo.assert_called_once_with(foo) + +If we don't use `autospec=True` then the unbound method is patched out +with a Mock instance instead, and isn't called with `self`. + + +Checking multiple calls with mock +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +mock has a nice API for making assertions about how your mock objects are used. + + >>> mock = Mock() + >>> mock.foo_bar.return_value = None + >>> mock.foo_bar('baz', spam='eggs') + >>> mock.foo_bar.assert_called_with('baz', spam='eggs') + +If your mock is only being called once you can use the +:meth:`assert_called_once_with` method that also asserts that the +:attr:`call_count` is one. + + >>> mock.foo_bar.assert_called_once_with('baz', spam='eggs') + >>> mock.foo_bar() + >>> mock.foo_bar.assert_called_once_with('baz', spam='eggs') + Traceback (most recent call last): + ... + AssertionError: Expected to be called once. Called 2 times. + +Both `assert_called_with` and `assert_called_once_with` make assertions about +the *most recent* call. If your mock is going to be called several times, and +you want to make assertions about *all* those calls you can use +:attr:`~Mock.call_args_list`: + + >>> mock = Mock(return_value=None) + >>> mock(1, 2, 3) + >>> mock(4, 5, 6) + >>> mock() + >>> mock.call_args_list + [call(1, 2, 3), call(4, 5, 6), call()] + +The :data:`call` helper makes it easy to make assertions about these calls. You +can build up a list of expected calls and compare it to `call_args_list`. This +looks remarkably similar to the repr of the `call_args_list`: + + >>> expected = [call(1, 2, 3), call(4, 5, 6), call()] + >>> mock.call_args_list == expected + True + + +Coping with mutable arguments +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Another situation is rare, but can bite you, is when your mock is called with +mutable arguments. `call_args` and `call_args_list` store *references* to the +arguments. If the arguments are mutated by the code under test then you can no +longer make assertions about what the values were when the mock was called. + +Here's some example code that shows the problem. Imagine the following functions +defined in 'mymodule':: + + def frob(val): + pass + + def grob(val): + "First frob and then clear val" + frob(val) + val.clear() + +When we try to test that `grob` calls `frob` with the correct argument look +what happens: + + >>> with patch('mymodule.frob') as mock_frob: + ... val = set([6]) + ... mymodule.grob(val) + ... + >>> val + set([]) + >>> mock_frob.assert_called_with(set([6])) + Traceback (most recent call last): + ... + AssertionError: Expected: ((set([6]),), {}) + Called with: ((set([]),), {}) + +One possibility would be for mock to copy the arguments you pass in. This +could then cause problems if you do assertions that rely on object identity +for equality. + +Here's one solution that uses the :attr:`side_effect` +functionality. If you provide a `side_effect` function for a mock then +`side_effect` will be called with the same args as the mock. This gives us an +opportunity to copy the arguments and store them for later assertions. In this +example I'm using *another* mock to store the arguments so that I can use the +mock methods for doing the assertion. Again a helper function sets this up for +me. + + >>> from copy import deepcopy + >>> from unittest.mock import Mock, patch, DEFAULT + >>> def copy_call_args(mock): + ... new_mock = Mock() + ... def side_effect(*args, **kwargs): + ... args = deepcopy(args) + ... kwargs = deepcopy(kwargs) + ... new_mock(*args, **kwargs) + ... return DEFAULT + ... mock.side_effect = side_effect + ... return new_mock + ... + >>> with patch('mymodule.frob') as mock_frob: + ... new_mock = copy_call_args(mock_frob) + ... val = set([6]) + ... mymodule.grob(val) + ... + >>> new_mock.assert_called_with(set([6])) + >>> new_mock.call_args + call(set([6])) + +`copy_call_args` is called with the mock that will be called. It returns a new +mock that we do the assertion on. The `side_effect` function makes a copy of +the args and calls our `new_mock` with the copy. + +.. note:: + + If your mock is only going to be used once there is an easier way of + checking arguments at the point they are called. You can simply do the + checking inside a `side_effect` function. + + >>> def side_effect(arg): + ... assert arg == set([6]) + ... + >>> mock = Mock(side_effect=side_effect) + >>> mock(set([6])) + >>> mock(set()) + Traceback (most recent call last): + ... + AssertionError + +An alternative approach is to create a subclass of `Mock` or `MagicMock` that +copies (using :func:`copy.deepcopy`) the arguments. +Here's an example implementation: + + >>> from copy import deepcopy + >>> class CopyingMock(MagicMock): + ... def __call__(self, *args, **kwargs): + ... args = deepcopy(args) + ... kwargs = deepcopy(kwargs) + ... return super(CopyingMock, self).__call__(*args, **kwargs) + ... + >>> c = CopyingMock(return_value=None) + >>> arg = set() + >>> c(arg) + >>> arg.add(1) + >>> c.assert_called_with(set()) + >>> c.assert_called_with(arg) + Traceback (most recent call last): + ... + AssertionError: Expected call: mock(set([1])) + Actual call: mock(set([])) + >>> c.foo + <CopyingMock name='mock.foo' id='...'> + +When you subclass `Mock` or `MagicMock` all dynamically created attributes, +and the `return_value` will use your subclass automatically. That means all +children of a `CopyingMock` will also have the type `CopyingMock`. + + +Nesting Patches +~~~~~~~~~~~~~~~ + +Using patch as a context manager is nice, but if you do multiple patches you +can end up with nested with statements indenting further and further to the +right: + + >>> class MyTest(TestCase): + ... + ... def test_foo(self): + ... with patch('mymodule.Foo') as mock_foo: + ... with patch('mymodule.Bar') as mock_bar: + ... with patch('mymodule.Spam') as mock_spam: + ... assert mymodule.Foo is mock_foo + ... assert mymodule.Bar is mock_bar + ... assert mymodule.Spam is mock_spam + ... + >>> original = mymodule.Foo + >>> MyTest('test_foo').test_foo() + >>> assert mymodule.Foo is original + +With unittest `cleanup` functions and the :ref:`start-and-stop` we can +achieve the same effect without the nested indentation. A simple helper +method, `create_patch`, puts the patch in place and returns the created mock +for us: + + >>> class MyTest(TestCase): + ... + ... def create_patch(self, name): + ... patcher = patch(name) + ... thing = patcher.start() + ... self.addCleanup(patcher.stop) + ... return thing + ... + ... def test_foo(self): + ... mock_foo = self.create_patch('mymodule.Foo') + ... mock_bar = self.create_patch('mymodule.Bar') + ... mock_spam = self.create_patch('mymodule.Spam') + ... + ... assert mymodule.Foo is mock_foo + ... assert mymodule.Bar is mock_bar + ... assert mymodule.Spam is mock_spam + ... + >>> original = mymodule.Foo + >>> MyTest('test_foo').run() + >>> assert mymodule.Foo is original + + +Mocking a dictionary with MagicMock +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +You may want to mock a dictionary, or other container object, recording all +access to it whilst having it still behave like a dictionary. + +We can do this with :class:`MagicMock`, which will behave like a dictionary, +and using :data:`~Mock.side_effect` to delegate dictionary access to a real +underlying dictionary that is under our control. + +When the `__getitem__` and `__setitem__` methods of our `MagicMock` are called +(normal dictionary access) then `side_effect` is called with the key (and in +the case of `__setitem__` the value too). We can also control what is returned. + +After the `MagicMock` has been used we can use attributes like +:data:`~Mock.call_args_list` to assert about how the dictionary was used: + + >>> my_dict = {'a': 1, 'b': 2, 'c': 3} + >>> def getitem(name): + ... return my_dict[name] + ... + >>> def setitem(name, val): + ... my_dict[name] = val + ... + >>> mock = MagicMock() + >>> mock.__getitem__.side_effect = getitem + >>> mock.__setitem__.side_effect = setitem + +.. note:: + + An alternative to using `MagicMock` is to use `Mock` and *only* provide + the magic methods you specifically want: + + >>> mock = Mock() + >>> mock.__setitem__ = Mock(side_effect=getitem) + >>> mock.__getitem__ = Mock(side_effect=setitem) + + A *third* option is to use `MagicMock` but passing in `dict` as the `spec` + (or `spec_set`) argument so that the `MagicMock` created only has + dictionary magic methods available: + + >>> mock = MagicMock(spec_set=dict) + >>> mock.__getitem__.side_effect = getitem + >>> mock.__setitem__.side_effect = setitem + +With these side effect functions in place, the `mock` will behave like a normal +dictionary but recording the access. It even raises a `KeyError` if you try +to access a key that doesn't exist. + + >>> mock['a'] + 1 + >>> mock['c'] + 3 + >>> mock['d'] + Traceback (most recent call last): + ... + KeyError: 'd' + >>> mock['b'] = 'fish' + >>> mock['d'] = 'eggs' + >>> mock['b'] + 'fish' + >>> mock['d'] + 'eggs' + +After it has been used you can make assertions about the access using the normal +mock methods and attributes: + + >>> mock.__getitem__.call_args_list + [call('a'), call('c'), call('d'), call('b'), call('d')] + >>> mock.__setitem__.call_args_list + [call('b', 'fish'), call('d', 'eggs')] + >>> my_dict + {'a': 1, 'c': 3, 'b': 'fish', 'd': 'eggs'} + + +Mock subclasses and their attributes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There are various reasons why you might want to subclass `Mock`. One reason +might be to add helper methods. Here's a silly example: + + >>> class MyMock(MagicMock): + ... def has_been_called(self): + ... return self.called + ... + >>> mymock = MyMock(return_value=None) + >>> mymock + <MyMock id='...'> + >>> mymock.has_been_called() + False + >>> mymock() + >>> mymock.has_been_called() + True + +The standard behaviour for `Mock` instances is that attributes and the return +value mocks are of the same type as the mock they are accessed on. This ensures +that `Mock` attributes are `Mocks` and `MagicMock` attributes are `MagicMocks` +[#]_. So if you're subclassing to add helper methods then they'll also be +available on the attributes and return value mock of instances of your +subclass. + + >>> mymock.foo + <MyMock name='mock.foo' id='...'> + >>> mymock.foo.has_been_called() + False + >>> mymock.foo() + <MyMock name='mock.foo()' id='...'> + >>> mymock.foo.has_been_called() + True + +Sometimes this is inconvenient. For example, `one user +<https://code.google.com/p/mock/issues/detail?id=105>`_ is subclassing mock to +created a `Twisted adaptor +<http://twistedmatrix.com/documents/11.0.0/api/twisted.python.components.html>`_. +Having this applied to attributes too actually causes errors. + +`Mock` (in all its flavours) uses a method called `_get_child_mock` to create +these "sub-mocks" for attributes and return values. You can prevent your +subclass being used for attributes by overriding this method. The signature is +that it takes arbitrary keyword arguments (`**kwargs`) which are then passed +onto the mock constructor: + + >>> class Subclass(MagicMock): + ... def _get_child_mock(self, **kwargs): + ... return MagicMock(**kwargs) + ... + >>> mymock = Subclass() + >>> mymock.foo + <MagicMock name='mock.foo' id='...'> + >>> assert isinstance(mymock, Subclass) + >>> assert not isinstance(mymock.foo, Subclass) + >>> assert not isinstance(mymock(), Subclass) + +.. [#] An exception to this rule are the non-callable mocks. Attributes use the + callable variant because otherwise non-callable mocks couldn't have callable + methods. + + +Mocking imports with patch.dict +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +One situation where mocking can be hard is where you have a local import inside +a function. These are harder to mock because they aren't using an object from +the module namespace that we can patch out. + +Generally local imports are to be avoided. They are sometimes done to prevent +circular dependencies, for which there is *usually* a much better way to solve +the problem (refactor the code) or to prevent "up front costs" by delaying the +import. This can also be solved in better ways than an unconditional local +import (store the module as a class or module attribute and only do the import +on first use). + +That aside there is a way to use `mock` to affect the results of an import. +Importing fetches an *object* from the `sys.modules` dictionary. Note that it +fetches an *object*, which need not be a module. Importing a module for the +first time results in a module object being put in `sys.modules`, so usually +when you import something you get a module back. This need not be the case +however. + +This means you can use :func:`patch.dict` to *temporarily* put a mock in place +in `sys.modules`. Any imports whilst this patch is active will fetch the mock. +When the patch is complete (the decorated function exits, the with statement +body is complete or `patcher.stop()` is called) then whatever was there +previously will be restored safely. + +Here's an example that mocks out the 'fooble' module. + + >>> mock = Mock() + >>> with patch.dict('sys.modules', {'fooble': mock}): + ... import fooble + ... fooble.blob() + ... + <Mock name='mock.blob()' id='...'> + >>> assert 'fooble' not in sys.modules + >>> mock.blob.assert_called_once_with() + +As you can see the `import fooble` succeeds, but on exit there is no 'fooble' +left in `sys.modules`. + +This also works for the `from module import name` form: + + >>> mock = Mock() + >>> with patch.dict('sys.modules', {'fooble': mock}): + ... from fooble import blob + ... blob.blip() + ... + <Mock name='mock.blob.blip()' id='...'> + >>> mock.blob.blip.assert_called_once_with() + +With slightly more work you can also mock package imports: + + >>> mock = Mock() + >>> modules = {'package': mock, 'package.module': mock.module} + >>> with patch.dict('sys.modules', modules): + ... from package.module import fooble + ... fooble() + ... + <Mock name='mock.module.fooble()' id='...'> + >>> mock.module.fooble.assert_called_once_with() + + +Tracking order of calls and less verbose call assertions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The :class:`Mock` class allows you to track the *order* of method calls on +your mock objects through the :attr:`~Mock.method_calls` attribute. This +doesn't allow you to track the order of calls between separate mock objects, +however we can use :attr:`~Mock.mock_calls` to achieve the same effect. + +Because mocks track calls to child mocks in `mock_calls`, and accessing an +arbitrary attribute of a mock creates a child mock, we can create our separate +mocks from a parent one. Calls to those child mock will then all be recorded, +in order, in the `mock_calls` of the parent: + + >>> manager = Mock() + >>> mock_foo = manager.foo + >>> mock_bar = manager.bar + + >>> mock_foo.something() + <Mock name='mock.foo.something()' id='...'> + >>> mock_bar.other.thing() + <Mock name='mock.bar.other.thing()' id='...'> + + >>> manager.mock_calls + [call.foo.something(), call.bar.other.thing()] + +We can then assert about the calls, including the order, by comparing with +the `mock_calls` attribute on the manager mock: + + >>> expected_calls = [call.foo.something(), call.bar.other.thing()] + >>> manager.mock_calls == expected_calls + True + +If `patch` is creating, and putting in place, your mocks then you can attach +them to a manager mock using the :meth:`~Mock.attach_mock` method. After +attaching calls will be recorded in `mock_calls` of the manager. + + >>> manager = MagicMock() + >>> with patch('mymodule.Class1') as MockClass1: + ... with patch('mymodule.Class2') as MockClass2: + ... manager.attach_mock(MockClass1, 'MockClass1') + ... manager.attach_mock(MockClass2, 'MockClass2') + ... MockClass1().foo() + ... MockClass2().bar() + ... + <MagicMock name='mock.MockClass1().foo()' id='...'> + <MagicMock name='mock.MockClass2().bar()' id='...'> + >>> manager.mock_calls + [call.MockClass1(), + call.MockClass1().foo(), + call.MockClass2(), + call.MockClass2().bar()] + +If many calls have been made, but you're only interested in a particular +sequence of them then an alternative is to use the +:meth:`~Mock.assert_has_calls` method. This takes a list of calls (constructed +with the :data:`call` object). If that sequence of calls are in +:attr:`~Mock.mock_calls` then the assert succeeds. + + >>> m = MagicMock() + >>> m().foo().bar().baz() + <MagicMock name='mock().foo().bar().baz()' id='...'> + >>> m.one().two().three() + <MagicMock name='mock.one().two().three()' id='...'> + >>> calls = call.one().two().three().call_list() + >>> m.assert_has_calls(calls) + +Even though the chained call `m.one().two().three()` aren't the only calls that +have been made to the mock, the assert still succeeds. + +Sometimes a mock may have several calls made to it, and you are only interested +in asserting about *some* of those calls. You may not even care about the +order. In this case you can pass `any_order=True` to `assert_has_calls`: + + >>> m = MagicMock() + >>> m(1), m.two(2, 3), m.seven(7), m.fifty('50') + (...) + >>> calls = [call.fifty('50'), call(1), call.seven(7)] + >>> m.assert_has_calls(calls, any_order=True) + + +More complex argument matching +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Using the same basic concept as :data:`ANY` we can implement matchers to do more +complex assertions on objects used as arguments to mocks. + +Suppose we expect some object to be passed to a mock that by default +compares equal based on object identity (which is the Python default for user +defined classes). To use :meth:`~Mock.assert_called_with` we would need to pass +in the exact same object. If we are only interested in some of the attributes +of this object then we can create a matcher that will check these attributes +for us. + +You can see in this example how a 'standard' call to `assert_called_with` isn't +sufficient: + + >>> class Foo: + ... def __init__(self, a, b): + ... self.a, self.b = a, b + ... + >>> mock = Mock(return_value=None) + >>> mock(Foo(1, 2)) + >>> mock.assert_called_with(Foo(1, 2)) + Traceback (most recent call last): + ... + AssertionError: Expected: call(<__main__.Foo object at 0x...>) + Actual call: call(<__main__.Foo object at 0x...>) + +A comparison function for our `Foo` class might look something like this: + + >>> def compare(self, other): + ... if not type(self) == type(other): + ... return False + ... if self.a != other.a: + ... return False + ... if self.b != other.b: + ... return False + ... return True + ... + +And a matcher object that can use comparison functions like this for its +equality operation would look something like this: + + >>> class Matcher: + ... def __init__(self, compare, some_obj): + ... self.compare = compare + ... self.some_obj = some_obj + ... def __eq__(self, other): + ... return self.compare(self.some_obj, other) + ... + +Putting all this together: + + >>> match_foo = Matcher(compare, Foo(1, 2)) + >>> mock.assert_called_with(match_foo) + +The `Matcher` is instantiated with our compare function and the `Foo` object +we want to compare against. In `assert_called_with` the `Matcher` equality +method will be called, which compares the object the mock was called with +against the one we created our matcher with. If they match then +`assert_called_with` passes, and if they don't an `AssertionError` is raised: + + >>> match_wrong = Matcher(compare, Foo(3, 4)) + >>> mock.assert_called_with(match_wrong) + Traceback (most recent call last): + ... + AssertionError: Expected: ((<Matcher object at 0x...>,), {}) + Called with: ((<Foo object at 0x...>,), {}) + +With a bit of tweaking you could have the comparison function raise the +`AssertionError` directly and provide a more useful failure message. + +As of version 1.5, the Python testing library `PyHamcrest +<http://pypi.python.org/pypi/PyHamcrest>`_ provides similar functionality, +that may be useful here, in the form of its equality matcher +(`hamcrest.library.integration.match_equality +<http://packages.python.org/PyHamcrest/integration.html#hamcrest.library.integration.match_equality>`_). diff --git a/Doc/library/unittest.mock.rst b/Doc/library/unittest.mock.rst new file mode 100644 index 0000000..f805df3 --- /dev/null +++ b/Doc/library/unittest.mock.rst @@ -0,0 +1,2244 @@ + +:mod:`unittest.mock` --- mock object library +============================================ + +.. module:: unittest.mock + :synopsis: Mock object library. +.. moduleauthor:: Michael Foord <michael@python.org> +.. currentmodule:: unittest.mock + +.. versionadded:: 3.3 + +:mod:`unittest.mock` is a library for testing in Python. It allows you to +replace parts of your system under test with mock objects and make assertions +about how they have been used. + +`unittest.mock` provides a core :class:`Mock` class removing the need to +create a host of stubs throughout your test suite. After performing an +action, you can make assertions about which methods / attributes were used +and arguments they were called with. You can also specify return values and +set needed attributes in the normal way. + +Additionally, mock provides a :func:`patch` decorator that handles patching +module and class level attributes within the scope of a test, along with +:const:`sentinel` for creating unique objects. See the `quick guide`_ for +some examples of how to use :class:`Mock`, :class:`MagicMock` and +:func:`patch`. + +Mock is very easy to use and is designed for use with :mod:`unittest`. Mock +is based on the 'action -> assertion' pattern instead of `'record -> replay'` +used by many mocking frameworks. + +There is a backport of `unittest.mock` for earlier versions of Python, +available as `mock on PyPI <http://pypi.python.org/pypi/mock>`_. + +**Source code:** :source:`Lib/unittest/mock.py` + + +Quick Guide +----------- + +:class:`Mock` and :class:`MagicMock` objects create all attributes and +methods as you access them and store details of how they have been used. You +can configure them, to specify return values or limit what attributes are +available, and then make assertions about how they have been used: + + >>> from unittest.mock import MagicMock + >>> thing = ProductionClass() + >>> thing.method = MagicMock(return_value=3) + >>> thing.method(3, 4, 5, key='value') + 3 + >>> thing.method.assert_called_with(3, 4, 5, key='value') + +:attr:`side_effect` allows you to perform side effects, including raising an +exception when a mock is called: + + >>> mock = Mock(side_effect=KeyError('foo')) + >>> mock() + Traceback (most recent call last): + ... + KeyError: 'foo' + + >>> values = {'a': 1, 'b': 2, 'c': 3} + >>> def side_effect(arg): + ... return values[arg] + ... + >>> mock.side_effect = side_effect + >>> mock('a'), mock('b'), mock('c') + (1, 2, 3) + >>> mock.side_effect = [5, 4, 3, 2, 1] + >>> mock(), mock(), mock() + (5, 4, 3) + +Mock has many other ways you can configure it and control its behaviour. For +example the `spec` argument configures the mock to take its specification +from another object. Attempting to access attributes or methods on the mock +that don't exist on the spec will fail with an `AttributeError`. + +The :func:`patch` decorator / context manager makes it easy to mock classes or +objects in a module under test. The object you specify will be replaced with a +mock (or other object) during the test and restored when the test ends: + + >>> from unittest.mock import patch + >>> @patch('module.ClassName2') + ... @patch('module.ClassName1') + ... def test(MockClass1, MockClass2): + ... module.ClassName1() + ... module.ClassName2() + ... assert MockClass1 is module.ClassName1 + ... assert MockClass2 is module.ClassName2 + ... assert MockClass1.called + ... assert MockClass2.called + ... + >>> test() + +.. note:: + + When you nest patch decorators the mocks are passed in to the decorated + function in the same order they applied (the normal *python* order that + decorators are applied). This means from the bottom up, so in the example + above the mock for `module.ClassName1` is passed in first. + + With `patch` it matters that you patch objects in the namespace where they + are looked up. This is normally straightforward, but for a quick guide + read :ref:`where to patch <where-to-patch>`. + +As well as a decorator `patch` can be used as a context manager in a with +statement: + + >>> with patch.object(ProductionClass, 'method', return_value=None) as mock_method: + ... thing = ProductionClass() + ... thing.method(1, 2, 3) + ... + >>> mock_method.assert_called_once_with(1, 2, 3) + + +There is also :func:`patch.dict` for setting values in a dictionary just +during a scope and restoring the dictionary to its original state when the test +ends: + + >>> foo = {'key': 'value'} + >>> original = foo.copy() + >>> with patch.dict(foo, {'newkey': 'newvalue'}, clear=True): + ... assert foo == {'newkey': 'newvalue'} + ... + >>> assert foo == original + +Mock supports the mocking of Python :ref:`magic methods <magic-methods>`. The +easiest way of using magic methods is with the :class:`MagicMock` class. It +allows you to do things like: + + >>> mock = MagicMock() + >>> mock.__str__.return_value = 'foobarbaz' + >>> str(mock) + 'foobarbaz' + >>> mock.__str__.assert_called_with() + +Mock allows you to assign functions (or other Mock instances) to magic methods +and they will be called appropriately. The `MagicMock` class is just a Mock +variant that has all of the magic methods pre-created for you (well, all the +useful ones anyway). + +The following is an example of using magic methods with the ordinary Mock +class: + + >>> mock = Mock() + >>> mock.__str__ = Mock(return_value='wheeeeee') + >>> str(mock) + 'wheeeeee' + +For ensuring that the mock objects in your tests have the same api as the +objects they are replacing, you can use :ref:`auto-speccing <auto-speccing>`. +Auto-speccing can be done through the `autospec` argument to patch, or the +:func:`create_autospec` function. Auto-speccing creates mock objects that +have the same attributes and methods as the objects they are replacing, and +any functions and methods (including constructors) have the same call +signature as the real object. + +This ensures that your mocks will fail in the same way as your production +code if they are used incorrectly: + + >>> from unittest.mock import create_autospec + >>> def function(a, b, c): + ... pass + ... + >>> mock_function = create_autospec(function, return_value='fishy') + >>> mock_function(1, 2, 3) + 'fishy' + >>> mock_function.assert_called_once_with(1, 2, 3) + >>> mock_function('wrong arguments') + Traceback (most recent call last): + ... + TypeError: <lambda>() takes exactly 3 arguments (1 given) + +`create_autospec` can also be used on classes, where it copies the signature of +the `__init__` method, and on callable objects where it copies the signature of +the `__call__` method. + + + +The Mock Class +-------------- + + +`Mock` is a flexible mock object intended to replace the use of stubs and +test doubles throughout your code. Mocks are callable and create attributes as +new mocks when you access them [#]_. Accessing the same attribute will always +return the same mock. Mocks record how you use them, allowing you to make +assertions about what your code has done to them. + +:class:`MagicMock` is a subclass of `Mock` with all the magic methods +pre-created and ready to use. There are also non-callable variants, useful +when you are mocking out objects that aren't callable: +:class:`NonCallableMock` and :class:`NonCallableMagicMock` + +The :func:`patch` decorators makes it easy to temporarily replace classes +in a particular module with a `Mock` object. By default `patch` will create +a `MagicMock` for you. You can specify an alternative class of `Mock` using +the `new_callable` argument to `patch`. + + +.. class:: Mock(spec=None, side_effect=None, return_value=DEFAULT, wraps=None, name=None, spec_set=None, **kwargs) + + Create a new `Mock` object. `Mock` takes several optional arguments + that specify the behaviour of the Mock object: + + * `spec`: This can be either a list of strings or an existing object (a + class or instance) that acts as the specification for the mock object. If + you pass in an object then a list of strings is formed by calling dir on + the object (excluding unsupported magic attributes and methods). + Accessing any attribute not in this list will raise an `AttributeError`. + + If `spec` is an object (rather than a list of strings) then + :attr:`~instance.__class__` returns the class of the spec object. This + allows mocks to pass `isinstance` tests. + + * `spec_set`: A stricter variant of `spec`. If used, attempting to *set* + or get an attribute on the mock that isn't on the object passed as + `spec_set` will raise an `AttributeError`. + + * `side_effect`: A function to be called whenever the Mock is called. See + the :attr:`~Mock.side_effect` attribute. Useful for raising exceptions or + dynamically changing return values. The function is called with the same + arguments as the mock, and unless it returns :data:`DEFAULT`, the return + value of this function is used as the return value. + + Alternatively `side_effect` can be an exception class or instance. In + this case the exception will be raised when the mock is called. + + If `side_effect` is an iterable then each call to the mock will return + the next value from the iterable. + + A `side_effect` can be cleared by setting it to `None`. + + * `return_value`: The value returned when the mock is called. By default + this is a new Mock (created on first access). See the + :attr:`return_value` attribute. + + * `wraps`: Item for the mock object to wrap. If `wraps` is not None then + calling the Mock will pass the call through to the wrapped object + (returning the real result). Attribute access on the mock will return a + Mock object that wraps the corresponding attribute of the wrapped + object (so attempting to access an attribute that doesn't exist will + raise an `AttributeError`). + + If the mock has an explicit `return_value` set then calls are not passed + to the wrapped object and the `return_value` is returned instead. + + * `name`: If the mock has a name then it will be used in the repr of the + mock. This can be useful for debugging. The name is propagated to child + mocks. + + Mocks can also be called with arbitrary keyword arguments. These will be + used to set attributes on the mock after it is created. See the + :meth:`configure_mock` method for details. + + + .. method:: assert_called_with(*args, **kwargs) + + This method is a convenient way of asserting that calls are made in a + particular way: + + >>> mock = Mock() + >>> mock.method(1, 2, 3, test='wow') + <Mock name='mock.method()' id='...'> + >>> mock.method.assert_called_with(1, 2, 3, test='wow') + + + .. method:: assert_called_once_with(*args, **kwargs) + + Assert that the mock was called exactly once and with the specified + arguments. + + >>> mock = Mock(return_value=None) + >>> mock('foo', bar='baz') + >>> mock.assert_called_once_with('foo', bar='baz') + >>> mock('foo', bar='baz') + >>> mock.assert_called_once_with('foo', bar='baz') + Traceback (most recent call last): + ... + AssertionError: Expected 'mock' to be called once. Called 2 times. + + + .. method:: assert_any_call(*args, **kwargs) + + assert the mock has been called with the specified arguments. + + The assert passes if the mock has *ever* been called, unlike + :meth:`assert_called_with` and :meth:`assert_called_once_with` that + only pass if the call is the most recent one. + + >>> mock = Mock(return_value=None) + >>> mock(1, 2, arg='thing') + >>> mock('some', 'thing', 'else') + >>> mock.assert_any_call(1, 2, arg='thing') + + + .. method:: assert_has_calls(calls, any_order=False) + + assert the mock has been called with the specified calls. + The `mock_calls` list is checked for the calls. + + If `any_order` is false (the default) then the calls must be + sequential. There can be extra calls before or after the + specified calls. + + If `any_order` is true then the calls can be in any order, but + they must all appear in :attr:`mock_calls`. + + >>> mock = Mock(return_value=None) + >>> mock(1) + >>> mock(2) + >>> mock(3) + >>> mock(4) + >>> calls = [call(2), call(3)] + >>> mock.assert_has_calls(calls) + >>> calls = [call(4), call(2), call(3)] + >>> mock.assert_has_calls(calls, any_order=True) + + + .. method:: reset_mock() + + The reset_mock method resets all the call attributes on a mock object: + + >>> mock = Mock(return_value=None) + >>> mock('hello') + >>> mock.called + True + >>> mock.reset_mock() + >>> mock.called + False + + This can be useful where you want to make a series of assertions that + reuse the same object. Note that `reset_mock` *doesn't* clear the + return value, :attr:`side_effect` or any child attributes you have + set using normal assignment. Child mocks and the return value mock + (if any) are reset as well. + + + .. method:: mock_add_spec(spec, spec_set=False) + + Add a spec to a mock. `spec` can either be an object or a + list of strings. Only attributes on the `spec` can be fetched as + attributes from the mock. + + If `spec_set` is `True` then only attributes on the spec can be set. + + + .. method:: attach_mock(mock, attribute) + + Attach a mock as an attribute of this one, replacing its name and + parent. Calls to the attached mock will be recorded in the + :attr:`method_calls` and :attr:`mock_calls` attributes of this one. + + + .. method:: configure_mock(**kwargs) + + Set attributes on the mock through keyword arguments. + + Attributes plus return values and side effects can be set on child + mocks using standard dot notation and unpacking a dictionary in the + method call: + + >>> mock = Mock() + >>> attrs = {'method.return_value': 3, 'other.side_effect': KeyError} + >>> mock.configure_mock(**attrs) + >>> mock.method() + 3 + >>> mock.other() + Traceback (most recent call last): + ... + KeyError + + The same thing can be achieved in the constructor call to mocks: + + >>> attrs = {'method.return_value': 3, 'other.side_effect': KeyError} + >>> mock = Mock(some_attribute='eggs', **attrs) + >>> mock.some_attribute + 'eggs' + >>> mock.method() + 3 + >>> mock.other() + Traceback (most recent call last): + ... + KeyError + + `configure_mock` exists to make it easier to do configuration + after the mock has been created. + + + .. method:: __dir__() + + `Mock` objects limit the results of `dir(some_mock)` to useful results. + For mocks with a `spec` this includes all the permitted attributes + for the mock. + + See :data:`FILTER_DIR` for what this filtering does, and how to + switch it off. + + + .. method:: _get_child_mock(**kw) + + Create the child mocks for attributes and return value. + By default child mocks will be the same type as the parent. + Subclasses of Mock may want to override this to customize the way + child mocks are made. + + For non-callable mocks the callable variant will be used (rather than + any custom subclass). + + + .. attribute:: called + + A boolean representing whether or not the mock object has been called: + + >>> mock = Mock(return_value=None) + >>> mock.called + False + >>> mock() + >>> mock.called + True + + .. attribute:: call_count + + An integer telling you how many times the mock object has been called: + + >>> mock = Mock(return_value=None) + >>> mock.call_count + 0 + >>> mock() + >>> mock() + >>> mock.call_count + 2 + + + .. attribute:: return_value + + Set this to configure the value returned by calling the mock: + + >>> mock = Mock() + >>> mock.return_value = 'fish' + >>> mock() + 'fish' + + The default return value is a mock object and you can configure it in + the normal way: + + >>> mock = Mock() + >>> mock.return_value.attribute = sentinel.Attribute + >>> mock.return_value() + <Mock name='mock()()' id='...'> + >>> mock.return_value.assert_called_with() + + `return_value` can also be set in the constructor: + + >>> mock = Mock(return_value=3) + >>> mock.return_value + 3 + >>> mock() + 3 + + + .. attribute:: side_effect + + This can either be a function to be called when the mock is called, + or an exception (class or instance) to be raised. + + If you pass in a function it will be called with same arguments as the + mock and unless the function returns the :data:`DEFAULT` singleton the + call to the mock will then return whatever the function returns. If the + function returns :data:`DEFAULT` then the mock will return its normal + value (from the :attr:`return_value`). + + An example of a mock that raises an exception (to test exception + handling of an API): + + >>> mock = Mock() + >>> mock.side_effect = Exception('Boom!') + >>> mock() + Traceback (most recent call last): + ... + Exception: Boom! + + Using `side_effect` to return a sequence of values: + + >>> mock = Mock() + >>> mock.side_effect = [3, 2, 1] + >>> mock(), mock(), mock() + (3, 2, 1) + + The `side_effect` function is called with the same arguments as the + mock (so it is wise for it to take arbitrary args and keyword + arguments) and whatever it returns is used as the return value for + the call. The exception is if `side_effect` returns :data:`DEFAULT`, + in which case the normal :attr:`return_value` is used. + + >>> mock = Mock(return_value=3) + >>> def side_effect(*args, **kwargs): + ... return DEFAULT + ... + >>> mock.side_effect = side_effect + >>> mock() + 3 + + `side_effect` can be set in the constructor. Here's an example that + adds one to the value the mock is called with and returns it: + + >>> side_effect = lambda value: value + 1 + >>> mock = Mock(side_effect=side_effect) + >>> mock(3) + 4 + >>> mock(-8) + -7 + + Setting `side_effect` to `None` clears it: + + >>> m = Mock(side_effect=KeyError, return_value=3) + >>> m() + Traceback (most recent call last): + ... + KeyError + >>> m.side_effect = None + >>> m() + 3 + + + .. attribute:: call_args + + This is either `None` (if the mock hasn't been called), or the + arguments that the mock was last called with. This will be in the + form of a tuple: the first member is any ordered arguments the mock + was called with (or an empty tuple) and the second member is any + keyword arguments (or an empty dictionary). + + >>> mock = Mock(return_value=None) + >>> print mock.call_args + None + >>> mock() + >>> mock.call_args + call() + >>> mock.call_args == () + True + >>> mock(3, 4) + >>> mock.call_args + call(3, 4) + >>> mock.call_args == ((3, 4),) + True + >>> mock(3, 4, 5, key='fish', next='w00t!') + >>> mock.call_args + call(3, 4, 5, key='fish', next='w00t!') + + `call_args`, along with members of the lists :attr:`call_args_list`, + :attr:`method_calls` and :attr:`mock_calls` are :data:`call` objects. + These are tuples, so they can be unpacked to get at the individual + arguments and make more complex assertions. See + :ref:`calls as tuples <calls-as-tuples>`. + + + .. attribute:: call_args_list + + This is a list of all the calls made to the mock object in sequence + (so the length of the list is the number of times it has been + called). Before any calls have been made it is an empty list. The + :data:`call` object can be used for conveniently constructing lists of + calls to compare with `call_args_list`. + + >>> mock = Mock(return_value=None) + >>> mock() + >>> mock(3, 4) + >>> mock(key='fish', next='w00t!') + >>> mock.call_args_list + [call(), call(3, 4), call(key='fish', next='w00t!')] + >>> expected = [(), ((3, 4),), ({'key': 'fish', 'next': 'w00t!'},)] + >>> mock.call_args_list == expected + True + + Members of `call_args_list` are :data:`call` objects. These can be + unpacked as tuples to get at the individual arguments. See + :ref:`calls as tuples <calls-as-tuples>`. + + + .. attribute:: method_calls + + As well as tracking calls to themselves, mocks also track calls to + methods and attributes, and *their* methods and attributes: + + >>> mock = Mock() + >>> mock.method() + <Mock name='mock.method()' id='...'> + >>> mock.property.method.attribute() + <Mock name='mock.property.method.attribute()' id='...'> + >>> mock.method_calls + [call.method(), call.property.method.attribute()] + + Members of `method_calls` are :data:`call` objects. These can be + unpacked as tuples to get at the individual arguments. See + :ref:`calls as tuples <calls-as-tuples>`. + + + .. attribute:: mock_calls + + `mock_calls` records *all* calls to the mock object, its methods, magic + methods *and* return value mocks. + + >>> mock = MagicMock() + >>> result = mock(1, 2, 3) + >>> mock.first(a=3) + <MagicMock name='mock.first()' id='...'> + >>> mock.second() + <MagicMock name='mock.second()' id='...'> + >>> int(mock) + 1 + >>> result(1) + <MagicMock name='mock()()' id='...'> + >>> expected = [call(1, 2, 3), call.first(a=3), call.second(), + ... call.__int__(), call()(1)] + >>> mock.mock_calls == expected + True + + Members of `mock_calls` are :data:`call` objects. These can be + unpacked as tuples to get at the individual arguments. See + :ref:`calls as tuples <calls-as-tuples>`. + + + .. attribute:: __class__ + + Normally the `__class__` attribute of an object will return its type. + For a mock object with a `spec` `__class__` returns the spec class + instead. This allows mock objects to pass `isinstance` tests for the + object they are replacing / masquerading as: + + >>> mock = Mock(spec=3) + >>> isinstance(mock, int) + True + + `__class__` is assignable to, this allows a mock to pass an + `isinstance` check without forcing you to use a spec: + + >>> mock = Mock() + >>> mock.__class__ = dict + >>> isinstance(mock, dict) + True + +.. class:: NonCallableMock(spec=None, wraps=None, name=None, spec_set=None, **kwargs) + + A non-callable version of `Mock`. The constructor parameters have the same + meaning of `Mock`, with the exception of `return_value` and `side_effect` + which have no meaning on a non-callable mock. + +Mock objects that use a class or an instance as a `spec` or `spec_set` are able +to pass `isinstance` tests: + + >>> mock = Mock(spec=SomeClass) + >>> isinstance(mock, SomeClass) + True + >>> mock = Mock(spec_set=SomeClass()) + >>> isinstance(mock, SomeClass) + True + +The `Mock` classes have support for mocking magic methods. See :ref:`magic +methods <magic-methods>` for the full details. + +The mock classes and the :func:`patch` decorators all take arbitrary keyword +arguments for configuration. For the `patch` decorators the keywords are +passed to the constructor of the mock being created. The keyword arguments +are for configuring attributes of the mock: + + >>> m = MagicMock(attribute=3, other='fish') + >>> m.attribute + 3 + >>> m.other + 'fish' + +The return value and side effect of child mocks can be set in the same way, +using dotted notation. As you can't use dotted names directly in a call you +have to create a dictionary and unpack it using `**`: + + >>> attrs = {'method.return_value': 3, 'other.side_effect': KeyError} + >>> mock = Mock(some_attribute='eggs', **attrs) + >>> mock.some_attribute + 'eggs' + >>> mock.method() + 3 + >>> mock.other() + Traceback (most recent call last): + ... + KeyError + + +.. class:: PropertyMock(*args, **kwargs) + + A mock intended to be used as a property, or other descriptor, on a class. + `PropertyMock` provides `__get__` and `__set__` methods so you can specify + a return value when it is fetched. + + Fetching a `PropertyMock` instance from an object calls the mock, with + no args. Setting it calls the mock with the value being set. + + >>> class Foo: + ... @property + ... def foo(self): + ... return 'something' + ... @foo.setter + ... def foo(self, value): + ... pass + ... + >>> with patch('__main__.Foo.foo', new_callable=PropertyMock) as mock_foo: + ... mock_foo.return_value = 'mockity-mock' + ... this_foo = Foo() + ... print this_foo.foo + ... this_foo.foo = 6 + ... + mockity-mock + >>> mock_foo.mock_calls + [call(), call(6)] + +Because of the way mock attributes are stored you can't directly attach a +`PropertyMock` to a mock object. Instead you can attach it to the mock type +object:: + + >>> m = MagicMock() + >>> p = PropertyMock(return_value=3) + >>> type(m).foo = p + >>> m.foo + 3 + >>> p.assert_called_once_with() + + +Calling +~~~~~~~ + +Mock objects are callable. The call will return the value set as the +:attr:`~Mock.return_value` attribute. The default return value is a new Mock +object; it is created the first time the return value is accessed (either +explicitly or by calling the Mock) - but it is stored and the same one +returned each time. + +Calls made to the object will be recorded in the attributes +like :attr:`~Mock.call_args` and :attr:`~Mock.call_args_list`. + +If :attr:`~Mock.side_effect` is set then it will be called after the call has +been recorded, so if `side_effect` raises an exception the call is still +recorded. + +The simplest way to make a mock raise an exception when called is to make +:attr:`~Mock.side_effect` an exception class or instance: + + >>> m = MagicMock(side_effect=IndexError) + >>> m(1, 2, 3) + Traceback (most recent call last): + ... + IndexError + >>> m.mock_calls + [call(1, 2, 3)] + >>> m.side_effect = KeyError('Bang!') + >>> m('two', 'three', 'four') + Traceback (most recent call last): + ... + KeyError: 'Bang!' + >>> m.mock_calls + [call(1, 2, 3), call('two', 'three', 'four')] + +If `side_effect` is a function then whatever that function returns is what +calls to the mock return. The `side_effect` function is called with the +same arguments as the mock. This allows you to vary the return value of the +call dynamically, based on the input: + + >>> def side_effect(value): + ... return value + 1 + ... + >>> m = MagicMock(side_effect=side_effect) + >>> m(1) + 2 + >>> m(2) + 3 + >>> m.mock_calls + [call(1), call(2)] + +If you want the mock to still return the default return value (a new mock), or +any set return value, then there are two ways of doing this. Either return +`mock.return_value` from inside `side_effect`, or return :data:`DEFAULT`: + + >>> m = MagicMock() + >>> def side_effect(*args, **kwargs): + ... return m.return_value + ... + >>> m.side_effect = side_effect + >>> m.return_value = 3 + >>> m() + 3 + >>> def side_effect(*args, **kwargs): + ... return DEFAULT + ... + >>> m.side_effect = side_effect + >>> m() + 3 + +To remove a `side_effect`, and return to the default behaviour, set the +`side_effect` to `None`: + + >>> m = MagicMock(return_value=6) + >>> def side_effect(*args, **kwargs): + ... return 3 + ... + >>> m.side_effect = side_effect + >>> m() + 3 + >>> m.side_effect = None + >>> m() + 6 + +The `side_effect` can also be any iterable object. Repeated calls to the mock +will return values from the iterable (until the iterable is exhausted and +a `StopIteration` is raised): + + >>> m = MagicMock(side_effect=[1, 2, 3]) + >>> m() + 1 + >>> m() + 2 + >>> m() + 3 + >>> m() + Traceback (most recent call last): + ... + StopIteration + +If any members of the iterable are exceptions they will be raised instead of +returned:: + + >>> iterable = (33, ValueError, 66) + >>> m = MagicMock(side_effect=iterable) + >>> m() + 33 + >>> m() + Traceback (most recent call last): + ... + ValueError + >>> m() + 66 + + +.. _deleting-attributes: + +Deleting Attributes +~~~~~~~~~~~~~~~~~~~ + +Mock objects create attributes on demand. This allows them to pretend to be +objects of any type. + +You may want a mock object to return `False` to a `hasattr` call, or raise an +`AttributeError` when an attribute is fetched. You can do this by providing +an object as a `spec` for a mock, but that isn't always convenient. + +You "block" attributes by deleting them. Once deleted, accessing an attribute +will raise an `AttributeError`. + + >>> mock = MagicMock() + >>> hasattr(mock, 'm') + True + >>> del mock.m + >>> hasattr(mock, 'm') + False + >>> del mock.f + >>> mock.f + Traceback (most recent call last): + ... + AttributeError: f + + +Mock names and the name attribute +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Since "name" is an argument to the :class:`Mock` constructor, if you want your +mock object to have a "name" attribute you can't just pass it in at creation +time. There are two alternatives. One option is to use +:meth:`~Mock.configure_mock`:: + + >>> mock = MagicMock() + >>> mock.configure_mock(name='my_name') + >>> mock.name + 'my_name' + +A simpler option is to simply set the "name" attribute after mock creation:: + + >>> mock = MagicMock() + >>> mock.name = "foo" + + +Attaching Mocks as Attributes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When you attach a mock as an attribute of another mock (or as the return +value) it becomes a "child" of that mock. Calls to the child are recorded in +the :attr:`~Mock.method_calls` and :attr:`~Mock.mock_calls` attributes of the +parent. This is useful for configuring child mocks and then attaching them to +the parent, or for attaching mocks to a parent that records all calls to the +children and allows you to make assertions about the order of calls between +mocks: + + >>> parent = MagicMock() + >>> child1 = MagicMock(return_value=None) + >>> child2 = MagicMock(return_value=None) + >>> parent.child1 = child1 + >>> parent.child2 = child2 + >>> child1(1) + >>> child2(2) + >>> parent.mock_calls + [call.child1(1), call.child2(2)] + +The exception to this is if the mock has a name. This allows you to prevent +the "parenting" if for some reason you don't want it to happen. + + >>> mock = MagicMock() + >>> not_a_child = MagicMock(name='not-a-child') + >>> mock.attribute = not_a_child + >>> mock.attribute() + <MagicMock name='not-a-child()' id='...'> + >>> mock.mock_calls + [] + +Mocks created for you by :func:`patch` are automatically given names. To +attach mocks that have names to a parent you use the :meth:`~Mock.attach_mock` +method: + + >>> thing1 = object() + >>> thing2 = object() + >>> parent = MagicMock() + >>> with patch('__main__.thing1', return_value=None) as child1: + ... with patch('__main__.thing2', return_value=None) as child2: + ... parent.attach_mock(child1, 'child1') + ... parent.attach_mock(child2, 'child2') + ... child1('one') + ... child2('two') + ... + >>> parent.mock_calls + [call.child1('one'), call.child2('two')] + + +.. [#] The only exceptions are magic methods and attributes (those that have + leading and trailing double underscores). Mock doesn't create these but + instead raises an ``AttributeError``. This is because the interpreter + will often implicitly request these methods, and gets *very* confused to + get a new Mock object when it expects a magic method. If you need magic + method support see :ref:`magic methods <magic-methods>`. + + +The patchers +------------ + +The patch decorators are used for patching objects only within the scope of +the function they decorate. They automatically handle the unpatching for you, +even if exceptions are raised. All of these functions can also be used in with +statements or as class decorators. + + +patch +~~~~~ + +.. note:: + + `patch` is straightforward to use. The key is to do the patching in the + right namespace. See the section `where to patch`_. + +.. function:: patch(target, new=DEFAULT, spec=None, create=False, spec_set=None, autospec=None, new_callable=None, **kwargs) + + `patch` acts as a function decorator, class decorator or a context + manager. Inside the body of the function or with statement, the `target` + is patched with a `new` object. When the function/with statement exits + the patch is undone. + + If `new` is omitted, then the target is replaced with a + :class:`MagicMock`. If `patch` is used as a decorator and `new` is + omitted, the created mock is passed in as an extra argument to the + decorated function. If `patch` is used as a context manager the created + mock is returned by the context manager. + + `target` should be a string in the form `'package.module.ClassName'`. The + `target` is imported and the specified object replaced with the `new` + object, so the `target` must be importable from the environment you are + calling `patch` from. The target is imported when the decorated function + is executed, not at decoration time. + + The `spec` and `spec_set` keyword arguments are passed to the `MagicMock` + if patch is creating one for you. + + In addition you can pass `spec=True` or `spec_set=True`, which causes + patch to pass in the object being mocked as the spec/spec_set object. + + `new_callable` allows you to specify a different class, or callable object, + that will be called to create the `new` object. By default `MagicMock` is + used. + + A more powerful form of `spec` is `autospec`. If you set `autospec=True` + then the mock with be created with a spec from the object being replaced. + All attributes of the mock will also have the spec of the corresponding + attribute of the object being replaced. Methods and functions being mocked + will have their arguments checked and will raise a `TypeError` if they are + called with the wrong signature. For mocks + replacing a class, their return value (the 'instance') will have the same + spec as the class. See the :func:`create_autospec` function and + :ref:`auto-speccing`. + + Instead of `autospec=True` you can pass `autospec=some_object` to use an + arbitrary object as the spec instead of the one being replaced. + + By default `patch` will fail to replace attributes that don't exist. If + you pass in `create=True`, and the attribute doesn't exist, patch will + create the attribute for you when the patched function is called, and + delete it again afterwards. This is useful for writing tests against + attributes that your production code creates at runtime. It is off by + default because it can be dangerous. With it switched on you can write + passing tests against APIs that don't actually exist! + + Patch can be used as a `TestCase` class decorator. It works by + decorating each test method in the class. This reduces the boilerplate + code when your test methods share a common patchings set. `patch` finds + tests by looking for method names that start with `patch.TEST_PREFIX`. + By default this is `test`, which matches the way `unittest` finds tests. + You can specify an alternative prefix by setting `patch.TEST_PREFIX`. + + Patch can be used as a context manager, with the with statement. Here the + patching applies to the indented block after the with statement. If you + use "as" then the patched object will be bound to the name after the + "as"; very useful if `patch` is creating a mock object for you. + + `patch` takes arbitrary keyword arguments. These will be passed to + the `Mock` (or `new_callable`) on construction. + + `patch.dict(...)`, `patch.multiple(...)` and `patch.object(...)` are + available for alternate use-cases. + +`patch` as function decorator, creating the mock for you and passing it into +the decorated function: + + >>> @patch('__main__.SomeClass') + ... def function(normal_argument, mock_class): + ... print(mock_class is SomeClass) + ... + >>> function(None) + True + +Patching a class replaces the class with a `MagicMock` *instance*. If the +class is instantiated in the code under test then it will be the +:attr:`~Mock.return_value` of the mock that will be used. + +If the class is instantiated multiple times you could use +:attr:`~Mock.side_effect` to return a new mock each time. Alternatively you +can set the `return_value` to be anything you want. + +To configure return values on methods of *instances* on the patched class +you must do this on the `return_value`. For example: + + >>> class Class: + ... def method(self): + ... pass + ... + >>> with patch('__main__.Class') as MockClass: + ... instance = MockClass.return_value + ... instance.method.return_value = 'foo' + ... assert Class() is instance + ... assert Class().method() == 'foo' + ... + +If you use `spec` or `spec_set` and `patch` is replacing a *class*, then the +return value of the created mock will have the same spec. + + >>> Original = Class + >>> patcher = patch('__main__.Class', spec=True) + >>> MockClass = patcher.start() + >>> instance = MockClass() + >>> assert isinstance(instance, Original) + >>> patcher.stop() + +The `new_callable` argument is useful where you want to use an alternative +class to the default :class:`MagicMock` for the created mock. For example, if +you wanted a :class:`NonCallableMock` to be used: + + >>> thing = object() + >>> with patch('__main__.thing', new_callable=NonCallableMock) as mock_thing: + ... assert thing is mock_thing + ... thing() + ... + Traceback (most recent call last): + ... + TypeError: 'NonCallableMock' object is not callable + +Another use case might be to replace an object with a `io.StringIO` instance: + + >>> from io import StringIO + >>> def foo(): + ... print 'Something' + ... + >>> @patch('sys.stdout', new_callable=StringIO) + ... def test(mock_stdout): + ... foo() + ... assert mock_stdout.getvalue() == 'Something\n' + ... + >>> test() + +When `patch` is creating a mock for you, it is common that the first thing +you need to do is to configure the mock. Some of that configuration can be done +in the call to patch. Any arbitrary keywords you pass into the call will be +used to set attributes on the created mock: + + >>> patcher = patch('__main__.thing', first='one', second='two') + >>> mock_thing = patcher.start() + >>> mock_thing.first + 'one' + >>> mock_thing.second + 'two' + +As well as attributes on the created mock attributes, like the +:attr:`~Mock.return_value` and :attr:`~Mock.side_effect`, of child mocks can +also be configured. These aren't syntactically valid to pass in directly as +keyword arguments, but a dictionary with these as keys can still be expanded +into a `patch` call using `**`: + + >>> config = {'method.return_value': 3, 'other.side_effect': KeyError} + >>> patcher = patch('__main__.thing', **config) + >>> mock_thing = patcher.start() + >>> mock_thing.method() + 3 + >>> mock_thing.other() + Traceback (most recent call last): + ... + KeyError + + +patch.object +~~~~~~~~~~~~ + +.. function:: patch.object(target, attribute, new=DEFAULT, spec=None, create=False, spec_set=None, autospec=None, new_callable=None, **kwargs) + + patch the named member (`attribute`) on an object (`target`) with a mock + object. + + `patch.object` can be used as a decorator, class decorator or a context + manager. Arguments `new`, `spec`, `create`, `spec_set`, `autospec` and + `new_callable` have the same meaning as for `patch`. Like `patch`, + `patch.object` takes arbitrary keyword arguments for configuring the mock + object it creates. + + When used as a class decorator `patch.object` honours `patch.TEST_PREFIX` + for choosing which methods to wrap. + +You can either call `patch.object` with three arguments or two arguments. The +three argument form takes the object to be patched, the attribute name and the +object to replace the attribute with. + +When calling with the two argument form you omit the replacement object, and a +mock is created for you and passed in as an extra argument to the decorated +function: + + >>> @patch.object(SomeClass, 'class_method') + ... def test(mock_method): + ... SomeClass.class_method(3) + ... mock_method.assert_called_with(3) + ... + >>> test() + +`spec`, `create` and the other arguments to `patch.object` have the same +meaning as they do for `patch`. + + +patch.dict +~~~~~~~~~~ + +.. function:: patch.dict(in_dict, values=(), clear=False, **kwargs) + + Patch a dictionary, or dictionary like object, and restore the dictionary + to its original state after the test. + + `in_dict` can be a dictionary or a mapping like container. If it is a + mapping then it must at least support getting, setting and deleting items + plus iterating over keys. + + `in_dict` can also be a string specifying the name of the dictionary, which + will then be fetched by importing it. + + `values` can be a dictionary of values to set in the dictionary. `values` + can also be an iterable of `(key, value)` pairs. + + If `clear` is true then the dictionary will be cleared before the new + values are set. + + `patch.dict` can also be called with arbitrary keyword arguments to set + values in the dictionary. + + `patch.dict` can be used as a context manager, decorator or class + decorator. When used as a class decorator `patch.dict` honours + `patch.TEST_PREFIX` for choosing which methods to wrap. + +`patch.dict` can be used to add members to a dictionary, or simply let a test +change a dictionary, and ensure the dictionary is restored when the test +ends. + + >>> foo = {} + >>> with patch.dict(foo, {'newkey': 'newvalue'}): + ... assert foo == {'newkey': 'newvalue'} + ... + >>> assert foo == {} + + >>> import os + >>> with patch.dict('os.environ', {'newkey': 'newvalue'}): + ... print os.environ['newkey'] + ... + newvalue + >>> assert 'newkey' not in os.environ + +Keywords can be used in the `patch.dict` call to set values in the dictionary: + + >>> mymodule = MagicMock() + >>> mymodule.function.return_value = 'fish' + >>> with patch.dict('sys.modules', mymodule=mymodule): + ... import mymodule + ... mymodule.function('some', 'args') + ... + 'fish' + +`patch.dict` can be used with dictionary like objects that aren't actually +dictionaries. At the very minimum they must support item getting, setting, +deleting and either iteration or membership test. This corresponds to the +magic methods `__getitem__`, `__setitem__`, `__delitem__` and either +`__iter__` or `__contains__`. + + >>> class Container: + ... def __init__(self): + ... self.values = {} + ... def __getitem__(self, name): + ... return self.values[name] + ... def __setitem__(self, name, value): + ... self.values[name] = value + ... def __delitem__(self, name): + ... del self.values[name] + ... def __iter__(self): + ... return iter(self.values) + ... + >>> thing = Container() + >>> thing['one'] = 1 + >>> with patch.dict(thing, one=2, two=3): + ... assert thing['one'] == 2 + ... assert thing['two'] == 3 + ... + >>> assert thing['one'] == 1 + >>> assert list(thing) == ['one'] + + +patch.multiple +~~~~~~~~~~~~~~ + +.. function:: patch.multiple(target, spec=None, create=False, spec_set=None, autospec=None, new_callable=None, **kwargs) + + Perform multiple patches in a single call. It takes the object to be + patched (either as an object or a string to fetch the object by importing) + and keyword arguments for the patches:: + + with patch.multiple(settings, FIRST_PATCH='one', SECOND_PATCH='two'): + ... + + Use :data:`DEFAULT` as the value if you want `patch.multiple` to create + mocks for you. In this case the created mocks are passed into a decorated + function by keyword, and a dictionary is returned when `patch.multiple` is + used as a context manager. + + `patch.multiple` can be used as a decorator, class decorator or a context + manager. The arguments `spec`, `spec_set`, `create`, `autospec` and + `new_callable` have the same meaning as for `patch`. These arguments will + be applied to *all* patches done by `patch.multiple`. + + When used as a class decorator `patch.multiple` honours `patch.TEST_PREFIX` + for choosing which methods to wrap. + +If you want `patch.multiple` to create mocks for you, then you can use +:data:`DEFAULT` as the value. If you use `patch.multiple` as a decorator +then the created mocks are passed into the decorated function by keyword. + + >>> thing = object() + >>> other = object() + + >>> @patch.multiple('__main__', thing=DEFAULT, other=DEFAULT) + ... def test_function(thing, other): + ... assert isinstance(thing, MagicMock) + ... assert isinstance(other, MagicMock) + ... + >>> test_function() + +`patch.multiple` can be nested with other `patch` decorators, but put arguments +passed by keyword *after* any of the standard arguments created by `patch`: + + >>> @patch('sys.exit') + ... @patch.multiple('__main__', thing=DEFAULT, other=DEFAULT) + ... def test_function(mock_exit, other, thing): + ... assert 'other' in repr(other) + ... assert 'thing' in repr(thing) + ... assert 'exit' in repr(mock_exit) + ... + >>> test_function() + +If `patch.multiple` is used as a context manager, the value returned by the +context manger is a dictionary where created mocks are keyed by name: + + >>> with patch.multiple('__main__', thing=DEFAULT, other=DEFAULT) as values: + ... assert 'other' in repr(values['other']) + ... assert 'thing' in repr(values['thing']) + ... assert values['thing'] is thing + ... assert values['other'] is other + ... + + +.. _start-and-stop: + +patch methods: start and stop +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +All the patchers have `start` and `stop` methods. These make it simpler to do +patching in `setUp` methods or where you want to do multiple patches without +nesting decorators or with statements. + +To use them call `patch`, `patch.object` or `patch.dict` as normal and keep a +reference to the returned `patcher` object. You can then call `start` to put +the patch in place and `stop` to undo it. + +If you are using `patch` to create a mock for you then it will be returned by +the call to `patcher.start`. + + >>> patcher = patch('package.module.ClassName') + >>> from package import module + >>> original = module.ClassName + >>> new_mock = patcher.start() + >>> assert module.ClassName is not original + >>> assert module.ClassName is new_mock + >>> patcher.stop() + >>> assert module.ClassName is original + >>> assert module.ClassName is not new_mock + + +A typical use case for this might be for doing multiple patches in the `setUp` +method of a `TestCase`: + + >>> class MyTest(TestCase): + ... def setUp(self): + ... self.patcher1 = patch('package.module.Class1') + ... self.patcher2 = patch('package.module.Class2') + ... self.MockClass1 = self.patcher1.start() + ... self.MockClass2 = self.patcher2.start() + ... + ... def tearDown(self): + ... self.patcher1.stop() + ... self.patcher2.stop() + ... + ... def test_something(self): + ... assert package.module.Class1 is self.MockClass1 + ... assert package.module.Class2 is self.MockClass2 + ... + >>> MyTest('test_something').run() + +.. caution:: + + If you use this technique you must ensure that the patching is "undone" by + calling `stop`. This can be fiddlier than you might think, because if an + exception is raised in the ``setUp`` then ``tearDown`` is not called. + :meth:`unittest.TestCase.addCleanup` makes this easier: + + >>> class MyTest(TestCase): + ... def setUp(self): + ... patcher = patch('package.module.Class') + ... self.MockClass = patcher.start() + ... self.addCleanup(patcher.stop) + ... + ... def test_something(self): + ... assert package.module.Class is self.MockClass + ... + + As an added bonus you no longer need to keep a reference to the `patcher` + object. + +It is also possible to stop all patches which have been started by using +`patch.stopall`. + +.. function:: patch.stopall + + Stop all active patches. Only stops patches started with `start`. + + +TEST_PREFIX +~~~~~~~~~~~ + +All of the patchers can be used as class decorators. When used in this way +they wrap every test method on the class. The patchers recognise methods that +start with `test` as being test methods. This is the same way that the +:class:`unittest.TestLoader` finds test methods by default. + +It is possible that you want to use a different prefix for your tests. You can +inform the patchers of the different prefix by setting `patch.TEST_PREFIX`: + + >>> patch.TEST_PREFIX = 'foo' + >>> value = 3 + >>> + >>> @patch('__main__.value', 'not three') + ... class Thing: + ... def foo_one(self): + ... print value + ... def foo_two(self): + ... print value + ... + >>> + >>> Thing().foo_one() + not three + >>> Thing().foo_two() + not three + >>> value + 3 + + +Nesting Patch Decorators +~~~~~~~~~~~~~~~~~~~~~~~~ + +If you want to perform multiple patches then you can simply stack up the +decorators. + +You can stack up multiple patch decorators using this pattern: + + >>> @patch.object(SomeClass, 'class_method') + ... @patch.object(SomeClass, 'static_method') + ... def test(mock1, mock2): + ... assert SomeClass.static_method is mock1 + ... assert SomeClass.class_method is mock2 + ... SomeClass.static_method('foo') + ... SomeClass.class_method('bar') + ... return mock1, mock2 + ... + >>> mock1, mock2 = test() + >>> mock1.assert_called_once_with('foo') + >>> mock2.assert_called_once_with('bar') + + +Note that the decorators are applied from the bottom upwards. This is the +standard way that Python applies decorators. The order of the created mocks +passed into your test function matches this order. + + +.. _where-to-patch: + +Where to patch +~~~~~~~~~~~~~~ + +`patch` works by (temporarily) changing the object that a *name* points to with +another one. There can be many names pointing to any individual object, so +for patching to work you must ensure that you patch the name used by the system +under test. + +The basic principle is that you patch where an object is *looked up*, which +is not necessarily the same place as where it is defined. A couple of +examples will help to clarify this. + +Imagine we have a project that we want to test with the following structure:: + + a.py + -> Defines SomeClass + + b.py + -> from a import SomeClass + -> some_function instantiates SomeClass + +Now we want to test `some_function` but we want to mock out `SomeClass` using +`patch`. The problem is that when we import module b, which we will have to +do then it imports `SomeClass` from module a. If we use `patch` to mock out +`a.SomeClass` then it will have no effect on our test; module b already has a +reference to the *real* `SomeClass` and it looks like our patching had no +effect. + +The key is to patch out `SomeClass` where it is used (or where it is looked up +). In this case `some_function` will actually look up `SomeClass` in module b, +where we have imported it. The patching should look like:: + + @patch('b.SomeClass') + +However, consider the alternative scenario where instead of `from a import +SomeClass` module b does `import a` and `some_function` uses `a.SomeClass`. Both +of these import forms are common. In this case the class we want to patch is +being looked up on the a module and so we have to patch `a.SomeClass` instead:: + + @patch('a.SomeClass') + + +Patching Descriptors and Proxy Objects +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Both patch_ and patch.object_ correctly patch and restore descriptors: class +methods, static methods and properties. You should patch these on the *class* +rather than an instance. They also work with *some* objects +that proxy attribute access, like the `django settings object +<http://www.voidspace.org.uk/python/weblog/arch_d7_2010_12_04.shtml#e1198>`_. + + +MagicMock and magic method support +---------------------------------- + +.. _magic-methods: + +Mocking Magic Methods +~~~~~~~~~~~~~~~~~~~~~ + +:class:`Mock` supports mocking the Python protocol methods, also known as +"magic methods". This allows mock objects to replace containers or other +objects that implement Python protocols. + +Because magic methods are looked up differently from normal methods [#]_, this +support has been specially implemented. This means that only specific magic +methods are supported. The supported list includes *almost* all of them. If +there are any missing that you need please let us know. + +You mock magic methods by setting the method you are interested in to a function +or a mock instance. If you are using a function then it *must* take ``self`` as +the first argument [#]_. + + >>> def __str__(self): + ... return 'fooble' + ... + >>> mock = Mock() + >>> mock.__str__ = __str__ + >>> str(mock) + 'fooble' + + >>> mock = Mock() + >>> mock.__str__ = Mock() + >>> mock.__str__.return_value = 'fooble' + >>> str(mock) + 'fooble' + + >>> mock = Mock() + >>> mock.__iter__ = Mock(return_value=iter([])) + >>> list(mock) + [] + +One use case for this is for mocking objects used as context managers in a +`with` statement: + + >>> mock = Mock() + >>> mock.__enter__ = Mock(return_value='foo') + >>> mock.__exit__ = Mock(return_value=False) + >>> with mock as m: + ... assert m == 'foo' + ... + >>> mock.__enter__.assert_called_with() + >>> mock.__exit__.assert_called_with(None, None, None) + +Calls to magic methods do not appear in :attr:`~Mock.method_calls`, but they +are recorded in :attr:`~Mock.mock_calls`. + +.. note:: + + If you use the `spec` keyword argument to create a mock then attempting to + set a magic method that isn't in the spec will raise an `AttributeError`. + +The full list of supported magic methods is: + +* ``__hash__``, ``__sizeof__``, ``__repr__`` and ``__str__`` +* ``__dir__``, ``__format__`` and ``__subclasses__`` +* ``__floor__``, ``__trunc__`` and ``__ceil__`` +* Comparisons: ``__cmp__``, ``__lt__``, ``__gt__``, ``__le__``, ``__ge__``, + ``__eq__`` and ``__ne__`` +* Container methods: ``__getitem__``, ``__setitem__``, ``__delitem__``, + ``__contains__``, ``__len__``, ``__iter__``, ``__getslice__``, + ``__setslice__``, ``__reversed__`` and ``__missing__`` +* Context manager: ``__enter__`` and ``__exit__`` +* Unary numeric methods: ``__neg__``, ``__pos__`` and ``__invert__`` +* The numeric methods (including right hand and in-place variants): + ``__add__``, ``__sub__``, ``__mul__``, ``__div__``, + ``__floordiv__``, ``__mod__``, ``__divmod__``, ``__lshift__``, + ``__rshift__``, ``__and__``, ``__xor__``, ``__or__``, and ``__pow__`` +* Numeric conversion methods: ``__complex__``, ``__int__``, ``__float__``, + ``__index__`` and ``__coerce__`` +* Descriptor methods: ``__get__``, ``__set__`` and ``__delete__`` +* Pickling: ``__reduce__``, ``__reduce_ex__``, ``__getinitargs__``, + ``__getnewargs__``, ``__getstate__`` and ``__setstate__`` + + +The following methods exist but are *not* supported as they are either in use +by mock, can't be set dynamically, or can cause problems: + +* ``__getattr__``, ``__setattr__``, ``__init__`` and ``__new__`` +* ``__prepare__``, ``__instancecheck__``, ``__subclasscheck__``, ``__del__`` + + + +Magic Mock +~~~~~~~~~~ + +There are two `MagicMock` variants: `MagicMock` and `NonCallableMagicMock`. + + +.. class:: MagicMock(*args, **kw) + + ``MagicMock`` is a subclass of :class:`Mock` with default implementations + of most of the magic methods. You can use ``MagicMock`` without having to + configure the magic methods yourself. + + The constructor parameters have the same meaning as for :class:`Mock`. + + If you use the `spec` or `spec_set` arguments then *only* magic methods + that exist in the spec will be created. + + +.. class:: NonCallableMagicMock(*args, **kw) + + A non-callable version of `MagicMock`. + + The constructor parameters have the same meaning as for + :class:`MagicMock`, with the exception of `return_value` and + `side_effect` which have no meaning on a non-callable mock. + +The magic methods are setup with `MagicMock` objects, so you can configure them +and use them in the usual way: + + >>> mock = MagicMock() + >>> mock[3] = 'fish' + >>> mock.__setitem__.assert_called_with(3, 'fish') + >>> mock.__getitem__.return_value = 'result' + >>> mock[2] + 'result' + +By default many of the protocol methods are required to return objects of a +specific type. These methods are preconfigured with a default return value, so +that they can be used without you having to do anything if you aren't interested +in the return value. You can still *set* the return value manually if you want +to change the default. + +Methods and their defaults: + +* ``__lt__``: NotImplemented +* ``__gt__``: NotImplemented +* ``__le__``: NotImplemented +* ``__ge__``: NotImplemented +* ``__int__``: 1 +* ``__contains__``: False +* ``__len__``: 1 +* ``__iter__``: iter([]) +* ``__exit__``: False +* ``__complex__``: 1j +* ``__float__``: 1.0 +* ``__bool__``: True +* ``__index__``: 1 +* ``__hash__``: default hash for the mock +* ``__str__``: default str for the mock +* ``__sizeof__``: default sizeof for the mock + +For example: + + >>> mock = MagicMock() + >>> int(mock) + 1 + >>> len(mock) + 0 + >>> list(mock) + [] + >>> object() in mock + False + +The two equality method, `__eq__` and `__ne__`, are special. +They do the default equality comparison on identity, using a side +effect, unless you change their return value to return something else: + + >>> MagicMock() == 3 + False + >>> MagicMock() != 3 + True + >>> mock = MagicMock() + >>> mock.__eq__.return_value = True + >>> mock == 3 + True + +The return value of `MagicMock.__iter__` can be any iterable object and isn't +required to be an iterator: + + >>> mock = MagicMock() + >>> mock.__iter__.return_value = ['a', 'b', 'c'] + >>> list(mock) + ['a', 'b', 'c'] + >>> list(mock) + ['a', 'b', 'c'] + +If the return value *is* an iterator, then iterating over it once will consume +it and subsequent iterations will result in an empty list: + + >>> mock.__iter__.return_value = iter(['a', 'b', 'c']) + >>> list(mock) + ['a', 'b', 'c'] + >>> list(mock) + [] + +``MagicMock`` has all of the supported magic methods configured except for some +of the obscure and obsolete ones. You can still set these up if you want. + +Magic methods that are supported but not setup by default in ``MagicMock`` are: + +* ``__subclasses__`` +* ``__dir__`` +* ``__format__`` +* ``__get__``, ``__set__`` and ``__delete__`` +* ``__reversed__`` and ``__missing__`` +* ``__reduce__``, ``__reduce_ex__``, ``__getinitargs__``, ``__getnewargs__``, + ``__getstate__`` and ``__setstate__`` +* ``__getformat__`` and ``__setformat__`` + + + +.. [#] Magic methods *should* be looked up on the class rather than the + instance. Different versions of Python are inconsistent about applying this + rule. The supported protocol methods should work with all supported versions + of Python. +.. [#] The function is basically hooked up to the class, but each ``Mock`` + instance is kept isolated from the others. + + +Helpers +------- + +sentinel +~~~~~~~~ + +.. data:: sentinel + + The ``sentinel`` object provides a convenient way of providing unique + objects for your tests. + + Attributes are created on demand when you access them by name. Accessing + the same attribute will always return the same object. The objects + returned have a sensible repr so that test failure messages are readable. + +Sometimes when testing you need to test that a specific object is passed as an +argument to another method, or returned. It can be common to create named +sentinel objects to test this. `sentinel` provides a convenient way of +creating and testing the identity of objects like this. + +In this example we monkey patch `method` to return `sentinel.some_object`: + + >>> real = ProductionClass() + >>> real.method = Mock(name="method") + >>> real.method.return_value = sentinel.some_object + >>> result = real.method() + >>> assert result is sentinel.some_object + >>> sentinel.some_object + sentinel.some_object + + +DEFAULT +~~~~~~~ + + +.. data:: DEFAULT + + The `DEFAULT` object is a pre-created sentinel (actually + `sentinel.DEFAULT`). It can be used by :attr:`~Mock.side_effect` + functions to indicate that the normal return value should be used. + + +call +~~~~ + +.. function:: call(*args, **kwargs) + + `call` is a helper object for making simpler assertions, for comparing with + :attr:`~Mock.call_args`, :attr:`~Mock.call_args_list`, + :attr:`~Mock.mock_calls` and :attr:`~Mock.method_calls`. `call` can also be + used with :meth:`~Mock.assert_has_calls`. + + >>> m = MagicMock(return_value=None) + >>> m(1, 2, a='foo', b='bar') + >>> m() + >>> m.call_args_list == [call(1, 2, a='foo', b='bar'), call()] + True + +.. method:: call.call_list() + + For a call object that represents multiple calls, `call_list` + returns a list of all the intermediate calls as well as the + final call. + +`call_list` is particularly useful for making assertions on "chained calls". A +chained call is multiple calls on a single line of code. This results in +multiple entries in :attr:`~Mock.mock_calls` on a mock. Manually constructing +the sequence of calls can be tedious. + +:meth:`~call.call_list` can construct the sequence of calls from the same +chained call: + + >>> m = MagicMock() + >>> m(1).method(arg='foo').other('bar')(2.0) + <MagicMock name='mock().method().other()()' id='...'> + >>> kall = call(1).method(arg='foo').other('bar')(2.0) + >>> kall.call_list() + [call(1), + call().method(arg='foo'), + call().method().other('bar'), + call().method().other()(2.0)] + >>> m.mock_calls == kall.call_list() + True + +.. _calls-as-tuples: + +A `call` object is either a tuple of (positional args, keyword args) or +(name, positional args, keyword args) depending on how it was constructed. When +you construct them yourself this isn't particularly interesting, but the `call` +objects that are in the :attr:`Mock.call_args`, :attr:`Mock.call_args_list` and +:attr:`Mock.mock_calls` attributes can be introspected to get at the individual +arguments they contain. + +The `call` objects in :attr:`Mock.call_args` and :attr:`Mock.call_args_list` +are two-tuples of (positional args, keyword args) whereas the `call` objects +in :attr:`Mock.mock_calls`, along with ones you construct yourself, are +three-tuples of (name, positional args, keyword args). + +You can use their "tupleness" to pull out the individual arguments for more +complex introspection and assertions. The positional arguments are a tuple +(an empty tuple if there are no positional arguments) and the keyword +arguments are a dictionary: + + >>> m = MagicMock(return_value=None) + >>> m(1, 2, 3, arg='one', arg2='two') + >>> kall = m.call_args + >>> args, kwargs = kall + >>> args + (1, 2, 3) + >>> kwargs + {'arg2': 'two', 'arg': 'one'} + >>> args is kall[0] + True + >>> kwargs is kall[1] + True + + >>> m = MagicMock() + >>> m.foo(4, 5, 6, arg='two', arg2='three') + <MagicMock name='mock.foo()' id='...'> + >>> kall = m.mock_calls[0] + >>> name, args, kwargs = kall + >>> name + 'foo' + >>> args + (4, 5, 6) + >>> kwargs + {'arg2': 'three', 'arg': 'two'} + >>> name is m.mock_calls[0][0] + True + + +create_autospec +~~~~~~~~~~~~~~~ + +.. function:: create_autospec(spec, spec_set=False, instance=False, **kwargs) + + Create a mock object using another object as a spec. Attributes on the + mock will use the corresponding attribute on the `spec` object as their + spec. + + Functions or methods being mocked will have their arguments checked to + ensure that they are called with the correct signature. + + If `spec_set` is `True` then attempting to set attributes that don't exist + on the spec object will raise an `AttributeError`. + + If a class is used as a spec then the return value of the mock (the + instance of the class) will have the same spec. You can use a class as the + spec for an instance object by passing `instance=True`. The returned mock + will only be callable if instances of the mock are callable. + + `create_autospec` also takes arbitrary keyword arguments that are passed to + the constructor of the created mock. + +See :ref:`auto-speccing` for examples of how to use auto-speccing with +`create_autospec` and the `autospec` argument to :func:`patch`. + + +ANY +~~~ + +.. data:: ANY + +Sometimes you may need to make assertions about *some* of the arguments in a +call to mock, but either not care about some of the arguments or want to pull +them individually out of :attr:`~Mock.call_args` and make more complex +assertions on them. + +To ignore certain arguments you can pass in objects that compare equal to +*everything*. Calls to :meth:`~Mock.assert_called_with` and +:meth:`~Mock.assert_called_once_with` will then succeed no matter what was +passed in. + + >>> mock = Mock(return_value=None) + >>> mock('foo', bar=object()) + >>> mock.assert_called_once_with('foo', bar=ANY) + +`ANY` can also be used in comparisons with call lists like +:attr:`~Mock.mock_calls`: + + >>> m = MagicMock(return_value=None) + >>> m(1) + >>> m(1, 2) + >>> m(object()) + >>> m.mock_calls == [call(1), call(1, 2), ANY] + True + + + +FILTER_DIR +~~~~~~~~~~ + +.. data:: FILTER_DIR + +`FILTER_DIR` is a module level variable that controls the way mock objects +respond to `dir` (only for Python 2.6 or more recent). The default is `True`, +which uses the filtering described below, to only show useful members. If you +dislike this filtering, or need to switch it off for diagnostic purposes, then +set `mock.FILTER_DIR = False`. + +With filtering on, `dir(some_mock)` shows only useful attributes and will +include any dynamically created attributes that wouldn't normally be shown. +If the mock was created with a `spec` (or `autospec` of course) then all the +attributes from the original are shown, even if they haven't been accessed +yet: + + >>> dir(Mock()) + ['assert_any_call', + 'assert_called_once_with', + 'assert_called_with', + 'assert_has_calls', + 'attach_mock', + ... + >>> from urllib import request + >>> dir(Mock(spec=request)) + ['AbstractBasicAuthHandler', + 'AbstractDigestAuthHandler', + 'AbstractHTTPHandler', + 'BaseHandler', + ... + +Many of the not-very-useful (private to `Mock` rather than the thing being +mocked) underscore and double underscore prefixed attributes have been +filtered from the result of calling `dir` on a `Mock`. If you dislike this +behaviour you can switch it off by setting the module level switch +`FILTER_DIR`: + + >>> from unittest import mock + >>> mock.FILTER_DIR = False + >>> dir(mock.Mock()) + ['_NonCallableMock__get_return_value', + '_NonCallableMock__get_side_effect', + '_NonCallableMock__return_value_doc', + '_NonCallableMock__set_return_value', + '_NonCallableMock__set_side_effect', + '__call__', + '__class__', + ... + +Alternatively you can just use `vars(my_mock)` (instance members) and +`dir(type(my_mock))` (type members) to bypass the filtering irrespective of +`mock.FILTER_DIR`. + + +mock_open +~~~~~~~~~ + +.. function:: mock_open(mock=None, read_data=None) + + A helper function to create a mock to replace the use of `open`. It works + for `open` called directly or used as a context manager. + + The `mock` argument is the mock object to configure. If `None` (the + default) then a `MagicMock` will be created for you, with the API limited + to methods or attributes available on standard file handles. + + `read_data` is a string for the `~io.IOBase.read` method of the file handle + to return. This is an empty string by default. + +Using `open` as a context manager is a great way to ensure your file handles +are closed properly and is becoming common:: + + with open('/some/path', 'w') as f: + f.write('something') + +The issue is that even if you mock out the call to `open` it is the +*returned object* that is used as a context manager (and has `__enter__` and +`__exit__` called). + +Mocking context managers with a :class:`MagicMock` is common enough and fiddly +enough that a helper function is useful. + + >>> m = mock_open() + >>> with patch('__main__.open', m, create=True): + ... with open('foo', 'w') as h: + ... h.write('some stuff') + ... + >>> m.mock_calls + [call('foo', 'w'), + call().__enter__(), + call().write('some stuff'), + call().__exit__(None, None, None)] + >>> m.assert_called_once_with('foo', 'w') + >>> handle = m() + >>> handle.write.assert_called_once_with('some stuff') + +And for reading files: + + >>> with patch('__main__.open', mock_open(read_data='bibble'), create=True) as m: + ... with open('foo') as h: + ... result = h.read() + ... + >>> m.assert_called_once_with('foo') + >>> assert result == 'bibble' + + +.. _auto-speccing: + +Autospeccing +~~~~~~~~~~~~ + +Autospeccing is based on the existing `spec` feature of mock. It limits the +api of mocks to the api of an original object (the spec), but it is recursive +(implemented lazily) so that attributes of mocks only have the same api as +the attributes of the spec. In addition mocked functions / methods have the +same call signature as the original so they raise a `TypeError` if they are +called incorrectly. + +Before I explain how auto-speccing works, here's why it is needed. + +`Mock` is a very powerful and flexible object, but it suffers from two flaws +when used to mock out objects from a system under test. One of these flaws is +specific to the `Mock` api and the other is a more general problem with using +mock objects. + +First the problem specific to `Mock`. `Mock` has two assert methods that are +extremely handy: :meth:`~Mock.assert_called_with` and +:meth:`~Mock.assert_called_once_with`. + + >>> mock = Mock(name='Thing', return_value=None) + >>> mock(1, 2, 3) + >>> mock.assert_called_once_with(1, 2, 3) + >>> mock(1, 2, 3) + >>> mock.assert_called_once_with(1, 2, 3) + Traceback (most recent call last): + ... + AssertionError: Expected 'mock' to be called once. Called 2 times. + +Because mocks auto-create attributes on demand, and allow you to call them +with arbitrary arguments, if you misspell one of these assert methods then +your assertion is gone: + +.. code-block:: pycon + + >>> mock = Mock(name='Thing', return_value=None) + >>> mock(1, 2, 3) + >>> mock.assret_called_once_with(4, 5, 6) + +Your tests can pass silently and incorrectly because of the typo. + +The second issue is more general to mocking. If you refactor some of your +code, rename members and so on, any tests for code that is still using the +*old api* but uses mocks instead of the real objects will still pass. This +means your tests can all pass even though your code is broken. + +Note that this is another reason why you need integration tests as well as +unit tests. Testing everything in isolation is all fine and dandy, but if you +don't test how your units are "wired together" there is still lots of room +for bugs that tests might have caught. + +`mock` already provides a feature to help with this, called speccing. If you +use a class or instance as the `spec` for a mock then you can only access +attributes on the mock that exist on the real class: + + >>> from urllib import request + >>> mock = Mock(spec=request.Request) + >>> mock.assret_called_with + Traceback (most recent call last): + ... + AttributeError: Mock object has no attribute 'assret_called_with' + +The spec only applies to the mock itself, so we still have the same issue +with any methods on the mock: + +.. code-block:: pycon + + >>> mock.has_data() + <mock.Mock object at 0x...> + >>> mock.has_data.assret_called_with() + +Auto-speccing solves this problem. You can either pass `autospec=True` to +`patch` / `patch.object` or use the `create_autospec` function to create a +mock with a spec. If you use the `autospec=True` argument to `patch` then the +object that is being replaced will be used as the spec object. Because the +speccing is done "lazily" (the spec is created as attributes on the mock are +accessed) you can use it with very complex or deeply nested objects (like +modules that import modules that import modules) without a big performance +hit. + +Here's an example of it in use: + + >>> from urllib import request + >>> patcher = patch('__main__.request', autospec=True) + >>> mock_request = patcher.start() + >>> request is mock_request + True + >>> mock_request.Request + <MagicMock name='request.Request' spec='Request' id='...'> + +You can see that `request.Request` has a spec. `request.Request` takes two +arguments in the constructor (one of which is `self`). Here's what happens if +we try to call it incorrectly: + + >>> req = request.Request() + Traceback (most recent call last): + ... + TypeError: <lambda>() takes at least 2 arguments (1 given) + +The spec also applies to instantiated classes (i.e. the return value of +specced mocks): + + >>> req = request.Request('foo') + >>> req + <NonCallableMagicMock name='request.Request()' spec='Request' id='...'> + +`Request` objects are not callable, so the return value of instantiating our +mocked out `request.Request` is a non-callable mock. With the spec in place +any typos in our asserts will raise the correct error: + + >>> req.add_header('spam', 'eggs') + <MagicMock name='request.Request().add_header()' id='...'> + >>> req.add_header.assret_called_with + Traceback (most recent call last): + ... + AttributeError: Mock object has no attribute 'assret_called_with' + >>> req.add_header.assert_called_with('spam', 'eggs') + +In many cases you will just be able to add `autospec=True` to your existing +`patch` calls and then be protected against bugs due to typos and api +changes. + +As well as using `autospec` through `patch` there is a +:func:`create_autospec` for creating autospecced mocks directly: + + >>> from urllib import request + >>> mock_request = create_autospec(request) + >>> mock_request.Request('foo', 'bar') + <NonCallableMagicMock name='mock.Request()' spec='Request' id='...'> + +This isn't without caveats and limitations however, which is why it is not +the default behaviour. In order to know what attributes are available on the +spec object, autospec has to introspect (access attributes) the spec. As you +traverse attributes on the mock a corresponding traversal of the original +object is happening under the hood. If any of your specced objects have +properties or descriptors that can trigger code execution then you may not be +able to use autospec. On the other hand it is much better to design your +objects so that introspection is safe [#]_. + +A more serious problem is that it is common for instance attributes to be +created in the `__init__` method and not to exist on the class at all. +`autospec` can't know about any dynamically created attributes and restricts +the api to visible attributes. + + >>> class Something: + ... def __init__(self): + ... self.a = 33 + ... + >>> with patch('__main__.Something', autospec=True): + ... thing = Something() + ... thing.a + ... + Traceback (most recent call last): + ... + AttributeError: Mock object has no attribute 'a' + +There are a few different ways of resolving this problem. The easiest, but +not necessarily the least annoying, way is to simply set the required +attributes on the mock after creation. Just because `autospec` doesn't allow +you to fetch attributes that don't exist on the spec it doesn't prevent you +setting them: + + >>> with patch('__main__.Something', autospec=True): + ... thing = Something() + ... thing.a = 33 + ... + +There is a more aggressive version of both `spec` and `autospec` that *does* +prevent you setting non-existent attributes. This is useful if you want to +ensure your code only *sets* valid attributes too, but obviously it prevents +this particular scenario: + + >>> with patch('__main__.Something', autospec=True, spec_set=True): + ... thing = Something() + ... thing.a = 33 + ... + Traceback (most recent call last): + ... + AttributeError: Mock object has no attribute 'a' + +Probably the best way of solving the problem is to add class attributes as +default values for instance members initialised in `__init__`. Note that if +you are only setting default attributes in `__init__` then providing them via +class attributes (shared between instances of course) is faster too. e.g. + +.. code-block:: python + + class Something: + a = 33 + +This brings up another issue. It is relatively common to provide a default +value of `None` for members that will later be an object of a different type. +`None` would be useless as a spec because it wouldn't let you access *any* +attributes or methods on it. As `None` is *never* going to be useful as a +spec, and probably indicates a member that will normally of some other type, +`autospec` doesn't use a spec for members that are set to `None`. These will +just be ordinary mocks (well - `MagicMocks`): + + >>> class Something: + ... member = None + ... + >>> mock = create_autospec(Something) + >>> mock.member.foo.bar.baz() + <MagicMock name='mock.member.foo.bar.baz()' id='...'> + +If modifying your production classes to add defaults isn't to your liking +then there are more options. One of these is simply to use an instance as the +spec rather than the class. The other is to create a subclass of the +production class and add the defaults to the subclass without affecting the +production class. Both of these require you to use an alternative object as +the spec. Thankfully `patch` supports this - you can simply pass the +alternative object as the `autospec` argument: + + >>> class Something: + ... def __init__(self): + ... self.a = 33 + ... + >>> class SomethingForTest(Something): + ... a = 33 + ... + >>> p = patch('__main__.Something', autospec=SomethingForTest) + >>> mock = p.start() + >>> mock.a + <NonCallableMagicMock name='Something.a' spec='int' id='...'> + + +.. [#] This only applies to classes or already instantiated objects. Calling + a mocked class to create a mock instance *does not* create a real instance. + It is only attribute lookups - along with calls to `dir` - that are done. + diff --git a/Doc/library/unittest.rst b/Doc/library/unittest.rst index b3372f9..fdc9409 100644 --- a/Doc/library/unittest.rst +++ b/Doc/library/unittest.rst @@ -11,17 +11,14 @@ (If you are already familiar with the basic concepts of testing, you might want to skip to :ref:`the list of assert methods <assert-methods>`.) -The Python unit testing framework, sometimes referred to as "PyUnit," is a -Python language version of JUnit, by Kent Beck and Erich Gamma. JUnit is, in -turn, a Java version of Kent's Smalltalk testing framework. Each is the de -facto standard unit testing framework for its respective language. +The :mod:`unittest` unit testing framework was originally inspired by JUnit +and has a similar flavor as major unit testing frameworks in other +languages. It supports test automation, sharing of setup and shutdown code +for tests, aggregation of tests into collections, and independence of the +tests from the reporting framework. -:mod:`unittest` supports test automation, sharing of setup and shutdown code for -tests, aggregation of tests into collections, and independence of the tests from -the reporting framework. The :mod:`unittest` module provides classes that make -it easy to support these qualities for a set of tests. - -To achieve this, :mod:`unittest` supports some important concepts: +To achieve this, :mod:`unittest` supports some important concepts in an +object-oriented way: test fixture A :dfn:`test fixture` represents the preparation needed to perform one or more @@ -30,7 +27,7 @@ test fixture process. test case - A :dfn:`test case` is the smallest unit of testing. It checks for a specific + A :dfn:`test case` is the individual unit of testing. It checks for a specific response to a particular set of inputs. :mod:`unittest` provides a base class, :class:`TestCase`, which may be used to create new test cases. @@ -44,43 +41,12 @@ test runner a textual interface, or return a special value to indicate the results of executing the tests. -The test case and test fixture concepts are supported through the -:class:`TestCase` and :class:`FunctionTestCase` classes; the former should be -used when creating new tests, and the latter can be used when integrating -existing test code with a :mod:`unittest`\ -driven framework. When building test -fixtures using :class:`TestCase`, the :meth:`~TestCase.setUp` and -:meth:`~TestCase.tearDown` methods can be overridden to provide initialization -and cleanup for the fixture. With :class:`FunctionTestCase`, existing functions -can be passed to the constructor for these purposes. When the test is run, the -fixture initialization is run first; if it succeeds, the cleanup method is run -after the test has been executed, regardless of the outcome of the test. Each -instance of the :class:`TestCase` will only be used to run a single test method, -so a new fixture is created for each test. - -Test suites are implemented by the :class:`TestSuite` class. This class allows -individual tests and test suites to be aggregated; when the suite is executed, -all tests added directly to the suite and in "child" test suites are run. - -A test runner is an object that provides a single method, -:meth:`~TestRunner.run`, which accepts a :class:`TestCase` or :class:`TestSuite` -object as a parameter, and returns a result object. The class -:class:`TestResult` is provided for use as the result object. :mod:`unittest` -provides the :class:`TextTestRunner` as an example test runner which reports -test results on the standard error stream by default. Alternate runners can be -implemented for other environments (such as graphical environments) without any -need to derive from a specific class. - .. seealso:: Module :mod:`doctest` Another test-support module with a very different flavor. - `unittest2: A backport of new unittest features for Python 2.4-2.6 <http://pypi.python.org/pypi/unittest2>`_ - Many new features were added to unittest in Python 2.7, including test - discovery. unittest2 allows you to use these features with earlier - versions of Python. - `Simple Smalltalk Testing: With Patterns <http://www.XProgramming.com/testfram.htm>`_ Kent Beck's original paper on testing frameworks using the pattern shared by :mod:`unittest`. @@ -89,7 +55,7 @@ need to derive from a specific class. Third-party unittest frameworks with a lighter-weight syntax for writing tests. For example, ``assert func(10) == 42``. - `The Python Testing Tools Taxonomy <http://pycheesecake.org/wiki/PythonTestingToolsTaxonomy>`_ + `The Python Testing Tools Taxonomy <http://wiki.python.org/moin/PythonTestingToolsTaxonomy>`_ An extensive list of Python testing tools including functional testing frameworks and mock object libraries. @@ -173,15 +139,8 @@ line, the above script produces an output that looks like this:: OK -Instead of :func:`unittest.main`, there are other ways to run the tests with a -finer level of control, less terse output, and no requirement to be run from the -command line. For example, the last two lines may be replaced with:: - - suite = unittest.TestLoader().loadTestsFromTestCase(TestSequenceFunctions) - unittest.TextTestRunner(verbosity=2).run(suite) - -Running the revised script from the interpreter or another script produces the -following output:: +Passing the ``-v`` option to your test script will instruct :func:`unittest.main` +to enable a higher level of verbosity, and produce the following output:: test_choice (__main__.TestSequenceFunctions) ... ok test_sample (__main__.TestSequenceFunctions) ... ok @@ -359,45 +318,30 @@ test cases are represented by :class:`unittest.TestCase` instances. To make your own test cases you must write subclasses of :class:`TestCase` or use :class:`FunctionTestCase`. -An instance of a :class:`TestCase`\ -derived class is an object that can -completely run a single test method, together with optional set-up and tidy-up -code. - The testing code of a :class:`TestCase` instance should be entirely self contained, such that it can be run either in isolation or in arbitrary combination with any number of other test cases. -The simplest :class:`TestCase` subclass will simply override the -:meth:`~TestCase.runTest` method in order to perform specific testing code:: +The simplest :class:`TestCase` subclass will simply implement a test method +(i.e. a method whose name starts with ``test``) in order to perform specific +testing code:: import unittest class DefaultWidgetSizeTestCase(unittest.TestCase): - def runTest(self): + def test_default_widget_size(self): widget = Widget('The widget') - self.assertEqual(widget.size(), (50, 50), 'incorrect default size') + self.assertEqual(widget.size(), (50, 50)) Note that in order to test something, we use one of the :meth:`assert\*` methods provided by the :class:`TestCase` base class. If the test fails, an exception will be raised, and :mod:`unittest` will identify the test case as a -:dfn:`failure`. Any other exceptions will be treated as :dfn:`errors`. This -helps you identify where the problem is: :dfn:`failures` are caused by incorrect -results - a 5 where you expected a 6. :dfn:`Errors` are caused by incorrect -code - e.g., a :exc:`TypeError` caused by an incorrect function call. - -The way to run a test case will be described later. For now, note that to -construct an instance of such a test case, we call its constructor without -arguments:: - - testCase = DefaultWidgetSizeTestCase() - -Now, such test cases can be numerous, and their set-up can be repetitive. In -the above case, constructing a :class:`Widget` in each of 100 Widget test case -subclasses would mean unsightly duplication. +:dfn:`failure`. Any other exceptions will be treated as :dfn:`errors`. -Luckily, we can factor out such set-up code by implementing a method called -:meth:`~TestCase.setUp`, which the testing framework will automatically call for -us when we run the test:: +Tests can be numerous, and their set-up can be repetitive. Luckily, we +can factor out set-up code by implementing a method called +:meth:`~TestCase.setUp`, which the testing framework will automatically +call for every single test we run:: import unittest @@ -405,23 +349,26 @@ us when we run the test:: def setUp(self): self.widget = Widget('The widget') - class DefaultWidgetSizeTestCase(SimpleWidgetTestCase): - def runTest(self): + def test_default_widget_size(self): self.assertEqual(self.widget.size(), (50,50), 'incorrect default size') - class WidgetResizeTestCase(SimpleWidgetTestCase): - def runTest(self): + def test_widget_resize(self): self.widget.resize(100,150) self.assertEqual(self.widget.size(), (100,150), 'wrong size after resize') +.. note:: + The order in which the various tests will be run is determined + by sorting the test method names with respect to the built-in + ordering for strings. + If the :meth:`~TestCase.setUp` method raises an exception while the test is -running, the framework will consider the test to have suffered an error, and the -:meth:`~TestCase.runTest` method will not be executed. +running, the framework will consider the test to have suffered an error, and +the test method will not be executed. Similarly, we can provide a :meth:`~TestCase.tearDown` method that tidies up -after the :meth:`~TestCase.runTest` method has been run:: +after the test method has been run:: import unittest @@ -431,59 +378,20 @@ after the :meth:`~TestCase.runTest` method has been run:: def tearDown(self): self.widget.dispose() - self.widget = None -If :meth:`~TestCase.setUp` succeeded, the :meth:`~TestCase.tearDown` method will -be run whether :meth:`~TestCase.runTest` succeeded or not. +If :meth:`~TestCase.setUp` succeeded, :meth:`~TestCase.tearDown` will be +run whether the test method succeeded or not. Such a working environment for the testing code is called a :dfn:`fixture`. -Often, many small test cases will use the same fixture. In this case, we would -end up subclassing :class:`SimpleWidgetTestCase` into many small one-method -classes such as :class:`DefaultWidgetSizeTestCase`. This is time-consuming and -discouraging, so in the same vein as JUnit, :mod:`unittest` provides a simpler -mechanism:: - - import unittest - - class WidgetTestCase(unittest.TestCase): - def setUp(self): - self.widget = Widget('The widget') - - def tearDown(self): - self.widget.dispose() - self.widget = None - - def test_default_size(self): - self.assertEqual(self.widget.size(), (50,50), - 'incorrect default size') - - def test_resize(self): - self.widget.resize(100,150) - self.assertEqual(self.widget.size(), (100,150), - 'wrong size after resize') - -Here we have not provided a :meth:`~TestCase.runTest` method, but have instead -provided two different test methods. Class instances will now each run one of -the :meth:`test_\*` methods, with ``self.widget`` created and destroyed -separately for each instance. When creating an instance we must specify the -test method it is to run. We do this by passing the method name in the -constructor:: - - defaultSizeTestCase = WidgetTestCase('test_default_size') - resizeTestCase = WidgetTestCase('test_resize') - Test case instances are grouped together according to the features they test. :mod:`unittest` provides a mechanism for this: the :dfn:`test suite`, -represented by :mod:`unittest`'s :class:`TestSuite` class:: +represented by :mod:`unittest`'s :class:`TestSuite` class. In most cases, +calling :func:`unittest.main` will do the right thing and collect all the +module's test cases for you, and then execute them. - widgetTestSuite = unittest.TestSuite() - widgetTestSuite.addTest(WidgetTestCase('test_default_size')) - widgetTestSuite.addTest(WidgetTestCase('test_resize')) - -For the ease of running tests, as we will see later, it is a good idea to -provide in each test module a callable object that returns a pre-built test -suite:: +However, should you want to customize the building of your test suite, +you can do it yourself:: def suite(): suite = unittest.TestSuite() @@ -491,37 +399,6 @@ suite:: suite.addTest(WidgetTestCase('test_resize')) return suite -or even:: - - def suite(): - tests = ['test_default_size', 'test_resize'] - - return unittest.TestSuite(map(WidgetTestCase, tests)) - -Since it is a common pattern to create a :class:`TestCase` subclass with many -similarly named test functions, :mod:`unittest` provides a :class:`TestLoader` -class that can be used to automate the process of creating a test suite and -populating it with individual tests. For example, :: - - suite = unittest.TestLoader().loadTestsFromTestCase(WidgetTestCase) - -will create a test suite that will run ``WidgetTestCase.test_default_size()`` and -``WidgetTestCase.test_resize``. :class:`TestLoader` uses the ``'test'`` method -name prefix to identify test methods automatically. - -Note that the order in which the various test cases will be run is -determined by sorting the test function names with respect to the -built-in ordering for strings. - -Often it is desirable to group suites of test cases together, so as to run tests -for the whole system at once. This is easy, since :class:`TestSuite` instances -can be added to a :class:`TestSuite` just as :class:`TestCase` instances can be -added to a :class:`TestSuite`:: - - suite1 = module1.TheTestSuite() - suite2 = module2.TheTestSuite() - alltests = unittest.TestSuite([suite1, suite2]) - You can place the definitions of test cases and test suites in the same modules as the code they are to test (such as :file:`widget.py`), but there are several advantages to placing the test code in a separate module, such as @@ -564,23 +441,13 @@ Given the following test function:: assert something.name is not None # ... -one can create an equivalent test case instance as follows:: - - testcase = unittest.FunctionTestCase(testSomething) - -If there are additional set-up and tear-down methods that should be called as -part of the test case's operation, they can also be provided like so:: +one can create an equivalent test case instance as follows, with optional +set-up and tear-down methods:: testcase = unittest.FunctionTestCase(testSomething, setUp=makeSomethingDB, tearDown=deleteSomethingDB) -To make migrating existing test suites easier, :mod:`unittest` supports tests -raising :exc:`AssertionError` to indicate test failure. However, it is -recommended that you use the explicit :meth:`TestCase.fail\*` and -:meth:`TestCase.assert\*` methods instead, as future versions of :mod:`unittest` -may treat :exc:`AssertionError` differently. - .. note:: Even though :class:`FunctionTestCase` can be used to quickly convert an @@ -711,32 +578,24 @@ Test cases .. class:: TestCase(methodName='runTest') - Instances of the :class:`TestCase` class represent the smallest testable units + Instances of the :class:`TestCase` class represent the logical test units in the :mod:`unittest` universe. This class is intended to be used as a base class, with specific tests being implemented by concrete subclasses. This class implements the interface needed by the test runner to allow it to drive the - test, and methods that the test code can use to check for and report various + tests, and methods that the test code can use to check for and report various kinds of failure. - Each instance of :class:`TestCase` will run a single test method: the method - named *methodName*. If you remember, we had an earlier example that went - something like this:: - - def suite(): - suite = unittest.TestSuite() - suite.addTest(WidgetTestCase('test_default_size')) - suite.addTest(WidgetTestCase('test_resize')) - return suite - - Here, we create two instances of :class:`WidgetTestCase`, each of which runs a - single test. + Each instance of :class:`TestCase` will run a single base method: the method + named *methodName*. However, the standard implementation of the default + *methodName*, ``runTest()``, will run every method starting with ``test`` + as an individual test, and count successes and failures accordingly. + Therefore, in most uses of :class:`TestCase`, you will neither change + the *methodName* nor reimplement the default ``runTest()`` method. .. versionchanged:: 3.2 - :class:`TestCase` can be instantiated successfully without providing a method - name. This makes it easier to experiment with :class:`TestCase` from the - interactive interpreter. - - *methodName* defaults to :meth:`runTest`. + :class:`TestCase` can be instantiated successfully without providing a + *methodName*. This makes it easier to experiment with :class:`TestCase` + from the interactive interpreter. :class:`TestCase` instances provide three groups of methods: one group used to run the test, another used by the test implementation to check conditions @@ -745,7 +604,6 @@ Test cases Methods in the first group (running the test) are: - .. method:: setUp() Method called to prepare the test fixture. This is called immediately @@ -797,14 +655,18 @@ Test cases .. method:: run(result=None) - Run the test, collecting the result into the test result object passed as - *result*. If *result* is omitted or ``None``, a temporary result - object is created (by calling the :meth:`defaultTestResult` method) and - used. The result object is not returned to :meth:`run`'s caller. + Run the test, collecting the result into the :class:`TestResult` object + passed as *result*. If *result* is omitted or ``None``, a temporary + result object is created (by calling the :meth:`defaultTestResult` + method) and used. The result object is returned to :meth:`run`'s + caller. The same effect may be had by simply calling the :class:`TestCase` instance. + .. versionchanged:: 3.3 + Previous versions of ``run`` did not return the result. Neither did + calling an instance. .. method:: skipTest(reason) @@ -865,10 +727,11 @@ Test cases | <TestCase.assertNotIsInstance>` | | | +-----------------------------------------+-----------------------------+---------------+ - All the assert methods (except :meth:`assertRaises`, - :meth:`assertRaisesRegex`, :meth:`assertWarns`, :meth:`assertWarnsRegex`) - accept a *msg* argument that, if specified, is used as the error message on - failure (see also :data:`longMessage`). + All the assert methods accept a *msg* argument that, if specified, is used + as the error message on failure (see also :data:`longMessage`). + Note that the *msg* keyword argument can be passed to :meth:`assertRaises`, + :meth:`assertRaisesRegex`, :meth:`assertWarns`, :meth:`assertWarnsRegex` + only when they are used as a context manager. .. method:: assertEqual(first, second, msg=None) @@ -952,18 +815,18 @@ Test cases | :meth:`assertRaises(exc, fun, *args, **kwds) | ``fun(*args, **kwds)`` raises *exc* | | | <TestCase.assertRaises>` | | | +---------------------------------------------------------+--------------------------------------+------------+ - | :meth:`assertRaisesRegex(exc, re, fun, *args, **kwds) | ``fun(*args, **kwds)`` raises *exc* | 3.1 | - | <TestCase.assertRaisesRegex>` | and the message matches *re* | | + | :meth:`assertRaisesRegex(exc, r, fun, *args, **kwds) | ``fun(*args, **kwds)`` raises *exc* | 3.1 | + | <TestCase.assertRaisesRegex>` | and the message matches regex *r* | | +---------------------------------------------------------+--------------------------------------+------------+ | :meth:`assertWarns(warn, fun, *args, **kwds) | ``fun(*args, **kwds)`` raises *warn* | 3.2 | | <TestCase.assertWarns>` | | | +---------------------------------------------------------+--------------------------------------+------------+ - | :meth:`assertWarnsRegex(warn, re, fun, *args, **kwds) | ``fun(*args, **kwds)`` raises *warn* | 3.2 | - | <TestCase.assertWarnsRegex>` | and the message matches *re* | | + | :meth:`assertWarnsRegex(warn, r, fun, *args, **kwds) | ``fun(*args, **kwds)`` raises *warn* | 3.2 | + | <TestCase.assertWarnsRegex>` | and the message matches regex *r* | | +---------------------------------------------------------+--------------------------------------+------------+ .. method:: assertRaises(exception, callable, *args, **kwds) - assertRaises(exception) + assertRaises(exception, msg=None) Test that an exception is raised when *callable* is called with any positional or keyword arguments that are also passed to @@ -972,12 +835,16 @@ Test cases To catch any of a group of exceptions, a tuple containing the exception classes may be passed as *exception*. - If only the *exception* argument is given, returns a context manager so - that the code under test can be written inline rather than as a function:: + If only the *exception* and possibly the *msg* arguments are given, + return a context manager so that the code under test can be written + inline rather than as a function:: with self.assertRaises(SomeException): do_something() + When used as a context manager, :meth:`assertRaises` accepts the + additional keyword argument *msg*. + The context manager will store the caught exception object in its :attr:`exception` attribute. This can be useful if the intention is to perform additional checks on the exception raised:: @@ -994,16 +861,19 @@ Test cases .. versionchanged:: 3.2 Added the :attr:`exception` attribute. + .. versionchanged:: 3.3 + Added the *msg* keyword argument when used as a context manager. + .. method:: assertRaisesRegex(exception, regex, callable, *args, **kwds) - assertRaisesRegex(exception, regex) + assertRaisesRegex(exception, regex, msg=None) Like :meth:`assertRaises` but also tests that *regex* matches on the string representation of the raised exception. *regex* may be a regular expression object or a string containing a regular expression suitable for use by :func:`re.search`. Examples:: - self.assertRaisesRegex(ValueError, 'invalid literal for.*XYZ$', + self.assertRaisesRegex(ValueError, "invalid literal for.*XYZ'$", int, 'XYZ') or:: @@ -1013,31 +883,39 @@ Test cases .. versionadded:: 3.1 under the name ``assertRaisesRegexp``. + .. versionchanged:: 3.2 Renamed to :meth:`assertRaisesRegex`. + .. versionchanged:: 3.3 + Added the *msg* keyword argument when used as a context manager. + .. method:: assertWarns(warning, callable, *args, **kwds) - assertWarns(warning) + assertWarns(warning, msg=None) Test that a warning is triggered when *callable* is called with any positional or keyword arguments that are also passed to :meth:`assertWarns`. The test passes if *warning* is triggered and - fails if it isn't. Also, any unexpected exception is an error. + fails if it isn't. Any exception is an error. To catch any of a group of warnings, a tuple containing the warning classes may be passed as *warnings*. - If only the *warning* argument is given, returns a context manager so - that the code under test can be written inline rather than as a function:: + If only the *warning* and possibly the *msg* arguments are given, + return a context manager so that the code under test can be written + inline rather than as a function:: with self.assertWarns(SomeWarning): do_something() + When used as a context manager, :meth:`assertWarns` accepts the + additional keyword argument *msg*. + The context manager will store the caught warning object in its :attr:`warning` attribute, and the source line which triggered the warnings in the :attr:`filename` and :attr:`lineno` attributes. This can be useful if the intention is to perform additional checks - on the exception raised:: + on the warning caught:: with self.assertWarns(SomeWarning) as cm: do_something() @@ -1050,9 +928,12 @@ Test cases .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Added the *msg* keyword argument when used as a context manager. + .. method:: assertWarnsRegex(warning, regex, callable, *args, **kwds) - assertWarnsRegex(warning, regex) + assertWarnsRegex(warning, regex, msg=None) Like :meth:`assertWarns` but also tests that *regex* matches on the message of the triggered warning. *regex* may be a regular expression @@ -1070,6 +951,8 @@ Test cases .. versionadded:: 3.2 + .. versionchanged:: 3.3 + Added the *msg* keyword argument when used as a context manager. There are also other methods used to perform more specific checks, such as: @@ -1095,10 +978,10 @@ Test cases | :meth:`assertLessEqual(a, b) | ``a <= b`` | 3.1 | | <TestCase.assertLessEqual>` | | | +---------------------------------------+--------------------------------+--------------+ - | :meth:`assertRegex(s, re) | ``regex.search(s)`` | 3.1 | + | :meth:`assertRegex(s, r) | ``r.search(s)`` | 3.1 | | <TestCase.assertRegex>` | | | +---------------------------------------+--------------------------------+--------------+ - | :meth:`assertNotRegex(s, re) | ``not regex.search(s)`` | 3.2 | + | :meth:`assertNotRegex(s, r) | ``not r.search(s)`` | 3.2 | | <TestCase.assertNotRegex>` | | | +---------------------------------------+--------------------------------+--------------+ | :meth:`assertCountEqual(a, b) | *a* and *b* have the same | 3.2 | @@ -1117,7 +1000,7 @@ Test cases like the :func:`round` function) and not *significant digits*. If *delta* is supplied instead of *places* then the difference - between *first* and *second* must be less (or more) than *delta*. + between *first* and *second* must be less or equal to (or greater than) *delta*. Supplying both *delta* and *places* raises a ``TypeError``. @@ -1159,21 +1042,6 @@ Test cases :meth:`.assertNotRegex`. - .. method:: assertDictContainsSubset(subset, dictionary, msg=None) - - Tests whether the key/value pairs in *dictionary* are a superset of - those in *subset*. If not, an error message listing the missing keys - and mismatched values is generated. - - Note, the arguments are in the opposite order of what the method name - dictates. Instead, consider using the set-methods on :ref:`dictionary - views <dict-views>`, for example: ``d.keys() <= e.keys()`` or - ``d.items() <= d.items()``. - - .. versionadded:: 3.1 - .. deprecated:: 3.2 - - .. method:: assertCountEqual(first, second, msg=None) Test that sequence *first* contains the same elements as *second*, @@ -1188,21 +1056,6 @@ Test cases .. versionadded:: 3.2 - .. method:: assertSameElements(first, second, msg=None) - - Test that sequence *first* contains the same elements as *second*, - regardless of their order. When they don't, an error message listing - the differences between the sequences will be generated. - - Duplicate elements are ignored when comparing *first* and *second*. - It is the equivalent of ``assertEqual(set(first), set(second))`` - but it works with sequences of unhashable objects as well. Because - duplicates are ignored, this method has been deprecated in favour of - :meth:`assertCountEqual`. - - .. versionadded:: 3.1 - .. deprecated:: 3.2 - .. _type-specific-methods: @@ -1636,11 +1489,11 @@ Loading and running tests .. method:: discover(start_dir, pattern='test*.py', top_level_dir=None) - Find and return all test modules from the specified start directory, - recursing into subdirectories to find them. Only test files that match - *pattern* will be loaded. (Using shell style pattern matching.) Only - module names that are importable (i.e. are valid Python identifiers) will - be loaded. + Find all the test modules by recursing into subdirectories from the + specified start directory, and return a TestSuite object containing them. + Only test files that match *pattern* will be loaded. (Using shell style + pattern matching.) Only module names that are importable (i.e. are valid + Python identifiers) will be loaded. All test modules must be importable from the top level of the project. If the start directory is not the top level directory then the top level @@ -1724,8 +1577,7 @@ Loading and running tests A list containing 2-tuples of :class:`TestCase` instances and strings holding formatted tracebacks. Each tuple represents a test where a failure - was explicitly signalled using the :meth:`TestCase.fail\*` or - :meth:`TestCase.assert\*` methods. + was explicitly signalled using the :meth:`TestCase.assert\*` methods. .. attribute:: skipped @@ -1822,7 +1674,7 @@ Loading and running tests .. method:: addError(test, err) - Called when the test case *test* raises an unexpected exception *err* is a + Called when the test case *test* raises an unexpected exception. *err* is a tuple of the form returned by :func:`sys.exc_info`: ``(type, value, traceback)``. @@ -1893,7 +1745,8 @@ Loading and running tests instead of repeatedly creating new instances. -.. class:: TextTestRunner(stream=None, descriptions=True, verbosity=1, runnerclass=None, warnings=None) +.. class:: TextTestRunner(stream=None, descriptions=True, verbosity=1, failfast=False, \ + buffer=False, resultclass=None, warnings=None) A basic test runner implementation that outputs results to a stream. If *stream* is ``None``, the default, :data:`sys.stderr` is used as the output stream. This class @@ -1901,13 +1754,14 @@ Loading and running tests applications which run test suites should provide alternate implementations. By default this runner shows :exc:`DeprecationWarning`, - :exc:`PendingDeprecationWarning`, and :exc:`ImportWarning` even if they are - :ref:`ignored by default <warning-ignored>`. Deprecation warnings caused by - :ref:`deprecated unittest methods <deprecated-aliases>` are also - special-cased and, when the warning filters are ``'default'`` or ``'always'``, - they will appear only once per-module, in order to avoid too many warning - messages. This behavior can be overridden using the :option:`-Wd` or - :option:`-Wa` options and leaving *warnings* to ``None``. + :exc:`PendingDeprecationWarning`, :exc:`ResourceWarning` and + :exc:`ImportWarning` even if they are :ref:`ignored by default + <warning-ignored>`. Deprecation warnings caused by :ref:`deprecated unittest + methods <deprecated-aliases>` are also special-cased and, when the warning + filters are ``'default'`` or ``'always'``, they will appear only once + per-module, in order to avoid too many warning messages. This behavior can + be overridden using the :option:`-Wd` or :option:`-Wa` options and leaving + *warnings* to ``None``. .. versionchanged:: 3.2 Added the ``warnings`` argument. @@ -1948,6 +1802,10 @@ Loading and running tests if __name__ == '__main__': unittest.main(verbosity=2) + The *defaultTest* argument is the name of the test to run if no test names + are specified via *argv*. If not specified or ``None`` and no test names are + provided via *argv*, all tests found in *module* are run. + The *argv* argument can be a list of options passed to the program, with the first element being the program name. If not specified or ``None``, the values of :data:`sys.argv` are used. diff --git a/Doc/library/urllib.error.rst b/Doc/library/urllib.error.rst index 9c04859..25c13bd 100644 --- a/Doc/library/urllib.error.rst +++ b/Doc/library/urllib.error.rst @@ -8,29 +8,32 @@ The :mod:`urllib.error` module defines the exception classes for exceptions -raised by :mod:`urllib.request`. The base exception class is :exc:`URLError`, -which inherits from :exc:`IOError`. +raised by :mod:`urllib.request`. The base exception class is :exc:`URLError`. The following exceptions are raised by :mod:`urllib.error` as appropriate: .. exception:: URLError The handlers raise this exception (or derived exceptions) when they run into - a problem. It is a subclass of :exc:`IOError`. + a problem. It is a subclass of :exc:`OSError`. .. attribute:: reason The reason for this error. It can be a message string or another - exception instance (:exc:`socket.error` for remote URLs, :exc:`OSError` - for local URLs). + exception instance. + + .. versionchanged:: 3.3 + :exc:`URLError` has been made a subclass of :exc:`OSError` instead + of :exc:`IOError`. .. exception:: HTTPError Though being an exception (a subclass of :exc:`URLError`), an :exc:`HTTPError` can also function as a non-exceptional file-like return - value (the same thing that :func:`urlopen` returns). This is useful when - handling exotic HTTP errors, such as requests for authentication. + value (the same thing that :func:`~urllib.request.urlopen` returns). This + is useful when handling exotic HTTP errors, such as requests for + authentication. .. attribute:: code @@ -45,7 +48,8 @@ The following exceptions are raised by :mod:`urllib.error` as appropriate: .. exception:: ContentTooShortError(msg, content) - This exception is raised when the :func:`urlretrieve` function detects that + This exception is raised when the :func:`~urllib.request.urlretrieve` + function detects that the amount of the downloaded data is less than the expected amount (given by the *Content-Length* header). The :attr:`content` attribute stores the downloaded (and supposedly truncated) data. diff --git a/Doc/library/urllib.parse.rst b/Doc/library/urllib.parse.rst index 61fc471..b951420 100644 --- a/Doc/library/urllib.parse.rst +++ b/Doc/library/urllib.parse.rst @@ -70,7 +70,7 @@ or on combining URL components into a URL string. ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html', params='', query='', fragment='') >>> urlparse('www.cwi.nl/%7Eguido/Python.html') - ParseResult(scheme='', netloc='', path='www.cwi.nl:80/%7Eguido/Python.html', + ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html', params='', query='', fragment='') >>> urlparse('help/Python.html') ParseResult(scheme='', netloc='', path='help/Python.html', params='', @@ -81,8 +81,7 @@ or on combining URL components into a URL string. this argument is the empty string. If the *allow_fragments* argument is false, fragment identifiers are not - allowed, even if the URL's addressing scheme normally does support them. The - default value for this argument is :const:`True`. + allowed. The default value for this argument is :const:`True`. The return value is actually an instance of a subclass of :class:`tuple`. This class has the following additional read-only convenience attributes: @@ -119,6 +118,11 @@ or on combining URL components into a URL string. .. versionchanged:: 3.2 Added IPv6 URL parsing capabilities. + .. versionchanged:: 3.3 + The fragment is now parsed for all URL schemes (unless *allow_fragment* is + false), in accordance with :rfc:`3986`. Previously, a whitelist of + schemes that support fragments existed. + .. function:: parse_qs(qs, keep_blank_values=False, strict_parsing=False, encoding='utf-8', errors='replace') @@ -141,8 +145,9 @@ or on combining URL components into a URL string. percent-encoded sequences into Unicode characters, as accepted by the :meth:`bytes.decode` method. - Use the :func:`urllib.parse.urlencode` function to convert such - dictionaries into query strings. + Use the :func:`urllib.parse.urlencode` function (with the ``doseq`` + parameter set to ``True``) to convert such dictionaries into query + strings. .. versionchanged:: 3.2 @@ -513,8 +518,8 @@ task isn't already covered by the URL parsing functions above. Convert a mapping object or a sequence of two-element tuples, which may either be a :class:`str` or a :class:`bytes`, to a "percent-encoded" string. If the resultant string is to be used as a *data* for POST - operation with :func:`urlopen` function, then it should be properly encoded - to bytes, otherwise it would result in a :exc:`TypeError`. + operation with :func:`~urllib.request.urlopen` function, then it should be + properly encoded to bytes, otherwise it would result in a :exc:`TypeError`. The resulting string is a series of ``key=value`` pairs separated by ``'&'`` characters, where both *key* and *value* are quoted using :func:`quote_plus` diff --git a/Doc/library/urllib.request.rst b/Doc/library/urllib.request.rst index 33244da..6493a29 100644 --- a/Doc/library/urllib.request.rst +++ b/Doc/library/urllib.request.rst @@ -16,7 +16,7 @@ authentication, redirections, cookies and more. The :mod:`urllib.request` module defines the following functions: -.. function:: urlopen(url, data=None[, timeout], *, cafile=None, capath=None) +.. function:: urlopen(url, data=None[, timeout], *, cafile=None, capath=None, cadefault=False) Open the URL *url*, which can be either a string or a :class:`Request` object. @@ -53,9 +53,15 @@ The :mod:`urllib.request` module defines the following functions: point to a directory of hashed certificate files. More information can be found in :meth:`ssl.SSLContext.load_verify_locations`. + The *cadefault* parameter specifies whether to fall back to loading a + default certificate store defined by the underlying OpenSSL library if the + *cafile* and *capath* parameters are omitted. This will only work on + some non-Windows platforms. + .. warning:: - If neither *cafile* nor *capath* is specified, an HTTPS request - will not do any verification of the server's certificate. + If neither *cafile* nor *capath* is specified, and *cadefault* is ``False``, + an HTTPS request will not do any verification of the server's + certificate. For http and https urls, this function returns a :class:`http.client.HTTPResponse` object which has the following @@ -75,14 +81,16 @@ The :mod:`urllib.request` module defines the following functions: * :meth:`~urllib.response.addinfourl.getcode` -- return the HTTP status code of the response. - Raises :exc:`URLError` on errors. + Raises :exc:`~urllib.error.URLError` on errors. Note that ``None`` may be returned if no handler handles the request (though the default installed global :class:`OpenerDirector` uses :class:`UnknownHandler` to ensure this never happens). - In addition, default installed :class:`ProxyHandler` makes sure the requests - are handled through the proxy when they are set. + In addition, if proxy settings are detected (for example, when a ``*_proxy`` + environment variable like :envvar:`http_proxy` is set), + :class:`ProxyHandler` is default installed and makes sure the requests are + handled through the proxy. The legacy ``urllib.urlopen`` function from Python 2.6 and earlier has been discontinued; :func:`urllib.request.urlopen` corresponds to the old @@ -100,6 +108,9 @@ The :mod:`urllib.request` module defines the following functions: .. versionadded:: 3.2 *data* can be an iterable object. + .. versionchanged:: 3.3 + *cadefault* was added. + .. function:: install_opener(opener) Install an :class:`OpenerDirector` instance as the default global opener. @@ -117,10 +128,10 @@ The :mod:`urllib.request` module defines the following functions: subclasses of :class:`BaseHandler` (in which case it must be possible to call the constructor without any parameters). Instances of the following classes will be in front of the *handler*\s, unless the *handler*\s contain them, - instances of them or subclasses of them: :class:`ProxyHandler`, - :class:`UnknownHandler`, :class:`HTTPHandler`, :class:`HTTPDefaultErrorHandler`, - :class:`HTTPRedirectHandler`, :class:`FTPHandler`, :class:`FileHandler`, - :class:`HTTPErrorProcessor`. + instances of them or subclasses of them: :class:`ProxyHandler` (if proxy + settings are detected), :class:`UnknownHandler`, :class:`HTTPHandler`, + :class:`HTTPDefaultErrorHandler`, :class:`HTTPRedirectHandler`, + :class:`FTPHandler`, :class:`FileHandler`, :class:`HTTPErrorProcessor`. If the Python installation has SSL support (i.e., if the :mod:`ssl` module can be imported), :class:`HTTPSHandler` will also be added. @@ -133,14 +144,14 @@ The :mod:`urllib.request` module defines the following functions: Convert the pathname *path* from the local syntax for a path to the form used in the path component of a URL. This does not produce a complete URL. The return - value will already be quoted using the :func:`quote` function. + value will already be quoted using the :func:`~urllib.parse.quote` function. .. function:: url2pathname(path) Convert the path component *path* from a percent-encoded URL to the local syntax for a - path. This does not accept a complete URL. This function uses :func:`unquote` - to decode *path*. + path. This does not accept a complete URL. This function uses + :func:`~urllib.parse.unquote` to decode *path*. .. function:: getproxies() @@ -153,7 +164,7 @@ The :mod:`urllib.request` module defines the following functions: The following classes are provided: -.. class:: Request(url, data=None, headers={}, origin_req_host=None, unverifiable=False) +.. class:: Request(url, data=None, headers={}, origin_req_host=None, unverifiable=False, method=None) This class is an abstraction of a URL request. @@ -200,12 +211,19 @@ The following classes are provided: containing the image. *unverifiable* should indicate whether the request is unverifiable, - as defined by RFC 2965. It defaults to False. An unverifiable + as defined by RFC 2965. It defaults to ``False``. An unverifiable request is one whose URL the user did not have the option to approve. For example, if the request is for an image in an HTML document, and the user had no option to approve the automatic fetching of the image, this should be true. + *method* should be a string that indicates the HTTP request method that + will be used (e.g. ``'HEAD'``). Its value is stored in the + :attr:`~Request.method` attribute and is used by :meth:`get_method()`. + + .. versionchanged:: 3.3 + :attr:`Request.method` argument is added to the Request class. + .. class:: OpenerDirector() @@ -222,7 +240,7 @@ The following classes are provided: .. class:: HTTPDefaultErrorHandler() A class which defines a default handler for HTTP error responses; all responses - are turned into :exc:`HTTPError` exceptions. + are turned into :exc:`~urllib.error.HTTPError` exceptions. .. class:: HTTPRedirectHandler() @@ -238,12 +256,12 @@ The following classes are provided: .. class:: ProxyHandler(proxies=None) Cause requests to go through a proxy. If *proxies* is given, it must be a - dictionary mapping protocol names to URLs of proxies. The default is to read the - list of proxies from the environment variables :envvar:`<protocol>_proxy`. - If no proxy environment variables are set, in a Windows environment, proxy - settings are obtained from the registry's Internet Settings section and in a - Mac OS X environment, proxy information is retrieved from the OS X System - Configuration Framework. + dictionary mapping protocol names to URLs of proxies. The default is to read + the list of proxies from the environment variables + :envvar:`<protocol>_proxy`. If no proxy environment variables are set, then + in a Windows environment proxy settings are obtained from the registry's + Internet Settings section, and in a Mac OS X environment proxy information + is retrieved from the OS X System Configuration Framework. To disable autodetected proxy pass an empty dictionary. @@ -271,10 +289,11 @@ The following classes are provided: .. class:: HTTPBasicAuthHandler(password_mgr=None) - Handle authentication with the remote host. *password_mgr*, if given, should be - something that is compatible with :class:`HTTPPasswordMgr`; refer to section - :ref:`http-password-mgr` for information on the interface that must be - supported. + Handle authentication with the remote host. *password_mgr*, if given, should + be something that is compatible with :class:`HTTPPasswordMgr`; refer to + section :ref:`http-password-mgr` for information on the interface that must + be supported. HTTPBasicAuthHandler will raise a :exc:`ValueError` when + presented with a wrong Authentication scheme. .. class:: ProxyBasicAuthHandler(password_mgr=None) @@ -296,10 +315,19 @@ The following classes are provided: .. class:: HTTPDigestAuthHandler(password_mgr=None) - Handle authentication with the remote host. *password_mgr*, if given, should be - something that is compatible with :class:`HTTPPasswordMgr`; refer to section - :ref:`http-password-mgr` for information on the interface that must be - supported. + Handle authentication with the remote host. *password_mgr*, if given, should + be something that is compatible with :class:`HTTPPasswordMgr`; refer to + section :ref:`http-password-mgr` for information on the interface that must + be supported. When both Digest Authentication Handler and Basic + Authentication Handler are both added, Digest Authentication is always tried + first. If the Digest Authentication returns a 40x response again, it is sent + to Basic Authentication handler to Handle. This Handler method will raise a + :exc:`ValueError` when presented with an authentication scheme other than + Digest or Basic. + + .. versionchanged:: 3.3 + Raise :exc:`ValueError` on unsupported Authentication Scheme. + .. class:: ProxyDigestAuthHandler(password_mgr=None) @@ -390,27 +418,25 @@ request. boolean, indicates whether the request is unverifiable as defined by RFC 2965. -.. method:: Request.add_data(data) - - Set the :class:`Request` data to *data*. This is ignored by all handlers except - HTTP handlers --- and there it should be a byte string, and will change the - request to be ``POST`` rather than ``GET``. - - -.. method:: Request.get_method() - - Return a string indicating the HTTP request method. This is only meaningful for - HTTP requests, and currently always returns ``'GET'`` or ``'POST'``. +.. attribute:: Request.method + The HTTP request method to use. This value is used by + :meth:`~Request.get_method` to override the computed HTTP request + method that would otherwise be returned. This attribute is initialized with + the value of the *method* argument passed to the constructor. -.. method:: Request.has_data() + .. versionadded:: 3.3 - Return whether the instance has a non-\ ``None`` data. +.. method:: Request.get_method() -.. method:: Request.get_data() + Return a string indicating the HTTP request method. If + :attr:`Request.method` is not ``None``, return its value, otherwise return + ``'GET'`` if :attr:`Request.data` is ``None``, or ``'POST'`` if it's not. + This is only meaningful for HTTP requests. - Return the instance's data. + .. versionchanged:: 3.3 + get_method now looks at the value of :attr:`Request.method`. .. method:: Request.add_header(key, val) @@ -440,20 +466,60 @@ request. Return the URL given in the constructor. +.. method:: Request.set_proxy(host, type) + + Prepare the request by connecting to a proxy server. The *host* and *type* will + replace those of the instance, and the instance's selector will be the original + URL given in the constructor. + + +.. method:: Request.add_data(data) + + Set the :class:`Request` data to *data*. This is ignored by all handlers except + HTTP handlers --- and there it should be a byte string, and will change the + request to be ``POST`` rather than ``GET``. Deprecated in 3.3, use + :attr:`Request.data`. + + .. deprecated-removed:: 3.3 3.4 + + +.. method:: Request.has_data() + + Return whether the instance has a non-\ ``None`` data. Deprecated in 3.3, + use :attr:`Request.data`. + + .. deprecated-removed:: 3.3 3.4 + + +.. method:: Request.get_data() + + Return the instance's data. Deprecated in 3.3, use :attr:`Request.data`. + + .. deprecated-removed:: 3.3 3.4 + + .. method:: Request.get_type() - Return the type of the URL --- also known as the scheme. + Return the type of the URL --- also known as the scheme. Deprecated in 3.3, + use :attr:`Request.type`. + + .. deprecated-removed:: 3.3 3.4 .. method:: Request.get_host() - Return the host to which a connection will be made. + Return the host to which a connection will be made. Deprecated in 3.3, use + :attr:`Request.host`. + + .. deprecated-removed:: 3.3 3.4 .. method:: Request.get_selector() Return the selector --- the part of the URL that is sent to the server. + Deprecated in 3.3, use :attr:`Request.selector`. + .. deprecated-removed:: 3.3 3.4 .. method:: Request.get_header(header_name, default=None) @@ -468,21 +534,22 @@ request. .. method:: Request.set_proxy(host, type) - Prepare the request by connecting to a proxy server. The *host* and *type* will - replace those of the instance, and the instance's selector will be the original - URL given in the constructor. - - .. method:: Request.get_origin_req_host() - Return the request-host of the origin transaction, as defined by :rfc:`2965`. - See the documentation for the :class:`Request` constructor. + Return the request-host of the origin transaction, as defined by + :rfc:`2965`. See the documentation for the :class:`Request` constructor. + Deprecated in 3.3, use :attr:`Request.origin_req_host`. + + .. deprecated-removed:: 3.3 3.4 .. method:: Request.is_unverifiable() Return whether the request is unverifiable, as defined by RFC 2965. See the - documentation for the :class:`Request` constructor. + documentation for the :class:`Request` constructor. Deprecated in 3.3, use + :attr:`Request.unverifiable`. + + .. deprecated-removed:: 3.3 3.4 .. _opener-director-objects: @@ -547,8 +614,8 @@ sorting the handler instances. #. Handlers with a method named like :meth:`protocol_open` are called to handle the request. This stage ends when a handler either returns a non-\ :const:`None` - value (ie. a response), or raises an exception (usually :exc:`URLError`). - Exceptions are allowed to propagate. + value (ie. a response), or raises an exception (usually + :exc:`~urllib.error.URLError`). Exceptions are allowed to propagate. In fact, the above algorithm is first tried for methods named :meth:`default_open`. If all such methods return :const:`None`, the algorithm @@ -607,8 +674,9 @@ The following attribute and methods should only be used by classes derived from This method, if implemented, will be called by the parent :class:`OpenerDirector`. It should return a file-like object as described in the return value of the :meth:`open` of :class:`OpenerDirector`, or ``None``. - It should raise :exc:`URLError`, unless a truly exceptional thing happens (for - example, :exc:`MemoryError` should not be mapped to :exc:`URLError`). + It should raise :exc:`~urllib.error.URLError`, unless a truly exceptional + thing happens (for example, :exc:`MemoryError` should not be mapped to + :exc:`URLError`). This method will be called before any protocol-specific open method. @@ -694,8 +762,8 @@ HTTPRedirectHandler Objects .. note:: Some HTTP redirections require action from this module's client code. If this - is the case, :exc:`HTTPError` is raised. See :rfc:`2616` for details of the - precise meanings of the various redirection codes. + is the case, :exc:`~urllib.error.HTTPError` is raised. See :rfc:`2616` for + details of the precise meanings of the various redirection codes. An :class:`HTTPError` exception raised as a security consideration if the HTTPRedirectHandler is presented with a redirected url which is not an HTTP, @@ -708,9 +776,9 @@ HTTPRedirectHandler Objects by the default implementations of the :meth:`http_error_30\*` methods when a redirection is received from the server. If a redirection should take place, return a new :class:`Request` to allow :meth:`http_error_30\*` to perform the - redirect to *newurl*. Otherwise, raise :exc:`HTTPError` if no other handler - should try to handle this URL, or return ``None`` if you can't but another - handler might. + redirect to *newurl*. Otherwise, raise :exc:`~urllib.error.HTTPError` if + no other handler should try to handle this URL, or return ``None`` if you + can't but another handler might. .. note:: @@ -912,7 +980,7 @@ FileHandler Objects .. versionchanged:: 3.2 This method is applicable only for local hostnames. When a remote - hostname is given, an :exc:`URLError` is raised. + hostname is given, an :exc:`~urllib.error.URLError` is raised. .. _ftp-handler-objects: @@ -954,7 +1022,7 @@ UnknownHandler Objects .. method:: UnknownHandler.unknown_open() - Raise a :exc:`URLError` exception. + Raise a :exc:`~urllib.error.URLError` exception. .. _http-error-processor-objects: @@ -971,7 +1039,7 @@ HTTPErrorProcessor Objects For non-200 error codes, this simply passes the job on to the :meth:`protocol_error_code` handler methods, via :meth:`OpenerDirector.error`. Eventually, :class:`HTTPDefaultErrorHandler` will raise an - :exc:`HTTPError` if no other handler handles the error. + :exc:`~urllib.error.HTTPError` if no other handler handles the error. .. method:: HTTPErrorProcessor.https_response() @@ -1004,7 +1072,7 @@ it receives from the http server. In general, a program will decode the returned bytes object to string once it determines or guesses the appropriate encoding. -The following W3C document, http://www.w3.org/International/O-charset , lists +The following W3C document, http://www.w3.org/International/O-charset\ , lists the various ways in which a (X)HTML or a XML document could have specified its encoding information. @@ -1044,6 +1112,15 @@ The code for the sample CGI used in the above example is:: data = sys.stdin.read() print('Content-type: text-plain\n\nGot Data: "%s"' % data) +Here is an example of doing a ``PUT`` request using :class:`Request`:: + + import urllib.request + DATA=b'some data' + req = urllib.request.Request(url='http://localhost:8080', data=DATA,method='PUT') + f = urllib.request.urlopen(req) + print(f.status) + print(f.reason) + Use of Basic HTTP Authentication:: import urllib.request @@ -1146,16 +1223,14 @@ The following functions and classes are ported from the Python 2 module ``urllib`` (as opposed to ``urllib2``). They might become deprecated at some point in the future. - .. function:: urlretrieve(url, filename=None, reporthook=None, data=None) - Copy a network object denoted by a URL to a local file, if necessary. If the URL - points to a local file, or a valid cached copy of the object exists, the object - is not copied. Return a tuple ``(filename, headers)`` where *filename* is the + Copy a network object denoted by a URL to a local file. If the URL + points to a local file, the object will not be copied unless filename is supplied. + Return a tuple ``(filename, headers)`` where *filename* is the local file name under which the object can be found, and *headers* is whatever the :meth:`info` method of the object returned by :func:`urlopen` returned (for - a remote object, possibly cached). Exceptions are the same as for - :func:`urlopen`. + a remote object). Exceptions are the same as for :func:`urlopen`. The second argument, if present, specifies the file location to copy to (if absent, the location will be a tempfile with a generated name). The third @@ -1166,11 +1241,18 @@ some point in the future. third argument may be ``-1`` on older FTP servers which do not return a file size in response to a retrieval request. + The following example illustrates the most common usage scenario:: + + >>> import urllib.request + >>> local_filename, headers = urllib.request.urlretrieve('http://python.org/') + >>> html = open(local_filename) + >>> html.close() + If the *url* uses the :file:`http:` scheme identifier, the optional *data* argument may be given to specify a ``POST`` request (normally the request type is ``GET``). The *data* argument must be a bytes object in standard :mimetype:`application/x-www-form-urlencoded` format; see the - :func:`urlencode` function below. + :func:`urllib.parse.urlencode` function. :func:`urlretrieve` will raise :exc:`ContentTooShortError` when it detects that the amount of data available was less than the expected amount (which is the @@ -1178,23 +1260,25 @@ some point in the future. the download is interrupted. The *Content-Length* is treated as a lower bound: if there's more data to read, - :func:`urlretrieve` reads more data, but if less data is available, it raises - the exception. + urlretrieve reads more data, but if less data is available, it raises the + exception. You can still retrieve the downloaded data in this case, it is stored in the :attr:`content` attribute of the exception instance. - If no *Content-Length* header was supplied, :func:`urlretrieve` can not check - the size of the data it has downloaded, and just returns it. In this case - you just have to assume that the download was successful. + If no *Content-Length* header was supplied, urlretrieve can not check the size + of the data it has downloaded, and just returns it. In this case you just have + to assume that the download was successful. .. function:: urlcleanup() - Clear the cache that may have been built up by previous calls to - :func:`urlretrieve`. + Cleans up temporary files that may have been left behind by previous + calls to :func:`urlretrieve`. .. class:: URLopener(proxies=None, **x509) + .. deprecated:: 3.3 + Base class for opening and reading URLs. Unless you need to support opening objects using schemes other than :file:`http:`, :file:`ftp:`, or :file:`file:`, you probably want to use :class:`FancyURLopener`. @@ -1215,7 +1299,7 @@ some point in the future. *key_file* and *cert_file* are supported to provide an SSL key and certificate; both are needed to support client authentication. - :class:`URLopener` objects will raise an :exc:`IOError` exception if the server + :class:`URLopener` objects will raise an :exc:`OSError` exception if the server returns an error code. .. method:: open(fullurl, data=None) @@ -1243,14 +1327,15 @@ some point in the future. *filename* is not given, the filename is the output of :func:`tempfile.mktemp` with a suffix that matches the suffix of the last path component of the input URL. If *reporthook* is given, it must be a function accepting three numeric - parameters. It will be called after each chunk of data is read from the + parameters: A chunk number, the maximum size chunks are read in and the total size of the download + (-1 if unknown). It will be called once at the start and after each chunk of data is read from the network. *reporthook* is ignored for local URLs. If the *url* uses the :file:`http:` scheme identifier, the optional *data* argument may be given to specify a ``POST`` request (normally the request type is ``GET``). The *data* argument must in standard - :mimetype:`application/x-www-form-urlencoded` format; see the :func:`urlencode` - function below. + :mimetype:`application/x-www-form-urlencoded` format; see the + :func:`urllib.parse.urlencode` function. .. attribute:: version @@ -1263,6 +1348,8 @@ some point in the future. .. class:: FancyURLopener(...) + .. deprecated:: 3.3 + :class:`FancyURLopener` subclasses :class:`URLopener` providing default handling for the following HTTP response codes: 301, 302, 303, 307 and 401. For the 30x response codes listed above, the :mailheader:`Location` header is used to fetch diff --git a/Doc/library/urllib.rst b/Doc/library/urllib.rst new file mode 100644 index 0000000..8e308bc --- /dev/null +++ b/Doc/library/urllib.rst @@ -0,0 +1,11 @@ +:mod:`urllib` --- URL handling modules +====================================== + +.. module:: urllib + +``urllib`` is a package that collects several modules for working with URLs: + +* :mod:`urllib.request` for opening and reading URLs +* :mod:`urllib.error` containing the exceptions raised by :mod:`urllib.request` +* :mod:`urllib.parse` for parsing URLs +* :mod:`urllib.robotparser` for parsing ``robots.txt`` files diff --git a/Doc/library/venv.rst b/Doc/library/venv.rst new file mode 100644 index 0000000..be0cad4 --- /dev/null +++ b/Doc/library/venv.rst @@ -0,0 +1,426 @@ +:mod:`venv` --- Creation of virtual environments +================================================ + +.. module:: venv + :synopsis: Creation of virtual environments. +.. moduleauthor:: Vinay Sajip <vinay_sajip@yahoo.co.uk> +.. sectionauthor:: Vinay Sajip <vinay_sajip@yahoo.co.uk> + + +.. index:: pair: Environments; virtual + +.. versionadded:: 3.3 + +**Source code:** :source:`Lib/venv` + +-------------- + +The :mod:`venv` module provides support for creating lightweight "virtual +environments" with their own site directories, optionally isolated from system +site directories. Each virtual environment has its own Python binary (allowing +creation of environments with various Python versions) and can have its own +independent set of installed Python packages in its site directories. + +See :pep:`405` for more information about Python virtual environments. + +Creating virtual environments +----------------------------- + +.. include:: /using/venv-create.inc + + +.. _venv-def: + +.. note:: A virtual environment (also called a ``venv``) is a Python + environment such that the Python interpreter, libraries and scripts + installed into it are isolated from those installed in other virtual + environments, and (by default) any libraries installed in a "system" Python, + i.e. one which is installed as part of your operating system. + + A venv is a directory tree which contains Python executable files and + other files which indicate that it is a venv. + + Common installation tools such as ``Setuptools`` and ``pip`` work as + expected with venvs - i.e. when a venv is active, they install Python + packages into the venv without needing to be told to do so explicitly. + Of course, you need to install them into the venv first: this could be + done by running ``ez_setup.py`` with the venv activated, + followed by running ``easy_install pip``. Alternatively, you could download + the source tarballs and run ``python setup.py install`` after unpacking, + with the venv activated. + + When a venv is active (i.e. the venv's Python interpreter is running), the + attributes :attr:`sys.prefix` and :attr:`sys.exec_prefix` point to the base + directory of the venv, whereas :attr:`sys.base_prefix` and + :attr:`sys.base_exec_prefix` point to the non-venv Python installation + which was used to create the venv. If a venv is not active, then + :attr:`sys.prefix` is the same as :attr:`sys.base_prefix` and + :attr:`sys.exec_prefix` is the same as :attr:`sys.base_exec_prefix` (they + all point to a non-venv Python installation). + + When a venv is active, any options that change the installation path will be + ignored from all distutils configuration files to prevent projects being + inadvertently installed outside of the virtual environment. + + When working in a command shell, users can make a venv active by running an + ``activate`` script in the venv's executables directory (the precise filename + is shell-dependent), which prepends the venv's directory for executables to + the ``PATH`` environment variable for the running shell. There should be no + need in other circumstances to activate a venv -- scripts installed into + venvs have a shebang line which points to the venv's Python interpreter. This + means that the script will run with that interpreter regardless of the value + of ``PATH``. On Windows, shebang line processing is supported if you have the + Python Launcher for Windows installed (this was added to Python in 3.3 - see + :pep:`397` for more details). Thus, double-clicking an installed script in + a Windows Explorer window should run the script with the correct interpreter + without there needing to be any reference to its venv in ``PATH``. + + +API +--- + +.. highlight:: python + +The high-level method described above makes use of a simple API which provides +mechanisms for third-party virtual environment creators to customize environment +creation according to their needs, the :class:`EnvBuilder` class. + +.. class:: EnvBuilder(system_site_packages=False, clear=False, symlinks=False, upgrade=False) + + The :class:`EnvBuilder` class accepts the following keyword arguments on + instantiation: + + * ``system_site_packages`` -- a Boolean value indicating that the system Python + site-packages should be available to the environment (defaults to ``False``). + + * ``clear`` -- a Boolean value which, if true, will delete any existing target + directory instead of raising an exception (defaults to ``False``). + + * ``symlinks`` -- a Boolean value indicating whether to attempt to symlink the + Python binary (and any necessary DLLs or other binaries, + e.g. ``pythonw.exe``), rather than copying. Defaults to ``True`` on Linux and + Unix systems, but ``False`` on Windows. + + * ``upgrade`` -- a Boolean value which, if true, will upgrade an existing + environment with the running Python - for use when that Python has been + upgraded in-place (defaults to ``False``). + + + + Creators of third-party virtual environment tools will be free to use the + provided ``EnvBuilder`` class as a base class. + + The returned env-builder is an object which has a method, ``create``: + + .. method:: create(env_dir) + + This method takes as required argument the path (absolute or relative to + the current directory) of the target directory which is to contain the + virtual environment. The ``create`` method will either create the + environment in the specified directory, or raise an appropriate + exception. + + The ``create`` method of the ``EnvBuilder`` class illustrates the hooks + available for subclass customization:: + + def create(self, env_dir): + """ + Create a virtualized Python environment in a directory. + env_dir is the target directory to create an environment in. + """ + env_dir = os.path.abspath(env_dir) + context = self.ensure_directories(env_dir) + self.create_configuration(context) + self.setup_python(context) + self.setup_scripts(context) + self.post_setup(context) + + Each of the methods :meth:`ensure_directories`, + :meth:`create_configuration`, :meth:`setup_python`, + :meth:`setup_scripts` and :meth:`post_setup` can be overridden. + + .. method:: ensure_directories(env_dir) + + Creates the environment directory and all necessary directories, and + returns a context object. This is just a holder for attributes (such as + paths), for use by the other methods. The directories are allowed to + exist already, as long as either ``clear`` or ``upgrade`` were + specified to allow operating on an existing environment directory. + + .. method:: create_configuration(context) + + Creates the ``pyvenv.cfg`` configuration file in the environment. + + .. method:: setup_python(context) + + Creates a copy of the Python executable (and, under Windows, DLLs) in + the environment. On a POSIX system, if a specific executable + ``python3.x`` was used, symlinks to ``python`` and ``python3`` will be + created pointing to that executable, unless files with those names + already exist. + + .. method:: setup_scripts(context) + + Installs activation scripts appropriate to the platform into the virtual + environment. + + .. method:: post_setup(context) + + A placeholder method which can be overridden in third party + implementations to pre-install packages in the virtual environment or + perform other post-creation steps. + + In addition, :class:`EnvBuilder` provides this utility method that can be + called from :meth:`setup_scripts` or :meth:`post_setup` in subclasses to + assist in installing custom scripts into the virtual environment. + + .. method:: install_scripts(context, path) + + *path* is the path to a directory that should contain subdirectories + "common", "posix", "nt", each containing scripts destined for the bin + directory in the environment. The contents of "common" and the + directory corresponding to :data:`os.name` are copied after some text + replacement of placeholders: + + * ``__VENV_DIR__`` is replaced with the absolute path of the environment + directory. + + * ``__VENV_NAME__`` is replaced with the environment name (final path + segment of environment directory). + + * ``__VENV_BIN_NAME__`` is replaced with the name of the bin directory + (either ``bin`` or ``Scripts``). + + * ``__VENV_PYTHON__`` is replaced with the absolute path of the + environment's executable. + + The directories are allowed to exist (for when an existing environment + is being upgraded). + +There is also a module-level convenience function: + +.. function:: create(env_dir, system_site_packages=False, clear=False, symlinks=False) + + Create an :class:`EnvBuilder` with the given keyword arguments, and call its + :meth:`~EnvBuilder.create` method with the *env_dir* argument. + +An example of extending ``EnvBuilder`` +-------------------------------------- + +The following script shows how to extend :class:`EnvBuilder` by implementing a +subclass which installs setuptools and pip into a created venv:: + + import os + import os.path + from subprocess import Popen, PIPE + import sys + from threading import Thread + from urllib.parse import urlparse + from urllib.request import urlretrieve + import venv + + class ExtendedEnvBuilder(venv.EnvBuilder): + """ + This builder installs setuptools and pip so that you can pip or + easy_install other packages into the created environment. + + :param nodist: If True, setuptools and pip are not installed into the + created environment. + :param nopip: If True, pip is not installed into the created + environment. + :param progress: If setuptools or pip are installed, the progress of the + installation can be monitored by passing a progress + callable. If specified, it is called with two + arguments: a string indicating some progress, and a + context indicating where the string is coming from. + The context argument can have one of three values: + 'main', indicating that it is called from virtualize() + itself, and 'stdout' and 'stderr', which are obtained + by reading lines from the output streams of a subprocess + which is used to install the app. + + If a callable is not specified, default progress + information is output to sys.stderr. + """ + + def __init__(self, *args, **kwargs): + self.nodist = kwargs.pop('nodist', False) + self.nopip = kwargs.pop('nopip', False) + self.progress = kwargs.pop('progress', None) + self.verbose = kwargs.pop('verbose', False) + super().__init__(*args, **kwargs) + + def post_setup(self, context): + """ + Set up any packages which need to be pre-installed into the + environment being created. + + :param context: The information for the environment creation request + being processed. + """ + os.environ['VIRTUAL_ENV'] = context.env_dir + if not self.nodist: + self.install_setuptools(context) + # Can't install pip without setuptools + if not self.nopip and not self.nodist: + self.install_pip(context) + + def reader(self, stream, context): + """ + Read lines from a subprocess' output stream and either pass to a progress + callable (if specified) or write progress information to sys.stderr. + """ + progress = self.progress + while True: + s = stream.readline() + if not s: + break + if progress is not None: + progress(s, context) + else: + if not self.verbose: + sys.stderr.write('.') + else: + sys.stderr.write(s.decode('utf-8')) + sys.stderr.flush() + stream.close() + + def install_script(self, context, name, url): + _, _, path, _, _, _ = urlparse(url) + fn = os.path.split(path)[-1] + binpath = context.bin_path + distpath = os.path.join(binpath, fn) + # Download script into the env's binaries folder + urlretrieve(url, distpath) + progress = self.progress + if self.verbose: + term = '\n' + else: + term = '' + if progress is not None: + progress('Installing %s ...%s' % (name, term), 'main') + else: + sys.stderr.write('Installing %s ...%s' % (name, term)) + sys.stderr.flush() + # Install in the env + args = [context.env_exe, fn] + p = Popen(args, stdout=PIPE, stderr=PIPE, cwd=binpath) + t1 = Thread(target=self.reader, args=(p.stdout, 'stdout')) + t1.start() + t2 = Thread(target=self.reader, args=(p.stderr, 'stderr')) + t2.start() + p.wait() + t1.join() + t2.join() + if progress is not None: + progress('done.', 'main') + else: + sys.stderr.write('done.\n') + # Clean up - no longer needed + os.unlink(distpath) + + def install_setuptools(self, context): + """ + Install setuptools in the environment. + + :param context: The information for the environment creation request + being processed. + """ + url = 'https://bitbucket.org/pypa/setuptools/downloads/ez_setup.py' + self.install_script(context, 'setuptools', url) + # clear up the setuptools archive which gets downloaded + pred = lambda o: o.startswith('setuptools-') and o.endswith('.tar.gz') + files = filter(pred, os.listdir(context.bin_path)) + for f in files: + f = os.path.join(context.bin_path, f) + os.unlink(f) + + def install_pip(self, context): + """ + Install pip in the environment. + + :param context: The information for the environment creation request + being processed. + """ + url = 'https://raw.github.com/pypa/pip/master/contrib/get-pip.py' + self.install_script(context, 'pip', url) + + def main(args=None): + compatible = True + if sys.version_info < (3, 3): + compatible = False + elif not hasattr(sys, 'base_prefix'): + compatible = False + if not compatible: + raise ValueError('This script is only for use with ' + 'Python 3.3 or later') + else: + import argparse + + parser = argparse.ArgumentParser(prog=__name__, + description='Creates virtual Python ' + 'environments in one or ' + 'more target ' + 'directories.') + parser.add_argument('dirs', metavar='ENV_DIR', nargs='+', + help='A directory to create the environment in.') + parser.add_argument('--no-setuptools', default=False, + action='store_true', dest='nodist', + help="Don't install setuptools or pip in the " + "virtual environment.") + parser.add_argument('--no-pip', default=False, + action='store_true', dest='nopip', + help="Don't install pip in the virtual " + "environment.") + parser.add_argument('--system-site-packages', default=False, + action='store_true', dest='system_site', + help='Give the virtual environment access to the ' + 'system site-packages dir.') + if os.name == 'nt': + use_symlinks = False + else: + use_symlinks = True + parser.add_argument('--symlinks', default=use_symlinks, + action='store_true', dest='symlinks', + help='Try to use symlinks rather than copies, ' + 'when symlinks are not the default for ' + 'the platform.') + parser.add_argument('--clear', default=False, action='store_true', + dest='clear', help='Delete the contents of the ' + 'environment directory if it ' + 'already exists, before ' + 'environment creation.') + parser.add_argument('--upgrade', default=False, action='store_true', + dest='upgrade', help='Upgrade the environment ' + 'directory to use this version ' + 'of Python, assuming Python ' + 'has been upgraded in-place.') + parser.add_argument('--verbose', default=False, action='store_true', + dest='verbose', help='Display the output ' + 'from the scripts which ' + 'install setuptools and pip.') + options = parser.parse_args(args) + if options.upgrade and options.clear: + raise ValueError('you cannot supply --upgrade and --clear together.') + builder = ExtendedEnvBuilder(system_site_packages=options.system_site, + clear=options.clear, + symlinks=options.symlinks, + upgrade=options.upgrade, + nodist=options.nodist, + nopip=options.nopip, + verbose=options.verbose) + for d in options.dirs: + builder.create(d) + + if __name__ == '__main__': + rc = 1 + try: + main() + rc = 0 + except Exception as e: + print('Error: %s' % e, file=sys.stderr) + sys.exit(rc) + + +This script is also available for download `online +<https://gist.github.com/4673395>`_. diff --git a/Doc/library/warnings.rst b/Doc/library/warnings.rst index 8af19a2..8a538ad 100644 --- a/Doc/library/warnings.rst +++ b/Doc/library/warnings.rst @@ -54,6 +54,8 @@ There are a number of built-in exceptions that represent warning categories. This categorization is useful to be able to filter out groups of warnings. The following warnings category classes are currently defined: +.. tabularcolumns:: |l|p{0.6\linewidth}| + +----------------------------------+-----------------------------------------------+ | Class | Description | +==================================+===============================================+ @@ -87,7 +89,7 @@ following warnings category classes are currently defined: | | Unicode. | +----------------------------------+-----------------------------------------------+ | :exc:`BytesWarning` | Base category for warnings related to | -| | :class:`bytes` and :class:`buffer`. | +| | :class:`bytes` and :class:`bytearray`. | +----------------------------------+-----------------------------------------------+ | :exc:`ResourceWarning` | Base category for warnings related to | | | resource usage. | @@ -339,8 +341,7 @@ Available Functions Write a warning to a file. The default implementation calls ``formatwarning(message, category, filename, lineno, line)`` and writes the resulting string to *file*, which defaults to ``sys.stderr``. You may replace - this function with an alternative implementation by assigning to - ``warnings.showwarning``. + this function with any callable by assigning to ``warnings.showwarning``. *line* is a line of source code to be included in the warning message; if *line* is not supplied, :func:`showwarning` will try to read the line specified by *filename* and *lineno*. diff --git a/Doc/library/webbrowser.rst b/Doc/library/webbrowser.rst index b2dcf8f..9c2b3ab 100644 --- a/Doc/library/webbrowser.rst +++ b/Doc/library/webbrowser.rst @@ -98,47 +98,55 @@ A number of browser types are predefined. This table gives the type names that may be passed to the :func:`get` function and the corresponding instantiations for the controller classes, all defined in this module. -+-----------------------+-----------------------------------------+-------+ -| Type Name | Class Name | Notes | -+=======================+=========================================+=======+ -| ``'mozilla'`` | :class:`Mozilla('mozilla')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'firefox'`` | :class:`Mozilla('mozilla')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'netscape'`` | :class:`Mozilla('netscape')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'galeon'`` | :class:`Galeon('galeon')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'epiphany'`` | :class:`Galeon('epiphany')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'skipstone'`` | :class:`BackgroundBrowser('skipstone')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'kfmclient'`` | :class:`Konqueror()` | \(1) | -+-----------------------+-----------------------------------------+-------+ -| ``'konqueror'`` | :class:`Konqueror()` | \(1) | -+-----------------------+-----------------------------------------+-------+ -| ``'kfm'`` | :class:`Konqueror()` | \(1) | -+-----------------------+-----------------------------------------+-------+ -| ``'mosaic'`` | :class:`BackgroundBrowser('mosaic')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'opera'`` | :class:`Opera()` | | -+-----------------------+-----------------------------------------+-------+ -| ``'grail'`` | :class:`Grail()` | | -+-----------------------+-----------------------------------------+-------+ -| ``'links'`` | :class:`GenericBrowser('links')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'elinks'`` | :class:`Elinks('elinks')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'lynx'`` | :class:`GenericBrowser('lynx')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'w3m'`` | :class:`GenericBrowser('w3m')` | | -+-----------------------+-----------------------------------------+-------+ -| ``'windows-default'`` | :class:`WindowsDefault` | \(2) | -+-----------------------+-----------------------------------------+-------+ -| ``'macosx'`` | :class:`MacOSX('default')` | \(3) | -+-----------------------+-----------------------------------------+-------+ -| ``'safari'`` | :class:`MacOSX('safari')` | \(3) | -+-----------------------+-----------------------------------------+-------+ ++------------------------+-----------------------------------------+-------+ +| Type Name | Class Name | Notes | ++========================+=========================================+=======+ +| ``'mozilla'`` | :class:`Mozilla('mozilla')` | | ++------------------------+-----------------------------------------+-------+ +| ``'firefox'`` | :class:`Mozilla('mozilla')` | | ++------------------------+-----------------------------------------+-------+ +| ``'netscape'`` | :class:`Mozilla('netscape')` | | ++------------------------+-----------------------------------------+-------+ +| ``'galeon'`` | :class:`Galeon('galeon')` | | ++------------------------+-----------------------------------------+-------+ +| ``'epiphany'`` | :class:`Galeon('epiphany')` | | ++------------------------+-----------------------------------------+-------+ +| ``'skipstone'`` | :class:`BackgroundBrowser('skipstone')` | | ++------------------------+-----------------------------------------+-------+ +| ``'kfmclient'`` | :class:`Konqueror()` | \(1) | ++------------------------+-----------------------------------------+-------+ +| ``'konqueror'`` | :class:`Konqueror()` | \(1) | ++------------------------+-----------------------------------------+-------+ +| ``'kfm'`` | :class:`Konqueror()` | \(1) | ++------------------------+-----------------------------------------+-------+ +| ``'mosaic'`` | :class:`BackgroundBrowser('mosaic')` | | ++------------------------+-----------------------------------------+-------+ +| ``'opera'`` | :class:`Opera()` | | ++------------------------+-----------------------------------------+-------+ +| ``'grail'`` | :class:`Grail()` | | ++------------------------+-----------------------------------------+-------+ +| ``'links'`` | :class:`GenericBrowser('links')` | | ++------------------------+-----------------------------------------+-------+ +| ``'elinks'`` | :class:`Elinks('elinks')` | | ++------------------------+-----------------------------------------+-------+ +| ``'lynx'`` | :class:`GenericBrowser('lynx')` | | ++------------------------+-----------------------------------------+-------+ +| ``'w3m'`` | :class:`GenericBrowser('w3m')` | | ++------------------------+-----------------------------------------+-------+ +| ``'windows-default'`` | :class:`WindowsDefault` | \(2) | ++------------------------+-----------------------------------------+-------+ +| ``'macosx'`` | :class:`MacOSX('default')` | \(3) | ++------------------------+-----------------------------------------+-------+ +| ``'safari'`` | :class:`MacOSX('safari')` | \(3) | ++------------------------+-----------------------------------------+-------+ +| ``'google-chrome'`` | :class:`Chrome('google-chrome')` | | ++------------------------+-----------------------------------------+-------+ +| ``'chrome'`` | :class:`Chrome('chrome')` | | ++------------------------+-----------------------------------------+-------+ +| ``'chromium'`` | :class:`Chromium('chromium')` | | ++------------------------+-----------------------------------------+-------+ +| ``'chromium-browser'`` | :class:`Chromium('chromium-browser')` | | ++------------------------+-----------------------------------------+-------+ Notes: @@ -155,12 +163,15 @@ Notes: (3) Only on Mac OS X platform. +.. versionadded:: 3.3 + Support for Chrome/Chromium has been added. + Here are some simple examples:: - url = 'http://www.python.org/' + url = 'http://docs.python.org/' # Open URL in a new tab, if a browser window is already open. - webbrowser.open_new_tab(url + 'doc/') + webbrowser.open_new_tab(url) # Open URL in new window, raising the window if possible. webbrowser.open_new(url) diff --git a/Doc/library/winreg.rst b/Doc/library/winreg.rst index cdc07b1..06bcb86 100644 --- a/Doc/library/winreg.rst +++ b/Doc/library/winreg.rst @@ -12,6 +12,17 @@ integer as the registry handle, a :ref:`handle object <handle-object>` is used to ensure that the handles are closed correctly, even if the programmer neglects to explicitly close them. +.. _exception-changed: + +.. versionchanged:: 3.3 + Several functions in this module used to raise a + :exc:`WindowsError`, which is now an alias of :exc:`OSError`. + +.. _functions: + +Functions +------------------ + This module offers the following functions: @@ -37,8 +48,11 @@ This module offers the following functions: *key* is the predefined handle to connect to. - The return value is the handle of the opened key. If the function fails, a - :exc:`WindowsError` exception is raised. + The return value is the handle of the opened key. If the function fails, an + :exc:`OSError` exception is raised. + + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. .. function:: CreateKey(key, sub_key) @@ -56,8 +70,11 @@ This module offers the following functions: If the key already exists, this function opens the existing key. - The return value is the handle of the opened key. If the function fails, a - :exc:`WindowsError` exception is raised. + The return value is the handle of the opened key. If the function fails, an + :exc:`OSError` exception is raised. + + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. .. function:: CreateKeyEx(key, sub_key, reserved=0, access=KEY_WRITE) @@ -81,11 +98,14 @@ This module offers the following functions: If the key already exists, this function opens the existing key. - The return value is the handle of the opened key. If the function fails, a - :exc:`WindowsError` exception is raised. + The return value is the handle of the opened key. If the function fails, an + :exc:`OSError` exception is raised. .. versionadded:: 3.2 + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. + .. function:: DeleteKey(key, sub_key) @@ -100,7 +120,10 @@ This module offers the following functions: *This method can not delete keys with subkeys.* If the method succeeds, the entire key, including all of its values, is removed. - If the method fails, a :exc:`WindowsError` exception is raised. + If the method fails, an :exc:`OSError` exception is raised. + + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. .. function:: DeleteKeyEx(key, sub_key, access=KEY_WOW64_64KEY, reserved=0) @@ -123,18 +146,21 @@ This module offers the following functions: *reserved* is a reserved integer, and must be zero. The default is zero. *access* is an integer that specifies an access mask that describes the desired - security access for the key. Default is :const:`KEY_ALL_ACCESS`. See + security access for the key. Default is :const:`KEY_WOW64_64KEY`. See :ref:`Access Rights <access-rights>` for other allowed values. *This method can not delete keys with subkeys.* If the method succeeds, the entire key, including all of its values, is - removed. If the method fails, a :exc:`WindowsError` exception is raised. + removed. If the method fails, an :exc:`OSError` exception is raised. On unsupported Windows versions, :exc:`NotImplementedError` is raised. .. versionadded:: 3.2 + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. + .. function:: DeleteValue(key, value) @@ -156,9 +182,12 @@ This module offers the following functions: *index* is an integer that identifies the index of the key to retrieve. The function retrieves the name of one subkey each time it is called. It is - typically called repeatedly until a :exc:`WindowsError` exception is + typically called repeatedly until an :exc:`OSError` exception is raised, indicating, no more values are available. + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. + .. function:: EnumValue(key, index) @@ -170,7 +199,7 @@ This module offers the following functions: *index* is an integer that identifies the index of the value to retrieve. The function retrieves the name of one subkey each time it is called. It is - typically called repeatedly, until a :exc:`WindowsError` exception is + typically called repeatedly, until an :exc:`OSError` exception is raised, indicating no more values. The result is a tuple of 3 items: @@ -189,6 +218,9 @@ This module offers the following functions: | | :meth:`SetValueEx`) | +-------+--------------------------------------------+ + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. + .. function:: ExpandEnvironmentStrings(str) @@ -261,9 +293,13 @@ This module offers the following functions: The result is a new handle to the specified key. - If the function fails, :exc:`WindowsError` is raised. + If the function fails, :exc:`OSError` is raised. + + .. versionchanged:: 3.2 + Allow the use of named arguments. - .. versionchanged:: 3.2 Allow the use of named arguments. + .. versionchanged:: 3.3 + See :ref:`above <exception-changed>`. .. function:: QueryInfoKey(key) diff --git a/Doc/library/winsound.rst b/Doc/library/winsound.rst index 8356062..61f34cd 100644 --- a/Doc/library/winsound.rst +++ b/Doc/library/winsound.rst @@ -126,6 +126,10 @@ provided by Windows platforms. It includes functions and several constants. Return immediately if the sound driver is busy. + .. note:: + + This flag is not supported on modern Windows platforms. + .. data:: MB_ICONASTERISK diff --git a/Doc/library/wsgiref.rst b/Doc/library/wsgiref.rst index 1fd3451..1cef2e9 100644 --- a/Doc/library/wsgiref.rst +++ b/Doc/library/wsgiref.rst @@ -609,6 +609,9 @@ input, output, and error streams. as :class:`BaseCGIHandler` and :class:`CGIHandler`) that are not HTTP origin servers. + .. versionchanged:: 3.3 + The term "Python" is replaced with implementation specific term like + "CPython", "Jython" etc. .. method:: BaseHandler.get_scheme() diff --git a/Doc/library/xml.dom.minidom.rst b/Doc/library/xml.dom.minidom.rst index 30182e4..89c660a 100644 --- a/Doc/library/xml.dom.minidom.rst +++ b/Doc/library/xml.dom.minidom.rst @@ -55,7 +55,7 @@ instead: .. function:: parseString(string, parser=None) Return a :class:`Document` that represents the *string*. This method creates a - :class:`StringIO` object for the string and passes that on to :func:`parse`. + :class:`io.StringIO` object for the string and passes that on to :func:`parse`. Both functions return a :class:`Document` object representing the content of the document. @@ -149,12 +149,7 @@ module documentation. This section lists the differences between the API and the DOM node. With an explicit *encoding* [1]_ argument, the result is a byte - string in the specified encoding. It is recommended that you - always specify an encoding; you may use any encoding you like, but - an argument of "utf-8" is the most common choice, avoiding - :exc:`UnicodeError` exceptions in case of unrepresentable text - data. - + string in the specified encoding. With no *encoding* argument, the result is a Unicode string, and the XML declaration in the resulting string does not specify an encoding. Encoding this string in an encoding other than UTF-8 is @@ -257,4 +252,4 @@ utility to most DOM users. "UTF8" is not valid in an XML document's declaration, even though Python accepts it as an encoding name. See http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl - and http://www.iana.org/assignments/character-sets . + and http://www.iana.org/assignments/character-sets\ . diff --git a/Doc/library/xml.dom.pulldom.rst b/Doc/library/xml.dom.pulldom.rst index 8aa9cfb..a9c9f67 100644 --- a/Doc/library/xml.dom.pulldom.rst +++ b/Doc/library/xml.dom.pulldom.rst @@ -74,7 +74,8 @@ and switch to DOM-related processing. Return a :class:`DOMEventStream` from the given input. *stream_or_string* may be either a file name, or a file-like object. *parser*, if given, must be a - :class:`XmlReader` object. This function will change the document handler of the + :class:`~xml.sax.xmlreader.XMLReader` object. This function will change the + document handler of the parser and activate namespace support; other parser configuration (like setting an entity resolver) must have been done in advance. diff --git a/Doc/library/xml.dom.rst b/Doc/library/xml.dom.rst index 1a3a6a4..19512ed 100644 --- a/Doc/library/xml.dom.rst +++ b/Doc/library/xml.dom.rst @@ -422,14 +422,15 @@ objects: In addition, the Python DOM interface requires that some additional support is provided to allow :class:`NodeList` objects to be used as Python sequences. All -:class:`NodeList` implementations must include support for :meth:`__len__` and -:meth:`__getitem__`; this allows iteration over the :class:`NodeList` in +:class:`NodeList` implementations must include support for +:meth:`~object.__len__` and +:meth:`~object.__getitem__`; this allows iteration over the :class:`NodeList` in :keyword:`for` statements and proper support for the :func:`len` built-in function. If a DOM implementation supports modification of the document, the -:class:`NodeList` implementation must also support the :meth:`__setitem__` and -:meth:`__delitem__` methods. +:class:`NodeList` implementation must also support the +:meth:`~object.__setitem__` and :meth:`~object.__delitem__` methods. .. _dom-documenttype-objects: diff --git a/Doc/library/xml.etree.elementtree.rst b/Doc/library/xml.etree.elementtree.rst index dc9ebb9..3f2dcbb 100644 --- a/Doc/library/xml.etree.elementtree.rst +++ b/Doc/library/xml.etree.elementtree.rst @@ -5,13 +5,12 @@ :synopsis: Implementation of the ElementTree API. .. moduleauthor:: Fredrik Lundh <fredrik@pythonware.com> -**Source code:** :source:`Lib/xml/etree/ElementTree.py` +The :mod:`xml.etree.ElementTree` module implements a simple and efficient API +for parsing and creating XML data. --------------- - -The :class:`Element` type is a flexible container object, designed to store -hierarchical data structures in memory. The type can be described as a cross -between a list and a dictionary. +.. versionchanged:: 3.3 + This module will use a fast implementation whenever available. + The :mod:`xml.etree.cElementTree` module is deprecated. .. warning:: @@ -20,42 +19,318 @@ between a list and a dictionary. maliciously constructed data. If you need to parse untrusted or unauthenticated data see :ref:`xml-vulnerabilities`. +Tutorial +-------- + +This is a short tutorial for using :mod:`xml.etree.ElementTree` (``ET`` in +short). The goal is to demonstrate some of the building blocks and basic +concepts of the module. + +XML tree and elements +^^^^^^^^^^^^^^^^^^^^^ + +XML is an inherently hierarchical data format, and the most natural way to +represent it is with a tree. ``ET`` has two classes for this purpose - +:class:`ElementTree` represents the whole XML document as a tree, and +:class:`Element` represents a single node in this tree. Interactions with +the whole document (reading and writing to/from files) are usually done +on the :class:`ElementTree` level. Interactions with a single XML element +and its sub-elements are done on the :class:`Element` level. + +.. _elementtree-parsing-xml: + +Parsing XML +^^^^^^^^^^^ + +We'll be using the following XML document as the sample data for this section: + +.. code-block:: xml + + <?xml version="1.0"?> + <data> + <country name="Liechtenstein"> + <rank>1</rank> + <year>2008</year> + <gdppc>141100</gdppc> + <neighbor name="Austria" direction="E"/> + <neighbor name="Switzerland" direction="W"/> + </country> + <country name="Singapore"> + <rank>4</rank> + <year>2011</year> + <gdppc>59900</gdppc> + <neighbor name="Malaysia" direction="N"/> + </country> + <country name="Panama"> + <rank>68</rank> + <year>2011</year> + <gdppc>13600</gdppc> + <neighbor name="Costa Rica" direction="W"/> + <neighbor name="Colombia" direction="E"/> + </country> + </data> + +We can import this data by reading from a file:: + + import xml.etree.ElementTree as ET + tree = ET.parse('country_data.xml') + root = tree.getroot() + +Or directly from a string:: + + root = ET.fromstring(country_data_as_string) + +:func:`fromstring` parses XML from a string directly into an :class:`Element`, +which is the root element of the parsed tree. Other parsing functions may +create an :class:`ElementTree`. Check the documentation to be sure. + +As an :class:`Element`, ``root`` has a tag and a dictionary of attributes:: + + >>> root.tag + 'data' + >>> root.attrib + {} + +It also has children nodes over which we can iterate:: + + >>> for child in root: + ... print(child.tag, child.attrib) + ... + country {'name': 'Liechtenstein'} + country {'name': 'Singapore'} + country {'name': 'Panama'} + +Children are nested, and we can access specific child nodes by index:: + + >>> root[0][1].text + '2008' + +Finding interesting elements +^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:class:`Element` has some useful methods that help iterate recursively over all +the sub-tree below it (its children, their children, and so on). For example, +:meth:`Element.iter`:: + + >>> for neighbor in root.iter('neighbor'): + ... print(neighbor.attrib) + ... + {'name': 'Austria', 'direction': 'E'} + {'name': 'Switzerland', 'direction': 'W'} + {'name': 'Malaysia', 'direction': 'N'} + {'name': 'Costa Rica', 'direction': 'W'} + {'name': 'Colombia', 'direction': 'E'} + +:meth:`Element.findall` finds only elements with a tag which are direct +children of the current element. :meth:`Element.find` finds the *first* child +with a particular tag, and :attr:`Element.text` accesses the element's text +content. :meth:`Element.get` accesses the element's attributes:: + + >>> for country in root.findall('country'): + ... rank = country.find('rank').text + ... name = country.get('name') + ... print(name, rank) + ... + Liechtenstein 1 + Singapore 4 + Panama 68 + +More sophisticated specification of which elements to look for is possible by +using :ref:`XPath <elementtree-xpath>`. + +Modifying an XML File +^^^^^^^^^^^^^^^^^^^^^ + +:class:`ElementTree` provides a simple way to build XML documents and write them to files. +The :meth:`ElementTree.write` method serves this purpose. + +Once created, an :class:`Element` object may be manipulated by directly changing +its fields (such as :attr:`Element.text`), adding and modifying attributes +(:meth:`Element.set` method), as well as adding new children (for example +with :meth:`Element.append`). + +Let's say we want to add one to each country's rank, and add an ``updated`` +attribute to the rank element:: + + >>> for rank in root.iter('rank'): + ... new_rank = int(rank.text) + 1 + ... rank.text = str(new_rank) + ... rank.set('updated', 'yes') + ... + >>> tree.write('output.xml') + +Our XML now looks like this: + +.. code-block:: xml + + <?xml version="1.0"?> + <data> + <country name="Liechtenstein"> + <rank updated="yes">2</rank> + <year>2008</year> + <gdppc>141100</gdppc> + <neighbor name="Austria" direction="E"/> + <neighbor name="Switzerland" direction="W"/> + </country> + <country name="Singapore"> + <rank updated="yes">5</rank> + <year>2011</year> + <gdppc>59900</gdppc> + <neighbor name="Malaysia" direction="N"/> + </country> + <country name="Panama"> + <rank updated="yes">69</rank> + <year>2011</year> + <gdppc>13600</gdppc> + <neighbor name="Costa Rica" direction="W"/> + <neighbor name="Colombia" direction="E"/> + </country> + </data> + +We can remove elements using :meth:`Element.remove`. Let's say we want to +remove all countries with a rank higher than 50:: + + >>> for country in root.findall('country'): + ... rank = int(country.find('rank').text) + ... if rank > 50: + ... root.remove(country) + ... + >>> tree.write('output.xml') + +Our XML now looks like this: + +.. code-block:: xml + + <?xml version="1.0"?> + <data> + <country name="Liechtenstein"> + <rank updated="yes">2</rank> + <year>2008</year> + <gdppc>141100</gdppc> + <neighbor name="Austria" direction="E"/> + <neighbor name="Switzerland" direction="W"/> + </country> + <country name="Singapore"> + <rank updated="yes">5</rank> + <year>2011</year> + <gdppc>59900</gdppc> + <neighbor name="Malaysia" direction="N"/> + </country> + </data> + +Building XML documents +^^^^^^^^^^^^^^^^^^^^^^ + +The :func:`SubElement` function also provides a convenient way to create new +sub-elements for a given element:: + + >>> a = ET.Element('a') + >>> b = ET.SubElement(a, 'b') + >>> c = ET.SubElement(a, 'c') + >>> d = ET.SubElement(c, 'd') + >>> ET.dump(a) + <a><b /><c><d /></c></a> + +Additional resources +^^^^^^^^^^^^^^^^^^^^ -Each element has a number of properties associated with it: - -* a tag which is a string identifying what kind of data this element represents - (the element type, in other words). - -* a number of attributes, stored in a Python dictionary. - -* a text string. - -* an optional tail string. - -* a number of child elements, stored in a Python sequence - -To create an element instance, use the :class:`Element` constructor or the -:func:`SubElement` factory function. - -The :class:`ElementTree` class can be used to wrap an element structure, and -convert it from and to XML. +See http://effbot.org/zone/element-index.htm for tutorials and links to other +docs. -A C implementation of this API is available as :mod:`xml.etree.cElementTree`. -See http://effbot.org/zone/element-index.htm for tutorials and links to other -docs. Fredrik Lundh's page is also the location of the development version of -the xml.etree.ElementTree. +.. _elementtree-xpath: -.. versionchanged:: 3.2 - The ElementTree API is updated to 1.3. For more information, see - `Introducing ElementTree 1.3 - <http://effbot.org/zone/elementtree-13-intro.htm>`_. +XPath support +------------- +This module provides limited support for +`XPath expressions <http://www.w3.org/TR/xpath>`_ for locating elements in a +tree. The goal is to support a small subset of the abbreviated syntax; a full +XPath engine is outside the scope of the module. + +Example +^^^^^^^ + +Here's an example that demonstrates some of the XPath capabilities of the +module. We'll be using the ``countrydata`` XML document from the +:ref:`Parsing XML <elementtree-parsing-xml>` section:: + + import xml.etree.ElementTree as ET + + root = ET.fromstring(countrydata) + + # Top-level elements + root.findall(".") + + # All 'neighbor' grand-children of 'country' children of the top-level + # elements + root.findall("./country/neighbor") + + # Nodes with name='Singapore' that have a 'year' child + root.findall(".//year/..[@name='Singapore']") + + # 'year' nodes that are children of nodes with name='Singapore' + root.findall(".//*[@name='Singapore']/year") + + # All 'neighbor' nodes that are the second child of their parent + root.findall(".//neighbor[2]") + +Supported XPath syntax +^^^^^^^^^^^^^^^^^^^^^^ + +.. tabularcolumns:: |l|L| + ++-----------------------+------------------------------------------------------+ +| Syntax | Meaning | ++=======================+======================================================+ +| ``tag`` | Selects all child elements with the given tag. | +| | For example, ``spam`` selects all child elements | +| | named ``spam``, ``spam/egg`` selects all | +| | grandchildren named ``egg`` in all children named | +| | ``spam``. | ++-----------------------+------------------------------------------------------+ +| ``*`` | Selects all child elements. For example, ``*/egg`` | +| | selects all grandchildren named ``egg``. | ++-----------------------+------------------------------------------------------+ +| ``.`` | Selects the current node. This is mostly useful | +| | at the beginning of the path, to indicate that it's | +| | a relative path. | ++-----------------------+------------------------------------------------------+ +| ``//`` | Selects all subelements, on all levels beneath the | +| | current element. For example, ``.//egg`` selects | +| | all ``egg`` elements in the entire tree. | ++-----------------------+------------------------------------------------------+ +| ``..`` | Selects the parent element. Returns ``None`` if the | +| | path attempts to reach the ancestors of the start | +| | element (the element ``find`` was called on). | ++-----------------------+------------------------------------------------------+ +| ``[@attrib]`` | Selects all elements that have the given attribute. | ++-----------------------+------------------------------------------------------+ +| ``[@attrib='value']`` | Selects all elements for which the given attribute | +| | has the given value. The value cannot contain | +| | quotes. | ++-----------------------+------------------------------------------------------+ +| ``[tag]`` | Selects all elements that have a child named | +| | ``tag``. Only immediate children are supported. | ++-----------------------+------------------------------------------------------+ +| ``[position]`` | Selects all elements that are located at the given | +| | position. The position can be either an integer | +| | (1 is the first position), the expression ``last()`` | +| | (for the last position), or a position relative to | +| | the last position (e.g. ``last()-1``). | ++-----------------------+------------------------------------------------------+ + +Predicates (expressions within square brackets) must be preceded by a tag +name, an asterisk, or another predicate. ``position`` predicates must be +preceded by a tag name. + +Reference +--------- .. _elementtree-functions: Functions ---------- +^^^^^^^^^ .. function:: Comment(text=None) @@ -104,14 +379,14 @@ Functions Parses an XML section into an element tree incrementally, and reports what's going on to the user. *source* is a filename or :term:`file object` - containing XML data. *events* is a list of events to report back. The + containing XML data. *events* is a tuple of events to report back. The supported events are the strings ``"start"``, ``"end"``, ``"start-ns"`` and ``"end-ns"`` (the "ns" events are used to get detailed namespace information). If *events* is omitted, only ``"end"`` events are reported. *parser* is an optional parser instance. If not given, the standard - :class:`XMLParser` parser is used. *parser* is not supported by - ``cElementTree``. Returns an :term:`iterator` providing ``(event, elem)`` - pairs. + :class:`XMLParser` parser is used. *parser* can only use the default + :class:`TreeBuilder` as a target. Returns an :term:`iterator` providing + ``(event, elem)`` pairs. .. note:: @@ -168,9 +443,9 @@ Functions Generates a string representation of an XML element, including all subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to - generate a Unicode string. *method* is either ``"xml"``, - ``"html"`` or ``"text"`` (default is ``"xml"``). Returns an (optionally) - encoded string containing the XML data. + generate a Unicode string (otherwise, a bytestring is generated). *method* + is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). + Returns an (optionally) encoded string containing the XML data. .. function:: tostringlist(element, encoding="us-ascii", method="xml") @@ -178,11 +453,11 @@ Functions Generates a string representation of an XML element, including all subelements. *element* is an :class:`Element` instance. *encoding* [1]_ is the output encoding (default is US-ASCII). Use ``encoding="unicode"`` to - generate a Unicode string. *method* is either ``"xml"``, - ``"html"`` or ``"text"`` (default is ``"xml"``). Returns a list of - (optionally) encoded strings containing the XML data. It does not guarantee - any specific sequence, except that ``"".join(tostringlist(element)) == - tostring(element)``. + generate a Unicode string (otherwise, a bytestring is generated). *method* + is either ``"xml"``, ``"html"`` or ``"text"`` (default is ``"xml"``). + Returns a list of (optionally) encoded strings containing the XML data. + It does not guarantee any specific sequence, except that + ``b"".join(tostringlist(element)) == tostring(element)``. .. versionadded:: 3.2 @@ -207,8 +482,7 @@ Functions .. _elementtree-element-objects: Element Objects ---------------- - +^^^^^^^^^^^^^^^ .. class:: Element(tag, attrib={}, **extra) @@ -259,7 +533,7 @@ Element Objects .. method:: clear() Resets an element. This function removes all subelements, clears all - attributes, and sets the text and tail attributes to None. + attributes, and sets the text and tail attributes to ``None``. .. method:: get(key, default=None) @@ -290,36 +564,43 @@ Element Objects .. method:: append(subelement) - Adds the element *subelement* to the end of this elements internal list - of subelements. + Adds the element *subelement* to the end of this element's internal list + of subelements. Raises :exc:`TypeError` if *subelement* is not an + :class:`Element`. .. method:: extend(subelements) Appends *subelements* from a sequence object with zero or more elements. - Raises :exc:`AssertionError` if a subelement is not a valid object. + Raises :exc:`TypeError` if a subelement is not an :class:`Element`. .. versionadded:: 3.2 - .. method:: find(match) + .. method:: find(match, namespaces=None) Finds the first subelement matching *match*. *match* may be a tag name - or path. Returns an element instance or ``None``. + or a :ref:`path <elementtree-xpath>`. Returns an element instance + or ``None``. *namespaces* is an optional mapping from namespace prefix + to full name. - .. method:: findall(match) + .. method:: findall(match, namespaces=None) - Finds all matching subelements, by tag name or path. Returns a list - containing all matching elements in document order. + Finds all matching subelements, by tag name or + :ref:`path <elementtree-xpath>`. Returns a list containing all matching + elements in document order. *namespaces* is an optional mapping from + namespace prefix to full name. - .. method:: findtext(match, default=None) + .. method:: findtext(match, default=None, namespaces=None) Finds text for the first subelement matching *match*. *match* may be - a tag name or path. Returns the text content of the first matching - element, or *default* if no element was found. Note that if the matching - element has no text content an empty string is returned. + a tag name or a :ref:`path <elementtree-xpath>`. Returns the text content + of the first matching element, or *default* if no element was found. + Note that if the matching element has no text content an empty string + is returned. *namespaces* is an optional mapping from namespace prefix + to full name. .. method:: getchildren() @@ -334,9 +615,10 @@ Element Objects Use method :meth:`Element.iter` instead. - .. method:: insert(index, element) + .. method:: insert(index, subelement) - Inserts a subelement at the given position in this element. + Inserts *subelement* at the given position in this element. Raises + :exc:`TypeError` if *subelement* is not an :class:`Element`. .. method:: iter(tag=None) @@ -350,10 +632,13 @@ Element Objects .. versionadded:: 3.2 - .. method:: iterfind(match) + .. method:: iterfind(match, namespaces=None) + + Finds all matching subelements, by tag name or + :ref:`path <elementtree-xpath>`. Returns an iterable yielding all + matching elements in document order. *namespaces* is an optional mapping + from namespace prefix to full name. - Finds all matching subelements, by tag name or path. Returns an iterable - yielding all matching elements in document order. .. versionadded:: 3.2 @@ -379,8 +664,9 @@ Element Objects or contents. :class:`Element` objects also support the following sequence type methods - for working with subelements: :meth:`__delitem__`, :meth:`__getitem__`, - :meth:`__setitem__`, :meth:`__len__`. + for working with subelements: :meth:`~object.__delitem__`, + :meth:`~object.__getitem__`, :meth:`~object.__setitem__`, + :meth:`~object.__len__`. Caution: Elements with no subelements will test as ``False``. This behavior will change in future versions. Use specific ``len(elem)`` or ``elem is @@ -398,7 +684,7 @@ Element Objects .. _elementtree-elementtree-objects: ElementTree Objects -------------------- +^^^^^^^^^^^^^^^^^^^ .. class:: ElementTree(element=None, file=None) @@ -418,17 +704,17 @@ ElementTree Objects care. *element* is an element instance. - .. method:: find(match) + .. method:: find(match, namespaces=None) Same as :meth:`Element.find`, starting at the root of the tree. - .. method:: findall(match) + .. method:: findall(match, namespaces=None) Same as :meth:`Element.findall`, starting at the root of the tree. - .. method:: findtext(match, default=None) + .. method:: findtext(match, default=None, namespaces=None) Same as :meth:`Element.findtext`, starting at the root of the tree. @@ -451,11 +737,9 @@ ElementTree Objects to look for (default is to return all elements) - .. method:: iterfind(match) + .. method:: iterfind(match, namespaces=None) - Finds all matching subelements, by tag name or path. Same as - getroot().iterfind(match). Returns an iterable yielding all matching - elements in document order. + Same as :meth:`Element.iterfind`, starting at the root of the tree. .. versionadded:: 3.2 @@ -464,8 +748,8 @@ ElementTree Objects Loads an external XML section into this element tree. *source* is a file name or :term:`file object`. *parser* is an optional parser instance. - If not given, the standard XMLParser parser is used. Returns the section - root element. + If not given, the standard :class:`XMLParser` parser is used. Returns the + section root element. .. method:: write(file, encoding="us-ascii", xml_declaration=None, \ @@ -473,13 +757,21 @@ ElementTree Objects Writes the element tree to a file, as XML. *file* is a file name, or a :term:`file object` opened for writing. *encoding* [1]_ is the output - encoding (default is US-ASCII). Use ``encoding="unicode"`` to write a - Unicode string. *xml_declaration* controls if an XML declaration - should be added to the file. Use False for never, True for always, None - for only if not US-ASCII or UTF-8 or Unicode (default is None). + encoding (default is US-ASCII). + *xml_declaration* controls if an XML declaration should be added to the + file. Use ``False`` for never, ``True`` for always, ``None`` + for only if not US-ASCII or UTF-8 or Unicode (default is ``None``). *default_namespace* sets the default XML namespace (for "xmlns"). *method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is - ``"xml"``). Returns an (optionally) encoded string. + ``"xml"``). + + The output is either a string (:class:`str`) or binary (:class:`bytes`). + This is controlled by the *encoding* argument. If *encoding* is + ``"unicode"``, the output is a string; otherwise, it's binary. Note that + this may conflict with the type of *file* if it's an open + :term:`file object`; make sure you do not try to write a string to a + binary stream and vice versa. + This is the XML file that is going to be manipulated:: @@ -512,7 +804,7 @@ Example of changing the attribute "target" of every link in first paragraph:: .. _elementtree-qname-objects: QName Objects -------------- +^^^^^^^^^^^^^ .. class:: QName(text_or_uri, tag=None) @@ -528,7 +820,7 @@ QName Objects .. _elementtree-treebuilder-objects: TreeBuilder Objects -------------------- +^^^^^^^^^^^^^^^^^^^ .. class:: TreeBuilder(element_factory=None) @@ -536,9 +828,9 @@ TreeBuilder Objects Generic element structure builder. This builder converts a sequence of start, data, and end method calls to a well-formed element structure. You can use this class to build an element structure using a custom XML parser, - or a parser for some other XML-like format. The *element_factory* is called - to create new :class:`Element` instances when given. - + or a parser for some other XML-like format. *element_factory*, when given, + must be a callable accepting two positional arguments: a tag and + a dict of attributes. It is expected to return a new element instance. .. method:: close() @@ -579,7 +871,7 @@ TreeBuilder Objects .. _elementtree-xmlparser-objects: XMLParser Objects ------------------ +^^^^^^^^^^^^^^^^^ .. class:: XMLParser(html=0, target=None, encoding=None) @@ -587,14 +879,16 @@ XMLParser Objects :class:`Element` structure builder for XML source data, based on the expat parser. *html* are predefined HTML entities. This flag is not supported by the current implementation. *target* is the target object. If omitted, the - builder uses an instance of the standard TreeBuilder class. *encoding* [1]_ - is optional. If given, the value overrides the encoding specified in the - XML file. + builder uses an instance of the standard :class:`TreeBuilder` class. + *encoding* [1]_ is optional. If given, the value overrides the encoding + specified in the XML file. .. method:: close() - Finishes feeding data to the parser. Returns an element structure. + Finishes feeding data to the parser. Returns the result of calling the + ``close()`` method of the *target* passed during construction; by default, + this is the toplevel document element. .. method:: doctype(name, pubid, system) @@ -608,12 +902,12 @@ XMLParser Objects Feeds data to the parser. *data* is encoded data. -:meth:`XMLParser.feed` calls *target*\'s :meth:`start` method -for each opening tag, its :meth:`end` method for each closing tag, -and data is processed by method :meth:`data`. :meth:`XMLParser.close` -calls *target*\'s method :meth:`close`. -:class:`XMLParser` can be used not only for building a tree structure. -This is an example of counting the maximum depth of an XML file:: + :meth:`XMLParser.feed` calls *target*\'s ``start()`` method + for each opening tag, its ``end()`` method for each closing tag, + and data is processed by method ``data()``. :meth:`XMLParser.close` + calls *target*\'s method ``close()``. + :class:`XMLParser` can be used not only for building a tree structure. + This is an example of counting the maximum depth of an XML file:: >>> from xml.etree.ElementTree import XMLParser >>> class MaxDepth: # The target object of the parser @@ -647,6 +941,24 @@ This is an example of counting the maximum depth of an XML file:: >>> parser.close() 4 +Exceptions +^^^^^^^^^^ + +.. class:: ParseError + + XML parse error, raised by the various parsing methods in this module when + parsing fails. The string representation of an instance of this exception + will contain a user-friendly error message. In addition, it will have + the following attributes available: + + .. attribute:: code + + A numeric error code from the expat parser. See the documentation of + :mod:`xml.parsers.expat` for the list of error codes and their meanings. + + .. attribute:: position + + A tuple of *line*, *column* numbers, specifying where the error occurred. .. rubric:: Footnotes diff --git a/Doc/library/xml.rst b/Doc/library/xml.rst index a800813..0188219 100644 --- a/Doc/library/xml.rst +++ b/Doc/library/xml.rst @@ -14,8 +14,9 @@ Python's interfaces for processing XML are grouped in the ``xml`` package. .. warning:: The XML modules are not secure against erroneous or maliciously - constructed data. If you need to parse untrusted or unauthenticated data see - :ref:`xml-vulnerabilities`. + constructed data. If you need to parse untrusted or + unauthenticated data see the :ref:`xml-vulnerabilities` and + :ref:`defused-packages` sections. It is important to note that modules in the :mod:`xml` package require that there be at least one SAX-compliant XML parser available. The Expat parser is @@ -28,11 +29,12 @@ definition of the Python bindings for the DOM and SAX interfaces. The XML handling submodules are: * :mod:`xml.etree.ElementTree`: the ElementTree API, a simple and lightweight + XML processor .. * :mod:`xml.dom`: the DOM API definition -* :mod:`xml.dom.minidom`: a lightweight DOM implementation +* :mod:`xml.dom.minidom`: a minimal DOM implementation * :mod:`xml.dom.pulldom`: support for building partial DOM trees .. @@ -44,27 +46,28 @@ The XML handling submodules are: .. _xml-vulnerabilities: XML vulnerabilities -=================== +------------------- The XML processing modules are not secure against maliciously constructed data. -An attacker can abuse vulnerabilities for e.g. denial of service attacks, to -access local files, to generate network connections to other machines, or -to or circumvent firewalls. The attacks on XML abuse unfamiliar features -like inline `DTD`_ (document type definition) with entities. +An attacker can abuse XML features to carry out denial of service attacks, +access local files, generate network connections to other machines, or +circumvent firewalls. +The following table gives an overview of the known attacks and whether +the various modules are vulnerable to them. ========================= ======== ========= ========= ======== ========= kind sax etree minidom pulldom xmlrpc ========================= ======== ========= ========= ======== ========= -billion laughs **True** **True** **True** **True** **True** -quadratic blowup **True** **True** **True** **True** **True** -external entity expansion **True** False (1) False (2) **True** False (3) -DTD retrieval **True** False False **True** False -decompression bomb False False False False **True** +billion laughs **Yes** **Yes** **Yes** **Yes** **Yes** +quadratic blowup **Yes** **Yes** **Yes** **Yes** **Yes** +external entity expansion **Yes** No (1) No (2) **Yes** No (3) +DTD retrieval **Yes** No No **Yes** No +decompression bomb No No No No **Yes** ========================= ======== ========= ========= ======== ========= 1. :mod:`xml.etree.ElementTree` doesn't expand external entities and raises a - ParserError when an entity occurs. + :exc:`ParserError` when an entity occurs. 2. :mod:`xml.dom.minidom` doesn't expand external entities and simply returns the unexpanded entity verbatim. 3. :mod:`xmlrpclib` doesn't expand external entities and omits them. @@ -73,55 +76,54 @@ decompression bomb False False False False **True** billion laughs / exponential entity expansion The `Billion Laughs`_ attack -- also known as exponential entity expansion -- uses multiple levels of nested entities. Each entity refers to another entity - several times, the final entity definition contains a small string. Eventually - the small string is expanded to several gigabytes. The exponential expansion - consumes lots of CPU time, too. + several times, and the final entity definition contains a small string. + The exponential expansion results in several gigabytes of text and + consumes lots of memory and CPU time. quadratic blowup entity expansion A quadratic blowup attack is similar to a `Billion Laughs`_ attack; it abuses entity expansion, too. Instead of nested entities it repeats one large entity with a couple of thousand chars over and over again. The attack isn't as - efficient as the exponential case but it avoids triggering countermeasures of - parsers against heavily nested entities. + efficient as the exponential case but it avoids triggering parser countermeasures + that forbid deeply-nested entities. external entity expansion Entity declarations can contain more than just text for replacement. They can - also point to external resources by public identifiers or system identifiers. - System identifiers are standard URIs or can refer to local files. The XML - parser retrieves the resource with e.g. HTTP or FTP requests and embeds the - content into the XML document. + also point to external resources or local files. The XML + parser accesses the resource and embeds the content into the XML document. DTD retrieval - Some XML libraries like Python's mod:'xml.dom.pulldom' retrieve document type + Some XML libraries like Python's :mod:`xml.dom.pulldom` retrieve document type definitions from remote or local locations. The feature has similar implications as the external entity expansion issue. decompression bomb - The issue of decompression bombs (aka `ZIP bomb`_) apply to all XML libraries - that can parse compressed XML stream like gzipped HTTP streams or LZMA-ed + Decompression bombs (aka `ZIP bomb`_) apply to all XML libraries + that can parse compressed XML streams such as gzipped HTTP streams or + LZMA-compressed files. For an attacker it can reduce the amount of transmitted data by three magnitudes or more. -The documentation of `defusedxml`_ on PyPI has further information about +The documentation for `defusedxml`_ on PyPI has further information about all known attack vectors with examples and references. -defused packages ----------------- +.. _defused-packages: + +The :mod:`defusedxml` and :mod:`defusedexpat` Packages +------------------------------------------------------ `defusedxml`_ is a pure Python package with modified subclasses of all stdlib -XML parsers that prevent any potentially malicious operation. The courses of -action are recommended for any server code that parses untrusted XML data. The -package also ships with example exploits and an extended documentation on more -XML exploits like xpath injection. - -`defusedexpat`_ provides a modified libexpat and patched replacment -:mod:`pyexpat` extension module with countermeasures against entity expansion -DoS attacks. Defusedexpat still allows a sane and configurable amount of entity -expansions. The modifications will be merged into future releases of Python. - -The workarounds and modifications are not included in patch releases as they -break backward compatibility. After all inline DTD and entity expansion are -well-definied XML features. +XML parsers that prevent any potentially malicious operation. Use of this +package is recommended for any server code that parses untrusted XML data. The +package also ships with example exploits and extended documentation on more +XML exploits such as XPath injection. + +`defusedexpat`_ provides a modified libexpat and a patched +:mod:`pyexpat` module that have countermeasures against entity expansion +DoS attacks. The :mod:`defusedexpat` module still allows a sane and configurable amount of entity +expansions. The modifications may be included in some future release of Python, +but will not be included in any bugfix releases of +Python because they break backward compatibility. .. _defusedxml: https://pypi.python.org/pypi/defusedxml/ diff --git a/Doc/library/xml.sax.handler.rst b/Doc/library/xml.sax.handler.rst index 1a6b391..0fb3341 100644 --- a/Doc/library/xml.sax.handler.rst +++ b/Doc/library/xml.sax.handler.rst @@ -237,7 +237,8 @@ events in the input document: Signals the start of an element in non-namespace mode. The *name* parameter contains the raw XML 1.0 name of the element type as a - string and the *attrs* parameter holds an object of the :class:`Attributes` + string and the *attrs* parameter holds an object of the + :class:`~xml.sax.xmlreader.Attributes` interface (see :ref:`attributes-objects`) containing the attributes of the element. The object passed as *attrs* may be re-used by the parser; holding on to a reference to it is not a reliable way to keep a copy of the attributes. @@ -260,7 +261,8 @@ events in the input document: The *name* parameter contains the name of the element type as a ``(uri, localname)`` tuple, the *qname* parameter contains the raw XML 1.0 name used in the source document, and the *attrs* parameter holds an instance of the - :class:`AttributesNS` interface (see :ref:`attributes-ns-objects`) + :class:`~xml.sax.xmlreader.AttributesNS` interface (see + :ref:`attributes-ns-objects`) containing the attributes of the element. If no namespace is associated with the element, the *uri* component of *name* will be ``None``. The object passed as *attrs* may be re-used by the parser; holding on to a reference to it is not @@ -376,8 +378,9 @@ ErrorHandler Objects -------------------- Objects with this interface are used to receive error and warning information -from the :class:`XMLReader`. If you create an object that implements this -interface, then register the object with your :class:`XMLReader`, the parser +from the :class:`~xml.sax.xmlreader.XMLReader`. If you create an object that +implements this interface, then register the object with your +:class:`~xml.sax.xmlreader.XMLReader`, the parser will call the methods in your object to report all warnings and errors. There are three levels of errors available: warnings, (possibly) recoverable errors, and unrecoverable errors. All methods take a :exc:`SAXParseException` as the diff --git a/Doc/library/xml.sax.reader.rst b/Doc/library/xml.sax.reader.rst index 6683da1..3ab6063 100644 --- a/Doc/library/xml.sax.reader.rst +++ b/Doc/library/xml.sax.reader.rst @@ -106,47 +106,50 @@ The :class:`XMLReader` interface supports the following methods: .. method:: XMLReader.getContentHandler() - Return the current :class:`ContentHandler`. + Return the current :class:`~xml.sax.handler.ContentHandler`. .. method:: XMLReader.setContentHandler(handler) - Set the current :class:`ContentHandler`. If no :class:`ContentHandler` is set, - content events will be discarded. + Set the current :class:`~xml.sax.handler.ContentHandler`. If no + :class:`~xml.sax.handler.ContentHandler` is set, content events will be + discarded. .. method:: XMLReader.getDTDHandler() - Return the current :class:`DTDHandler`. + Return the current :class:`~xml.sax.handler.DTDHandler`. .. method:: XMLReader.setDTDHandler(handler) - Set the current :class:`DTDHandler`. If no :class:`DTDHandler` is set, DTD + Set the current :class:`~xml.sax.handler.DTDHandler`. If no + :class:`~xml.sax.handler.DTDHandler` is set, DTD events will be discarded. .. method:: XMLReader.getEntityResolver() - Return the current :class:`EntityResolver`. + Return the current :class:`~xml.sax.handler.EntityResolver`. .. method:: XMLReader.setEntityResolver(handler) - Set the current :class:`EntityResolver`. If no :class:`EntityResolver` is set, + Set the current :class:`~xml.sax.handler.EntityResolver`. If no + :class:`~xml.sax.handler.EntityResolver` is set, attempts to resolve an external entity will result in opening the system identifier for the entity, and fail if it is not available. .. method:: XMLReader.getErrorHandler() - Return the current :class:`ErrorHandler`. + Return the current :class:`~xml.sax.handler.ErrorHandler`. .. method:: XMLReader.setErrorHandler(handler) - Set the current error handler. If no :class:`ErrorHandler` is set, errors will - be raised as exceptions, and warnings will be printed. + Set the current error handler. If no :class:`~xml.sax.handler.ErrorHandler` + is set, errors will be raised as exceptions, and warnings will be printed. .. method:: XMLReader.setLocale(locale) @@ -322,9 +325,11 @@ InputSource Objects The :class:`Attributes` Interface --------------------------------- -:class:`Attributes` objects implement a portion of the mapping protocol, -including the methods :meth:`copy`, :meth:`get`, :meth:`__contains__`, -:meth:`items`, :meth:`keys`, and :meth:`values`. The following methods +:class:`Attributes` objects implement a portion of the :term:`mapping protocol +<mapping>`, including the methods :meth:`~collections.abc.Mapping.copy`, +:meth:`~collections.abc.Mapping.get`, :meth:`~object.__contains__`, +:meth:`~collections.abc.Mapping.items`, :meth:`~collections.abc.Mapping.keys`, +and :meth:`~collections.abc.Mapping.values`. The following methods are also provided: diff --git a/Doc/library/xml.sax.rst b/Doc/library/xml.sax.rst index d5c56b6..e95d6b0 100644 --- a/Doc/library/xml.sax.rst +++ b/Doc/library/xml.sax.rst @@ -26,7 +26,8 @@ The convenience functions are: .. function:: make_parser(parser_list=[]) - Create and return a SAX :class:`XMLReader` object. The first parser found will + Create and return a SAX :class:`~xml.sax.xmlreader.XMLReader` object. The + first parser found will be used. If *parser_list* is provided, it must be a sequence of strings which name modules that have a function named :func:`create_parser`. Modules listed in *parser_list* will be used before modules in the default list of parsers. @@ -36,8 +37,9 @@ The convenience functions are: Create a SAX parser and use it to parse a document. The document, passed in as *filename_or_stream*, can be a filename or a file object. The *handler* - parameter needs to be a SAX :class:`ContentHandler` instance. If - *error_handler* is given, it must be a SAX :class:`ErrorHandler` instance; if + parameter needs to be a SAX :class:`~handler.ContentHandler` instance. If + *error_handler* is given, it must be a SAX :class:`~handler.ErrorHandler` + instance; if omitted, :exc:`SAXParseException` will be raised on all errors. There is no return value; all work must be done by the *handler* passed in. @@ -62,10 +64,12 @@ For these objects, only the interfaces are relevant; they are normally not instantiated by the application itself. Since Python does not have an explicit notion of interface, they are formally introduced as classes, but applications may use implementations which do not inherit from the provided classes. The -:class:`InputSource`, :class:`Locator`, :class:`Attributes`, -:class:`AttributesNS`, and :class:`XMLReader` interfaces are defined in the +:class:`~xml.sax.xmlreader.InputSource`, :class:`~xml.sax.xmlreader.Locator`, +:class:`~xml.sax.xmlreader.Attributes`, :class:`~xml.sax.xmlreader.AttributesNS`, +and :class:`~xml.sax.xmlreader.XMLReader` interfaces are defined in the module :mod:`xml.sax.xmlreader`. The handler interfaces are defined in -:mod:`xml.sax.handler`. For convenience, :class:`InputSource` (which is often +:mod:`xml.sax.handler`. For convenience, +:class:`~xml.sax.xmlreader.InputSource` (which is often instantiated directly) and the handler classes are also available from :mod:`xml.sax`. These interfaces are described below. @@ -78,7 +82,8 @@ classes. Encapsulate an XML error or warning. This class can contain basic error or warning information from either the XML parser or the application: it can be subclassed to provide additional functionality or to add localization. Note - that although the handlers defined in the :class:`ErrorHandler` interface + that although the handlers defined in the + :class:`~xml.sax.handler.ErrorHandler` interface receive instances of this exception, it is not required to actually raise the exception --- it is also useful as a container for information. @@ -91,22 +96,26 @@ classes. .. exception:: SAXParseException(msg, exception, locator) - Subclass of :exc:`SAXException` raised on parse errors. Instances of this class - are passed to the methods of the SAX :class:`ErrorHandler` interface to provide - information about the parse error. This class supports the SAX :class:`Locator` - interface as well as the :class:`SAXException` interface. + Subclass of :exc:`SAXException` raised on parse errors. Instances of this + class are passed to the methods of the SAX + :class:`~xml.sax.handler.ErrorHandler` interface to provide information + about the parse error. This class supports the SAX + :class:`~xml.sax.xmlreader.Locator` interface as well as the + :class:`SAXException` interface. .. exception:: SAXNotRecognizedException(msg, exception=None) - Subclass of :exc:`SAXException` raised when a SAX :class:`XMLReader` is + Subclass of :exc:`SAXException` raised when a SAX + :class:`~xml.sax.xmlreader.XMLReader` is confronted with an unrecognized feature or property. SAX applications and extensions may use this class for similar purposes. .. exception:: SAXNotSupportedException(msg, exception=None) - Subclass of :exc:`SAXException` raised when a SAX :class:`XMLReader` is asked to + Subclass of :exc:`SAXException` raised when a SAX + :class:`~xml.sax.xmlreader.XMLReader` is asked to enable a feature that is not supported, or to set a property to a value that the implementation does not support. SAX applications and extensions may use this class for similar purposes. diff --git a/Doc/library/xml.sax.utils.rst b/Doc/library/xml.sax.utils.rst index ff36fd8..14cf078 100644 --- a/Doc/library/xml.sax.utils.rst +++ b/Doc/library/xml.sax.utils.rst @@ -52,7 +52,8 @@ or as base classes. .. class:: XMLGenerator(out=None, encoding='iso-8859-1', short_empty_elements=False) - This class implements the :class:`ContentHandler` interface by writing SAX + This class implements the :class:`~xml.sax.handler.ContentHandler` interface + by writing SAX events back into an XML document. In other words, using an :class:`XMLGenerator` as the content handler will reproduce the original document being parsed. *out* should be a file-like object which will default to *sys.stdout*. *encoding* is @@ -62,12 +63,13 @@ or as base classes. tags, if set to *True* they are emitted as a single self-closed tag. .. versionadded:: 3.2 - short_empty_elements + The *short_empty_elements* parameter. .. class:: XMLFilterBase(base) - This class is designed to sit between an :class:`XMLReader` and the client + This class is designed to sit between an + :class:`~xml.sax.xmlreader.XMLReader` and the client application's event handlers. By default, it does nothing but pass requests up to the reader and events on to the handlers unmodified, but subclasses can override specific methods to modify the event stream or the configuration @@ -76,9 +78,10 @@ or as base classes. .. function:: prepare_input_source(source, base='') - This function takes an input source and an optional base URL and returns a fully - resolved :class:`InputSource` object ready for reading. The input source can be - given as a string, a file-like object, or an :class:`InputSource` object; - parsers will use this function to implement the polymorphic *source* argument to - their :meth:`parse` method. + This function takes an input source and an optional base URL and returns a + fully resolved :class:`~xml.sax.xmlreader.InputSource` object ready for + reading. The input source can be given as a string, a file-like object, or + an :class:`~xml.sax.xmlreader.InputSource` object; parsers will use this + function to implement the polymorphic *source* argument to their + :meth:`parse` method. diff --git a/Doc/library/xmlrpc.client.rst b/Doc/library/xmlrpc.client.rst index 4592fb4..3cb19d1 100644 --- a/Doc/library/xmlrpc.client.rst +++ b/Doc/library/xmlrpc.client.rst @@ -8,7 +8,7 @@ .. XXX Not everything is documented yet. It might be good to describe - Marshaller, Unmarshaller, getparser, dumps, loads, and Transport. + Marshaller, Unmarshaller, getparser and Transport. **Source code:** :source:`Lib/xmlrpc/client.py` @@ -28,7 +28,12 @@ between conformable Python objects and XML on the wire. :ref:`xml-vulnerabilities`. -.. class:: ServerProxy(uri, transport=None, encoding=None, verbose=False, allow_none=False, use_datetime=False) +.. class:: ServerProxy(uri, transport=None, encoding=None, verbose=False, \ + allow_none=False, use_datetime=False, \ + use_builtin_types=False) + + .. versionchanged:: 3.3 + The *use_builtin_types* flag was added. A :class:`ServerProxy` instance is an object that manages communication with a remote XML-RPC server. The required first argument is a URI (Uniform Resource @@ -41,9 +46,13 @@ between conformable Python objects and XML on the wire. XML; the default behaviour is for ``None`` to raise a :exc:`TypeError`. This is a commonly-used extension to the XML-RPC specification, but isn't supported by all clients and servers; see http://ontosys.com/xml-rpc/extensions.php for a - description. The *use_datetime* flag can be used to cause date/time values to - be presented as :class:`datetime.datetime` objects; this is false by default. - :class:`datetime.datetime` objects may be passed to calls. + description. The *use_builtin_types* flag can be used to cause date/time values + to be presented as :class:`datetime.datetime` objects and binary data to be + presented as :class:`bytes` objects; this flag is false by default. + :class:`datetime.datetime` and :class:`bytes` objects may be passed to calls. + + The obsolete *use_datetime* flag is similar to *use_builtin_types* but it + applies only to date/time values. Both the HTTP and HTTPS transports support the URL syntax extension for HTTP Basic Authentication: ``http://user:pass@host:port/path``. The ``user:pass`` @@ -63,6 +72,8 @@ between conformable Python objects and XML on the wire. (e.g. that can be marshalled through XML), include the following (and except where noted, they are unmarshalled as the same Python type): + .. tabularcolumns:: |l|L| + +---------------------------------+---------------------------------------------+ | Name | Meaning | +=================================+=============================================+ @@ -85,12 +96,12 @@ between conformable Python objects and XML on the wire. | | only their *__dict__* attribute is | | | transmitted. | +---------------------------------+---------------------------------------------+ - | :const:`dates` | in seconds since the epoch (pass in an | - | | instance of the :class:`DateTime` class) or | + | :const:`dates` | In seconds since the epoch. Pass in an | + | | instance of the :class:`DateTime` class or | | | a :class:`datetime.datetime` instance. | +---------------------------------+---------------------------------------------+ - | :const:`binary data` | pass in an instance of the :class:`Binary` | - | | wrapper class | + | :const:`binary data` | Pass in an instance of the :class:`Binary` | + | | wrapper class or a :class:`bytes` instance. | +---------------------------------+---------------------------------------------+ This is the full set of data types supported by XML-RPC. Method calls may also @@ -105,8 +116,9 @@ between conformable Python objects and XML on the wire. ensure that the string is free of characters that aren't allowed in XML, such as the control characters with ASCII values between 0 and 31 (except, of course, tab, newline and carriage return); failing to do this will result in an XML-RPC - request that isn't well-formed XML. If you have to pass arbitrary strings via - XML-RPC, use the :class:`Binary` wrapper class described below. + request that isn't well-formed XML. If you have to pass arbitrary bytes + via XML-RPC, use the :class:`bytes` class or the class:`Binary` wrapper class + described below. :class:`Server` is retained as an alias for :class:`ServerProxy` for backwards compatibility. New code should use :class:`ServerProxy`. @@ -256,7 +268,7 @@ The client code for the preceding server:: Binary Objects -------------- -This class may be initialized from string data (which may include NULs). The +This class may be initialized from bytes data (which may include NULs). The primary access to the content of a :class:`Binary` object is provided by an attribute: @@ -264,15 +276,15 @@ attribute: .. attribute:: Binary.data The binary data encapsulated by the :class:`Binary` instance. The data is - provided as an 8-bit string. + provided as a :class:`bytes` object. :class:`Binary` objects have the following methods, supported mainly for internal use by the marshalling/unmarshalling code: -.. method:: Binary.decode(string) +.. method:: Binary.decode(bytes) - Accept a base64 string and decode it as the instance's new data. + Accept a base64 :class:`bytes` object and decode it as the instance's new data. .. method:: Binary.encode(out) @@ -423,21 +435,21 @@ remote server into a single request [#]_. is a :term:`generator`; iterating over this generator yields the individual results. -A usage example of this class follows. The server code :: +A usage example of this class follows. The server code:: from xmlrpc.server import SimpleXMLRPCServer - def add(x,y): - return x+y + def add(x, y): + return x + y def subtract(x, y): - return x-y + return x - y def multiply(x, y): - return x*y + return x * y def divide(x, y): - return x/y + return x // y # A simple server with simple arithmetic functions server = SimpleXMLRPCServer(("localhost", 8000)) @@ -455,13 +467,13 @@ The client code for the preceding server:: proxy = xmlrpc.client.ServerProxy("http://localhost:8000/") multicall = xmlrpc.client.MultiCall(proxy) - multicall.add(7,3) - multicall.subtract(7,3) - multicall.multiply(7,3) - multicall.divide(7,3) + multicall.add(7, 3) + multicall.subtract(7, 3) + multicall.multiply(7, 3) + multicall.divide(7, 3) result = multicall() - print("7+3=%d, 7-3=%d, 7*3=%d, 7/3=%d" % tuple(result)) + print("7+3=%d, 7-3=%d, 7*3=%d, 7//3=%d" % tuple(result)) Convenience Functions @@ -478,14 +490,21 @@ Convenience Functions it via an extension, provide a true value for *allow_none*. -.. function:: loads(data, use_datetime=False) +.. function:: loads(data, use_datetime=False, use_builtin_types=False) Convert an XML-RPC request or response into Python objects, a ``(params, methodname)``. *params* is a tuple of argument; *methodname* is a string, or ``None`` if no method name is present in the packet. If the XML-RPC packet represents a fault condition, this function will raise a :exc:`Fault` exception. - The *use_datetime* flag can be used to cause date/time values to be presented as - :class:`datetime.datetime` objects; this is false by default. + The *use_builtin_types* flag can be used to cause date/time values to be + presented as :class:`datetime.datetime` objects and binary data to be + presented as :class:`bytes` objects; this flag is false by default. + + The obsolete *use_datetime* flag is similar to *use_builtin_types* but it + applies only to date/time values. + + .. versionchanged:: 3.3 + The *use_builtin_types* flag was added. .. _xmlrpc-client-example: diff --git a/Doc/library/xmlrpc.rst b/Doc/library/xmlrpc.rst new file mode 100644 index 0000000..ae68157 --- /dev/null +++ b/Doc/library/xmlrpc.rst @@ -0,0 +1,12 @@ +:mod:`xmlrpc` --- XMLRPC server and client modules +================================================== + +XML-RPC is a Remote Procedure Call method that uses XML passed via HTTP as a +transport. With it, a client can call methods with parameters on a remote +server (the server is named by a URI) and get back structured data. + +``xmlrpc`` is a package that collects server and client modules implementing +XML-RPC. The modules are: + +* :mod:`xmlrpc.client` +* :mod:`xmlrpc.server` diff --git a/Doc/library/xmlrpc.server.rst b/Doc/library/xmlrpc.server.rst index 6b4c202..aca4f36 100644 --- a/Doc/library/xmlrpc.server.rst +++ b/Doc/library/xmlrpc.server.rst @@ -23,7 +23,9 @@ servers written in Python. Servers can either be free standing, using :ref:`xml-vulnerabilities`. -.. class:: SimpleXMLRPCServer(addr, requestHandler=SimpleXMLRPCRequestHandler, logRequests=True, allow_none=False, encoding=None, bind_and_activate=True) +.. class:: SimpleXMLRPCServer(addr, requestHandler=SimpleXMLRPCRequestHandler,\ + logRequests=True, allow_none=False, encoding=None,\ + bind_and_activate=True, use_builtin_types=False) Create a new server instance. This class provides methods for registration of functions that can be called by the XML-RPC protocol. The *requestHandler* @@ -32,18 +34,31 @@ servers written in Python. Servers can either be free standing, using are passed to the :class:`socketserver.TCPServer` constructor. If *logRequests* is true (the default), requests will be logged; setting this parameter to false will turn off logging. The *allow_none* and *encoding* parameters are passed - on to :mod:`xmlrpc.client` and control the XML-RPC responses that will be returned + on to :mod:`xmlrpc.client` and control the XML-RPC responses that will be returned from the server. The *bind_and_activate* parameter controls whether :meth:`server_bind` and :meth:`server_activate` are called immediately by the constructor; it defaults to true. Setting it to false allows code to manipulate the *allow_reuse_address* class variable before the address is bound. + The *use_builtin_types* parameter is passed to the + :func:`~xmlrpc.client.loads` function and controls which types are processed + when date/times values or binary data are received; it defaults to false. + .. versionchanged:: 3.3 + The *use_builtin_types* flag was added. -.. class:: CGIXMLRPCRequestHandler(allow_none=False, encoding=None) + +.. class:: CGIXMLRPCRequestHandler(allow_none=False, encoding=None,\ + use_builtin_types=False) Create a new instance to handle XML-RPC requests in a CGI environment. The *allow_none* and *encoding* parameters are passed on to :mod:`xmlrpc.client` and control the XML-RPC responses that will be returned from the server. + The *use_builtin_types* parameter is passed to the + :func:`~xmlrpc.client.loads` function and controls which types are processed + when date/times values or binary data are received; it defaults to false. + + .. versionchanged:: 3.3 + The *use_builtin_types* flag was added. .. class:: SimpleXMLRPCRequestHandler() @@ -169,6 +184,70 @@ server:: # Print list of available methods print(s.system.listMethods()) +The following example included in `Lib/xmlrpc/server.py` module shows a server +allowing dotted names and registering a multicall function. + +.. warning:: + + Enabling the *allow_dotted_names* option allows intruders to access your + module's global variables and may allow intruders to execute arbitrary code on + your machine. Only use this example only within a secure, closed network. + +:: + + import datetime + + class ExampleService: + def getData(self): + return '42' + + class currentTime: + @staticmethod + def getCurrentTime(): + return datetime.datetime.now() + + server = SimpleXMLRPCServer(("localhost", 8000)) + server.register_function(pow) + server.register_function(lambda x,y: x+y, 'add') + server.register_instance(ExampleService(), allow_dotted_names=True) + server.register_multicall_functions() + print('Serving XML-RPC on localhost port 8000') + try: + server.serve_forever() + except KeyboardInterrupt: + print("\nKeyboard interrupt received, exiting.") + server.server_close() + sys.exit(0) + +This ExampleService demo can be invoked from the command line:: + + python -m xmlrpc.server + + +The client that interacts with the above server is included in +`Lib/xmlrpc/client.py`:: + + server = ServerProxy("http://localhost:8000") + + try: + print(server.currentTime.getCurrentTime()) + except Error as v: + print("ERROR", v) + + multi = MultiCall(server) + multi.getData() + multi.pow(2,9) + multi.add(1,2) + try: + for response in multi(): + print(response) + except Error as v: + print("ERROR", v) + +This client which interacts with the demo XMLRPC server can be invoked as:: + + python -m xmlrpc.client + CGIXMLRPCRequestHandler ----------------------- @@ -240,12 +319,17 @@ to HTTP GET requests. Servers can either be free standing, using :class:`DocCGIXMLRPCRequestHandler`. -.. class:: DocXMLRPCServer(addr, requestHandler=DocXMLRPCRequestHandler, logRequests=True, allow_none=False, encoding=None, bind_and_activate=True) +.. class:: DocXMLRPCServer(addr, requestHandler=DocXMLRPCRequestHandler,\ + logRequests=True, allow_none=False, encoding=None,\ + bind_and_activate=True, use_builtin_types=True) Create a new server instance. All parameters have the same meaning as for :class:`SimpleXMLRPCServer`; *requestHandler* defaults to :class:`DocXMLRPCRequestHandler`. + .. versionchanged:: 3.3 + The *use_builtin_types* flag was added. + .. class:: DocCGIXMLRPCRequestHandler() diff --git a/Doc/library/zipfile.rst b/Doc/library/zipfile.rst index d82ab06..7a6482b 100644 --- a/Doc/library/zipfile.rst +++ b/Doc/library/zipfile.rst @@ -18,7 +18,7 @@ defined in `PKZIP Application Note This module does not currently handle multi-disk ZIP files. It can handle ZIP files that use the ZIP64 extensions -(that is ZIP files that are more than 4 GByte in size). It supports +(that is ZIP files that are more than 4 GiB in size). It supports decryption of encrypted files in ZIP archives, but it currently cannot create an encrypted file. Decryption is extremely slow as it is implemented in native Python rather than C. @@ -87,7 +87,30 @@ The module defines the following items: .. data:: ZIP_DEFLATED The numeric constant for the usual ZIP compression method. This requires the - :mod:`zlib` module. No other compression methods are currently supported. + :mod:`zlib` module. + + +.. data:: ZIP_BZIP2 + + The numeric constant for the BZIP2 compression method. This requires the + :mod:`bz2` module. + + .. versionadded:: 3.3 + +.. data:: ZIP_LZMA + + The numeric constant for the LZMA compression method. This requires the + :mod:`lzma` module. + + .. versionadded:: 3.3 + + .. note:: + + The ZIP file format specification has included support for bzip2 compression + since 2001, and for LZMA compression since 2006. However, some tools + (including older Python releases) do not support these compression + methods, and may either refuse to process the ZIP file altogether, + or fail to extract individual files. .. seealso:: @@ -118,12 +141,14 @@ ZipFile Objects adding a ZIP archive to another file (such as :file:`python.exe`). If *mode* is ``a`` and the file does not exist at all, it is created. *compression* is the ZIP compression method to use when writing the archive, - and should be :const:`ZIP_STORED` or :const:`ZIP_DEFLATED`; unrecognized - values will cause :exc:`RuntimeError` to be raised. If :const:`ZIP_DEFLATED` - is specified but the :mod:`zlib` module is not available, :exc:`RuntimeError` + and should be :const:`ZIP_STORED`, :const:`ZIP_DEFLATED`, + :const:`ZIP_BZIP2` or :const:`ZIP_LZMA`; unrecognized + values will cause :exc:`RuntimeError` to be raised. If :const:`ZIP_DEFLATED`, + :const:`ZIP_BZIP2` or :const:`ZIP_LZMA` is specified but the corresponded module + (:mod:`zlib`, :mod:`bz2` or :mod:`lzma`) is not available, :exc:`RuntimeError` is also raised. The default is :const:`ZIP_STORED`. If *allowZip64* is ``True`` zipfile will create ZIP files that use the ZIP64 extensions when - the zipfile is larger than 2 GB. If it is false (the default) :mod:`zipfile` + the zipfile is larger than 2 GiB. If it is false (the default) :mod:`zipfile` will raise an exception when the ZIP file would require ZIP64 extensions. ZIP64 extensions are disabled by default because the default :program:`zip` and :program:`unzip` commands on Unix (the InfoZIP utilities) don't support @@ -143,6 +168,9 @@ ZipFile Objects .. versionadded:: 3.2 Added the ability to use :class:`ZipFile` as a context manager. + .. versionchanged:: 3.3 + Added support for :mod:`bzip2 <bz2>` and :mod:`lzma` compression. + .. method:: ZipFile.close() @@ -185,8 +213,9 @@ ZipFile Objects .. note:: The file-like object is read-only and provides the following methods: - :meth:`!read`, :meth:`!readline`, :meth:`!readlines`, :meth:`!__iter__`, - :meth:`!__next__`. + :meth:`~io.BufferedIOBase.read`, :meth:`~io.IOBase.readline`, + :meth:`~io.IOBase.readlines`, :meth:`__iter__`, + :meth:`~iterator.__next__`. .. note:: @@ -239,7 +268,7 @@ ZipFile Objects that have absolute filenames starting with ``"/"`` or filenames with two dots ``".."``. - .. versionchanged:: 3.2.4 + .. versionchanged:: 3.3.1 The zipfile module attempts to prevent that. See :meth:`extract` note. diff --git a/Doc/library/zipimport.rst b/Doc/library/zipimport.rst index 60b2bd1..2cf508b 100644 --- a/Doc/library/zipimport.rst +++ b/Doc/library/zipimport.rst @@ -85,9 +85,12 @@ zipimporter Objects .. method:: get_data(pathname) - Return the data associated with *pathname*. Raise :exc:`IOError` if the + Return the data associated with *pathname*. Raise :exc:`OSError` if the file wasn't found. + .. versionchanged:: 3.3 + :exc:`IOError` used to be raised instead of :exc:`OSError`. + .. method:: get_filename(fullname) @@ -108,7 +111,7 @@ zipimporter Objects .. method:: is_package(fullname) - Return True if the module specified by *fullname* is a package. Raise + Return ``True`` if the module specified by *fullname* is a package. Raise :exc:`ZipImportError` if the module couldn't be found. diff --git a/Doc/library/zlib.rst b/Doc/library/zlib.rst index 75640d4..b178fe1 100644 --- a/Doc/library/zlib.rst +++ b/Doc/library/zlib.rst @@ -51,21 +51,45 @@ The available exception and functions in this module are: .. function:: compress(data[, level]) - Compresses the bytes in *data*, returning a bytes object containing - compressed data. *level* is an integer from ``0`` to ``9`` controlling the - level of compression; ``1`` is fastest and produces the least compression, - ``9`` is slowest and produces the most. ``0`` is no compression. The default - value is ``6``. Raises the :exc:`error` exception if any error occurs. + Compresses the bytes in *data*, returning a bytes object containing compressed data. + *level* is an integer from ``0`` to ``9`` controlling the level of compression; + ``1`` is fastest and produces the least compression, ``9`` is slowest and + produces the most. ``0`` is no compression. The default value is ``6``. + Raises the :exc:`error` exception if any error occurs. -.. function:: compressobj([level]) +.. function:: compressobj(level=-1, method=DEFLATED, wbits=15, memlevel=8, strategy=Z_DEFAULT_STRATEGY[, zdict]) Returns a compression object, to be used for compressing data streams that won't - fit into memory at once. *level* is an integer from ``0`` to ``9`` controlling - the level of compression; ``1`` is fastest and produces the least compression, - ``9`` is slowest and produces the most. ``0`` is no compression. The default + fit into memory at once. + + *level* is the compression level -- an integer from ``0`` to ``9``. A value + of ``1`` is fastest and produces the least compression, while a value of + ``9`` is slowest and produces the most. ``0`` is no compression. The default value is ``6``. + *method* is the compression algorithm. Currently, the only supported value is + ``DEFLATED``. + + *wbits* is the base two logarithm of the size of the window buffer. This + should be an integer from ``8`` to ``15``. Higher values give better + compression, but use more memory. + + *memlevel* controls the amount of memory used for internal compression state. + Valid values range from ``1`` to ``9``. Higher values using more memory, + but are faster and produce smaller output. + + *strategy* is used to tune the compression algorithm. Possible values are + ``Z_DEFAULT_STRATEGY``, ``Z_FILTERED``, and ``Z_HUFFMAN_ONLY``. + + *zdict* is a predefined compression dictionary. This is a sequence of bytes + (such as a :class:`bytes` object) containing subsequences that are expected + to occur frequently in the data that is to be compressed. Those subsequences + that are expected to be most common should come at the end of the dictionary. + + .. versionchanged:: 3.3 + Added the *zdict* parameter and keyword argument support. + .. function:: crc32(data[, value]) @@ -83,12 +107,13 @@ The available exception and functions in this module are: Always returns an unsigned 32-bit integer. -.. note:: - To generate the same numeric value across all Python versions and - platforms use crc32(data) & 0xffffffff. If you are only using - the checksum in packed binary format this is not necessary as the - return value is the correct 32bit binary representation - regardless of sign. + .. note:: + + To generate the same numeric value across all Python versions and + platforms, use ``crc32(data) & 0xffffffff``. If you are only using + the checksum in packed binary format this is not necessary as the + return value is the correct 32-bit binary representation + regardless of sign. .. function:: decompress(data[, wbits[, bufsize]]) @@ -115,11 +140,26 @@ The available exception and functions in this module are: to :c:func:`malloc`. The default size is 16384. -.. function:: decompressobj([wbits]) +.. function:: decompressobj(wbits=15[, zdict]) Returns a decompression object, to be used for decompressing data streams that - won't fit into memory at once. The *wbits* parameter controls the size of the - window buffer. + won't fit into memory at once. + + The *wbits* parameter controls the size of the window buffer. + + The *zdict* parameter specifies a predefined compression dictionary. If + provided, this must be the same dictionary as was used by the compressor that + produced the data that is to be decompressed. + + .. note:: + + If *zdict* is a mutable object (such as a :class:`bytearray`), you must not + modify its contents between the call to :func:`decompressobj` and the first + call to the decompressor's ``decompress()`` method. + + .. versionchanged:: 3.3 + Added the *zdict* parameter. + Compression objects support the following methods: @@ -151,23 +191,16 @@ Compression objects support the following methods: compress a set of data that share a common initial prefix. -Decompression objects support the following methods, and two attributes: +Decompression objects support the following methods and attributes: .. attribute:: Decompress.unused_data A bytes object which contains any bytes past the end of the compressed data. That is, - this remains ``""`` until the last byte that contains compression data is + this remains ``b""`` until the last byte that contains compression data is available. If the whole bytestring turned out to contain compressed data, this is ``b""``, an empty bytes object. - The only way to determine where a bytestring of compressed data ends is by actually - decompressing it. This means that when compressed data is contained part of a - larger file, you can only find the end of it by reading data and feeding it - followed by some non-empty bytestring into a decompression object's - :meth:`decompress` method until the :attr:`unused_data` attribute is no longer - empty. - .. attribute:: Decompress.unconsumed_tail @@ -178,6 +211,17 @@ Decompression objects support the following methods, and two attributes: :meth:`decompress` method call in order to get correct output. +.. attribute:: Decompress.eof + + A boolean indicating whether the end of the compressed data stream has been + reached. + + This makes it possible to distinguish between a properly-formed compressed + stream, and an incomplete or truncated one. + + .. versionadded:: 3.3 + + .. method:: Decompress.decompress(data[, max_length]) Decompress *data*, returning a bytes object containing the uncompressed data @@ -212,6 +256,24 @@ Decompression objects support the following methods, and two attributes: seeks into the stream at a future point. +Information about the version of the zlib library in use is available through +the following constants: + + +.. data:: ZLIB_VERSION + + The version string of the zlib library that was used for building the module. + This may be different from the zlib library actually used at runtime, which + is available as :const:`ZLIB_RUNTIME_VERSION`. + + +.. data:: ZLIB_RUNTIME_VERSION + + The version string of the zlib library actually loaded by the interpreter. + + .. versionadded:: 3.3 + + .. seealso:: Module :mod:`gzip` diff --git a/Doc/license.rst b/Doc/license.rst index a3257f8..4440c7b 100644 --- a/Doc/license.rst +++ b/Doc/license.rst @@ -50,77 +50,11 @@ been GPL-compatible; the table below summarizes the various releases. +----------------+--------------+------------+------------+-----------------+ | 2.1.1 | 2.1+2.0.1 | 2001 | PSF | yes | +----------------+--------------+------------+------------+-----------------+ -| 2.2 | 2.1.1 | 2001 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ | 2.1.2 | 2.1.1 | 2002 | PSF | yes | +----------------+--------------+------------+------------+-----------------+ | 2.1.3 | 2.1.2 | 2002 | PSF | yes | +----------------+--------------+------------+------------+-----------------+ -| 2.2.1 | 2.2 | 2002 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.2.2 | 2.2.1 | 2002 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.2.3 | 2.2.2 | 2002-2003 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.3 | 2.2.2 | 2002-2003 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.3.1 | 2.3 | 2002-2003 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.3.2 | 2.3.1 | 2003 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.3.3 | 2.3.2 | 2003 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.3.4 | 2.3.3 | 2004 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.3.5 | 2.3.4 | 2005 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.4 | 2.3 | 2004 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.4.1 | 2.4 | 2005 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.4.2 | 2.4.1 | 2005 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.4.3 | 2.4.2 | 2006 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.4.4 | 2.4.3 | 2006 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.5 | 2.4 | 2006 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.5.1 | 2.5 | 2007 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.6 | 2.5 | 2008 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.6.1 | 2.6 | 2008 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.6.2 | 2.6.1 | 2009 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.6.3 | 2.6.2 | 2009 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 2.6.4 | 2.6.3 | 2009 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.0 | 2.6 | 2008 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.0.1 | 3.0 | 2009 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.1 | 3.0.1 | 2009 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.1.1 | 3.1 | 2009 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.1.2 | 3.1.1 | 2010 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.1.3 | 3.1.2 | 2010 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.1.4 | 3.1.3 | 2011 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.2 | 3.1 | 2011 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.2.1 | 3.2 | 2011 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.2.2 | 3.2.1 | 2011 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.2.3 | 3.2.2 | 2012 | PSF | yes | -+----------------+--------------+------------+------------+-----------------+ -| 3.2.4 | 3.2.3 | 2013 | PSF | yes | +| 2.2 and above | 2.1.1 | 2001-now | PSF | yes | +----------------+--------------+------------+------------+-----------------+ .. note:: @@ -150,7 +84,7 @@ Terms and conditions for accessing or otherwise using Python analyze, test, perform and/or display publicly, prepare derivative works, distribute, and otherwise use Python |release| alone or in any derivative version, provided, however, that PSF's License Agreement and PSF's notice of - copyright, i.e., "Copyright © 2001-2013 Python Software Foundation; All Rights + copyright, i.e., "Copyright © 2001-2014 Python Software Foundation; All Rights Reserved" are retained in Python |release| alone or in any derivative version prepared by Licensee. @@ -911,3 +845,36 @@ used for the build:: Jean-loup Gailly Mark Adler jloup@gzip.org madler@alumni.caltech.edu + +libmpdec +-------- + +The :mod:`_decimal` Module is built using an included copy of the libmpdec +library unless the build is configured ``--with-system-libmpdec``:: + + Copyright (c) 2008-2016 Stefan Krah. All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + 1. Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + + 2. Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in the + documentation and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS "AS IS" AND + ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE + IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE + ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE + FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS + OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) + HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT + LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY + OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + SUCH DAMAGE. + + diff --git a/Doc/make.bat b/Doc/make.bat index af3bade..d6f7074 100644 --- a/Doc/make.bat +++ b/Doc/make.bat @@ -34,10 +34,10 @@ echo. goto end :checkout -svn co %SVNROOT%/external/Sphinx-1.0.7/sphinx tools/sphinx -svn co %SVNROOT%/external/docutils-0.6/docutils tools/docutils +svn co %SVNROOT%/external/Sphinx-1.2/sphinx tools/sphinx +svn co %SVNROOT%/external/docutils-0.11/docutils tools/docutils svn co %SVNROOT%/external/Jinja-2.3.1/jinja2 tools/jinja2 -svn co %SVNROOT%/external/Pygments-1.3.1/pygments tools/pygments +svn co %SVNROOT%/external/Pygments-1.6/pygments tools/pygments goto end :update diff --git a/Doc/reference/compound_stmts.rst b/Doc/reference/compound_stmts.rst index d0d0646..8afc69e 100644 --- a/Doc/reference/compound_stmts.rst +++ b/Doc/reference/compound_stmts.rst @@ -535,17 +535,17 @@ function. The annotation values are available as values of a dictionary keyed by the parameters' names in the :attr:`__annotations__` attribute of the function object. -.. index:: pair: lambda; form +.. index:: pair: lambda; expression It is also possible to create anonymous functions (functions not bound to a -name), for immediate use in expressions. This uses lambda forms, described in -section :ref:`lambda`. Note that the lambda form is merely a shorthand for a +name), for immediate use in expressions. This uses lambda expressions, described in +section :ref:`lambda`. Note that the lambda expression is merely a shorthand for a simplified function definition; a function defined in a ":keyword:`def`" statement can be passed around or assigned to another name just like a function -defined by a lambda form. The ":keyword:`def`" form is actually more powerful +defined by a lambda expression. The ":keyword:`def`" form is actually more powerful since it allows the execution of multiple statements and annotations. -**Programmer's note:** Functions are first-class objects. A "``def``" form +**Programmer's note:** Functions are first-class objects. A "``def``" statement executed inside a function definition defines a local function that can be returned or passed around. Free variables used in the nested function can access the local variables of the function containing the def. See section diff --git a/Doc/reference/datamodel.rst b/Doc/reference/datamodel.rst index 4bad2d5..160af30 100644 --- a/Doc/reference/datamodel.rst +++ b/Doc/reference/datamodel.rst @@ -35,12 +35,19 @@ represented by objects.) Every object has an identity, a type and a value. An object's *identity* never changes once it has been created; you may think of it as the object's address in memory. The ':keyword:`is`' operator compares the identity of two objects; the -:func:`id` function returns an integer representing its identity (currently -implemented as its address). An object's :dfn:`type` is also unchangeable. [#]_ +:func:`id` function returns an integer representing its identity. + +.. impl-detail:: + + For CPython, ``id(x)`` is the memory address where ``x`` is stored. + An object's type determines the operations that the object supports (e.g., "does it have a length?") and also defines the possible values for objects of that type. The :func:`type` function returns an object's type (which is an object -itself). The *value* of some objects can change. Objects whose value can +itself). Like its identity, an object's :dfn:`type` is also unchangeable. +[#]_ + +The *value* of some objects can change. Objects whose value can change are said to be *mutable*; objects whose value is unchangeable once they are created are called *immutable*. (The value of an immutable container object that contains a reference to a mutable object can change when the latter's value @@ -194,7 +201,7 @@ Ellipsis single: True These represent the truth values False and True. The two objects representing - the values False and True are the only Boolean objects. The Boolean type is a + the values ``False`` and ``True`` are the only Boolean objects. The Boolean type is a subtype of the integer type, and Boolean values behave like the values 0 and 1, respectively, in almost all contexts, the exception being that when converted to a string, the strings ``"False"`` or ``"True"`` are returned, respectively. @@ -267,25 +274,27 @@ Sequences The following types are immutable sequences: + .. index:: + single: string; immutable sequences + Strings .. index:: builtin: chr builtin: ord - builtin: str single: character single: integer single: Unicode - The items of a string object are Unicode code units. A Unicode code - unit is represented by a string object of one item and can hold either - a 16-bit or 32-bit value representing a Unicode ordinal (the maximum - value for the ordinal is given in ``sys.maxunicode``, and depends on - how Python is configured at compile time). Surrogate pairs may be - present in the Unicode object, and will be reported as two separate - items. The built-in functions :func:`chr` and :func:`ord` convert - between code units and nonnegative integers representing the Unicode - ordinals as defined in the Unicode Standard 3.0. Conversion from and to - other encodings are possible through the string method :meth:`encode`. + A string is a sequence of values that represent Unicode codepoints. + All the codepoints in range ``U+0000 - U+10FFFF`` can be represented + in a string. Python doesn't have a :c:type:`chr` type, and + every character in the string is represented as a string object + with length ``1``. The built-in function :func:`ord` converts a + character to its codepoint (as an integer); :func:`chr` converts + an integer in range ``0 - 10FFFF`` to the corresponding character. + :meth:`str.encode` can be used to convert a :class:`str` to + :class:`bytes` using the given encoding, and :meth:`bytes.decode` can + be used to achieve the opposite. Tuples .. index:: @@ -307,7 +316,7 @@ Sequences represented by integers in the range 0 <= x < 256. Bytes literals (like ``b'abc'``) and the built-in function :func:`bytes` can be used to construct bytes objects. Also, bytes objects can be decoded to strings - via the :meth:`decode` method. + via the :meth:`~bytes.decode` method. Mutable sequences .. index:: @@ -369,7 +378,7 @@ Set types These represent a mutable set. They are created by the built-in :func:`set` constructor and can be modified afterwards by several methods, such as - :meth:`add`. + :meth:`~set.add`. Frozen sets .. index:: object: frozenset @@ -439,6 +448,8 @@ Callable types Special attributes: + .. tabularcolumns:: |l|L|l| + +-------------------------+-------------------------------+-----------+ | Attribute | Meaning | | +=========================+===============================+===========+ @@ -448,6 +459,11 @@ Callable types +-------------------------+-------------------------------+-----------+ | :attr:`__name__` | The function's name | Writable | +-------------------------+-------------------------------+-----------+ + | :attr:`__qualname__` | The function's | Writable | + | | :term:`qualified name` | | + | | | | + | | .. versionadded:: 3.3 | | + +-------------------------+-------------------------------+-----------+ | :attr:`__module__` | The name of the module the | Writable | | | function was defined in, or | | | | ``None`` if unavailable. | | @@ -479,7 +495,7 @@ Callable types | :attr:`__annotations__` | A dict containing annotations | Writable | | | of parameters. The keys of | | | | the dict are the parameter | | - | | names, or ``'return'`` for | | + | | names, and ``'return'`` for | | | | the return annotation, if | | | | provided. | | +-------------------------+-------------------------------+-----------+ @@ -588,7 +604,7 @@ Callable types A function or method which uses the :keyword:`yield` statement (see section :ref:`yield`) is called a :dfn:`generator function`. Such a function, when called, always returns an iterator object which can be used to execute the - body of the function: calling the iterator's :meth:`iterator__next__` + body of the function: calling the iterator's :meth:`iterator.__next__` method will cause the function to execute until it provides a value using the :keyword:`yield` statement. When the function executes a :keyword:`return` statement or falls off the end, a :exc:`StopIteration` @@ -639,17 +655,20 @@ Modules statement: import object: module - Modules are imported by the :keyword:`import` statement (see section - :ref:`import`). A module object has a - namespace implemented by a dictionary object (this is the dictionary referenced - by the __globals__ attribute of functions defined in the module). Attribute - references are translated to lookups in this dictionary, e.g., ``m.x`` is - equivalent to ``m.__dict__["x"]``. A module object does not contain the code - object used to initialize the module (since it isn't needed once the - initialization is done). - - Attribute assignment updates the module's namespace dictionary, e.g., ``m.x = - 1`` is equivalent to ``m.__dict__["x"] = 1``. + Modules are a basic organizational unit of Python code, and are created by + the :ref:`import system <importsystem>` as invoked either by the + :keyword:`import` statement (see :keyword:`import`), or by calling + functions such as :func:`importlib.import_module` and built-in + :func:`__import__`. A module object has a namespace implemented by a + dictionary object (this is the dictionary referenced by the ``__globals__`` + attribute of functions defined in the module). Attribute references are + translated to lookups in this dictionary, e.g., ``m.x`` is equivalent to + ``m.__dict__["x"]``. A module object does not contain the code object used + to initialize the module (since it isn't needed once the initialization is + done). + + Attribute assignment updates the module's namespace dictionary, e.g., + ``m.x = 1`` is equivalent to ``m.__dict__["x"] = 1``. .. index:: single: __dict__ (module attribute) @@ -671,11 +690,12 @@ Modules Predefined (writable) attributes: :attr:`__name__` is the module's name; :attr:`__doc__` is the module's documentation string, or ``None`` if - unavailable; :attr:`__file__` is the pathname of the file from which the module - was loaded, if it was loaded from a file. The :attr:`__file__` attribute is not - present for C modules that are statically linked into the interpreter; for - extension modules loaded dynamically from a shared library, it is the pathname - of the shared library file. + unavailable; :attr:`__file__` is the pathname of the file from which the + module was loaded, if it was loaded from a file. The :attr:`__file__` + attribute may be missing for certain types of modules, such as C modules + that are statically linked into the interpreter; for extension modules + loaded dynamically from a shared library, it is the pathname of the shared + library file. Custom classes Custom class types are typically created by class definitions (see section @@ -728,10 +748,10 @@ Custom classes Special attributes: :attr:`__name__` is the class name; :attr:`__module__` is the module name in which the class was defined; :attr:`__dict__` is the - dictionary containing the class's namespace; :attr:`__bases__` is a tuple - (possibly empty or a singleton) containing the base classes, in the order of - their occurrence in the base class list; :attr:`__doc__` is the class's - documentation string, or None if undefined. + dictionary containing the class's namespace; :attr:`~class.__bases__` is a + tuple (possibly empty or a singleton) containing the base classes, in the + order of their occurrence in the base class list; :attr:`__doc__` is the + class's documentation string, or None if undefined. Class instances .. index:: @@ -773,8 +793,8 @@ Class instances single: __dict__ (instance attribute) single: __class__ (instance attribute) - Special attributes: :attr:`__dict__` is the attribute dictionary; - :attr:`__class__` is the instance's class. + Special attributes: :attr:`~object.__dict__` is the attribute dictionary; + :attr:`~instance.__class__` is the instance's class. I/O objects (also known as file objects) .. index:: @@ -792,9 +812,9 @@ I/O objects (also known as file objects) A :term:`file object` represents an open file. Various shortcuts are available to create file objects: the :func:`open` built-in function, and - also :func:`os.popen`, :func:`os.fdopen`, and the :meth:`makefile` method - of socket objects (and perhaps by other functions or methods provided - by extension modules). + also :func:`os.popen`, :func:`os.fdopen`, and the + :meth:`~socket.socket.makefile` method of socket objects (and perhaps by + other functions or methods provided by extension modules). The objects ``sys.stdin``, ``sys.stdout`` and ``sys.stderr`` are initialized to file objects corresponding to the interpreter's standard @@ -963,9 +983,9 @@ Internal types single: stop (slice object attribute) single: step (slice object attribute) - Special read-only attributes: :attr:`start` is the lower bound; :attr:`stop` is - the upper bound; :attr:`step` is the step value; each is ``None`` if omitted. - These attributes can have any type. + Special read-only attributes: :attr:`~slice.start` is the lower bound; + :attr:`~slice.stop` is the upper bound; :attr:`~slice.step` is the step + value; each is ``None`` if omitted. These attributes can have any type. Slice objects support one method: @@ -1019,7 +1039,8 @@ When implementing a class that emulates any built-in type, it is important that the emulation only be implemented to the degree that it makes sense for the object being modelled. For example, some sequences may work well with retrieval of individual elements, but extracting a slice may not make sense. (One example -of this is the :class:`NodeList` interface in the W3C's Document Object Model.) +of this is the :class:`~xml.dom.NodeList` interface in the W3C's Document +Object Model.) .. _customization: @@ -1153,7 +1174,7 @@ Basic customization Called by :func:`str(object) <str>` and the built-in functions :func:`format` and :func:`print` to compute the "informal" or nicely printable string representation of an object. The return value must be a - :ref:`string <typesseq>` object. + :ref:`string <textseq>` object. This method differs from :meth:`object.__repr__` in that there is no expectation that :meth:`__str__` return a valid Python expression: a more @@ -1172,14 +1193,14 @@ Basic customization Called by :func:`bytes` to compute a byte-string representation of an object. This should return a ``bytes`` object. - -.. method:: object.__format__(self, format_spec) - .. index:: + single: string; __format__() (object method) pair: string; conversion - builtin: str builtin: print + +.. method:: object.__format__(self, format_spec) + Called by the :func:`format` built-in function (and by extension, the :meth:`str.format` method of class :class:`str`) to produce a "formatted" string representation of an object. The ``format_spec`` argument is @@ -1244,10 +1265,21 @@ Basic customization Called by built-in function :func:`hash` and for operations on members of hashed collections including :class:`set`, :class:`frozenset`, and - :class:`dict`. :meth:`__hash__` should return an integer. The only required - property is that objects which compare equal have the same hash value; it is - advised to somehow mix together (e.g. using exclusive or) the hash values for - the components of the object that also play a part in comparison of objects. + :class:`dict`. :meth:`__hash__` should return an integer. The only + required property is that objects which compare equal have the same hash + value; it is advised to somehow mix together (e.g. using exclusive or) the + hash values for the components of the object that also play a part in + comparison of objects. + + .. note:: + + :func:`hash` truncates the value returned from an object's custom + :meth:`__hash__` method to the size of a :c:type:`Py_ssize_t`. This is + typically 8 bytes on 64-bit builds and 4 bytes on 32-bit builds. If an + object's :meth:`__hash__` must interoperate on builds of different bit + sizes, be sure to check the width on all supported builds. An easy way + to do this is with + ``python -c "import sys; print(sys.hash_info.width)"`` If a class does not define an :meth:`__eq__` method it should not define a :meth:`__hash__` operation either; if it defines :meth:`__eq__` but not @@ -1258,10 +1290,10 @@ Basic customization immutable (if the object's hash value changes, it will be in the wrong hash bucket). - User-defined classes have :meth:`__eq__` and :meth:`__hash__` methods by default; with them, all objects compare unequal (except with themselves) - and ``x.__hash__()`` returns ``id(x)``. + and ``x.__hash__()`` returns an appropriate value such that ``x == y`` + implies both that ``x is y`` and ``hash(x) == hash(y)``. A class that overrides :meth:`__eq__` and does not define :meth:`__hash__` will have its :meth:`__hash__` implicitly set to ``None``. When the @@ -1280,7 +1312,27 @@ Basic customization a :exc:`TypeError` would be incorrectly identified as hashable by an ``isinstance(obj, collections.Hashable)`` call. - See also the :option:`-R` command-line option. + + .. note:: + + By default, the :meth:`__hash__` values of str, bytes and datetime + objects are "salted" with an unpredictable random value. Although they + remain constant within an individual Python process, they are not + predictable between repeated invocations of Python. + + This is intended to provide protection against a denial-of-service caused + by carefully-chosen inputs that exploit the worst case performance of a + dict insertion, O(n^2) complexity. See + http://www.ocert.org/advisories/ocert-2011-003.html for details. + + Changing hash values affects the iteration order of dicts, sets and + other mappings. Python has never made guarantees about this ordering + (and it typically varies between 32-bit and 64-bit builds). + + See also :envvar:`PYTHONHASHSEED`. + + .. versionchanged:: 3.3 + Hash randomization is enabled by default. .. method:: object.__bool__(self) @@ -1361,7 +1413,8 @@ access (use of, assignment to, or deletion of ``x.name``) for class instances. .. method:: object.__dir__(self) - Called when :func:`dir` is called on the object. A list must be returned. + Called when :func:`dir` is called on the object. A sequence must be + returned. :func:`dir` converts the returned sequence to a list and sorts it. .. _descriptors: @@ -1518,7 +1571,7 @@ Notes on using *__slots__* program undefined. In the future, a check may be added to prevent this. * Nonempty *__slots__* does not work for classes derived from "variable-length" - built-in types such as :class:`int`, :class:`str` and :class:`tuple`. + built-in types such as :class:`int`, :class:`bytes` and :class:`tuple`. * Any non-string iterable may be assigned to *__slots__*. Mappings may also be used; however, in the future, special meaning may be assigned to the values @@ -1532,53 +1585,115 @@ Notes on using *__slots__* Customizing class creation -------------------------- -By default, classes are constructed using :func:`type`. A class definition is -read into a separate namespace and the value of class name is bound to the -result of ``type(name, bases, dict)``. +By default, classes are constructed using :func:`type`. The class body is +executed in a new namespace and the class name is bound locally to the +result of ``type(name, bases, namespace)``. + +The class creation process can be customised by passing the ``metaclass`` +keyword argument in the class definition line, or by inheriting from an +existing class that included such an argument. In the following example, +both ``MyClass`` and ``MySubclass`` are instances of ``Meta``:: + + class Meta(type): + pass + + class MyClass(metaclass=Meta): + pass + + class MySubclass(MyClass): + pass + +Any other keyword arguments that are specified in the class definition are +passed through to all metaclass operations described below. + +When a class definition is executed, the following steps occur: + +* the appropriate metaclass is determined +* the class namespace is prepared +* the class body is executed +* the class object is created + +Determining the appropriate metaclass +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The appropriate metaclass for a class definition is determined as follows: + +* if no bases and no explicit metaclass are given, then :func:`type` is used +* if an explicit metaclass is given and it is *not* an instance of + :func:`type`, then it is used directly as the metaclass +* if an instance of :func:`type` is given as the explicit metaclass, or + bases are defined, then the most derived metaclass is used -When the class definition is read, if a callable ``metaclass`` keyword argument -is passed after the bases in the class definition, the callable given will be -called instead of :func:`type`. If other keyword arguments are passed, they -will also be passed to the metaclass. This allows classes or functions to be -written which monitor or alter the class creation process: +The most derived metaclass is selected from the explicitly specified +metaclass (if any) and the metaclasses (i.e. ``type(cls)``) of all specified +base classes. The most derived metaclass is one which is a subtype of *all* +of these candidate metaclasses. If none of the candidate metaclasses meets +that criterion, then the class definition will fail with ``TypeError``. -* Modifying the class dictionary prior to the class being created. -* Returning an instance of another class -- essentially performing the role of a - factory function. +Preparing the class namespace +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -These steps will have to be performed in the metaclass's :meth:`__new__` method --- :meth:`type.__new__` can then be called from this method to create a class -with different properties. This example adds a new element to the class -dictionary before creating the class:: +Once the appropriate metaclass has been identified, then the class namespace +is prepared. If the metaclass has a ``__prepare__`` attribute, it is called +as ``namespace = metaclass.__prepare__(name, bases, **kwds)`` (where the +additional keyword arguments, if any, come from the class definition). - class metacls(type): - def __new__(mcs, name, bases, dict): - dict['foo'] = 'metacls was here' - return type.__new__(mcs, name, bases, dict) +If the metaclass has no ``__prepare__`` attribute, then the class namespace +is initialised as an empty :func:`dict` instance. -You can of course also override other class methods (or add new methods); for -example defining a custom :meth:`__call__` method in the metaclass allows custom -behavior when the class is called, e.g. not always creating a new instance. +.. seealso:: + + :pep:`3115` - Metaclasses in Python 3000 + Introduced the ``__prepare__`` namespace hook + + +Executing the class body +^^^^^^^^^^^^^^^^^^^^^^^^ + +The class body is executed (approximately) as +``exec(body, globals(), namespace)``. The key difference from a normal +call to :func:`exec` is that lexical scoping allows the class body (including +any methods) to reference names from the current and outer scopes when the +class definition occurs inside a function. -If the metaclass has a :meth:`__prepare__` attribute (usually implemented as a -class or static method), it is called before the class body is evaluated with -the name of the class and a tuple of its bases for arguments. It should return -an object that supports the mapping interface that will be used to store the -namespace of the class. The default is a plain dictionary. This could be used, -for example, to keep track of the order that class attributes are declared in by -returning an ordered dictionary. +However, even when the class definition occurs inside the function, methods +defined inside the class still cannot see names defined at the class scope. +Class variables must be accessed through the first parameter of instance or +class methods, and cannot be accessed at all from static methods. -The appropriate metaclass is determined by the following precedence rules: -* If the ``metaclass`` keyword argument is passed with the bases, it is used. +Creating the class object +^^^^^^^^^^^^^^^^^^^^^^^^^ + +Once the class namespace has been populated by executing the class body, +the class object is created by calling +``metaclass(name, bases, namespace, **kwds)`` (the additional keywords +passed here are the same as those passed to ``__prepare__``). + +This class object is the one that will be referenced by the zero-argument +form of :func:`super`. ``__class__`` is an implicit closure reference +created by the compiler if any methods in a class body refer to either +``__class__`` or ``super``. This allows the zero argument form of +:func:`super` to correctly identify the class being defined based on +lexical scoping, while the class or instance that was used to make the +current call is identified based on the first argument passed to the method. + +After the class object is created, it is passed to the class decorators +included in the class definition (if any) and the resulting object is bound +in the local namespace as the defined class. + +.. seealso:: + + :pep:`3135` - New super + Describes the implicit ``__class__`` closure reference -* Otherwise, if there is at least one base class, its metaclass is used. -* Otherwise, the default metaclass (:class:`type`) is used. +Metaclass example +^^^^^^^^^^^^^^^^^ The potential uses for metaclasses are boundless. Some ideas that have been -explored including logging, interface checking, automatic delegation, automatic +explored include logging, interface checking, automatic delegation, automatic property creation, proxies, frameworks, and automatic resource locking/synchronization. @@ -1591,9 +1706,9 @@ to remember the order that class members were defined:: def __prepare__(metacls, name, bases, **kwds): return collections.OrderedDict() - def __new__(cls, name, bases, classdict): - result = type.__new__(cls, name, bases, dict(classdict)) - result.members = tuple(classdict) + def __new__(cls, name, bases, namespace, **kwds): + result = type.__new__(cls, name, bases, dict(namespace)) + result.members = tuple(namespace) return result class A(metaclass=OrderedClass): @@ -1649,10 +1764,10 @@ case the instance is itself a class. :pep:`3119` - Introducing Abstract Base Classes Includes the specification for customizing :func:`isinstance` and - :func:`issubclass` behavior through :meth:`__instancecheck__` and - :meth:`__subclasscheck__`, with motivation for this functionality in the - context of adding Abstract Base Classes (see the :mod:`abc` module) to the - language. + :func:`issubclass` behavior through :meth:`~class.__instancecheck__` and + :meth:`~class.__subclasscheck__`, with motivation for this functionality + in the context of adding Abstract Base Classes (see the :mod:`abc` + module) to the language. .. _callable-types: @@ -1682,9 +1797,10 @@ a sequence, the allowable keys should be the integers *k* for which ``0 <= k < N`` where *N* is the length of the sequence, or slice objects, which define a range of items. It is also recommended that mappings provide the methods :meth:`keys`, :meth:`values`, :meth:`items`, :meth:`get`, :meth:`clear`, -:meth:`setdefault`, :meth:`pop`, :meth:`popitem`, :meth:`copy`, and +:meth:`setdefault`, :meth:`pop`, :meth:`popitem`, :meth:`!copy`, and :meth:`update` behaving similar to those for Python's standard dictionary -objects. The :mod:`collections` module provides a :class:`MutableMapping` +objects. The :mod:`collections` module provides a +:class:`~collections.abc.MutableMapping` abstract base class to help create those methods from a base set of :meth:`__getitem__`, :meth:`__setitem__`, :meth:`__delitem__`, and :meth:`keys`. Mutable sequences should provide methods :meth:`append`, :meth:`count`, @@ -1907,11 +2023,13 @@ left undefined. ``&=``, ``^=``, ``|=``). These methods should attempt to do the operation in-place (modifying *self*) and return the result (which could be, but does not have to be, *self*). If a specific method is not defined, the augmented - assignment falls back to the normal methods. For instance, to execute the - statement ``x += y``, where *x* is an instance of a class that has an - :meth:`__iadd__` method, ``x.__iadd__(y)`` is called. If *x* is an instance - of a class that does not define a :meth:`__iadd__` method, ``x.__add__(y)`` - and ``y.__radd__(x)`` are considered, as with the evaluation of ``x + y``. + assignment falls back to the normal methods. For instance, if *x* is an + instance of a class with an :meth:`__iadd__` method, ``x += y`` is equivalent + to ``x = x.__iadd__(y)`` . Otherwise, ``x.__add__(y)`` and ``y.__radd__(x)`` + are considered, as with the evaluation of ``x + y``. In certain situations, + augmented assignment can result in unexpected errors (see + :ref:`faq-augmented-assignment-tuple-error`), but this behavior is in + fact part of the data model. .. method:: object.__neg__(self) diff --git a/Doc/reference/expressions.rst b/Doc/reference/expressions.rst index 68de3c8..06baba0 100644 --- a/Doc/reference/expressions.rst +++ b/Doc/reference/expressions.rst @@ -84,14 +84,13 @@ exception. definition begins with two or more underscore characters and does not end in two or more underscores, it is considered a :dfn:`private name` of that class. Private names are transformed to a longer form before code is generated for -them. The transformation inserts the class name in front of the name, with -leading underscores removed, and a single underscore inserted in front of the -class name. For example, the identifier ``__spam`` occurring in a class named -``Ham`` will be transformed to ``_Ham__spam``. This transformation is -independent of the syntactical context in which the identifier is used. If the -transformed name is extremely long (longer than 255 characters), implementation -defined truncation may happen. If the class name consists only of underscores, -no transformation is done. +them. The transformation inserts the class name, with leading underscores +removed and a single underscore inserted, in front of the name. For example, +the identifier ``__spam`` occurring in a class named ``Ham`` will be transformed +to ``_Ham__spam``. This transformation is independent of the syntactical +context in which the identifier is used. If the transformed name is extremely +long (longer than 255 characters), implementation defined truncation may happen. +If the class name consists only of underscores, no transformation is done. .. _atom-literals: @@ -318,25 +317,27 @@ Yield expressions .. productionlist:: yield_atom: "(" `yield_expression` ")" - yield_expression: "yield" [`expression_list`] + yield_expression: "yield" [`expression_list` | "from" `expression`] -The :keyword:`yield` expression is only used when defining a generator function, -and can only be used in the body of a function definition. Using a -:keyword:`yield` expression in a function definition is sufficient to cause that -definition to create a generator function instead of a normal function. +The yield expression is only used when defining a :term:`generator` function and +thus can only be used in the body of a function definition. Using a yield +expression in a function's body causes that function to be a generator. When a generator function is called, it returns an iterator known as a generator. That generator then controls the execution of a generator function. The execution starts when one of the generator's methods is called. At that -time, the execution proceeds to the first :keyword:`yield` expression, where it -is suspended again, returning the value of :token:`expression_list` to -generator's caller. By suspended we mean that all local state is retained, -including the current bindings of local variables, the instruction pointer, and -the internal evaluation stack. When the execution is resumed by calling one of -the generator's methods, the function can proceed exactly as if the -:keyword:`yield` expression was just another external call. The value of the -:keyword:`yield` expression after resuming depends on the method which resumed -the execution. +time, the execution proceeds to the first yield expression, where it is +suspended again, returning the value of :token:`expression_list` to generator's +caller. By suspended, we mean that all local state is retained, including the +current bindings of local variables, the instruction pointer, and the internal +evaluation stack. When the execution is resumed by calling one of the +generator's methods, the function can proceed exactly as if the yield expression +was just another external call. The value of the yield expression after +resuming depends on the method which resumed the execution. If +:meth:`~generator.__next__` is used (typically via either a :keyword:`for` or +the :func:`next` builtin) then the result is :const:`None`. Otherwise, if +:meth:`~generator.send` is used, then the result will be the value passed in to +that method. .. index:: single: coroutine @@ -346,14 +347,47 @@ suspended. The only difference is that a generator function cannot control where should the execution continue after it yields; the control is always transferred to the generator's caller. -The :keyword:`yield` statement is allowed in the :keyword:`try` clause of a -:keyword:`try` ... :keyword:`finally` construct. If the generator is not -resumed before it is finalized (by reaching a zero reference count or by being -garbage collected), the generator-iterator's :meth:`close` method will be -called, allowing any pending :keyword:`finally` clauses to execute. +yield expressions are allowed in the :keyword:`try` clause of a :keyword:`try` +... :keyword:`finally` construct. If the generator is not resumed before it is +finalized (by reaching a zero reference count or by being garbage collected), +the generator-iterator's :meth:`~generator.close` method will be called, +allowing any pending :keyword:`finally` clauses to execute. + +When ``yield from <expr>`` is used, it treats the supplied expression as +a subiterator. All values produced by that subiterator are passed directly +to the caller of the current generator's methods. Any values passed in with +:meth:`~generator.send` and any exceptions passed in with +:meth:`~generator.throw` are passed to the underlying iterator if it has the +appropriate methods. If this is not the case, then :meth:`~generator.send` +will raise :exc:`AttributeError` or :exc:`TypeError`, while +:meth:`~generator.throw` will just raise the passed in exception immediately. + +When the underlying iterator is complete, the :attr:`~StopIteration.value` +attribute of the raised :exc:`StopIteration` instance becomes the value of +the yield expression. It can be either set explicitly when raising +:exc:`StopIteration`, or automatically when the sub-iterator is a generator +(by returning a value from the sub-generator). + + .. versionchanged:: 3.3 + Added ``yield from <expr>`` to delegate control flow to a subiterator + +The parentheses may be omitted when the yield expression is the sole expression +on the right hand side of an assignment statement. -.. index:: object: generator +.. seealso:: + + :pep:`0255` - Simple Generators + The proposal for adding generators and the :keyword:`yield` statement to Python. + + :pep:`0342` - Coroutines via Enhanced Generators + The proposal to enhance the API and syntax of generators, making them + usable as simple coroutines. + :pep:`0380` - Syntax for Delegating to a Subgenerator + The proposal to introduce the :token:`yield_from` syntax, making delegation + to sub-generators easy. + +.. index:: object: generator Generator-iterator methods ^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -365,18 +399,18 @@ Note that calling any of the generator methods below when the generator is already executing raises a :exc:`ValueError` exception. .. index:: exception: StopIteration +.. class:: generator .. method:: generator.__next__() Starts the execution of a generator function or resumes it at the last - executed :keyword:`yield` expression. When a generator function is resumed - with a :meth:`~generator.__next__` method, the current :keyword:`yield` - expression always evaluates to :const:`None`. The execution then continues - to the next :keyword:`yield` expression, where the generator is suspended - again, and the value of the :token:`expression_list` is returned to - :meth:`next`'s caller. - If the generator exits without yielding another value, a :exc:`StopIteration` + executed yield expression. When a generator function is resumed with a + :meth:`~generator.__next__` method, the current yield expression always + evaluates to :const:`None`. The execution then continues to the next yield + expression, where the generator is suspended again, and the value of the + :token:`expression_list` is returned to :meth:`next`'s caller. If the + generator exits without yielding another value, a :exc:`StopIteration` exception is raised. This method is normally called implicitly, e.g. by a :keyword:`for` loop, or @@ -386,12 +420,12 @@ is already executing raises a :exc:`ValueError` exception. .. method:: generator.send(value) Resumes the execution and "sends" a value into the generator function. The - ``value`` argument becomes the result of the current :keyword:`yield` - expression. The :meth:`send` method returns the next value yielded by the - generator, or raises :exc:`StopIteration` if the generator exits without - yielding another value. When :meth:`send` is called to start the generator, - it must be called with :const:`None` as the argument, because there is no - :keyword:`yield` expression that could receive the value. + *value* argument becomes the result of the current yield expression. The + :meth:`send` method returns the next value yielded by the generator, or + raises :exc:`StopIteration` if the generator exits without yielding another + value. When :meth:`send` is called to start the generator, it must be called + with :const:`None` as the argument, because there is no yield expression that + could receive the value. .. method:: generator.throw(type[, value[, traceback]]) @@ -415,6 +449,13 @@ is already executing raises a :exc:`ValueError` exception. other exception, it is propagated to the caller. :meth:`close` does nothing if the generator has already exited due to an exception or normal exit. +.. class:: . + +.. index:: single: yield; examples + +Examples +^^^^^^^^ + Here is a simple example that demonstrates the behavior of generators and generator functions:: @@ -442,15 +483,8 @@ generator functions:: >>> generator.close() Don't forget to clean up when 'close()' is called. - -.. seealso:: - - :pep:`0255` - Simple Generators - The proposal for adding generators and the :keyword:`yield` statement to Python. - - :pep:`0342` - Coroutines via Enhanced Generators - The proposal to enhance the API and syntax of generators, making them - usable as simple coroutines. +For examples using ``yield from``, see :ref:`pep-380` in "What's New in +Python." .. _primaries: @@ -594,10 +628,10 @@ follows. If the slice list contains at least one comma, the key is a tuple containing the conversion of the slice items; otherwise, the conversion of the lone slice item is the key. The conversion of a slice item that is an expression is that expression. The conversion of a proper slice is a slice -object (see section :ref:`types`) whose :attr:`start`, :attr:`stop` and -:attr:`step` attributes are the values of the expressions given as lower bound, -upper bound and stride, respectively, substituting ``None`` for missing -expressions. +object (see section :ref:`types`) whose :attr:`~slice.start`, +:attr:`~slice.stop` and :attr:`~slice.step` attributes are the values of the +expressions given as lower bound, upper bound and stride, respectively, +substituting ``None`` for missing expressions. .. index:: @@ -876,7 +910,7 @@ repetition is performed; a negative repetition factor yields an empty sequence. The ``/`` (division) and ``//`` (floor division) operators yield the quotient of their arguments. The numeric arguments are first converted to a common type. -Integer division yields a float, while floor division of integers results in an +Division of integers yields a float, while floor division of integers results in an integer; the result is that of mathematical division with the 'floor' function applied to the result. Division by zero raises the :exc:`ZeroDivisionError` exception. @@ -936,8 +970,8 @@ the left or right by the number of bits given by the second argument. .. index:: exception: ValueError -A right shift by *n* bits is defined as division by ``pow(2,n)``. A left shift -by *n* bits is defined as multiplication with ``pow(2,n)``. +A right shift by *n* bits is defined as floor division by ``pow(2,n)``. A left +shift by *n* bits is defined as multiplication with ``pow(2,n)``. .. note:: @@ -1182,8 +1216,8 @@ Conditional expressions .. productionlist:: conditional_expression: `or_test` ["if" `or_test` "else" `expression`] - expression: `conditional_expression` | `lambda_form` - expression_nocond: `or_test` | `lambda_form_nocond` + expression: `conditional_expression` | `lambda_expr` + expression_nocond: `or_test` | `lambda_expr_nocond` Conditional expressions (sometimes called a "ternary operator") have the lowest priority of all Python operations. @@ -1207,10 +1241,10 @@ Lambdas pair: anonymous; function .. productionlist:: - lambda_form: "lambda" [`parameter_list`]: `expression` - lambda_form_nocond: "lambda" [`parameter_list`]: `expression_nocond` + lambda_expr: "lambda" [`parameter_list`]: `expression` + lambda_expr_nocond: "lambda" [`parameter_list`]: `expression_nocond` -Lambda forms (lambda expressions) have the same syntactic position as +Lambda expressions (sometimes called lambda forms) have the same syntactic position as expressions. They are a shorthand to create anonymous functions; the expression ``lambda arguments: expression`` yields a function object. The unnamed object behaves like a function object defined with :: @@ -1219,7 +1253,8 @@ behaves like a function object defined with :: return expression See section :ref:`function` for the syntax of parameter lists. Note that -functions created with lambda forms cannot contain statements or annotations. +functions created with lambda expressions cannot contain statements or +annotations. .. _exprlists: @@ -1298,7 +1333,7 @@ groups from right to left). | :keyword:`not` ``x`` | Boolean NOT | +-----------------------------------------------+-------------------------------------+ | :keyword:`in`, :keyword:`not in`, | Comparisons, including membership | -| :keyword:`is`, :keyword:`is not`, ``<``, | tests and identity tests, | +| :keyword:`is`, :keyword:`is not`, ``<``, | tests and identity tests | | ``<=``, ``>``, ``>=``, ``!=``, ``==`` | | +-----------------------------------------------+-------------------------------------+ | ``|`` | Bitwise OR | diff --git a/Doc/reference/import.rst b/Doc/reference/import.rst new file mode 100644 index 0000000..eb497a1 --- /dev/null +++ b/Doc/reference/import.rst @@ -0,0 +1,706 @@ + +.. _importsystem: + +***************** +The import system +***************** + +.. index:: single: import machinery + +Python code in one :term:`module` gains access to the code in another module +by the process of :term:`importing` it. The :keyword:`import` statement is +the most common way of invoking the import machinery, but it is not the only +way. Functions such as :func:`importlib.import_module` and built-in +:func:`__import__` can also be used to invoke the import machinery. + +The :keyword:`import` statement combines two operations; it searches for the +named module, then it binds the results of that search to a name in the local +scope. The search operation of the :keyword:`import` statement is defined as +a call to the :func:`__import__` function, with the appropriate arguments. +The return value of :func:`__import__` is used to perform the name +binding operation of the :keyword:`import` statement. See the +:keyword:`import` statement for the exact details of that name binding +operation. + +A direct call to :func:`__import__` performs only the module search and, if +found, the module creation operation. While certain side-effects may occur, +such as the importing of parent packages, and the updating of various caches +(including :data:`sys.modules`), only the :keyword:`import` statement performs +a name binding operation. + +When calling :func:`__import__` as part of an import statement, the +import system first checks the module global namespace for a function by +that name. If it is not found, then the standard builtin :func:`__import__` +is called. Other mechanisms for invoking the import system (such as +:func:`importlib.import_module`) do not perform this check and will always +use the standard import system. + +When a module is first imported, Python searches for the module and if found, +it creates a module object [#fnmo]_, initializing it. If the named module +cannot be found, an :exc:`ImportError` is raised. Python implements various +strategies to search for the named module when the import machinery is +invoked. These strategies can be modified and extended by using various hooks +described in the sections below. + +.. versionchanged:: 3.3 + The import system has been updated to fully implement the second phase + of :pep:`302`. There is no longer any implicit import machinery - the full + import system is exposed through :data:`sys.meta_path`. In addition, + native namespace package support has been implemented (see :pep:`420`). + + +:mod:`importlib` +================ + +The :mod:`importlib` module provides a rich API for interacting with the +import system. For example :func:`importlib.import_module` provides a +recommended, simpler API than built-in :func:`__import__` for invoking the +import machinery. Refer to the :mod:`importlib` library documentation for +additional detail. + + + +Packages +======== + +.. index:: + single: package + +Python has only one type of module object, and all modules are of this type, +regardless of whether the module is implemented in Python, C, or something +else. To help organize modules and provide a naming hierarchy, Python has a +concept of :term:`packages <package>`. + +You can think of packages as the directories on a file system and modules as +files within directories, but don't take this analogy too literally since +packages and modules need not originate from the file system. For the +purposes of this documentation, we'll use this convenient analogy of +directories and files. Like file system directories, packages are organized +hierarchically, and packages may themselves contain subpackages, as well as +regular modules. + +It's important to keep in mind that all packages are modules, but not all +modules are packages. Or put another way, packages are just a special kind of +module. Specifically, any module that contains a ``__path__`` attribute is +considered a package. + +All modules have a name. Subpackage names are separated from their parent +package name by dots, akin to Python's standard attribute access syntax. Thus +you might have a module called :mod:`sys` and a package called :mod:`email`, +which in turn has a subpackage called :mod:`email.mime` and a module within +that subpackage called :mod:`email.mime.text`. + + +Regular packages +---------------- + +.. index:: + pair: package; regular + +Python defines two types of packages, :term:`regular packages <regular +package>` and :term:`namespace packages <namespace package>`. Regular +packages are traditional packages as they existed in Python 3.2 and earlier. +A regular package is typically implemented as a directory containing an +``__init__.py`` file. When a regular package is imported, this +``__init__.py`` file is implicitly executed, and the objects it defines are +bound to names in the package's namespace. The ``__init__.py`` file can +contain the same Python code that any other module can contain, and Python +will add some additional attributes to the module when it is imported. + +For example, the following file system layout defines a top level ``parent`` +package with three subpackages:: + + parent/ + __init__.py + one/ + __init__.py + two/ + __init__.py + three/ + __init__.py + +Importing ``parent.one`` will implicitly execute ``parent/__init__.py`` and +``parent/one/__init__.py``. Subsequent imports of ``parent.two`` or +``parent.three`` will execute ``parent/two/__init__.py`` and +``parent/three/__init__.py`` respectively. + + +Namespace packages +------------------ + +.. index:: + pair:: package; namespace + pair:: package; portion + +A namespace package is a composite of various :term:`portions <portion>`, +where each portion contributes a subpackage to the parent package. Portions +may reside in different locations on the file system. Portions may also be +found in zip files, on the network, or anywhere else that Python searches +during import. Namespace packages may or may not correspond directly to +objects on the file system; they may be virtual modules that have no concrete +representation. + +Namespace packages do not use an ordinary list for their ``__path__`` +attribute. They instead use a custom iterable type which will automatically +perform a new search for package portions on the next import attempt within +that package if the path of their parent package (or :data:`sys.path` for a +top level package) changes. + +With namespace packages, there is no ``parent/__init__.py`` file. In fact, +there may be multiple ``parent`` directories found during import search, where +each one is provided by a different portion. Thus ``parent/one`` may not be +physically located next to ``parent/two``. In this case, Python will create a +namespace package for the top-level ``parent`` package whenever it or one of +its subpackages is imported. + +See also :pep:`420` for the namespace package specification. + + +Searching +========= + +To begin the search, Python needs the :term:`fully qualified <qualified name>` +name of the module (or package, but for the purposes of this discussion, the +difference is immaterial) being imported. This name may come from various +arguments to the :keyword:`import` statement, or from the parameters to the +:func:`importlib.import_module` or :func:`__import__` functions. + +This name will be used in various phases of the import search, and it may be +the dotted path to a submodule, e.g. ``foo.bar.baz``. In this case, Python +first tries to import ``foo``, then ``foo.bar``, and finally ``foo.bar.baz``. +If any of the intermediate imports fail, an :exc:`ImportError` is raised. + + +The module cache +---------------- + +.. index:: + single: sys.modules + +The first place checked during import search is :data:`sys.modules`. This +mapping serves as a cache of all modules that have been previously imported, +including the intermediate paths. So if ``foo.bar.baz`` was previously +imported, :data:`sys.modules` will contain entries for ``foo``, ``foo.bar``, +and ``foo.bar.baz``. Each key will have as its value the corresponding module +object. + +During import, the module name is looked up in :data:`sys.modules` and if +present, the associated value is the module satisfying the import, and the +process completes. However, if the value is ``None``, then an +:exc:`ImportError` is raised. If the module name is missing, Python will +continue searching for the module. + +:data:`sys.modules` is writable. Deleting a key may not destroy the +associated module (as other modules may hold references to it), +but it will invalidate the cache entry for the named module, causing +Python to search anew for the named module upon its next +import. The key can also be assigned to ``None``, forcing the next import +of the module to result in an :exc:`ImportError`. + +Beware though, as if you keep a reference to the module object, +invalidate its cache entry in :data:`sys.modules`, and then re-import the +named module, the two module objects will *not* be the same. By contrast, +:func:`imp.reload` will reuse the *same* module object, and simply +reinitialise the module contents by rerunning the module's code. + + +Finders and loaders +------------------- + +.. index:: + single: finder + single: loader + +If the named module is not found in :data:`sys.modules`, then Python's import +protocol is invoked to find and load the module. This protocol consists of +two conceptual objects, :term:`finders <finder>` and :term:`loaders <loader>`. +A finder's job is to determine whether it can find the named module using +whatever strategy it knows about. Objects that implement both of these +interfaces are referred to as :term:`importers <importer>` - they return +themselves when they find that they can load the requested module. + +Python includes a number of default finders and importers. The first one +knows how to locate built-in modules, and the second knows how to locate +frozen modules. A third default finder searches an :term:`import path` +for modules. The :term:`import path` is a list of locations that may +name file system paths or zip files. It can also be extended to search +for any locatable resource, such as those identified by URLs. + +The import machinery is extensible, so new finders can be added to extend the +range and scope of module searching. + +Finders do not actually load modules. If they can find the named module, they +return a :term:`loader`, which the import machinery then invokes to load the +module and create the corresponding module object. + +The following sections describe the protocol for finders and loaders in more +detail, including how you can create and register new ones to extend the +import machinery. + + +Import hooks +------------ + +.. index:: + single: import hooks + single: meta hooks + single: path hooks + pair: hooks; import + pair: hooks; meta + pair: hooks; path + +The import machinery is designed to be extensible; the primary mechanism for +this are the *import hooks*. There are two types of import hooks: *meta +hooks* and *import path hooks*. + +Meta hooks are called at the start of import processing, before any other +import processing has occurred, other than :data:`sys.modules` cache look up. +This allows meta hooks to override :data:`sys.path` processing, frozen +modules, or even built-in modules. Meta hooks are registered by adding new +finder objects to :data:`sys.meta_path`, as described below. + +Import path hooks are called as part of :data:`sys.path` (or +``package.__path__``) processing, at the point where their associated path +item is encountered. Import path hooks are registered by adding new callables +to :data:`sys.path_hooks` as described below. + + +The meta path +------------- + +.. index:: + single: sys.meta_path + pair: finder; find_module + pair: finder; find_loader + +When the named module is not found in :data:`sys.modules`, Python next +searches :data:`sys.meta_path`, which contains a list of meta path finder +objects. These finders are queried in order to see if they know how to handle +the named module. Meta path finders must implement a method called +:meth:`find_module()` which takes two arguments, a name and an import path. +The meta path finder can use any strategy it wants to determine whether it can +handle the named module or not. + +If the meta path finder knows how to handle the named module, it returns a +loader object. If it cannot handle the named module, it returns ``None``. If +:data:`sys.meta_path` processing reaches the end of its list without returning +a loader, then an :exc:`ImportError` is raised. Any other exceptions raised +are simply propagated up, aborting the import process. + +The :meth:`find_module()` method of meta path finders is called with two +arguments. The first is the fully qualified name of the module being +imported, for example ``foo.bar.baz``. The second argument is the path +entries to use for the module search. For top-level modules, the second +argument is ``None``, but for submodules or subpackages, the second +argument is the value of the parent package's ``__path__`` attribute. If +the appropriate ``__path__`` attribute cannot be accessed, an +:exc:`ImportError` is raised. + +The meta path may be traversed multiple times for a single import request. +For example, assuming none of the modules involved has already been cached, +importing ``foo.bar.baz`` will first perform a top level import, calling +``mpf.find_module("foo", None)`` on each meta path finder (``mpf``). After +``foo`` has been imported, ``foo.bar`` will be imported by traversing the +meta path a second time, calling +``mpf.find_module("foo.bar", foo.__path__)``. Once ``foo.bar`` has been +imported, the final traversal will call +``mpf.find_module("foo.bar.baz", foo.bar.__path__)``. + +Some meta path finders only support top level imports. These importers will +always return ``None`` when anything other than ``None`` is passed as the +second argument. + +Python's default :data:`sys.meta_path` has three meta path finders, one that +knows how to import built-in modules, one that knows how to import frozen +modules, and one that knows how to import modules from an :term:`import path` +(i.e. the :term:`path based finder`). + + +Loaders +======= + +If and when a module loader is found its +:meth:`~importlib.abc.Loader.load_module` method is called, with a single +argument, the fully qualified name of the module being imported. This method +has several responsibilities, and should return the module object it has +loaded [#fnlo]_. If it cannot load the module, it should raise an +:exc:`ImportError`, although any other exception raised during +:meth:`load_module()` will be propagated. + +In many cases, the finder and loader can be the same object; in such cases the +:meth:`finder.find_module()` would just return ``self``. + +Loaders must satisfy the following requirements: + + * If there is an existing module object with the given name in + :data:`sys.modules`, the loader must use that existing module. (Otherwise, + :func:`imp.reload` will not work correctly.) If the named module does + not exist in :data:`sys.modules`, the loader must create a new module + object and add it to :data:`sys.modules`. + + Note that the module *must* exist in :data:`sys.modules` before the loader + executes the module code. This is crucial because the module code may + (directly or indirectly) import itself; adding it to :data:`sys.modules` + beforehand prevents unbounded recursion in the worst case and multiple + loading in the best. + + If loading fails, the loader must remove any modules it has inserted into + :data:`sys.modules`, but it must remove **only** the failing module, and + only if the loader itself has loaded it explicitly. Any module already in + the :data:`sys.modules` cache, and any module that was successfully loaded + as a side-effect, must remain in the cache. + + * The loader may set the ``__file__`` attribute of the module. If set, this + attribute's value must be a string. The loader may opt to leave + ``__file__`` unset if it has no semantic meaning (e.g. a module loaded from + a database). If ``__file__`` is set, it may also be appropriate to set the + ``__cached__`` attribute which is the path to any compiled version of the + code (e.g. byte-compiled file). The file does not need to exist to set this + attribute; the path can simply point to whether the compiled file would + exist (see :pep:`3147`). + + * The loader may set the ``__name__`` attribute of the module. While not + required, setting this attribute is highly recommended so that the + :meth:`repr()` of the module is more informative. + + * If the module is a package (either regular or namespace), the loader must + set the module object's ``__path__`` attribute. The value must be + iterable, but may be empty if ``__path__`` has no further significance + to the loader. If ``__path__`` is not empty, it must produce strings + when iterated over. More details on the semantics of ``__path__`` are + given :ref:`below <package-path-rules>`. + + * The ``__loader__`` attribute must be set to the loader object that loaded + the module. This is mostly for introspection and reloading, but can be + used for additional loader-specific functionality, for example getting + data associated with a loader. + + * The module's ``__package__`` attribute should be set. Its value must be a + string, but it can be the same value as its ``__name__``. If the attribute + is set to ``None`` or is missing, the import system will fill it in with a + more appropriate value. When the module is a package, its ``__package__`` + value should be set to its ``__name__``. When the module is not a package, + ``__package__`` should be set to the empty string for top-level modules, or + for submodules, to the parent package's name. See :pep:`366` for further + details. + + This attribute is used instead of ``__name__`` to calculate explicit + relative imports for main modules, as defined in :pep:`366`. + + * If the module is a Python module (as opposed to a built-in module or a + dynamically loaded extension), the loader should execute the module's code + in the module's global name space (``module.__dict__``). + + +Module reprs +------------ + +By default, all modules have a usable repr, however depending on the +attributes set above, and hooks in the loader, you can more explicitly control +the repr of module objects. + +Loaders may implement a :meth:`module_repr()` method which takes a single +argument, the module object. When ``repr(module)`` is called for a module +with a loader supporting this protocol, whatever is returned from +``module.__loader__.module_repr(module)`` is returned as the module's repr +without further processing. This return value must be a string. + +If the module has no ``__loader__`` attribute, or the loader has no +:meth:`module_repr()` method, then the module object implementation itself +will craft a default repr using whatever information is available. It will +try to use the ``module.__name__``, ``module.__file__``, and +``module.__loader__`` as input into the repr, with defaults for whatever +information is missing. + +Here are the exact rules used: + + * If the module has a ``__loader__`` and that loader has a + :meth:`module_repr()` method, call it with a single argument, which is the + module object. The value returned is used as the module's repr. + + * If an exception occurs in :meth:`module_repr()`, the exception is caught + and discarded, and the calculation of the module's repr continues as if + :meth:`module_repr()` did not exist. + + * If the module has a ``__file__`` attribute, this is used as part of the + module's repr. + + * If the module has no ``__file__`` but does have a ``__loader__``, then the + loader's repr is used as part of the module's repr. + + * Otherwise, just use the module's ``__name__`` in the repr. + +This example, from :pep:`420` shows how a loader can craft its own module +repr:: + + class NamespaceLoader: + @classmethod + def module_repr(cls, module): + return "<module '{}' (namespace)>".format(module.__name__) + + +.. _package-path-rules: + +module.__path__ +--------------- + +By definition, if a module has an ``__path__`` attribute, it is a package, +regardless of its value. + +A package's ``__path__`` attribute is used during imports of its subpackages. +Within the import machinery, it functions much the same as :data:`sys.path`, +i.e. providing a list of locations to search for modules during import. +However, ``__path__`` is typically much more constrained than +:data:`sys.path`. + +``__path__`` must be an iterable of strings, but it may be empty. +The same rules used for :data:`sys.path` also apply to a package's +``__path__``, and :data:`sys.path_hooks` (described below) are +consulted when traversing a package's ``__path__``. + +A package's ``__init__.py`` file may set or alter the package's ``__path__`` +attribute, and this was typically the way namespace packages were implemented +prior to :pep:`420`. With the adoption of :pep:`420`, namespace packages no +longer need to supply ``__init__.py`` files containing only ``__path__`` +manipulation code; the namespace loader automatically sets ``__path__`` +correctly for the namespace package. + + +The Path Based Finder +===================== + +.. index:: + single: path based finder + +As mentioned previously, Python comes with several default meta path finders. +One of these, called the :term:`path based finder`, searches an :term:`import +path`, which contains a list of :term:`path entries <path entry>`. Each path +entry names a location to search for modules. + +The path based finder itself doesn't know how to import anything. Instead, it +traverses the individual path entries, associating each of them with a +path entry finder that knows how to handle that particular kind of path. + +The default set of path entry finders implement all the semantics for finding +modules on the file system, handling special file types such as Python source +code (``.py`` files), Python byte code (``.pyc`` and ``.pyo`` files) and +shared libraries (e.g. ``.so`` files). When supported by the :mod:`zipimport` +module in the standard library, the default path entry finders also handle +loading all of these file types (other than shared libraries) from zipfiles. + +Path entries need not be limited to file system locations. They can refer to +URLs, database queries, or any other location that can be specified as a +string. + +The path based finder provides additional hooks and protocols so that you +can extend and customize the types of searchable path entries. For example, +if you wanted to support path entries as network URLs, you could write a hook +that implements HTTP semantics to find modules on the web. This hook (a +callable) would return a :term:`path entry finder` supporting the protocol +described below, which was then used to get a loader for the module from the +web. + +A word of warning: this section and the previous both use the term *finder*, +distinguishing between them by using the terms :term:`meta path finder` and +:term:`path entry finder`. These two types of finders are very similar, +support similar protocols, and function in similar ways during the import +process, but it's important to keep in mind that they are subtly different. +In particular, meta path finders operate at the beginning of the import +process, as keyed off the :data:`sys.meta_path` traversal. + +By contrast, path entry finders are in a sense an implementation detail +of the path based finder, and in fact, if the path based finder were to be +removed from :data:`sys.meta_path`, none of the path entry finder semantics +would be invoked. + + +Path entry finders +------------------ + +.. index:: + single: sys.path + single: sys.path_hooks + single: sys.path_importer_cache + single: PYTHONPATH + +The :term:`path based finder` is responsible for finding and loading Python +modules and packages whose location is specified with a string :term:`path +entry`. Most path entries name locations in the file system, but they need +not be limited to this. + +As a meta path finder, the :term:`path based finder` implements the +:meth:`find_module()` protocol previously described, however it exposes +additional hooks that can be used to customize how modules are found and +loaded from the :term:`import path`. + +Three variables are used by the :term:`path based finder`, :data:`sys.path`, +:data:`sys.path_hooks` and :data:`sys.path_importer_cache`. The ``__path__`` +attributes on package objects are also used. These provide additional ways +that the import machinery can be customized. + +:data:`sys.path` contains a list of strings providing search locations for +modules and packages. It is initialized from the :data:`PYTHONPATH` +environment variable and various other installation- and +implementation-specific defaults. Entries in :data:`sys.path` can name +directories on the file system, zip files, and potentially other "locations" +(see the :mod:`site` module) that should be searched for modules, such as +URLs, or database queries. Only strings and bytes should be present on +:data:`sys.path`; all other data types are ignored. The encoding of bytes +entries is determined by the individual :term:`path entry finders <path entry +finder>`. + +The :term:`path based finder` is a :term:`meta path finder`, so the import +machinery begins the :term:`import path` search by calling the path +based finder's :meth:`find_module()` method as described previously. When +the ``path`` argument to :meth:`find_module()` is given, it will be a +list of string paths to traverse - typically a package's ``__path__`` +attribute for an import within that package. If the ``path`` argument +is ``None``, this indicates a top level import and :data:`sys.path` is used. + +The path based finder iterates over every entry in the search path, and +for each of these, looks for an appropriate :term:`path entry finder` for the +path entry. Because this can be an expensive operation (e.g. there may be +`stat()` call overheads for this search), the path based finder maintains +a cache mapping path entries to path entry finders. This cache is maintained +in :data:`sys.path_importer_cache` (despite the name, this cache actually +stores finder objects rather than being limited to :term:`importer` objects). +In this way, the expensive search for a particular :term:`path entry` +location's :term:`path entry finder` need only be done once. User code is +free to remove cache entries from :data:`sys.path_importer_cache` forcing +the path based finder to perform the path entry search again [#fnpic]_. + +If the path entry is not present in the cache, the path based finder iterates +over every callable in :data:`sys.path_hooks`. Each of the :term:`path entry +hooks <path entry hook>` in this list is called with a single argument, the +path entry to be searched. This callable may either return a :term:`path +entry finder` that can handle the path entry, or it may raise +:exc:`ImportError`. An :exc:`ImportError` is used by the path based finder to +signal that the hook cannot find a :term:`path entry finder` for that +:term:`path entry`. The exception is ignored and :term:`import path` +iteration continues. The hook should expect either a string or bytes object; +the encoding of bytes objects is up to the hook (e.g. it may be a file system +encoding, UTF-8, or something else), and if the hook cannot decode the +argument, it should raise :exc:`ImportError`. + +If :data:`sys.path_hooks` iteration ends with no :term:`path entry finder` +being returned, then the path based finder's :meth:`find_module()` method +will store ``None`` in :data:`sys.path_importer_cache` (to indicate that +there is no finder for this path entry) and return ``None``, indicating that +this :term:`meta path finder` could not find the module. + +If a :term:`path entry finder` *is* returned by one of the :term:`path entry +hook` callables on :data:`sys.path_hooks`, then the following protocol is used +to ask the finder for a module loader, which is then used to load the module. + + +Path entry finder protocol +-------------------------- + +In order to support imports of modules and initialized packages and also to +contribute portions to namespace packages, path entry finders must implement +the :meth:`find_loader()` method. + +:meth:`find_loader()` takes one argument, the fully qualified name of the +module being imported. :meth:`find_loader()` returns a 2-tuple where the +first item is the loader and the second item is a namespace :term:`portion`. +When the first item (i.e. the loader) is ``None``, this means that while the +path entry finder does not have a loader for the named module, it knows that the +path entry contributes to a namespace portion for the named module. This will +almost always be the case where Python is asked to import a namespace package +that has no physical presence on the file system. When a path entry finder +returns ``None`` for the loader, the second item of the 2-tuple return value +must be a sequence, although it can be empty. + +If :meth:`find_loader()` returns a non-``None`` loader value, the portion is +ignored and the loader is returned from the path based finder, terminating +the search through the path entries. + +For backwards compatibility with other implementations of the import +protocol, many path entry finders also support the same, +traditional :meth:`find_module()` method that meta path finders support. +However path entry finder :meth:`find_module()` methods are never called +with a ``path`` argument (they are expected to record the appropriate +path information from the initial call to the path hook). + +The :meth:`find_module()` method on path entry finders is deprecated, +as it does not allow the path entry finder to contribute portions to +namespace packages. Instead path entry finders should implement the +:meth:`find_loader()` method as described above. If it exists on the path +entry finder, the import system will always call :meth:`find_loader()` +in preference to :meth:`find_module()`. + + +Replacing the standard import system +==================================== + +The most reliable mechanism for replacing the entire import system is to +delete the default contents of :data:`sys.meta_path`, replacing them +entirely with a custom meta path hook. + +If it is acceptable to only alter the behaviour of import statements +without affecting other APIs that access the import system, then replacing +the builtin :func:`__import__` function may be sufficient. This technique +may also be employed at the module level to only alter the behaviour of +import statements within that module. + +To selectively prevent import of some modules from a hook early on the +meta path (rather than disabling the standard import system entirely), +it is sufficient to raise :exc:`ImportError` directly from +:meth:`find_module` instead of returning ``None``. The latter indicates +that the meta path search should continue. while raising an exception +terminates it immediately. + + +Open issues +=========== + +XXX It would be really nice to have a diagram. + +XXX * (import_machinery.rst) how about a section devoted just to the +attributes of modules and packages, perhaps expanding upon or supplanting the +related entries in the data model reference page? + +XXX runpy, pkgutil, et al in the library manual should all get "See Also" +links at the top pointing to the new import system section. + + +References +========== + +The import machinery has evolved considerably since Python's early days. The +original `specification for packages +<http://www.python.org/doc/essays/packages.html>`_ is still available to read, +although some details have changed since the writing of that document. + +The original specification for :data:`sys.meta_path` was :pep:`302`, with +subsequent extension in :pep:`420`. + +:pep:`420` introduced :term:`namespace packages <namespace package>` for +Python 3.3. :pep:`420` also introduced the :meth:`find_loader` protocol as an +alternative to :meth:`find_module`. + +:pep:`366` describes the addition of the ``__package__`` attribute for +explicit relative imports in main modules. + +:pep:`328` introduced absolute and explicit relative imports and initially +proposed ``__name__`` for semantics :pep:`366` would eventually specify for +``__package__``. + +:pep:`338` defines executing modules as scripts. + + +.. rubric:: Footnotes + +.. [#fnmo] See :class:`types.ModuleType`. + +.. [#fnlo] The importlib implementation avoids using the return value + directly. Instead, it gets the module object by looking the module name up + in :data:`sys.modules`. The indirect effect of this is that an imported + module may replace itself in :data:`sys.modules`. This is + implementation-specific behavior that is not guaranteed to work in other + Python implementations. + +.. [#fnpic] In legacy code, it is possible to find instances of + :class:`imp.NullImporter` in the :data:`sys.path_importer_cache`. It + is recommended that code be changed to use ``None`` instead. See + :ref:`portingpythoncode` for more details. diff --git a/Doc/reference/index.rst b/Doc/reference/index.rst index 513e445..a66673b 100644 --- a/Doc/reference/index.rst +++ b/Doc/reference/index.rst @@ -21,6 +21,7 @@ interfaces available to C/C++ programmers in detail. lexical_analysis.rst datamodel.rst executionmodel.rst + import.rst expressions.rst simple_stmts.rst compound_stmts.rst diff --git a/Doc/reference/lexical_analysis.rst b/Doc/reference/lexical_analysis.rst index 4b49738..0ed3d3b 100644 --- a/Doc/reference/lexical_analysis.rst +++ b/Doc/reference/lexical_analysis.rst @@ -401,7 +401,7 @@ String literals are described by the following lexical definitions: .. productionlist:: stringliteral: [`stringprefix`](`shortstring` | `longstring`) - stringprefix: "r" | "R" + stringprefix: "r" | "u" | "R" | "U" shortstring: "'" `shortstringitem`* "'" | '"' `shortstringitem`* '"' longstring: "'''" `longstringitem`* "'''" | '"""' `longstringitem`* '"""' shortstringitem: `shortstringchar` | `stringescapeseq` @@ -412,7 +412,7 @@ String literals are described by the following lexical definitions: .. productionlist:: bytesliteral: `bytesprefix`(`shortbytes` | `longbytes`) - bytesprefix: "b" | "B" | "br" | "Br" | "bR" | "BR" + bytesprefix: "b" | "B" | "br" | "Br" | "bR" | "BR" | "rb" | "rB" | "Rb" | "RB" shortbytes: "'" `shortbytesitem`* "'" | '"' `shortbytesitem`* '"' longbytes: "'''" `longbytesitem`* "'''" | '"""' `longbytesitem`* '"""' shortbytesitem: `shortbyteschar` | `bytesescapeseq` @@ -441,10 +441,24 @@ instance of the :class:`bytes` type instead of the :class:`str` type. They may only contain ASCII characters; bytes with a numeric value of 128 or greater must be expressed with escapes. +As of Python 3.3 it is possible again to prefix unicode strings with a +``u`` prefix to simplify maintenance of dual 2.x and 3.x codebases. + Both string and bytes literals may optionally be prefixed with a letter ``'r'`` or ``'R'``; such strings are called :dfn:`raw strings` and treat backslashes as literal characters. As a result, in string literals, ``'\U'`` and ``'\u'`` -escapes in raw strings are not treated specially. +escapes in raw strings are not treated specially. Given that Python 2.x's raw +unicode literals behave differently than Python 3.x's the ``'ur'`` syntax +is not supported. + + .. versionadded:: 3.3 + The ``'rb'`` prefix of raw bytes literals has been added as a synonym + of ``'br'``. + + .. versionadded:: 3.3 + Support for the unicode legacy literal (``u'value'``) was reintroduced + to simplify the maintenance of dual Python 2.x and 3.x codebases. + See :pep:`414` for more information. In triple-quoted strings, unescaped newlines and quotes are allowed (and are retained), except that three unescaped quotes in a row terminate the string. (A @@ -492,13 +506,13 @@ Escape sequences only recognized in string literals are: +-----------------+---------------------------------+-------+ | Escape Sequence | Meaning | Notes | +=================+=================================+=======+ -| ``\N{name}`` | Character named *name* in the | | +| ``\N{name}`` | Character named *name* in the | \(4) | | | Unicode database | | +-----------------+---------------------------------+-------+ -| ``\uxxxx`` | Character with 16-bit hex value | \(4) | +| ``\uxxxx`` | Character with 16-bit hex value | \(5) | | | *xxxx* | | +-----------------+---------------------------------+-------+ -| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(5) | +| ``\Uxxxxxxxx`` | Character with 32-bit hex value | \(6) | | | *xxxxxxxx* | | +-----------------+---------------------------------+-------+ @@ -516,13 +530,15 @@ Notes: with the given value. (4) + .. versionchanged:: 3.3 + Support for name aliases [#]_ has been added. + +(5) Individual code units which form parts of a surrogate pair can be encoded using this escape sequence. Exactly four hex digits are required. -(5) - Any Unicode character can be encoded this way, but characters outside the Basic - Multilingual Plane (BMP) will be encoded using a surrogate pair if Python is - compiled to use 16-bit code units (the default). Exactly eight hex digits +(6) + Any Unicode character can be encoded this way. Exactly eight hex digits are required. @@ -688,7 +704,7 @@ Delimiters The following tokens serve as delimiters in the grammar:: ( ) [ ] { } - , : . ; @ = + , : . ; @ = -> += -= *= /= //= %= &= |= ^= >>= <<= **= @@ -706,3 +722,8 @@ The following printing ASCII characters are not used in Python. Their occurrence outside string literals and comments is an unconditional error:: $ ? ` + + +.. rubric:: Footnotes + +.. [#] http://www.unicode.org/Public/6.1.0/ucd/NameAliases.txt diff --git a/Doc/reference/simple_stmts.rst b/Doc/reference/simple_stmts.rst index 73183d5..40bbc39 100644 --- a/Doc/reference/simple_stmts.rst +++ b/Doc/reference/simple_stmts.rst @@ -424,10 +424,10 @@ When :keyword:`return` passes control out of a :keyword:`try` statement with a :keyword:`finally` clause, that :keyword:`finally` clause is executed before really leaving the function. -In a generator function, the :keyword:`return` statement is not allowed to -include an :token:`expression_list`. In that context, a bare :keyword:`return` -indicates that the generator is done and will cause :exc:`StopIteration` to be -raised. +In a generator function, the :keyword:`return` statement indicates that the +generator is done and will cause :exc:`StopIteration` to be raised. The returned +value (if any) is used as an argument to construct :exc:`StopIteration` and +becomes the :attr:`StopIteration.value` attribute. .. _yield: @@ -445,38 +445,26 @@ The :keyword:`yield` statement .. productionlist:: yield_stmt: `yield_expression` -The :keyword:`yield` statement is only used when defining a generator function, -and is only used in the body of the generator function. Using a :keyword:`yield` -statement in a function definition is sufficient to cause that definition to -create a generator function instead of a normal function. -When a generator function is called, it returns an iterator known as a generator -iterator, or more commonly, a generator. The body of the generator function is -executed by calling the :func:`next` function on the generator repeatedly until -it raises an exception. - -When a :keyword:`yield` statement is executed, the state of the generator is -frozen and the value of :token:`expression_list` is returned to :meth:`next`'s -caller. By "frozen" we mean that all local state is retained, including the -current bindings of local variables, the instruction pointer, and the internal -evaluation stack: enough information is saved so that the next time :func:`next` -is invoked, the function can proceed exactly as if the :keyword:`yield` -statement were just another external call. - -The :keyword:`yield` statement is allowed in the :keyword:`try` clause of a -:keyword:`try` ... :keyword:`finally` construct. If the generator is not -resumed before it is finalized (by reaching a zero reference count or by being -garbage collected), the generator-iterator's :meth:`close` method will be -called, allowing any pending :keyword:`finally` clauses to execute. +A :keyword:`yield` statement is semantically equivalent to a :ref:`yield +expression <yieldexpr>`. The yield statement can be used to omit the parentheses +that would otherwise be required in the equivalent yield expression +statement. For example, the yield statements :: -.. seealso:: + yield <expr> + yield from <expr> + +are equivalent to the yield expression statements :: - :pep:`0255` - Simple Generators - The proposal for adding generators and the :keyword:`yield` statement to Python. + (yield <expr>) + (yield from <expr>) - :pep:`0342` - Coroutines via Enhanced Generators - The proposal that, among other generator enhancements, proposed allowing - :keyword:`yield` to appear inside a :keyword:`try` ... :keyword:`finally` block. +Yield expressions and statements are only used when defining a :term:`generator` +function, and are only used in the body of the generator function. Using yield +in a function definition is sufficient to cause that definition to create a +generator function instead of a normal function. +For full details of :keyword:`yield` semantics, refer to the +:ref:`yieldexpr` section. .. _raise: @@ -645,161 +633,82 @@ The :keyword:`import` statement relative_module: "."* `module` | "."+ name: `identifier` -Import statements are executed in two steps: (1) find a module, and initialize -it if necessary; (2) define a name or names in the local namespace (of the scope -where the :keyword:`import` statement occurs). The statement comes in two -forms differing on whether it uses the :keyword:`from` keyword. The first form -(without :keyword:`from`) repeats these steps for each identifier in the list. -The form with :keyword:`from` performs step (1) once, and then performs step -(2) repeatedly. For a reference implementation of step (1), see the -:mod:`importlib` module. - -.. index:: - single: package +The basic import statement (no :keyword:`from` clause) is executed in two +steps: -To understand how step (1) occurs, one must first understand how Python handles -hierarchical naming of modules. To help organize modules and provide a -hierarchy in naming, Python has a concept of packages. A package can contain -other packages and modules while modules cannot contain other modules or -packages. From a file system perspective, packages are directories and modules -are files. The original `specification for packages -<http://www.python.org/doc/essays/packages.html>`_ is still available to read, -although minor details have changed since the writing of that document. +#. find a module, loading and initializing it if necessary +#. define a name or names in the local namespace for the scope where + the :keyword:`import` statement occurs. -.. index:: - single: sys.modules +When the statement contains multiple clauses (separated by +commas) the two steps are carried out separately for each clause, just +as though the clauses had been separated out into individiual import +statements. -Once the name of the module is known (unless otherwise specified, the term -"module" will refer to both packages and modules), searching -for the module or package can begin. The first place checked is -:data:`sys.modules`, the cache of all modules that have been imported -previously. If the module is found there then it is used in step (2) of import -unless ``None`` is found in :data:`sys.modules`, in which case -:exc:`ImportError` is raised. +The details of the first step, finding and loading modules is described in +greater detail in the section on the :ref:`import system <importsystem>`, +which also describes the various types of packages and modules that can +be imported, as well as all the hooks that can be used to customize +the import system. Note that failures in this step may indicate either +that the module could not be located, *or* that an error occurred while +initializing the module, which includes execution of the module's code. -.. index:: - single: sys.meta_path - single: finder - pair: finder; find_module - single: __path__ - -If the module is not found in the cache, then :data:`sys.meta_path` is searched -(the specification for :data:`sys.meta_path` can be found in :pep:`302`). -The object is a list of :term:`finder` objects which are queried in order as to -whether they know how to load the module by calling their :meth:`find_module` -method with the name of the module. If the module happens to be contained -within a package (as denoted by the existence of a dot in the name), then a -second argument to :meth:`find_module` is given as the value of the -:attr:`__path__` attribute from the parent package (everything up to the last -dot in the name of the module being imported). If a finder can find the module -it returns a :term:`loader` (discussed later) or returns ``None``. +If the requested module is retrieved successfully, it will be made +available in the local namespace in one of three ways: -.. index:: - single: sys.path_hooks - single: sys.path_importer_cache - single: sys.path - -If none of the finders on :data:`sys.meta_path` are able to find the module -then some implicitly defined finders are queried. Implementations of Python -vary in what implicit meta path finders are defined. The one they all do -define, though, is one that handles :data:`sys.path_hooks`, -:data:`sys.path_importer_cache`, and :data:`sys.path`. - -The implicit finder searches for the requested module in the "paths" specified -in one of two places ("paths" do not have to be file system paths). If the -module being imported is supposed to be contained within a package then the -second argument passed to :meth:`find_module`, :attr:`__path__` on the parent -package, is used as the source of paths. If the module is not contained in a -package then :data:`sys.path` is used as the source of paths. - -Once the source of paths is chosen it is iterated over to find a finder that -can handle that path. The dict at :data:`sys.path_importer_cache` caches -finders for paths and is checked for a finder. If the path does not have a -finder cached then :data:`sys.path_hooks` is searched by calling each object in -the list with a single argument of the path, returning a finder or raises -:exc:`ImportError`. If a finder is returned then it is cached in -:data:`sys.path_importer_cache` and then used for that path entry. If no finder -can be found but the path exists then a value of ``None`` is -stored in :data:`sys.path_importer_cache` to signify that an implicit, -file-based finder that handles modules stored as individual files should be -used for that path. If the path does not exist then a finder which always -returns ``None`` is placed in the cache for the path. +* If the module name is followed by :keyword:`as`, then the name + following :keyword:`as` is bound directly to the imported module. +* If no other name is specified, and the module being imported is a top + level module, the module's name is bound in the local namespace as a + reference to the imported module +* If the module being imported is *not* a top level module, then the name + of the top level package that contains the module is bound in the local + namespace as a reference to the top level package. The imported module + must be accessed using its full qualified name rather than directly -.. index:: - single: loader - pair: loader; load_module - exception: ImportError - -If no finder can find the module then :exc:`ImportError` is raised. Otherwise -some finder returned a loader whose :meth:`load_module` method is called with -the name of the module to load (see :pep:`302` for the original definition of -loaders). A loader has several responsibilities to perform on a module it -loads. First, if the module already exists in :data:`sys.modules` (a -possibility if the loader is called outside of the import machinery) then it -is to use that module for initialization and not a new module. But if the -module does not exist in :data:`sys.modules` then it is to be added to that -dict before initialization begins. If an error occurs during loading of the -module and it was added to :data:`sys.modules` it is to be removed from the -dict. If an error occurs but the module was already in :data:`sys.modules` it -is left in the dict. .. index:: - single: __name__ - single: __file__ - single: __path__ - single: __package__ - single: __loader__ - -The loader must set several attributes on the module. :data:`__name__` is to be -set to the name of the module. :data:`__file__` is to be the "path" to the file -unless the module is built-in (and thus listed in -:data:`sys.builtin_module_names`) in which case the attribute is not set. -If what is being imported is a package then :data:`__path__` is to be set to a -list of paths to be searched when looking for modules and packages contained -within the package being imported. :data:`__package__` is optional but should -be set to the name of package that contains the module or package (the empty -string is used for module not contained in a package). :data:`__loader__` is -also optional but should be set to the loader object that is loading the -module. + pair: name; binding + keyword: from + exception: ImportError -.. index:: - exception: ImportError +The :keyword:`from` form uses a slightly more complex process: -If an error occurs during loading then the loader raises :exc:`ImportError` if -some other exception is not already being propagated. Otherwise the loader -returns the module that was loaded and initialized. +#. find the module specified in the :keyword:`from` clause loading and + initializing it if necessary; +#. for each of the identifiers specified in the :keyword:`import` clauses: -When step (1) finishes without raising an exception, step (2) can begin. + #. check if the imported module has an attribute by that name + #. if not, attempt to import a submodule with that name and then + check the imported module again for that attribute + #. if the attribute is not found, :exc:`ImportError` is raised. + #. otherwise, a reference to that value is bound in the local namespace, + using the name in the :keyword:`as` clause if it is present, + otherwise using the attribute name -The first form of :keyword:`import` statement binds the module name in the local -namespace to the module object, and then goes on to import the next identifier, -if any. If the module name is followed by :keyword:`as`, the name following -:keyword:`as` is used as the local name for the module. +Examples:: -.. index:: - pair: name; binding - exception: ImportError + import foo # foo imported and bound locally + import foo.bar.baz # foo.bar.baz imported, foo bound locally + import foo.bar.baz as fbb # foo.bar.baz imported and bound as fbb + from foo.bar import baz # foo.bar.baz imported and bound as baz + from foo import attr # foo imported and foo.attr bound as attr -The :keyword:`from` form does not bind the module name: it goes through the list -of identifiers, looks each one of them up in the module found in step (1), and -binds the name in the local namespace to the object thus found. As with the -first form of :keyword:`import`, an alternate local name can be supplied by -specifying ":keyword:`as` localname". If a name is not found, -:exc:`ImportError` is raised. If the list of identifiers is replaced by a star -(``'*'``), all public names defined in the module are bound in the local -namespace of the :keyword:`import` statement. +If the list of identifiers is replaced by a star (``'*'``), all public +names defined in the module are bound in the local namespace for the scope +where the :keyword:`import` statement occurs. .. index:: single: __all__ (optional module attribute) The *public names* defined by a module are determined by checking the module's -namespace for a variable named ``__all__``; if defined, it must be a sequence of -strings which are names defined or imported by that module. The names given in -``__all__`` are all considered public and are required to exist. If ``__all__`` -is not defined, the set of public names includes all names found in the module's -namespace which do not begin with an underscore character (``'_'``). -``__all__`` should contain the entire public API. It is intended to avoid -accidentally exporting items that are not part of the API (such as library -modules which were imported and used within the module). +namespace for a variable named ``__all__``; if defined, it must be a sequence +of strings which are names defined or imported by that module. The names +given in ``__all__`` are all considered public and are required to exist. If +``__all__`` is not defined, the set of public names includes all names found +in the module's namespace which do not begin with an underscore character +(``'_'``). ``__all__`` should contain the entire public API. It is intended +to avoid accidentally exporting items that are not part of the API (such as +library modules which were imported and used within the module). The :keyword:`from` form with ``*`` may only occur in a module scope. The wild card form of import --- ``import *`` --- is only allowed at the module level. diff --git a/Doc/tools/sphinx-build.py b/Doc/tools/sphinx-build.py index 91be6f0..d3fe702 100644 --- a/Doc/tools/sphinx-build.py +++ b/Doc/tools/sphinx-build.py @@ -15,9 +15,9 @@ warnings.filterwarnings('ignore', category=UserWarning, module='jinja2') if __name__ == '__main__': - if sys.version_info[:3] < (2, 4, 0): + if sys.version_info[:3] < (2, 4, 0) or sys.version_info[:3] > (3, 0, 0): sys.stderr.write("""\ -Error: Sphinx needs to be executed with Python 2.4 or newer (not 3.0 though). +Error: Sphinx needs to be executed with Python 2.4 or newer (not 3.x though). (If you run this from the Makefile, you can set the PYTHON variable to the path of an alternative interpreter executable, e.g., ``make html PYTHON=python2.5``). diff --git a/Doc/tools/sphinxext/c_annotations.py b/Doc/tools/sphinxext/c_annotations.py new file mode 100644 index 0000000..8b5167a --- /dev/null +++ b/Doc/tools/sphinxext/c_annotations.py @@ -0,0 +1,117 @@ +# -*- coding: utf-8 -*- +""" + c_annotations.py + ~~~~~~~~~~~~~~~~ + + Supports annotations for C API elements: + + * reference count annotations for C API functions. Based on + refcount.py and anno-api.py in the old Python documentation tools. + + * stable API annotations + + Usage: Set the `refcount_file` config value to the path to the reference + count data file. + + :copyright: Copyright 2007-2013 by Georg Brandl. + :license: Python license. +""" + +from os import path +from docutils import nodes +from docutils.parsers.rst import directives + +from sphinx import addnodes +from sphinx.domains.c import CObject + + +class RCEntry: + def __init__(self, name): + self.name = name + self.args = [] + self.result_type = '' + self.result_refs = None + + +class Annotations(dict): + @classmethod + def fromfile(cls, filename): + d = cls() + fp = open(filename, 'r') + try: + for line in fp: + line = line.strip() + if line[:1] in ("", "#"): + # blank lines and comments + continue + parts = line.split(":", 4) + if len(parts) != 5: + raise ValueError("Wrong field count in %r" % line) + function, type, arg, refcount, comment = parts + # Get the entry, creating it if needed: + try: + entry = d[function] + except KeyError: + entry = d[function] = RCEntry(function) + if not refcount or refcount == "null": + refcount = None + else: + refcount = int(refcount) + # Update the entry with the new parameter or the result + # information. + if arg: + entry.args.append((arg, type, refcount)) + else: + entry.result_type = type + entry.result_refs = refcount + finally: + fp.close() + return d + + def add_annotations(self, app, doctree): + for node in doctree.traverse(addnodes.desc_content): + par = node.parent + if par['domain'] != 'c': + continue + if par['stableabi']: + node.insert(0, nodes.emphasis(' Part of the stable ABI.', + ' Part of the stable ABI.', + classes=['stableabi'])) + if par['objtype'] != 'function': + continue + if not par[0].has_key('names') or not par[0]['names']: + continue + entry = self.get(par[0]['names'][0]) + if not entry: + continue + elif entry.result_type not in ("PyObject*", "PyVarObject*"): + continue + if entry.result_refs is None: + rc = 'Return value: Always NULL.' + elif entry.result_refs: + rc = 'Return value: New reference.' + else: + rc = 'Return value: Borrowed reference.' + node.insert(0, nodes.emphasis(rc, rc, classes=['refcount'])) + + +def init_annotations(app): + refcounts = Annotations.fromfile( + path.join(app.srcdir, app.config.refcount_file)) + app.connect('doctree-read', refcounts.add_annotations) + + +def setup(app): + app.add_config_value('refcount_file', '', True) + app.connect('builder-inited', init_annotations) + + # monkey-patch C object... + CObject.option_spec = { + 'noindex': directives.flag, + 'stableabi': directives.flag, + } + old_handle_signature = CObject.handle_signature + def new_handle_signature(self, sig, signode): + signode.parent['stableabi'] = 'stableabi' in self.options + return old_handle_signature(self, sig, signode) + CObject.handle_signature = new_handle_signature diff --git a/Doc/tools/sphinxext/indexsidebar.html b/Doc/tools/sphinxext/indexsidebar.html index 748cb91..a0ec32f 100644 --- a/Doc/tools/sphinxext/indexsidebar.html +++ b/Doc/tools/sphinxext/indexsidebar.html @@ -3,7 +3,7 @@ <h3>Docs for other versions</h3> <ul> <li><a href="http://docs.python.org/2.7/">Python 2.7 (stable)</a></li> - <li><a href="http://docs.python.org/3.3/">Python 3.3 (in development)</a></li> + <li><a href="http://docs.python.org/3.4/">Python 3.4 (in development)</a></li> <li><a href="http://www.python.org/doc/versions/">Old versions</a></li> </ul> diff --git a/Doc/tools/sphinxext/layout.html b/Doc/tools/sphinxext/layout.html index 4b16b3f..16a9212 100644 --- a/Doc/tools/sphinxext/layout.html +++ b/Doc/tools/sphinxext/layout.html @@ -16,6 +16,63 @@ <link rel="shortcut icon" type="image/png" href="{{ pathto('_static/py.png', 1) }}" /> {% if not embedded %}<script type="text/javascript" src="{{ pathto('_static/copybutton.js', 1) }}"></script>{% endif %} {% if versionswitcher is defined and not embedded %}<script type="text/javascript" src="{{ pathto('_static/version_switch.js', 1) }}"></script>{% endif %} + {% if pagename == 'whatsnew/changelog' %} + <script type="text/javascript"> + $(document).ready(function() { + // add the search form and bind the events + $('h1').after([ + '<p>Filter entries by content:', + '<input type="text" value="" id="searchbox" style="width: 50%">', + '<input type="submit" id="searchbox-submit" value="Filter"></p>' + ].join('\n')); + + function dofilter() { + try { + var query = new RegExp($('#searchbox').val(), 'i'); + } + catch (e) { + return; // not a valid regex (yet) + } + // find headers for the versions (What's new in Python X.Y.Z?) + $('#changelog h2').each(function(index1, h2) { + var h2_parent = $(h2).parent(); + var sections_found = 0; + // find headers for the sections (Core, Library, etc.) + h2_parent.find('h3').each(function(index2, h3) { + var h3_parent = $(h3).parent(); + var entries_found = 0; + // find all the entries + h3_parent.find('li').each(function(index3, li) { + var li = $(li); + // check if the query matches the entry + if (query.test(li.text())) { + li.show(); + entries_found++; + } + else { + li.hide(); + } + }); + // if there are entries, show the section, otherwise hide it + if (entries_found > 0) { + h3_parent.show(); + sections_found++; + } + else { + h3_parent.hide(); + } + }); + if (sections_found > 0) + h2_parent.show(); + else + h2_parent.hide(); + }); + } + $('#searchbox').keyup(dofilter); + $('#searchbox-submit').click(dofilter); + }); + </script> + {% endif %} {{ super() }} {% endblock %} {% block footer %} diff --git a/Doc/tools/sphinxext/patchlevel.py b/Doc/tools/sphinxext/patchlevel.py index b070d60..bca2eb8 100644 --- a/Doc/tools/sphinxext/patchlevel.py +++ b/Doc/tools/sphinxext/patchlevel.py @@ -34,8 +34,7 @@ def get_header_version_info(srcdir): release = version = '%s.%s' % (d['PY_MAJOR_VERSION'], d['PY_MINOR_VERSION']) micro = int(d['PY_MICRO_VERSION']) - if micro != 0: - release += '.' + str(micro) + release += '.' + str(micro) level = d['PY_RELEASE_LEVEL'] suffixes = { @@ -51,8 +50,7 @@ def get_header_version_info(srcdir): def get_sys_version_info(): major, minor, micro, level, serial = sys.version_info release = version = '%s.%s' % (major, minor) - if micro: - release += '.%s' % micro + release += '.%s' % micro if level != 'final': release += '%s%s' % (level[0], serial) return version, release diff --git a/Doc/tools/sphinxext/pydoctheme/static/pydoctheme.css b/Doc/tools/sphinxext/pydoctheme/static/pydoctheme.css index 9942ca6..3d995d8 100644 --- a/Doc/tools/sphinxext/pydoctheme/static/pydoctheme.css +++ b/Doc/tools/sphinxext/pydoctheme/static/pydoctheme.css @@ -122,7 +122,7 @@ div.body tt.xref, div.body a tt { font-weight: normal; } -p.deprecated { +.deprecated { border-radius: 3px; } @@ -168,3 +168,11 @@ div.footer { div.footer a:hover { color: #0095C4; } + +.refcount { + color: #060; +} + +.stableabi { + color: #229; +} diff --git a/Doc/tools/sphinxext/pyspecific.py b/Doc/tools/sphinxext/pyspecific.py index 9b2cc47..f2d3ca1 100644 --- a/Doc/tools/sphinxext/pyspecific.py +++ b/Doc/tools/sphinxext/pyspecific.py @@ -5,15 +5,21 @@ Sphinx extension with Python doc-specific markup. - :copyright: 2008, 2009, 2010 by Georg Brandl. + :copyright: 2008-2014 by Georg Brandl. :license: Python license. """ ISSUE_URI = 'http://bugs.python.org/issue%s' -SOURCE_URI = 'http://hg.python.org/cpython/file/3.2/%s' +SOURCE_URI = 'http://hg.python.org/cpython/file/3.3/%s' from docutils import nodes, utils + +import sphinx from sphinx.util.nodes import split_explicit_title +from sphinx.util.compat import Directive +from sphinx.writers.html import HTMLTranslator +from sphinx.writers.latex import LaTeXTranslator +from sphinx.locale import versionlabels # monkey-patch reST parser to disable alphabetic and roman enumerated lists from docutils.parsers.rst.states import Body @@ -22,21 +28,19 @@ Body.enum.converters['loweralpha'] = \ Body.enum.converters['lowerroman'] = \ Body.enum.converters['upperroman'] = lambda x: None -# monkey-patch HTML translator to give versionmodified paragraphs a class -def new_visit_versionmodified(self, node): - self.body.append(self.starttag(node, 'p', CLASS=node['type'])) - text = versionlabels[node['type']] % node['version'] - if len(node): - text += ':' - else: - text += '.' - self.body.append('<span class="versionmodified">%s</span> ' % text) +SPHINX11 = sphinx.__version__[:3] < '1.2' -from sphinx.writers.html import HTMLTranslator -from sphinx.writers.latex import LaTeXTranslator -from sphinx.locale import versionlabels -HTMLTranslator.visit_versionmodified = new_visit_versionmodified -HTMLTranslator.visit_versionmodified = new_visit_versionmodified +if SPHINX11: + # monkey-patch HTML translator to give versionmodified paragraphs a class + def new_visit_versionmodified(self, node): + self.body.append(self.starttag(node, 'p', CLASS=node['type'])) + text = versionlabels[node['type']] % node['version'] + if len(node): + text += ':' + else: + text += '.' + self.body.append('<span class="versionmodified">%s</span> ' % text) + HTMLTranslator.visit_versionmodified = new_visit_versionmodified # monkey-patch HTML and LaTeX translators to keep doctest blocks in the # doctest docs themselves @@ -87,8 +91,6 @@ def source_role(typ, rawtext, text, lineno, inliner, options={}, content=[]): # Support for marking up implementation details -from sphinx.util.compat import Directive - class ImplementationDetail(Directive): has_content = True @@ -141,12 +143,6 @@ class PyDecoratorMethod(PyDecoratorMixin, PyClassmember): # Support for documenting version of removal in deprecations -from sphinx.locale import versionlabels -from sphinx.util.compat import Directive - -versionlabels['deprecated-removed'] = \ - 'Deprecated since version %s, will be removed in version %s' - class DeprecatedRemoved(Directive): has_content = True required_arguments = 2 @@ -154,24 +150,84 @@ class DeprecatedRemoved(Directive): final_argument_whitespace = True option_spec = {} + _label = 'Deprecated since version %s, will be removed in version %s' + def run(self): node = addnodes.versionmodified() node.document = self.state.document node['type'] = 'deprecated-removed' version = (self.arguments[0], self.arguments[1]) node['version'] = version + text = self._label % version if len(self.arguments) == 3: inodes, messages = self.state.inline_text(self.arguments[2], self.lineno+1) - node.extend(inodes) - if self.content: - self.state.nested_parse(self.content, self.content_offset, node) - ret = [node] + messages + para = nodes.paragraph(self.arguments[2], '', *inodes) + node.append(para) else: - ret = [node] + messages = [] + if self.content: + self.state.nested_parse(self.content, self.content_offset, node) + if isinstance(node[0], nodes.paragraph) and node[0].rawsource: + content = nodes.inline(node[0].rawsource, translatable=True) + content.source = node[0].source + content.line = node[0].line + content += node[0].children + node[0].replace_self(nodes.paragraph('', '', content)) + if not SPHINX11: + node[0].insert(0, nodes.inline('', '%s: ' % text, + classes=['versionmodified'])) + elif not SPHINX11: + para = nodes.paragraph('', '', + nodes.inline('', '%s.' % text, classes=['versionmodified'])) + node.append(para) env = self.state.document.settings.env env.note_versionchange('deprecated', version[0], node, self.lineno) - return ret + return [node] + messages + +# for Sphinx < 1.2 +versionlabels['deprecated-removed'] = DeprecatedRemoved._label + + +# Support for including Misc/NEWS + +import re +import codecs + +issue_re = re.compile('([Ii])ssue #([0-9]+)') +whatsnew_re = re.compile(r"(?im)^what's new in (.*?)\??$") + +class MiscNews(Directive): + has_content = False + required_arguments = 1 + optional_arguments = 0 + final_argument_whitespace = False + option_spec = {} + + def run(self): + fname = self.arguments[0] + source = self.state_machine.input_lines.source( + self.lineno - self.state_machine.input_offset - 1) + source_dir = path.dirname(path.abspath(source)) + fpath = path.join(source_dir, fname) + self.state.document.settings.record_dependencies.add(fpath) + try: + fp = codecs.open(fpath, encoding='utf-8') + try: + content = fp.read() + finally: + fp.close() + except Exception: + text = 'The NEWS file is not available.' + node = nodes.strong(text, text) + return [node] + content = issue_re.sub(r'`\1ssue #\2 <http://bugs.python.org/\2>`__', + content) + content = whatsnew_re.sub(r'\1', content) + # remove first 3 lines as they are the main heading + lines = ['.. default-role:: obj', ''] + content.splitlines()[3:] + self.state_machine.insert_input(lines, fname) + return [] # Support for building "topic help" for pydoc @@ -230,11 +286,12 @@ class PydocTopicsBuilder(Builder): document.append(doctree.ids[labelid]) destination = StringOutput(encoding='utf-8') writer.write(document, destination) - self.topics[label] = str(writer.output) + self.topics[label] = writer.output.encode('utf-8') def finish(self): f = open(path.join(self.outdir, 'topics.py'), 'w') try: + f.write('# -*- coding: utf-8 -*-\n') f.write('# Autogenerated by Sphinx on %s\n' % asctime()) f.write('topics = ' + pformat(self.topics) + '\n') finally: @@ -304,3 +361,4 @@ def setup(app): app.add_description_unit('2to3fixer', '2to3fixer', '%s (2to3 fixer)') app.add_directive_to_domain('py', 'decorator', PyDecoratorFunction) app.add_directive_to_domain('py', 'decoratormethod', PyDecoratorMethod) + app.add_directive('miscnews', MiscNews) diff --git a/Doc/tools/sphinxext/static/basic.css b/Doc/tools/sphinxext/static/basic.css index 5708b4d..3242a81 100644 --- a/Doc/tools/sphinxext/static/basic.css +++ b/Doc/tools/sphinxext/static/basic.css @@ -38,6 +38,8 @@ div.related li.right { /* -- sidebar --------------------------------------------------------------- */ div.sphinxsidebarwrapper { + position: relative; + top: 0; padding: 10px 5px 0 10px; word-wrap: break-word; } @@ -334,10 +336,14 @@ dl.glossary dt { font-style: italic; } -p.deprecated, p.deprecated-removed { +.deprecated, .deprecated-removed { background-color: #ffe4e4; border: 1px solid #f66; - padding: 7px + padding: 7px; +} + +div.deprecated p, div.deprecated-removed p { + margin-bottom: 0; } .system-message { diff --git a/Doc/tools/sphinxext/static/sidebar.js b/Doc/tools/sphinxext/static/sidebar.js index 0c410e6..e8d58f4 100644 --- a/Doc/tools/sphinxext/static/sidebar.js +++ b/Doc/tools/sphinxext/static/sidebar.js @@ -2,7 +2,8 @@ * sidebar.js * ~~~~~~~~~~ * - * This script makes the Sphinx sidebar collapsible. + * This script makes the Sphinx sidebar collapsible and implements intelligent + * scrolling. * * .sphinxsidebar contains .sphinxsidebarwrapper. This script adds in * .sphixsidebar, after .sphinxsidebarwrapper, the #sidebarbutton used to @@ -24,6 +25,8 @@ $(function() { // global elements used by the functions. // the 'sidebarbutton' element is defined as global after its // creation, in the add_sidebar_button function + var jwindow = $(window); + var jdocument = $(document); var bodywrapper = $('.bodywrapper'); var sidebar = $('.sphinxsidebar'); var sidebarwrapper = $('.sphinxsidebarwrapper'); @@ -42,6 +45,13 @@ $(function() { var dark_color = '#AAAAAA'; var light_color = '#CCCCCC'; + function get_viewport_height() { + if (window.innerHeight) + return window.innerHeight; + else + return jwindow.height(); + } + function sidebar_is_collapsed() { return sidebarwrapper.is(':not(:visible)'); } @@ -51,6 +61,8 @@ $(function() { expand_sidebar(); else collapse_sidebar(); + // adjust the scrolling of the sidebar + scroll_sidebar(); } function collapse_sidebar() { @@ -95,11 +107,7 @@ $(function() { ); var sidebarbutton = $('#sidebarbutton'); // find the height of the viewport to center the '<<' in the page - var viewport_height; - if (window.innerHeight) - viewport_height = window.innerHeight; - else - viewport_height = $(window).height(); + var viewport_height = get_viewport_height(); var sidebar_offset = sidebar.offset().top; var sidebar_height = Math.max(bodywrapper.height(), sidebar.height()); sidebarbutton.find('span').css({ @@ -152,4 +160,34 @@ $(function() { add_sidebar_button(); var sidebarbutton = $('#sidebarbutton'); set_position_from_cookie(); + + + /* intelligent scrolling */ + function scroll_sidebar() { + var sidebar_height = sidebarwrapper.height(); + var viewport_height = get_viewport_height(); + var offset = sidebar.position()['top']; + var wintop = jwindow.scrollTop(); + var winbot = wintop + viewport_height; + var curtop = sidebarwrapper.position()['top']; + var curbot = curtop + sidebar_height; + // does sidebar fit in window? + if (sidebar_height < viewport_height) { + // yes: easy case -- always keep at the top + sidebarwrapper.css('top', $u.min([$u.max([0, wintop - offset - 10]), + jdocument.height() - sidebar_height - 200])); + } + else { + // no: only scroll if top/bottom edge of sidebar is at + // top/bottom edge of window + if (curtop > wintop && curbot > winbot) { + sidebarwrapper.css('top', $u.max([wintop - offset - 10, 0])); + } + else if (curtop < wintop && curbot < winbot) { + sidebarwrapper.css('top', $u.min([winbot - sidebar_height - offset - 20, + jdocument.height() - sidebar_height - 200])); + } + } + } + jwindow.scroll(scroll_sidebar); }); diff --git a/Doc/tools/sphinxext/static/version_switch.js b/Doc/tools/sphinxext/static/version_switch.js index cc7be1c..e5528eb 100644 --- a/Doc/tools/sphinxext/static/version_switch.js +++ b/Doc/tools/sphinxext/static/version_switch.js @@ -2,7 +2,8 @@ 'use strict'; var all_versions = { - '3.4': 'dev (3.4)', + '3.5': 'dev (3.5)', + '3.4': '3.4', '3.3': '3.3', '3.2': '3.2', '2.7': '2.7', diff --git a/Doc/tools/sphinxext/susp-ignored.csv b/Doc/tools/sphinxext/susp-ignored.csv index 87e05cb..abc3ca8 100644 --- a/Doc/tools/sphinxext/susp-ignored.csv +++ b/Doc/tools/sphinxext/susp-ignored.csv @@ -1,139 +1,236 @@ c-api/arg,,:ref,"PyArg_ParseTuple(args, ""O|O:ref"", &object, &callback)" c-api/list,,:high,list[low:high] -c-api/list,,:high,list[low:high] = itemlist -c-api/sequence,,:i2,o[i1:i2] -c-api/sequence,,:i2,o[i1:i2] = v c-api/sequence,,:i2,del o[i1:i2] +c-api/sequence,,:i2,o[i1:i2] c-api/unicode,,:end,str[start:end] +c-api/unicode,,:start,unicode[start:start+length] +distutils/examples,267,`,This is the description of the ``foobar`` package. distutils/setupscript,,::, extending/embedding,,:numargs,"if(!PyArg_ParseTuple(args, "":numargs""))" -extending/extending,,:set,"if (PyArg_ParseTuple(args, ""O:set_callback"", &temp)) {" extending/extending,,:myfunction,"PyArg_ParseTuple(args, ""D:myfunction"", &c);" +extending/extending,,:set,"if (PyArg_ParseTuple(args, ""O:set_callback"", &temp)) {" extending/newtypes,,:call,"if (!PyArg_ParseTuple(args, ""sss:call"", &arg1, &arg2, &arg3)) {" extending/windows,,:initspam,/export:initspam -howto/cporting,,:add,"if (!PyArg_ParseTuple(args, ""ii:add_ints"", &one, &two))" +faq/programming,,:chr,">=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr(" +faq/programming,,::,for x in sequence[::-1]: +faq/programming,,:reduce,"print((lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y," +faq/programming,,:reduce,"Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro," +faq/windows,,:bd8afb90ebf2,"Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)] on win32" howto/cporting,,:encode,"if (!PyArg_ParseTuple(args, ""O:encode_object"", &myobj))" howto/cporting,,:say,"if (!PyArg_ParseTuple(args, ""U:say_hello"", &name))" -howto/curses,,:black,"They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and" -howto/curses,,:blue,"They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and" -howto/curses,,:cyan,"They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and" -howto/curses,,:green,"They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and" -howto/curses,,:magenta,"They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and" -howto/curses,,:red,"They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and" -howto/curses,,:white,"7:white." -howto/curses,,:yellow,"They are: 0:black, 1:red, 2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and" +howto/curses,,:black,"colors when it activates color mode. They are: 0:black, 1:red," +howto/curses,,:red,"colors when it activates color mode. They are: 0:black, 1:red," +howto/curses,,:green,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The" +howto/curses,,:yellow,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The" +howto/curses,,:blue,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The" +howto/curses,,:magenta,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The" +howto/curses,,:cyan,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The" +howto/curses,,:white,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white. The" +howto/ipaddress,,:DB8,>>> ipaddress.ip_address('2001:DB8::1') +howto/ipaddress,,::,>>> ipaddress.ip_address('2001:DB8::1') +howto/ipaddress,,:db8,IPv6Address('2001:db8::1') +howto/ipaddress,,::,IPv6Address('2001:db8::1') +howto/ipaddress,,::,IPv6Address('::1') +howto/ipaddress,,:db8,>>> ipaddress.ip_network('2001:db8::0/96') +howto/ipaddress,,::,>>> ipaddress.ip_network('2001:db8::0/96') +howto/ipaddress,,:db8,IPv6Network('2001:db8::/96') +howto/ipaddress,,::,IPv6Network('2001:db8::/96') +howto/ipaddress,,:db8,IPv6Network('2001:db8::/128') +howto/ipaddress,,::,IPv6Network('2001:db8::/128') +howto/ipaddress,,:db8,IPv6Interface('2001:db8::1/96') +howto/ipaddress,,::,IPv6Interface('2001:db8::1/96') +howto/ipaddress,,:db8,>>> addr6 = ipaddress.ip_address('2001:db8::1') +howto/ipaddress,,::,>>> addr6 = ipaddress.ip_address('2001:db8::1') +howto/ipaddress,,:db8,>>> host6 = ipaddress.ip_interface('2001:db8::1/96') +howto/ipaddress,,::,>>> host6 = ipaddress.ip_interface('2001:db8::1/96') +howto/ipaddress,,:db8,>>> net6 = ipaddress.ip_network('2001:db8::0/96') +howto/ipaddress,,::,>>> net6 = ipaddress.ip_network('2001:db8::0/96') +howto/ipaddress,,:ffff,IPv6Address('ffff:ffff:ffff:ffff:ffff:ffff::') +howto/ipaddress,,::,IPv6Address('ffff:ffff:ffff:ffff:ffff:ffff::') +howto/ipaddress,,::,IPv6Address('::ffff:ffff') +howto/ipaddress,,:ffff,IPv6Address('::ffff:ffff') +howto/ipaddress,,:db8,'2001:db8::/96' +howto/ipaddress,,::,'2001:db8::/96' +howto/ipaddress,,:db8,>>> ipaddress.ip_interface('2001:db8::1/96') +howto/ipaddress,,::,>>> ipaddress.ip_interface('2001:db8::1/96') +howto/ipaddress,,:db8,'2001:db8::1' +howto/ipaddress,,::,'2001:db8::1' +howto/ipaddress,,:db8,IPv6Address('2001:db8::ffff:ffff') +howto/ipaddress,,::,IPv6Address('2001:db8::ffff:ffff') +howto/ipaddress,,:ffff,IPv6Address('2001:db8::ffff:ffff') +howto/logging,,:And,"WARNING:And this, too" +howto/logging,,:And,"WARNING:root:And this, too" +howto/logging,,:Doing,INFO:root:Doing something +howto/logging,,:Finished,INFO:root:Finished +howto/logging,,:logger,severity:logger name:message +howto/logging,,:Look,WARNING:root:Look before you leap! +howto/logging,,:message,severity:logger name:message +howto/logging,,:root,DEBUG:root:This message should go to the log file +howto/logging,,:root,INFO:root:Doing something +howto/logging,,:root,INFO:root:Finished +howto/logging,,:root,INFO:root:So should this +howto/logging,,:root,INFO:root:Started +howto/logging,,:root,"WARNING:root:And this, too" +howto/logging,,:root,WARNING:root:Look before you leap! +howto/logging,,:root,WARNING:root:Watch out! +howto/logging,,:So,INFO:root:So should this +howto/logging,,:So,INFO:So should this +howto/logging,,:Started,INFO:root:Started +howto/logging,,:This,DEBUG:root:This message should go to the log file +howto/logging,,:This,DEBUG:This message should appear on the console +howto/logging,,:Watch,WARNING:root:Watch out! +howto/pyporting,75,::,# make sure to use :: Python *and* :: Python :: 3 so +howto/pyporting,75,::,"'Programming Language :: Python'," +howto/pyporting,75,::,'Programming Language :: Python :: 3' howto/regex,,::, howto/regex,,:foo,(?:foo) howto/urllib2,,:example,"for example ""joe@password:example.com""" -howto/webservers,,.. image:,.. image:: http.png library/audioop,,:ipos,"# factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)]," +library/bisect,32,:hi,all(val >= x for val in a[i:hi]) +library/bisect,42,:hi,all(val > x for val in a[i:hi]) +library/configparser,,:home,my_dir: ${Common:home_dir}/twosheds +library/configparser,,:option,${section:option} +library/configparser,,:path,python_dir: ${Frameworks:path}/Python/Versions/${Frameworks:Python} +library/configparser,,:Python,python_dir: ${Frameworks:path}/Python/Versions/${Frameworks:Python} +library/configparser,,:system,path: ${Common:system_dir}/Library/Frameworks/ library/datetime,,:MM, library/datetime,,:SS, library/decimal,,:optional,"trailneg:optional trailing minus indicator" library/difflib,,:ahi,a[alo:ahi] library/difflib,,:bhi,b[blo:bhi] +library/difflib,,:i1, library/difflib,,:i2, library/difflib,,:j2, -library/difflib,,:i1, -library/dis,,:TOS, -library/dis,,`,TOS = `TOS` library/doctest,,`,``factorial`` from the ``example`` module: library/doctest,,`,The ``example`` module library/doctest,,`,Using ``factorial`` +library/exceptions,,:err,err.object[err.start:err.end] library/functions,,:step,a[start:stop:step] library/functions,,:stop,"a[start:stop, i]" library/functions,,:stop,a[start:stop:step] -library/hotshot,,:lineno,"ncalls tottime percall cumtime percall filename:lineno(function)" -library/httplib,,:port,host:port -library/imaplib,,:MM,"""DD-Mmm-YYYY HH:MM:SS +HHMM""" -library/imaplib,,:SS,"""DD-Mmm-YYYY HH:MM:SS +HHMM""" -library/itertools,,:stop,elements from seq[start:stop:step] +library/http.client,,:port,host:port +library/http.cookies,,`,!#$%&'*+-.^_`|~: +library/imaplib,,:MM,"""DD-Mmm-YYYY HH:MM:SS" +library/imaplib,,:SS,"""DD-Mmm-YYYY HH:MM:SS" +library/inspect,,:int,">>> def foo(a, *, b:int, **kwargs):" +library/inspect,,:int,"'(a, *, b:int, **kwargs)'" +library/inspect,,:int,'b:int' +library/ipaddress,,:db8,>>> ipaddress.ip_address('2001:db8::') +library/ipaddress,,::,>>> ipaddress.ip_address('2001:db8::') +library/ipaddress,,:db8,IPv6Address('2001:db8::') +library/ipaddress,,::,IPv6Address('2001:db8::') +library/ipaddress,,:db8,>>> ipaddress.IPv6Address('2001:db8::1000') +library/ipaddress,,::,>>> ipaddress.IPv6Address('2001:db8::1000') +library/ipaddress,,:db8,IPv6Address('2001:db8::1000') +library/ipaddress,,::,IPv6Address('2001:db8::1000') +library/ipaddress,,::,"""::abc:7:def""" +library/ipaddress,,:def,"""::abc:7:def""" +library/ipaddress,,::,::FFFF/96 +library/ipaddress,,::,2002::/16 +library/ipaddress,,::,2001::/32 +library/ipaddress,,::,>>> str(ipaddress.IPv6Address('::1')) +library/ipaddress,,::,'::1' +library/ipaddress,,:ff00,ffff:ff00:: +library/ipaddress,,:db00,2001:db00::0/24 +library/ipaddress,,::,2001:db00::0/24 +library/ipaddress,,:db00,2001:db00::0/ffff:ff00:: +library/ipaddress,,::,2001:db00::0/ffff:ff00:: library/itertools,,:step,elements from seq[start:stop:step] +library/itertools,,:stop,elements from seq[start:stop:step] library/linecache,,:sys,"sys:x:3:3:sys:/dev:/bin/sh" -library/logging,,:And, -library/logging,,:package1, -library/logging,,:package2, -library/logging,,:root, -library/logging,,:This, -library/logging,,:port,host:port +library/logging.handlers,,:port,host:port library/mmap,,:i2,obj[i1:i2] -library/multiprocessing,,:queue,">>> QueueManager.register('get_queue', callable=lambda:queue)" -library/multiprocessing,,`,">>> l._callmethod('__getitem__', (20,)) # equiv to `l[20]`" -library/multiprocessing,,`,">>> l._callmethod('__getslice__', (2, 7)) # equiv to `l[2:7]`" -library/multiprocessing,,`,# `BaseManager`. -library/multiprocessing,,`,# `Pool.imap()` (which will save on the amount of code needed anyway). +library/multiprocessing,,`,# Add more tasks using `put()` library/multiprocessing,,`,# A test file for the `multiprocessing` package library/multiprocessing,,`,# A test of `multiprocessing.Pool` class -library/multiprocessing,,`,# Add more tasks using `put()` -library/multiprocessing,,`,# create server for a `HostManager` object -library/multiprocessing,,`,# Depends on `multiprocessing` package -- tested with `processing-0.60` +library/multiprocessing,,`,# `BaseManager`. library/multiprocessing,,`,# in the original order then consider using `Pool.map()` or +library/multiprocessing,,`,">>> l._callmethod('__getitem__', (20,)) # equiv to `l[20]`" +library/multiprocessing,,`,">>> l._callmethod('__getslice__', (2, 7)) # equiv to `l[2:7]`" library/multiprocessing,,`,# Not sure if we should synchronize access to `socket.accept()` method by library/multiprocessing,,`,# object. (We import `multiprocessing.reduction` to enable this pickling.) +library/multiprocessing,,`,# `Pool.imap()` (which will save on the amount of code needed anyway). +library/multiprocessing,,:queue,">>> QueueManager.register('get_queue', callable=lambda:queue)" library/multiprocessing,,`,# register the Foo class; make `f()` and `g()` accessible via proxy library/multiprocessing,,`,# register the Foo class; make `g()` and `_h()` accessible via proxy library/multiprocessing,,`,# register the generator function baz; use `GeneratorProxy` to make proxies -library/multiprocessing,,`,`Cluster` is a subclass of `SyncManager` so it allows creation of -library/multiprocessing,,`,`hostname` gives the name of the host. If hostname is not -library/multiprocessing,,`,`slots` is used to specify the number of slots for processes on +library/nntplib,,:bytes,:bytes +library/nntplib,,:lines,:lines library/optparse,,:len,"del parser.rargs[:len(value)]" library/os.path,,:foo,c:foo -library/parser,,`,"""Make a function that raises an argument to the exponent `exp`.""" +library/pdb,,:lineno,filename:lineno +library/pickle,,:memory,"conn = sqlite3.connect("":memory:"")" library/posix,,`,"CFLAGS=""`getconf LFS_CFLAGS`"" OPT=""-g -O2 $CFLAGS""" -library/profile,,:lineno,ncalls tottime percall cumtime percall filename:lineno(function) +library/pprint,209,::,"'classifiers': ['Development Status :: 4 - Beta'," +library/pprint,209,::,"'Intended Audience :: Developers'," +library/pprint,209,::,"'License :: OSI Approved :: MIT License'," +library/pprint,209,::,"'Natural Language :: English'," +library/pprint,209,::,"'Operating System :: OS Independent'," +library/pprint,209,::,"'Programming Language :: Python'," +library/pprint,209,::,"'Programming Language :: Python :: 2'," +library/pprint,209,::,"'Programming Language :: Python :: 2.6'," +library/pprint,209,::,"'Programming Language :: Python :: 2.7'," +library/pprint,209,::,"'Topic :: Software Development :: Libraries'," +library/pprint,209,::,"'Topic :: Software Development :: Libraries :: Python Modules']," library/profile,,:lineno,filename:lineno(function) library/pyexpat,,:elem1,<py:elem1 /> library/pyexpat,,:py,"xmlns:py = ""http://www.python.org/ns/"">" -library/repr,,`,"return `obj`" -library/smtplib,,:port,"as well as a regular host:port server." +library/smtplib,,:port,method must support that as well as a regular host:port +library/socket,,::,"(10, 1, 6, '', ('2001:888:2000:d::a2', 80, 0, 0))]" library/socket,,::,'5aef:2b::8' +library/socket,,:can,"return (can_id, can_dlc, data[:can_dlc])" +library/socket,,:len,fds.fromstring(cmsg_data[:len(cmsg_data) - (len(cmsg_data) % fds.itemsize)]) +library/sqlite3,,:age,"cur.execute(""select * from people where name_last=:who and age=:age"", {""who"": who, ""age"": age})" library/sqlite3,,:memory, library/sqlite3,,:who,"cur.execute(""select * from people where name_last=:who and age=:age"", {""who"": who, ""age"": age})" -library/sqlite3,,:age,"cur.execute(""select * from people where name_last=:who and age=:age"", {""who"": who, ""age"": age})" -library/ssl,,:My,"Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Organization, Inc." library/ssl,,:My,"Organizational Unit Name (eg, section) []:My Group" +library/ssl,,:My,"Organization Name (eg, company) [Internet Widgits Pty Ltd]:My Organization, Inc." library/ssl,,:myserver,"Common Name (eg, YOUR name) []:myserver.mygroup.myorganization.com" library/ssl,,:MyState,State or Province Name (full name) [Some-State]:MyState library/ssl,,:ops,Email Address []:ops@myserver.mygroup.myorganization.com library/ssl,,:Some,"Locality Name (eg, city) []:Some City" library/ssl,,:US,Country Name (2 letter code) [AU]:US +library/stdtypes,,::,>>> a[::-1].tolist() +library/stdtypes,,::,>>> a[::2].tolist() +library/stdtypes,,:end,s[start:end] +library/stdtypes,,::,>>> hash(v[::-2]) == hash(b'abcefg'[::-2]) library/stdtypes,,:len,s[len(s):len(s)] -library/stdtypes,,:len,s[len(s):len(s)] -library/string,,:end,s[start:end] -library/string,,:end,s[start:end] -library/subprocess,,`,"output=`mycmd myarg`" +library/stdtypes,,::,>>> y = m[::2] +library/stdtypes,,::,>>> z = y[::-2] library/subprocess,,`,"output=`dmesg | grep hda`" +library/subprocess,,`,"output=`mycmd myarg`" +library/tarfile,,:bz2, library/tarfile,,:compression,filemode[:compression] library/tarfile,,:gz, -library/tarfile,,:bz2, +library/tarfile,,:xz,'a:xz' +library/tarfile,,:xz,'r:xz' +library/tarfile,,:xz,'w:xz' library/time,,:mm, library/time,,:ss, library/turtle,,::,Example:: -library/urllib,,:port,:port -library/urllib2,,:password,"""joe:password@python.org""" +library/urllib.request,,:close,Connection:close +library/urllib.request,,:lang,"xmlns=""http://www.w3.org/1999/xhtml"" xml:lang=""en"" lang=""en"">\n\n<head>\n" +library/urllib.request,,:password,"""joe:password@python.org""" library/uuid,,:uuid,urn:uuid:12345678-1234-5678-1234-567812345678 +library/venv,,:param,":param nodist: If True, setuptools and pip are not installed into the" +library/venv,,:param,":param progress: If setuptools or pip are installed, the progress of the" +library/venv,,:param,":param nopip: If True, pip is not installed into the created" +library/venv,,:param,:param context: The information for the environment creation request library/xmlrpc.client,,:pass,http://user:pass@host:port/path -library/xmlrpc.client,,:port,http://user:pass@host:port/path library/xmlrpc.client,,:pass,user:pass +library/xmlrpc.client,,:port,http://user:pass@host:port/path +license,,`,"``Software''), to deal in the Software without restriction, including" +license,,`,"THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND," +license,,`,* THIS SOFTWARE IS PROVIDED BY ERIC YOUNG ``AS IS'' AND +license,,`,THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND +license,,`,* THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY license,,`,THIS SOFTWARE IS PROVIDED BY THE PROJECT AND CONTRIBUTORS ``AS IS'' AND license,,:zooko,mailto:zooko@zooko.com -license,,`,THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND -reference/datamodel,,:step,a[i:j:step] -reference/datamodel,,:max, reference/expressions,,:index,x[index:index] -reference/expressions,,:datum,{key:datum...} -reference/expressions,,`,`expressions...` -reference/grammar,,:output,#diagram:output -reference/grammar,,:rules,#diagram:rules -reference/grammar,,:token,#diagram:token -reference/grammar,,`,'`' testlist1 '`' +reference/lexical_analysis,,`,$ ? ` reference/lexical_analysis,,:fileencoding,# vim:fileencoding=<encoding-name> -reference/lexical_analysis,,`,", : . ` = ;" -tutorial/datastructures,,:value,key:value pairs within the braces adds initial key:value pairs tutorial/datastructures,,:value,It is also possible to delete a key:value -tutorial/stdlib2,,:start,"fields = struct.unpack('<IIIHH', data[start:start+16])" -tutorial/stdlib2,,:start,extra = data[start:start+extra_size] -tutorial/stdlib2,,:start,filename = data[start:start+filenamesize] +tutorial/datastructures,,:value,key:value pairs within the braces adds initial key:value pairs tutorial/stdlib2,,:config,"logging.warning('Warning:config file %s not found', 'server.conf')" tutorial/stdlib2,,:config,WARNING:root:Warning:config file server.conf not found tutorial/stdlib2,,:Critical,CRITICAL:root:Critical error -- shutting down @@ -141,15 +238,17 @@ tutorial/stdlib2,,:Error,ERROR:root:Error occurred tutorial/stdlib2,,:root,CRITICAL:root:Critical error -- shutting down tutorial/stdlib2,,:root,ERROR:root:Error occurred tutorial/stdlib2,,:root,WARNING:root:Warning:config file server.conf not found +tutorial/stdlib2,,:start,extra = data[start:start+extra_size] +tutorial/stdlib2,,:start,"fields = struct.unpack('<IIIHH', data[start:start+16])" +tutorial/stdlib2,,:start,filename = data[start:start+filenamesize] tutorial/stdlib2,,:Warning,WARNING:root:Warning:config file server.conf not found -using/cmdline,,:line,file:line: category: message using/cmdline,,:category,action:message:category:module:line +using/cmdline,,:errorhandler,:errorhandler using/cmdline,,:line,action:message:category:module:line +using/cmdline,,:line,file:line: category: message using/cmdline,,:message,action:message:category:module:line using/cmdline,,:module,action:message:category:module:line -using/cmdline,,:errorhandler,:errorhandler -using/windows,162,`,`` this fixes syntax highlighting errors in some editors due to the \\\\ hackery -using/windows,170,`,`` +using/unix,,:Packaging,http://en.opensuse.org/Portal:Packaging whatsnew/2.0,418,:len, whatsnew/2.3,,::, whatsnew/2.3,,:config, @@ -163,116 +262,29 @@ whatsnew/2.4,,:System, whatsnew/2.5,,:memory,:memory: whatsnew/2.5,,:step,[start:stop:step] whatsnew/2.5,,:stop,[start:stop:step] -distutils/examples,267,`,This is the description of the ``foobar`` package. -faq/programming,,:reduce,"print((lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(lambda x,y:x+y,map(lambda y," -faq/programming,,:reduce,"Sx=Sx,Sy=Sy:reduce(lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro," -faq/programming,,:chr,">=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):f(xc,yc,x,y,k,f):chr(" -faq/programming,,::,for x in sequence[::-1]: -faq/windows,,:bd8afb90ebf2,"Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)] on win32" -faq/windows,,:EOF,@setlocal enableextensions & python -x %~f0 %* & goto :EOF -faq/windows,,:REG,.py :REG_SZ: c:\<path to python>\python.exe -u %s %s -library/bisect,32,:hi,all(val >= x for val in a[i:hi]) -library/bisect,42,:hi,all(val > x for val in a[i:hi]) -library/http.client,,:port,host:port -library/http.cookies,,`,!#$%&'*+-.^_`|~ -library/nntplib,,:bytes,:bytes -library/nntplib,,:lines,:lines -library/nntplib,,:lines,"['xref', 'from', ':lines', ':bytes', 'references', 'date', 'message-id', 'subject']" -library/nntplib,,:bytes,"['xref', 'from', ':lines', ':bytes', 'references', 'date', 'message-id', 'subject']" -library/pickle,,:memory,"conn = sqlite3.connect("":memory:"")" -library/profile,,:lineno,"(sort by filename:lineno)," -library/socket,,::,"(10, 1, 6, '', ('2001:888:2000:d::a2', 80, 0, 0))]" -library/stdtypes,,:end,s[start:end] -library/stdtypes,,:end,s[start:end] -library/urllib.request,,:close,Connection:close -library/urllib.request,,:password,"""joe:password@python.org""" -library/urllib.request,,:lang,"xmlns=""http://www.w3.org/1999/xhtml"" xml:lang=""en"" lang=""en"">\n\n<head>\n" -license,,`,* THIS SOFTWARE IS PROVIDED BY THE OpenSSL PROJECT ``AS IS'' AND ANY -license,,`,* THIS SOFTWARE IS PROVIDED BY ERIC YOUNG ``AS IS'' AND -license,,`,"``Software''), to deal in the Software without restriction, including" -license,,`,"THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND," -reference/lexical_analysis,704,`,$ ? ` +whatsnew/2.7,1619,::,"ParseResult(scheme='http', netloc='[1080::8:800:200C:417A]'," +whatsnew/2.7,1619,::,>>> urlparse.urlparse('http://[1080::8:800:200C:417A]/foo') whatsnew/2.7,735,:Sunday,'2009:4:Sunday' -whatsnew/2.7,862,::,"export PYTHONWARNINGS=all,error:::Cookie:0" whatsnew/2.7,862,:Cookie,"export PYTHONWARNINGS=all,error:::Cookie:0" -whatsnew/2.7,1619,::,>>> urlparse.urlparse('http://[1080::8:800:200C:417A]/foo') -whatsnew/2.7,1619,::,"ParseResult(scheme='http', netloc='[1080::8:800:200C:417A]'," -library/configparser,,`,# Set the optional `raw` argument of get() to True if you wish to disable -library/configparser,,`,# The optional `vars` argument is a dict with members that will take -library/configparser,,`,# The optional `fallback` argument can be used to provide a fallback value -library/configparser,,:option,${section:option} -library/configparser,,:system,path: ${Common:system_dir}/Library/Frameworks/ -library/configparser,,:home,my_dir: ${Common:home_dir}/twosheds -library/configparser,,:path,python_dir: ${Frameworks:path}/Python/Versions/${Frameworks:Python} -library/configparser,,:Python,python_dir: ${Frameworks:path}/Python/Versions/${Frameworks:Python} -library/pdb,,:lineno,[filename:lineno | bpnumber [bpnumber ...]] -library/pdb,,:lineno,filename:lineno -library/logging,,:Watch,WARNING:root:Watch out! -library/logging,,:So,INFO:root:So should this -library/logging,,:Started,INFO:root:Started -library/logging,,:Doing,INFO:root:Doing something -library/logging,,:Finished,INFO:root:Finished -library/logging,,:Look,WARNING:root:Look before you leap! -library/logging,,:So,INFO:So should this -library/logging,,:logger,severity:logger name:message -library/logging,,:message,severity:logger name:message -whatsnew/3.2,,:directory,... ${buildout:directory}/downloads/dist -whatsnew/3.2,,:location,... zope9-location = ${zope9:location} -whatsnew/3.2,,:prefix,... zope-conf = ${custom:prefix}/etc/zope.conf -howto/logging,,:root,WARNING:root:Watch out! -howto/logging,,:Watch,WARNING:root:Watch out! -howto/logging,,:root,DEBUG:root:This message should go to the log file -howto/logging,,:This,DEBUG:root:This message should go to the log file -howto/logging,,:root,INFO:root:So should this -howto/logging,,:So,INFO:root:So should this -howto/logging,,:root,"WARNING:root:And this, too" -howto/logging,,:And,"WARNING:root:And this, too" -howto/logging,,:root,INFO:root:Started -howto/logging,,:Started,INFO:root:Started -howto/logging,,:root,INFO:root:Doing something -howto/logging,,:Doing,INFO:root:Doing something -howto/logging,,:root,INFO:root:Finished -howto/logging,,:Finished,INFO:root:Finished -howto/logging,,:root,WARNING:root:Look before you leap! -howto/logging,,:Look,WARNING:root:Look before you leap! -howto/logging,,:This,DEBUG:This message should appear on the console -howto/logging,,:So,INFO:So should this -howto/logging,,:And,"WARNING:And this, too" -howto/logging,,:logger,severity:logger name:message -howto/logging,,:message,severity:logger name:message -library/logging.handlers,,:port,host:port -library/imaplib,116,:MM,"""DD-Mmm-YYYY HH:MM:SS" -library/imaplib,116,:SS,"""DD-Mmm-YYYY HH:MM:SS" -whatsnew/3.2,,::,"$ export PYTHONWARNINGS='ignore::RuntimeWarning::,once::UnicodeWarning::'" -howto/pyporting,75,::,# make sure to use :: Python *and* :: Python :: 3 so -howto/pyporting,75,::,"'Programming Language :: Python'," -howto/pyporting,75,::,'Programming Language :: Python :: 3' -whatsnew/3.2,,:gz,">>> with tarfile.open(name='myarchive.tar.gz', mode='w:gz') as tf:" -whatsnew/3.2,,:directory,${buildout:directory}/downloads/dist -whatsnew/3.2,,:location,zope9-location = ${zope9:location} -whatsnew/3.2,,:prefix,zope-conf = ${custom:prefix}/etc/zope.conf -whatsnew/3.2,,:beef,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') -whatsnew/3.2,,:cafe,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') +whatsnew/2.7,862,::,"export PYTHONWARNINGS=all,error:::Cookie:0" +whatsnew/3.2,,:affe,"netloc='[dead:beef:cafe:5417:affe:8FA3:deaf:feed]'," whatsnew/3.2,,:affe,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') -whatsnew/3.2,,:deaf,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') -whatsnew/3.2,,:feed,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') whatsnew/3.2,,:beef,"netloc='[dead:beef:cafe:5417:affe:8FA3:deaf:feed]'," +whatsnew/3.2,,:beef,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') whatsnew/3.2,,:cafe,"netloc='[dead:beef:cafe:5417:affe:8FA3:deaf:feed]'," -whatsnew/3.2,,:affe,"netloc='[dead:beef:cafe:5417:affe:8FA3:deaf:feed]'," +whatsnew/3.2,,:cafe,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') whatsnew/3.2,,:deaf,"netloc='[dead:beef:cafe:5417:affe:8FA3:deaf:feed]'," +whatsnew/3.2,,:deaf,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') +whatsnew/3.2,,:directory,${buildout:directory}/downloads/dist +whatsnew/3.2,,::,"$ export PYTHONWARNINGS='ignore::RuntimeWarning::,once::UnicodeWarning::'" whatsnew/3.2,,:feed,"netloc='[dead:beef:cafe:5417:affe:8FA3:deaf:feed]'," -library/pprint,209,::,"'classifiers': ['Development Status :: 4 - Beta'," -library/pprint,209,::,"'Intended Audience :: Developers'," -library/pprint,209,::,"'License :: OSI Approved :: MIT License'," -library/pprint,209,::,"'Natural Language :: English'," -library/pprint,209,::,"'Operating System :: OS Independent'," -library/pprint,209,::,"'Programming Language :: Python'," -library/pprint,209,::,"'Programming Language :: Python :: 2'," -library/pprint,209,::,"'Programming Language :: Python :: 2.6'," -library/pprint,209,::,"'Programming Language :: Python :: 2.7'," -library/pprint,209,::,"'Topic :: Software Development :: Libraries'," -library/pprint,209,::,"'Topic :: Software Development :: Libraries :: Python Modules']," -library/http.client,,:port,host:port -library/http.cookies,,`,!#$%&'*+-.^_`|~ -library/sqlite3,,:who,"cur.execute(""select * from people where name_last=:who and age=:age"", {""who"": who, ""age"": age})" -library/sqlite3,,:age,"cur.execute(""select * from people where name_last=:who and age=:age"", {""who"": who, ""age"": age})" +whatsnew/3.2,,:feed,>>> urllib.parse.urlparse('http://[dead:beef:cafe:5417:affe:8FA3:deaf:feed]/foo/') +whatsnew/3.2,,:gz,">>> with tarfile.open(name='myarchive.tar.gz', mode='w:gz') as tf:" +whatsnew/3.2,,:location,zope9-location = ${zope9:location} +whatsnew/3.2,,:prefix,zope-conf = ${custom:prefix}/etc/zope.conf +whatsnew/changelog,,:platform,:platform: +whatsnew/changelog,,:PythonCmd,"With Tk < 8.5 _tkinter.c:PythonCmd() raised UnicodeDecodeError, caused" +whatsnew/changelog,,::,": Fix FTP tests for IPv6, bind to ""::1"" instead of ""localhost""." +whatsnew/changelog,,::,": Use ""127.0.0.1"" or ""::1"" instead of ""localhost"" as much as" +whatsnew/changelog,,:password,user:password +whatsnew/changelog,,:gz,w:gz diff --git a/Doc/tools/sphinxext/suspicious.py b/Doc/tools/sphinxext/suspicious.py index 888b231..ee87733 100644 --- a/Doc/tools/sphinxext/suspicious.py +++ b/Doc/tools/sphinxext/suspicious.py @@ -68,6 +68,10 @@ class Rule: # None -> don't care self.issue = issue # the markup fragment that triggered this rule self.line = line # text of the container element (single line only) + self.used = False + + def __repr__(self): + return '{0.docname},,{0.issue},{0.line}'.format(self) @@ -107,6 +111,12 @@ class CheckSuspiciousMarkupBuilder(Builder): doctree.walk(visitor) def finish(self): + unused_rules = [rule for rule in self.rules if not rule.used] + if unused_rules: + self.warn('Found %s/%s unused rules:' % + (len(unused_rules), len(self.rules))) + for rule in unused_rules: + self.info(repr(rule)) return def check_issue(self, line, lineno, issue): @@ -131,6 +141,7 @@ class CheckSuspiciousMarkupBuilder(Builder): if (rule.lineno is not None) and \ abs(rule.lineno - lineno) > 5: continue # if it came this far, the rule matched + rule.used = True return True return False diff --git a/Doc/tutorial/classes.rst b/Doc/tutorial/classes.rst index cff2710..08072a3 100644 --- a/Doc/tutorial/classes.rst +++ b/Doc/tutorial/classes.rst @@ -168,7 +168,6 @@ binding:: def do_global(): global spam spam = "global spam" - spam = "test spam" do_local() print("After local assignment:", spam) @@ -184,7 +183,6 @@ The output of the example code is: .. code-block:: none - After local assignment: test spam After nonlocal assignment: nonlocal spam After global assignment: nonlocal spam @@ -654,7 +652,7 @@ will do nicely:: A piece of Python code that expects a particular abstract data type can often be passed a class that emulates the methods of that data type instead. For instance, if you have a function that formats some data from a file object, you -can define a class with methods :meth:`read` and :meth:`readline` that get the +can define a class with methods :meth:`read` and :meth:`!readline` that get the data from a string buffer instead, and pass it as an argument. .. (Unfortunately, this technique has its limitations: a class can't define @@ -698,9 +696,9 @@ example, the following code will print B, C, D in that order:: class D(C): pass - for c in [B, C, D]: + for cls in [B, C, D]: try: - raise c() + raise cls() except D: print("D") except C: @@ -740,8 +738,8 @@ pervades and unifies Python. Behind the scenes, the :keyword:`for` statement calls :func:`iter` on the container object. The function returns an iterator object that defines the method :meth:`~iterator.__next__` which accesses elements in the container one at a time. When there are no more elements, -:meth:`__next__` raises a :exc:`StopIteration` exception which tells the -:keyword:`for` loop to terminate. You can call the :meth:`__next__` method +:meth:`~iterator.__next__` raises a :exc:`StopIteration` exception which tells the +:keyword:`for` loop to terminate. You can call the :meth:`~iterator.__next__` method using the :func:`next` built-in function; this example shows how it all works:: >>> s = 'abc' diff --git a/Doc/tutorial/controlflow.rst b/Doc/tutorial/controlflow.rst index 055f547..97aea4f 100644 --- a/Doc/tutorial/controlflow.rst +++ b/Doc/tutorial/controlflow.rst @@ -19,14 +19,14 @@ example:: >>> x = int(input("Please enter an integer: ")) Please enter an integer: 42 >>> if x < 0: - ... x = 0 - ... print('Negative changed to zero') + ... x = 0 + ... print('Negative changed to zero') ... elif x == 0: - ... print('Zero') + ... print('Zero') ... elif x == 1: - ... print('Single') + ... print('Single') ... else: - ... print('More') + ... print('More') ... More @@ -370,7 +370,7 @@ defined to allow. For example:: return False retries = retries - 1 if retries < 0: - raise IOError('refusenik user') + raise IOError('uncooperative user') print(complaint) This function can be called in several ways: @@ -583,17 +583,16 @@ In the same fashion, dictionaries can deliver keyword arguments with the ``**``\ .. _tut-lambda: -Lambda Forms ------------- +Lambda Expressions +------------------ -By popular demand, a few features commonly found in functional programming -languages like Lisp have been added to Python. With the :keyword:`lambda` -keyword, small anonymous functions can be created. Here's a function that -returns the sum of its two arguments: ``lambda a, b: a+b``. Lambda forms can be -used wherever function objects are required. They are syntactically restricted -to a single expression. Semantically, they are just syntactic sugar for a -normal function definition. Like nested function definitions, lambda forms can -reference variables from the containing scope:: +Small anonymous functions can be created with the :keyword:`lambda` keyword. +This function returns the sum of its two arguments: ``lambda a, b: a+b``. +Lambda functions can be used wherever function objects are required. They are +syntactically restricted to a single expression. Semantically, they are just +syntactic sugar for a normal function definition. Like nested function +definitions, lambda functions can reference variables from the containing +scope:: >>> def make_incrementor(n): ... return lambda x: x + n @@ -604,6 +603,14 @@ reference variables from the containing scope:: >>> f(1) 43 +The above example uses a lambda expression to return a function. Another use +is to pass a small function as an argument:: + + >>> pairs = [(1, 'one'), (2, 'two'), (3, 'three'), (4, 'four')] + >>> pairs.sort(key=lambda pair: pair[1]) + >>> pairs + [(4, 'four'), (1, 'one'), (3, 'three'), (2, 'two')] + .. _tut-docstrings: @@ -749,4 +756,3 @@ extracted for you: .. [#] Actually, *call by object reference* would be a better description, since if a mutable object is passed, the caller will see any changes the callee makes to it (items inserted into a list). - diff --git a/Doc/tutorial/datastructures.rst b/Doc/tutorial/datastructures.rst index 0ee4dca..24d2d2e 100644 --- a/Doc/tutorial/datastructures.rst +++ b/Doc/tutorial/datastructures.rst @@ -19,13 +19,13 @@ objects: .. method:: list.append(x) :noindex: - Add an item to the end of the list; equivalent to ``a[len(a):] = [x]``. + Add an item to the end of the list. Equivalent to ``a[len(a):] = [x]``. .. method:: list.extend(L) :noindex: - Extend the list by appending all the items in the given list; equivalent to + Extend the list by appending all the items in the given list. Equivalent to ``a[len(a):] = L``. @@ -40,8 +40,8 @@ objects: .. method:: list.remove(x) :noindex: - Remove the first item from the list whose value is *x*. It is an error if there - is no such item. + Remove the first item from the list whose value is *x*. It is an error if + there is no such item. .. method:: list.pop([i]) @@ -54,6 +54,12 @@ objects: will see this notation frequently in the Python Library Reference.) +.. method:: list.clear() + :noindex: + + Remove all items from the list. Equivalent to ``del a[:]``. + + .. method:: list.index(x) :noindex: @@ -70,13 +76,20 @@ objects: .. method:: list.sort() :noindex: - Sort the items of the list, in place. + Sort the items of the list in place. .. method:: list.reverse() :noindex: - Reverse the elements of the list, in place. + Reverse the elements of the list in place. + + +.. method:: list.copy() + :noindex: + + Return a shallow copy of the list. Equivalent to ``a[:]``. + An example that uses most of the list methods:: @@ -99,6 +112,10 @@ An example that uses most of the list methods:: >>> a [-1, 1, 66.25, 333, 333, 1234.5] +You might have noticed that methods like ``insert``, ``remove`` or ``sort`` that +modify the list have no return value printed -- they return ``None``. [1]_ This +is a design principle for all mutable data structures in Python. + .. _tut-lists-as-stacks: @@ -480,7 +497,7 @@ using a non-existent key. Performing ``list(d.keys())`` on a dictionary returns a list of all the keys used in the dictionary, in arbitrary order (if you want it sorted, just use -``sorted(d.keys())`` instead). [1]_ To check whether a single key is in the +``sorted(d.keys())`` instead). [2]_ To check whether a single key is in the dictionary, use the :keyword:`in` keyword. Here is a small example using a dictionary:: @@ -677,6 +694,9 @@ interpreter will raise a :exc:`TypeError` exception. .. rubric:: Footnotes -.. [1] Calling ``d.keys()`` will return a :dfn:`dictionary view` object. It +.. [1] Other languages may return the mutated object, which allows method + chaining, such as ``d->insert("a")->remove("b")->sort();``. + +.. [2] Calling ``d.keys()`` will return a :dfn:`dictionary view` object. It supports operations like membership test and iteration, but its contents are not independent of the original dictionary -- it is only a *view*. diff --git a/Doc/tutorial/errors.rst b/Doc/tutorial/errors.rst index 2b76c32..4282151 100644 --- a/Doc/tutorial/errors.rst +++ b/Doc/tutorial/errors.rst @@ -45,7 +45,7 @@ programs, however, and result in error messages as shown here:: >>> 10 * (1/0) Traceback (most recent call last): File "<stdin>", line 1, in ? - ZeroDivisionError: int division or modulo by zero + ZeroDivisionError: division by zero >>> 4 + spam*3 Traceback (most recent call last): File "<stdin>", line 1, in ? diff --git a/Doc/tutorial/inputoutput.rst b/Doc/tutorial/inputoutput.rst index c804e25..b3bf0ef 100644 --- a/Doc/tutorial/inputoutput.rst +++ b/Doc/tutorial/inputoutput.rst @@ -213,10 +213,6 @@ operation. For example:: >>> print('The value of PI is approximately %5.3f.' % math.pi) The value of PI is approximately 3.142. -Since :meth:`str.format` is quite new, a lot of Python code still uses the ``%`` -operator. However, because this old style of formatting will eventually be -removed from the language, :meth:`str.format` should generally be used. - More information can be found in the :ref:`old-string-formatting` section. @@ -300,18 +296,8 @@ containing only a single newline. :: >>> f.readline() '' -``f.readlines()`` returns a list containing all the lines of data in the file. -If given an optional parameter *sizehint*, it reads that many bytes from the -file and enough more to complete a line, and returns the lines from that. This -is often used to allow efficient reading of a large file by lines, but without -having to load the entire file in memory. Only complete lines will be returned. -:: - - >>> f.readlines() - ['This is the first line of the file.\n', 'Second line of the file\n'] - -An alternative approach to reading lines is to loop over the file object. This is -memory efficient, fast, and leads to simpler code:: +For reading lines from a file, you can loop over the file object. This is memory +efficient, fast, and leads to simple code:: >>> for line in f: ... print(line, end='') @@ -319,9 +305,8 @@ memory efficient, fast, and leads to simpler code:: This is the first line of the file. Second line of the file -The alternative approach is simpler but does not provide as fine-grained -control. Since the two approaches manage line buffering differently, they -should not be mixed. +If you want to read all the lines of a file in a list you can also use +``list(f)`` or ``f.readlines()``. ``f.write(string)`` writes the contents of *string* to the file, returning the number of characters written. :: @@ -337,9 +322,11 @@ first:: >>> f.write(s) 18 -``f.tell()`` returns an integer giving the file object's current position in the -file, measured in bytes from the beginning of the file. To change the file -object's position, use ``f.seek(offset, from_what)``. The position is computed +``f.tell()`` returns an integer giving the file object's current position in the file +represented as number of bytes from the beginning of the file when in `binary mode` and +an opaque number when in `text mode`. + +To change the file object's position, use ``f.seek(offset, from_what)``. The position is computed from adding *offset* to a reference point; the reference point is selected by the *from_what* argument. A *from_what* value of 0 measures from the beginning of the file, 1 uses the current file position, and 2 uses the end of the file as @@ -360,7 +347,10 @@ beginning of the file as the reference point. :: In text files (those opened without a ``b`` in the mode string), only seeks relative to the beginning of the file are allowed (the exception being seeking -to the very file end with ``seek(0, 2)``). +to the very file end with ``seek(0, 2)``) and the only valid *offset* values are +those returned from the ``f.tell()``, or zero. Any other *offset* value produces +undefined behaviour. + When you're done with a file, call ``f.close()`` to close it and free up any system resources taken up by the open file. After calling ``f.close()``, @@ -387,47 +377,64 @@ File objects have some additional methods, such as :meth:`~file.isatty` and Reference for a complete guide to file objects. -.. _tut-pickle: +.. _tut-json: -The :mod:`pickle` Module ------------------------- +Saving structured data with :mod:`json` +--------------------------------------- -.. index:: module: pickle +.. index:: module: json -Strings can easily be written to and read from a file. Numbers take a bit more +Strings can easily be written to and read from a file. Numbers take a bit more effort, since the :meth:`read` method only returns strings, which will have to be passed to a function like :func:`int`, which takes a string like ``'123'`` -and returns its numeric value 123. However, when you want to save more complex -data types like lists, dictionaries, or class instances, things get a lot more -complicated. - -Rather than have users be constantly writing and debugging code to save -complicated data types, Python provides a standard module called :mod:`pickle`. -This is an amazing module that can take almost any Python object (even some -forms of Python code!), and convert it to a string representation; this process -is called :dfn:`pickling`. Reconstructing the object from the string -representation is called :dfn:`unpickling`. Between pickling and unpickling, -the string representing the object may have been stored in a file or data, or +and returns its numeric value 123. When you want to save more complex data +types like nested lists and dictionaries, parsing and serializing by hand +becomes complicated. + +Rather than having users constantly writing and debugging code to save +complicated data types to files, Python allows you to use the popular data +interchange format called `JSON (JavaScript Object Notation) +<http://json.org>`_. The standard module called :mod:`json` can take Python +data hierarchies, and convert them to string representations; this process is +called :dfn:`serializing`. Reconstructing the data from the string representation +is called :dfn:`deserializing`. Between serializing and deserializing, the +string representing the object may have been stored in a file or data, or sent over a network connection to some distant machine. -If you have an object ``x``, and a file object ``f`` that's been opened for -writing, the simplest way to pickle the object takes only one line of code:: +.. note:: + The JSON format is commonly used by modern applications to allow for data + exchange. Many programmers are already familiar with it, which makes + it a good choice for interoperability. + +If you have an object ``x``, you can view its JSON string representation with a +simple line of code:: + + >>> json.dumps([1, 'simple', 'list']) + '[1, "simple", "list"]' + +Another variant of the :func:`~json.dumps` function, called :func:`~json.dump`, +simply serializes the object to a :term:`text file`. So if ``f`` is a +:term:`text file` object opened for writing, we can do this:: + + json.dump(x, f) - pickle.dump(x, f) +To decode the object again, if ``f`` is a :term:`text file` object which has +been opened for reading:: -To unpickle the object again, if ``f`` is a file object which has been opened -for reading:: + x = json.load(f) - x = pickle.load(f) +This simple serialization technique can handle lists and dictionaries, but +serializing arbitrary class instances in JSON requires a bit of extra effort. +The reference for the :mod:`json` module contains an explanation of this. -(There are other variants of this, used when pickling many objects or when you -don't want to write the pickled data to a file; consult the complete -documentation for :mod:`pickle` in the Python Library Reference.) +.. seealso:: -:mod:`pickle` is the standard way to make Python objects which can be stored and -reused by other programs or by a future invocation of the same program; the -technical term for this is a :dfn:`persistent` object. Because :mod:`pickle` is -so widely used, many authors who write Python extensions take care to ensure -that new data types such as matrices can be properly pickled and unpickled. + :mod:`pickle` - the pickle module + Contrary to :ref:`JSON <tut-json>`, *pickle* is a protocol which allows + the serialization of arbitrarily complex Python objects. As such, it is + specific to Python and cannot be used to communicate with applications + written in other languages. It is also insecure by default: + deserializing pickle data coming from an untrusted source can execute + arbitrary code, if the data was crafted by a skilled attacker. diff --git a/Doc/tutorial/interpreter.rst b/Doc/tutorial/interpreter.rst index d61dafc..cdc2bf2 100644 --- a/Doc/tutorial/interpreter.rst +++ b/Doc/tutorial/interpreter.rst @@ -10,13 +10,13 @@ Using the Python Interpreter Invoking the Interpreter ======================== -The Python interpreter is usually installed as :file:`/usr/local/bin/python3.2` +The Python interpreter is usually installed as :file:`/usr/local/bin/python3.3` on those machines where it is available; putting :file:`/usr/local/bin` in your Unix shell's search path makes it possible to start it by typing the command: .. code-block:: text - python3.2 + python3.3 to the shell. [#]_ Since the choice of the directory where the interpreter lives is an installation option, other places are possible; check with your local @@ -24,11 +24,11 @@ Python guru or system administrator. (E.g., :file:`/usr/local/python` is a popular alternative location.) On Windows machines, the Python installation is usually placed in -:file:`C:\\Python32`, though you can change this when you're running the +:file:`C:\\Python33`, though you can change this when you're running the installer. To add this directory to your path, you can type the following command into the command prompt in a DOS box:: - set path=%path%;C:\python32 + set path=%path%;C:\python33 Typing an end-of-file character (:kbd:`Control-D` on Unix, :kbd:`Control-Z` on Windows) at the primary prompt causes the interpreter to exit with a zero exit @@ -95,8 +95,8 @@ with the *secondary prompt*, by default three dots (``...``). The interpreter prints a welcome message stating its version number and a copyright notice before printing the first prompt:: - $ python3.2 - Python 3.2.3 (default, May 3 2012, 15:54:42) + $ python3.3 + Python 3.3 (default, Sep 24 2012, 09:25:04) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> @@ -149,7 +149,7 @@ Executable Python Scripts On BSD'ish Unix systems, Python scripts can be made directly executable, like shell scripts, by putting the line :: - #! /usr/bin/env python3.2 + #! /usr/bin/env python3.3 (assuming that the interpreter is on the user's :envvar:`PATH`) at the beginning of the script and giving the file an executable mode. The ``#!`` must be the diff --git a/Doc/tutorial/introduction.rst b/Doc/tutorial/introduction.rst index b6d94ac..9efd1ac 100644 --- a/Doc/tutorial/introduction.rst +++ b/Doc/tutorial/introduction.rst @@ -5,7 +5,7 @@ An Informal Introduction to Python ********************************** In the following examples, input and output are distinguished by the presence or -absence of prompts (``>>>`` and ``...``): to repeat the example, you must type +absence of prompts (:term:`>>>` and :term:`...`): to repeat the example, you must type everything after the prompt, when the prompt appears; lines that do not begin with a prompt are output from the interpreter. Note that a secondary prompt on a line by itself in an example means you must type a blank line; this is used to @@ -22,9 +22,9 @@ be omitted when typing in examples. Some examples:: # this is the first comment - SPAM = 1 # and this is the second comment - # ... and now a third! - STRING = "# This is not a comment." + spam = 1 # and this is the second comment + # ... and now a third! + text = "# This is not a comment because it's inside quotes." .. _tut-calculator: @@ -44,55 +44,53 @@ Numbers The interpreter acts as a simple calculator: you can type an expression at it and it will write the value. Expression syntax is straightforward: the operators ``+``, ``-``, ``*`` and ``/`` work just like in most other languages -(for example, Pascal or C); parentheses can be used for grouping. For example:: +(for example, Pascal or C); parentheses (``()``) can be used for grouping. +For example:: - >>> 2+2 + >>> 2 + 2 4 - >>> # This is a comment - ... 2+2 - 4 - >>> 2+2 # and a comment on the same line as code - 4 - >>> (50-5*6)/4 + >>> 50 - 5*6 + 20 + >>> (50 - 5*6) / 4 5.0 - >>> 8/5 # Fractions aren't lost when dividing integers + >>> 8 / 5 # division always returns a floating point number 1.6 -Note: You might not see exactly the same result; floating point results can -differ from one machine to another. We will say more later about controlling -the appearance of floating point output. See also :ref:`tut-fp-issues` for a -full discussion of some of the subtleties of floating point numbers and their -representations. +The integer numbers (e.g. ``2``, ``4``, ``20``) have type :class:`int`, +the ones with a fractional part (e.g. ``5.0``, ``1.6``) have type +:class:`float`. We will see more about numeric types later in the tutorial. -To do integer division and get an integer result, -discarding any fractional result, there is another operator, ``//``:: +Division (``/``) always returns a float. To do :term:`floor division` and +get an integer result (discarding any fractional result) you can use the ``//`` +operator; to calculate the remainder you can use ``%``:: - >>> # Integer division returns the floor: - ... 7//3 + >>> 17 / 3 # classic division returns a float + 5.666666666666667 + >>> + >>> 17 // 3 # floor division discards the fractional part + 5 + >>> 17 % 3 # the % operator returns the remainder of the division 2 - >>> 7//-3 - -3 + >>> 5 * 3 + 2 # result * divisor + remainder + 17 -The equal sign (``'='``) is used to assign a value to a variable. Afterwards, no +With Python, it is possible to use the ``**`` operator to calculate powers [#]_:: + + >>> 5 ** 2 # 5 squared + 25 + >>> 2 ** 7 # 2 to the power of 7 + 128 + +The equal sign (``=``) is used to assign a value to a variable. Afterwards, no result is displayed before the next interactive prompt:: >>> width = 20 - >>> height = 5*9 + >>> height = 5 * 9 >>> width * height 900 -A value can be assigned to several variables simultaneously:: - - >>> x = y = z = 0 # Zero x, y and z - >>> x - 0 - >>> y - 0 - >>> z - 0 - -Variables must be "defined" (assigned a value) before they can be used, or an -error will occur:: +If a variable is not "defined" (assigned a value), trying to use it will +give you an error:: >>> n # try to access an undefined variable Traceback (most recent call last): @@ -107,49 +105,6 @@ convert the integer operand to floating point:: >>> 7.0 / 2 3.5 -Complex numbers are also supported; imaginary numbers are written with a suffix -of ``j`` or ``J``. Complex numbers with a nonzero real component are written as -``(real+imagj)``, or can be created with the ``complex(real, imag)`` function. -:: - - >>> 1j * 1J - (-1+0j) - >>> 1j * complex(0, 1) - (-1+0j) - >>> 3+1j*3 - (3+3j) - >>> (3+1j)*3 - (9+3j) - >>> (1+2j)/(1+1j) - (1.5+0.5j) - -Complex numbers are always represented as two floating point numbers, the real -and imaginary part. To extract these parts from a complex number *z*, use -``z.real`` and ``z.imag``. :: - - >>> a=1.5+0.5j - >>> a.real - 1.5 - >>> a.imag - 0.5 - -The conversion functions to floating point and integer (:func:`float`, -:func:`int`) don't work for complex numbers --- there is not one correct way to -convert a complex number to a real number. Use ``abs(z)`` to get its magnitude -(as a float) or ``z.real`` to get its real part:: - - >>> a=3.0+4.0j - >>> float(a) - Traceback (most recent call last): - File "<stdin>", line 1, in ? - TypeError: can't convert complex to float; use abs(z) - >>> a.real - 3.0 - >>> a.imag - 4.0 - >>> abs(a) # sqrt(a.real**2 + a.imag**2) - 5.0 - In interactive mode, the last printed expression is assigned to the variable ``_``. This means that when you are using Python as a desk calculator, it is somewhat easier to continue calculations, for example:: @@ -167,20 +122,28 @@ This variable should be treated as read-only by the user. Don't explicitly assign a value to it --- you would create an independent local variable with the same name masking the built-in variable with its magic behavior. +In addition to :class:`int` and :class:`float`, Python supports other types of +numbers, such as :class:`~decimal.Decimal` and :class:`~fractions.Fraction`. +Python also has built-in support for :ref:`complex numbers <typesnumeric>`, +and uses the ``j`` or ``J`` suffix to indicate the imaginary part +(e.g. ``3+5j``). + .. _tut-strings: Strings ------- -Besides numbers, Python can also manipulate strings, which can be expressed in -several ways. They can be enclosed in single quotes or double quotes:: +Besides numbers, Python can also manipulate strings, which can be expressed +in several ways. They can be enclosed in single quotes (``'...'``) or +double quotes (``"..."``) with the same result [#]_. ``\`` can be used +to escape quotes:: - >>> 'spam eggs' + >>> 'spam eggs' # single quotes 'spam eggs' - >>> 'doesn\'t' + >>> 'doesn\'t' # use \' to escape the single quote... "doesn't" - >>> "doesn't" + >>> "doesn't" # ...or use double quotes instead "doesn't" >>> '"Yes," he said.' '"Yes," he said.' @@ -189,38 +152,40 @@ several ways. They can be enclosed in single quotes or double quotes:: >>> '"Isn\'t," she said.' '"Isn\'t," she said.' -The interpreter prints the result of string operations in the same way as they -are typed for input: inside quotes, and with quotes and other funny characters -escaped by backslashes, to show the precise value. The string is enclosed in -double quotes if the string contains a single quote and no double quotes, else -it's enclosed in single quotes. The :func:`print` function produces a more -readable output for such input strings. +In the interactive interpreter, the output string is enclosed in quotes and +special characters are escaped with backslashes. While this might sometimes +look different from the input (the enclosing quotes could change), the two +strings are equivalent. The string is enclosed in double quotes if +the string contains a single quote and no double quotes, otherwise it is +enclosed in single quotes. The :func:`print` function produces a more +readable output, by omitting the enclosing quotes and by printing escaped +and special characters:: -String literals can span multiple lines in several ways. Continuation lines can -be used, with a backslash as the last character on the line indicating that the -next line is a logical continuation of the line:: - - hello = "This is a rather long string containing\n\ - several lines of text just as you would do in C.\n\ - Note that whitespace at the beginning of the line is\ - significant." - - print(hello) - -Note that newlines still need to be embedded in the string using ``\n`` -- the -newline following the trailing backslash is discarded. This example would print -the following: - -.. code-block:: text - - This is a rather long string containing - several lines of text just as you would do in C. - Note that whitespace at the beginning of the line is significant. - -Or, strings can be surrounded in a pair of matching triple-quotes: ``"""`` or -``'''``. End of lines do not need to be escaped when using triple-quotes, but -they will be included in the string. So the following uses one escape to -avoid an unwanted initial blank line. :: + >>> '"Isn\'t," she said.' + '"Isn\'t," she said.' + >>> print('"Isn\'t," she said.') + "Isn't," she said. + >>> s = 'First line.\nSecond line.' # \n means newline + >>> s # without print(), \n is included in the output + 'First line.\nSecond line.' + >>> print(s) # with print(), \n produces a new line + First line. + Second line. + +If you don't want characters prefaced by ``\`` to be interpreted as +special characters, you can use *raw strings* by adding an ``r`` before +the first quote:: + + >>> print('C:\some\name') # here \n means newline! + C:\some + ame + >>> print(r'C:\some\name') # note the r before the quote + C:\some\name + +String literals can span multiple lines. One way is using triple-quotes: +``"""..."""`` or ``'''...'''``. End of lines are automatically +included in the string, but it's possible to prevent this by adding a ``\`` at +the end of the line. The following example:: print("""\ Usage: thingy [OPTIONS] @@ -228,7 +193,7 @@ avoid an unwanted initial blank line. :: -H hostname Hostname to connect to """) -produces the following output: +produces the following output (note that the initial newline is not included): .. code-block:: text @@ -236,143 +201,100 @@ produces the following output: -h Display this usage message -H hostname Hostname to connect to -If we make the string literal a "raw" string, ``\n`` sequences are not converted -to newlines, but the backslash at the end of the line, and the newline character -in the source, are both included in the string as data. Thus, the example:: - - hello = r"This is a rather long string containing\n\ - several lines of text much as you would do in C." - - print(hello) - -would print: - -.. code-block:: text - - This is a rather long string containing\n\ - several lines of text much as you would do in C. - Strings can be concatenated (glued together) with the ``+`` operator, and repeated with ``*``:: - >>> word = 'Help' + 'A' - >>> word - 'HelpA' - >>> '<' + word*5 + '>' - '<HelpAHelpAHelpAHelpAHelpA>' - -Two string literals next to each other are automatically concatenated; the first -line above could also have been written ``word = 'Help' 'A'``; this only works -with two literals, not with arbitrary string expressions:: - - >>> 'str' 'ing' # <- This is ok - 'string' - >>> 'str'.strip() + 'ing' # <- This is ok - 'string' - >>> 'str'.strip() 'ing' # <- This is invalid - File "<stdin>", line 1, in ? - 'str'.strip() 'ing' - ^ - SyntaxError: invalid syntax + >>> # 3 times 'un', followed by 'ium' + >>> 3 * 'un' + 'ium' + 'unununium' -Strings can be subscripted (indexed); like in C, the first character of a string -has subscript (index) 0. There is no separate character type; a character is -simply a string of size one. As in the Icon programming language, substrings -can be specified with the *slice notation*: two indices separated by a colon. -:: +Two or more *string literals* (i.e. the ones enclosed between quotes) next +to each other are automatically concatenated. :: - >>> word[4] - 'A' - >>> word[0:2] - 'He' - >>> word[2:4] - 'lp' + >>> 'Py' 'thon' + 'Python' -Slice indices have useful defaults; an omitted first index defaults to zero, an -omitted second index defaults to the size of the string being sliced. :: +This only works with two literals though, not with variables or expressions:: - >>> word[:2] # The first two characters - 'He' - >>> word[2:] # Everything except the first two characters - 'lpA' + >>> prefix = 'Py' + >>> prefix 'thon' # can't concatenate a variable and a string literal + ... + SyntaxError: invalid syntax + >>> ('un' * 3) 'ium' + ... + SyntaxError: invalid syntax -Unlike a C string, Python strings cannot be changed. Assigning to an indexed -position in the string results in an error:: +If you want to concatenate variables or a variable and a literal, use ``+``:: - >>> word[0] = 'x' - Traceback (most recent call last): - File "<stdin>", line 1, in ? - TypeError: 'str' object does not support item assignment - >>> word[:1] = 'Splat' - Traceback (most recent call last): - File "<stdin>", line 1, in ? - TypeError: 'str' object does not support slice assignment + >>> prefix + 'thon' + 'Python' -However, creating a new string with the combined content is easy and efficient:: +This feature is particularly useful when you want to break long strings:: - >>> 'x' + word[1:] - 'xelpA' - >>> 'Splat' + word[4] - 'SplatA' + >>> text = ('Put several strings within parentheses ' + 'to have them joined together.') + >>> text + 'Put several strings within parentheses to have them joined together.' -Here's a useful invariant of slice operations: ``s[:i] + s[i:]`` equals ``s``. -:: +Strings can be *indexed* (subscripted), with the first character having index 0. +There is no separate character type; a character is simply a string of size +one:: - >>> word[:2] + word[2:] - 'HelpA' - >>> word[:3] + word[3:] - 'HelpA' + >>> word = 'Python' + >>> word[0] # character in position 0 + 'P' + >>> word[5] # character in position 5 + 'n' -Degenerate slice indices are handled gracefully: an index that is too large is -replaced by the string size, an upper bound smaller than the lower bound returns -an empty string. :: +Indices may also be negative numbers, to start counting from the right:: - >>> word[1:100] - 'elpA' - >>> word[10:] - '' - >>> word[2:1] - '' + >>> word[-1] # last character + 'n' + >>> word[-2] # second-last character + 'o' + >>> word[-6] + 'P' -Indices may be negative numbers, to start counting from the right. For example:: +Note that since -0 is the same as 0, negative indices start from -1. - >>> word[-1] # The last character - 'A' - >>> word[-2] # The last-but-one character - 'p' - >>> word[-2:] # The last two characters - 'pA' - >>> word[:-2] # Everything except the last two characters - 'Hel' +In addition to indexing, *slicing* is also supported. While indexing is used +to obtain individual characters, *slicing* allows you to obtain substring:: -But note that -0 is really the same as 0, so it does not count from the right! -:: + >>> word[0:2] # characters from position 0 (included) to 2 (excluded) + 'Py' + >>> word[2:5] # characters from position 2 (included) to 5 (excluded) + 'tho' - >>> word[-0] # (since -0 equals 0) - 'H' +Note how the start is always included, and the end always excluded. This +makes sure that ``s[:i] + s[i:]`` is always equal to ``s``:: -Out-of-range negative slice indices are truncated, but don't try this for -single-element (non-slice) indices:: + >>> word[:2] + word[2:] + 'Python' + >>> word[:4] + word[4:] + 'Python' - >>> word[-100:] - 'HelpA' - >>> word[-10] # error - Traceback (most recent call last): - File "<stdin>", line 1, in ? - IndexError: string index out of range +Slice indices have useful defaults; an omitted first index defaults to zero, an +omitted second index defaults to the size of the string being sliced. :: + + >>> word[:2] # character from the beginning to position 2 (excluded) + 'Py' + >>> word[4:] # characters from position 4 (included) to the end + 'on' + >>> word[-2:] # characters from the second-last (included) to the end + 'on' One way to remember how slices work is to think of the indices as pointing *between* characters, with the left edge of the first character numbered 0. Then the right edge of the last character of a string of *n* characters has index *n*, for example:: - +---+---+---+---+---+ - | H | e | l | p | A | - +---+---+---+---+---+ - 0 1 2 3 4 5 - -5 -4 -3 -2 -1 + +---+---+---+---+---+---+ + | P | y | t | h | o | n | + +---+---+---+---+---+---+ + 0 1 2 3 4 5 6 + -6 -5 -4 -3 -2 -1 -The first row of numbers gives the position of the indices 0...5 in the string; +The first row of numbers gives the position of the indices 0...6 in the string; the second row gives the corresponding negative indices. The slice from *i* to *j* consists of all characters between the edges labeled *i* and *j*, respectively. @@ -381,6 +303,38 @@ For non-negative indices, the length of a slice is the difference of the indices, if both are within bounds. For example, the length of ``word[1:3]`` is 2. +Attempting to use a index that is too large will result in an error:: + + >>> word[42] # the word only has 7 characters + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + IndexError: string index out of range + +However, out of range slice indexes are handled gracefully when used for +slicing:: + + >>> word[4:42] + 'on' + >>> word[42:] + '' + +Python strings cannot be changed --- they are :term:`immutable`. +Therefore, assigning to an indexed position in the string results in an error:: + + >>> word[0] = 'J' + ... + TypeError: 'str' object does not support item assignment + >>> word[2:] = 'py' + ... + TypeError: 'str' object does not support item assignment + +If you need a different string, you should create a new one:: + + >>> 'J' + word[1:] + 'Jython' + >>> word[:2] + 'py' + 'Pypy' + The built-in function :func:`len` returns the length of a string:: >>> s = 'supercalifragilisticexpialidocious' @@ -390,7 +344,7 @@ The built-in function :func:`len` returns the length of a string:: .. seealso:: - :ref:`typesseq` + :ref:`textseq` Strings are examples of *sequence types*, and support the common operations supported by such types. @@ -407,149 +361,96 @@ The built-in function :func:`len` returns the length of a string:: the left operand of the ``%`` operator are described in more detail here. -.. _tut-unicodestrings: - -About Unicode -------------- - -.. sectionauthor:: Marc-André Lemburg <mal@lemburg.com> - - -Starting with Python 3.0 all strings support Unicode (see -http://www.unicode.org/). - -Unicode has the advantage of providing one ordinal for every character in every -script used in modern and ancient texts. Previously, there were only 256 -possible ordinals for script characters. Texts were typically bound to a code -page which mapped the ordinals to script characters. This lead to very much -confusion especially with respect to internationalization (usually written as -``i18n`` --- ``'i'`` + 18 characters + ``'n'``) of software. Unicode solves -these problems by defining one code page for all scripts. - -If you want to include special characters in a string, -you can do so by using the Python *Unicode-Escape* encoding. The following -example shows how:: +.. _tut-lists: - >>> 'Hello\u0020World !' - 'Hello World !' +Lists +----- -The escape sequence ``\u0020`` indicates to insert the Unicode character with -the ordinal value 0x0020 (the space character) at the given position. +Python knows a number of *compound* data types, used to group together other +values. The most versatile is the *list*, which can be written as a list of +comma-separated values (items) between square brackets. Lists might contain +items of different types, but usually the items all have the same type. :: -Other characters are interpreted by using their respective ordinal values -directly as Unicode ordinals. If you have literal strings in the standard -Latin-1 encoding that is used in many Western countries, you will find it -convenient that the lower 256 characters of Unicode are the same as the 256 -characters of Latin-1. + >>> squares = [1, 4, 9, 16, 25] + >>> squares + [1, 4, 9, 16, 25] -Apart from these standard encodings, Python provides a whole set of other ways -of creating Unicode strings on the basis of a known encoding. +Like strings (and all other built-in :term:`sequence` type), lists can be +indexed and sliced:: -To convert a string into a sequence of bytes using a specific encoding, -string objects provide an :func:`encode` method that takes one argument, the -name of the encoding. Lowercase names for encodings are preferred. :: + >>> squares[0] # indexing returns the item + 1 + >>> squares[-1] + 25 + >>> squares[-3:] # slicing returns a new list + [9, 16, 25] - >>> "Äpfel".encode('utf-8') - b'\xc3\x84pfel' +All slice operations return a new list containing the requested elements. This +means that the following slice returns a new (shallow) copy of the list:: -.. _tut-lists: + >>> squares[:] + [1, 4, 9, 16, 25] -Lists ------ +Lists also supports operations like concatenation:: -Python knows a number of *compound* data types, used to group together other -values. The most versatile is the *list*, which can be written as a list of -comma-separated values (items) between square brackets. List items need not all -have the same type. :: - - >>> a = ['spam', 'eggs', 100, 1234] - >>> a - ['spam', 'eggs', 100, 1234] - -Like string indices, list indices start at 0, and lists can be sliced, -concatenated and so on:: - - >>> a[0] - 'spam' - >>> a[3] - 1234 - >>> a[-2] - 100 - >>> a[1:-1] - ['eggs', 100] - >>> a[:2] + ['bacon', 2*2] - ['spam', 'eggs', 'bacon', 4] - >>> 3*a[:3] + ['Boo!'] - ['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!'] + >>> squares + [36, 49, 64, 81, 100] + [1, 4, 9, 16, 25, 36, 49, 64, 81, 100] -All slice operations return a new list containing the requested elements. This -means that the following slice returns a shallow copy of the list *a*:: +Unlike strings, which are :term:`immutable`, lists are a :term:`mutable` +type, i.e. it is possible to change their content:: - >>> a[:] - ['spam', 'eggs', 100, 1234] + >>> cubes = [1, 8, 27, 65, 125] # something's wrong here + >>> 4 ** 3 # the cube of 4 is 64, not 65! + 64 + >>> cubes[3] = 64 # replace the wrong value + >>> cubes + [1, 8, 27, 64, 125] -Unlike strings, which are *immutable*, it is possible to change individual -elements of a list:: +You can also add new items at the end of the list, by using +the :meth:`~list.append` *method* (we will see more about methods later):: - >>> a - ['spam', 'eggs', 100, 1234] - >>> a[2] = a[2] + 23 - >>> a - ['spam', 'eggs', 123, 1234] + >>> cubes.append(216) # add the cube of 6 + >>> cubes.append(7 ** 3) # and the cube of 7 + >>> cubes + [1, 8, 27, 64, 125, 216, 343] Assignment to slices is also possible, and this can even change the size of the list or clear it entirely:: - >>> # Replace some items: - ... a[0:2] = [1, 12] - >>> a - [1, 12, 123, 1234] - >>> # Remove some: - ... a[0:2] = [] - >>> a - [123, 1234] - >>> # Insert some: - ... a[1:1] = ['bletch', 'xyzzy'] - >>> a - [123, 'bletch', 'xyzzy', 1234] - >>> # Insert (a copy of) itself at the beginning - >>> a[:0] = a - >>> a - [123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234] - >>> # Clear the list: replace all items with an empty list - >>> a[:] = [] - >>> a + >>> letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g'] + >>> letters + ['a', 'b', 'c', 'd', 'e', 'f', 'g'] + >>> # replace some values + >>> letters[2:5] = ['C', 'D', 'E'] + >>> letters + ['a', 'b', 'C', 'D', 'E', 'f', 'g'] + >>> # now remove them + >>> letters[2:5] = [] + >>> letters + ['a', 'b', 'f', 'g'] + >>> # clear the list by replacing all the elements with an empty list + >>> letters[:] = [] + >>> letters [] The built-in function :func:`len` also applies to lists:: - >>> a = ['a', 'b', 'c', 'd'] - >>> len(a) + >>> letters = ['a', 'b', 'c', 'd'] + >>> len(letters) 4 It is possible to nest lists (create lists containing other lists), for example:: - >>> q = [2, 3] - >>> p = [1, q, 4] - >>> len(p) - 3 - >>> p[1] - [2, 3] - >>> p[1][0] - 2 - -You can add something to the end of the list:: - - >>> p[1].append('xtra') - >>> p - [1, [2, 3, 'xtra'], 4] - >>> q - [2, 3, 'xtra'] - -Note that in the last example, ``p[1]`` and ``q`` really refer to the same -object! We'll come back to *object semantics* later. - + >>> a = ['a', 'b', 'c'] + >>> n = [1, 2, 3] + >>> x = [a, n] + >>> x + [['a', 'b', 'c'], [1, 2, 3]] + >>> x[0] + ['a', 'b', 'c'] + >>> x[0][1] + 'b' .. _tut-firststeps: @@ -600,19 +501,19 @@ This example introduces several new features. guess when you have typed the last line). Note that each line within a basic block must be indented by the same amount. -* The :func:`print` function writes the value of the expression(s) it is - given. It differs from just writing the expression you want to write (as we did - earlier in the calculator examples) in the way it handles multiple - expressions, floating point quantities, - and strings. Strings are printed without quotes, and a space is inserted - between items, so you can format things nicely, like this:: +* The :func:`print` function writes the value of the argument(s) it is given. + It differs from just writing the expression you want to write (as we did + earlier in the calculator examples) in the way it handles multiple arguments, + floating point quantities, and strings. Strings are printed without quotes, + and a space is inserted between items, so you can format things nicely, like + this:: >>> i = 256*256 >>> print('The value of i is', i) The value of i is 65536 - The keyword *end* can be used to avoid the newline after the output, or end - the output with a different string:: + The keyword argument *end* can be used to avoid the newline after the output, + or end the output with a different string:: >>> a, b = 0, 1 >>> while b < 1000: @@ -620,3 +521,15 @@ This example introduces several new features. ... a, b = b, a+b ... 1,1,2,3,5,8,13,21,34,55,89,144,233,377,610,987, + + +.. rubric:: Footnotes + +.. [#] Since ``**`` has higher precedence than ``-``, ``-3**2`` will be + interpreted as ``-(3**2)`` and thus result in ``-9``. To avoid this + and get ``9``, you can use ``(-3)**2``. + +.. [#] Unlike other languages, special characters such as ``\n`` have the + same meaning with both single (``'...'``) and double (``"..."``) quotes. + The only difference between the two is that within single quotes you don't + need to escape ``"`` (but you have to escape ``\'``) and vice versa. diff --git a/Doc/tutorial/modules.rst b/Doc/tutorial/modules.rst index 86c1a0d..1902964 100644 --- a/Doc/tutorial/modules.rst +++ b/Doc/tutorial/modules.rst @@ -72,7 +72,8 @@ More on Modules A module can contain executable statements as well as function definitions. These statements are intended to initialize the module. They are executed only -the *first* time the module is imported somewhere. [#]_ +the *first* time the module name is encountered in an import statement. [#]_ +(They are also run if the file is executed as a script.) Each module has its own private symbol table, which is used as the global symbol table by all functions defined in the module. Thus, the author of a module can @@ -182,57 +183,45 @@ directory. This is an error unless the replacement is intended. See section "Compiled" Python files ----------------------- -As an important speed-up of the start-up time for short programs that use a lot -of standard modules, if a file called :file:`spam.pyc` exists in the directory -where :file:`spam.py` is found, this is assumed to contain an -already-"byte-compiled" version of the module :mod:`spam`. The modification time -of the version of :file:`spam.py` used to create :file:`spam.pyc` is recorded in -:file:`spam.pyc`, and the :file:`.pyc` file is ignored if these don't match. - -Normally, you don't need to do anything to create the :file:`spam.pyc` file. -Whenever :file:`spam.py` is successfully compiled, an attempt is made to write -the compiled version to :file:`spam.pyc`. It is not an error if this attempt -fails; if for any reason the file is not written completely, the resulting -:file:`spam.pyc` file will be recognized as invalid and thus ignored later. The -contents of the :file:`spam.pyc` file are platform independent, so a Python -module directory can be shared by machines of different architectures. +To speed up loading modules, Python caches the compiled version of each module +in the ``__pycache__`` directory under the name :file:`module.{version}.pyc`, +where the version encodes the format of the compiled file; it generally contains +the Python version number. For example, in CPython release 3.3 the compiled +version of spam.py would be cached as ``__pycache__/spam.cpython-33.pyc``. This +naming convention allows compiled modules from different releases and different +versions of Python to coexist. + +Python checks the modification date of the source against the compiled version +to see if it's out of date and needs to be recompiled. This is a completely +automatic process. Also, the compiled modules are platform-independent, so the +same library can be shared among systems with different architectures. + +Python does not check the cache in two circumstances. First, it always +recompiles and does not store the result for the module that's loaded directly +from the command line. Second, it does not check the cache if there is no +source module. To support a non-source (compiled only) distribution, the +compiled module must be in the source directory, and there must not be a source +module. Some tips for experts: -* When the Python interpreter is invoked with the :option:`-O` flag, optimized - code is generated and stored in :file:`.pyo` files. The optimizer currently - doesn't help much; it only removes :keyword:`assert` statements. When - :option:`-O` is used, *all* :term:`bytecode` is optimized; ``.pyc`` files are - ignored and ``.py`` files are compiled to optimized bytecode. +* You can use the :option:`-O` or :option:`-OO` switches on the Python command + to reduce the size of a compiled module. The ``-O`` switch removes assert + statements, the ``-OO`` switch removes both assert statements and __doc__ + strings. Since some programs may rely on having these available, you should + only use this option if you know what you're doing. "Optimized" modules have + a .pyo rather than a .pyc suffix and are usually smaller. Future releases may + change the effects of optimization. -* Passing two :option:`-O` flags to the Python interpreter (:option:`-OO`) will - cause the bytecode compiler to perform optimizations that could in some rare - cases result in malfunctioning programs. Currently only ``__doc__`` strings are - removed from the bytecode, resulting in more compact :file:`.pyo` files. Since - some programs may rely on having these available, you should only use this - option if you know what you're doing. +* A program doesn't run any faster when it is read from a ``.pyc`` or ``.pyo`` + file than when it is read from a ``.py`` file; the only thing that's faster + about ``.pyc`` or ``.pyo`` files is the speed with which they are loaded. -* A program doesn't run any faster when it is read from a :file:`.pyc` or - :file:`.pyo` file than when it is read from a :file:`.py` file; the only thing - that's faster about :file:`.pyc` or :file:`.pyo` files is the speed with which - they are loaded. +* The module :mod:`compileall` can create .pyc files (or .pyo files when + :option:`-O` is used) for all modules in a directory. -* When a script is run by giving its name on the command line, the bytecode for - the script is never written to a :file:`.pyc` or :file:`.pyo` file. Thus, the - startup time of a script may be reduced by moving most of its code to a module - and having a small bootstrap script that imports that module. It is also - possible to name a :file:`.pyc` or :file:`.pyo` file directly on the command - line. - -* It is possible to have a file called :file:`spam.pyc` (or :file:`spam.pyo` - when :option:`-O` is used) without a file :file:`spam.py` for the same module. - This can be used to distribute a library of Python code in a form that is - moderately hard to reverse engineer. - - .. index:: module: compileall - -* The module :mod:`compileall` can create :file:`.pyc` files (or :file:`.pyo` - files when :option:`-O` is used) for all modules in a directory. +* There is more detail on this process, including a flow chart of the + decisions, in PEP 3147. .. _tut-standardmodules: @@ -289,21 +278,24 @@ defines. It returns a sorted list of strings:: >>> dir(fibo) ['__name__', 'fib', 'fib2'] >>> dir(sys) # doctest: +NORMALIZE_WHITESPACE - ['__displayhook__', '__doc__', '__excepthook__', '__name__', '__package__', - '__stderr__', '__stdin__', '__stdout__', '_clear_type_cache', - '_current_frames', '_getframe', '_mercurial', '_xoptions', 'abiflags', - 'api_version', 'argv', 'builtin_module_names', 'byteorder', 'call_tracing', - 'callstats', 'copyright', 'displayhook', 'dont_write_bytecode', 'exc_info', + ['__displayhook__', '__doc__', '__egginsert', '__excepthook__', + '__loader__', '__name__', '__package__', '__plen', '__stderr__', + '__stdin__', '__stdout__', '_clear_type_cache', '_current_frames', + '_debugmallocstats', '_getframe', '_home', '_mercurial', '_xoptions', + 'abiflags', 'api_version', 'argv', 'base_exec_prefix', 'base_prefix', + 'builtin_module_names', 'byteorder', 'call_tracing', 'callstats', + 'copyright', 'displayhook', 'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix', 'executable', 'exit', 'flags', 'float_info', 'float_repr_style', 'getcheckinterval', 'getdefaultencoding', 'getdlopenflags', 'getfilesystemencoding', 'getobjects', 'getprofile', 'getrecursionlimit', 'getrefcount', 'getsizeof', 'getswitchinterval', - 'gettotalrefcount', 'gettrace', 'hash_info', 'hexversion', 'int_info', - 'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path', - 'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1', - 'setcheckinterval', 'setdlopenflags', 'setprofile', 'setrecursionlimit', - 'setswitchinterval', 'settrace', 'stderr', 'stdin', 'stdout', 'subversion', - 'version', 'version_info', 'warnoptions'] + 'gettotalrefcount', 'gettrace', 'hash_info', 'hexversion', + 'implementation', 'int_info', 'intern', 'maxsize', 'maxunicode', + 'meta_path', 'modules', 'path', 'path_hooks', 'path_importer_cache', + 'platform', 'prefix', 'ps1', 'setcheckinterval', 'setdlopenflags', + 'setprofile', 'setrecursionlimit', 'setswitchinterval', 'settrace', + 'stderr', 'stdin', 'stdout', 'thread_info', 'version', 'version_info', + 'warnoptions'] Without arguments, :func:`dir` lists the names you have defined currently:: @@ -324,29 +316,34 @@ want a list of those, they are defined in the standard module >>> import builtins >>> dir(builtins) # doctest: +NORMALIZE_WHITESPACE ['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', - 'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError', - 'Ellipsis', 'EnvironmentError', 'Exception', 'False', 'FloatingPointError', + 'BlockingIOError', 'BrokenPipeError', 'BufferError', 'BytesWarning', + 'ChildProcessError', 'ConnectionAbortedError', 'ConnectionError', + 'ConnectionRefusedError', 'ConnectionResetError', 'DeprecationWarning', + 'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False', + 'FileExistsError', 'FileNotFoundError', 'FloatingPointError', 'FutureWarning', 'GeneratorExit', 'IOError', 'ImportError', - 'ImportWarning', 'IndentationError', 'IndexError', 'KeyError', - 'KeyboardInterrupt', 'LookupError', 'MemoryError', 'NameError', 'None', - 'NotImplemented', 'NotImplementedError', 'OSError', 'OverflowError', - 'PendingDeprecationWarning', 'ReferenceError', 'ResourceWarning', - 'RuntimeError', 'RuntimeWarning', 'StopIteration', 'SyntaxError', - 'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'True', - 'TypeError', 'UnboundLocalError', 'UnicodeDecodeError', - 'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError', - 'UnicodeWarning', 'UserWarning', 'ValueError', 'Warning', - 'ZeroDivisionError', '_', '__build_class__', '__debug__', '__doc__', - '__import__', '__name__', '__package__', 'abs', 'all', 'any', 'ascii', - 'bin', 'bool', 'bytearray', 'bytes', 'callable', 'chr', 'classmethod', - 'compile', 'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', - 'divmod', 'enumerate', 'eval', 'exec', 'exit', 'filter', 'float', 'format', - 'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex', 'id', - 'input', 'int', 'isinstance', 'issubclass', 'iter', 'len', 'license', - 'list', 'locals', 'map', 'max', 'memoryview', 'min', 'next', 'object', - 'oct', 'open', 'ord', 'pow', 'print', 'property', 'quit', 'range', 'repr', - 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod', - 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip'] + 'ImportWarning', 'IndentationError', 'IndexError', 'InterruptedError', + 'IsADirectoryError', 'KeyError', 'KeyboardInterrupt', 'LookupError', + 'MemoryError', 'NameError', 'None', 'NotADirectoryError', 'NotImplemented', + 'NotImplementedError', 'OSError', 'OverflowError', + 'PendingDeprecationWarning', 'PermissionError', 'ProcessLookupError', + 'ReferenceError', 'ResourceWarning', 'RuntimeError', 'RuntimeWarning', + 'StopIteration', 'SyntaxError', 'SyntaxWarning', 'SystemError', + 'SystemExit', 'TabError', 'TimeoutError', 'True', 'TypeError', + 'UnboundLocalError', 'UnicodeDecodeError', 'UnicodeEncodeError', + 'UnicodeError', 'UnicodeTranslateError', 'UnicodeWarning', 'UserWarning', + 'ValueError', 'Warning', 'ZeroDivisionError', '_', '__build_class__', + '__debug__', '__doc__', '__import__', '__name__', '__package__', 'abs', + 'all', 'any', 'ascii', 'bin', 'bool', 'bytearray', 'bytes', 'callable', + 'chr', 'classmethod', 'compile', 'complex', 'copyright', 'credits', + 'delattr', 'dict', 'dir', 'divmod', 'enumerate', 'eval', 'exec', 'exit', + 'filter', 'float', 'format', 'frozenset', 'getattr', 'globals', 'hasattr', + 'hash', 'help', 'hex', 'id', 'input', 'int', 'isinstance', 'issubclass', + 'iter', 'len', 'license', 'list', 'locals', 'map', 'max', 'memoryview', + 'min', 'next', 'object', 'oct', 'open', 'ord', 'pow', 'print', 'property', + 'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', + 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars', + 'zip'] .. _tut-packages: @@ -370,7 +367,9 @@ There are also many different operations you might want to perform on sound data (such as mixing, adding echo, applying an equalizer function, creating an artificial stereo effect), so in addition you will be writing a never-ending stream of modules to perform these operations. Here's a possible structure for -your package (expressed in terms of a hierarchical filesystem):: +your package (expressed in terms of a hierarchical filesystem): + +.. code-block:: text sound/ Top-level package __init__.py Initialize the sound package @@ -467,7 +466,7 @@ list of module names that should be imported when ``from package import *`` is encountered. It is up to the package author to keep this list up-to-date when a new version of the package is released. Package authors may also decide not to support it, if they don't see a use for importing \* from their package. For -example, the file :file:`sounds/effects/__init__.py` could contain the following +example, the file :file:`sound/effects/__init__.py` could contain the following code:: __all__ = ["echo", "surround", "reverse"] @@ -542,6 +541,6 @@ modules found in a package. .. rubric:: Footnotes .. [#] In fact function definitions are also 'statements' that are 'executed'; the - execution of a module-level function enters the function name in the module's - global symbol table. + execution of a module-level function definition enters the function name in + the module's global symbol table. diff --git a/Doc/tutorial/stdlib.rst b/Doc/tutorial/stdlib.rst index 128e6a6..7e7a154 100644 --- a/Doc/tutorial/stdlib.rst +++ b/Doc/tutorial/stdlib.rst @@ -15,7 +15,7 @@ operating system:: >>> import os >>> os.getcwd() # Return the current working directory - 'C:\\Python31' + 'C:\\Python33' >>> os.chdir('/server/accesslogs') # Change current working directory >>> os.system('mkdir today') # Run the command mkdir in the system shell 0 @@ -203,7 +203,7 @@ Data Compression ================ Common data archiving and compression formats are directly supported by modules -including: :mod:`zlib`, :mod:`gzip`, :mod:`bz2`, :mod:`zipfile` and +including: :mod:`zlib`, :mod:`gzip`, :mod:`bz2`, :mod:`lzma`, :mod:`zipfile` and :mod:`tarfile`. :: >>> import zlib @@ -281,8 +281,10 @@ file:: def test_average(self): self.assertEqual(average([20, 30, 70]), 40.0) self.assertEqual(round(average([1, 5, 7]), 1), 4.3) - self.assertRaises(ZeroDivisionError, average, []) - self.assertRaises(TypeError, average, 20, 30, 70) + with self.assertRaises(ZeroDivisionError): + average([]) + with self.assertRaises(TypeError): + average(20, 30, 70) unittest.main() # Calling from the command line invokes all tests diff --git a/Doc/tutorial/stdlib2.rst b/Doc/tutorial/stdlib2.rst index 2265cd0..c1dd69a 100644 --- a/Doc/tutorial/stdlib2.rst +++ b/Doc/tutorial/stdlib2.rst @@ -71,9 +71,9 @@ formatting numbers with group separators:: Templating ========== -The :mod:`string` module includes a versatile :class:`Template` class with a -simplified syntax suitable for editing by end-users. This allows users to -customize their applications without having to alter the application. +The :mod:`string` module includes a versatile :class:`~string.Template` class +with a simplified syntax suitable for editing by end-users. This allows users +to customize their applications without having to alter the application. The format uses placeholder names formed by ``$`` with valid Python identifiers (alphanumeric characters and underscores). Surrounding the placeholder with @@ -85,11 +85,11 @@ spaces. Writing ``$$`` creates a single escaped ``$``:: >>> t.substitute(village='Nottingham', cause='the ditch fund') 'Nottinghamfolk send $10 to the ditch fund.' -The :meth:`substitute` method raises a :exc:`KeyError` when a placeholder is not -supplied in a dictionary or a keyword argument. For mail-merge style -applications, user supplied data may be incomplete and the -:meth:`safe_substitute` method may be more appropriate --- it will leave -placeholders unchanged if data is missing:: +The :meth:`~string.Template.substitute` method raises a :exc:`KeyError` when a +placeholder is not supplied in a dictionary or a keyword argument. For +mail-merge style applications, user supplied data may be incomplete and the +:meth:`~string.Template.safe_substitute` method may be more appropriate --- +it will leave placeholders unchanged if data is missing:: >>> t = Template('Return the $item to $owner.') >>> d = dict(item='unladen swallow') @@ -132,8 +132,9 @@ templates for XML files, plain text reports, and HTML web reports. Working with Binary Data Record Layouts ======================================= -The :mod:`struct` module provides :func:`pack` and :func:`unpack` functions for -working with variable length binary record formats. The following example shows +The :mod:`struct` module provides :func:`~struct.pack` and +:func:`~struct.unpack` functions for working with variable length binary +record formats. The following example shows how to loop through header information in a ZIP file without using the :mod:`zipfile` module. Pack codes ``"H"`` and ``"I"`` represent two and four byte unsigned numbers respectively. The ``"<"`` indicates that they are @@ -141,7 +142,9 @@ standard size and in little-endian byte order:: import struct - data = open('myfile.zip', 'rb').read() + with open('myfile.zip', 'rb') as f: + data = f.read() + start = 0 for i in range(3): # show the first 3 file headers start += 14 @@ -199,7 +202,7 @@ While those tools are powerful, minor design errors can result in problems that are difficult to reproduce. So, the preferred approach to task coordination is to concentrate all access to a resource in a single thread and then use the :mod:`queue` module to feed that thread with requests from other threads. -Applications using :class:`Queue` objects for inter-thread communication and +Applications using :class:`~queue.Queue` objects for inter-thread communication and coordination are easier to design, more readable, and more reliable. @@ -229,8 +232,9 @@ This produces the following output: By default, informational and debugging messages are suppressed and the output is sent to standard error. Other output options include routing messages through email, datagrams, sockets, or to an HTTP Server. New filters can select -different routing based on message priority: :const:`DEBUG`, :const:`INFO`, -:const:`WARNING`, :const:`ERROR`, and :const:`CRITICAL`. +different routing based on message priority: :const:`~logging.DEBUG`, +:const:`~logging.INFO`, :const:`~logging.WARNING`, :const:`~logging.ERROR`, +and :const:`~logging.CRITICAL`. The logging system can be configured directly from Python or can be loaded from a user editable configuration file for customized logging without altering the @@ -273,7 +277,7 @@ applications include caching objects that are expensive to create:: Traceback (most recent call last): File "<stdin>", line 1, in <module> d['primary'] # entry was automatically removed - File "C:/python31/lib/weakref.py", line 46, in __getitem__ + File "C:/python33/lib/weakref.py", line 46, in __getitem__ o = self.data[key]() KeyError: 'primary' @@ -287,11 +291,11 @@ Many data structure needs can be met with the built-in list type. However, sometimes there is a need for alternative implementations with different performance trade-offs. -The :mod:`array` module provides an :class:`array()` object that is like a list -that stores only homogeneous data and stores it more compactly. The following -example shows an array of numbers stored as two byte unsigned binary numbers -(typecode ``"H"``) rather than the usual 16 bytes per entry for regular lists of -Python int objects:: +The :mod:`array` module provides an :class:`~array.array()` object that is like +a list that stores only homogeneous data and stores it more compactly. The +following example shows an array of numbers stored as two byte unsigned binary +numbers (typecode ``"H"``) rather than the usual 16 bytes per entry for regular +lists of Python int objects:: >>> from array import array >>> a = array('H', [4000, 10, 700, 22222]) @@ -300,10 +304,10 @@ Python int objects:: >>> a[1:3] array('H', [10, 700]) -The :mod:`collections` module provides a :class:`deque()` object that is like a -list with faster appends and pops from the left side but slower lookups in the -middle. These objects are well suited for implementing queues and breadth first -tree searches:: +The :mod:`collections` module provides a :class:`~collections.deque()` object +that is like a list with faster appends and pops from the left side but slower +lookups in the middle. These objects are well suited for implementing queues +and breadth first tree searches:: >>> from collections import deque >>> d = deque(["task1", "task2", "task3"]) @@ -349,8 +353,8 @@ not want to run a full list sort:: Decimal Floating Point Arithmetic ================================= -The :mod:`decimal` module offers a :class:`Decimal` datatype for decimal -floating point arithmetic. Compared to the built-in :class:`float` +The :mod:`decimal` module offers a :class:`~decimal.Decimal` datatype for +decimal floating point arithmetic. Compared to the built-in :class:`float` implementation of binary floating point, the class is especially helpful for * financial applications and other uses which require exact decimal @@ -371,13 +375,15 @@ becomes significant if the results are rounded to the nearest cent:: >>> round(.70 * 1.05, 2) 0.73 -The :class:`Decimal` result keeps a trailing zero, automatically inferring four -place significance from multiplicands with two place significance. Decimal -reproduces mathematics as done by hand and avoids issues that can arise when -binary floating point cannot exactly represent decimal quantities. +The :class:`~decimal.Decimal` result keeps a trailing zero, automatically +inferring four place significance from multiplicands with two place +significance. Decimal reproduces mathematics as done by hand and avoids +issues that can arise when binary floating point cannot exactly represent +decimal quantities. -Exact representation enables the :class:`Decimal` class to perform modulo -calculations and equality tests that are unsuitable for binary floating point:: +Exact representation enables the :class:`~decimal.Decimal` class to perform +modulo calculations and equality tests that are unsuitable for binary floating +point:: >>> Decimal('1.00') % Decimal('.10') Decimal('0.00') diff --git a/Doc/using/cmdline.rst b/Doc/using/cmdline.rst index cd4d02f..4e7168f 100644 --- a/Doc/using/cmdline.rst +++ b/Doc/using/cmdline.rst @@ -24,7 +24,7 @@ Command line When invoking Python, you may specify any of these options:: - python [-bBdEhiORqsSuvVWx?] [-c command | -m module-name | script | - ] [args] + python [-bBdEhiOqsSuvVWx?] [-c command | -m module-name | script | - ] [args] The most common use case is, of course, a simple invocation of a script:: @@ -229,23 +229,22 @@ Miscellaneous options .. cmdoption:: -R - Turn on hash randomization, so that the :meth:`__hash__` values of str, bytes - and datetime objects are "salted" with an unpredictable random value. - Although they remain constant within an individual Python process, they are - not predictable between repeated invocations of Python. + Kept for compatibility. On Python 3.3 and greater, hash randomization is + turned on by default. - This is intended to provide protection against a denial-of-service caused by - carefully-chosen inputs that exploit the worst case performance of a dict - construction, O(n^2) complexity. See - http://www.ocert.org/advisories/ocert-2011-003.html for details. + On previous versions of Python, this option turns on hash randomization, + so that the :meth:`__hash__` values of str, bytes and datetime + are "salted" with an unpredictable random value. Although they remain + constant within an individual Python process, they are not predictable + between repeated invocations of Python. - Changing hash values affects the order in which keys are retrieved from a - dict. Although Python has never made guarantees about this ordering (and it - typically varies between 32-bit and 64-bit builds), enough real-world code - implicitly relies on this non-guaranteed behavior that the randomization is - disabled by default. + Hash randomization is intended to provide protection against a + denial-of-service caused by carefully-chosen inputs that exploit the worst + case performance of a dict construction, O(n^2) complexity. See + http://www.ocert.org/advisories/ocert-2011-003.html for details. - See also :envvar:`PYTHONHASHSEED`. + :envvar:`PYTHONHASHSEED` allows you to set a fixed value for the hash + seed secret. .. versionadded:: 3.2.3 @@ -263,13 +262,15 @@ Miscellaneous options .. cmdoption:: -S Disable the import of the module :mod:`site` and the site-dependent - manipulations of :data:`sys.path` that it entails. + manipulations of :data:`sys.path` that it entails. Also disable these + manipulations if :mod:`site` is explicitly imported later (call + :func:`site.main` if you want them to be triggered). .. cmdoption:: -u - Force the binary layer of the stdin, stdout and stderr streams (which is - available as their ``buffer`` attribute) to be unbuffered. The text I/O + Force the binary layer of the stdout and stderr streams (which is + available as their ``buffer`` attribute) to be unbuffered. The text I/O layer will still be line-buffered if writing to the console, or block-buffered if redirected to a non-interactive file. @@ -357,12 +358,16 @@ Miscellaneous options .. cmdoption:: -X Reserved for various implementation-specific options. CPython currently - defines none of them, but allows to pass arbitrary values and retrieve + defines just one, you can use ``-X faulthandler`` to enable + :mod:`faulthandler`. It also allows to pass arbitrary values and retrieve them through the :data:`sys._xoptions` dictionary. .. versionchanged:: 3.2 It is now allowed to pass :option:`-X` with CPython. + .. versionadded:: 3.3 + The ``-X faulthandler`` option. + Options you shouldn't use ~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -475,7 +480,7 @@ conflict. .. envvar:: PYTHONCASEOK If this is set, Python ignores case in :keyword:`import` statements. This - only works on Windows, OS X, and OS/2. + only works on Windows and OS X. .. envvar:: PYTHONDONTWRITEBYTECODE @@ -487,9 +492,8 @@ conflict. .. envvar:: PYTHONHASHSEED - If this variable is set to ``random``, the effect is the same as specifying - the :option:`-R` option: a random value is used to seed the hashes of str, - bytes and datetime objects. + If this variable is not set or set to ``random``, a random value is used + to seed the hashes of str, bytes and datetime objects. If :envvar:`PYTHONHASHSEED` is set to an integer value, it is used as a fixed seed for generating the hash() of the types covered by the hash @@ -500,8 +504,7 @@ conflict. values. The integer must be a decimal number in the range [0,4294967295]. Specifying - the value 0 will lead to the same hash values as when hash randomization is - disabled. + the value 0 will disable hash randomization. .. versionadded:: 3.2.3 @@ -531,8 +534,8 @@ conflict. Defines the :data:`user base directory <site.USER_BASE>`, which is used to compute the path of the :data:`user site-packages directory <site.USER_SITE>` - and :ref:`Distutils installation paths <inst-alt-install-user>` for ``python - setup.py install --user``. + and :ref:`Distutils installation paths <inst-alt-install-user>` for + ``python setup.py install --user``. .. seealso:: @@ -551,6 +554,16 @@ conflict. separated string, it is equivalent to specifying :option:`-W` multiple times. +.. envvar:: PYTHONFAULTHANDLER + + If this environment variable is set, :func:`faulthandler.enable` is called + at startup: install a handler for :const:`SIGSEGV`, :const:`SIGFPE`, + :const:`SIGABRT`, :const:`SIGBUS` and :const:`SIGILL` signals to dump the + Python traceback. This is equivalent to :option:`-X` ``faulthandler`` + option. + + .. versionadded:: 3.3 + Debug-mode variables ~~~~~~~~~~~~~~~~~~~~ diff --git a/Doc/using/index.rst b/Doc/using/index.rst index 1201153..502afa9 100644 --- a/Doc/using/index.rst +++ b/Doc/using/index.rst @@ -17,4 +17,4 @@ interpreter and things that make working with Python easier. unix.rst windows.rst mac.rst - + scripts.rst diff --git a/Doc/using/mac.rst b/Doc/using/mac.rst index 075a359..3e1b74d 100644 --- a/Doc/using/mac.rst +++ b/Doc/using/mac.rst @@ -17,14 +17,15 @@ the IDE and the Package Manager that are worth pointing out. Getting and Installing MacPython ================================ -Mac OS X 10.5 comes with Python 2.5.1 pre-installed by Apple. If you wish, you -are invited to install the most recent version of Python from the Python website -(http://www.python.org). A current "universal binary" build of Python, which -runs natively on the Mac's new Intel and legacy PPC CPU's, is available there. +Mac OS X 10.8 comes with Python 2.7 pre-installed by Apple. If you wish, you +are invited to install the most recent version of Python 3 from the Python +website (http://www.python.org). A current "universal binary" build of Python, +which runs natively on the Mac's new Intel and legacy PPC CPU's, is available +there. What you get after installing is a number of things: -* A :file:`MacPython 2.5` folder in your :file:`Applications` folder. In here +* A :file:`MacPython 3.3` folder in your :file:`Applications` folder. In here you find IDLE, the development environment that is a standard part of official Python distributions; PythonLauncher, which handles double-clicking Python scripts from the Finder; and the "Build Applet" tool, which allows you to @@ -92,7 +93,7 @@ aware of: programs that talk to the Aqua window manager (in other words, anything that has a GUI) need to be run in a special way. Use :program:`pythonw` instead of :program:`python` to start such scripts. -With Python 2.5, you can use either :program:`python` or :program:`pythonw`. +With Python 3.3, you can use either :program:`python` or :program:`pythonw`. Configuration @@ -125,13 +126,11 @@ Installing Additional Python Packages There are several methods to install additional Python packages: -* http://pythonmac.org/packages/ contains selected compiled packages for Python - 2.5, 2.4, and 2.3. - * Packages can be installed via the standard Python distutils mode (``python setup.py install``). -* Many packages can also be installed via the :program:`setuptools` extension. +* Many packages can also be installed via the :program:`setuptools` extension + or :program:`pip` wrapper, see http://www.pip-installer.org/. GUI Programming on the Mac @@ -159,7 +158,7 @@ http://www.riverbankcomputing.co.uk/software/pyqt/intro. Distributing Python Applications on the Mac =========================================== -The "Build Applet" tool that is placed in the MacPython 2.5 folder is fine for +The "Build Applet" tool that is placed in the MacPython 3.3 folder is fine for packaging small Python scripts on your own machine to run as a standard Mac application. This tool, however, is not robust enough to distribute Python applications to other users. diff --git a/Doc/using/scripts.rst b/Doc/using/scripts.rst new file mode 100644 index 0000000..2c87416 --- /dev/null +++ b/Doc/using/scripts.rst @@ -0,0 +1,12 @@ +.. _tools-and-scripts: + +Additional Tools and Scripts +============================ + +.. _scripts-pyvenv: + +pyvenv - Creating virtual environments +-------------------------------------- + +.. include:: venv-create.inc + diff --git a/Doc/using/unix.rst b/Doc/using/unix.rst index 773387b..40635c6 100644 --- a/Doc/using/unix.rst +++ b/Doc/using/unix.rst @@ -28,7 +28,7 @@ following links: http://www.debian.org/doc/manuals/maint-guide/first.en.html for Debian users - http://linuxmafia.com/pub/linux/suse-linux-internals/chapter35.html + http://en.opensuse.org/Portal:Packaging for OpenSuse users http://docs.fedoraproject.org/en-US/Fedora_Draft_Documentation/0.1/html/RPM_Guide/ch-creating-rpms.html for Fedora users diff --git a/Doc/using/venv-create.inc b/Doc/using/venv-create.inc new file mode 100644 index 0000000..5fdbc9b --- /dev/null +++ b/Doc/using/venv-create.inc @@ -0,0 +1,85 @@ +Creation of :ref:`virtual environments <venv-def>` is done by executing the +``pyvenv`` script:: + + pyvenv /path/to/new/virtual/environment + +Running this command creates the target directory (creating any parent +directories that don't exist already) and places a ``pyvenv.cfg`` file in it +with a ``home`` key pointing to the Python installation the command was run +from. It also creates a ``bin`` (or ``Scripts`` on Windows) subdirectory +containing a copy of the ``python`` binary (or binaries, in the case of +Windows). It also creates an (initially empty) ``lib/pythonX.Y/site-packages`` +subdirectory (on Windows, this is ``Lib\site-packages``). + +.. highlight:: none + +On Windows, you may have to invoke the ``pyvenv`` script as follows, if you +don't have the relevant PATH and PATHEXT settings:: + + c:\Temp>c:\Python33\python c:\Python33\Tools\Scripts\pyvenv.py myenv + +or equivalently:: + + c:\Temp>c:\Python33\python -m venv myenv + +The command, if run with ``-h``, will show the available options:: + + usage: pyvenv [-h] [--system-site-packages] [--symlinks] [--clear] + [--upgrade] ENV_DIR [ENV_DIR ...] + + Creates virtual Python environments in one or more target directories. + + positional arguments: + ENV_DIR A directory to create the environment in. + + optional arguments: + -h, --help show this help message and exit + --system-site-packages Give access to the global site-packages dir to the + virtual environment. + --symlinks Try to use symlinks rather than copies, when symlinks + are not the default for the platform. + --clear Delete the environment directory if it already exists. + If not specified and the directory exists, an error is + raised. + --upgrade Upgrade the environment directory to use this version + of Python, assuming Python has been upgraded in-place. + +If the target directory already exists an error will be raised, unless +the ``--clear`` or ``--upgrade`` option was provided. + +The created ``pyvenv.cfg`` file also includes the +``include-system-site-packages`` key, set to ``true`` if ``venv`` is +run with the ``--system-site-packages`` option, ``false`` otherwise. + +Multiple paths can be given to ``pyvenv``, in which case an identical +virtualenv will be created, according to the given options, at each +provided path. + +Once a venv has been created, it can be "activated" using a script in the +venv's binary directory. The invocation of the script is platform-specific: on +a Posix platform, you would typically do:: + + $ source <venv>/bin/activate + +whereas on Windows, you might do:: + + C:\> <venv>/Scripts/activate + +if you are using the ``cmd.exe`` shell, or perhaps:: + + PS C:\> <venv>/Scripts/Activate.ps1 + +if you use PowerShell. + +You don't specifically *need* to activate an environment; activation just +prepends the venv's binary directory to your path, so that "python" invokes the +venv's Python interpreter and you can run installed scripts without having to +use their full path. However, all scripts installed in a venv should be +runnable without activating it, and run with the venv's Python automatically. + +You can deactivate a venv by typing "deactivate" in your shell. The exact +mechanism is platform-specific: for example, the Bash activation script defines +a "deactivate" function, whereas on Windows there are separate scripts called +``deactivate.bat`` and ``Deactivate.ps1`` which are installed when the venv is +created. + diff --git a/Doc/using/windows.rst b/Doc/using/windows.rst index c85afd2..6c4d16e 100644 --- a/Doc/using/windows.rst +++ b/Doc/using/windows.rst @@ -24,10 +24,6 @@ With ongoing development of Python, some platforms that used to be supported earlier are no longer supported (due to the lack of users or developers). Check :pep:`11` for details on all unsupported platforms. -* Up to 2.5, Python was still compatible with Windows 95, 98 and ME (but already - raised a deprecation warning on installation). For Python 2.6 (and all - following releases), this support was dropped and new releases are just - expected to work on the Windows NT family. * `Windows CE <http://pythonce.sourceforge.net/>`_ is still supported. * The `Cygwin <http://cygwin.com/>`_ installer offers to install the `Python interpreter <http://cygwin.com/packages/python>`_ as well; it is located under @@ -36,8 +32,8 @@ Check :pep:`11` for details on all unsupported platforms. release/python>`_, `Maintainer releases <http://www.tishler.net/jason/software/python/>`_) -See `Python for Windows (and DOS) <http://www.python.org/download/windows/>`_ -for detailed information about platforms with precompiled installers. +See `Python for Windows <http://www.python.org/download/windows/>`_ +for detailed information about platforms with pre-compiled installers. .. seealso:: @@ -64,7 +60,7 @@ Besides the standard CPython distribution, there are modified packages including additional functionality. The following is a list of popular versions and their key features: -`ActivePython <http://www.activestate.com/Products/activepython/>`_ +`ActivePython <http://www.activestate.com/activepython/>`_ Installer with multi-platform compatibility, documentation, PyWin32 `Enthought Python Distribution <http://www.enthought.com/products/epd.php>`_ @@ -132,21 +128,33 @@ Consult :command:`set /?` for details on this behaviour. Setting Environment variables, Louis J. Farrugia +.. _windows-path-mod: + Finding the Python executable ----------------------------- +.. versionchanged:: 3.3 + Besides using the automatically created start menu entry for the Python -interpreter, you might want to start Python in the DOS prompt. To make this -work, you need to set your :envvar:`%PATH%` environment variable to include the -directory of your Python distribution, delimited by a semicolon from other -entries. An example variable could look like this (assuming the first two -entries are Windows' default):: +interpreter, you might want to start Python in the command prompt. As of +Python 3.3, the installer has an option to set that up for you. + +At the "Customize Python 3.3" screen, an option called +"Add python.exe to search path" can be enabled to have the installer place +your installation into the :envvar:`%PATH%`. This allows you to type +:command:`python` to run the interpreter. Thus, you can also execute your +scripts with command line options, see :ref:`using-on-cmdline` documentation. - C:\WINDOWS\system32;C:\WINDOWS;C:\Python25 +If you don't enable this option at install time, you can always re-run the +installer to choose it. -Typing :command:`python` on your command prompt will now fire up the Python -interpreter. Thus, you can also execute your scripts with command line options, -see :ref:`using-on-cmdline` documentation. +The alternative is manually modifying the :envvar:`%PATH%` using the +directions in :ref:`setting-envvars`. You need to set your :envvar:`%PATH%` +environment variable to include the directory of your Python distribution, +delimited by a semicolon from other entries. An example variable could look +like this (assuming the first two entries are Windows' default):: + + C:\WINDOWS\system32;C:\WINDOWS;C:\Python33 Finding modules @@ -205,13 +213,19 @@ The end result of all this is: Executing scripts ----------------- -Python scripts (files with the extension ``.py``) will be executed by -:program:`python.exe` by default. This executable opens a terminal, which stays -open even if the program uses a GUI. If you do not want this to happen, use the -extension ``.pyw`` which will cause the script to be executed by -:program:`pythonw.exe` by default (both executables are located in the top-level -of your Python installation directory). This suppresses the terminal window on -startup. +As of Python 3.3, Python includes a launcher which facilitates running Python +scripts. See :ref:`launcher` for more information. + +Executing scripts without the Python launcher +--------------------------------------------- + +Without the Python launcher installed, Python scripts (files with the extension +``.py``) will be executed by :program:`python.exe` by default. This executable +opens a terminal, which stays open even if the program uses a GUI. If you do +not want this to happen, use the extension ``.pyw`` which will cause the script +to be executed by :program:`pythonw.exe` by default (both executables are +located in the top-level of your Python installation directory). This +suppresses the terminal window on startup. You can also make all ``.py`` scripts execute with :program:`pythonw.exe`, setting this through the usual facilities, for example (might require @@ -227,6 +241,250 @@ administrative rights): ftype Python.File=C:\Path\to\pythonw.exe "%1" %* +.. _launcher: + +Python Launcher for Windows +=========================== + +.. versionadded:: 3.3 + +The Python launcher for Windows is a utility which aids in the location and +execution of different Python versions. It allows scripts (or the +command-line) to indicate a preference for a specific Python version, and +will locate and execute that version. + +Getting started +--------------- + +From the command-line +^^^^^^^^^^^^^^^^^^^^^ + +You should ensure the launcher is on your PATH - depending on how it was +installed it may already be there, but check just in case it is not. + +From a command-prompt, execute the following command: + +:: + + py + +You should find that the latest version of Python 2.x you have installed is +started - it can be exited as normal, and any additional command-line +arguments specified will be sent directly to Python. + +If you have multiple versions of Python 2.x installed (e.g., 2.6 and 2.7) you +will have noticed that Python 2.7 was started - to launch Python 2.6, try the +command: + +:: + + py -2.6 + +If you have a Python 3.x installed, try the command: + +:: + + py -3 + +You should find the latest version of Python 3.x starts. + +From a script +^^^^^^^^^^^^^ + +Let's create a test Python script - create a file called ``hello.py`` with the +following contents + +:: + + #! python + import sys + sys.stdout.write("hello from Python %s\n" % (sys.version,)) + +From the directory in which hello.py lives, execute the command: + +:: + + py hello.py + +You should notice the version number of your latest Python 2.x installation +is printed. Now try changing the first line to be: + +:: + + #! python3 + +Re-executing the command should now print the latest Python 3.x information. +As with the above command-line examples, you can specify a more explicit +version qualifier. Assuming you have Python 2.6 installed, try changing the +first line to ``#! python2.6`` and you should find the 2.6 version +information printed. + +From file associations +^^^^^^^^^^^^^^^^^^^^^^ + +The launcher should have been associated with Python files (i.e. ``.py``, +``.pyw``, ``.pyc``, ``.pyo`` files) when it was installed. This means that +when you double-click on one of these files from Windows explorer the launcher +will be used, and therefore you can use the same facilities described above to +have the script specify the version which should be used. + +The key benefit of this is that a single launcher can support multiple Python +versions at the same time depending on the contents of the first line. + +Shebang Lines +------------- + +If the first line of a script file starts with ``#!``, it is known as a +"shebang" line. Linux and other Unix like operating systems have native +support for such lines and are commonly used on such systems to indicate how +a script should be executed. This launcher allows the same facilities to be +using with Python scripts on Windows and the examples above demonstrate their +use. + +To allow shebang lines in Python scripts to be portable between Unix and +Windows, this launcher supports a number of 'virtual' commands to specify +which interpreter to use. The supported virtual commands are: + +* ``/usr/bin/env python`` +* ``/usr/bin/python`` +* ``/usr/local/bin/python`` +* ``python`` + +For example, if the first line of your script starts with + +:: + + #! /usr/bin/python + +The default Python will be located and used. As many Python scripts written +to work on Unix will already have this line, you should find these scripts can +be used by the launcher without modification. If you are writing a new script +on Windows which you hope will be useful on Unix, you should use one of the +shebang lines starting with ``/usr``. + +Arguments in shebang lines +-------------------------- + +The shebang lines can also specify additional options to be passed to the +Python interpreter. For example, if you have a shebang line: + +:: + + #! /usr/bin/python -v + +Then Python will be started with the ``-v`` option + +Customization +------------- + +Customization via INI files +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + + Two .ini files will be searched by the launcher - ``py.ini`` in the + current user's "application data" directory (i.e. the directory returned + by calling the Windows function SHGetFolderPath with CSIDL_LOCAL_APPDATA) + and ``py.ini`` in the same directory as the launcher. The same .ini + files are used for both the 'console' version of the launcher (i.e. + py.exe) and for the 'windows' version (i.e. pyw.exe) + + Customization specified in the "application directory" will have + precedence over the one next to the executable, so a user, who may not + have write access to the .ini file next to the launcher, can override + commands in that global .ini file) + +Customizing default Python versions +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In some cases, a version qualifier can be included in a command to dictate +which version of Python will be used by the command. A version qualifier +starts with a major version number and can optionally be followed by a period +('.') and a minor version specifier. If the minor qualifier is specified, it +may optionally be followed by "-32" to indicate the 32-bit implementation of +that version be used. + +For example, a shebang line of ``#!python`` has no version qualifier, while +``#!python3`` has a version qualifier which specifies only a major version. + +If no version qualifiers are found in a command, the environment variable +``PY_PYTHON`` can be set to specify the default version qualifier - the default +value is "2". Note this value could specify just a major version (e.g. "2") or +a major.minor qualifier (e.g. "2.6"), or even major.minor-32. + +If no minor version qualifiers are found, the environment variable +``PY_PYTHON{major}`` (where ``{major}`` is the current major version qualifier +as determined above) can be set to specify the full version. If no such option +is found, the launcher will enumerate the installed Python versions and use +the latest minor release found for the major version, which is likely, +although not guaranteed, to be the most recently installed version in that +family. + +On 64-bit Windows with both 32-bit and 64-bit implementations of the same +(major.minor) Python version installed, the 64-bit version will always be +preferred. This will be true for both 32-bit and 64-bit implementations of the +launcher - a 32-bit launcher will prefer to execute a 64-bit Python installation +of the specified version if available. This is so the behavior of the launcher +can be predicted knowing only what versions are installed on the PC and +without regard to the order in which they were installed (i.e., without knowing +whether a 32 or 64-bit version of Python and corresponding launcher was +installed last). As noted above, an optional "-32" suffix can be used on a +version specifier to change this behaviour. + +Examples: + +* If no relevant options are set, the commands ``python`` and + ``python2`` will use the latest Python 2.x version installed and + the command ``python3`` will use the latest Python 3.x installed. + +* The commands ``python3.1`` and ``python2.7`` will not consult any + options at all as the versions are fully specified. + +* If ``PY_PYTHON=3``, the commands ``python`` and ``python3`` will both use + the latest installed Python 3 version. + +* If ``PY_PYTHON=3.1-32``, the command ``python`` will use the 32-bit + implementation of 3.1 whereas the command ``python3`` will use the latest + installed Python (PY_PYTHON was not considered at all as a major + version was specified.) + +* If ``PY_PYTHON=3`` and ``PY_PYTHON3=3.1``, the commands + ``python`` and ``python3`` will both use specifically 3.1 + +In addition to environment variables, the same settings can be configured +in the .INI file used by the launcher. The section in the INI file is +called ``[defaults]`` and the key name will be the same as the +environment variables without the leading ``PY_`` prefix (and note that +the key names in the INI file are case insensitive.) The contents of +an environment variable will override things specified in the INI file. + +For example: + +* Setting ``PY_PYTHON=3.1`` is equivalent to the INI file containing: + +:: + + [defaults] + python=3.1 + +* Setting ``PY_PYTHON=3`` and ``PY_PYTHON3=3.1`` is equivalent to the INI file + containing: + +:: + + [defaults] + python=3 + python3=3.1 + +Diagnostics +----------- + +If an environment variable ``PYLAUNCH_DEBUG`` is set (to any value), the +launcher will print diagnostic information to stderr (i.e. to the console). +While this information manages to be simultaneously verbose *and* terse, it +should allow you to see what versions of Python were located, why a +particular version was chosen and the exact command-line used to execute the +target Python. + + Additional modules ================== @@ -265,13 +523,14 @@ shipped with PyWin32. It is an embeddable IDE with a built-in debugger. by David and Paul Boddie -Py2exe ------- +cx_Freeze +--------- -`Py2exe <http://www.py2exe.org/>`_ is a :mod:`distutils` extension (see -:ref:`extending-distutils`) which wraps Python scripts into executable Windows -programs (:file:`{*}.exe` files). When you have done this, you can distribute -your application without requiring your users to install Python. +`cx_Freeze <http://cx-freeze.sourceforge.net/>`_ is a :mod:`distutils` +extension (see :ref:`extending-distutils`) which wraps Python scripts into +executable Windows programs (:file:`{*}.exe` files). When you have done this, +you can distribute your application without requiring your users to install +Python. WConio @@ -294,9 +553,9 @@ If you want to compile CPython yourself, first thing you should do is get the latest release's source or just grab a fresh `checkout <http://docs.python.org/devguide/setup#checking-out-the-code>`_. -For Microsoft Visual C++, which is the compiler with which official Python -releases are built, the source tree contains solutions/project files. View the -:file:`readme.txt` in their respective directories: +The source tree contains a build solution and project files for Microsoft +Visual C++, which is the compiler used to build the official Python releases. +View the :file:`readme.txt` in their respective directories: +--------------------+--------------+-----------------------+ | Directory | MSVC version | Visual Studio version | @@ -307,14 +566,16 @@ releases are built, the source tree contains solutions/project files. View the +--------------------+--------------+-----------------------+ | :file:`PC/VS8.0/` | 8.0 | 2005 | +--------------------+--------------+-----------------------+ -| :file:`PCbuild/` | 9.0 | 2008 | +| :file:`PC/VS9.0/` | 9.0 | 2008 | ++--------------------+--------------+-----------------------+ +| :file:`PCbuild/` | 10.0 | 2010 | +--------------------+--------------+-----------------------+ -Note that not all of these build directories are fully supported. Read the -release notes to see which compiler version the official releases for your -version are built with. +Note that any build directories within the :file:`PC` directory are not +necessarily fully supported. The :file:`PCbuild` directory contains the files +for the compiler used to build the official release. -Check :file:`PC/readme.txt` for general information on the build process. +Check :file:`PCbuild/readme.txt` for general information on the build process. For extension modules, consult :ref:`building-on-windows`. @@ -343,3 +604,7 @@ Other resources `A Python for Windows Tutorial <http://www.imladris.com/Scripts/PythonForWindows.html>`_ by Amanda Birmingham, 2004 + :pep:`397` - Python launcher for Windows + The proposal for the launcher to be included in the Python distribution. + + diff --git a/Doc/whatsnew/2.0.rst b/Doc/whatsnew/2.0.rst index 850e57d..64b908b 100644 --- a/Doc/whatsnew/2.0.rst +++ b/Doc/whatsnew/2.0.rst @@ -166,7 +166,7 @@ encoding. Encodings are named by strings, such as ``'ascii'``, ``'utf-8'``, registering new encodings that are then available throughout a Python program. If an encoding isn't specified, the default encoding is usually 7-bit ASCII, though it can be changed for your Python installation by calling the -:func:`sys.setdefaultencoding(encoding)` function in a customised version of +``sys.setdefaultencoding(encoding)`` function in a customised version of :file:`site.py`. Combining 8-bit and Unicode strings always coerces to Unicode, using the default @@ -203,7 +203,7 @@ U+0660 is an Arabic number. The :mod:`codecs` module contains functions to look up existing encodings and register new ones. Unless you want to implement a new encoding, you'll most -often use the :func:`codecs.lookup(encoding)` function, which returns a +often use the ``codecs.lookup(encoding)`` function, which returns a 4-element tuple: ``(encode_func, decode_func, stream_reader, stream_writer)``. * *encode_func* is a function that takes a Unicode string, and returns a 2-tuple @@ -600,7 +600,7 @@ Python code is found to be improperly indented. Changes to Built-in Functions ----------------------------- -A new built-in, :func:`zip(seq1, seq2, ...)`, has been added. :func:`zip` +A new built-in, ``zip(seq1, seq2, ...)``, has been added. :func:`zip` returns a list of tuples where each tuple contains the i-th element from each of the argument sequences. The difference between :func:`zip` and ``map(None, seq1, seq2)`` is that :func:`map` pads the sequences with ``None`` if the @@ -619,7 +619,7 @@ level, serial)`` For example, in a hypothetical 2.0.1beta1, ``sys.version_info`` would be ``(2, 0, 1, 'beta', 1)``. *level* is a string such as ``"alpha"``, ``"beta"``, or ``"final"`` for a final release. -Dictionaries have an odd new method, :meth:`setdefault(key, default)`, which +Dictionaries have an odd new method, ``setdefault(key, default)``, which behaves similarly to the existing :meth:`get` method. However, if the key is missing, :meth:`setdefault` both returns the value of *default* as :meth:`get` would do, and also inserts it into the dictionary as the value for *key*. Thus, @@ -1038,7 +1038,7 @@ Brian Gallew contributed OpenSSL support for the :mod:`socket` module. OpenSSL is an implementation of the Secure Socket Layer, which encrypts the data being sent over a socket. When compiling Python, you can edit :file:`Modules/Setup` to include SSL support, which adds an additional function to the :mod:`socket` -module: :func:`socket.ssl(socket, keyfile, certfile)`, which takes a socket +module: ``socket.ssl(socket, keyfile, certfile)``, which takes a socket object and returns an SSL socket. The :mod:`httplib` and :mod:`urllib` modules were also changed to support ``https://`` URLs, though no one has implemented FTP or SMTP over SSL. diff --git a/Doc/whatsnew/2.1.rst b/Doc/whatsnew/2.1.rst index 117af10..b1ab48e 100644 --- a/Doc/whatsnew/2.1.rst +++ b/Doc/whatsnew/2.1.rst @@ -204,7 +204,7 @@ Each of these magic methods can return anything at all: a Boolean, a matrix, a list, or any other Python object. Alternatively they can raise an exception if the comparison is impossible, inconsistent, or otherwise meaningless. -The built-in :func:`cmp(A,B)` function can use the rich comparison machinery, +The built-in ``cmp(A,B)`` function can use the rich comparison machinery, and now accepts an optional argument specifying which comparison operation to use; this is given as one of the strings ``"<"``, ``"<="``, ``">"``, ``">="``, ``"=="``, or ``"!="``. If called without the optional third argument, @@ -350,7 +350,7 @@ where this behaviour is undesirable, object caches being the most common one, and another being circular references in data structures such as trees. For example, consider a memoizing function that caches the results of another -function :func:`f(x)` by storing the function's argument and its result in a +function ``f(x)`` by storing the function's argument and its result in a dictionary:: _cache = {} @@ -656,7 +656,7 @@ New and Improved Modules use :mod:`ftplib` to retrieve files and then don't work from behind a firewall. It's deemed unlikely that this will cause problems for anyone, because Netscape defaults to passive mode and few people complain, but if passive mode is - unsuitable for your application or network setup, call :meth:`set_pasv(0)` on + unsuitable for your application or network setup, call ``set_pasv(0)`` on FTP objects to disable passive mode. * Support for raw socket access has been added to the :mod:`socket` module, @@ -666,7 +666,7 @@ New and Improved Modules for displaying timing profiles for Python programs, invoked when the module is run as a script. Contributed by Eric S. Raymond. -* A new implementation-dependent function, :func:`sys._getframe([depth])`, has +* A new implementation-dependent function, ``sys._getframe([depth])``, has been added to return a given frame object from the current call stack. :func:`sys._getframe` returns the frame at the top of the call stack; if the optional integer argument *depth* is supplied, the function returns the frame diff --git a/Doc/whatsnew/2.2.rst b/Doc/whatsnew/2.2.rst index 1db1ee7..31b6df4 100644 --- a/Doc/whatsnew/2.2.rst +++ b/Doc/whatsnew/2.2.rst @@ -173,12 +173,12 @@ attributes of their own: * :attr:`__doc__` is the attribute's docstring. -* :meth:`__get__(object)` is a method that retrieves the attribute value from +* ``__get__(object)`` is a method that retrieves the attribute value from *object*. -* :meth:`__set__(object, value)` sets the attribute on *object* to *value*. +* ``__set__(object, value)`` sets the attribute on *object* to *value*. -* :meth:`__delete__(object, value)` deletes the *value* attribute of *object*. +* ``__delete__(object, value)`` deletes the *value* attribute of *object*. For example, when you write ``obj.x``, the steps that Python actually performs are:: @@ -288,7 +288,7 @@ Following this rule, referring to :meth:`D.save` will return :meth:`C.save`, which is the behaviour we're after. This lookup rule is the same as the one followed by Common Lisp. A new built-in function, :func:`super`, provides a way to get at a class's superclasses without having to reimplement Python's -algorithm. The most commonly used form will be :func:`super(class, obj)`, which +algorithm. The most commonly used form will be ``super(class, obj)``, which returns a bound superclass object (not the actual class object). This form will be used in methods to call a method in the superclass; for example, :class:`D`'s :meth:`save` method would look like this:: @@ -301,7 +301,7 @@ will be used in methods to call a method in the superclass; for example, ... :func:`super` can also return unbound superclass objects when called as -:func:`super(class)` or :func:`super(class1, class2)`, but this probably won't +``super(class)`` or ``super(class1, class2)``, but this probably won't often be useful. @@ -314,13 +314,13 @@ code more readable by automatically mapping an attribute access such as ``obj.parent`` into a method call such as ``obj.get_parent``. Python 2.2 adds some new ways of controlling attribute access. -First, :meth:`__getattr__(attr_name)` is still supported by new-style classes, +First, ``__getattr__(attr_name)`` is still supported by new-style classes, and nothing about it has changed. As before, it will be called when an attempt is made to access ``obj.foo`` and no attribute named ``foo`` is found in the instance's dictionary. New-style classes also support a new method, -:meth:`__getattribute__(attr_name)`. The difference between the two methods is +``__getattribute__(attr_name)``. The difference between the two methods is that :meth:`__getattribute__` is *always* called whenever any attribute is accessed, while the old :meth:`__getattr__` is only called if ``foo`` isn't found in the instance's dictionary. @@ -441,8 +441,8 @@ work, though it really should. In Python 2.2, iteration can be implemented separately, and :meth:`__getitem__` methods can be limited to classes that really do support random access. The -basic idea of iterators is simple. A new built-in function, :func:`iter(obj)` -or ``iter(C, sentinel)``, is used to get an iterator. :func:`iter(obj)` returns +basic idea of iterators is simple. A new built-in function, ``iter(obj)`` +or ``iter(C, sentinel)``, is used to get an iterator. ``iter(obj)`` returns an iterator for the object *obj*, while ``iter(C, sentinel)`` returns an iterator that will invoke the callable object *C* until it returns *sentinel* to signal that the iterator is done. @@ -450,9 +450,9 @@ signal that the iterator is done. Python classes can define an :meth:`__iter__` method, which should create and return a new iterator for the object; if the object is its own iterator, this method can just return ``self``. In particular, iterators will usually be their -own iterators. Extension types implemented in C can implement a :attr:`tp_iter` +own iterators. Extension types implemented in C can implement a :c:member:`~PyTypeObject.tp_iter` function in order to return an iterator, and extension types that want to behave -as iterators can define a :attr:`tp_iternext` function. +as iterators can define a :c:member:`~PyTypeObject.tp_iternext` function. So, after all this, what do iterators actually do? They have one required method, :meth:`next`, which takes no arguments and returns the next value. When @@ -478,7 +478,7 @@ there are no more values to be returned, calling :meth:`next` should raise the In 2.2, Python's :keyword:`for` statement no longer expects a sequence; it expects something for which :func:`iter` will return an iterator. For backward compatibility and convenience, an iterator is automatically constructed for -sequences that don't implement :meth:`__iter__` or a :attr:`tp_iter` slot, so +sequences that don't implement :meth:`__iter__` or a :c:member:`~PyTypeObject.tp_iter` slot, so ``for i in [1,2,3]`` will still work. Wherever the Python interpreter loops over a sequence, it's been changed to use the iterator protocol. This means you can do things like this:: @@ -793,7 +793,7 @@ further details. Another change is simpler to explain. Since their introduction, Unicode strings have supported an :meth:`encode` method to convert the string to a selected -encoding such as UTF-8 or Latin-1. A symmetric :meth:`decode([*encoding*])` +encoding such as UTF-8 or Latin-1. A symmetric ``decode([*encoding*])`` method has been added to 8-bit strings (though not to Unicode strings) in 2.2. :meth:`decode` assumes that the string is in the specified encoding and decodes it, returning whatever is returned by the codec. @@ -1203,7 +1203,7 @@ Some of the more notable changes are: to an MBCS encoded string, as used by the Microsoft file APIs. As MBCS is explicitly used by the file APIs, Python's choice of ASCII as the default encoding turns out to be an annoyance. On Unix, the locale's character set is - used if :func:`locale.nl_langinfo(CODESET)` is available. (Windows support was + used if ``locale.nl_langinfo(CODESET)`` is available. (Windows support was contributed by Mark Hammond with assistance from Marc-André Lemburg. Unix support was added by Martin von Löwis.) diff --git a/Doc/whatsnew/2.3.rst b/Doc/whatsnew/2.3.rst index f4c79e4..f0e48d9 100644 --- a/Doc/whatsnew/2.3.rst +++ b/Doc/whatsnew/2.3.rst @@ -504,8 +504,8 @@ This produces the following output:: ZeroDivisionError: integer division or modulo by zero Slightly more advanced programs will use a logger other than the root logger. -The :func:`getLogger(name)` function is used to get a particular log, creating -it if it doesn't exist yet. :func:`getLogger(None)` returns the root logger. :: +The ``getLogger(name)`` function is used to get a particular log, creating +it if it doesn't exist yet. ``getLogger(None)`` returns the root logger. :: log = logging.getLogger('server') ... @@ -724,10 +724,10 @@ module: objects to it. Additional built-in and frozen modules can be imported by an object added to this list. -Importer objects must have a single method, :meth:`find_module(fullname, -path=None)`. *fullname* will be a module or package name, e.g. ``string`` or +Importer objects must have a single method, ``find_module(fullname, +path=None)``. *fullname* will be a module or package name, e.g. ``string`` or ``distutils.core``. :meth:`find_module` must return a loader object that has a -single method, :meth:`load_module(fullname)`, that creates and returns the +single method, ``load_module(fullname)``, that creates and returns the corresponding module object. Pseudo-code for Python's new import logic, therefore, looks something like this @@ -935,7 +935,7 @@ Or use slice objects directly in subscripts:: [0, 2, 4] To simplify implementing sequences that support extended slicing, slice objects -now have a method :meth:`indices(length)` which, given the length of a sequence, +now have a method ``indices(length)`` which, given the length of a sequence, returns a ``(start, stop, step)`` tuple that can be passed directly to :func:`range`. :meth:`indices` handles omitted and out-of-bounds indices in a manner consistent with regular slices (and this innocuous phrase hides a welter @@ -984,7 +984,7 @@ Here are all of the changes that Python 2.3 makes to the core Python language. * Built-in types now support the extended slicing syntax, as described in section :ref:`section-slices` of this document. -* A new built-in function, :func:`sum(iterable, start=0)`, adds up the numeric +* A new built-in function, ``sum(iterable, start=0)``, adds up the numeric items in the iterable object and returns their sum. :func:`sum` only accepts numbers, meaning that you can't use it to concatenate a bunch of strings. (Contributed by Alex Martelli.) @@ -998,7 +998,7 @@ Here are all of the changes that Python 2.3 makes to the core Python language. its index, now takes optional *start* and *stop* arguments to limit the search to only part of the list. -* Dictionaries have a new method, :meth:`pop(key[, *default*])`, that returns +* Dictionaries have a new method, ``pop(key[, *default*])``, that returns the value corresponding to *key* and removes that key/value pair from the dictionary. If the requested key isn't present in the dictionary, *default* is returned if it's specified and :exc:`KeyError` raised if it isn't. :: @@ -1020,7 +1020,7 @@ Here are all of the changes that Python 2.3 makes to the core Python language. {} >>> - There's also a new class method, :meth:`dict.fromkeys(iterable, value)`, that + There's also a new class method, ``dict.fromkeys(iterable, value)``, that creates a dictionary with keys taken from the supplied iterator *iterable* and all values set to *value*, defaulting to ``None``. @@ -1093,7 +1093,7 @@ Here are all of the changes that Python 2.3 makes to the core Python language. 100 bytecodes, speeding up single-threaded applications by reducing the switching overhead. Some multithreaded applications may suffer slower response time, but that's easily fixed by setting the limit back to a lower number using - :func:`sys.setcheckinterval(N)`. The limit can be retrieved with the new + ``sys.setcheckinterval(N)``. The limit can be retrieved with the new :func:`sys.getcheckinterval` function. * One minor but far-reaching change is that the names of extension types defined @@ -1272,10 +1272,10 @@ complete list of changes, or look through the CVS logs for all the details. * Previously the :mod:`doctest` module would only search the docstrings of public methods and functions for test cases, but it now also examines private - ones as well. The :func:`DocTestSuite(` function creates a + ones as well. The :func:`DocTestSuite` function creates a :class:`unittest.TestSuite` object from a set of :mod:`doctest` tests. -* The new :func:`gc.get_referents(object)` function returns a list of all the +* The new ``gc.get_referents(object)`` function returns a list of all the objects referenced by *object*. * The :mod:`getopt` module gained a new function, :func:`gnu_getopt`, that @@ -1347,8 +1347,8 @@ complete list of changes, or look through the CVS logs for all the details. documentation for details. (Contributed by Raymond Hettinger.) -* Two new functions in the :mod:`math` module, :func:`degrees(rads)` and - :func:`radians(degs)`, convert between radians and degrees. Other functions in +* Two new functions in the :mod:`math` module, ``degrees(rads)`` and + ``radians(degs)``, convert between radians and degrees. Other functions in the :mod:`math` module such as :func:`math.sin` and :func:`math.cos` have always required input values measured in radians. Also, an optional *base* argument was added to :func:`math.log` to make it easier to compute logarithms for bases @@ -1405,7 +1405,7 @@ complete list of changes, or look through the CVS logs for all the details. and therefore faster performance. Setting the parser object's :attr:`buffer_text` attribute to :const:`True` will enable buffering. -* The :func:`sample(population, k)` function was added to the :mod:`random` +* The ``sample(population, k)`` function was added to the :mod:`random` module. *population* is a sequence or :class:`xrange` object containing the elements of a population, and :func:`sample` chooses *k* elements from the population without replacing chosen elements. *k* can be any value up to @@ -1451,7 +1451,7 @@ complete list of changes, or look through the CVS logs for all the details. encryption is not believed to be secure. If you need encryption, use one of the several AES Python modules that are available separately. -* The :mod:`shutil` module gained a :func:`move(src, dest)` function that +* The :mod:`shutil` module gained a ``move(src, dest)`` function that recursively moves a file or directory to a new location. * Support for more advanced POSIX signal handling was added to the :mod:`signal` @@ -1459,7 +1459,7 @@ complete list of changes, or look through the CVS logs for all the details. platforms. * The :mod:`socket` module now supports timeouts. You can call the - :meth:`settimeout(t)` method on a socket object to set a timeout of *t* seconds. + ``settimeout(t)`` method on a socket object to set a timeout of *t* seconds. Subsequent socket operations that take longer than *t* seconds to complete will abort and raise a :exc:`socket.timeout` exception. @@ -1480,9 +1480,9 @@ complete list of changes, or look through the CVS logs for all the details. :program:`tar`\ -format archive files. (Contributed by Lars Gustäbel.) * The new :mod:`textwrap` module contains functions for wrapping strings - containing paragraphs of text. The :func:`wrap(text, width)` function takes a + containing paragraphs of text. The ``wrap(text, width)`` function takes a string and returns a list containing the text split into lines of no more than - the chosen width. The :func:`fill(text, width)` function returns a single + the chosen width. The ``fill(text, width)`` function returns a single string, reformatted to fit into lines no longer than the chosen width. (As you can guess, :func:`fill` is built on top of :func:`wrap`. For example:: @@ -1903,7 +1903,7 @@ Changes to Python's build process and to the C API include: short int`, ``I`` for :c:type:`unsigned int`, and ``K`` for :c:type:`unsigned long long`. -* A new function, :c:func:`PyObject_DelItemString(mapping, char \*key)` was added +* A new function, ``PyObject_DelItemString(mapping, char *key)`` was added as shorthand for ``PyObject_DelItem(mapping, PyString_New(key))``. * File objects now manage their internal string buffer differently, increasing diff --git a/Doc/whatsnew/2.4.rst b/Doc/whatsnew/2.4.rst index 9d339a5..5973f3b 100644 --- a/Doc/whatsnew/2.4.rst +++ b/Doc/whatsnew/2.4.rst @@ -37,7 +37,7 @@ PEP 218: Built-In Set Objects Python 2.3 introduced the :mod:`sets` module. C implementations of set data types have now been added to the Python core as two new built-in types, -:func:`set(iterable)` and :func:`frozenset(iterable)`. They provide high speed +``set(iterable)`` and ``frozenset(iterable)``. They provide high speed operations for membership testing, for eliminating duplicates from sequences, and for mathematical operations like unions, intersections, differences, and symmetric differences. :: @@ -346,7 +346,7 @@ returned. PEP 322: Reverse Iteration ========================== -A new built-in function, :func:`reversed(seq)`, takes a sequence and returns an +A new built-in function, ``reversed(seq)``, takes a sequence and returns an iterator that loops over the elements of the sequence in reverse order. :: >>> for i in reversed(xrange(1,4)): @@ -384,7 +384,7 @@ PEP 324: New subprocess Module The standard library provides a number of ways to execute a subprocess, offering different features and different levels of complexity. -:func:`os.system(command)` is easy to use, but slow (it runs a shell process +``os.system(command)`` is easy to use, but slow (it runs a shell process which executes the command) and dangerous (you have to be careful about escaping the shell's metacharacters). The :mod:`popen2` module offers classes that can capture standard output and standard error from the subprocess, but the naming @@ -431,8 +431,8 @@ The constructor has a number of handy options: Once you've created the :class:`Popen` instance, you can call its :meth:`wait` method to pause until the subprocess has exited, :meth:`poll` to check if it's -exited without pausing, or :meth:`communicate(data)` to send the string *data* -to the subprocess's standard input. :meth:`communicate(data)` then reads any +exited without pausing, or ``communicate(data)`` to send the string *data* +to the subprocess's standard input. ``communicate(data)`` then reads any data that the subprocess has sent to its standard output or standard error, returning a tuple ``(stdout_data, stderr_data)``. @@ -749,10 +749,10 @@ numbers in the current locale. The solution described in the PEP is to add three new functions to the Python API that perform ASCII-only conversions, ignoring the locale setting: -* :c:func:`PyOS_ascii_strtod(str, ptr)` and :c:func:`PyOS_ascii_atof(str, ptr)` +* ``PyOS_ascii_strtod(str, ptr)`` and ``PyOS_ascii_atof(str, ptr)`` both convert a string to a C :c:type:`double`. -* :c:func:`PyOS_ascii_formatd(buffer, buf_len, format, d)` converts a +* ``PyOS_ascii_formatd(buffer, buf_len, format, d)`` converts a :c:type:`double` to an ASCII string. The code for these functions came from the GLib library @@ -778,7 +778,7 @@ Here are all of the changes that Python 2.4 makes to the core Python language. * Decorators for functions and methods were added (:pep:`318`). * Built-in :func:`set` and :func:`frozenset` types were added (:pep:`218`). - Other new built-ins include the :func:`reversed(seq)` function (:pep:`322`). + Other new built-ins include the ``reversed(seq)`` function (:pep:`322`). * Generator expressions were added (:pep:`289`). @@ -846,7 +846,7 @@ Here are all of the changes that Python 2.4 makes to the core Python language. ['A', 'b', 'c', 'D'] Finally, the *reverse* parameter takes a Boolean value. If the value is true, - the list will be sorted into reverse order. Instead of ``L.sort() ; + the list will be sorted into reverse order. Instead of ``L.sort(); L.reverse()``, you can now write ``L.sort(reverse=True)``. The results of sorting are now guaranteed to be stable. This means that two @@ -857,7 +857,7 @@ Here are all of the changes that Python 2.4 makes to the core Python language. (All changes to :meth:`sort` contributed by Raymond Hettinger.) -* There is a new built-in function :func:`sorted(iterable)` that works like the +* There is a new built-in function ``sorted(iterable)`` that works like the in-place :meth:`list.sort` method but can be used in expressions. The differences are: @@ -898,8 +898,8 @@ Here are all of the changes that Python 2.4 makes to the core Python language. For example, you can now run the Python profiler with ``python -m profile``. (Contributed by Nick Coghlan.) -* The :func:`eval(expr, globals, locals)` and :func:`execfile(filename, globals, - locals)` functions and the ``exec`` statement now accept any mapping type +* The ``eval(expr, globals, locals)`` and ``execfile(filename, globals, + locals)`` functions and the ``exec`` statement now accept any mapping type for the *locals* parameter. Previously this had to be a regular Python dictionary. (Contributed by Raymond Hettinger.) @@ -1090,7 +1090,7 @@ complete list of changes, or look through the CVS logs for all the details. Yves Dionne) and new :meth:`deleteacl` and :meth:`myrights` methods (contributed by Arnaud Mazin). -* The :mod:`itertools` module gained a :func:`groupby(iterable[, *func*])` +* The :mod:`itertools` module gained a ``groupby(iterable[, *func*])`` function. *iterable* is something that can be iterated over to return a stream of elements, and the optional *func* parameter is a function that takes an element and returns a key value; if omitted, the key is simply the element @@ -1139,7 +1139,7 @@ complete list of changes, or look through the CVS logs for all the details. (Contributed by Hye-Shik Chang.) -* :mod:`itertools` also gained a function named :func:`tee(iterator, N)` that +* :mod:`itertools` also gained a function named ``tee(iterator, N)`` that returns *N* independent iterators that replicate *iterator*. If *N* is omitted, the default is 2. :: @@ -1177,7 +1177,7 @@ complete list of changes, or look through the CVS logs for all the details. level=0, # Log all messages format='%(levelname):%(process):%(thread):%(message)') - Other additions to the :mod:`logging` package include a :meth:`log(level, msg)` + Other additions to the :mod:`logging` package include a ``log(level, msg)`` convenience method, as well as a :class:`TimedRotatingFileHandler` class that rotates its log files at a timed interval. The module already had :class:`RotatingFileHandler`, which rotated logs once the file exceeded a @@ -1196,7 +1196,7 @@ complete list of changes, or look through the CVS logs for all the details. group or for a range of groups. (Contributed by Jürgen A. Erhard.) * Two new functions were added to the :mod:`operator` module, - :func:`attrgetter(attr)` and :func:`itemgetter(index)`. Both functions return + ``attrgetter(attr)`` and ``itemgetter(index)``. Both functions return callables that take a single argument and return the corresponding attribute or item; these callables make excellent data extractors when used with :func:`map` or :func:`sorted`. For example:: @@ -1223,14 +1223,14 @@ complete list of changes, or look through the CVS logs for all the details. replacement for :func:`rfc822.formatdate`. You may want to write new e-mail processing code with this in mind. (Change implemented by Anthony Baxter.) -* A new :func:`urandom(n)` function was added to the :mod:`os` module, returning +* A new ``urandom(n)`` function was added to the :mod:`os` module, returning a string containing *n* bytes of random data. This function provides access to platform-specific sources of randomness such as :file:`/dev/urandom` on Linux or the Windows CryptoAPI. (Contributed by Trevor Perrin.) -* Another new function: :func:`os.path.lexists(path)` returns true if the file +* Another new function: ``os.path.lexists(path)`` returns true if the file specified by *path* exists, whether or not it's a symbolic link. This differs - from the existing :func:`os.path.exists(path)` function, which returns false if + from the existing ``os.path.exists(path)`` function, which returns false if *path* is a symlink that points to a destination that doesn't exist. (Contributed by Beni Cherniavsky.) @@ -1243,7 +1243,7 @@ complete list of changes, or look through the CVS logs for all the details. * The :mod:`profile` module can now profile C extension functions. (Contributed by Nick Bastin.) -* The :mod:`random` module has a new method called :meth:`getrandbits(N)` that +* The :mod:`random` module has a new method called ``getrandbits(N)`` that returns a long integer *N* bits in length. The existing :meth:`randrange` method now uses :meth:`getrandbits` where appropriate, making generation of arbitrarily large random numbers more efficient. (Contributed by Raymond @@ -1272,7 +1272,7 @@ complete list of changes, or look through the CVS logs for all the details. this, but 2.4 will raise a :exc:`RuntimeError` exception. * Two new functions were added to the :mod:`socket` module. :func:`socketpair` - returns a pair of connected sockets and :func:`getservbyport(port)` looks up the + returns a pair of connected sockets and ``getservbyport(port)`` looks up the service name for a given port number. (Contributed by Dave Cole and Barry Warsaw.) @@ -1454,11 +1454,11 @@ Some of the changes to Python's build process and to the C API are: * Another new macro, :c:macro:`Py_CLEAR(obj)`, decreases the reference count of *obj* and sets *obj* to the null pointer. (Contributed by Jim Fulton.) -* A new function, :c:func:`PyTuple_Pack(N, obj1, obj2, ..., objN)`, constructs +* A new function, ``PyTuple_Pack(N, obj1, obj2, ..., objN)``, constructs tuples from a variable length argument list of Python objects. (Contributed by Raymond Hettinger.) -* A new function, :c:func:`PyDict_Contains(d, k)`, implements fast dictionary +* A new function, ``PyDict_Contains(d, k)``, implements fast dictionary lookups without masking exceptions raised during the look-up process. (Contributed by Raymond Hettinger.) diff --git a/Doc/whatsnew/2.5.rst b/Doc/whatsnew/2.5.rst index e059cd5..683630a 100644 --- a/Doc/whatsnew/2.5.rst +++ b/Doc/whatsnew/2.5.rst @@ -171,7 +171,7 @@ method, where the first argument has been provided. :: popup_menu.append( ("Open", open_func, 1) ) Another function in the :mod:`functools` module is the -:func:`update_wrapper(wrapper, wrapped)` function that helps you write well- +``update_wrapper(wrapper, wrapped)`` function that helps you write well- behaved decorators. :func:`update_wrapper` copies the name, module, and docstring attribute to a wrapper function so that tracebacks inside the wrapped function are easier to understand. For example, you might write:: @@ -286,7 +286,7 @@ Python's standard :mod:`string` module? There's no clean way to ignore :mod:`pkg.string` and look for the standard module; generally you had to look at the contents of ``sys.modules``, which is slightly unclean. Holger Krekel's :mod:`py.std` package provides a tidier way to perform imports from the standard -library, ``import py ; py.std.string.join()``, but that package isn't available +library, ``import py; py.std.string.join()``, but that package isn't available on all Python installations. Reading code which relies on relative imports is also less clear, because a @@ -454,7 +454,7 @@ expression on the right-hand side of an assignment. This means you can write ``val = yield i`` but have to use parentheses when there's an operation, as in ``val = (yield i) + 12``.) -Values are sent into a generator by calling its :meth:`send(value)` method. The +Values are sent into a generator by calling its ``send(value)`` method. The generator's code is then resumed and the :keyword:`yield` expression returns the specified *value*. If the regular :meth:`next` method is called, the :keyword:`yield` returns :const:`None`. @@ -496,7 +496,7 @@ function. In addition to :meth:`send`, there are two other new methods on generators: -* :meth:`throw(type, value=None, traceback=None)` is used to raise an exception +* ``throw(type, value=None, traceback=None)`` is used to raise an exception inside the generator; the exception is raised by the :keyword:`yield` expression where the generator's execution is paused. @@ -660,7 +660,7 @@ A high-level explanation of the context management protocol is: * The code in *BLOCK* is executed. -* If *BLOCK* raises an exception, the :meth:`__exit__(type, value, traceback)` +* If *BLOCK* raises an exception, the ``__exit__(type, value, traceback)`` is called with the exception details, the same values returned by :func:`sys.exc_info`. The method's return value controls whether the exception is re-raised: any false value re-raises the exception, and ``True`` will result @@ -773,7 +773,7 @@ decorator as:: with db_transaction(db) as cursor: ... -The :mod:`contextlib` module also has a :func:`nested(mgr1, mgr2, ...)` function +The :mod:`contextlib` module also has a ``nested(mgr1, mgr2, ...)`` function that combines a number of context managers so you don't need to write nested ':keyword:`with`' statements. In this example, the single ':keyword:`with`' statement both starts a database transaction and acquires a thread lock:: @@ -782,7 +782,7 @@ statement both starts a database transaction and acquires a thread lock:: with nested (db_transaction(db), lock) as (cursor, locked): ... -Finally, the :func:`closing(object)` function returns *object* so that it can be +Finally, the ``closing(object)`` function returns *object* so that it can be bound to a variable, and calls ``object.close`` at the end of the block. :: import urllib, sys @@ -955,7 +955,7 @@ interpreter will check that the type returned is correct, and raises a A corresponding :attr:`nb_index` slot was added to the C-level :c:type:`PyNumberMethods` structure to let C extensions implement this protocol. -:c:func:`PyNumber_Index(obj)` can be used in extension code to call the +``PyNumber_Index(obj)`` can be used in extension code to call the :meth:`__index__` function and retrieve its result. @@ -976,7 +976,7 @@ Here are all of the changes that Python 2.5 makes to the core Python language. * The :class:`dict` type has a new hook for letting subclasses provide a default value when a key isn't contained in the dictionary. When a key isn't found, the - dictionary's :meth:`__missing__(key)` method will be called. This hook is used + dictionary's ``__missing__(key)`` method will be called. This hook is used to implement the new :class:`defaultdict` class in the :mod:`collections` module. The following example defines a dictionary that returns zero for any missing key:: @@ -989,16 +989,16 @@ Here are all of the changes that Python 2.5 makes to the core Python language. print d[1], d[2] # Prints 1, 2 print d[3], d[4] # Prints 0, 0 -* Both 8-bit and Unicode strings have new :meth:`partition(sep)` and - :meth:`rpartition(sep)` methods that simplify a common use case. +* Both 8-bit and Unicode strings have new ``partition(sep)`` and + ``rpartition(sep)`` methods that simplify a common use case. - The :meth:`find(S)` method is often used to get an index which is then used to + The ``find(S)`` method is often used to get an index which is then used to slice the string and obtain the pieces that are before and after the separator. - :meth:`partition(sep)` condenses this pattern into a single method call that + ``partition(sep)`` condenses this pattern into a single method call that returns a 3-tuple containing the substring before the separator, the separator itself, and the substring after the separator. If the separator isn't found, the first element of the tuple is the entire string and the other two elements - are empty. :meth:`rpartition(sep)` also returns a 3-tuple but starts searching + are empty. ``rpartition(sep)`` also returns a 3-tuple but starts searching from the end of the string; the ``r`` stands for 'reverse'. Some examples:: @@ -1157,7 +1157,7 @@ marked in the following list. .. Patch 1313939, 1359618 -* The :func:`long(str, base)` function is now faster on long digit strings +* The ``long(str, base)`` function is now faster on long digit strings because fewer intermediate results are calculated. The peak is for strings of around 800--1000 digits where the function is 6 times faster. (Contributed by Alan McIntyre and committed at the NeedForSpeed sprint.) @@ -1268,7 +1268,7 @@ complete list of changes, or look through the SVN logs for all the details. (Contributed by Guido van Rossum.) * The :class:`deque` double-ended queue type supplied by the :mod:`collections` - module now has a :meth:`remove(value)` method that removes the first occurrence + module now has a ``remove(value)`` method that removes the first occurrence of *value* in the queue, raising :exc:`ValueError` if the value isn't found. (Contributed by Raymond Hettinger.) @@ -1291,7 +1291,7 @@ complete list of changes, or look through the SVN logs for all the details. * The :mod:`csv` module, which parses files in comma-separated value format, received several enhancements and a number of bugfixes. You can now set the maximum size in bytes of a field by calling the - :meth:`csv.field_size_limit(new_limit)` function; omitting the *new_limit* + ``csv.field_size_limit(new_limit)`` function; omitting the *new_limit* argument will return the currently-set limit. The :class:`reader` class now has a :attr:`line_num` attribute that counts the number of physical lines read from the source; records can span multiple physical lines, so :attr:`line_num` is not @@ -1308,7 +1308,7 @@ complete list of changes, or look through the SVN logs for all the details. (Contributed by Skip Montanaro and Andrew McNamara.) * The :class:`datetime` class in the :mod:`datetime` module now has a - :meth:`strptime(string, format)` method for parsing date strings, contributed + ``strptime(string, format)`` method for parsing date strings, contributed by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and :func:`time.strftime`:: @@ -1403,7 +1403,7 @@ complete list of changes, or look through the SVN logs for all the details. * The :mod:`mailbox` module underwent a massive rewrite to add the capability to modify mailboxes in addition to reading them. A new set of classes that include :class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and - have an :meth:`add(message)` method to add messages, :meth:`remove(key)` to + have an ``add(message)`` method to add messages, ``remove(key)`` to remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox. The following example converts a maildir-format mailbox into an mbox-format one:: @@ -1458,7 +1458,7 @@ complete list of changes, or look through the SVN logs for all the details. :func:`wait4` return additional information. :func:`wait3` doesn't take a process ID as input, so it waits for any child process to exit and returns a 3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the - :func:`resource.getrusage` function. :func:`wait4(pid)` does take a process ID. + :func:`resource.getrusage` function. ``wait4(pid)`` does take a process ID. (Contributed by Chad J. Schroeder.) On FreeBSD, the :func:`os.stat` function now returns times with nanosecond @@ -1532,8 +1532,8 @@ complete list of changes, or look through the SVN logs for all the details. In Python code, netlink addresses are represented as a tuple of 2 integers, ``(pid, group_mask)``. - Two new methods on socket objects, :meth:`recv_into(buffer)` and - :meth:`recvfrom_into(buffer)`, store the received data in an object that + Two new methods on socket objects, ``recv_into(buffer)`` and + ``recvfrom_into(buffer)``, store the received data in an object that supports the buffer protocol instead of returning the data as a string. This means you can put the data directly into an array or a memory-mapped file. @@ -1557,8 +1557,8 @@ complete list of changes, or look through the SVN logs for all the details. year, number, name = s.unpack(data) You can also pack and unpack data to and from buffer objects directly using the - :meth:`pack_into(buffer, offset, v1, v2, ...)` and :meth:`unpack_from(buffer, - offset)` methods. This lets you store data directly into an array or a memory- + ``pack_into(buffer, offset, v1, v2, ...)`` and ``unpack_from(buffer, + offset)`` methods. This lets you store data directly into an array or a memory- mapped file. (:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed @@ -1592,7 +1592,7 @@ complete list of changes, or look through the SVN logs for all the details. .. patch 918101 * The :mod:`threading` module now lets you set the stack size used when new - threads are created. The :func:`stack_size([*size*])` function returns the + threads are created. The ``stack_size([*size*])`` function returns the currently configured stack size, and supplying the optional *size* parameter sets a new value. Not all platforms support changing the stack size, but Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.) @@ -1911,7 +1911,7 @@ differently. :: h = hashlib.new('md5') # Provide algorithm as a string Once a hash object has been created, its methods are the same as before: -:meth:`update(string)` hashes the specified string into the current digest +``update(string)`` hashes the specified string into the current digest state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary string or a string of hex digits, and :meth:`copy` returns a new hashing object with the same digest state. @@ -2168,20 +2168,20 @@ Changes to Python's build process and to the C API include: * Two new macros can be used to indicate C functions that are local to the current file so that a faster calling convention can be used. - :c:func:`Py_LOCAL(type)` declares the function as returning a value of the + ``Py_LOCAL(type)`` declares the function as returning a value of the specified *type* and uses a fast-calling qualifier. - :c:func:`Py_LOCAL_INLINE(type)` does the same thing and also requests the + ``Py_LOCAL_INLINE(type)`` does the same thing and also requests the function be inlined. If :c:func:`PY_LOCAL_AGGRESSIVE` is defined before :file:`python.h` is included, a set of more aggressive optimizations are enabled for the module; you should benchmark the results to find out if these optimizations actually make the code faster. (Contributed by Fredrik Lundh at the NeedForSpeed sprint.) -* :c:func:`PyErr_NewException(name, base, dict)` can now accept a tuple of base +* ``PyErr_NewException(name, base, dict)`` can now accept a tuple of base classes as its *base* argument. (Contributed by Georg Brandl.) * The :c:func:`PyErr_Warn` function for issuing warnings is now deprecated in - favour of :c:func:`PyErr_WarnEx(category, message, stacklevel)` which lets you + favour of ``PyErr_WarnEx(category, message, stacklevel)`` which lets you specify the number of stack frames separating this function and the caller. A *stacklevel* of 1 is the function calling :c:func:`PyErr_WarnEx`, 2 is the function above that, and so forth. (Added by Neal Norwitz.) diff --git a/Doc/whatsnew/2.6.rst b/Doc/whatsnew/2.6.rst index bdd7ff7..3ae6c77 100644 --- a/Doc/whatsnew/2.6.rst +++ b/Doc/whatsnew/2.6.rst @@ -1891,7 +1891,7 @@ changes, or look through the Subversion logs for all the details. >>> dq=deque(maxlen=3) >>> dq deque([], maxlen=3) - >>> dq.append(1) ; dq.append(2) ; dq.append(3) + >>> dq.append(1); dq.append(2); dq.append(3) >>> dq deque([1, 2, 3], maxlen=3) >>> dq.append(4) @@ -2783,12 +2783,12 @@ http://www.json.org. types. The following example encodes and decodes a dictionary:: >>> import json - >>> data = {"spam" : "foo", "parrot" : 42} + >>> data = {"spam": "foo", "parrot": 42} >>> in_json = json.dumps(data) # Encode the data >>> in_json '{"parrot": 42, "spam": "foo"}' >>> json.loads(in_json) # Decode into a Python object - {"spam" : "foo", "parrot" : 42} + {"spam": "foo", "parrot": 42} It's also possible to write your own decoders and encoders to support more types. Pretty-printing of the JSON strings is also supported. diff --git a/Doc/whatsnew/3.0.rst b/Doc/whatsnew/3.0.rst index 133ecab..71b87b8 100644 --- a/Doc/whatsnew/3.0.rst +++ b/Doc/whatsnew/3.0.rst @@ -810,7 +810,7 @@ Builtins * The :func:`round` function rounding strategy and return type have changed. Exact halfway cases are now rounded to the nearest even result instead of away from zero. (For example, ``round(2.5)`` now - returns ``2`` rather than ``3``.) :func:`round(x[, n])` now + returns ``2`` rather than ``3``.) ``round(x[, n])`` now delegates to ``x.__round__([n])`` instead of always returning a float. It generally returns an integer when called with a single argument and a value of the same type as ``x`` when called with two diff --git a/Doc/whatsnew/3.2.rst b/Doc/whatsnew/3.2.rst index 47d477f..aa69df2 100644 --- a/Doc/whatsnew/3.2.rst +++ b/Doc/whatsnew/3.2.rst @@ -50,7 +50,7 @@ This article explains the new features in Python 3.2 as compared to 3.1. It focuses on a few highlights and gives a few examples. For full details, see the -:source:`Misc/NEWS <Misc/NEWS>` file. +`Misc/NEWS <http://hg.python.org/cpython/file/3.2/Misc/NEWS>`_ file. .. seealso:: @@ -268,7 +268,7 @@ launch of four parallel threads for copying files:: e.submit(shutil.copy, 'src1.txt', 'dest1.txt') e.submit(shutil.copy, 'src2.txt', 'dest2.txt') e.submit(shutil.copy, 'src3.txt', 'dest3.txt') - e.submit(shutil.copy, 'src4.txt', 'dest4.txt') + e.submit(shutil.copy, 'src3.txt', 'dest4.txt') .. seealso:: @@ -2352,7 +2352,7 @@ A number of small performance enhancements have been added: (Contributed by Antoine Pitrou; :issue:`3001`.) * The fast-search algorithm in stringlib is now used by the :meth:`split`, - :meth:`splitlines` and :meth:`replace` methods on + :meth:`rsplit`, :meth:`splitlines` and :meth:`replace` methods on :class:`bytes`, :class:`bytearray` and :class:`str` objects. Likewise, the algorithm is also used by :meth:`rfind`, :meth:`rindex`, :meth:`rsplit` and :meth:`rpartition`. @@ -2469,14 +2469,14 @@ Code Repository In addition to the existing Subversion code repository at http://svn.python.org there is now a `Mercurial <http://mercurial.selenic.com/>`_ repository at -http://hg.python.org/\. +http://hg.python.org/\ . After the 3.2 release, there are plans to switch to Mercurial as the primary repository. This distributed version control system should make it easier for members of the community to create and share external changesets. See :pep:`385` for details. -To learn the new version control system, see the `tutorial by Joel +To learn to use the new version control system, see the `tutorial by Joel Spolsky <http://hginit.com>`_ or the `Guide to Mercurial Workflows <http://mercurial.selenic.com/guide/>`_. diff --git a/Doc/whatsnew/3.3.rst b/Doc/whatsnew/3.3.rst new file mode 100644 index 0000000..cda63e4 --- /dev/null +++ b/Doc/whatsnew/3.3.rst @@ -0,0 +1,2516 @@ +**************************** + What's New In Python 3.3 +**************************** + +.. Rules for maintenance: + + * Anyone can add text to this document. Do not spend very much time + on the wording of your changes, because your text will probably + get rewritten to some degree. + + * The maintainer will go through Misc/NEWS periodically and add + changes; it's therefore more important to add your changes to + Misc/NEWS than to this file. + + * This is not a complete list of every single change; completeness + is the purpose of Misc/NEWS. Some changes I consider too small + or esoteric to include. If such a change is added to the text, + I'll just remove it. (This is another reason you shouldn't spend + too much time on writing your addition.) + + * If you want to draw your new text to the attention of the + maintainer, add 'XXX' to the beginning of the paragraph or + section. + + * It's OK to just add a fragmentary note about a change. For + example: "XXX Describe the transmogrify() function added to the + socket module." The maintainer will research the change and + write the necessary text. + + * You can comment out your additions if you like, but it's not + necessary (especially when a final release is some months away). + + * Credit the author of a patch or bugfix. Just the name is + sufficient; the e-mail address isn't necessary. + + * It's helpful to add the bug/patch number as a comment: + + XXX Describe the transmogrify() function added to the socket + module. + (Contributed by P.Y. Developer in :issue:`12345`.) + + This saves the maintainer the effort of going through the Mercurial log + when researching a change. + +This article explains the new features in Python 3.3, compared to 3.2. +Python 3.3 was released on September 29, 2012. For full details, +see the `changelog <http://docs.python.org/3.3/whatsnew/changelog.html>`_. + +.. seealso:: + + :pep:`398` - Python 3.3 Release Schedule + + +Summary -- Release highlights +============================= + +.. This section singles out the most important changes in Python 3.3. + Brevity is key. + +New syntax features: + +* New ``yield from`` expression for :ref:`generator delegation <pep-380>`. +* The ``u'unicode'`` syntax is accepted again for :class:`str` objects. + +New library modules: + +* :mod:`faulthandler` (helps debugging low-level crashes) +* :mod:`ipaddress` (high-level objects representing IP addresses and masks) +* :mod:`lzma` (compress data using the XZ / LZMA algorithm) +* :mod:`unittest.mock` (replace parts of your system under test with mock objects) +* :mod:`venv` (Python :ref:`virtual environments <pep-405>`, as in the + popular ``virtualenv`` package) + +New built-in features: + +* Reworked :ref:`I/O exception hierarchy <pep-3151>`. + +Implementation improvements: + +* Rewritten :ref:`import machinery <importlib>` based on :mod:`importlib`. +* More compact :ref:`unicode strings <pep-393>`. +* More compact :ref:`attribute dictionaries <pep-412>`. + +Significantly Improved Library Modules: + +* C Accelerator for the :ref:`decimal <new-decimal>` module. +* Better unicode handling in the :ref:`email <new-email>` module + (:term:`provisional <provisional package>`). + +Security improvements: + +* Hash randomization is switched on by default. + +Please read on for a comprehensive list of user-facing changes. + + +.. _pep-405: + +PEP 405: Virtual Environments +============================= + +Virtual environments help create separate Python setups while sharing a +system-wide base install, for ease of maintenance. Virtual environments +have their own set of private site packages (i.e. locally-installed +libraries), and are optionally segregated from the system-wide site +packages. Their concept and implementation are inspired by the popular +``virtualenv`` third-party package, but benefit from tighter integration +with the interpreter core. + +This PEP adds the :mod:`venv` module for programmatic access, and the +:ref:`pyvenv <scripts-pyvenv>` script for command-line access and +administration. The Python interpreter checks for a ``pyvenv.cfg``, +file whose existence signals the base of a virtual environment's directory +tree. + +.. seealso:: + + :pep:`405` - Python Virtual Environments + PEP written by Carl Meyer; implementation by Carl Meyer and Vinay Sajip + + +PEP 420: Implicit Namespace Packages +==================================== + +Native support for package directories that don't require ``__init__.py`` +marker files and can automatically span multiple path segments (inspired by +various third party approaches to namespace packages, as described in +:pep:`420`) + +.. seealso:: + + :pep:`420` - Implicit Namespace Packages + PEP written by Eric V. Smith; implementation by Eric V. Smith + and Barry Warsaw + + +.. _pep-3118-update: + +PEP 3118: New memoryview implementation and buffer protocol documentation +========================================================================= + +The implementation of :pep:`3118` has been significantly improved. + +The new memoryview implementation comprehensively fixes all ownership and +lifetime issues of dynamically allocated fields in the Py_buffer struct +that led to multiple crash reports. Additionally, several functions that +crashed or returned incorrect results for non-contiguous or multi-dimensional +input have been fixed. + +The memoryview object now has a PEP-3118 compliant getbufferproc() +that checks the consumer's request type. Many new features have been +added, most of them work in full generality for non-contiguous arrays +and arrays with suboffsets. + +The documentation has been updated, clearly spelling out responsibilities +for both exporters and consumers. Buffer request flags are grouped into +basic and compound flags. The memory layout of non-contiguous and +multi-dimensional NumPy-style arrays is explained. + +Features +-------- + +* All native single character format specifiers in struct module syntax + (optionally prefixed with '@') are now supported. + +* With some restrictions, the cast() method allows changing of format and + shape of C-contiguous arrays. + +* Multi-dimensional list representations are supported for any array type. + +* Multi-dimensional comparisons are supported for any array type. + +* One-dimensional memoryviews of hashable (read-only) types with formats B, + b or c are now hashable. (Contributed by Antoine Pitrou in :issue:`13411`) + +* Arbitrary slicing of any 1-D arrays type is supported. For example, it + is now possible to reverse a memoryview in O(1) by using a negative step. + +API changes +----------- + +* The maximum number of dimensions is officially limited to 64. + +* The representation of empty shape, strides and suboffsets is now + an empty tuple instead of None. + +* Accessing a memoryview element with format 'B' (unsigned bytes) + now returns an integer (in accordance with the struct module syntax). + For returning a bytes object the view must be cast to 'c' first. + +* memoryview comparisons now use the logical structure of the operands + and compare all array elements by value. All format strings in struct + module syntax are supported. Views with unrecognised format strings + are still permitted, but will always compare as unequal, regardless + of view contents. + +* For further changes see `Build and C API Changes`_ and `Porting C code`_. + +(Contributed by Stefan Krah in :issue:`10181`) + +.. seealso:: + + :pep:`3118` - Revising the Buffer Protocol + + +.. _pep-393: + +PEP 393: Flexible String Representation +======================================= + +The Unicode string type is changed to support multiple internal +representations, depending on the character with the largest Unicode ordinal +(1, 2, or 4 bytes) in the represented string. This allows a space-efficient +representation in common cases, but gives access to full UCS-4 on all +systems. For compatibility with existing APIs, several representations may +exist in parallel; over time, this compatibility should be phased out. + +On the Python side, there should be no downside to this change. + +On the C API side, PEP 393 is fully backward compatible. The legacy API +should remain available at least five years. Applications using the legacy +API will not fully benefit of the memory reduction, or - worse - may use +a bit more memory, because Python may have to maintain two versions of each +string (in the legacy format and in the new efficient storage). + +Functionality +------------- + +Changes introduced by :pep:`393` are the following: + +* Python now always supports the full range of Unicode codepoints, including + non-BMP ones (i.e. from ``U+0000`` to ``U+10FFFF``). The distinction between + narrow and wide builds no longer exists and Python now behaves like a wide + build, even under Windows. + +* With the death of narrow builds, the problems specific to narrow builds have + also been fixed, for example: + + * :func:`len` now always returns 1 for non-BMP characters, + so ``len('\U0010FFFF') == 1``; + + * surrogate pairs are not recombined in string literals, + so ``'\uDBFF\uDFFF' != '\U0010FFFF'``; + + * indexing or slicing non-BMP characters returns the expected value, + so ``'\U0010FFFF'[0]`` now returns ``'\U0010FFFF'`` and not ``'\uDBFF'``; + + * all other functions in the standard library now correctly handle + non-BMP codepoints. + +* The value of :data:`sys.maxunicode` is now always ``1114111`` (``0x10FFFF`` + in hexadecimal). The :c:func:`PyUnicode_GetMax` function still returns + either ``0xFFFF`` or ``0x10FFFF`` for backward compatibility, and it should + not be used with the new Unicode API (see :issue:`13054`). + +* The :file:`./configure` flag ``--with-wide-unicode`` has been removed. + +Performance and resource usage +------------------------------ + +The storage of Unicode strings now depends on the highest codepoint in the string: + +* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per codepoint; + +* BMP strings (``U+0000-U+FFFF``) use 2 bytes per codepoint; + +* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per codepoint. + +The net effect is that for most applications, memory usage of string +storage should decrease significantly - especially compared to former +wide unicode builds - as, in many cases, strings will be pure ASCII +even in international contexts (because many strings store non-human +language data, such as XML fragments, HTTP headers, JSON-encoded data, +etc.). We also hope that it will, for the same reasons, increase CPU +cache efficiency on non-trivial applications. The memory usage of +Python 3.3 is two to three times smaller than Python 3.2, and a little +bit better than Python 2.7, on a Django benchmark (see the PEP for +details). + +.. seealso:: + + :pep:`393` - Flexible String Representation + PEP written by Martin von Löwis; implementation by Torsten Becker + and Martin von Löwis. + + +.. _pep-397: + +PEP 397: Python Launcher for Windows +==================================== + +The Python 3.3 Windows installer now includes a ``py`` launcher application +that can be used to launch Python applications in a version independent +fashion. + +This launcher is invoked implicitly when double-clicking ``*.py`` files. +If only a single Python version is installed on the system, that version +will be used to run the file. If multiple versions are installed, the most +recent version is used by default, but this can be overridden by including +a Unix-style "shebang line" in the Python script. + +The launcher can also be used explicitly from the command line as the ``py`` +application. Running ``py`` follows the same version selection rules as +implicitly launching scripts, but a more specific version can be selected +by passing appropriate arguments (such as ``-3`` to request Python 3 when +Python 2 is also installed, or ``-2.6`` to specifclly request an earlier +Python version when a more recent version is installed). + +In addition to the launcher, the Windows installer now includes an +option to add the newly installed Python to the system PATH (contributed +by Brian Curtin in :issue:`3561`). + +.. seealso:: + + :pep:`397` - Python Launcher for Windows + PEP written by Mark Hammond and Martin v. Löwis; implementation by + Vinay Sajip. + + Launcher documentation: :ref:`launcher` + + Installer PATH modification: :ref:`windows-path-mod` + + +.. _pep-3151: + +PEP 3151: Reworking the OS and IO exception hierarchy +===================================================== + +The hierarchy of exceptions raised by operating system errors is now both +simplified and finer-grained. + +You don't have to worry anymore about choosing the appropriate exception +type between :exc:`OSError`, :exc:`IOError`, :exc:`EnvironmentError`, +:exc:`WindowsError`, :exc:`mmap.error`, :exc:`socket.error` or +:exc:`select.error`. All these exception types are now only one: +:exc:`OSError`. The other names are kept as aliases for compatibility +reasons. + +Also, it is now easier to catch a specific error condition. Instead of +inspecting the ``errno`` attribute (or ``args[0]``) for a particular +constant from the :mod:`errno` module, you can catch the adequate +:exc:`OSError` subclass. The available subclasses are the following: + +* :exc:`BlockingIOError` +* :exc:`ChildProcessError` +* :exc:`ConnectionError` +* :exc:`FileExistsError` +* :exc:`FileNotFoundError` +* :exc:`InterruptedError` +* :exc:`IsADirectoryError` +* :exc:`NotADirectoryError` +* :exc:`PermissionError` +* :exc:`ProcessLookupError` +* :exc:`TimeoutError` + +And the :exc:`ConnectionError` itself has finer-grained subclasses: + +* :exc:`BrokenPipeError` +* :exc:`ConnectionAbortedError` +* :exc:`ConnectionRefusedError` +* :exc:`ConnectionResetError` + +Thanks to the new exceptions, common usages of the :mod:`errno` can now be +avoided. For example, the following code written for Python 3.2:: + + from errno import ENOENT, EACCES, EPERM + + try: + with open("document.txt") as f: + content = f.read() + except IOError as err: + if err.errno == ENOENT: + print("document.txt file is missing") + elif err.errno in (EACCES, EPERM): + print("You are not allowed to read document.txt") + else: + raise + +can now be written without the :mod:`errno` import and without manual +inspection of exception attributes:: + + try: + with open("document.txt") as f: + content = f.read() + except FileNotFoundError: + print("document.txt file is missing") + except PermissionError: + print("You are not allowed to read document.txt") + +.. seealso:: + + :pep:`3151` - Reworking the OS and IO Exception Hierarchy + PEP written and implemented by Antoine Pitrou + + +.. index:: + single: yield; yield from (in What's New) + +.. _pep-380: + +PEP 380: Syntax for Delegating to a Subgenerator +================================================ + +PEP 380 adds the ``yield from`` expression, allowing a :term:`generator` to +delegate +part of its operations to another generator. This allows a section of code +containing :keyword:`yield` to be factored out and placed in another generator. +Additionally, the subgenerator is allowed to return with a value, and the +value is made available to the delegating generator. + +While designed primarily for use in delegating to a subgenerator, the ``yield +from`` expression actually allows delegation to arbitrary subiterators. + +For simple iterators, ``yield from iterable`` is essentially just a shortened +form of ``for item in iterable: yield item``:: + + >>> def g(x): + ... yield from range(x, 0, -1) + ... yield from range(x) + ... + >>> list(g(5)) + [5, 4, 3, 2, 1, 0, 1, 2, 3, 4] + +However, unlike an ordinary loop, ``yield from`` allows subgenerators to +receive sent and thrown values directly from the calling scope, and +return a final value to the outer generator:: + + >>> def accumulate(): + ... tally = 0 + ... while 1: + ... next = yield + ... if next is None: + ... return tally + ... tally += next + ... + >>> def gather_tallies(tallies): + ... while 1: + ... tally = yield from accumulate() + ... tallies.append(tally) + ... + >>> tallies = [] + >>> acc = gather_tallies(tallies) + >>> next(acc) # Ensure the accumulator is ready to accept values + >>> for i in range(4): + ... acc.send(i) + ... + >>> acc.send(None) # Finish the first tally + >>> for i in range(5): + ... acc.send(i) + ... + >>> acc.send(None) # Finish the second tally + >>> tallies + [6, 10] + +The main principle driving this change is to allow even generators that are +designed to be used with the ``send`` and ``throw`` methods to be split into +multiple subgenerators as easily as a single large function can be split into +multiple subfunctions. + +.. seealso:: + + :pep:`380` - Syntax for Delegating to a Subgenerator + PEP written by Greg Ewing; implementation by Greg Ewing, integrated into + 3.3 by Renaud Blanch, Ryan Kelly and Nick Coghlan; documentation by + Zbigniew Jędrzejewski-Szmek and Nick Coghlan + + +PEP 409: Suppressing exception context +====================================== + +PEP 409 introduces new syntax that allows the display of the chained +exception context to be disabled. This allows cleaner error messages in +applications that convert between exception types:: + + >>> class D: + ... def __init__(self, extra): + ... self._extra_attributes = extra + ... def __getattr__(self, attr): + ... try: + ... return self._extra_attributes[attr] + ... except KeyError: + ... raise AttributeError(attr) from None + ... + >>> D({}).x + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + File "<stdin>", line 8, in __getattr__ + AttributeError: x + +Without the ``from None`` suffix to suppress the cause, the original +exception would be displayed by default:: + + >>> class C: + ... def __init__(self, extra): + ... self._extra_attributes = extra + ... def __getattr__(self, attr): + ... try: + ... return self._extra_attributes[attr] + ... except KeyError: + ... raise AttributeError(attr) + ... + >>> C({}).x + Traceback (most recent call last): + File "<stdin>", line 6, in __getattr__ + KeyError: 'x' + + During handling of the above exception, another exception occurred: + + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + File "<stdin>", line 8, in __getattr__ + AttributeError: x + +No debugging capability is lost, as the original exception context remains +available if needed (for example, if an intervening library has incorrectly +suppressed valuable underlying details):: + + >>> try: + ... D({}).x + ... except AttributeError as exc: + ... print(repr(exc.__context__)) + ... + KeyError('x',) + +.. seealso:: + + :pep:`409` - Suppressing exception context + PEP written by Ethan Furman; implemented by Ethan Furman and Nick + Coghlan. + + +PEP 414: Explicit Unicode literals +====================================== + +To ease the transition from Python 2 for Unicode aware Python applications +that make heavy use of Unicode literals, Python 3.3 once again supports the +"``u``" prefix for string literals. This prefix has no semantic significance +in Python 3, it is provided solely to reduce the number of purely mechanical +changes in migrating to Python 3, making it easier for developers to focus on +the more significant semantic changes (such as the stricter default +separation of binary and text data). + +.. seealso:: + + :pep:`414` - Explicit Unicode literals + PEP written by Armin Ronacher. + + +PEP 3155: Qualified name for classes and functions +================================================== + +Functions and class objects have a new ``__qualname__`` attribute representing +the "path" from the module top-level to their definition. For global functions +and classes, this is the same as ``__name__``. For other functions and classes, +it provides better information about where they were actually defined, and +how they might be accessible from the global scope. + +Example with (non-bound) methods:: + + >>> class C: + ... def meth(self): + ... pass + >>> C.meth.__name__ + 'meth' + >>> C.meth.__qualname__ + 'C.meth' + +Example with nested classes:: + + >>> class C: + ... class D: + ... def meth(self): + ... pass + ... + >>> C.D.__name__ + 'D' + >>> C.D.__qualname__ + 'C.D' + >>> C.D.meth.__name__ + 'meth' + >>> C.D.meth.__qualname__ + 'C.D.meth' + +Example with nested functions:: + + >>> def outer(): + ... def inner(): + ... pass + ... return inner + ... + >>> outer().__name__ + 'inner' + >>> outer().__qualname__ + 'outer.<locals>.inner' + +The string representation of those objects is also changed to include the +new, more precise information:: + + >>> str(C.D) + "<class '__main__.C.D'>" + >>> str(C.D.meth) + '<function C.D.meth at 0x7f46b9fe31e0>' + +.. seealso:: + + :pep:`3155` - Qualified name for classes and functions + PEP written and implemented by Antoine Pitrou. + + +.. _pep-412: + +PEP 412: Key-Sharing Dictionary +=============================== + +Dictionaries used for the storage of objects' attributes are now able to +share part of their internal storage between each other (namely, the part +which stores the keys and their respective hashes). This reduces the memory +consumption of programs creating many instances of non-builtin types. + +.. seealso:: + + :pep:`412` - Key-Sharing Dictionary + PEP written and implemented by Mark Shannon. + + +PEP 362: Function Signature Object +================================== + +A new function :func:`inspect.signature` makes introspection of python +callables easy and straightforward. A broad range of callables is supported: +python functions, decorated or not, classes, and :func:`functools.partial` +objects. New classes :class:`inspect.Signature`, :class:`inspect.Parameter` +and :class:`inspect.BoundArguments` hold information about the call signatures, +such as, annotations, default values, parameters kinds, and bound arguments, +which considerably simplifies writing decorators and any code that validates +or amends calling signatures or arguments. + +.. seealso:: + + :pep:`362`: - Function Signature Object + PEP written by Brett Cannon, Yury Selivanov, Larry Hastings, Jiwon Seo; + implemented by Yury Selivanov. + + +PEP 421: Adding sys.implementation +================================== + +A new attribute on the :mod:`sys` module exposes details specific to the +implementation of the currently running interpreter. The initial set of +attributes on :attr:`sys.implementation` are ``name``, ``version``, +``hexversion``, and ``cache_tag``. + +The intention of ``sys.implementation`` is to consolidate into one namespace +the implementation-specific data used by the standard library. This allows +different Python implementations to share a single standard library code base +much more easily. In its initial state, ``sys.implementation`` holds only a +small portion of the implementation-specific data. Over time that ratio will +shift in order to make the standard library more portable. + +One example of improved standard library portability is ``cache_tag``. As of +Python 3.3, ``sys.implementation.cache_tag`` is used by :mod:`importlib` to +support :pep:`3147` compliance. Any Python implementation that uses +``importlib`` for its built-in import system may use ``cache_tag`` to control +the caching behavior for modules. + +SimpleNamespace +--------------- + +The implementation of ``sys.implementation`` also introduces a new type to +Python: :class:`types.SimpleNamespace`. In contrast to a mapping-based +namespace, like :class:`dict`, ``SimpleNamespace`` is attribute-based, like +:class:`object`. However, unlike ``object``, ``SimpleNamespace`` instances +are writable. This means that you can add, remove, and modify the namespace +through normal attribute access. + +.. seealso:: + + :pep:`421` - Adding sys.implementation + PEP written and implemented by Eric Snow. + + +.. _importlib: + +Using importlib as the Implementation of Import +=============================================== +:issue:`2377` - Replace __import__ w/ importlib.__import__ +:issue:`13959` - Re-implement parts of :mod:`imp` in pure Python +:issue:`14605` - Make import machinery explicit +:issue:`14646` - Require loaders set __loader__ and __package__ + +The :func:`__import__` function is now powered by :func:`importlib.__import__`. +This work leads to the completion of "phase 2" of :pep:`302`. There are +multiple benefits to this change. First, it has allowed for more of the +machinery powering import to be exposed instead of being implicit and hidden +within the C code. It also provides a single implementation for all Python VMs +supporting Python 3.3 to use, helping to end any VM-specific deviations in +import semantics. And finally it eases the maintenance of import, allowing for +future growth to occur. + +For the common user, there should be no visible change in semantics. For +those whose code currently manipulates import or calls import +programmatically, the code changes that might possibly be required are covered +in the `Porting Python code`_ section of this document. + +New APIs +-------- +One of the large benefits of this work is the exposure of what goes into +making the import statement work. That means the various importers that were +once implicit are now fully exposed as part of the :mod:`importlib` package. + +The abstract base classes defined in :mod:`importlib.abc` have been expanded +to properly delineate between :term:`meta path finders <meta path finder>` +and :term:`path entry finders <path entry finder>` by introducing +:class:`importlib.abc.MetaPathFinder` and +:class:`importlib.abc.PathEntryFinder`, respectively. The old ABC of +:class:`importlib.abc.Finder` is now only provided for backwards-compatibility +and does not enforce any method requirements. + +In terms of finders, :class:`importlib.machinery.FileFinder` exposes the +mechanism used to search for source and bytecode files of a module. Previously +this class was an implicit member of :attr:`sys.path_hooks`. + +For loaders, the new abstract base class :class:`importlib.abc.FileLoader` helps +write a loader that uses the file system as the storage mechanism for a module's +code. The loader for source files +(:class:`importlib.machinery.SourceFileLoader`), sourceless bytecode files +(:class:`importlib.machinery.SourcelessFileLoader`), and extension modules +(:class:`importlib.machinery.ExtensionFileLoader`) are now available for +direct use. + +:exc:`ImportError` now has ``name`` and ``path`` attributes which are set when +there is relevant data to provide. The message for failed imports will also +provide the full name of the module now instead of just the tail end of the +module's name. + +The :func:`importlib.invalidate_caches` function will now call the method with +the same name on all finders cached in :attr:`sys.path_importer_cache` to help +clean up any stored state as necessary. + +Visible Changes +--------------- + +For potential required changes to code, see the `Porting Python code`_ +section. + +Beyond the expanse of what :mod:`importlib` now exposes, there are other +visible changes to import. The biggest is that :attr:`sys.meta_path` and +:attr:`sys.path_hooks` now store all of the meta path finders and path entry +hooks used by import. Previously the finders were implicit and hidden within +the C code of import instead of being directly exposed. This means that one can +now easily remove or change the order of the various finders to fit one's needs. + +Another change is that all modules have a ``__loader__`` attribute, storing the +loader used to create the module. :pep:`302` has been updated to make this +attribute mandatory for loaders to implement, so in the future once 3rd-party +loaders have been updated people will be able to rely on the existence of the +attribute. Until such time, though, import is setting the module post-load. + +Loaders are also now expected to set the ``__package__`` attribute from +:pep:`366`. Once again, import itself is already setting this on all loaders +from :mod:`importlib` and import itself is setting the attribute post-load. + +``None`` is now inserted into :attr:`sys.path_importer_cache` when no finder +can be found on :attr:`sys.path_hooks`. Since :class:`imp.NullImporter` is not +directly exposed on :attr:`sys.path_hooks` it could no longer be relied upon to +always be available to use as a value representing no finder found. + +All other changes relate to semantic changes which should be taken into +consideration when updating code for Python 3.3, and thus should be read about +in the `Porting Python code`_ section of this document. + +(Implementation by Brett Cannon) + + +Other Language Changes +====================== + +Some smaller changes made to the core Python language are: + +* Added support for Unicode name aliases and named sequences. + Both :func:`unicodedata.lookup()` and ``'\N{...}'`` now resolve name aliases, + and :func:`unicodedata.lookup()` resolves named sequences too. + + (Contributed by Ezio Melotti in :issue:`12753`) + +* Unicode database updated to UCD version 6.1.0 + +* Equality comparisons on :func:`range` objects now return a result reflecting + the equality of the underlying sequences generated by those range objects. + (:issue:`13201`) + +* The ``count()``, ``find()``, ``rfind()``, ``index()`` and ``rindex()`` + methods of :class:`bytes` and :class:`bytearray` objects now accept an + integer between 0 and 255 as their first argument. + + (Contributed by Petri Lehtinen in :issue:`12170`) + +* The ``rjust()``, ``ljust()``, and ``center()`` methods of :class:`bytes` + and :class:`bytearray` now accept a :class:`bytearray` for the ``fill`` + argument. (Contributed by Petri Lehtinen in :issue:`12380`.) + +* New methods have been added to :class:`list` and :class:`bytearray`: + ``copy()`` and ``clear()`` (:issue:`10516`). Consequently, + :class:`~collections.abc.MutableSequence` now also defines a + :meth:`~collections.abc.MutableSequence.clear` method (:issue:`11388`). + +* Raw bytes literals can now be written ``rb"..."`` as well as ``br"..."``. + + (Contributed by Antoine Pitrou in :issue:`13748`.) + +* :meth:`dict.setdefault` now does only one lookup for the given key, making + it atomic when used with built-in types. + + (Contributed by Filip Gruszczyński in :issue:`13521`.) + +* The error messages produced when a function call does not match the function + signature have been significantly improved. + + (Contributed by Benjamin Peterson.) + + +A Finer-Grained Import Lock +=========================== + +Previous versions of CPython have always relied on a global import lock. +This led to unexpected annoyances, such as deadlocks when importing a module +would trigger code execution in a different thread as a side-effect. +Clumsy workarounds were sometimes employed, such as the +:c:func:`PyImport_ImportModuleNoBlock` C API function. + +In Python 3.3, importing a module takes a per-module lock. This correctly +serializes importation of a given module from multiple threads (preventing +the exposure of incompletely initialized modules), while eliminating the +aforementioned annoyances. + +(Contributed by Antoine Pitrou in :issue:`9260`.) + + +Builtin functions and types +=========================== + +* :func:`open` gets a new *opener* parameter: the underlying file descriptor + for the file object is then obtained by calling *opener* with (*file*, + *flags*). It can be used to use custom flags like :data:`os.O_CLOEXEC` for + example. The ``'x'`` mode was added: open for exclusive creation, failing if + the file already exists. +* :func:`print`: added the *flush* keyword argument. If the *flush* keyword + argument is true, the stream is forcibly flushed. +* :func:`hash`: hash randomization is enabled by default, see + :meth:`object.__hash__` and :envvar:`PYTHONHASHSEED`. +* The :class:`str` type gets a new :meth:`~str.casefold` method: return a + casefolded copy of the string, casefolded strings may be used for caseless + matching. For example, ``'ß'.casefold()`` returns ``'ss'``. +* The sequence documentation has been substantially rewritten to better + explain the binary/text sequence distinction and to provide specific + documentation sections for the individual builtin sequence types + (:issue:`4966`) + + +New Modules +=========== + +faulthandler +------------ + +This new debug module :mod:`faulthandler` contains functions to dump Python tracebacks explicitly, +on a fault (a crash like a segmentation fault), after a timeout, or on a user +signal. Call :func:`faulthandler.enable` to install fault handlers for the +:const:`SIGSEGV`, :const:`SIGFPE`, :const:`SIGABRT`, :const:`SIGBUS`, and +:const:`SIGILL` signals. You can also enable them at startup by setting the +:envvar:`PYTHONFAULTHANDLER` environment variable or by using :option:`-X` +``faulthandler`` command line option. + +Example of a segmentation fault on Linux: :: + + $ python -q -X faulthandler + >>> import ctypes + >>> ctypes.string_at(0) + Fatal Python error: Segmentation fault + + Current thread 0x00007fb899f39700: + File "/home/python/cpython/Lib/ctypes/__init__.py", line 486 in string_at + File "<stdin>", line 1 in <module> + Segmentation fault + + +ipaddress +--------- + +The new :mod:`ipaddress` module provides tools for creating and manipulating +objects representing IPv4 and IPv6 addresses, networks and interfaces (i.e. +an IP address associated with a specific IP subnet). + +(Contributed by Google and Peter Moody in :pep:`3144`) + +lzma +---- + +The newly-added :mod:`lzma` module provides data compression and decompression +using the LZMA algorithm, including support for the ``.xz`` and ``.lzma`` +file formats. + +(Contributed by Nadeem Vawda and Per Øyvind Karlsen in :issue:`6715`) + + +Improved Modules +================ + +abc +--- + +Improved support for abstract base classes containing descriptors composed with +abstract methods. The recommended approach to declaring abstract descriptors is +now to provide :attr:`__isabstractmethod__` as a dynamically updated +property. The built-in descriptors have been updated accordingly. + + * :class:`abc.abstractproperty` has been deprecated, use :class:`property` + with :func:`abc.abstractmethod` instead. + * :class:`abc.abstractclassmethod` has been deprecated, use + :class:`classmethod` with :func:`abc.abstractmethod` instead. + * :class:`abc.abstractstaticmethod` has been deprecated, use + :class:`staticmethod` with :func:`abc.abstractmethod` instead. + +(Contributed by Darren Dale in :issue:`11610`) + +:meth:`abc.ABCMeta.register` now returns the registered subclass, which means +it can now be used as a class decorator (:issue:`10868`). + + +array +----- + +The :mod:`array` module supports the :c:type:`long long` type using ``q`` and +``Q`` type codes. + +(Contributed by Oren Tirosh and Hirokazu Yamamoto in :issue:`1172711`) + + +base64 +------ + +ASCII-only Unicode strings are now accepted by the decoding functions of the +:mod:`base64` modern interface. For example, ``base64.b64decode('YWJj')`` +returns ``b'abc'``. (Contributed by Catalin Iacob in :issue:`13641`.) + + +binascii +-------- + +In addition to the binary objects they normally accept, the ``a2b_`` functions +now all also accept ASCII-only strings as input. (Contributed by Antoine +Pitrou in :issue:`13637`.) + + +bz2 +--- + +The :mod:`bz2` module has been rewritten from scratch. In the process, several +new features have been added: + +* New :func:`bz2.open` function: open a bzip2-compressed file in binary or + text mode. + +* :class:`bz2.BZ2File` can now read from and write to arbitrary file-like + objects, by means of its constructor's *fileobj* argument. + + (Contributed by Nadeem Vawda in :issue:`5863`) + +* :class:`bz2.BZ2File` and :func:`bz2.decompress` can now decompress + multi-stream inputs (such as those produced by the :program:`pbzip2` tool). + :class:`bz2.BZ2File` can now also be used to create this type of file, using + the ``'a'`` (append) mode. + + (Contributed by Nir Aides in :issue:`1625`) + +* :class:`bz2.BZ2File` now implements all of the :class:`io.BufferedIOBase` API, + except for the :meth:`detach` and :meth:`truncate` methods. + + +codecs +------ + +The :mod:`~encodings.mbcs` codec has been rewritten to handle correctly +``replace`` and ``ignore`` error handlers on all Windows versions. The +:mod:`~encodings.mbcs` codec now supports all error handlers, instead of only +``replace`` to encode and ``ignore`` to decode. + +A new Windows-only codec has been added: ``cp65001`` (:issue:`13216`). It is the +Windows code page 65001 (Windows UTF-8, ``CP_UTF8``). For example, it is used +by ``sys.stdout`` if the console output code page is set to cp65001 (e.g., using +``chcp 65001`` command). + +Multibyte CJK decoders now resynchronize faster. They only ignore the first +byte of an invalid byte sequence. For example, ``b'\xff\n'.decode('gb2312', +'replace')`` now returns a ``\n`` after the replacement character. + +(:issue:`12016`) + +Incremental CJK codec encoders are no longer reset at each call to their +encode() methods. For example:: + + $ ./python -q + >>> import codecs + >>> encoder = codecs.getincrementalencoder('hz')('strict') + >>> b''.join(encoder.encode(x) for x in '\u52ff\u65bd\u65bc\u4eba\u3002 Bye.') + b'~{NpJ)l6HK!#~} Bye.' + +This example gives ``b'~{Np~}~{J)~}~{l6~}~{HK~}~{!#~} Bye.'`` with older Python +versions. + +(:issue:`12100`) + +The ``unicode_internal`` codec has been deprecated. + + +collections +----------- + +Addition of a new :class:`~collections.ChainMap` class to allow treating a +number of mappings as a single unit. (Written by Raymond Hettinger for +:issue:`11089`, made public in :issue:`11297`) + +The abstract base classes have been moved in a new :mod:`collections.abc` +module, to better differentiate between the abstract and the concrete +collections classes. Aliases for ABCs are still present in the +:mod:`collections` module to preserve existing imports. (:issue:`11085`) + +.. XXX addition of __slots__ to ABCs not recorded here: internal detail + +The :class:`~collections.Counter` class now supports the unary ``+`` and ``-`` +operators, as well as the in-place operators ``+=``, ``-=``, ``|=``, and +``&=``. (Contributed by Raymond Hettinger in :issue:`13121`.) + + +contextlib +---------- + +:class:`~contextlib.ExitStack` now provides a solid foundation for +programmatic manipulation of context managers and similar cleanup +functionality. Unlike the previous ``contextlib.nested`` API (which was +deprecated and removed), the new API is designed to work correctly +regardless of whether context managers acquire their resources in +their ``__init__`` method (for example, file objects) or in their +``__enter__`` method (for example, synchronisation objects from the +:mod:`threading` module). + +(:issue:`13585`) + + +crypt +----- + +Addition of salt and modular crypt format (hashing method) and the :func:`~crypt.mksalt` +function to the :mod:`crypt` module. + +(:issue:`10924`) + +curses +------ + + * If the :mod:`curses` module is linked to the ncursesw library, use Unicode + functions when Unicode strings or characters are passed (e.g. + :c:func:`waddwstr`), and bytes functions otherwise (e.g. :c:func:`waddstr`). + * Use the locale encoding instead of ``utf-8`` to encode Unicode strings. + * :class:`curses.window` has a new :attr:`curses.window.encoding` attribute. + * The :class:`curses.window` class has a new :meth:`~curses.window.get_wch` + method to get a wide character + * The :mod:`curses` module has a new :meth:`~curses.unget_wch` function to + push a wide character so the next :meth:`~curses.window.get_wch` will return + it + +(Contributed by Iñigo Serna in :issue:`6755`) + +datetime +-------- + + * Equality comparisons between naive and aware :class:`~datetime.datetime` + instances now return :const:`False` instead of raising :exc:`TypeError` + (:issue:`15006`). + * New :meth:`datetime.datetime.timestamp` method: Return POSIX timestamp + corresponding to the :class:`~datetime.datetime` instance. + * The :meth:`datetime.datetime.strftime` method supports formatting years + older than 1000. + * The :meth:`datetime.datetime.astimezone` method can now be + called without arguments to convert datetime instance to the system + timezone. + + +.. _new-decimal: + +decimal +------- + +:issue:`7652` - integrate fast native decimal arithmetic. + C-module and libmpdec written by Stefan Krah. + +The new C version of the decimal module integrates the high speed libmpdec +library for arbitrary precision correctly-rounded decimal floating point +arithmetic. libmpdec conforms to IBM's General Decimal Arithmetic Specification. + +Performance gains range from 10x for database applications to 100x for +numerically intensive applications. These numbers are expected gains +for standard precisions used in decimal floating point arithmetic. Since +the precision is user configurable, the exact figures may vary. For example, +in integer bignum arithmetic the differences can be significantly higher. + +The following table is meant as an illustration. Benchmarks are available +at http://www.bytereef.org/mpdecimal/quickstart.html. + + +---------+-------------+--------------+-------------+ + | | decimal.py | _decimal | speedup | + +=========+=============+==============+=============+ + | pi | 42.02s | 0.345s | 120x | + +---------+-------------+--------------+-------------+ + | telco | 172.19s | 5.68s | 30x | + +---------+-------------+--------------+-------------+ + | psycopg | 3.57s | 0.29s | 12x | + +---------+-------------+--------------+-------------+ + +Features +~~~~~~~~ + +* The :exc:`~decimal.FloatOperation` signal optionally enables stricter + semantics for mixing floats and Decimals. + +* If Python is compiled without threads, the C version automatically + disables the expensive thread local context machinery. In this case, + the variable :data:`~decimal.HAVE_THREADS` is set to ``False``. + +API changes +~~~~~~~~~~~ + +* The C module has the following context limits, depending on the machine + architecture: + + +-------------------+---------------------+------------------------------+ + | | 32-bit | 64-bit | + +===================+=====================+==============================+ + | :const:`MAX_PREC` | :const:`425000000` | :const:`999999999999999999` | + +-------------------+---------------------+------------------------------+ + | :const:`MAX_EMAX` | :const:`425000000` | :const:`999999999999999999` | + +-------------------+---------------------+------------------------------+ + | :const:`MIN_EMIN` | :const:`-425000000` | :const:`-999999999999999999` | + +-------------------+---------------------+------------------------------+ + +* In the context templates (:class:`~decimal.DefaultContext`, + :class:`~decimal.BasicContext` and :class:`~decimal.ExtendedContext`) + the magnitude of :attr:`~decimal.Context.Emax` and + :attr:`~decimal.Context.Emin` has changed to :const:`999999`. + +* The :class:`~decimal.Decimal` constructor in decimal.py does not observe + the context limits and converts values with arbitrary exponents or precision + exactly. Since the C version has internal limits, the following scheme is + used: If possible, values are converted exactly, otherwise + :exc:`~decimal.InvalidOperation` is raised and the result is NaN. In the + latter case it is always possible to use :meth:`~decimal.Context.create_decimal` + in order to obtain a rounded or inexact value. + + +* The power function in decimal.py is always correctly-rounded. In the + C version, it is defined in terms of the correctly-rounded + :meth:`~decimal.Decimal.exp` and :meth:`~decimal.Decimal.ln` functions, + but the final result is only "almost always correctly rounded". + + +* In the C version, the context dictionary containing the signals is a + :class:`~collections.abc.MutableMapping`. For speed reasons, + :attr:`~decimal.Context.flags` and :attr:`~decimal.Context.traps` always + refer to the same :class:`~collections.abc.MutableMapping` that the context + was initialized with. If a new signal dictionary is assigned, + :attr:`~decimal.Context.flags` and :attr:`~decimal.Context.traps` + are updated with the new values, but they do not reference the RHS + dictionary. + + +* Pickling a :class:`~decimal.Context` produces a different output in order + to have a common interchange format for the Python and C versions. + + +* The order of arguments in the :class:`~decimal.Context` constructor has been + changed to match the order displayed by :func:`repr`. + + +* The ``watchexp`` parameter in the :meth:`~decimal.Decimal.quantize` method + is deprecated. + + +.. _new-email: + +email +----- + +Policy Framework +~~~~~~~~~~~~~~~~ + +The email package now has a :mod:`~email.policy` framework. A +:class:`~email.policy.Policy` is an object with several methods and properties +that control how the email package behaves. The primary policy for Python 3.3 +is the :class:`~email.policy.Compat32` policy, which provides backward +compatibility with the email package in Python 3.2. A ``policy`` can be +specified when an email message is parsed by a :mod:`~email.parser`, or when a +:class:`~email.message.Message` object is created, or when an email is +serialized using a :mod:`~email.generator`. Unless overridden, a policy passed +to a ``parser`` is inherited by all the ``Message`` object and sub-objects +created by the ``parser``. By default a ``generator`` will use the policy of +the ``Message`` object it is serializing. The default policy is +:data:`~email.policy.compat32`. + +The minimum set of controls implemented by all ``policy`` objects are: + + .. tabularcolumns:: |l|L| + + =============== ======================================================= + max_line_length The maximum length, excluding the linesep character(s), + individual lines may have when a ``Message`` is + serialized. Defaults to 78. + + linesep The character used to separate individual lines when a + ``Message`` is serialized. Defaults to ``\n``. + + cte_type ``7bit`` or ``8bit``. ``8bit`` applies only to a + ``Bytes`` ``generator``, and means that non-ASCII may + be used where allowed by the protocol (or where it + exists in the original input). + + raise_on_defect Causes a ``parser`` to raise error when defects are + encountered instead of adding them to the ``Message`` + object's ``defects`` list. + =============== ======================================================= + +A new policy instance, with new settings, is created using the +:meth:`~email.policy.Policy.clone` method of policy objects. ``clone`` takes +any of the above controls as keyword arguments. Any control not specified in +the call retains its default value. Thus you can create a policy that uses +``\r\n`` linesep characters like this:: + + mypolicy = compat32.clone(linesep='\r\n') + +Policies can be used to make the generation of messages in the format needed by +your application simpler. Instead of having to remember to specify +``linesep='\r\n'`` in all the places you call a ``generator``, you can specify +it once, when you set the policy used by the ``parser`` or the ``Message``, +whichever your program uses to create ``Message`` objects. On the other hand, +if you need to generate messages in multiple forms, you can still specify the +parameters in the appropriate ``generator`` call. Or you can have custom +policy instances for your different cases, and pass those in when you create +the ``generator``. + + +Provisional Policy with New Header API +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +While the policy framework is worthwhile all by itself, the main motivation for +introducing it is to allow the creation of new policies that implement new +features for the email package in a way that maintains backward compatibility +for those who do not use the new policies. Because the new policies introduce a +new API, we are releasing them in Python 3.3 as a :term:`provisional policy +<provisional package>`. Backwards incompatible changes (up to and including +removal of the code) may occur if deemed necessary by the core developers. + +The new policies are instances of :class:`~email.policy.EmailPolicy`, +and add the following additional controls: + + .. tabularcolumns:: |l|L| + + =============== ======================================================= + refold_source Controls whether or not headers parsed by a + :mod:`~email.parser` are refolded by the + :mod:`~email.generator`. It can be ``none``, ``long``, + or ``all``. The default is ``long``, which means that + source headers with a line longer than + ``max_line_length`` get refolded. ``none`` means no + line get refolded, and ``all`` means that all lines + get refolded. + + header_factory A callable that take a ``name`` and ``value`` and + produces a custom header object. + =============== ======================================================= + +The ``header_factory`` is the key to the new features provided by the new +policies. When one of the new policies is used, any header retrieved from +a ``Message`` object is an object produced by the ``header_factory``, and any +time you set a header on a ``Message`` it becomes an object produced by +``header_factory``. All such header objects have a ``name`` attribute equal +to the header name. Address and Date headers have additional attributes +that give you access to the parsed data of the header. This means you can now +do things like this:: + + >>> m = Message(policy=SMTP) + >>> m['To'] = 'Éric <foo@example.com>' + >>> m['to'] + 'Éric <foo@example.com>' + >>> m['to'].addresses + (Address(display_name='Éric', username='foo', domain='example.com'),) + >>> m['to'].addresses[0].username + 'foo' + >>> m['to'].addresses[0].display_name + 'Éric' + >>> m['Date'] = email.utils.localtime() + >>> m['Date'].datetime + datetime.datetime(2012, 5, 25, 21, 39, 24, 465484, tzinfo=datetime.timezone(datetime.timedelta(-1, 72000), 'EDT')) + >>> m['Date'] + 'Fri, 25 May 2012 21:44:27 -0400' + >>> print(m) + To: =?utf-8?q?=C3=89ric?= <foo@example.com> + Date: Fri, 25 May 2012 21:44:27 -0400 + +You will note that the unicode display name is automatically encoded as +``utf-8`` when the message is serialized, but that when the header is accessed +directly, you get the unicode version. This eliminates any need to deal with +the :mod:`email.header` :meth:`~email.header.decode_header` or +:meth:`~email.header.make_header` functions. + +You can also create addresses from parts:: + + >>> m['cc'] = [Group('pals', [Address('Bob', 'bob', 'example.com'), + ... Address('Sally', 'sally', 'example.com')]), + ... Address('Bonzo', addr_spec='bonz@laugh.com')] + >>> print(m) + To: =?utf-8?q?=C3=89ric?= <foo@example.com> + Date: Fri, 25 May 2012 21:44:27 -0400 + cc: pals: Bob <bob@example.com>, Sally <sally@example.com>;, Bonzo <bonz@laugh.com> + +Decoding to unicode is done automatically:: + + >>> m2 = message_from_string(str(m)) + >>> m2['to'] + 'Éric <foo@example.com>' + +When you parse a message, you can use the ``addresses`` and ``groups`` +attributes of the header objects to access the groups and individual +addresses:: + + >>> m2['cc'].addresses + (Address(display_name='Bob', username='bob', domain='example.com'), Address(display_name='Sally', username='sally', domain='example.com'), Address(display_name='Bonzo', username='bonz', domain='laugh.com')) + >>> m2['cc'].groups + (Group(display_name='pals', addresses=(Address(display_name='Bob', username='bob', domain='example.com'), Address(display_name='Sally', username='sally', domain='example.com')), Group(display_name=None, addresses=(Address(display_name='Bonzo', username='bonz', domain='laugh.com'),)) + +In summary, if you use one of the new policies, header manipulation works the +way it ought to: your application works with unicode strings, and the email +package transparently encodes and decodes the unicode to and from the RFC +standard Content Transfer Encodings. + +Other API Changes +~~~~~~~~~~~~~~~~~ + +New :class:`~email.parser.BytesHeaderParser`, added to the :mod:`~email.parser` +module to complement :class:`~email.parser.HeaderParser` and complete the Bytes +API. + +New utility functions: + + * :func:`~email.utils.format_datetime`: given a :class:`~datetime.datetime`, + produce a string formatted for use in an email header. + + * :func:`~email.utils.parsedate_to_datetime`: given a date string from + an email header, convert it into an aware :class:`~datetime.datetime`, + or a naive :class:`~datetime.datetime` if the offset is ``-0000``. + + * :func:`~email.utils.localtime`: With no argument, returns the + current local time as an aware :class:`~datetime.datetime` using the local + :class:`~datetime.timezone`. Given an aware :class:`~datetime.datetime`, + converts it into an aware :class:`~datetime.datetime` using the + local :class:`~datetime.timezone`. + + +ftplib +------ + +* :class:`ftplib.FTP` now accepts a ``source_address`` keyword argument to + specify the ``(host, port)`` to use as the source address in the bind call + when creating the outgoing socket. (Contributed by Giampaolo Rodolà + in :issue:`8594`.) + +* The :class:`~ftplib.FTP_TLS` class now provides a new + :func:`~ftplib.FTP_TLS.ccc` function to revert control channel back to + plaintext. This can be useful to take advantage of firewalls that know how + to handle NAT with non-secure FTP without opening fixed ports. (Contributed + by Giampaolo Rodolà in :issue:`12139`) + +* Added :meth:`ftplib.FTP.mlsd` method which provides a parsable directory + listing format and deprecates :meth:`ftplib.FTP.nlst` and + :meth:`ftplib.FTP.dir`. (Contributed by Giampaolo Rodolà in :issue:`11072`) + + +functools +--------- + +The :func:`functools.lru_cache` decorator now accepts a ``typed`` keyword +argument (that defaults to ``False`` to ensure that it caches values of +different types that compare equal in separate cache slots. (Contributed +by Raymond Hettinger in :issue:`13227`.) + + +gc +-- + +It is now possible to register callbacks invoked by the garbage collector +before and after collection using the new :data:`~gc.callbacks` list. + + +hmac +---- + +A new :func:`~hmac.compare_digest` function has been added to prevent side +channel attacks on digests through timing analysis. (Contributed by Nick +Coghlan and Christian Heimes in :issue:`15061`) + + +http +---- + +:class:`http.server.BaseHTTPRequestHandler` now buffers the headers and writes +them all at once when :meth:`~http.server.BaseHTTPRequestHandler.end_headers` is +called. A new method :meth:`~http.server.BaseHTTPRequestHandler.flush_headers` +can be used to directly manage when the accumlated headers are sent. +(Contributed by Andrew Schaaf in :issue:`3709`.) + +:class:`http.server` now produces valid ``HTML 4.01 strict`` output. +(Contributed by Ezio Melotti in :issue:`13295`.) + +:class:`http.client.HTTPResponse` now has a +:meth:`~http.client.HTTPResponse.readinto` method, which means it can be used +as a :class:`io.RawIOBase` class. (Contributed by John Kuhn in +:issue:`13464`.) + + +html +---- + +:class:`html.parser.HTMLParser` is now able to parse broken markup without +raising errors, therefore the *strict* argument of the constructor and the +:exc:`~html.parser.HTMLParseError` exception are now deprecated. +The ability to parse broken markup is the result of a number of bug fixes that +are also available on the latest bug fix releases of Python 2.7/3.2. +(Contributed by Ezio Melotti in :issue:`15114`, and :issue:`14538`, +:issue:`13993`, :issue:`13960`, :issue:`13358`, :issue:`1745761`, +:issue:`755670`, :issue:`13357`, :issue:`12629`, :issue:`1200313`, +:issue:`670664`, :issue:`13273`, :issue:`12888`, :issue:`7311`) + +A new :data:`~html.entities.html5` dictionary that maps HTML5 named character +references to the equivalent Unicode character(s) (e.g. ``html5['gt;'] == +'>'``) has been added to the :mod:`html.entities` module. The dictionary is +now also used by :class:`~html.parser.HTMLParser`. (Contributed by Ezio +Melotti in :issue:`11113` and :issue:`15156`) + + +imaplib +------- + +The :class:`~imaplib.IMAP4_SSL` constructor now accepts an SSLContext +parameter to control parameters of the secure channel. + +(Contributed by Sijin Joseph in :issue:`8808`) + + +inspect +------- + +A new :func:`~inspect.getclosurevars` function has been added. This function +reports the current binding of all names referenced from the function body and +where those names were resolved, making it easier to verify correct internal +state when testing code that relies on stateful closures. + +(Contributed by Meador Inge and Nick Coghlan in :issue:`13062`) + +A new :func:`~inspect.getgeneratorlocals` function has been added. This +function reports the current binding of local variables in the generator's +stack frame, making it easier to verify correct internal state when testing +generators. + +(Contributed by Meador Inge in :issue:`15153`) + +io +-- + +The :func:`~io.open` function has a new ``'x'`` mode that can be used to +exclusively create a new file, and raise a :exc:`FileExistsError` if the file +already exists. It is based on the C11 'x' mode to fopen(). + +(Contributed by David Townshend in :issue:`12760`) + +The constructor of the :class:`~io.TextIOWrapper` class has a new +*write_through* optional argument. If *write_through* is ``True``, calls to +:meth:`~io.TextIOWrapper.write` are guaranteed not to be buffered: any data +written on the :class:`~io.TextIOWrapper` object is immediately handled to its +underlying binary buffer. + + +itertools +--------- + +:func:`~itertools.accumulate` now takes an optional ``func`` argument for +providing a user-supplied binary function. + + +logging +------- + +The :func:`~logging.basicConfig` function now supports an optional ``handlers`` +argument taking an iterable of handlers to be added to the root logger. + +A class level attribute :attr:`~logging.handlers.SysLogHandler.append_nul` has +been added to :class:`~logging.handlers.SysLogHandler` to allow control of the +appending of the ``NUL`` (``\000``) byte to syslog records, since for some +deamons it is required while for others it is passed through to the log. + + + +math +---- + +The :mod:`math` module has a new function, :func:`~math.log2`, which returns +the base-2 logarithm of *x*. + +(Written by Mark Dickinson in :issue:`11888`). + + +mmap +---- + +The :meth:`~mmap.mmap.read` method is now more compatible with other file-like +objects: if the argument is omitted or specified as ``None``, it returns the +bytes from the current file position to the end of the mapping. (Contributed +by Petri Lehtinen in :issue:`12021`.) + + +multiprocessing +--------------- + +The new :func:`multiprocessing.connection.wait` function allows to poll +multiple objects (such as connections, sockets and pipes) with a timeout. +(Contributed by Richard Oudkerk in :issue:`12328`.) + +:class:`multiprocessing.Connection` objects can now be transferred over +multiprocessing connections. +(Contributed by Richard Oudkerk in :issue:`4892`.) + +:class:`multiprocessing.Process` now accepts a ``daemon`` keyword argument +to override the default behavior of inheriting the ``daemon`` flag from +the parent process (:issue:`6064`). + +New attribute attribute :data:`multiprocessing.Process.sentinel` allows a +program to wait on multiple :class:`~multiprocessing.Process` objects at one +time using the appropriate OS primitives (for example, :mod:`select` on +posix systems). + +New methods :meth:`multiprocessing.pool.Pool.starmap` and +:meth:`~multiprocessing.pool.Pool.starmap_async` provide +:func:`itertools.starmap` equivalents to the existing +:meth:`multiprocessing.pool.Pool.map` and +:meth:`~multiprocessing.pool.Pool.map_async` functions. (Contributed by Hynek +Schlawack in :issue:`12708`.) + + +nntplib +------- + +The :class:`nntplib.NNTP` class now supports the context manager protocol to +unconditionally consume :exc:`socket.error` exceptions and to close the NNTP +connection when done:: + + >>> from nntplib import NNTP + >>> with NNTP('news.gmane.org') as n: + ... n.group('gmane.comp.python.committers') + ... + ('211 1755 1 1755 gmane.comp.python.committers', 1755, 1, 1755, 'gmane.comp.python.committers') + >>> + +(Contributed by Giampaolo Rodolà in :issue:`9795`) + + +os +-- + +* The :mod:`os` module has a new :func:`~os.pipe2` function that makes it + possible to create a pipe with :data:`~os.O_CLOEXEC` or + :data:`~os.O_NONBLOCK` flags set atomically. This is especially useful to + avoid race conditions in multi-threaded programs. + +* The :mod:`os` module has a new :func:`~os.sendfile` function which provides + an efficent "zero-copy" way for copying data from one file (or socket) + descriptor to another. The phrase "zero-copy" refers to the fact that all of + the copying of data between the two descriptors is done entirely by the + kernel, with no copying of data into userspace buffers. :func:`~os.sendfile` + can be used to efficiently copy data from a file on disk to a network socket, + e.g. for downloading a file. + + (Patch submitted by Ross Lagerwall and Giampaolo Rodolà in :issue:`10882`.) + +* To avoid race conditions like symlink attacks and issues with temporary + files and directories, it is more reliable (and also faster) to manipulate + file descriptors instead of file names. Python 3.3 enhances existing functions + and introduces new functions to work on file descriptors (:issue:`4761`, + :issue:`10755` and :issue:`14626`). + + - The :mod:`os` module has a new :func:`~os.fwalk` function similar to + :func:`~os.walk` except that it also yields file descriptors referring to the + directories visited. This is especially useful to avoid symlink races. + + - The following functions get new optional *dir_fd* (:ref:`paths relative to + directory descriptors <dir_fd>`) and/or *follow_symlinks* (:ref:`not + following symlinks <follow_symlinks>`): + :func:`~os.access`, :func:`~os.chflags`, :func:`~os.chmod`, :func:`~os.chown`, + :func:`~os.link`, :func:`~os.lstat`, :func:`~os.mkdir`, :func:`~os.mkfifo`, + :func:`~os.mknod`, :func:`~os.open`, :func:`~os.readlink`, :func:`~os.remove`, + :func:`~os.rename`, :func:`~os.replace`, :func:`~os.rmdir`, :func:`~os.stat`, + :func:`~os.symlink`, :func:`~os.unlink`, :func:`~os.utime`. Platform + support for using these parameters can be checked via the sets + :data:`os.supports_dir_fd` and :data:`os.supports_follows_symlinks`. + + - The following functions now support a file descriptor for their path argument: + :func:`~os.chdir`, :func:`~os.chmod`, :func:`~os.chown`, + :func:`~os.execve`, :func:`~os.listdir`, :func:`~os.pathconf`, :func:`~os.path.exists`, + :func:`~os.stat`, :func:`~os.statvfs`, :func:`~os.utime`. Platform support + for this can be checked via the :data:`os.supports_fd` set. + +* :func:`~os.access` accepts an ``effective_ids`` keyword argument to turn on + using the effective uid/gid rather than the real uid/gid in the access check. + Platform support for this can be checked via the + :data:`~os.supports_effective_ids` set. + +* The :mod:`os` module has two new functions: :func:`~os.getpriority` and + :func:`~os.setpriority`. They can be used to get or set process + niceness/priority in a fashion similar to :func:`os.nice` but extended to all + processes instead of just the current one. + + (Patch submitted by Giampaolo Rodolà in :issue:`10784`.) + +* The new :func:`os.replace` function allows cross-platform renaming of a + file with overwriting the destination. With :func:`os.rename`, an existing + destination file is overwritten under POSIX, but raises an error under + Windows. + (Contributed by Antoine Pitrou in :issue:`8828`.) + +* The stat family of functions (:func:`~os.stat`, :func:`~os.fstat`, + and :func:`~os.lstat`) now support reading a file's timestamps + with nanosecond precision. Symmetrically, :func:`~os.utime` + can now write file timestamps with nanosecond precision. (Contributed by + Larry Hastings in :issue:`14127`.) + +* The new :func:`os.get_terminal_size` function queries the size of the + terminal attached to a file descriptor. See also + :func:`shutil.get_terminal_size`. + (Contributed by Zbigniew Jędrzejewski-Szmek in :issue:`13609`.) + +.. XXX sort out this mess after beta1 + +* New functions to support Linux extended attributes (:issue:`12720`): + :func:`~os.getxattr`, :func:`~os.listxattr`, :func:`~os.removexattr`, + :func:`~os.setxattr`. + +* New interface to the scheduler. These functions + control how a process is allocated CPU time by the operating system. New + functions: + :func:`~os.sched_get_priority_max`, :func:`~os.sched_get_priority_min`, + :func:`~os.sched_getaffinity`, :func:`~os.sched_getparam`, + :func:`~os.sched_getscheduler`, :func:`~os.sched_rr_get_interval`, + :func:`~os.sched_setaffinity`, :func:`~os.sched_setparam`, + :func:`~os.sched_setscheduler`, :func:`~os.sched_yield`, + +* New functions to control the file system: + + * :func:`~os.posix_fadvise`: Announces an intention to access data in a + specific pattern thus allowing the kernel to make optimizations. + * :func:`~os.posix_fallocate`: Ensures that enough disk space is allocated + for a file. + * :func:`~os.sync`: Force write of everything to disk. + +* Additional new posix functions: + + * :func:`~os.lockf`: Apply, test or remove a POSIX lock on an open file descriptor. + * :func:`~os.pread`: Read from a file descriptor at an offset, the file + offset remains unchanged. + * :func:`~os.pwrite`: Write to a file descriptor from an offset, leaving + the file offset unchanged. + * :func:`~os.readv`: Read from a file descriptor into a number of writable buffers. + * :func:`~os.truncate`: Truncate the file corresponding to *path*, so that + it is at most *length* bytes in size. + * :func:`~os.waitid`: Wait for the completion of one or more child processes. + * :func:`~os.writev`: Write the contents of *buffers* to a file descriptor, + where *buffers* is an arbitrary sequence of buffers. + * :func:`~os.getgrouplist` (:issue:`9344`): Return list of group ids that + specified user belongs to. + +* :func:`~os.times` and :func:`~os.uname`: Return type changed from a tuple to + a tuple-like object with named attributes. + +* Some platforms now support additional constants for the :func:`~os.lseek` + function, such as ``os.SEEK_HOLE`` and ``os.SEEK_DATA``. + +* New constants :data:`~os.RTLD_LAZY`, :data:`~os.RTLD_NOW`, + :data:`~os.RTLD_GLOBAL`, :data:`~os.RTLD_LOCAL`, :data:`~os.RTLD_NODELETE`, + :data:`~os.RTLD_NOLOAD`, and :data:`~os.RTLD_DEEPBIND` are available on + platforms that support them. These are for use with the + :func:`sys.setdlopenflags` function, and supersede the similar constants + defined in :mod:`ctypes` and :mod:`DLFCN`. (Contributed by Victor Stinner + in :issue:`13226`.) + +* :func:`os.symlink` now accepts (and ignores) the ``target_is_directory`` + keyword argument on non-Windows platforms, to ease cross-platform support. + + +pdb +--- + +Tab-completion is now available not only for command names, but also their +arguments. For example, for the ``break`` command, function and file names +are completed. + +(Contributed by Georg Brandl in :issue:`14210`) + + +pickle +------ + +:class:`pickle.Pickler` objects now have an optional +:attr:`~pickle.Pickler.dispatch_table` attribute allowing to set per-pickler +reduction functions. + +(Contributed by Richard Oudkerk in :issue:`14166`.) + + +pydoc +----- + +The Tk GUI and the :func:`~pydoc.serve` function have been removed from the +:mod:`pydoc` module: ``pydoc -g`` and :func:`~pydoc.serve` have been deprecated +in Python 3.2. + + +re +-- + +:class:`str` regular expressions now support ``\u`` and ``\U`` escapes. + +(Contributed by Serhiy Storchaka in :issue:`3665`.) + + +sched +----- + +* :meth:`~sched.scheduler.run` now accepts a *blocking* parameter which when + set to False makes the method execute the scheduled events due to expire + soonest (if any) and then return immediately. + This is useful in case you want to use the :class:`~sched.scheduler` in + non-blocking applications. (Contributed by Giampaolo Rodolà in :issue:`13449`) + +* :class:`~sched.scheduler` class can now be safely used in multi-threaded + environments. (Contributed by Josiah Carlson and Giampaolo Rodolà in + :issue:`8684`) + +* *timefunc* and *delayfunct* parameters of :class:`~sched.scheduler` class + constructor are now optional and defaults to :func:`time.time` and + :func:`time.sleep` respectively. (Contributed by Chris Clark in + :issue:`13245`) + +* :meth:`~sched.scheduler.enter` and :meth:`~sched.scheduler.enterabs` + *argument* parameter is now optional. (Contributed by Chris Clark in + :issue:`13245`) + +* :meth:`~sched.scheduler.enter` and :meth:`~sched.scheduler.enterabs` + now accept a *kwargs* parameter. (Contributed by Chris Clark in + :issue:`13245`) + + +select +------ + +Solaris and derivative platforms have a new class :class:`select.devpoll` +for high performance asynchronous sockets via :file:`/dev/poll`. +(Contributed by Jesús Cea Avión in :issue:`6397`.) + + +shlex +----- + +The previously undocumented helper function ``quote`` from the +:mod:`pipes` modules has been moved to the :mod:`shlex` module and +documented. :func:`~shlex.quote` properly escapes all characters in a string +that might be otherwise given special meaning by the shell. + + +shutil +------ + +* New functions: + + * :func:`~shutil.disk_usage`: provides total, used and free disk space + statistics. (Contributed by Giampaolo Rodolà in :issue:`12442`) + * :func:`~shutil.chown`: allows one to change user and/or group of the given + path also specifying the user/group names and not only their numeric + ids. (Contributed by Sandro Tosi in :issue:`12191`) + * :func:`shutil.get_terminal_size`: returns the size of the terminal window + to which the interpreter is attached. (Contributed by Zbigniew + Jędrzejewski-Szmek in :issue:`13609`.) + +* :func:`~shutil.copy2` and :func:`~shutil.copystat` now preserve file + timestamps with nanosecond precision on platforms that support it. + They also preserve file "extended attributes" on Linux. (Contributed + by Larry Hastings in :issue:`14127` and :issue:`15238`.) + +* Several functions now take an optional ``symlinks`` argument: when that + parameter is true, symlinks aren't dereferenced and the operation instead + acts on the symlink itself (or creates one, if relevant). + (Contributed by Hynek Schlawack in :issue:`12715`.) + +* When copying files to a different file system, :func:`~shutil.move` now + handles symlinks the way the posix ``mv`` command does, recreating the + symlink rather than copying the target file contents. (Contributed by + Jonathan Niehof in :issue:`9993`.) :func:`~shutil.move` now also returns + the ``dst`` argument as its result. + +* :func:`~shutil.rmtree` is now resistant to symlink attacks on platforms + which support the new ``dir_fd`` parameter in :func:`os.open` and + :func:`os.unlink`. (Contributed by Martin von Löwis and Hynek Schlawack + in :issue:`4489`.) + + +signal +------ + +* The :mod:`signal` module has new functions: + + * :func:`~signal.pthread_sigmask`: fetch and/or change the signal mask of the + calling thread (Contributed by Jean-Paul Calderone in :issue:`8407`); + * :func:`~signal.pthread_kill`: send a signal to a thread; + * :func:`~signal.sigpending`: examine pending functions; + * :func:`~signal.sigwait`: wait a signal; + * :func:`~signal.sigwaitinfo`: wait for a signal, returning detailed + information about it; + * :func:`~signal.sigtimedwait`: like :func:`~signal.sigwaitinfo` but with a + timeout. + +* The signal handler writes the signal number as a single byte instead of + a nul byte into the wakeup file descriptor. So it is possible to wait more + than one signal and know which signals were raised. + +* :func:`signal.signal` and :func:`signal.siginterrupt` raise an OSError, + instead of a RuntimeError: OSError has an errno attribute. + + +smtpd +----- + +The :mod:`smtpd` module now supports :rfc:`5321` (extended SMTP) and :rfc:`1870` +(size extension). Per the standard, these extensions are enabled if and only +if the client initiates the session with an ``EHLO`` command. + +(Initial ``ELHO`` support by Alberto Trevino. Size extension by Juhana +Jauhiainen. Substantial additional work on the patch contributed by Michele +Orrù and Dan Boswell. :issue:`8739`) + + +smtplib +------- + +The :class:`~smtplib.SMTP`, :class:`~smtplib.SMTP_SSL`, and +:class:`~smtplib.LMTP` classes now accept a ``source_address`` keyword argument +to specify the ``(host, port)`` to use as the source address in the bind call +when creating the outgoing socket. (Contributed by Paulo Scardine in +:issue:`11281`.) + +:class:`~smtplib.SMTP` now supports the context manager protocol, allowing an +``SMTP`` instance to be used in a ``with`` statement. (Contributed +by Giampaolo Rodolà in :issue:`11289`.) + +The :class:`~smtplib.SMTP_SSL` constructor and the :meth:`~smtplib.SMTP.starttls` +method now accept an SSLContext parameter to control parameters of the secure +channel. (Contributed by Kasun Herath in :issue:`8809`) + + +socket +------ + +* The :class:`~socket.socket` class now exposes additional methods to process + ancillary data when supported by the underlying platform: + + * :func:`~socket.socket.sendmsg` + * :func:`~socket.socket.recvmsg` + * :func:`~socket.socket.recvmsg_into` + + (Contributed by David Watson in :issue:`6560`, based on an earlier patch by + Heiko Wundram) + +* The :class:`~socket.socket` class now supports the PF_CAN protocol family + (http://en.wikipedia.org/wiki/Socketcan), on Linux + (http://lwn.net/Articles/253425). + + (Contributed by Matthias Fuchs, updated by Tiago Gonçalves in :issue:`10141`) + +* The :class:`~socket.socket` class now supports the PF_RDS protocol family + (http://en.wikipedia.org/wiki/Reliable_Datagram_Sockets and + http://oss.oracle.com/projects/rds/). + +* The :class:`~socket.socket` class now supports the ``PF_SYSTEM`` protocol + family on OS X. (Contributed by Michael Goderbauer in :issue:`13777`.) + +* New function :func:`~socket.sethostname` allows the hostname to be set + on unix systems if the calling process has sufficient privileges. + (Contributed by Ross Lagerwall in :issue:`10866`.) + + +socketserver +------------ + +:class:`~socketserver.BaseServer` now has an overridable method +:meth:`~socketserver.BaseServer.service_actions` that is called by the +:meth:`~socketserver.BaseServer.serve_forever` method in the service loop. +:class:`~socketserver.ForkingMixIn` now uses this to clean up zombie +child proceses. (Contributed by Justin Warkentin in :issue:`11109`.) + + +sqlite3 +------- + +New :class:`sqlite3.Connection` method +:meth:`~sqlite3.Connection.set_trace_callback` can be used to capture a trace of +all sql commands processed by sqlite. (Contributed by Torsten Landschoff +in :issue:`11688`.) + + +ssl +--- + +* The :mod:`ssl` module has two new random generation functions: + + * :func:`~ssl.RAND_bytes`: generate cryptographically strong + pseudo-random bytes. + * :func:`~ssl.RAND_pseudo_bytes`: generate pseudo-random bytes. + + (Contributed by Victor Stinner in :issue:`12049`) + +* The :mod:`ssl` module now exposes a finer-grained exception hierarchy + in order to make it easier to inspect the various kinds of errors. + (Contributed by Antoine Pitrou in :issue:`11183`) + +* :meth:`~ssl.SSLContext.load_cert_chain` now accepts a *password* argument + to be used if the private key is encrypted. + (Contributed by Adam Simpkins in :issue:`12803`) + +* Diffie-Hellman key exchange, both regular and Elliptic Curve-based, is + now supported through the :meth:`~ssl.SSLContext.load_dh_params` and + :meth:`~ssl.SSLContext.set_ecdh_curve` methods. + (Contributed by Antoine Pitrou in :issue:`13626` and :issue:`13627`) + +* SSL sockets have a new :meth:`~ssl.SSLSocket.get_channel_binding` method + allowing the implementation of certain authentication mechanisms such as + SCRAM-SHA-1-PLUS. (Contributed by Jacek Konieczny in :issue:`12551`) + +* You can query the SSL compression algorithm used by an SSL socket, thanks + to its new :meth:`~ssl.SSLSocket.compression` method. The new attribute + :attr:`~ssl.OP_NO_COMPRESSION` can be used to disable compression. + (Contributed by Antoine Pitrou in :issue:`13634`) + +* Support has been added for the Next Procotol Negotiation extension using + the :meth:`ssl.SSLContext.set_npn_protocols` method. + (Contributed by Colin Marc in :issue:`14204`) + +* SSL errors can now be introspected more easily thanks to + :attr:`~ssl.SSLError.library` and :attr:`~ssl.SSLError.reason` attributes. + (Contributed by Antoine Pitrou in :issue:`14837`) + +* The :func:`~ssl.get_server_certificate` function now supports IPv6. + (Contributed by Charles-François Natali in :issue:`11811`.) + +* New attribute :attr:`~ssl.OP_CIPHER_SERVER_PREFERENCE` allows setting + SSLv3 server sockets to use the server's cipher ordering preference rather + than the client's (:issue:`13635`). + + +stat +---- + +The undocumented tarfile.filemode function has been moved to +:func:`stat.filemode`. It can be used to convert a file's mode to a string of +the form '-rwxrwxrwx'. + +(Contributed by Giampaolo Rodolà in :issue:`14807`) + + +struct +------ + +The :mod:`struct` module now supports ``ssize_t`` and ``size_t`` via the +new codes ``n`` and ``N``, respectively. (Contributed by Antoine Pitrou +in :issue:`3163`.) + + +subprocess +---------- + +Command strings can now be bytes objects on posix platforms. (Contributed by +Victor Stinner in :issue:`8513`.) + +A new constant :data:`~subprocess.DEVNULL` allows suppressing output in a +platform-independent fashion. (Contributed by Ross Lagerwall in +:issue:`5870`.) + + +sys +--- + +The :mod:`sys` module has a new :data:`~sys.thread_info` :term:`struct +sequence` holding informations about the thread implementation +(:issue:`11223`). + + +tarfile +------- + +:mod:`tarfile` now supports ``lzma`` encoding via the :mod:`lzma` module. +(Contributed by Lars Gustäbel in :issue:`5689`.) + + +tempfile +-------- + +:class:`tempfile.SpooledTemporaryFile`\'s +:meth:`~tempfile.SpooledTemporaryFile.truncate` method now accepts +a ``size`` parameter. (Contributed by Ryan Kelly in :issue:`9957`.) + + +textwrap +-------- + +The :mod:`textwrap` module has a new :func:`~textwrap.indent` that makes +it straightforward to add a common prefix to selected lines in a block +of text (:issue:`13857`). + + +threading +--------- + +:class:`threading.Condition`, :class:`threading.Semaphore`, +:class:`threading.BoundedSemaphore`, :class:`threading.Event`, and +:class:`threading.Timer`, all of which used to be factory functions returning a +class instance, are now classes and may be subclassed. (Contributed by Éric +Araujo in :issue:`10968`). + +The :class:`threading.Thread` constructor now accepts a ``daemon`` keyword +argument to override the default behavior of inheriting the ``deamon`` flag +value from the parent thread (:issue:`6064`). + +The formerly private function ``_thread.get_ident`` is now available as the +public function :func:`threading.get_ident`. This eliminates several cases of +direct access to the ``_thread`` module in the stdlib. Third party code that +used ``_thread.get_ident`` should likewise be changed to use the new public +interface. + + +time +---- + +The :pep:`418` added new functions to the :mod:`time` module: + +* :func:`~time.get_clock_info`: Get information on a clock. +* :func:`~time.monotonic`: Monotonic clock (cannot go backward), not affected + by system clock updates. +* :func:`~time.perf_counter`: Performance counter with the highest available + resolution to measure a short duration. +* :func:`~time.process_time`: Sum of the system and user CPU time of the + current process. + +Other new functions: + +* :func:`~time.clock_getres`, :func:`~time.clock_gettime` and + :func:`~time.clock_settime` functions with ``CLOCK_xxx`` constants. + (Contributed by Victor Stinner in :issue:`10278`) + +To improve cross platform consistency, :func:`~time.sleep` now raises a +:exc:`ValueError` when passed a negative sleep value. Previously this was an +error on posix, but produced an infinite sleep on Windows. + + +types +----- + +Add a new :class:`types.MappingProxyType` class: Read-only proxy of a mapping. +(:issue:`14386`) + + +The new functions `types.new_class` and `types.prepare_class` provide support +for PEP 3115 compliant dynamic type creation. (:issue:`14588`) + + +unittest +-------- + +:meth:`.assertRaises`, :meth:`.assertRaisesRegex`, :meth:`.assertWarns`, and +:meth:`.assertWarnsRegex` now accept a keyword argument *msg* when used as +context managers. (Contributed by Ezio Melotti and Winston Ewert in +:issue:`10775`) + +:meth:`unittest.TestCase.run` now returns the :class:`~unittest.TestResult` +object. + + +urllib +------ + +The :class:`~urllib.request.Request` class, now accepts a *method* argument +used by :meth:`~urllib.request.Request.get_method` to determine what HTTP method +should be used. For example, this will send a ``'HEAD'`` request:: + + >>> urlopen(Request('http://www.python.org', method='HEAD')) + +(:issue:`1673007`) + + +webbrowser +---------- + +The :mod:`webbrowser` module supports more "browsers": Google Chrome (named +:program:`chrome`, :program:`chromium`, :program:`chrome-browser` or +:program:`chromium-browser` depending on the version and operating system), +and the generic launchers :program:`xdg-open`, from the FreeDesktop.org +project, and :program:`gvfs-open`, which is the default URI handler for GNOME +3. (The former contributed by Arnaud Calmettes in :issue:`13620`, the latter +by Matthias Klose in :issue:`14493`) + + +xml.etree.ElementTree +--------------------- + +The :mod:`xml.etree.ElementTree` module now imports its C accelerator by +default; there is no longer a need to explicitly import +:mod:`xml.etree.cElementTree` (this module stays for backwards compatibility, +but is now deprecated). In addition, the ``iter`` family of methods of +:class:`~xml.etree.ElementTree.Element` has been optimized (rewritten in C). +The module's documentation has also been greatly improved with added examples +and a more detailed reference. + + +zlib +---- + +New attribute :attr:`zlib.Decompress.eof` makes it possible to distinguish +between a properly-formed compressed stream and an incomplete or truncated one. +(Contributed by Nadeem Vawda in :issue:`12646`.) + +New attribute :attr:`zlib.ZLIB_RUNTIME_VERSION` reports the version string of +the underlying ``zlib`` library that is loaded at runtime. (Contributed by +Torsten Landschoff in :issue:`12306`.) + + +Optimizations +============= + +Major performance enhancements have been added: + +* Thanks to :pep:`393`, some operations on Unicode strings have been optimized: + + * the memory footprint is divided by 2 to 4 depending on the text + * encode an ASCII string to UTF-8 doesn't need to encode characters anymore, + the UTF-8 representation is shared with the ASCII representation + * the UTF-8 encoder has been optimized + * repeating a single ASCII letter and getting a substring of a ASCII strings + is 4 times faster + +* UTF-8 is now 2x to 4x faster. UTF-16 encoding is now up to 10x faster. + + (contributed by Serhiy Storchaka, :issue:`14624`, :issue:`14738` and + :issue:`15026`.) + + +Build and C API Changes +======================= + +Changes to Python's build process and to the C API include: + +* New :pep:`3118` related function: + + * :c:func:`PyMemoryView_FromMemory` + +* :pep:`393` added new Unicode types, macros and functions: + + * High-level API: + + * :c:func:`PyUnicode_CopyCharacters` + * :c:func:`PyUnicode_FindChar` + * :c:func:`PyUnicode_GetLength`, :c:macro:`PyUnicode_GET_LENGTH` + * :c:func:`PyUnicode_New` + * :c:func:`PyUnicode_Substring` + * :c:func:`PyUnicode_ReadChar`, :c:func:`PyUnicode_WriteChar` + + * Low-level API: + + * :c:type:`Py_UCS1`, :c:type:`Py_UCS2`, :c:type:`Py_UCS4` types + * :c:type:`PyASCIIObject` and :c:type:`PyCompactUnicodeObject` structures + * :c:macro:`PyUnicode_READY` + * :c:func:`PyUnicode_FromKindAndData` + * :c:func:`PyUnicode_AsUCS4`, :c:func:`PyUnicode_AsUCS4Copy` + * :c:macro:`PyUnicode_DATA`, :c:macro:`PyUnicode_1BYTE_DATA`, + :c:macro:`PyUnicode_2BYTE_DATA`, :c:macro:`PyUnicode_4BYTE_DATA` + * :c:macro:`PyUnicode_KIND` with :c:type:`PyUnicode_Kind` enum: + :c:data:`PyUnicode_WCHAR_KIND`, :c:data:`PyUnicode_1BYTE_KIND`, + :c:data:`PyUnicode_2BYTE_KIND`, :c:data:`PyUnicode_4BYTE_KIND` + * :c:macro:`PyUnicode_READ`, :c:macro:`PyUnicode_READ_CHAR`, :c:macro:`PyUnicode_WRITE` + * :c:macro:`PyUnicode_MAX_CHAR_VALUE` + +* :c:macro:`PyArg_ParseTuple` now accepts a :class:`bytearray` for the ``c`` + format (:issue:`12380`). + + + +Deprecated +========== + +Unsupported Operating Systems +----------------------------- + +OS/2 and VMS are no longer supported due to the lack of a maintainer. + +Windows 2000 and Windows platforms which set ``COMSPEC`` to ``command.com`` +are no longer supported due to maintenance burden. + +OSF support, which was deprecated in 3.2, has been completely removed. + + +Deprecated Python modules, functions and methods +------------------------------------------------ + +* Passing a non-empty string to ``object.__format__()`` is deprecated, and + will produce a :exc:`TypeError` in Python 3.4 (:issue:`9856`). +* The ``unicode_internal`` codec has been deprecated because of the + :pep:`393`, use UTF-8, UTF-16 (``utf-16-le`` or ``utf-16-be``), or UTF-32 + (``utf-32-le`` or ``utf-32-be``) +* :meth:`ftplib.FTP.nlst` and :meth:`ftplib.FTP.dir`: use + :meth:`ftplib.FTP.mlsd` +* :func:`platform.popen`: use the :mod:`subprocess` module. Check especially + the :ref:`subprocess-replacements` section (:issue:`11377`). +* :issue:`13374`: The Windows bytes API has been deprecated in the :mod:`os` + module. Use Unicode filenames, instead of bytes filenames, to not depend on + the ANSI code page anymore and to support any filename. +* :issue:`13988`: The :mod:`xml.etree.cElementTree` module is deprecated. The + accelerator is used automatically whenever available. +* The behaviour of :func:`time.clock` depends on the platform: use the new + :func:`time.perf_counter` or :func:`time.process_time` function instead, + depending on your requirements, to have a well defined behaviour. +* The :func:`os.stat_float_times` function is deprecated. +* :mod:`abc` module: + + * :class:`abc.abstractproperty` has been deprecated, use :class:`property` + with :func:`abc.abstractmethod` instead. + * :class:`abc.abstractclassmethod` has been deprecated, use + :class:`classmethod` with :func:`abc.abstractmethod` instead. + * :class:`abc.abstractstaticmethod` has been deprecated, use + :class:`staticmethod` with :func:`abc.abstractmethod` instead. + +* :mod:`importlib` package: + + * :meth:`importlib.abc.SourceLoader.path_mtime` is now deprecated in favour of + :meth:`importlib.abc.SourceLoader.path_stats` as bytecode files now store + both the modification time and size of the source file the bytecode file was + compiled from. + + + + + +Deprecated functions and types of the C API +------------------------------------------- + +The :c:type:`Py_UNICODE` has been deprecated by :pep:`393` and will be +removed in Python 4. All functions using this type are deprecated: + +Unicode functions and methods using :c:type:`Py_UNICODE` and +:c:type:`Py_UNICODE*` types: + +* :c:macro:`PyUnicode_FromUnicode`: use :c:func:`PyUnicode_FromWideChar` or + :c:func:`PyUnicode_FromKindAndData` +* :c:macro:`PyUnicode_AS_UNICODE`, :c:func:`PyUnicode_AsUnicode`, + :c:func:`PyUnicode_AsUnicodeAndSize`: use :c:func:`PyUnicode_AsWideCharString` +* :c:macro:`PyUnicode_AS_DATA`: use :c:macro:`PyUnicode_DATA` with + :c:macro:`PyUnicode_READ` and :c:macro:`PyUnicode_WRITE` +* :c:macro:`PyUnicode_GET_SIZE`, :c:func:`PyUnicode_GetSize`: use + :c:macro:`PyUnicode_GET_LENGTH` or :c:func:`PyUnicode_GetLength` +* :c:macro:`PyUnicode_GET_DATA_SIZE`: use + ``PyUnicode_GET_LENGTH(str) * PyUnicode_KIND(str)`` (only work on ready + strings) +* :c:func:`PyUnicode_AsUnicodeCopy`: use :c:func:`PyUnicode_AsUCS4Copy` or + :c:func:`PyUnicode_AsWideCharString` +* :c:func:`PyUnicode_GetMax` + + +Functions and macros manipulating Py_UNICODE* strings: + +* :c:macro:`Py_UNICODE_strlen`: use :c:func:`PyUnicode_GetLength` or + :c:macro:`PyUnicode_GET_LENGTH` +* :c:macro:`Py_UNICODE_strcat`: use :c:func:`PyUnicode_CopyCharacters` or + :c:func:`PyUnicode_FromFormat` +* :c:macro:`Py_UNICODE_strcpy`, :c:macro:`Py_UNICODE_strncpy`, + :c:macro:`Py_UNICODE_COPY`: use :c:func:`PyUnicode_CopyCharacters` or + :c:func:`PyUnicode_Substring` +* :c:macro:`Py_UNICODE_strcmp`: use :c:func:`PyUnicode_Compare` +* :c:macro:`Py_UNICODE_strncmp`: use :c:func:`PyUnicode_Tailmatch` +* :c:macro:`Py_UNICODE_strchr`, :c:macro:`Py_UNICODE_strrchr`: use + :c:func:`PyUnicode_FindChar` +* :c:macro:`Py_UNICODE_FILL`: use :c:func:`PyUnicode_Fill` +* :c:macro:`Py_UNICODE_MATCH` + +Encoders: + +* :c:func:`PyUnicode_Encode`: use :c:func:`PyUnicode_AsEncodedObject` +* :c:func:`PyUnicode_EncodeUTF7` +* :c:func:`PyUnicode_EncodeUTF8`: use :c:func:`PyUnicode_AsUTF8` or + :c:func:`PyUnicode_AsUTF8String` +* :c:func:`PyUnicode_EncodeUTF32` +* :c:func:`PyUnicode_EncodeUTF16` +* :c:func:`PyUnicode_EncodeUnicodeEscape:` use + :c:func:`PyUnicode_AsUnicodeEscapeString` +* :c:func:`PyUnicode_EncodeRawUnicodeEscape:` use + :c:func:`PyUnicode_AsRawUnicodeEscapeString` +* :c:func:`PyUnicode_EncodeLatin1`: use :c:func:`PyUnicode_AsLatin1String` +* :c:func:`PyUnicode_EncodeASCII`: use :c:func:`PyUnicode_AsASCIIString` +* :c:func:`PyUnicode_EncodeCharmap` +* :c:func:`PyUnicode_TranslateCharmap` +* :c:func:`PyUnicode_EncodeMBCS`: use :c:func:`PyUnicode_AsMBCSString` or + :c:func:`PyUnicode_EncodeCodePage` (with ``CP_ACP`` code_page) +* :c:func:`PyUnicode_EncodeDecimal`, + :c:func:`PyUnicode_TransformDecimalToASCII` + + +Deprecated features +------------------- + +The :mod:`array` module's ``'u'`` format code is now deprecated and will be +removed in Python 4 together with the rest of the (:c:type:`Py_UNICODE`) API. + + +Porting to Python 3.3 +===================== + +This section lists previously described changes and other bugfixes +that may require changes to your code. + +.. _portingpythoncode: + +Porting Python code +------------------- + +* Hash randomization is enabled by default. Set the :envvar:`PYTHONHASHSEED` + environment variable to ``0`` to disable hash randomization. See also the + :meth:`object.__hash__` method. + +* :issue:`12326`: On Linux, sys.platform doesn't contain the major version + anymore. It is now always 'linux', instead of 'linux2' or 'linux3' depending + on the Linux version used to build Python. Replace sys.platform == 'linux2' + with sys.platform.startswith('linux'), or directly sys.platform == 'linux' if + you don't need to support older Python versions. + +* :issue:`13847`, :issue:`14180`: :mod:`time` and :mod:`datetime`: + :exc:`OverflowError` is now raised instead of :exc:`ValueError` if a + timestamp is out of range. :exc:`OSError` is now raised if C functions + :c:func:`gmtime` or :c:func:`localtime` failed. + +* The default finders used by import now utilize a cache of what is contained + within a specific directory. If you create a Python source file or sourceless + bytecode file, make sure to call :func:`importlib.invalidate_caches` to clear + out the cache for the finders to notice the new file. + +* :exc:`ImportError` now uses the full name of the module that was attemped to + be imported. Doctests that check ImportErrors' message will need to be + updated to use the full name of the module instead of just the tail of the + name. + +* The *index* argument to :func:`__import__` now defaults to 0 instead of -1 + and no longer support negative values. It was an oversight when :pep:`328` was + implemented that the default value remained -1. If you need to continue to + perform a relative import followed by an absolute import, then perform the + relative import using an index of 1, followed by another import using an + index of 0. It is preferred, though, that you use + :func:`importlib.import_module` rather than call :func:`__import__` directly. + +* :func:`__import__` no longer allows one to use an index value other than 0 + for top-level modules. E.g. ``__import__('sys', level=1)`` is now an error. + +* Because :attr:`sys.meta_path` and :attr:`sys.path_hooks` now have finders on + them by default, you will most likely want to use :meth:`list.insert` instead + of :meth:`list.append` to add to those lists. + +* Because ``None`` is now inserted into :attr:`sys.path_importer_cache`, if you + are clearing out entries in the dictionary of paths that do not have a + finder, you will need to remove keys paired with values of ``None`` **and** + :class:`imp.NullImporter` to be backwards-compatible. This will lead to extra + overhead on older versions of Python that re-insert ``None`` into + :attr:`sys.path_importer_cache` where it repesents the use of implicit + finders, but semantically it should not change anything. + +* :class:`importlib.abc.Finder` no longer specifies a `find_module()` abstract + method that must be implemented. If you were relying on subclasses to + implement that method, make sure to check for the method's existence first. + You will probably want to check for `find_loader()` first, though, in the + case of working with :term:`path entry finders <path entry finder>`. + +* :mod:`pkgutil` has been converted to use :mod:`importlib` internally. This + eliminates many edge cases where the old behaviour of the PEP 302 import + emulation failed to match the behaviour of the real import system. The + import emulation itself is still present, but is now deprecated. The + :func:`pkgutil.iter_importers` and :func:`pkgutil.walk_packages` functions + special case the standard import hooks so they are still supported even + though they do not provide the non-standard ``iter_modules()`` method. + +* A longstanding RFC-compliance bug (:issue:`1079`) in the parsing done by + :func:`email.header.decode_header` has been fixed. Code that uses the + standard idiom to convert encoded headers into unicode + (``str(make_header(decode_header(h))``) will see no change, but code that + looks at the individual tuples returned by decode_header will see that + whitespace that precedes or follows ``ASCII`` sections is now included in the + ``ASCII`` section. Code that builds headers using ``make_header`` should + also continue to work without change, since ``make_header`` continues to add + whitespace between ``ASCII`` and non-``ASCII`` sections if it is not already + present in the input strings. + +* :func:`email.utils.formataddr` now does the correct content transfer + encoding when passed non-``ASCII`` display names. Any code that depended on + the previous buggy behavior that preserved the non-``ASCII`` unicode in the + formatted output string will need to be changed (:issue:`1690608`). + +* :meth:`poplib.POP3.quit` may now raise protocol errors like all other + ``poplib`` methods. Code that assumes ``quit`` does not raise + :exc:`poplib.error_proto` errors may need to be changed if errors on ``quit`` + are encountered by a particular application (:issue:`11291`). + +* The ``strict`` argument to :class:`email.parser.Parser`, deprecated since + Python 2.4, has finally been removed. + +* The deprecated method ``unittest.TestCase.assertSameElements`` has been + removed. + +* The deprecated variable ``time.accept2dyear`` has been removed. + +* The deprecated ``Context._clamp`` attribute has been removed from the + :mod:`decimal` module. It was previously replaced by the public attribute + :attr:`~decimal.Context.clamp`. (See :issue:`8540`.) + +* The undocumented internal helper class ``SSLFakeFile`` has been removed + from :mod:`smtplib`, since its functionality has long been provided directly + by :meth:`socket.socket.makefile`. + +* Passing a negative value to :func:`time.sleep` on Windows now raises an + error instead of sleeping forever. It has always raised an error on posix. + +* The ``ast.__version__`` constant has been removed. If you need to + make decisions affected by the AST version, use :attr:`sys.version_info` + to make the decision. + +* Code that used to work around the fact that the :mod:`threading` module used + factory functions by subclassing the private classes will need to change to + subclass the now-public classes. + +* The undocumented debugging machinery in the threading module has been + removed, simplifying the code. This should have no effect on production + code, but is mentioned here in case any application debug frameworks were + interacting with it (:issue:`13550`). + + +Porting C code +-------------- + +* In the course of changes to the buffer API the undocumented + :c:member:`~Py_buffer.smalltable` member of the + :c:type:`Py_buffer` structure has been removed and the + layout of the :c:type:`PyMemoryViewObject` has changed. + + All extensions relying on the relevant parts in ``memoryobject.h`` + or ``object.h`` must be rebuilt. + +* Due to :ref:`PEP 393 <pep-393>`, the :c:type:`Py_UNICODE` type and all + functions using this type are deprecated (but will stay available for + at least five years). If you were using low-level Unicode APIs to + construct and access unicode objects and you want to benefit of the + memory footprint reduction provided by PEP 393, you have to convert + your code to the new :doc:`Unicode API <../c-api/unicode>`. + + However, if you only have been using high-level functions such as + :c:func:`PyUnicode_Concat()`, :c:func:`PyUnicode_Join` or + :c:func:`PyUnicode_FromFormat()`, your code will automatically take + advantage of the new unicode representations. + +* :c:func:`PyImport_GetMagicNumber` now returns -1 upon failure. + +* As a negative value for the *level* argument to :func:`__import__` is no + longer valid, the same now holds for :c:func:`PyImport_ImportModuleLevel`. + This also means that the value of *level* used by + :c:func:`PyImport_ImportModuleEx` is now 0 instead of -1. + + +Building C extensions +--------------------- + +* The range of possible file names for C extensions has been narrowed. + Very rarely used spellings have been suppressed: under POSIX, files + named ``xxxmodule.so``, ``xxxmodule.abi3.so`` and + ``xxxmodule.cpython-*.so`` are no longer recognized as implementing + the ``xxx`` module. If you had been generating such files, you have + to switch to the other spellings (i.e., remove the ``module`` string + from the file names). + + (implemented in :issue:`14040`.) + + +Command Line Switch Changes +--------------------------- + +* The -Q command-line flag and related artifacts have been removed. Code + checking sys.flags.division_warning will need updating. + + (:issue:`10998`, contributed by Éric Araujo.) + +* When :program:`python` is started with :option:`-S`, ``import site`` + will no longer add site-specific paths to the module search paths. In + previous versions, it did. + + (:issue:`11591`, contributed by Carl Meyer with editions by Éric Araujo.) diff --git a/Doc/whatsnew/changelog.rst b/Doc/whatsnew/changelog.rst new file mode 100644 index 0000000..57e2dab --- /dev/null +++ b/Doc/whatsnew/changelog.rst @@ -0,0 +1,6 @@ ++++++++++ +Changelog ++++++++++ + +.. miscnews:: ../../Misc/NEWS + diff --git a/Doc/whatsnew/index.rst b/Doc/whatsnew/index.rst index 8220bd2..bc1206b 100644 --- a/Doc/whatsnew/index.rst +++ b/Doc/whatsnew/index.rst @@ -11,6 +11,7 @@ anyone wishing to stay up-to-date after a new release. .. toctree:: :maxdepth: 2 + 3.3.rst 3.2.rst 3.1.rst 3.0.rst @@ -22,3 +23,11 @@ anyone wishing to stay up-to-date after a new release. 2.2.rst 2.1.rst 2.0.rst + +The "Changelog" is a HTML version of the file :source:`Misc/NEWS` which +contains *all* nontrivial changes to Python for the current version. + +.. toctree:: + :maxdepth: 2 + + changelog.rst |