summaryrefslogtreecommitdiffstats
path: root/Doc/library
diff options
context:
space:
mode:
authorR David Murray <rdmurray@bitdance.com>2011-04-18 17:59:37 (GMT)
committerR David Murray <rdmurray@bitdance.com>2011-04-18 17:59:37 (GMT)
commit3edd22ac950d3a2bcc1ad2e5a83554970aef3369 (patch)
treeb4661afc1be45e0d072c1c83ab354b2362f05afb /Doc/library
parentce16be91dc68597b0c5bfc7b4b1c5136fe5697a6 (diff)
downloadcpython-3edd22ac950d3a2bcc1ad2e5a83554970aef3369.zip
cpython-3edd22ac950d3a2bcc1ad2e5a83554970aef3369.tar.gz
cpython-3edd22ac950d3a2bcc1ad2e5a83554970aef3369.tar.bz2
#11731: simplify/enhance parser/generator API by introducing policy objects.
This new interface will also allow for future planned enhancements in control over the parser/generator without requiring any additional complexity in the parser/generator API. Patch reviewed by Éric Araujo and Barry Warsaw.
Diffstat (limited to 'Doc/library')
-rw-r--r--Doc/library/email.generator.rst55
-rw-r--r--Doc/library/email.parser.rst54
-rw-r--r--Doc/library/email.policy.rst179
3 files changed, 256 insertions, 32 deletions
diff --git a/Doc/library/email.generator.rst b/Doc/library/email.generator.rst
index 85b32fe..847d7e4 100644
--- a/Doc/library/email.generator.rst
+++ b/Doc/library/email.generator.rst
@@ -32,7 +32,8 @@ Here are the public methods of the :class:`Generator` class, imported from the
:mod:`email.generator` module:
-.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78)
+.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78, *, \
+ policy=policy.default)
The constructor for the :class:`Generator` class takes a :term:`file-like object`
called *outfp* for an argument. *outfp* must support the :meth:`write` method
@@ -53,10 +54,16 @@ Here are the public methods of the :class:`Generator` class, imported from the
:class:`~email.header.Header` class. Set to zero to disable header wrapping.
The default is 78, as recommended (but not required) by :rfc:`2822`.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the generator's operation. The default policy
+ maintains backward compatibility.
+
+ .. versionchanged:: 3.3 Added the *policy* keyword.
+
The other public :class:`Generator` methods are:
- .. method:: flatten(msg, unixfrom=False, linesep='\\n')
+ .. method:: flatten(msg, unixfrom=False, linesep=None)
Print the textual representation of the message object structure rooted at
*msg* to the output file specified when the :class:`Generator` instance
@@ -72,12 +79,13 @@ Here are the public methods of the :class:`Generator` class, imported from the
Note that for subparts, no envelope header is ever printed.
Optional *linesep* specifies the line separator character used to
- terminate lines in the output. It defaults to ``\n`` because that is
- the most useful value for Python application code (other library packages
- expect ``\n`` separated lines). ``linesep=\r\n`` can be used to
- generate output with RFC-compliant line separators.
+ terminate lines in the output. If specified it overrides the value
+ specified by the ``Generator``\'s ``policy``.
- Messages parsed with a Bytes parser that have a
+ Because strings cannot represent non-ASCII bytes, ``Generator`` ignores
+ the value of the :attr:`~email.policy.Policy.must_be_7bit`
+ :mod:`~email.policy` setting and operates as if it were set ``True``.
+ This means that messages parsed with a Bytes parser that have a
:mailheader:`Content-Transfer-Encoding` of 8bit will be converted to a
use a 7bit Content-Transfer-Encoding. Non-ASCII bytes in the headers
will be :rfc:`2047` encoded with a charset of `unknown-8bit`.
@@ -103,7 +111,8 @@ As a convenience, see the :class:`~email.message.Message` methods
formatted string representation of a message object. For more detail, see
:mod:`email.message`.
-.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78)
+.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, \
+ policy=policy.default)
The constructor for the :class:`BytesGenerator` class takes a binary
:term:`file-like object` called *outfp* for an argument. *outfp* must
@@ -125,19 +134,31 @@ formatted string representation of a message object. For more detail, see
wrapping. The default is 78, as recommended (but not required) by
:rfc:`2822`.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the generator's operation. The default policy
+ maintains backward compatibility.
+
+ .. versionchanged:: 3.3 Added the *policy* keyword.
+
The other public :class:`BytesGenerator` methods are:
- .. method:: flatten(msg, unixfrom=False, linesep='\n')
+ .. method:: flatten(msg, unixfrom=False, linesep=None)
Print the textual representation of the message object structure rooted
at *msg* to the output file specified when the :class:`BytesGenerator`
instance was created. Subparts are visited depth-first and the resulting
- text will be properly MIME encoded. If the input that created the *msg*
- contained bytes with the high bit set and those bytes have not been
- modified, they will be copied faithfully to the output, even if doing so
- is not strictly RFC compliant. (To produce strictly RFC compliant
- output, use the :class:`Generator` class.)
+ text will be properly MIME encoded. If the :mod:`~email.policy` option
+ :attr:`~email.policy.Policy.must_be_7bit` is ``False`` (the default),
+ then any bytes with the high bit set in the original parsed message that
+ have not been modified will be copied faithfully to the output. If
+ ``must_be_7bit`` is true, the bytes will be converted as needed using an
+ ASCII content-transfer-encoding. In particular, RFC-invalid non-ASCII
+ bytes in headers will be encoded using the MIME ``unknown-8bit``
+ character set, thus rendering them RFC-compliant.
+
+ .. XXX: There should be a complementary option that just does the RFC
+ compliance transformation but leaves CTE 8bit parts alone.
Messages parsed with a Bytes parser that have a
:mailheader:`Content-Transfer-Encoding` of 8bit will be reconstructed
@@ -152,10 +173,8 @@ formatted string representation of a message object. For more detail, see
Note that for subparts, no envelope header is ever printed.
Optional *linesep* specifies the line separator character used to
- terminate lines in the output. It defaults to ``\n`` because that is
- the most useful value for Python application code (other library packages
- expect ``\n`` separated lines). ``linesep=\r\n`` can be used to
- generate output with RFC-compliant line separators.
+ terminate lines in the output. If specified it overrides the value
+ specified by the ``Generator``\ 's ``policy``.
.. method:: clone(fp)
diff --git a/Doc/library/email.parser.rst b/Doc/library/email.parser.rst
index c72d3d4..c5e43a9 100644
--- a/Doc/library/email.parser.rst
+++ b/Doc/library/email.parser.rst
@@ -58,12 +58,18 @@ list of defects that it can find.
Here is the API for the :class:`FeedParser`:
-.. class:: FeedParser(_factory=email.message.Message)
+.. class:: FeedParser(_factory=email.message.Message, *, policy=policy.default)
Create a :class:`FeedParser` instance. Optional *_factory* is a no-argument
callable that will be called whenever a new message object is needed. It
defaults to the :class:`email.message.Message` class.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the parser's operation. The default policy maintains
+ backward compatibility.
+
+ .. versionchanged:: 3.3 Added the *policy* keyword.
+
.. method:: feed(data)
Feed the :class:`FeedParser` some more data. *data* should be a string
@@ -104,7 +110,7 @@ have the same API as the :class:`Parser` and :class:`BytesParser` classes.
.. versionadded:: 3.3 BytesHeaderParser
-.. class:: Parser(_class=email.message.Message)
+.. class:: Parser(_class=email.message.Message, *, policy=policy.default)
The constructor for the :class:`Parser` class takes an optional argument
*_class*. This must be a callable factory (such as a function or a class), and
@@ -112,8 +118,13 @@ have the same API as the :class:`Parser` and :class:`BytesParser` classes.
:class:`~email.message.Message` (see :mod:`email.message`). The factory will
be called without arguments.
- .. versionchanged:: 3.2
- Removed the *strict* argument that was deprecated in 2.4.
+ The *policy* keyword specifies a :mod:`~email.policy` object that controls a
+ number of aspects of the parser's operation. The default policy maintains
+ backward compatibility.
+
+ .. versionchanged:: 3.3
+ Removed the *strict* argument that was deprecated in 2.4. Added the
+ *policy* keyword.
The other public :class:`Parser` methods are:
@@ -144,12 +155,18 @@ have the same API as the :class:`Parser` and :class:`BytesParser` classes.
the entire contents of the file.
-.. class:: BytesParser(_class=email.message.Message, strict=None)
+.. class:: BytesParser(_class=email.message.Message, *, policy=policy.default)
This class is exactly parallel to :class:`Parser`, but handles bytes input.
The *_class* and *strict* arguments are interpreted in the same way as for
- the :class:`Parser` constructor. *strict* is supported only to make porting
- code easier; it is deprecated.
+ the :class:`Parser` constructor.
+
+ The *policy* keyword specifies a :mod:`~email.policy` object that
+ controls a number of aspects of the parser's operation. The default
+ policy maintains backward compatibility.
+
+ .. versionchanged:: 3.3
+ Removed the *strict* argument. Added the *policy* keyword.
.. method:: parse(fp, headeronly=False)
@@ -187,12 +204,15 @@ in the top-level :mod:`email` package namespace.
.. currentmodule:: email
-.. function:: message_from_string(s, _class=email.message.Message, strict=None)
+.. function:: message_from_string(s, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure from a string. This is exactly equivalent to
- ``Parser().parsestr(s)``. Optional *_class* and *strict* are interpreted as
+ ``Parser().parsestr(s)``. *_class* and *policy* are interpreted as
with the :class:`Parser` class constructor.
+ .. versionchanged:: removed *strict*, added *policy*
+
.. function:: message_from_bytes(s, _class=email.message.Message, strict=None)
Return a message object structure from a byte string. This is exactly
@@ -200,21 +220,27 @@ in the top-level :mod:`email` package namespace.
*strict* are interpreted as with the :class:`Parser` class constructor.
.. versionadded:: 3.2
+ .. versionchanged:: 3.3 removed *strict*, added *policy*
-.. function:: message_from_file(fp, _class=email.message.Message, strict=None)
+.. function:: message_from_file(fp, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure tree from an open :term:`file object`.
- This is exactly equivalent to ``Parser().parse(fp)``. Optional *_class*
- and *strict* are interpreted as with the :class:`Parser` class constructor.
+ This is exactly equivalent to ``Parser().parse(fp)``. *_class*
+ and *policy* are interpreted as with the :class:`Parser` class constructor.
+
+ .. versionchanged:: 3.3 removed *strict*, added *policy*
-.. function:: message_from_binary_file(fp, _class=email.message.Message, strict=None)
+.. function:: message_from_binary_file(fp, _class=email.message.Message, *, \
+ policy=policy.default)
Return a message object structure tree from an open binary :term:`file
object`. This is exactly equivalent to ``BytesParser().parse(fp)``.
- Optional *_class* and *strict* are interpreted as with the :class:`Parser`
+ *_class* and *policy* are interpreted as with the :class:`Parser`
class constructor.
.. versionadded:: 3.2
+ .. versionchanged:: 3.3 removed *strict*, added *policy*
Here's an example of how you might use this at an interactive Python prompt::
diff --git a/Doc/library/email.policy.rst b/Doc/library/email.policy.rst
new file mode 100644
index 0000000..f7eb471
--- /dev/null
+++ b/Doc/library/email.policy.rst
@@ -0,0 +1,179 @@
+:mod:`email`: Policy Objects
+----------------------------
+
+.. module:: email.policy
+ :synopsis: Controlling the parsing and generating of messages
+
+
+The :mod:`email` package's prime focus is the handling of email messages as
+described by the various email and MIME RFCs. However, the general format of
+email messages (a block of header fields each consisting of a name followed by
+a colon followed by a value, the whole block followed by a blank line and an
+arbitrary 'body'), is a format that has found utility outside of the realm of
+email. Some of these uses conform fairly closely to the main RFCs, some do
+not. And even when working with email, there are times when it is desirable to
+break strict compliance with the RFCs.
+
+Policy objects are the mechanism used to provide the email package with the
+flexibility to handle all these disparate use cases,
+
+A :class:`Policy` object encapsulates a set of attributes and methods that
+control the behavior of various components of the email package during use.
+:class:`Policy` instances can be passed to various classes and methods in the
+email package to alter the default behavior. The settable values and their
+defaults are described below. The :mod:`policy` module also provides some
+pre-created :class:`Policy` instances. In addition to a :const:`default`
+instance, there are instances tailored for certain applications. For example
+there is an :const:`SMTP` :class:`Policy` with defaults appropriate for
+generating output to be sent to an SMTP server. These are listed :ref:`below
+<Policy Instances>`.
+
+In general an application will only need to deal with setting the policy at the
+input and output boundaries. Once parsed, a message is represented by a
+:class:`~email.message.Message` object, which is designed to be independent of
+the format that the message has "on the wire" when it is received, transmitted,
+or displayed. Thus, a :class:`Policy` can be specified when parsing a message
+to create a :class:`~email.message.Message`, and again when turning the
+:class:`~email.message.Message` into some other representation. While often a
+program will use the same :class:`Policy` for both input and output, the two
+can be different.
+
+As an example, the following code could be used to read an email message from a
+file on disk and pass it to the system ``sendmail`` program on a ``unix``
+system::
+
+ >>> from email import msg_from_binary_file
+ >>> from email.generator import BytesGenerator
+ >>> import email.policy
+ >>> from subprocess import Popen, PIPE
+ >>> with open('mymsg.txt', 'b') as f:
+ >>> msg = msg_from_binary_file(f, policy=email.policy.mbox)
+ >>> p = Popen(['sendmail', msg['To'][0].address], stdin=PIPE)
+ >>> g = BytesGenerator(p.stdin, email.policy.policy=SMTP)
+ >>> g.flatten(msg)
+ >>> p.stdin.close()
+ >>> rc = p.wait()
+
+Some email package methods accept a *policy* keyword argument, allowing the
+policy to be overridden for that method. For example, the following code use
+the :meth:`email.message.Message.as_string` method to the *msg* object from the
+previous example and re-write it to a file using the native line separators for
+the platform on which it is running::
+
+ >>> import os
+ >>> mypolicy = email.policy.Policy(linesep=os.linesep)
+ >>> with open('converted.txt', 'wb') as f:
+ ... f.write(msg.as_string(policy=mypolicy))
+
+Policy instances are immutable, but they can be cloned, accepting the same
+keyword arguments as the class constructor and returning a new :class:`Policy`
+instance that is a copy of the original but with the specified attributes
+values changed. For example, the following creates an SMTP policy that will
+raise any defects detected as errors::
+
+ >>> strict_SMTP = email.policy.SMTP.clone(raise_on_defect=True)
+
+Policy objects can also be combined using the addition operator, producing a
+policy object whose settings are a combination of the non-default values of the
+summed objects::
+
+ >>> strict_SMTP = email.policy.SMTP + email.policy.strict
+
+This operation is not commutative; that is, the order in which the objects are
+added matters. To illustrate::
+
+ >>> Policy = email.policy.Policy
+ >>> apolicy = Policy(max_line_length=100) + Policy(max_line_length=80)
+ >>> apolicy.max_line_length
+ 80
+ >>> apolicy = Policy(max_line_length=80) + Policy(max_line_length=100)
+ >>> apolicy.max_line_length
+ 100
+
+
+.. class:: Policy(**kw)
+
+ The valid constructor keyword arguments are any of the attributes listed
+ below.
+
+ .. attribute:: max_line_length
+
+ The maximum length of any line in the serialized output, not counting the
+ end of line character(s). Default is 78, per :rfc:`5322`. A value of
+ ``0`` or :const:`None` indicates that no line wrapping should be
+ done at all.
+
+ .. attribute:: linesep
+
+ The string to be used to terminate lines in serialized output. The
+ default is '\\n' because that's the internal end-of-line discipline used
+ by Python, though '\\r\\n' is required by the RFCs. See `Policy
+ Instances`_ for policies that use an RFC conformant linesep. Setting it
+ to :attr:`os.linesep` may also be useful.
+
+ .. attribute:: must_be_7bit
+
+ If :const:`True`, data output by a bytes generator is limited to ASCII
+ characters. If :const:`False` (the default), then bytes with the high
+ bit set are preserved and/or allowed in certain contexts (for example,
+ where possible a content transfer encoding of ``8bit`` will be used).
+ String generators act as if ``must_be_7bit`` is `True` regardless of the
+ policy in effect, since a string cannot represent non-ASCII bytes.
+
+ .. attribute:: raise_on_defect
+
+ If :const:`True`, any defects encountered will be raised as errors. If
+ :const:`False` (the default), defects will be passed to the
+ :meth:`register_defect` method.
+
+ .. method:: handle_defect(obj, defect)
+
+ *obj* is the object on which to register the defect. *defect* should be
+ an instance of a subclass of :class:`~email.errors.Defect`.
+ If :attr:`raise_on_defect`
+ is ``True`` the defect is raised as an exception. Otherwise *obj* and
+ *defect* are passed to :meth:`register_defect`. This method is intended
+ to be called by parsers when they encounter defects, and will not be
+ called by code that uses the email library unless that code is
+ implementing an alternate parser.
+
+ .. method:: register_defect(obj, defect)
+
+ *obj* is the object on which to register the defect. *defect* should be
+ a subclass of :class:`~email.errors.Defect`. This method is part of the
+ public API so that custom ``Policy`` subclasses can implement alternate
+ handling of defects. The default implementation calls the ``append``
+ method of the ``defects`` attribute of *obj*.
+
+ .. method:: clone(obj, *kw):
+
+ Return a new :class:`Policy` instance whose attributes have the same
+ values as the current instance, except where those attributes are
+ given new values by the keyword arguments.
+
+
+Policy Instances
+................
+
+The following instances of :class:`Policy` provide defaults suitable for
+specific common application domains.
+
+.. data:: default
+
+ An instance of :class:`Policy` with all defaults unchanged.
+
+.. data:: SMTP
+
+ Output serialized from a message will conform to the email and SMTP
+ RFCs. The only changed attribute is :attr:`linesep`, which is set to
+ ``\r\n``.
+
+.. data:: HTTP
+
+ Suitable for use when serializing headers for use in HTTP traffic.
+ :attr:`linesep` is set to ``\r\n``, and :attr:`max_line_length` is set to
+ :const:`None` (unlimited).
+
+.. data:: strict
+
+ :attr:`raise_on_defect` is set to :const:`True`.