diff options
Diffstat (limited to 'Doc/library/email.generator.rst')
-rw-r--r-- | Doc/library/email.generator.rst | 376 |
1 files changed, 208 insertions, 168 deletions
diff --git a/Doc/library/email.generator.rst b/Doc/library/email.generator.rst index d596ed8..ab0fbc2 100644 --- a/Doc/library/email.generator.rst +++ b/Doc/library/email.generator.rst @@ -8,210 +8,244 @@ -------------- -One of the most common tasks is to generate the flat text of the email message -represented by a message object structure. You will need to do this if you want -to send your message via the :mod:`smtplib` module or the :mod:`nntplib` module, -or print the message on the console. Taking a message object structure and -producing a flat text document is the job of the :class:`Generator` class. - -Again, as with the :mod:`email.parser` module, you aren't limited to the -functionality of the bundled generator; you could write one from scratch -yourself. However the bundled generator knows how to generate most email in a -standards-compliant way, should handle MIME and non-MIME email messages just -fine, and is designed so that the transformation from flat text, to a message -structure via the :class:`~email.parser.Parser` class, and back to flat text, -is idempotent (the input is identical to the output) [#]_. On the other hand, -using the Generator on a :class:`~email.message.Message` constructed by program -may result in changes to the :class:`~email.message.Message` object as defaults -are filled in. - -:class:`bytes` output can be generated using the :class:`BytesGenerator` class. -If the message object structure contains non-ASCII bytes, this generator's -:meth:`~BytesGenerator.flatten` method will emit the original bytes. Parsing a -binary message and then flattening it with :class:`BytesGenerator` should be -idempotent for standards compliant messages. - -Here are the public methods of the :class:`Generator` class, imported from the -:mod:`email.generator` module: - - -.. class:: Generator(outfp, mangle_from_=True, maxheaderlen=78, *, policy=None) - - The constructor for the :class:`Generator` class takes a :term:`file-like object` - called *outfp* for an argument. *outfp* must support the :meth:`write` method - and be usable as the output file for the :func:`print` function. - - Optional *mangle_from_* is a flag that, when ``True``, puts a ``>`` character in - front of any line in the body that starts exactly as ``From``, i.e. ``From`` - followed by a space at the beginning of the line. This is the only guaranteed - portable way to avoid having such lines be mistaken for a Unix mailbox format - envelope header separator (see `WHY THE CONTENT-LENGTH FORMAT IS BAD - <https://www.jwz.org/doc/content-length.html>`_ for details). *mangle_from_* - defaults to ``True``, but you might want to set this to ``False`` if you are not - writing Unix mailbox format files. - - Optional *maxheaderlen* specifies the longest length for a non-continued header. - When a header line is longer than *maxheaderlen* (in characters, with tabs - expanded to 8 spaces), the header will be split as defined in the - :class:`~email.header.Header` class. Set to zero to disable header wrapping. - The default is 78, as recommended (but not required) by :rfc:`2822`. - - The *policy* keyword specifies a :mod:`~email.policy` object that controls a - number of aspects of the generator's operation. If no *policy* is specified, - then the *policy* attached to the message object passed to :attr:`flatten` - is used. +One of the most common tasks is to generate the flat (serialized) version of +the email message represented by a message object structure. You will need to +do this if you want to send your message via :meth:`smtplib.SMTP.sendmail` or +the :mod:`nntplib` module, or print the message on the console. Taking a +message object structure and producing a serialized representation is the job +of the generator classes. + +As with the :mod:`email.parser` module, you aren't limited to the functionality +of the bundled generator; you could write one from scratch yourself. However +the bundled generator knows how to generate most email in a standards-compliant +way, should handle MIME and non-MIME email messages just fine, and is designed +so that the bytes-oriented parsing and generation operations are inverses, +assuming the same non-transforming :mod:`~email.policy` is used for both. That +is, parsing the serialized byte stream via the +:class:`~email.parser.BytesParser` class and then regenerating the serialized +byte stream using :class:`BytesGenerator` should produce output identical to +the input [#]_. (On the other hand, using the generator on an +:class:`~email.message.EmailMessage` constructed by program may result in +changes to the :class:`~email.message.EmailMessage` object as defaults are +filled in.) + +The :class:`Generator` class can be used to flatten a message into a text (as +opposed to binary) serialized representation, but since Unicode cannot +represent binary data directly, the message is of necessity transformed into +something that contains only ASCII characters, using the standard email RFC +Content Transfer Encoding techniques for encoding email messages for transport +over channels that are not "8 bit clean". + + +.. class:: BytesGenerator(outfp, mangle_from_=None, maxheaderlen=None, *, \ + policy=None) - .. versionchanged:: 3.3 Added the *policy* keyword. + Return a :class:`BytesGenerator` object that will write any message provided + to the :meth:`flatten` method, or any surrogateescape encoded text provided + to the :meth:`write` method, to the :term:`file-like object` *outfp*. + *outfp* must support a ``write`` method that accepts binary data. + + If optional *mangle_from_* is ``True``, put a ``>`` character in front of + any line in the body that starts with the exact string ``"From "``, that is + ``From`` followed by a space at the beginning of a line. *mangle_from_* + defaults to the value of the :attr:`~email.policy.Policy.mangle_from_` + setting of the *policy* (which is ``True`` for the + :data:`~email.policy.compat32` policy and ``False`` for all others). + *mangle_from_* is intended for use when messages are stored in unix mbox + format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD + <http://www.jwz.org/doc/content-length.html>`_). + + If *maxheaderlen* is not ``None``, refold any header lines that are longer + than *maxheaderlen*, or if ``0``, do not rewrap any headers. If + *manheaderlen* is ``None`` (the default), wrap headers and other message + lines according to the *policy* settings. + + If *policy* is specified, use that policy to control message generation. If + *policy* is ``None`` (the default), use the policy associated with the + :class:`~email.message.Message` or :class:`~email.message.EmailMessage` + object passed to ``flatten`` to control the message generation. See + :mod:`email.policy` for details on what *policy* controls. - The other public :class:`Generator` methods are: + .. versionadded:: 3.2 + .. versionchanged:: 3.3 Added the *policy* keyword. - .. method:: flatten(msg, unixfrom=False, linesep=None) + .. versionchanged:: 3.6 The default behavior of the *mangle_from_* + and *maxheaderlen* parameters is to follow the policy. - Print the textual representation of the message object structure rooted at - *msg* to the output file specified when the :class:`Generator` instance - was created. Subparts are visited depth-first and the resulting text will - be properly MIME encoded. - Optional *unixfrom* is a flag that forces the printing of the envelope - header delimiter before the first :rfc:`2822` header of the root message - object. If the root object has no envelope header, a standard one is - crafted. By default, this is set to ``False`` to inhibit the printing of - the envelope delimiter. + .. method:: flatten(msg, unixfrom=False, linesep=None) + Print the textual representation of the message object structure rooted + at *msg* to the output file specified when the :class:`BytesGenerator` + instance was created. + + If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type` + is ``8bit`` (the default), copy any headers in the original parsed + message that have not been modified to the output with any bytes with the + high bit set reproduced as in the original, and preserve the non-ASCII + :mailheader:`Content-Transfer-Encoding` of any body parts that have them. + If ``cte_type`` is ``7bit``, convert the bytes with the high bit set as + needed using an ASCII-compatible :mailheader:`Content-Transfer-Encoding`. + That is, transform parts with non-ASCII + :mailheader:`Cotnent-Transfer-Encoding` + (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatibile + :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII + bytes in headers using the MIME ``unknown-8bit`` character set, thus + rendering them RFC-compliant. + + .. XXX: There should be an option that just does the RFC + compliance transformation on headers but leaves CTE 8bit parts alone. + + If *unixfrom* is ``True``, print the envelope header delimiter used by + the Unix mailbox format (see :mod:`mailbox`) before the first of the + :rfc:`5322` headers of the root message object. If the root object has + no envelope header, craft a standard one. The default is ``False``. Note that for subparts, no envelope header is ever printed. - Optional *linesep* specifies the line separator character used to - terminate lines in the output. If specified it overrides the value - specified by the *msg*\'s or ``Generator``\'s ``policy``. + If *linesep* is not ``None``, use it as the separator character between + all the lines of the flattened message. If *linesep* is ``None`` (the + default), use the value specified in the *policy*. - Because strings cannot represent non-ASCII bytes, if the policy that - applies when ``flatten`` is run has :attr:`~email.policy.Policy.cte_type` - set to ``8bit``, ``Generator`` will operate as if it were set to - ``7bit``. This means that messages parsed with a Bytes parser that have - a :mailheader:`Content-Transfer-Encoding` of ``8bit`` will be converted - to a use a ``7bit`` Content-Transfer-Encoding. Non-ASCII bytes in the - headers will be :rfc:`2047` encoded with a charset of ``unknown-8bit``. + .. XXX: flatten should take a *policy* keyword. - .. versionchanged:: 3.2 - Added support for re-encoding ``8bit`` message bodies, and the - *linesep* argument. .. method:: clone(fp) - Return an independent clone of this :class:`Generator` instance with the - exact same options. - - .. method:: write(s) - - Write the string *s* to the underlying file object, i.e. *outfp* passed to - :class:`Generator`'s constructor. This provides just enough file-like API - for :class:`Generator` instances to be used in the :func:`print` function. + Return an independent clone of this :class:`BytesGenerator` instance with + the exact same option settings, and *fp* as the new *outfp*. -As a convenience, see the :class:`~email.message.Message` methods -:meth:`~email.message.Message.as_string` and ``str(aMessage)``, a.k.a. -:meth:`~email.message.Message.__str__`, which simplify the generation of a -formatted string representation of a message object. For more detail, see -:mod:`email.message`. -.. class:: BytesGenerator(outfp, mangle_from_=True, maxheaderlen=78, *, \ - policy=None) + .. method:: write(s) - The constructor for the :class:`BytesGenerator` class takes a binary - :term:`file-like object` called *outfp* for an argument. *outfp* must - support a :meth:`write` method that accepts binary data. + Encode *s* using the ``ASCII`` codec and the ``surrogateescape`` error + handler, and pass it to the *write* method of the *outfp* passed to the + :class:`BytesGenerator`'s constructor. - Optional *mangle_from_* is a flag that, when ``True``, puts a ``>`` - character in front of any line in the body that starts exactly as ``From``, - i.e. ``From`` followed by a space at the beginning of the line. This is the - only guaranteed portable way to avoid having such lines be mistaken for a - Unix mailbox format envelope header separator (see `WHY THE CONTENT-LENGTH - FORMAT IS BAD <https://www.jwz.org/doc/content-length.html>`_ for details). - *mangle_from_* defaults to ``True``, but you might want to set this to - ``False`` if you are not writing Unix mailbox format files. - Optional *maxheaderlen* specifies the longest length for a non-continued - header. When a header line is longer than *maxheaderlen* (in characters, - with tabs expanded to 8 spaces), the header will be split as defined in the - :class:`~email.header.Header` class. Set to zero to disable header - wrapping. The default is 78, as recommended (but not required) by - :rfc:`2822`. +As a convenience, :class:`~email.message.EmailMessage` provides the methods +:meth:`~email.message.EmailMessage.as_bytes` and ``bytes(aMessage)`` (a.k.a. +:meth:`~email.message.EmailMessage.__bytes__`), which simplify the generation of +a serialized binary representation of a message object. For more detail, see +:mod:`email.message`. - The *policy* keyword specifies a :mod:`~email.policy` object that controls a - number of aspects of the generator's operation. If no *policy* is specified, - then the *policy* attached to the message object passed to :attr:`flatten` - is used. +Because strings cannot represent binary data, the :class:`Generator` class must +convert any binary data in any message it flattens to an ASCII compatible +format, by converting them to an ASCII compatible +:mailheader:`Content-Transfer_Encoding`. Using the terminology of the email +RFCs, you can think of this as :class:`Generator` serializing to an I/O stream +that is not "8 bit clean". In other words, most applications will want +to be using :class:`BytesGenerator`, and not :class:`Generator`. + +.. class:: Generator(outfp, mangle_from_=None, maxheaderlen=None, *, \ + policy=None) + + Return a :class:`Generator` object that will write any message provided + to the :meth:`flatten` method, or any text provided to the :meth:`write` + method, to the :term:`file-like object` *outfp*. *outfp* must support a + ``write`` method that accepts string data. + + If optional *mangle_from_* is ``True``, put a ``>`` character in front of + any line in the body that starts with the exact string ``"From "``, that is + ``From`` followed by a space at the beginning of a line. *mangle_from_* + defaults to the value of the :attr:`~email.policy.Policy.mangle_from_` + setting of the *policy* (which is ``True`` for the + :data:`~email.policy.compat32` policy and ``False`` for all others). + *mangle_from_* is intended for use when messages are stored in unix mbox + format (see :mod:`mailbox` and `WHY THE CONTENT-LENGTH FORMAT IS BAD + <http://www.jwz.org/doc/content-length.html>`_). + + If *maxheaderlen* is not ``None``, refold any header lines that are longer + than *maxheaderlen*, or if ``0``, do not rewrap any headers. If + *manheaderlen* is ``None`` (the default), wrap headers and other message + lines according to the *policy* settings. + + If *policy* is specified, use that policy to control message generation. If + *policy* is ``None`` (the default), use the policy associated with the + :class:`~email.message.Message` or :class:`~email.message.EmailMessage` + object passed to ``flatten`` to control the message generation. See + :mod:`email.policy` for details on what *policy* controls. .. versionchanged:: 3.3 Added the *policy* keyword. - The other public :class:`BytesGenerator` methods are: + .. versionchanged:: 3.6 The default behavior of the *mangle_from_* + and *maxheaderlen* parameters is to follow the policy. .. method:: flatten(msg, unixfrom=False, linesep=None) Print the textual representation of the message object structure rooted - at *msg* to the output file specified when the :class:`BytesGenerator` - instance was created. Subparts are visited depth-first and the resulting - text will be properly MIME encoded. If the :mod:`~email.policy` option - :attr:`~email.policy.Policy.cte_type` is ``8bit`` (the default), - then any bytes with the high bit set in the original parsed message that - have not been modified will be copied faithfully to the output. If - ``cte_type`` is ``7bit``, the bytes will be converted as needed - using an ASCII-compatible Content-Transfer-Encoding. In particular, - RFC-invalid non-ASCII bytes in headers will be encoded using the MIME - ``unknown-8bit`` character set, thus rendering them RFC-compliant. - - .. XXX: There should be a complementary option that just does the RFC - compliance transformation but leaves CTE 8bit parts alone. - - Messages parsed with a Bytes parser that have a - :mailheader:`Content-Transfer-Encoding` of 8bit will be reconstructed - as 8bit if they have not been modified. - - Optional *unixfrom* is a flag that forces the printing of the envelope - header delimiter before the first :rfc:`2822` header of the root message - object. If the root object has no envelope header, a standard one is - crafted. By default, this is set to ``False`` to inhibit the printing of - the envelope delimiter. - + at *msg* to the output file specified when the :class:`Generator` + instance was created. + + If the :mod:`~email.policy` option :attr:`~email.policy.Policy.cte_type` + is ``8bit``, generate the message as if the option were set to ``7bit``. + (This is required because strings cannot represent non-ASCII bytes.) + Convert any bytes with the high bit set as needed using an + ASCII-compatible :mailheader:`Content-Transfer-Encoding`. That is, + transform parts with non-ASCII :mailheader:`Cotnent-Transfer-Encoding` + (:mailheader:`Content-Transfer-Encoding: 8bit`) to an ASCII compatibile + :mailheader:`Content-Transfer-Encoding`, and encode RFC-invalid non-ASCII + bytes in headers using the MIME ``unknown-8bit`` character set, thus + rendering them RFC-compliant. + + If *unixfrom* is ``True``, print the envelope header delimiter used by + the Unix mailbox format (see :mod:`mailbox`) before the first of the + :rfc:`5322` headers of the root message object. If the root object has + no envelope header, craft a standard one. The default is ``False``. Note that for subparts, no envelope header is ever printed. - Optional *linesep* specifies the line separator character used to - terminate lines in the output. If specified it overrides the value - specified by the ``Generator``\ or *msg*\ 's ``policy``. + If *linesep* is not ``None``, use it as the separator character between + all the lines of the flattened message. If *linesep* is ``None`` (the + default), use the value specified in the *policy*. + + .. XXX: flatten should take a *policy* keyword. + + .. versionchanged:: 3.2 + Added support for re-encoding ``8bit`` message bodies, and the + *linesep* argument. + .. method:: clone(fp) - Return an independent clone of this :class:`BytesGenerator` instance with - the exact same options. + Return an independent clone of this :class:`Generator` instance with the + exact same options, and *fp* as the new *outfp*. + .. method:: write(s) - Write the string *s* to the underlying file object. *s* is encoded using - the ``ASCII`` codec and written to the *write* method of the *outfp* - *outfp* passed to the :class:`BytesGenerator`'s constructor. This - provides just enough file-like API for :class:`BytesGenerator` instances - to be used in the :func:`print` function. + Write *s* to the *write* method of the *outfp* passed to the + :class:`Generator`'s constructor. This provides just enough file-like + API for :class:`Generator` instances to be used in the :func:`print` + function. - .. versionadded:: 3.2 -The :mod:`email.generator` module also provides a derived class, called -:class:`DecodedGenerator` which is like the :class:`Generator` base class, -except that non-\ :mimetype:`text` parts are substituted with a format string -representing the part. +As a convenience, :class:`~email.message.EmailMessage` provides the methods +:meth:`~email.message.EmailMessage.as_string` and ``str(aMessage)`` (a.k.a. +:meth:`~email.message.EmailMessage.__str__`), which simplify the generation of +a formatted string representation of a message object. For more detail, see +:mod:`email.message`. + +The :mod:`email.generator` module also provides a derived class, +:class:`DecodedGenerator`, which is like the :class:`Generator` base class, +except that non-\ :mimetype:`text` parts are not serialized, but are instead +represented in the output stream by a string derived from a template filled +in with information about the part. -.. class:: DecodedGenerator(outfp, mangle_from_=True, maxheaderlen=78, fmt=None) +.. class:: DecodedGenerator(outfp, mangle_from_=None, maxheaderlen=None, \ + fmt=None, *, policy=None) - This class, derived from :class:`Generator` walks through all the subparts of a - message. If the subpart is of main type :mimetype:`text`, then it prints the - decoded payload of the subpart. Optional *_mangle_from_* and *maxheaderlen* are - as with the :class:`Generator` base class. + Act like :class:`Generator`, except that for any subpart of the message + passed to :meth:`Generator.flatten`, if the subpart is of main type + :mimetype:`text`, print the decoded payload of the subpart, and if the main + type is not :mimetype:`text`, instead of printing it fill in the string + *fmt* using information from the part and print the resulting + filled-in string. - If the subpart is not of main type :mimetype:`text`, optional *fmt* is a format - string that is used instead of the message payload. *fmt* is expanded with the - following keywords, ``%(keyword)s`` format: + To fill in *fmt*, execute ``fmt % part_info``, where ``part_info`` + is a dictionary composed of the following keys and values: * ``type`` -- Full MIME type of the non-\ :mimetype:`text` part @@ -225,15 +259,21 @@ representing the part. * ``encoding`` -- Content transfer encoding of the non-\ :mimetype:`text` part - The default value for *fmt* is ``None``, meaning :: + If *fmt* is ``None``, use the following default *fmt*: + + "[Non-text (%(type)s) part of message omitted, filename %(filename)s]" - [Non-text (%(type)s) part of message omitted, filename %(filename)s] + Optional *_mangle_from_* and *maxheaderlen* are as with the + :class:`Generator` base class. .. rubric:: Footnotes -.. [#] This statement assumes that you use the appropriate setting for the - ``unixfrom`` argument, and that you set maxheaderlen=0 (which will - preserve whatever the input line lengths were). It is also not strictly - true, since in many cases runs of whitespace in headers are collapsed - into single blanks. The latter is a bug that will eventually be fixed. +.. [#] This statement assumes that you use the appropriate setting for + ``unixfrom``, and that there are no :mod:`policy` settings calling for + automatic adjustments (for example, + :attr:`~email.policy.Policy.refold_source` must be ``none``, which is + *not* the default). It is also not 100% true, since if the message + does not conform to the RFC standards occasionally information about the + exact original text is lost during parsing error recovery. It is a goal + to fix these latter edge cases when possible. |