diff options
Diffstat (limited to 'Doc/library/email.header.rst')
-rw-r--r-- | Doc/library/email.header.rst | 110 |
1 files changed, 40 insertions, 70 deletions
diff --git a/Doc/library/email.header.rst b/Doc/library/email.header.rst index 07152c2..4e585fc 100644 --- a/Doc/library/email.header.rst +++ b/Doc/library/email.header.rst @@ -4,17 +4,6 @@ .. module:: email.header :synopsis: Representing non-ASCII headers -**Source code:** :source:`Lib/email/header.py` - --------------- - -This module is part of the legacy (``Compat32``) email API. In the current API -encoding and decoding of headers is handled transparently by the -dictionary-like API of the :class:`~email.message.EmailMessage` class. In -addition to uses in legacy code, this module can be useful in applications that -need to completely control the character sets used when encoding headers. - -The remaining text in this section is the original documentation of the module. :rfc:`2822` is the base standard that describes the format of email messages. It derives from the older :rfc:`822` standard which came into widespread use at @@ -42,8 +31,8 @@ For example:: >>> msg = Message() >>> h = Header('p\xf6stal', 'iso-8859-1') >>> msg['Subject'] = h - >>> msg.as_string() - 'Subject: =?iso-8859-1?q?p=F6stal?=\n\n' + >>> print msg.as_string() + Subject: =?iso-8859-1?q?p=F6stal?= @@ -54,18 +43,20 @@ the character set that the byte string was encoded in. When the subsequent field was properly :rfc:`2047` encoded. MIME-aware mail readers would show this header using the embedded ISO-8859-1 character. +.. versionadded:: 2.2.2 + Here is the :class:`Header` class description: -.. class:: Header(s=None, charset=None, maxlinelen=None, header_name=None, continuation_ws=' ', errors='strict') +.. class:: Header([s[, charset[, maxlinelen[, header_name[, continuation_ws[, errors]]]]]]) Create a MIME-compliant header that can contain strings in different character sets. Optional *s* is the initial header value. If ``None`` (the default), the initial header value is not set. You can later append to the header with - :meth:`append` method calls. *s* may be an instance of :class:`bytes` or - :class:`str`, but see the :meth:`append` documentation for semantics. + :meth:`append` method calls. *s* may be a byte string or a Unicode string, but + see the :meth:`append` documentation for semantics. Optional *charset* serves two purposes: it has the same meaning as the *charset* argument to the :meth:`append` method. It also sets the default character set @@ -81,15 +72,15 @@ Here is the :class:`Header` class description: for *header_name* is ``None``, meaning it is not taken into account for the first line of a long, split header. - Optional *continuation_ws* must be :rfc:`2822`\ -compliant folding - whitespace, and is usually either a space or a hard tab character. This - character will be prepended to continuation lines. *continuation_ws* - defaults to a single space character. + Optional *continuation_ws* must be :rfc:`2822`\ -compliant folding whitespace, + and is usually either a space or a hard tab character. This character will be + prepended to continuation lines. *continuation_ws* defaults to a single + space character (" "). Optional *errors* is passed straight through to the :meth:`append` method. - .. method:: append(s, charset=None, errors='strict') + .. method:: append(s[, charset[, errors]]) Append the string *s* to the MIME header. @@ -99,64 +90,43 @@ Here is the :class:`Header` class description: of ``None`` (the default) means that the *charset* given in the constructor is used. - *s* may be an instance of :class:`bytes` or :class:`str`. If it is an - instance of :class:`bytes`, then *charset* is the encoding of that byte - string, and a :exc:`UnicodeError` will be raised if the string cannot be - decoded with that character set. + *s* may be a byte string or a Unicode string. If it is a byte string + (i.e. ``isinstance(s, str)`` is true), then *charset* is the encoding of + that byte string, and a :exc:`UnicodeError` will be raised if the string + cannot be decoded with that character set. - If *s* is an instance of :class:`str`, then *charset* is a hint specifying - the character set of the characters in the string. + If *s* is a Unicode string, then *charset* is a hint specifying the + character set of the characters in the string. In this case, when + producing an :rfc:`2822`\ -compliant header using :rfc:`2047` rules, the + Unicode string will be encoded using the following charsets in order: + ``us-ascii``, the *charset* hint, ``utf-8``. The first character set to + not provoke a :exc:`UnicodeError` is used. - In either case, when producing an :rfc:`2822`\ -compliant header using - :rfc:`2047` rules, the string will be encoded using the output codec of - the charset. If the string cannot be encoded using the output codec, a - UnicodeError will be raised. + Optional *errors* is passed through to any :func:`unicode` or + :meth:`unicode.encode` call, and defaults to "strict". - Optional *errors* is passed as the errors argument to the decode call - if *s* is a byte string. - - .. method:: encode(splitchars=';, \\t', maxlinelen=None, linesep='\\n') + .. method:: encode([splitchars]) Encode a message header into an RFC-compliant format, possibly wrapping long lines and encapsulating non-ASCII parts in base64 or quoted-printable - encodings. - - Optional *splitchars* is a string containing characters which should be - given extra weight by the splitting algorithm during normal header - wrapping. This is in very rough support of :RFC:`2822`\'s 'higher level - syntactic breaks': split points preceded by a splitchar are preferred - during line splitting, with the characters preferred in the order in - which they appear in the string. Space and tab may be included in the - string to indicate whether preference should be given to one over the - other as a split point when other split chars do not appear in the line - being split. Splitchars does not affect :RFC:`2047` encoded lines. - - *maxlinelen*, if given, overrides the instance's value for the maximum - line length. - - *linesep* specifies the characters used to separate the lines of the - folded header. It defaults to the most useful value for Python - application code (``\n``), but ``\r\n`` can be specified in order - to produce headers with RFC-compliant line separators. - - .. versionchanged:: 3.2 - Added the *linesep* argument. - + encodings. Optional *splitchars* is a string containing characters to + split long ASCII lines on, in rough support of :rfc:`2822`'s *highest + level syntactic breaks*. This doesn't affect :rfc:`2047` encoded lines. The :class:`Header` class also provides a number of methods to support standard operators and built-in functions. + .. method:: __str__() - Returns an approximation of the :class:`Header` as a string, using an - unlimited line length. All pieces are converted to unicode using the - specified encoding and joined together appropriately. Any pieces with a - charset of ``'unknown-8bit'`` are decoded as ASCII using the ``'replace'`` - error handler. + A synonym for :meth:`Header.encode`. Useful for ``str(aHeader)``. + + + .. method:: __unicode__() - .. versionchanged:: 3.2 - Added handling for the ``'unknown-8bit'`` charset. + A helper for the built-in :func:`unicode` function. Returns the header as + a Unicode string. .. method:: __eq__(other) @@ -187,10 +157,10 @@ The :mod:`email.header` module also provides the following convenient functions. >>> from email.header import decode_header >>> decode_header('=?iso-8859-1?q?p=F6stal?=') - [(b'p\xf6stal', 'iso-8859-1')] + [('p\xf6stal', 'iso-8859-1')] -.. function:: make_header(decoded_seq, maxlinelen=None, header_name=None, continuation_ws=' ') +.. function:: make_header(decoded_seq[, maxlinelen[, header_name[, continuation_ws]]]) Create a :class:`Header` instance from a sequence of pairs as returned by :func:`decode_header`. @@ -199,7 +169,7 @@ The :mod:`email.header` module also provides the following convenient functions. pairs of the format ``(decoded_string, charset)`` where *charset* is the name of the character set. - This function takes one of those sequence of pairs and returns a - :class:`Header` instance. Optional *maxlinelen*, *header_name*, and - *continuation_ws* are as in the :class:`Header` constructor. + This function takes one of those sequence of pairs and returns a :class:`Header` + instance. Optional *maxlinelen*, *header_name*, and *continuation_ws* are as in + the :class:`Header` constructor. |