diff options
author | R. David Murray <rdmurray@bitdance.com> | 2011-01-07 23:25:30 (GMT) |
---|---|---|
committer | R. David Murray <rdmurray@bitdance.com> | 2011-01-07 23:25:30 (GMT) |
commit | 9253214fd9fe22b8b2b4ca5bb28952df8cab3e8c (patch) | |
tree | 30d925a75c0b3bd542c00d6dbd667e72178056a7 /Doc | |
parent | 6f0022d84af15d51ffa1606991f2b6e9e56448ed (diff) | |
download | cpython-9253214fd9fe22b8b2b4ca5bb28952df8cab3e8c.zip cpython-9253214fd9fe22b8b2b4ca5bb28952df8cab3e8c.tar.gz cpython-9253214fd9fe22b8b2b4ca5bb28952df8cab3e8c.tar.bz2 |
#10686: recode non-ASCII headers to 'unknown-8bit' instead of ?s.
This applies only when generating strings from non-RFC compliant binary
input; it makes the existing recoding behavior more consistent (ie:
now no data is lost when recoding).
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/library/email.generator.rst | 4 | ||||
-rw-r--r-- | Doc/library/email.header.rst | 10 | ||||
-rw-r--r-- | Doc/library/email.message.rst | 7 | ||||
-rw-r--r-- | Doc/whatsnew/3.2.rst | 2 |
4 files changed, 16 insertions, 7 deletions
diff --git a/Doc/library/email.generator.rst b/Doc/library/email.generator.rst index 22d8b09..85b32fe 100644 --- a/Doc/library/email.generator.rst +++ b/Doc/library/email.generator.rst @@ -79,8 +79,8 @@ Here are the public methods of the :class:`Generator` class, imported from the Messages parsed with a Bytes parser that have a :mailheader:`Content-Transfer-Encoding` of 8bit will be converted to a - use a 7bit Content-Transfer-Encoding. Any other non-ASCII bytes in the - message structure will be converted to '?' characters. + use a 7bit Content-Transfer-Encoding. Non-ASCII bytes in the headers + will be :rfc:`2047` encoded with a charset of `unknown-8bit`. .. versionchanged:: 3.2 Added support for re-encoding 8bit message bodies, and the *linesep* diff --git a/Doc/library/email.header.rst b/Doc/library/email.header.rst index ff2b484..29752c4 100644 --- a/Doc/library/email.header.rst +++ b/Doc/library/email.header.rst @@ -130,8 +130,14 @@ Here is the :class:`Header` class description: .. method:: __str__() - A helper for :class:`str`'s :func:`encode` method. Returns the header as - a Unicode string. + Returns an approximation of the :class:`Header` as a string, using an + unlimited line length. All pieces are converted to unicode using the + specified encoding and joined together appropriately. Any pieces with a + charset of `unknown-8bit` are decoded as `ASCII` using the `replace` + error handler. + + .. versionchanged:: 3.2 + Added handling for the `unknown-8bit` charset. .. method:: __eq__(other) diff --git a/Doc/library/email.message.rst b/Doc/library/email.message.rst index e76e689..29f7ba3 100644 --- a/Doc/library/email.message.rst +++ b/Doc/library/email.message.rst @@ -169,9 +169,10 @@ Here are the methods of the :class:`Message` class: Note that in all cases, any envelope header present in the message is not included in the mapping interface. - In a model generated from bytes, any header values that (in contravention - of the RFCs) contain non-ASCII bytes will have those bytes transformed - into '?' characters when the values are retrieved through this interface. + In a model generated from bytes, any header values that (in contravention of + the RFCs) contain non-ASCII bytes will, when retrieved through this + interface, be represented as :class:`~email.header.Header` objects with + a charset of `unknown-8bit`. .. method:: __len__() diff --git a/Doc/whatsnew/3.2.rst b/Doc/whatsnew/3.2.rst index b6e2550..69b318e 100644 --- a/Doc/whatsnew/3.2.rst +++ b/Doc/whatsnew/3.2.rst @@ -618,6 +618,8 @@ format. * Given bytes input to the model, :class:`~email.generator.Generator` will convert message bodies that have a :mailheader:`Content-Transfer-Encoding` of *8bit* to instead have a *7bit* :mailheader:`Content-Transfer-Encoding`. + XXX: Headers with Un-encoded non-ASCII bytes will be :rfc:`2047`\ -encoded + using the charset `unknown-8bit`. * A new class :class:`~email.generator.BytesGenerator` produces bytes as output, preserving any unchanged non-ASCII data that was present in the input used to |