diff options
Diffstat (limited to 'Doc/library/rfc822.rst')
-rw-r--r-- | Doc/library/rfc822.rst | 351 |
1 files changed, 351 insertions, 0 deletions
diff --git a/Doc/library/rfc822.rst b/Doc/library/rfc822.rst new file mode 100644 index 0000000..fa25ba5 --- /dev/null +++ b/Doc/library/rfc822.rst @@ -0,0 +1,351 @@ + +:mod:`rfc822` --- Parse RFC 2822 mail headers +============================================= + +.. module:: rfc822 + :synopsis: Parse 2822 style mail messages. + + +.. deprecated:: 2.3 + The :mod:`email` package should be used in preference to the :mod:`rfc822` + module. This module is present only to maintain backward compatibility. + +This module defines a class, :class:`Message`, which represents an "email +message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages +consist of a collection of message headers, and a message body. This module +also defines a helper class :class:`AddressList` for parsing :rfc:`2822` +addresses. Please refer to the RFC for information on the specific syntax of +:rfc:`2822` messages. + +.. index:: module: mailbox + +The :mod:`mailbox` module provides classes to read mailboxes produced by +various end-user mail programs. + + +.. class:: Message(file[, seekable]) + + A :class:`Message` instance is instantiated with an input object as parameter. + Message relies only on the input object having a :meth:`readline` method; in + particular, ordinary file objects qualify. Instantiation reads headers from the + input object up to a delimiter line (normally a blank line) and stores them in + the instance. The message body, following the headers, is not consumed. + + This class can work with any input object that supports a :meth:`readline` + method. If the input object has seek and tell capability, the + :meth:`rewindbody` method will work; also, illegal lines will be pushed back + onto the input stream. If the input object lacks seek but has an :meth:`unread` + method that can push back a line of input, :class:`Message` will use that to + push back illegal lines. Thus this class can be used to parse messages coming + from a buffered stream. + + The optional *seekable* argument is provided as a workaround for certain stdio + libraries in which :cfunc:`tell` discards buffered data before discovering that + the :cfunc:`lseek` system call doesn't work. For maximum portability, you + should set the seekable argument to zero to prevent that initial :meth:`tell` + when passing in an unseekable object such as a file object created from a socket + object. + + Input lines as read from the file may either be terminated by CR-LF or by a + single linefeed; a terminating CR-LF is replaced by a single linefeed before the + line is stored. + + All header matching is done independent of upper or lower case; e.g. + ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result. + + +.. class:: AddressList(field) + + You may instantiate the :class:`AddressList` helper class using a single string + parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The + parameter ``None`` yields an empty list.) + + +.. function:: quote(str) + + Return a new string with backslashes in *str* replaced by two backslashes and + double quotes replaced by backslash-double quote. + + +.. function:: unquote(str) + + Return a new string which is an *unquoted* version of *str*. If *str* ends and + begins with double quotes, they are stripped off. Likewise if *str* ends and + begins with angle brackets, they are stripped off. + + +.. function:: parseaddr(address) + + Parse *address*, which should be the value of some address-containing field such + as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and + "email address" parts. Returns a tuple of that information, unless the parse + fails, in which case a 2-tuple ``(None, None)`` is returned. + + +.. function:: dump_address_pair(pair) + + The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname, + email_address)`` and returns the string value suitable for a :mailheader:`To` or + :mailheader:`Cc` header. If the first element of *pair* is false, then the + second element is returned unmodified. + + +.. function:: parsedate(date) + + Attempts to parse a date according to the rules in :rfc:`2822`. however, some + mailers don't follow that format as specified, so :func:`parsedate` tries to + guess correctly in such cases. *date* is a string containing an :rfc:`2822` + date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing + the date, :func:`parsedate` returns a 9-tuple that can be passed directly to + :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6, + 7, and 8 of the result tuple are not usable. + + +.. function:: parsedate_tz(date) + + Performs the same function as :func:`parsedate`, but returns either ``None`` or + a 10-tuple; the first 9 elements make up a tuple that can be passed directly to + :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC + (which is the official term for Greenwich Mean Time). (Note that the sign of + the timezone offset is the opposite of the sign of the ``time.timezone`` + variable for the same timezone; the latter variable follows the POSIX standard + while this module follows :rfc:`2822`.) If the input string has no timezone, + the last element of the tuple returned is ``None``. Note that indexes 6, 7, and + 8 of the result tuple are not usable. + + +.. function:: mktime_tz(tuple) + + Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If + the timezone item in the tuple is ``None``, assume local time. Minor + deficiency: this first interprets the first 8 elements as a local time and then + compensates for the timezone difference; this may yield a slight error around + daylight savings time switch dates. Not enough to worry about for common use. + + +.. seealso:: + + Module :mod:`email` + Comprehensive email handling package; supersedes the :mod:`rfc822` module. + + Module :mod:`mailbox` + Classes to read various mailbox formats produced by end-user mail programs. + + Module :mod:`mimetools` + Subclass of :class:`rfc822.Message` that handles MIME encoded messages. + + +.. _message-objects: + +Message Objects +--------------- + +A :class:`Message` instance has the following methods: + + +.. method:: Message.rewindbody() + + Seek to the start of the message body. This only works if the file object is + seekable. + + +.. method:: Message.isheader(line) + + Returns a line's canonicalized fieldname (the dictionary key that will be used + to index it) if the line is a legal :rfc:`2822` header; otherwise returns + ``None`` (implying that parsing should stop here and the line be pushed back on + the input stream). It is sometimes useful to override this method in a + subclass. + + +.. method:: Message.islast(line) + + Return true if the given line is a delimiter on which Message should stop. The + delimiter line is consumed, and the file object's read location positioned + immediately after it. By default this method just checks that the line is + blank, but you can override it in a subclass. + + +.. method:: Message.iscomment(line) + + Return ``True`` if the given line should be ignored entirely, just skipped. By + default this is a stub that always returns ``False``, but you can override it in + a subclass. + + +.. method:: Message.getallmatchingheaders(name) + + Return a list of lines consisting of all headers matching *name*, if any. Each + physical line, whether it is a continuation line or not, is a separate list + item. Return the empty list if no header matches *name*. + + +.. method:: Message.getfirstmatchingheader(name) + + Return a list of lines comprising the first header matching *name*, and its + continuation line(s), if any. Return ``None`` if there is no header matching + *name*. + + +.. method:: Message.getrawheader(name) + + Return a single string consisting of the text after the colon in the first + header matching *name*. This includes leading whitespace, the trailing + linefeed, and internal linefeeds and whitespace if there any continuation + line(s) were present. Return ``None`` if there is no header matching *name*. + + +.. method:: Message.getheader(name[, default]) + + Like ``getrawheader(name)``, but strip leading and trailing whitespace. + Internal whitespace is not stripped. The optional *default* argument can be + used to specify a different default to be returned when there is no header + matching *name*. + + +.. method:: Message.get(name[, default]) + + An alias for :meth:`getheader`, to make the interface more compatible with + regular dictionaries. + + +.. method:: Message.getaddr(name) + + Return a pair ``(full name, email address)`` parsed from the string returned by + ``getheader(name)``. If no header matching *name* exists, return ``(None, + None)``; otherwise both the full name and the address are (possibly empty) + strings. + + Example: If *m*'s first :mailheader:`From` header contains the string + ``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair + ``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen + <jack@cwi.nl>'`` instead, it would yield the exact same result. + + +.. method:: Message.getaddrlist(name) + + This is similar to ``getaddr(list)``, but parses a header containing a list of + email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full + name, email address)`` pairs (even if there was only one address in the header). + If there is no header matching *name*, return an empty list. + + If multiple headers exist that match the named header (e.g. if there are several + :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines + the named headers contain are also parsed. + + +.. method:: Message.getdate(name) + + Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible + with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If + there is no header matching *name*, or it is unparsable, return ``None``. + + Date parsing appears to be a black art, and not all mailers adhere to the + standard. While it has been tested and found correct on a large collection of + email from many sources, it is still possible that this function may + occasionally yield an incorrect result. + + +.. method:: Message.getdate_tz(name) + + Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the + first 9 elements will make a tuple compatible with :func:`time.mktime`, and the + 10th is a number giving the offset of the date's timezone from UTC. Note that + fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is + no header matching *name*, or it is unparsable, return ``None``. + +:class:`Message` instances also support a limited mapping interface. In +particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError` +if there is no matching header; and ``len(m)``, ``m.get(name[, default])``, +``m.has_key(name)``, ``m.keys()``, ``m.values()`` ``m.items()``, and +``m.setdefault(name[, default])`` act as expected, with the one difference +that :meth:`setdefault` uses an empty string as the default value. +:class:`Message` instances also support the mapping writable interface ``m[name] += value`` and ``del m[name]``. :class:`Message` objects do not support the +:meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the +mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only +added in Python 2.2.) + +Finally, :class:`Message` instances have some public instance variables: + + +.. attribute:: Message.headers + + A list containing the entire set of header lines, in the order in which they + were read (except that setitem calls may disturb this order). Each line contains + a trailing newline. The blank line terminating the headers is not contained in + the list. + + +.. attribute:: Message.fp + + The file or file-like object passed at instantiation time. This can be used to + read the message content. + + +.. attribute:: Message.unixfrom + + The Unix ``From`` line, if the message had one, or an empty string. This is + needed to regenerate the message in some contexts, such as an ``mbox``\ -style + mailbox file. + + +.. _addresslist-objects: + +AddressList Objects +------------------- + +An :class:`AddressList` instance has the following methods: + + +.. method:: AddressList.__len__() + + Return the number of addresses in the address list. + + +.. method:: AddressList.__str__() + + Return a canonicalized string representation of the address list. Addresses are + rendered in "name" <host@domain> form, comma-separated. + + +.. method:: AddressList.__add__(alist) + + Return a new :class:`AddressList` instance that contains all addresses in both + :class:`AddressList` operands, with duplicates removed (set union). + + +.. method:: AddressList.__iadd__(alist) + + In-place version of :meth:`__add__`; turns this :class:`AddressList` instance + into the union of itself and the right-hand instance, *alist*. + + +.. method:: AddressList.__sub__(alist) + + Return a new :class:`AddressList` instance that contains every address in the + left-hand :class:`AddressList` operand that is not present in the right-hand + address operand (set difference). + + +.. method:: AddressList.__isub__(alist) + + In-place version of :meth:`__sub__`, removing addresses in this list which are + also in *alist*. + +Finally, :class:`AddressList` instances have one public instance variable: + + +.. attribute:: AddressList.addresslist + + A list of tuple string pairs, one per address. In each member, the first is the + canonicalized name part, the second is the actual route-address (``'@'``\ + -separated username-host.domain pair). + +.. rubric:: Footnotes + +.. [#] This module originally conformed to :rfc:`822`, hence the name. Since then, + :rfc:`2822` has been released as an update to :rfc:`822`. This module should be + considered :rfc:`2822`\ -conformant, especially in cases where the syntax or + semantics have changed since :rfc:`822`. + |