summaryrefslogtreecommitdiffstats
path: root/Doc/library/rfc822.rst
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/library/rfc822.rst')
-rw-r--r--Doc/library/rfc822.rst351
1 files changed, 351 insertions, 0 deletions
diff --git a/Doc/library/rfc822.rst b/Doc/library/rfc822.rst
new file mode 100644
index 0000000..fa25ba5
--- /dev/null
+++ b/Doc/library/rfc822.rst
@@ -0,0 +1,351 @@
+
+:mod:`rfc822` --- Parse RFC 2822 mail headers
+=============================================
+
+.. module:: rfc822
+ :synopsis: Parse 2822 style mail messages.
+
+
+.. deprecated:: 2.3
+ The :mod:`email` package should be used in preference to the :mod:`rfc822`
+ module. This module is present only to maintain backward compatibility.
+
+This module defines a class, :class:`Message`, which represents an "email
+message" as defined by the Internet standard :rfc:`2822`. [#]_ Such messages
+consist of a collection of message headers, and a message body. This module
+also defines a helper class :class:`AddressList` for parsing :rfc:`2822`
+addresses. Please refer to the RFC for information on the specific syntax of
+:rfc:`2822` messages.
+
+.. index:: module: mailbox
+
+The :mod:`mailbox` module provides classes to read mailboxes produced by
+various end-user mail programs.
+
+
+.. class:: Message(file[, seekable])
+
+ A :class:`Message` instance is instantiated with an input object as parameter.
+ Message relies only on the input object having a :meth:`readline` method; in
+ particular, ordinary file objects qualify. Instantiation reads headers from the
+ input object up to a delimiter line (normally a blank line) and stores them in
+ the instance. The message body, following the headers, is not consumed.
+
+ This class can work with any input object that supports a :meth:`readline`
+ method. If the input object has seek and tell capability, the
+ :meth:`rewindbody` method will work; also, illegal lines will be pushed back
+ onto the input stream. If the input object lacks seek but has an :meth:`unread`
+ method that can push back a line of input, :class:`Message` will use that to
+ push back illegal lines. Thus this class can be used to parse messages coming
+ from a buffered stream.
+
+ The optional *seekable* argument is provided as a workaround for certain stdio
+ libraries in which :cfunc:`tell` discards buffered data before discovering that
+ the :cfunc:`lseek` system call doesn't work. For maximum portability, you
+ should set the seekable argument to zero to prevent that initial :meth:`tell`
+ when passing in an unseekable object such as a file object created from a socket
+ object.
+
+ Input lines as read from the file may either be terminated by CR-LF or by a
+ single linefeed; a terminating CR-LF is replaced by a single linefeed before the
+ line is stored.
+
+ All header matching is done independent of upper or lower case; e.g.
+ ``m['From']``, ``m['from']`` and ``m['FROM']`` all yield the same result.
+
+
+.. class:: AddressList(field)
+
+ You may instantiate the :class:`AddressList` helper class using a single string
+ parameter, a comma-separated list of :rfc:`2822` addresses to be parsed. (The
+ parameter ``None`` yields an empty list.)
+
+
+.. function:: quote(str)
+
+ Return a new string with backslashes in *str* replaced by two backslashes and
+ double quotes replaced by backslash-double quote.
+
+
+.. function:: unquote(str)
+
+ Return a new string which is an *unquoted* version of *str*. If *str* ends and
+ begins with double quotes, they are stripped off. Likewise if *str* ends and
+ begins with angle brackets, they are stripped off.
+
+
+.. function:: parseaddr(address)
+
+ Parse *address*, which should be the value of some address-containing field such
+ as :mailheader:`To` or :mailheader:`Cc`, into its constituent "realname" and
+ "email address" parts. Returns a tuple of that information, unless the parse
+ fails, in which case a 2-tuple ``(None, None)`` is returned.
+
+
+.. function:: dump_address_pair(pair)
+
+ The inverse of :meth:`parseaddr`, this takes a 2-tuple of the form ``(realname,
+ email_address)`` and returns the string value suitable for a :mailheader:`To` or
+ :mailheader:`Cc` header. If the first element of *pair* is false, then the
+ second element is returned unmodified.
+
+
+.. function:: parsedate(date)
+
+ Attempts to parse a date according to the rules in :rfc:`2822`. however, some
+ mailers don't follow that format as specified, so :func:`parsedate` tries to
+ guess correctly in such cases. *date* is a string containing an :rfc:`2822`
+ date, such as ``'Mon, 20 Nov 1995 19:12:08 -0500'``. If it succeeds in parsing
+ the date, :func:`parsedate` returns a 9-tuple that can be passed directly to
+ :func:`time.mktime`; otherwise ``None`` will be returned. Note that indexes 6,
+ 7, and 8 of the result tuple are not usable.
+
+
+.. function:: parsedate_tz(date)
+
+ Performs the same function as :func:`parsedate`, but returns either ``None`` or
+ a 10-tuple; the first 9 elements make up a tuple that can be passed directly to
+ :func:`time.mktime`, and the tenth is the offset of the date's timezone from UTC
+ (which is the official term for Greenwich Mean Time). (Note that the sign of
+ the timezone offset is the opposite of the sign of the ``time.timezone``
+ variable for the same timezone; the latter variable follows the POSIX standard
+ while this module follows :rfc:`2822`.) If the input string has no timezone,
+ the last element of the tuple returned is ``None``. Note that indexes 6, 7, and
+ 8 of the result tuple are not usable.
+
+
+.. function:: mktime_tz(tuple)
+
+ Turn a 10-tuple as returned by :func:`parsedate_tz` into a UTC timestamp. If
+ the timezone item in the tuple is ``None``, assume local time. Minor
+ deficiency: this first interprets the first 8 elements as a local time and then
+ compensates for the timezone difference; this may yield a slight error around
+ daylight savings time switch dates. Not enough to worry about for common use.
+
+
+.. seealso::
+
+ Module :mod:`email`
+ Comprehensive email handling package; supersedes the :mod:`rfc822` module.
+
+ Module :mod:`mailbox`
+ Classes to read various mailbox formats produced by end-user mail programs.
+
+ Module :mod:`mimetools`
+ Subclass of :class:`rfc822.Message` that handles MIME encoded messages.
+
+
+.. _message-objects:
+
+Message Objects
+---------------
+
+A :class:`Message` instance has the following methods:
+
+
+.. method:: Message.rewindbody()
+
+ Seek to the start of the message body. This only works if the file object is
+ seekable.
+
+
+.. method:: Message.isheader(line)
+
+ Returns a line's canonicalized fieldname (the dictionary key that will be used
+ to index it) if the line is a legal :rfc:`2822` header; otherwise returns
+ ``None`` (implying that parsing should stop here and the line be pushed back on
+ the input stream). It is sometimes useful to override this method in a
+ subclass.
+
+
+.. method:: Message.islast(line)
+
+ Return true if the given line is a delimiter on which Message should stop. The
+ delimiter line is consumed, and the file object's read location positioned
+ immediately after it. By default this method just checks that the line is
+ blank, but you can override it in a subclass.
+
+
+.. method:: Message.iscomment(line)
+
+ Return ``True`` if the given line should be ignored entirely, just skipped. By
+ default this is a stub that always returns ``False``, but you can override it in
+ a subclass.
+
+
+.. method:: Message.getallmatchingheaders(name)
+
+ Return a list of lines consisting of all headers matching *name*, if any. Each
+ physical line, whether it is a continuation line or not, is a separate list
+ item. Return the empty list if no header matches *name*.
+
+
+.. method:: Message.getfirstmatchingheader(name)
+
+ Return a list of lines comprising the first header matching *name*, and its
+ continuation line(s), if any. Return ``None`` if there is no header matching
+ *name*.
+
+
+.. method:: Message.getrawheader(name)
+
+ Return a single string consisting of the text after the colon in the first
+ header matching *name*. This includes leading whitespace, the trailing
+ linefeed, and internal linefeeds and whitespace if there any continuation
+ line(s) were present. Return ``None`` if there is no header matching *name*.
+
+
+.. method:: Message.getheader(name[, default])
+
+ Like ``getrawheader(name)``, but strip leading and trailing whitespace.
+ Internal whitespace is not stripped. The optional *default* argument can be
+ used to specify a different default to be returned when there is no header
+ matching *name*.
+
+
+.. method:: Message.get(name[, default])
+
+ An alias for :meth:`getheader`, to make the interface more compatible with
+ regular dictionaries.
+
+
+.. method:: Message.getaddr(name)
+
+ Return a pair ``(full name, email address)`` parsed from the string returned by
+ ``getheader(name)``. If no header matching *name* exists, return ``(None,
+ None)``; otherwise both the full name and the address are (possibly empty)
+ strings.
+
+ Example: If *m*'s first :mailheader:`From` header contains the string
+ ``'jack@cwi.nl (Jack Jansen)'``, then ``m.getaddr('From')`` will yield the pair
+ ``('Jack Jansen', 'jack@cwi.nl')``. If the header contained ``'Jack Jansen
+ <jack@cwi.nl>'`` instead, it would yield the exact same result.
+
+
+.. method:: Message.getaddrlist(name)
+
+ This is similar to ``getaddr(list)``, but parses a header containing a list of
+ email addresses (e.g. a :mailheader:`To` header) and returns a list of ``(full
+ name, email address)`` pairs (even if there was only one address in the header).
+ If there is no header matching *name*, return an empty list.
+
+ If multiple headers exist that match the named header (e.g. if there are several
+ :mailheader:`Cc` headers), all are parsed for addresses. Any continuation lines
+ the named headers contain are also parsed.
+
+
+.. method:: Message.getdate(name)
+
+ Retrieve a header using :meth:`getheader` and parse it into a 9-tuple compatible
+ with :func:`time.mktime`; note that fields 6, 7, and 8 are not usable. If
+ there is no header matching *name*, or it is unparsable, return ``None``.
+
+ Date parsing appears to be a black art, and not all mailers adhere to the
+ standard. While it has been tested and found correct on a large collection of
+ email from many sources, it is still possible that this function may
+ occasionally yield an incorrect result.
+
+
+.. method:: Message.getdate_tz(name)
+
+ Retrieve a header using :meth:`getheader` and parse it into a 10-tuple; the
+ first 9 elements will make a tuple compatible with :func:`time.mktime`, and the
+ 10th is a number giving the offset of the date's timezone from UTC. Note that
+ fields 6, 7, and 8 are not usable. Similarly to :meth:`getdate`, if there is
+ no header matching *name*, or it is unparsable, return ``None``.
+
+:class:`Message` instances also support a limited mapping interface. In
+particular: ``m[name]`` is like ``m.getheader(name)`` but raises :exc:`KeyError`
+if there is no matching header; and ``len(m)``, ``m.get(name[, default])``,
+``m.has_key(name)``, ``m.keys()``, ``m.values()`` ``m.items()``, and
+``m.setdefault(name[, default])`` act as expected, with the one difference
+that :meth:`setdefault` uses an empty string as the default value.
+:class:`Message` instances also support the mapping writable interface ``m[name]
+= value`` and ``del m[name]``. :class:`Message` objects do not support the
+:meth:`clear`, :meth:`copy`, :meth:`popitem`, or :meth:`update` methods of the
+mapping interface. (Support for :meth:`get` and :meth:`setdefault` was only
+added in Python 2.2.)
+
+Finally, :class:`Message` instances have some public instance variables:
+
+
+.. attribute:: Message.headers
+
+ A list containing the entire set of header lines, in the order in which they
+ were read (except that setitem calls may disturb this order). Each line contains
+ a trailing newline. The blank line terminating the headers is not contained in
+ the list.
+
+
+.. attribute:: Message.fp
+
+ The file or file-like object passed at instantiation time. This can be used to
+ read the message content.
+
+
+.. attribute:: Message.unixfrom
+
+ The Unix ``From`` line, if the message had one, or an empty string. This is
+ needed to regenerate the message in some contexts, such as an ``mbox``\ -style
+ mailbox file.
+
+
+.. _addresslist-objects:
+
+AddressList Objects
+-------------------
+
+An :class:`AddressList` instance has the following methods:
+
+
+.. method:: AddressList.__len__()
+
+ Return the number of addresses in the address list.
+
+
+.. method:: AddressList.__str__()
+
+ Return a canonicalized string representation of the address list. Addresses are
+ rendered in "name" <host@domain> form, comma-separated.
+
+
+.. method:: AddressList.__add__(alist)
+
+ Return a new :class:`AddressList` instance that contains all addresses in both
+ :class:`AddressList` operands, with duplicates removed (set union).
+
+
+.. method:: AddressList.__iadd__(alist)
+
+ In-place version of :meth:`__add__`; turns this :class:`AddressList` instance
+ into the union of itself and the right-hand instance, *alist*.
+
+
+.. method:: AddressList.__sub__(alist)
+
+ Return a new :class:`AddressList` instance that contains every address in the
+ left-hand :class:`AddressList` operand that is not present in the right-hand
+ address operand (set difference).
+
+
+.. method:: AddressList.__isub__(alist)
+
+ In-place version of :meth:`__sub__`, removing addresses in this list which are
+ also in *alist*.
+
+Finally, :class:`AddressList` instances have one public instance variable:
+
+
+.. attribute:: AddressList.addresslist
+
+ A list of tuple string pairs, one per address. In each member, the first is the
+ canonicalized name part, the second is the actual route-address (``'@'``\
+ -separated username-host.domain pair).
+
+.. rubric:: Footnotes
+
+.. [#] This module originally conformed to :rfc:`822`, hence the name. Since then,
+ :rfc:`2822` has been released as an update to :rfc:`822`. This module should be
+ considered :rfc:`2822`\ -conformant, especially in cases where the syntax or
+ semantics have changed since :rfc:`822`.
+