diff options
Diffstat (limited to 'Doc/lib/email.tex')
-rw-r--r-- | Doc/lib/email.tex | 358 |
1 files changed, 358 insertions, 0 deletions
diff --git a/Doc/lib/email.tex b/Doc/lib/email.tex new file mode 100644 index 0000000..eba0684 --- /dev/null +++ b/Doc/lib/email.tex @@ -0,0 +1,358 @@ +% Copyright (C) 2001 Python Software Foundation +% Author: barry@zope.com (Barry Warsaw) + +\section{\module{email} -- + An email and MIME handling package} + +\declaremodule{standard}{email} +\modulesynopsis{Package supporting the parsing, manipulating, and + generating email messages, including MIME documents.} +\moduleauthor{Barry A. Warsaw}{barry@zope.com} + +\versionadded{2.2} + +The \module{email} package is a library for managing email messages, +including MIME and other \rfc{2822}-based message documents. It +subsumes most of the functionality in several older standard modules +such as \module{rfc822}, \module{mimetools}, \module{multifile}, and +other non-standard packages such as \module{mimecntl}. + +The primary distinguishing feature of the \module{email} package is +that it splits the parsing and generating of email messages from the +internal \emph{object model} representation of email. Applications +using the \module{email} package deal primarily with objects; you can +add sub-objects to messages, remove sub-objects from messages, +completely re-arrange the contents, etc. There is a separate parser +and a separate generator which handles the transformation from flat +text to the object module, and then back to flat text again. There +are also handy subclasses for some common MIME object types, and a few +miscellaneous utilities that help with such common tasks as extracting +and parsing message field values, creating RFC-compliant dates, etc. + +The following sections describe the functionality of the +\module{email} package. The ordering follows a progression that +should be common in applications: an email message is read as flat +text from a file or other source, the text is parsed to produce an +object model representation of the email message, this model is +manipulated, and finally the model is rendered back into +flat text. + +It is perfectly feasible to create the object model out of whole cloth +-- i.e. completely from scratch. From there, a similar progression can +be taken as above. + +Also included are detailed specifications of all the classes and +modules that the \module{email} package provides, the exception +classes you might encounter while using the \module{email} package, +some auxiliary utilities, and a few examples. For users of the older +\module{mimelib} package, from which the \module{email} package is +descendent, a section on differences and porting is provided. + +\subsection{Representing an email message} + +The primary object in the \module{email} package is the +\class{Message} class, provided in the \refmodule{email.Message} +module. \class{Message} is the base class for the \module{email} +object model. It provides the core functionality for setting and +querying header fields, and for accessing message bodies. + +Conceptually, a \class{Message} object consists of \emph{headers} and +\emph{payloads}. Headers are \rfc{2822} style field name and +values where the field name and value are separated by a colon. The +colon is not part of either the field name or the field value. + +Headers are stored and returned in case-preserving form but are +matched case-insensitively. There may also be a single +\emph{Unix-From} header, also known as the envelope header or the +\code{From_} header. The payload is either a string in the case of +simple message objects, a list of \class{Message} objects for +multipart MIME documents, or a single \class{Message} instance for +\code{message/rfc822} type objects. + +\class{Message} objects provide a mapping style interface for +accessing the message headers, and an explicit interface for accessing +both the headers and the payload. It provides convenience methods for +generating a flat text representation of the message object tree, for +accessing commonly used header parameters, and for recursively walking +over the object tree. + +\subsection{Parsing email messages} +Message object trees can be created in one of two ways: they can be +created from whole cloth by instantiating \class{Message} objects and +stringing them together via \method{add_payload()} and +\method{set_payload()} calls, or they can be created by parsing a flat text +representation of the email message. + +The \module{email} package provides a standard parser that understands +most email document structures, including MIME documents. You can +pass the parser a string or a file object, and the parser will return +to you the root \class{Message} instance of the object tree. For +simple, non-MIME messages the payload of this root object will likely +be a string (e.g. containing the text of the message). For MIME +messages, the root object will return 1 from its +\method{is_multipart()} method, and the subparts can be accessed via +the \method{get_payload()} and \method{walk()} methods. + +Note that the parser can be extended in limited ways, and of course +you can implement your own parser completely from scratch. There is +no magical connection between the \module{email} package's bundled +parser and the +\class{Message} class, so your custom parser can create message object +trees in any way it find necessary. The \module{email} package's +parser is described in detail in the \refmodule{email.Parser} module +documentation. + +\subsection{Generating MIME documents} +One of the most common tasks is to generate the flat text of the email +message represented by a message object tree. You will need to do +this if you want to send your message via the \refmodule{smtplib} +module or the \refmodule{nntplib} module, or print the message on the +console. Taking a message object tree and producing a flat text +document is the job of the \refmodule{email.Generator} module. + +Again, as with the \refmodule{email.Parser} module, you aren't limited +to the functionality of the bundled generator; you could write one +from scratch yourself. However the bundled generator knows how to +generate most email in a standards-compliant way, should handle MIME +and non-MIME email messages just fine, and is designed so that the +transformation from flat text, to an object tree via the +\class{Parser} class, +and back to flat text, be idempotent (the input is identical to the +output). + +\subsection{Creating email and MIME objects from scratch} + +Ordinarily, you get a message object tree by passing some text to a +parser, which parses the text and returns the root of the message +object tree. However you can also build a complete object tree from +scratch, or even individual \class{Message} objects by hand. In fact, +you can also take an existing tree and add new \class{Message} +objects, move them around, etc. This makes a very convenient +interface for slicing-and-dicing MIME messages. + +You can create a new object tree by creating \class{Message} +instances, adding payloads and all the appropriate headers manually. +For MIME messages though, the \module{email} package provides some +convenient classes to make things easier. Each of these classes +should be imported from a module with the same name as the class, from +within the \module{email} package. E.g.: + +\begin{verbatim} +import email.MIMEImage.MIMEImage +\end{verbatim} + +or + +\begin{verbatim} +from email.MIMEText import MIMEText +\end{verbatim} + +Here are the classes: + +\begin{classdesc}{MIMEBase}{_maintype, _subtype, **_params} +This is the base class for all the MIME-specific subclasses of +\class{Message}. Ordinarily you won't create instances specifically +of \class{MIMEBase}, although you could. \class{MIMEBase} is provided +primarily as a convenient base class for more specific MIME-aware +subclasses. + +\var{_maintype} is the \code{Content-Type:} major type (e.g. \code{text} or +\code{image}), and \var{_subtype} is the \code{Content-Type:} minor type +(e.g. \code{plain} or \code{gif}). \var{_params} is a parameter +key/value dictionary and is passed directly to +\method{Message.add_header()}. + +The \class{MIMEBase} class always adds a \code{Content-Type:} header +(based on \var{_maintype}, \var{_subtype}, and \var{_params}), and a +\code{MIME-Version:} header (always set to \code{1.0}). +\end{classdesc} + +\begin{classdesc}{MIMEImage}{_imagedata\optional{, _subtype\optional{, + _encoder\optional{, **_params}}}} + +A subclass of \class{MIMEBase}, the \class{MIMEImage} class is used to +create MIME message objects of major type \code{image}. +\var{_imagedata} is a string containing the raw image data. If this +data can be decoded by the standard Python module \refmodule{imghdr}, +then the subtype will be automatically included in the +\code{Content-Type:} header. Otherwise you can explicitly specify the +image subtype via the \var{_subtype} parameter. If the minor type could +not be guessed and \var{_subtype} was not given, then \code{TypeError} +is raised. + +Optional \var{_encoder} is a callable (i.e. function) which will +perform the actual encoding of the image data for transport. This +callable takes one argument, which is the \class{MIMEImage} instance. +It should use \method{get_payload()} and \method{set_payload()} to +change the payload to encoded form. It should also add any +\code{Content-Transfer-Encoding:} or other headers to the message +object as necessary. The default encoding is \emph{Base64}. See the +\refmodule{email.Encoders} module for a list of the built-in encoders. + +\var{_params} are passed straight through to the \class{MIMEBase} +constructor. +\end{classdesc} + +\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{, + _charset\optional{, _encoder}}}} +A subclass of \class{MIMEBase}, the \class{MIMEText} class is used to +create MIME objects of major type \code{text}. \var{_text} is the string +for the payload. \var{_subtype} is the minor type and defaults to +\code{plain}. \var{_charset} is the character set of the text and is +passed as a parameter to the \class{MIMEBase} constructor; it defaults +to \code{us-ascii}. No guessing or encoding is performed on the text +data, but a newline is appended to \var{_text} if it doesn't already +end with a newline. + +The \var{_encoding} argument is as with the \class{MIMEImage} class +constructor, except that the default encoding for \class{MIMEText} +objects is one that doesn't actually modify the payload, but does set +the \code{Content-Transfer-Encoding:} header to \code{7bit} or +\code{8bit} as appropriate. +\end{classdesc} + +\begin{classdesc}{MIMEMessage}{_msg\optional{, _subtype}} +A subclass of \class{MIMEBase}, the \class{MIMEMessage} class is used to +create MIME objects of main type \code{message}. \var{_msg} is used as +the payload, and must be an instance of class \class{Message} (or a +subclass thereof), otherwise a \exception{TypeError} is raised. + +Optional \var{_subtype} sets the subtype of the message; it defaults +to \code{rfc822}. +\end{classdesc} + +\subsection{Encoders, Exceptions, Utilities, and Iterators} + +The \module{email} package provides various encoders for safe +transport of binary payloads in \class{MIMEImage} and \class{MIMEText} +instances. See the \refmodule{email.Encoders} module for more +details. + +All of the class exceptions that the \module{email} package can raise +are available in the \refmodule{email.Errors} module. + +Some miscellaneous utility functions are available in the +\refmodule{email.Utils} module. + +Iterating over a message object tree is easy with the +\method{Message.walk()} method; some additional helper iterators are +available in the \refmodule{email.Iterators} module. + +\subsection{Differences from \module{mimelib}} + +The \module{email} package was originally prototyped as a separate +library called \module{mimelib}. Changes have been made so that +method names are more consistent, and some methods or modules have +either been added or removed. The semantics of some of the methods +have also changed. For the most part, any functionality available in +\module{mimelib} is still available in the \module{email} package, +albeit often in a different way. + +Here is a brief description of the differences between the +\module{mimelib} and the \module{email} packages, along with hints on +how to port your applications. + +Of course, the most visible difference between the two packages is +that the package name has been changed to \module{email}. In +addition, the top-level package has the following differences: + +\begin{itemize} +\item \function{messageFromString()} has been renamed to + \function{message_from_string()}. +\item \function{messageFromFile()} has been renamed to + \function{message_from_file()}. +\end{itemize} + +The \class{Message} class has the following differences: + +\begin{itemize} +\item The method \method{asString()} was renamed to \method{as_string()}. +\item The method \method{ismultipart()} was renamed to + \method{is_multipart()}. +\item The \method{get_payload()} method has grown a \var{decode} + optional argument. +\item The method \method{getall()} was renamed to \method{get_all()}. +\item The method \method{addheader()} was renamed to \method{add_header()}. +\item The method \method{gettype()} was renamed to \method{get_type()}. +\item The method\method{getmaintype()} was renamed to + \method{get_main_type()}. +\item The method \method{getsubtype()} was renamed to + \method{get_subtype()}. +\item The method \method{getparams()} was renamed to + \method{get_params()}. + Also, whereas \method{getparams()} returned a list of strings, + \method{get_params()} returns a list of 2-tuples, effectively + the key/value pairs of the parameters, split on the \samp{=} + sign. +\item The method \method{getparam()} was renamed to \method{get_param()}. +\item The method \method{getcharsets()} was renamed to + \method{get_charsets()}. +\item The method \method{getfilename()} was renamed to + \method{get_filename()}. +\item The method \method{getboundary()} was renamed to + \method{get_boundary()}. +\item The method \method{setboundary()} was renamed to + \method{set_boundary()}. +\item The method \method{getdecodedpayload()} was removed. To get + similar functionality, pass the value 1 to the \var{decode} flag + of the {get_payload()} method. +\item The method \method{getpayloadastext()} was removed. Similar + functionality + is supported by the \class{DecodedGenerator} class in the + \refmodule{email.Generator} module. +\item The method \method{getbodyastext()} was removed. You can get + similar functionality by creating an iterator with + \function{typed_subpart_iterator()} in the + \refmodule{email.Iterators} module. +\end{itemize} + +The \class{Parser} class has no differences in its public interface. +It does have some additional smarts to recognize +\code{message/delivery-status} type messages, which it represents as +a \class{Message} instance containing separate \class{Message} +subparts for each header block in the delivery status +notification\footnote{Delivery Status Notifications (DSN) are defined +in \rfc{1894}}. + +The \class{Generator} class has no differences in its public +interface. There is a new class in the \refmodule{email.Generator} +module though, called \class{DecodedGenerator} which provides most of +the functionality previously available in the +\method{Message.getpayloadastext()} method. + +The following modules and classes have been changed: + +\begin{itemize} +\item The \class{MIMEBase} class constructor arguments \var{_major} + and \var{_minor} have changed to \var{_maintype} and + \var{_subtype} respectively. +\item The \code{Image} class/module has been renamed to + \code{MIMEImage}. The \var{_minor} argument has been renamed to + \var{_subtype}. +\item The \code{Text} class/module has been renamed to + \code{MIMEText}. The \var{_minor} argument has been renamed to + \var{_subtype}. +\item The \code{MessageRFC822} class/module has been renamed to + \code{MIMEMessage}. Note that an earlier version of + \module{mimelib} called this class/module \code{RFC822}, but + that clashed with the Python standard library module + \refmodule{rfc822} on some case-insensitive file systems. + + Also, the \class{MIMEMessage} class now represents any kind of + MIME message with main type \code{message}. It takes an + optional argument \var{_subtype} which is used to set the MIME + subtype. \var{_subtype} defaults to \code{rfc822}. +\end{itemize} + +\module{mimelib} provided some utility functions in its +\module{address} and \module{date} modules. All of these functions +have been moved to the \refmodule{email.Utils} module. + +The \code{MsgReader} class/module has been removed. Its functionality +is most closely supported in the \function{body_line_iterator()} +function in the \refmodule{email.Iterators} module. + +\subsection{Examples} + +Coming soon... + |