summaryrefslogtreecommitdiffstats
path: root/Doc/lib/email.tex
diff options
context:
space:
mode:
Diffstat (limited to 'Doc/lib/email.tex')
-rw-r--r--Doc/lib/email.tex358
1 files changed, 358 insertions, 0 deletions
diff --git a/Doc/lib/email.tex b/Doc/lib/email.tex
new file mode 100644
index 0000000..eba0684
--- /dev/null
+++ b/Doc/lib/email.tex
@@ -0,0 +1,358 @@
+% Copyright (C) 2001 Python Software Foundation
+% Author: barry@zope.com (Barry Warsaw)
+
+\section{\module{email} --
+ An email and MIME handling package}
+
+\declaremodule{standard}{email}
+\modulesynopsis{Package supporting the parsing, manipulating, and
+ generating email messages, including MIME documents.}
+\moduleauthor{Barry A. Warsaw}{barry@zope.com}
+
+\versionadded{2.2}
+
+The \module{email} package is a library for managing email messages,
+including MIME and other \rfc{2822}-based message documents. It
+subsumes most of the functionality in several older standard modules
+such as \module{rfc822}, \module{mimetools}, \module{multifile}, and
+other non-standard packages such as \module{mimecntl}.
+
+The primary distinguishing feature of the \module{email} package is
+that it splits the parsing and generating of email messages from the
+internal \emph{object model} representation of email. Applications
+using the \module{email} package deal primarily with objects; you can
+add sub-objects to messages, remove sub-objects from messages,
+completely re-arrange the contents, etc. There is a separate parser
+and a separate generator which handles the transformation from flat
+text to the object module, and then back to flat text again. There
+are also handy subclasses for some common MIME object types, and a few
+miscellaneous utilities that help with such common tasks as extracting
+and parsing message field values, creating RFC-compliant dates, etc.
+
+The following sections describe the functionality of the
+\module{email} package. The ordering follows a progression that
+should be common in applications: an email message is read as flat
+text from a file or other source, the text is parsed to produce an
+object model representation of the email message, this model is
+manipulated, and finally the model is rendered back into
+flat text.
+
+It is perfectly feasible to create the object model out of whole cloth
+-- i.e. completely from scratch. From there, a similar progression can
+be taken as above.
+
+Also included are detailed specifications of all the classes and
+modules that the \module{email} package provides, the exception
+classes you might encounter while using the \module{email} package,
+some auxiliary utilities, and a few examples. For users of the older
+\module{mimelib} package, from which the \module{email} package is
+descendent, a section on differences and porting is provided.
+
+\subsection{Representing an email message}
+
+The primary object in the \module{email} package is the
+\class{Message} class, provided in the \refmodule{email.Message}
+module. \class{Message} is the base class for the \module{email}
+object model. It provides the core functionality for setting and
+querying header fields, and for accessing message bodies.
+
+Conceptually, a \class{Message} object consists of \emph{headers} and
+\emph{payloads}. Headers are \rfc{2822} style field name and
+values where the field name and value are separated by a colon. The
+colon is not part of either the field name or the field value.
+
+Headers are stored and returned in case-preserving form but are
+matched case-insensitively. There may also be a single
+\emph{Unix-From} header, also known as the envelope header or the
+\code{From_} header. The payload is either a string in the case of
+simple message objects, a list of \class{Message} objects for
+multipart MIME documents, or a single \class{Message} instance for
+\code{message/rfc822} type objects.
+
+\class{Message} objects provide a mapping style interface for
+accessing the message headers, and an explicit interface for accessing
+both the headers and the payload. It provides convenience methods for
+generating a flat text representation of the message object tree, for
+accessing commonly used header parameters, and for recursively walking
+over the object tree.
+
+\subsection{Parsing email messages}
+Message object trees can be created in one of two ways: they can be
+created from whole cloth by instantiating \class{Message} objects and
+stringing them together via \method{add_payload()} and
+\method{set_payload()} calls, or they can be created by parsing a flat text
+representation of the email message.
+
+The \module{email} package provides a standard parser that understands
+most email document structures, including MIME documents. You can
+pass the parser a string or a file object, and the parser will return
+to you the root \class{Message} instance of the object tree. For
+simple, non-MIME messages the payload of this root object will likely
+be a string (e.g. containing the text of the message). For MIME
+messages, the root object will return 1 from its
+\method{is_multipart()} method, and the subparts can be accessed via
+the \method{get_payload()} and \method{walk()} methods.
+
+Note that the parser can be extended in limited ways, and of course
+you can implement your own parser completely from scratch. There is
+no magical connection between the \module{email} package's bundled
+parser and the
+\class{Message} class, so your custom parser can create message object
+trees in any way it find necessary. The \module{email} package's
+parser is described in detail in the \refmodule{email.Parser} module
+documentation.
+
+\subsection{Generating MIME documents}
+One of the most common tasks is to generate the flat text of the email
+message represented by a message object tree. You will need to do
+this if you want to send your message via the \refmodule{smtplib}
+module or the \refmodule{nntplib} module, or print the message on the
+console. Taking a message object tree and producing a flat text
+document is the job of the \refmodule{email.Generator} module.
+
+Again, as with the \refmodule{email.Parser} module, you aren't limited
+to the functionality of the bundled generator; you could write one
+from scratch yourself. However the bundled generator knows how to
+generate most email in a standards-compliant way, should handle MIME
+and non-MIME email messages just fine, and is designed so that the
+transformation from flat text, to an object tree via the
+\class{Parser} class,
+and back to flat text, be idempotent (the input is identical to the
+output).
+
+\subsection{Creating email and MIME objects from scratch}
+
+Ordinarily, you get a message object tree by passing some text to a
+parser, which parses the text and returns the root of the message
+object tree. However you can also build a complete object tree from
+scratch, or even individual \class{Message} objects by hand. In fact,
+you can also take an existing tree and add new \class{Message}
+objects, move them around, etc. This makes a very convenient
+interface for slicing-and-dicing MIME messages.
+
+You can create a new object tree by creating \class{Message}
+instances, adding payloads and all the appropriate headers manually.
+For MIME messages though, the \module{email} package provides some
+convenient classes to make things easier. Each of these classes
+should be imported from a module with the same name as the class, from
+within the \module{email} package. E.g.:
+
+\begin{verbatim}
+import email.MIMEImage.MIMEImage
+\end{verbatim}
+
+or
+
+\begin{verbatim}
+from email.MIMEText import MIMEText
+\end{verbatim}
+
+Here are the classes:
+
+\begin{classdesc}{MIMEBase}{_maintype, _subtype, **_params}
+This is the base class for all the MIME-specific subclasses of
+\class{Message}. Ordinarily you won't create instances specifically
+of \class{MIMEBase}, although you could. \class{MIMEBase} is provided
+primarily as a convenient base class for more specific MIME-aware
+subclasses.
+
+\var{_maintype} is the \code{Content-Type:} major type (e.g. \code{text} or
+\code{image}), and \var{_subtype} is the \code{Content-Type:} minor type
+(e.g. \code{plain} or \code{gif}). \var{_params} is a parameter
+key/value dictionary and is passed directly to
+\method{Message.add_header()}.
+
+The \class{MIMEBase} class always adds a \code{Content-Type:} header
+(based on \var{_maintype}, \var{_subtype}, and \var{_params}), and a
+\code{MIME-Version:} header (always set to \code{1.0}).
+\end{classdesc}
+
+\begin{classdesc}{MIMEImage}{_imagedata\optional{, _subtype\optional{,
+ _encoder\optional{, **_params}}}}
+
+A subclass of \class{MIMEBase}, the \class{MIMEImage} class is used to
+create MIME message objects of major type \code{image}.
+\var{_imagedata} is a string containing the raw image data. If this
+data can be decoded by the standard Python module \refmodule{imghdr},
+then the subtype will be automatically included in the
+\code{Content-Type:} header. Otherwise you can explicitly specify the
+image subtype via the \var{_subtype} parameter. If the minor type could
+not be guessed and \var{_subtype} was not given, then \code{TypeError}
+is raised.
+
+Optional \var{_encoder} is a callable (i.e. function) which will
+perform the actual encoding of the image data for transport. This
+callable takes one argument, which is the \class{MIMEImage} instance.
+It should use \method{get_payload()} and \method{set_payload()} to
+change the payload to encoded form. It should also add any
+\code{Content-Transfer-Encoding:} or other headers to the message
+object as necessary. The default encoding is \emph{Base64}. See the
+\refmodule{email.Encoders} module for a list of the built-in encoders.
+
+\var{_params} are passed straight through to the \class{MIMEBase}
+constructor.
+\end{classdesc}
+
+\begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{,
+ _charset\optional{, _encoder}}}}
+A subclass of \class{MIMEBase}, the \class{MIMEText} class is used to
+create MIME objects of major type \code{text}. \var{_text} is the string
+for the payload. \var{_subtype} is the minor type and defaults to
+\code{plain}. \var{_charset} is the character set of the text and is
+passed as a parameter to the \class{MIMEBase} constructor; it defaults
+to \code{us-ascii}. No guessing or encoding is performed on the text
+data, but a newline is appended to \var{_text} if it doesn't already
+end with a newline.
+
+The \var{_encoding} argument is as with the \class{MIMEImage} class
+constructor, except that the default encoding for \class{MIMEText}
+objects is one that doesn't actually modify the payload, but does set
+the \code{Content-Transfer-Encoding:} header to \code{7bit} or
+\code{8bit} as appropriate.
+\end{classdesc}
+
+\begin{classdesc}{MIMEMessage}{_msg\optional{, _subtype}}
+A subclass of \class{MIMEBase}, the \class{MIMEMessage} class is used to
+create MIME objects of main type \code{message}. \var{_msg} is used as
+the payload, and must be an instance of class \class{Message} (or a
+subclass thereof), otherwise a \exception{TypeError} is raised.
+
+Optional \var{_subtype} sets the subtype of the message; it defaults
+to \code{rfc822}.
+\end{classdesc}
+
+\subsection{Encoders, Exceptions, Utilities, and Iterators}
+
+The \module{email} package provides various encoders for safe
+transport of binary payloads in \class{MIMEImage} and \class{MIMEText}
+instances. See the \refmodule{email.Encoders} module for more
+details.
+
+All of the class exceptions that the \module{email} package can raise
+are available in the \refmodule{email.Errors} module.
+
+Some miscellaneous utility functions are available in the
+\refmodule{email.Utils} module.
+
+Iterating over a message object tree is easy with the
+\method{Message.walk()} method; some additional helper iterators are
+available in the \refmodule{email.Iterators} module.
+
+\subsection{Differences from \module{mimelib}}
+
+The \module{email} package was originally prototyped as a separate
+library called \module{mimelib}. Changes have been made so that
+method names are more consistent, and some methods or modules have
+either been added or removed. The semantics of some of the methods
+have also changed. For the most part, any functionality available in
+\module{mimelib} is still available in the \module{email} package,
+albeit often in a different way.
+
+Here is a brief description of the differences between the
+\module{mimelib} and the \module{email} packages, along with hints on
+how to port your applications.
+
+Of course, the most visible difference between the two packages is
+that the package name has been changed to \module{email}. In
+addition, the top-level package has the following differences:
+
+\begin{itemize}
+\item \function{messageFromString()} has been renamed to
+ \function{message_from_string()}.
+\item \function{messageFromFile()} has been renamed to
+ \function{message_from_file()}.
+\end{itemize}
+
+The \class{Message} class has the following differences:
+
+\begin{itemize}
+\item The method \method{asString()} was renamed to \method{as_string()}.
+\item The method \method{ismultipart()} was renamed to
+ \method{is_multipart()}.
+\item The \method{get_payload()} method has grown a \var{decode}
+ optional argument.
+\item The method \method{getall()} was renamed to \method{get_all()}.
+\item The method \method{addheader()} was renamed to \method{add_header()}.
+\item The method \method{gettype()} was renamed to \method{get_type()}.
+\item The method\method{getmaintype()} was renamed to
+ \method{get_main_type()}.
+\item The method \method{getsubtype()} was renamed to
+ \method{get_subtype()}.
+\item The method \method{getparams()} was renamed to
+ \method{get_params()}.
+ Also, whereas \method{getparams()} returned a list of strings,
+ \method{get_params()} returns a list of 2-tuples, effectively
+ the key/value pairs of the parameters, split on the \samp{=}
+ sign.
+\item The method \method{getparam()} was renamed to \method{get_param()}.
+\item The method \method{getcharsets()} was renamed to
+ \method{get_charsets()}.
+\item The method \method{getfilename()} was renamed to
+ \method{get_filename()}.
+\item The method \method{getboundary()} was renamed to
+ \method{get_boundary()}.
+\item The method \method{setboundary()} was renamed to
+ \method{set_boundary()}.
+\item The method \method{getdecodedpayload()} was removed. To get
+ similar functionality, pass the value 1 to the \var{decode} flag
+ of the {get_payload()} method.
+\item The method \method{getpayloadastext()} was removed. Similar
+ functionality
+ is supported by the \class{DecodedGenerator} class in the
+ \refmodule{email.Generator} module.
+\item The method \method{getbodyastext()} was removed. You can get
+ similar functionality by creating an iterator with
+ \function{typed_subpart_iterator()} in the
+ \refmodule{email.Iterators} module.
+\end{itemize}
+
+The \class{Parser} class has no differences in its public interface.
+It does have some additional smarts to recognize
+\code{message/delivery-status} type messages, which it represents as
+a \class{Message} instance containing separate \class{Message}
+subparts for each header block in the delivery status
+notification\footnote{Delivery Status Notifications (DSN) are defined
+in \rfc{1894}}.
+
+The \class{Generator} class has no differences in its public
+interface. There is a new class in the \refmodule{email.Generator}
+module though, called \class{DecodedGenerator} which provides most of
+the functionality previously available in the
+\method{Message.getpayloadastext()} method.
+
+The following modules and classes have been changed:
+
+\begin{itemize}
+\item The \class{MIMEBase} class constructor arguments \var{_major}
+ and \var{_minor} have changed to \var{_maintype} and
+ \var{_subtype} respectively.
+\item The \code{Image} class/module has been renamed to
+ \code{MIMEImage}. The \var{_minor} argument has been renamed to
+ \var{_subtype}.
+\item The \code{Text} class/module has been renamed to
+ \code{MIMEText}. The \var{_minor} argument has been renamed to
+ \var{_subtype}.
+\item The \code{MessageRFC822} class/module has been renamed to
+ \code{MIMEMessage}. Note that an earlier version of
+ \module{mimelib} called this class/module \code{RFC822}, but
+ that clashed with the Python standard library module
+ \refmodule{rfc822} on some case-insensitive file systems.
+
+ Also, the \class{MIMEMessage} class now represents any kind of
+ MIME message with main type \code{message}. It takes an
+ optional argument \var{_subtype} which is used to set the MIME
+ subtype. \var{_subtype} defaults to \code{rfc822}.
+\end{itemize}
+
+\module{mimelib} provided some utility functions in its
+\module{address} and \module{date} modules. All of these functions
+have been moved to the \refmodule{email.Utils} module.
+
+The \code{MsgReader} class/module has been removed. Its functionality
+is most closely supported in the \function{body_line_iterator()}
+function in the \refmodule{email.Iterators} module.
+
+\subsection{Examples}
+
+Coming soon...
+