diff options
-rw-r--r-- | Doc/lib/email.tex | 120 | ||||
-rw-r--r-- | Doc/lib/emailencoders.tex | 18 | ||||
-rw-r--r-- | Doc/lib/emailexc.tex | 17 | ||||
-rw-r--r-- | Doc/lib/emailgenerator.tex | 34 | ||||
-rw-r--r-- | Doc/lib/emailiter.tex | 10 | ||||
-rw-r--r-- | Doc/lib/emailmessage.tex | 115 | ||||
-rw-r--r-- | Doc/lib/emailparser.tex | 54 | ||||
-rw-r--r-- | Doc/lib/emailutil.tex | 18 |
8 files changed, 161 insertions, 225 deletions
diff --git a/Doc/lib/email.tex b/Doc/lib/email.tex index 4d955cc..afa13ff 100644 --- a/Doc/lib/email.tex +++ b/Doc/lib/email.tex @@ -14,8 +14,9 @@ The \module{email} package is a library for managing email messages, including MIME and other \rfc{2822}-based message documents. It subsumes most of the functionality in several older standard modules -such as \module{rfc822}, \module{mimetools}, \module{multifile}, and -other non-standard packages such as \module{mimecntl}. +such as \refmodule{rfc822}, \refmodule{mimetools}, +\refmodule{multifile}, and other non-standard packages such as +\module{mimecntl}. The primary distinguishing feature of the \module{email} package is that it splits the parsing and generating of email messages from the @@ -38,8 +39,8 @@ manipulated, and finally the model is rendered back into flat text. It is perfectly feasible to create the object model out of whole cloth --- i.e. completely from scratch. From there, a similar progression can -be taken as above. +--- i.e. completely from scratch. From there, a similar progression +can be taken as above. Also included are detailed specifications of all the classes and modules that the \module{email} package provides, the exception @@ -49,76 +50,13 @@ some auxiliary utilities, and a few examples. For users of the older descendent, a section on differences and porting is provided. \subsection{Representing an email message} - -The primary object in the \module{email} package is the -\class{Message} class, provided in the \refmodule{email.Message} -module. \class{Message} is the base class for the \module{email} -object model. It provides the core functionality for setting and -querying header fields, and for accessing message bodies. - -Conceptually, a \class{Message} object consists of \emph{headers} and -\emph{payloads}. Headers are \rfc{2822} style field name and -values where the field name and value are separated by a colon. The -colon is not part of either the field name or the field value. - -Headers are stored and returned in case-preserving form but are -matched case-insensitively. There may also be a single -\emph{Unix-From} header, also known as the envelope header or the -\mailheader{From_} header. The payload is either a string in the case of -simple message objects, a list of \class{Message} objects for -multipart MIME documents, or a single \class{Message} instance for -\mimetype{message/rfc822} type objects. - -\class{Message} objects provide a mapping style interface for -accessing the message headers, and an explicit interface for accessing -both the headers and the payload. It provides convenience methods for -generating a flat text representation of the message object tree, for -accessing commonly used header parameters, and for recursively walking -over the object tree. +\input{emailmessage} \subsection{Parsing email messages} -Message object trees can be created in one of two ways: they can be -created from whole cloth by instantiating \class{Message} objects and -stringing them together via \method{add_payload()} and -\method{set_payload()} calls, or they can be created by parsing a flat text -representation of the email message. - -The \module{email} package provides a standard parser that understands -most email document structures, including MIME documents. You can -pass the parser a string or a file object, and the parser will return -to you the root \class{Message} instance of the object tree. For -simple, non-MIME messages the payload of this root object will likely -be a string (e.g. containing the text of the message). For MIME -messages, the root object will return 1 from its -\method{is_multipart()} method, and the subparts can be accessed via -the \method{get_payload()} and \method{walk()} methods. - -Note that the parser can be extended in limited ways, and of course -you can implement your own parser completely from scratch. There is -no magical connection between the \module{email} package's bundled -parser and the -\class{Message} class, so your custom parser can create message object -trees in any way it find necessary. The \module{email} package's -parser is described in detail in the \refmodule{email.Parser} module -documentation. +\input{emailparser} \subsection{Generating MIME documents} -One of the most common tasks is to generate the flat text of the email -message represented by a message object tree. You will need to do -this if you want to send your message via the \refmodule{smtplib} -module or the \refmodule{nntplib} module, or print the message on the -console. Taking a message object tree and producing a flat text -document is the job of the \refmodule{email.Generator} module. - -Again, as with the \refmodule{email.Parser} module, you aren't limited -to the functionality of the bundled generator; you could write one -from scratch yourself. However the bundled generator knows how to -generate most email in a standards-compliant way, should handle MIME -and non-MIME email messages just fine, and is designed so that the -transformation from flat text, to an object tree via the -\class{Parser} class, -and back to flat text, be idempotent (the input is identical to the -output). +\input{emailgenerator} \subsection{Creating email and MIME objects from scratch} @@ -156,9 +94,10 @@ of \class{MIMEBase}, although you could. \class{MIMEBase} is provided primarily as a convenient base class for more specific MIME-aware subclasses. -\var{_maintype} is the \code{Content-Type:} major type (e.g. \code{text} or -\code{image}), and \var{_subtype} is the \code{Content-Type:} minor type -(e.g. \code{plain} or \code{gif}). \var{_params} is a parameter +\var{_maintype} is the \mailheader{Content-Type} major type +(e.g. \mimetype{text} or \mimetype{image}), and \var{_subtype} is the +\mailheader{Content-Type} minor type +(e.g. \mimetype{plain} or \mimetype{gif}). \var{_params} is a parameter key/value dictionary and is passed directly to \method{Message.add_header()}. @@ -195,10 +134,11 @@ constructor. \begin{classdesc}{MIMEText}{_text\optional{, _subtype\optional{, _charset\optional{, _encoder}}}} + A subclass of \class{MIMEBase}, the \class{MIMEText} class is used to -create MIME objects of major type \mimetype{text}. \var{_text} is the string -for the payload. \var{_subtype} is the minor type and defaults to -\mimetype{plain}. \var{_charset} is the character set of the text and is +create MIME objects of major type \mimetype{text}. \var{_text} is the +string for the payload. \var{_subtype} is the minor type and defaults +to \mimetype{plain}. \var{_charset} is the character set of the text and is passed as a parameter to the \class{MIMEBase} constructor; it defaults to \code{us-ascii}. No guessing or encoding is performed on the text data, but a newline is appended to \var{_text} if it doesn't already @@ -221,27 +161,24 @@ Optional \var{_subtype} sets the subtype of the message; it defaults to \mimetype{rfc822}. \end{classdesc} -\subsection{Encoders, Exceptions, Utilities, and Iterators} +\subsection{Encoders} +\input{emailencoders} -The \module{email} package provides various encoders for safe -transport of binary payloads in \class{MIMEImage} and \class{MIMEText} -instances. See the \refmodule{email.Encoders} module for more -details. +\subsection{Exception classes} +\input{emailexc} -All of the class exceptions that the \module{email} package can raise -are available in the \refmodule{email.Errors} module. +\subsection{Miscellaneous utilities} +\input{emailutil} -Some miscellaneous utility functions are available in the -\refmodule{email.Utils} module. - -Iterating over a message object tree is easy with the -\method{Message.walk()} method; some additional helper iterators are -available in the \refmodule{email.Iterators} module. +\subsection{Iterators} +\input{emailiter} \subsection{Differences from \module{mimelib}} The \module{email} package was originally prototyped as a separate -library called \module{mimelib}. Changes have been made so that +library called +\ulink{\module{mimelib}}{http://mimelib.sf.net/}. +Changes have been made so that method names are more consistent, and some methods or modules have either been added or removed. The semantics of some of the methods have also changed. For the most part, any functionality available in @@ -282,7 +219,7 @@ The \class{Message} class has the following differences: \method{get_params()}. Also, whereas \method{getparams()} returned a list of strings, \method{get_params()} returns a list of 2-tuples, effectively - the key/value pairs of the parameters, split on the \samp{=} + the key/value pairs of the parameters, split on the \character{=} sign. \item The method \method{getparam()} was renamed to \method{get_param()}. \item The method \method{getcharsets()} was renamed to @@ -355,4 +292,3 @@ function in the \refmodule{email.Iterators} module. \subsection{Examples} Coming soon... - diff --git a/Doc/lib/emailencoders.tex b/Doc/lib/emailencoders.tex index 6ebb302..3e247a9 100644 --- a/Doc/lib/emailencoders.tex +++ b/Doc/lib/emailencoders.tex @@ -1,16 +1,10 @@ -\section{\module{email.Encoders} --- - Email message payload encoders} - \declaremodule{standard}{email.Encoders} \modulesynopsis{Encoders for email message payloads.} -\sectionauthor{Barry A. Warsaw}{barry@zope.com} - -\versionadded{2.2} When creating \class{Message} objects from scratch, you often need to encode the payloads for transport through compliant mail servers. -This is especially true for \code{image/*} and \code{text/*} type -messages containing binary data. +This is especially true for \mimetype{image/*} and \mimetype{text/*} +type messages containing binary data. The \module{email} package provides some convenient encodings in its \module{Encoders} module. These encoders are actually used by the @@ -18,7 +12,7 @@ The \module{email} package provides some convenient encodings in its encodings. All encoder functions take exactly one argument, the message object to encode. They usually extract the payload, encode it, and reset the payload to this newly encoded value. They should also -set the \code{Content-Transfer-Encoding:} header as appropriate. +set the \mailheader{Content-Transfer-Encoding} header as appropriate. Here are the encoding functions provided: @@ -34,7 +28,7 @@ printable data, but contains a few unprintable characters. \begin{funcdesc}{encode_base64}{msg} Encodes the payload into \emph{Base64} form and sets the -\code{Content-Transfer-Encoding:} header to +\mailheader{Content-Transfer-Encoding} header to \code{base64}. This is a good encoding to use when most of your payload is unprintable data since it is a more compact form than Quoted-Printable. The drawback of Base64 encoding is that it @@ -43,11 +37,11 @@ renders the text non-human readable. \begin{funcdesc}{encode_7or8bit}{msg} This doesn't actually modify the message's payload, but it does set -the \code{Content-Transfer-Encoding:} header to either \code{7bit} or +the \mailheader{Content-Transfer-Encoding} header to either \code{7bit} or \code{8bit} as appropriate, based on the payload data. \end{funcdesc} \begin{funcdesc}{encode_noop}{msg} This does nothing; it doesn't even set the -\code{Content-Transfer-Encoding:} header. +\mailheader{Content-Transfer-Encoding} header. \end{funcdesc} diff --git a/Doc/lib/emailexc.tex b/Doc/lib/emailexc.tex index 8b2d189..4929244 100644 --- a/Doc/lib/emailexc.tex +++ b/Doc/lib/emailexc.tex @@ -1,11 +1,5 @@ -\section{\module{email.Errors} --- - email package exception classes} - -\declaremodule{standard}{email.Exceptions} +\declaremodule{standard}{email.Errors} \modulesynopsis{The exception classes used by the email package.} -\sectionauthor{Barry A. Warsaw}{barry@zope.com} - -\versionadded{2.2} The following exception classes are defined in the \module{email.Errors} module: @@ -41,13 +35,14 @@ It can be raised from the \method{Parser.parse()} or \method{Parser.parsestr()} methods. Situations where it can be raised include not being able to find the -starting or terminating boundary in a \code{multipart/*} message. +starting or terminating boundary in a \mimetype{multipart/*} message. \end{excclassdesc} \begin{excclassdesc}{MultipartConversionError}{} Raised when a payload is added to a \class{Message} object using \method{add_payload()}, but the payload is already a scalar and the -message's \code{Content-Type:} main type is not either \code{multipart} -or missing. \exception{MultipartConversionError} multiply inherits -from \exception{MessageError} and the built-in \exception{TypeError}. +message's \mailheader{Content-Type} main type is not either +\mimetype{multipart} or missing. \exception{MultipartConversionError} +multiply inherits from \exception{MessageError} and the built-in +\exception{TypeError}. \end{excclassdesc} diff --git a/Doc/lib/emailgenerator.tex b/Doc/lib/emailgenerator.tex index 2cb58ec..6ded8d1 100644 --- a/Doc/lib/emailgenerator.tex +++ b/Doc/lib/emailgenerator.tex @@ -1,17 +1,24 @@ -\section{\module{email.Generator} --- - Generating flat text from an email message object tree} - \declaremodule{standard}{email.Generator} -\modulesynopsis{Generate flat text email messages to from a message - object tree.} -\sectionauthor{Barry A. Warsaw}{barry@zope.com} +\modulesynopsis{Generate flat text email messages from a message object tree.} + +One of the most common tasks is to generate the flat text of the email +message represented by a message object tree. You will need to do +this if you want to send your message via the \refmodule{smtplib} +module or the \refmodule{nntplib} module, or print the message on the +console. Taking a message object tree and producing a flat text +document is the job of the \class{Generator} class. -\versionadded{2.2} +Again, as with the \refmodule{email.Parser} module, you aren't limited +to the functionality of the bundled generator; you could write one +from scratch yourself. However the bundled generator knows how to +generate most email in a standards-compliant way, should handle MIME +and non-MIME email messages just fine, and is designed so that the +transformation from flat text, to an object tree via the +\class{Parser} class, +and back to flat text, be idempotent (the input is identical to the +output). -The \class{Generator} class is used to render a message object model -into its flat text representation, including MIME encoding any -sub-messages, generating the correct \rfc{2822} headers, etc. Here -are the public methods of the \class{Generator} class. +Here are the public methods of the \class{Generator} class: \begin{classdesc}{Generator}{outfp\optional{, mangle_from_\optional{, maxheaderlen}}} @@ -25,8 +32,9 @@ character in front of any line in the body that starts exactly as \samp{From } (i.e. \code{From} followed by a space at the front of the line). This is the only guaranteed portable way to avoid having such lines be mistaken for \emph{Unix-From} headers (see -\url{http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html} - for details). +\ulink{WHY THE CONTENT-LENGTH FORMAT IS BAD} +{http://home.netscape.com/eng/mozilla/2.0/relnotes/demo/content-length.html} +for details). Optional \var{maxheaderlen} specifies the longest length for a non-continued header. When a header line is longer than diff --git a/Doc/lib/emailiter.tex b/Doc/lib/emailiter.tex index fbaafbb..eed98be 100644 --- a/Doc/lib/emailiter.tex +++ b/Doc/lib/emailiter.tex @@ -1,11 +1,5 @@ -\section{\module{email.Iterators} --- - Message object tree iterators} - \declaremodule{standard}{email.Iterators} \modulesynopsis{Iterate over a message object tree.} -\sectionauthor{Barry A. Warsaw}{barry@zope.com} - -\versionadded{2.2} Iterating over a message object tree is fairly easy with the \method{Message.walk()} method. The \module{email.Iterators} module @@ -29,9 +23,9 @@ subparts that match the MIME type specified by \var{maintype} and Note that \var{subtype} is optional; if omitted, then subpart MIME type matching is done only with the main type. \var{maintype} is -optional too; it defaults to \code{text}. +optional too; it defaults to \mimetype{text}. Thus, by default \function{typed_subpart_iterator()} returns each -subpart that has a MIME type of \code{text/*}. +subpart that has a MIME type of \mimetype{text/*}. \end{funcdesc} diff --git a/Doc/lib/emailmessage.tex b/Doc/lib/emailmessage.tex index bc9c0ce..ea2d0df 100644 --- a/Doc/lib/emailmessage.tex +++ b/Doc/lib/emailmessage.tex @@ -1,30 +1,37 @@ -\section{\module{email.Message} --- - The Message class} - \declaremodule{standard}{email.Message} \modulesynopsis{The base class representing email messages.} -\sectionauthor{Barry A. Warsaw}{barry@zope.com} - -\versionadded{2.2} - -The \module{Message} module provides a single class, the -\class{Message} class. This class is the base class for the -\module{email} package object model. It has a fairly extensive set of -methods to get and set email headers and email payloads. For an -introduction of the \module{email} package, please read the -\refmodule{email} package overview. - -\class{Message} instances can be created either directly, or -indirectly by using a \refmodule{email.Parser}. \class{Message} -objects provide a mapping style interface for accessing the message -headers, and an explicit interface for accessing both the headers and -the payload. It provides convenience methods for generating a flat -text representation of the message object tree, for accessing commonly -used header parameters, and for recursively walking over the object -tree. + +The central class in the \module{email} package is the +\class{Message} class; it is the base class for the \module{email} +object model. \class{Message} provides the core functionality for +setting and querying header fields, and for accessing message bodies. + +Conceptually, a \class{Message} object consists of \emph{headers} and +\emph{payloads}. Headers are \rfc{2822} style field names and +values where the field name and value are separated by a colon. The +colon is not part of either the field name or the field value. + +Headers are stored and returned in case-preserving form but are +matched case-insensitively. There may also be a single +\emph{Unix-From} header, also known as the envelope header or the +\code{From_} header. The payload is either a string in the case of +simple message objects, a list of \class{Message} objects for +multipart MIME documents, or a single \class{Message} instance for +\mimetype{message/rfc822} type objects. + +\class{Message} objects provide a mapping style interface for +accessing the message headers, and an explicit interface for accessing +both the headers and the payload. It provides convenience methods for +generating a flat text representation of the message object tree, for +accessing commonly used header parameters, and for recursively walking +over the object tree. Here are the methods of the \class{Message} class: +\begin{classdesc}{Message}{} +The constructor takes no arguments. +\end{classdesc} + \begin{methoddesc}[Message]{as_string}{\optional{unixfrom}} Return the entire formatted message as a string. Optional \var{unixfrom}, when true, specifies to include the \emph{Unix-From} @@ -66,8 +73,9 @@ For any other type of existing payload, \method{add_payload()} will transform the new payload into a list consisting of the old payload and \var{payload}, but only if the document is already a MIME multipart document. This condition is satisfied if the message's -\code{Content-Type:} header's main type is either \var{multipart}, or -there is no \code{Content-Type:} header. In any other situation, +\mailheader{Content-Type} header's main type is either +\mimetype{multipart}, or there is no \mailheader{Content-Type} +header. In any other situation, \exception{MultipartConversionError} is raised. \end{methoddesc} @@ -83,18 +91,18 @@ string or a single \class{Message} instance) when With optional \var{i}, \method{get_payload()} will return the \var{i}-th element of the payload, counting from zero, if -\method{is_multipart()} returns 1. An \code{IndexError} will be raised +\method{is_multipart()} returns 1. An \exception{IndexError} will be raised if \var{i} is less than 0 or greater than or equal to the number of items in the payload. If the payload is scalar (i.e. \method{is_multipart()} returns 0) and \var{i} is given, a -\code{TypeError} is raised. +\exception{TypeError} is raised. Optional \var{decode} is a flag indicating whether the payload should be -decoded or not, according to the \code{Content-Transfer-Encoding:} header. +decoded or not, according to the \mailheader{Content-Transfer-Encoding} header. When true and the message is not a multipart, the payload will be decoded if this header's value is \samp{quoted-printable} or \samp{base64}. If some other encoding is used, or -\code{Content-Transfer-Encoding:} header is +\mailheader{Content-Transfer-Encoding} header is missing, the payload is returned as-is (undecoded). If the message is a multipart and the \var{decode} flag is true, then \code{None} is returned. @@ -137,7 +145,7 @@ if 'message-id' in myMessage: \begin{methoddesc}[Message]{__getitem__}{name} Return the value of the named header field. \var{name} should not include the colon field separator. If the header is missing, -\code{None} is returned; a \code{KeyError} is never raised. +\code{None} is returned; a \exception{KeyError} is never raised. Note that if the named field appears more than once in the message's headers, exactly which of those field values will be returned is @@ -243,10 +251,11 @@ Content-Disposition: attachment; filename="bud.gif" \begin{methoddesc}[Message]{get_type}{\optional{failobj}} Return the message's content type, as a string of the form -``maintype/subtype'' as taken from the \code{Content-Type:} header. +\mimetype{maintype/subtype} as taken from the +\mailheader{Content-Type} header. The returned string is coerced to lowercase. -If there is no \code{Content-Type:} header in the message, +If there is no \mailheader{Content-Type} header in the message, \var{failobj} is returned (defaults to \code{None}). \end{methoddesc} @@ -263,46 +272,46 @@ same semantics for \var{failobj}. \end{methoddesc} \begin{methoddesc}[Message]{get_params}{\optional{failobj\optional{, header}}} -Return the message's \code{Content-Type:} parameters, as a list. The +Return the message's \mailheader{Content-Type} parameters, as a list. The elements of the returned list are 2-tuples of key/value pairs, as -split on the \samp{=} sign. The left hand side of the \samp{=} is the -key, while the right hand side is the value. If there is no \samp{=} -sign in the parameter the value is the empty string. The value is -always unquoted with \method{Utils.unquote()}. +split on the \character{=} sign. The left hand side of the +\character{=} is the key, while the right hand side is the value. If +there is no \character{=} sign in the parameter the value is the empty +string. The value is always unquoted with \method{Utils.unquote()}. Optional \var{failobj} is the object to return if there is no -\code{Content-Type:} header. Optional \var{header} is the header to -search instead of \code{Content-Type:}. +\mailheader{Content-Type} header. Optional \var{header} is the header to +search instead of \mailheader{Content-Type}. \end{methoddesc} \begin{methoddesc}[Message]{get_param}{param\optional{, failobj\optional{, header}}} -Return the value of the \code{Content-Type:} header's parameter -\var{param} as a string. If the message has no \code{Content-Type:} +Return the value of the \mailheader{Content-Type} header's parameter +\var{param} as a string. If the message has no \mailheader{Content-Type} header or if there is no such parameter, then \var{failobj} is returned (defaults to \code{None}). Optional \var{header} if given, specifies the message header to use -instead of \code{Content-Type:}. +instead of \mailheader{Content-Type}. \end{methoddesc} \begin{methoddesc}[Message]{get_charsets}{\optional{failobj}} Return a list containing the character set names in the message. If -the message is a \code{multipart}, then the list will contain one +the message is a \mimetype{multipart}, then the list will contain one element for each subpart in the payload, otherwise, it will be a list of length 1. Each item in the list will be a string which is the value of the -\code{charset} parameter in the \code{Content-Type:} header for the +\code{charset} parameter in the \mailheader{Content-Type} header for the represented subpart. However, if the subpart has no -\code{Content-Type:} header, no \code{charset} parameter, or is not of -the \code{text} main MIME type, then that item in the returned list +\mailheader{Content-Type} header, no \code{charset} parameter, or is not of +the \mimetype{text} main MIME type, then that item in the returned list will be \var{failobj}. \end{methoddesc} \begin{methoddesc}[Message]{get_filename}{\optional{failobj}} Return the value of the \code{filename} parameter of the -\code{Content-Disposition:} header of the message, or \var{failobj} if +\mailheader{Content-Disposition} header of the message, or \var{failobj} if either the header is missing, or has no \code{filename} parameter. The returned string will always be unquoted as per \method{Utils.unquote()}. @@ -310,25 +319,25 @@ The returned string will always be unquoted as per \begin{methoddesc}[Message]{get_boundary}{\optional{failobj}} Return the value of the \code{boundary} parameter of the -\code{Content-Type:} header of the message, or \var{failobj} if either +\mailheader{Content-Type} header of the message, or \var{failobj} if either the header is missing, or has no \code{boundary} parameter. The returned string will always be unquoted as per \method{Utils.unquote()}. \end{methoddesc} \begin{methoddesc}[Message]{set_boundary}{boundary} -Set the \code{boundary} parameter of the \code{Content-Type:} header +Set the \code{boundary} parameter of the \mailheader{Content-Type} header to \var{boundary}. \method{set_boundary()} will always quote \var{boundary} so you should not quote it yourself. A -\code{HeaderParseError} is raised if the message object has no -\code{Content-Type:} header. +\exception{HeaderParseError} is raised if the message object has no +\mailheader{Content-Type} header. Note that using this method is subtly different than deleting the old -\code{Content-Type:} header and adding a new one with the new boundary +\mailheader{Content-Type} header and adding a new one with the new boundary via \method{add_header()}, because \method{set_boundary()} preserves the -order of the \code{Content-Type:} header in the list of headers. +order of the \mailheader{Content-Type} header in the list of headers. However, it does \emph{not} preserve any continuation lines which may -have been present in the original \code{Content-Type:} header. +have been present in the original \mailheader{Content-Type} header. \end{methoddesc} \begin{methoddesc}[Message]{walk}{} diff --git a/Doc/lib/emailparser.tex b/Doc/lib/emailparser.tex index c96c3b3..0b2cec0 100644 --- a/Doc/lib/emailparser.tex +++ b/Doc/lib/emailparser.tex @@ -1,24 +1,30 @@ -\section{\module{email.Parser} --- - Parsing flat text email messages} - \declaremodule{standard}{email.Parser} \modulesynopsis{Parse flat text email messages to produce a message object tree.} -\sectionauthor{Barry A. Warsaw}{barry@zope.com} - -\versionadded{2.2} - -The \module{Parser} module provides a single class, the \class{Parser} -class, which is used to take a message in flat text form and create -the associated object model. The resulting object tree can then be -manipulated using the \class{Message} class interface as described in -\refmodule{email.Message}, and turned over -to a generator (as described in \refmodule{emamil.Generator}) to -return the textual representation of the message. It is intended that -the \class{Parser} to \class{Generator} path be idempotent if the -object model isn't modified in between. -\subsection{Parser class API} +Message object trees can be created in one of two ways: they can be +created from whole cloth by instantiating \class{Message} objects and +stringing them together via \method{add_payload()} and +\method{set_payload()} calls, or they can be created by parsing a flat text +representation of the email message. + +The \module{email} package provides a standard parser that understands +most email document structures, including MIME documents. You can +pass the parser a string or a file object, and the parser will return +to you the root \class{Message} instance of the object tree. For +simple, non-MIME messages the payload of this root object will likely +be a string (e.g. containing the text of the message). For MIME +messages, the root object will return 1 from its +\method{is_multipart()} method, and the subparts can be accessed via +the \method{get_payload()} and \method{walk()} methods. + +Note that the parser can be extended in limited ways, and of course +you can implement your own parser completely from scratch. There is +no magical connection between the \module{email} package's bundled +parser and the \class{Message} class, so your custom parser can create +message object trees in any way it find necessary. + +\subsubsection{Parser class API} \begin{classdesc}{Parser}{\optional{_class}} The constructor for the \class{Parser} class takes a single optional @@ -75,22 +81,22 @@ prompt: >>> msg = email.message_from_string(myString) \end{verbatim} -\subsection{Additional notes} +\subsubsection{Additional notes} Here are some notes on the parsing semantics: \begin{itemize} -\item Most non-\code{multipart} type messages are parsed as a single +\item Most non-\mimetype{multipart} type messages are parsed as a single message object with a string payload. These objects will return 0 for \method{is_multipart()}. -\item One exception is for \code{message/delivery-status} type - messages. Because such the body of such messages consist of +\item One exception is for \mimetype{message/delivery-status} type + messages. Because the body of such messages consist of blocks of headers, \class{Parser} will create a non-multipart object containing non-multipart subobjects for each header block. -\item Another exception is for \code{message/*} types (i.e. more - general than \code{message/delivery-status}. These are - typically \code{message/rfc822} type messages, represented as a +\item Another exception is for \mimetype{message/*} types (i.e. more + general than \mimetype{message/delivery-status}). These are + typically \mimetype{message/rfc822} type messages, represented as a non-multipart object containing a singleton payload, another non-multipart \class{Message} instance. \end{itemize} diff --git a/Doc/lib/emailutil.tex b/Doc/lib/emailutil.tex index e028fcd..d4a7313 100644 --- a/Doc/lib/emailutil.tex +++ b/Doc/lib/emailutil.tex @@ -1,11 +1,5 @@ -\section{\module{email.Utils} --- - Miscellaneous email package utilities} - \declaremodule{standard}{email.Utils} \modulesynopsis{Miscellaneous email package utilities.} -\sectionauthor{Barry A. Warsaw}{barry@zope.com} - -\versionadded{2.2} There are several useful utilities provided with the \module{email} package. @@ -24,8 +18,8 @@ are stripped off. \begin{funcdesc}{parseaddr}{address} Parse address -- which should be the value of some address-containing -field such as \code{To:} or \code{Cc:} -- into its constituent -``realname'' and ``email address'' parts. Returns a tuple of that +field such as \mailheader{To} or \mailheader{Cc} -- into its constituent +\emph{realname} and \emph{email address} parts. Returns a tuple of that information, unless the parse fails, in which case a 2-tuple of \code{(None, None)} is returned. \end{funcdesc} @@ -33,7 +27,7 @@ information, unless the parse fails, in which case a 2-tuple of \begin{funcdesc}{dump_address_pair}{pair} The inverse of \method{parseaddr()}, this takes a 2-tuple of the form \code{(realname, email_address)} and returns the string value suitable -for a \code{To:} or \code{Cc:} header. If the first element of +for a \mailheader{To} or \mailheader{Cc} header. If the first element of \var{pair} is false, then the second element is returned unmodified. \end{funcdesc} @@ -68,9 +62,9 @@ in the \var{charset} character set (Python can't reliably guess what character set a string might be encoded in). The default \var{charset} is \samp{iso-8859-1}. -\var{encoding} must be either the letter \samp{q} for -Quoted-Printable or \samp{b} for Base64 encoding. If -neither, a \code{ValueError} is raised. Both the \var{charset} and +\var{encoding} must be either the letter \character{q} for +Quoted-Printable or \character{b} for Base64 encoding. If +neither, a \exception{ValueError} is raised. Both the \var{charset} and the \var{encoding} strings are case-insensitive, and coerced to lower case in the returned string. \end{funcdesc} |