diff options
author | Barry Warsaw <barry@python.org> | 2004-08-25 02:22:30 (GMT) |
---|---|---|
committer | Barry Warsaw <barry@python.org> | 2004-08-25 02:22:30 (GMT) |
commit | 8bee76106e8da9fd6011432d2f60861a94c623db (patch) | |
tree | 813980417d605a18837bead1951ff22a8e2b10b6 /Doc | |
parent | c8854434790d3a281f0ea5a4714e5c01414413b0 (diff) | |
download | cpython-8bee76106e8da9fd6011432d2f60861a94c623db.zip cpython-8bee76106e8da9fd6011432d2f60861a94c623db.tar.gz cpython-8bee76106e8da9fd6011432d2f60861a94c623db.tar.bz2 |
PEP 292 classes Template and SafeTemplate are added to the string module.
This patch includes test cases and documentation updates, as well as NEWS file
updates.
This patch also updates the sre modules so that they don't import the string
module, breaking direct circular imports.
Diffstat (limited to 'Doc')
-rw-r--r-- | Doc/lib/libstring.tex | 150 |
1 files changed, 122 insertions, 28 deletions
diff --git a/Doc/lib/libstring.tex b/Doc/lib/libstring.tex index 48d7fc4..2824aeb 100644 --- a/Doc/lib/libstring.tex +++ b/Doc/lib/libstring.tex @@ -4,11 +4,23 @@ \declaremodule{standard}{string} \modulesynopsis{Common string operations.} +The \module{string} package contains a number of useful constants and classes, +as well as some deprecated legacy functions that are also available as methods +on strings. See the module \refmodule{re}\refstmodindex{re} for string +functions based on regular expressions. -This module defines some constants useful for checking character -classes and some useful string functions. See the module -\refmodule{re}\refstmodindex{re} for string functions based on regular -expressions. +In general, all of these objects are exposed directly in the \module{string} +package so users need only import the \module{string} package to begin using +these constants, classes, and functions. + +\begin{notice} +Starting with Python 2.4, the traditional \module{string} module was turned +into a package, however backward compatibility with existing code has been +retained. Code using the \module{string} module that worked prior to Python +2.4 should continue to work unchanged. +\end{notice} + +\subsection{String constants} The constants defined in this module are: @@ -86,11 +98,113 @@ The constants defined in this module are: is undefined. \end{datadesc} +\subsection{Template strings} + +Templates are Unicode strings that can be used to provide string substitutions +as described in \pep{292}. There is a \class{Template} class that is a +subclass of \class{unicode}, overriding the default \method{__mod__()} method. +Instead of the normal \samp{\%}-based substitutions, Template strings support +\samp{\$}-based substitutions, using the following rules: + +\begin{itemize} +\item \samp{\$\$} is an escape; it is replaced with a single \samp{\$}. + +\item \samp{\$identifier} names a substitution placeholder matching a mapping + key of "identifier". By default, "identifier" must spell a Python + identifier. The first non-identifier character after the \samp{\$} + character terminates this placeholder specification. + +\item \samp{\$\{identifier\}} is equivalent to \samp{\$identifier}. It is + required when valid identifier characters follow the placeholder but are + not part of the placeholder, e.g. "\$\{noun\}ification". +\end{itemize} + +Any other appearance of \samp{\$} in the string will result in a +\exception{ValueError} being raised. + +Template strings are used just like normal strings, in that the modulus +operator is used to interpolate a dictionary of values into a Template string, +e.g.: + +\begin{verbatim} +>>> from string import Template +>>> s = Template('$who likes $what') +>>> print s % dict(who='tim', what='kung pao') +tim likes kung pao +>>> Template('Give $who $100') % dict(who='tim') +Traceback (most recent call last): +[...] +ValueError: Invalid placeholder at index 10 +\end{verbatim} + +There is also a \class{SafeTemplate} class, derived from \class{Template} +which acts the same as \class{Template}, except that if placeholders are +missing in the interpolation dictionary, no \exception{KeyError} will be +raised. Instead the original placeholder (with or without the braces, as +appropriate) will be used: + +\begin{verbatim} +>>> from string import SafeTemplate +>>> s = SafeTemplate('$who likes $what for ${meal}') +>>> print s % dict(who='tim') +tim likes $what for ${meal} +\end{verbatim} + +The values in the mapping will automatically be converted to Unicode strings, +using the built-in \function{unicode()} function, which will be called without +optional arguments \var{encoding} or \var{errors}. + +Advanced usage: you can derive subclasses of \class{Template} or +\class{SafeTemplate} to use application-specific placeholder rules. To do +this, you override the class attribute \member{pattern}; the value must be a +compiled regular expression object with four named capturing groups. The +capturing groups correspond to the rules given above, along with the invalid +placeholder rule: + +\begin{itemize} +\item \var{escaped} -- This group matches the escape sequence, i.e. \samp{\$\$} + in the default pattern. +\item \var{named} -- This group matches the unbraced placeholder name; it + should not include the \samp{\$} in capturing group. +\item \var{braced} -- This group matches the brace delimited placeholder name; + it should not include either the \samp{\$} or braces in the capturing + group. +\item \var{bogus} -- This group matches any other \samp{\$}. It usually just + matches a single \samp{\$} and should appear last. +\end{itemize} + +\subsection{String functions} + +The following functions are available to operate on string and Unicode +objects. They are not available as string methods. + +\begin{funcdesc}{capwords}{s} + Split the argument into words using \function{split()}, capitalize + each word using \function{capitalize()}, and join the capitalized + words using \function{join()}. Note that this replaces runs of + whitespace characters by a single space, and removes leading and + trailing whitespace. +\end{funcdesc} + +\begin{funcdesc}{maketrans}{from, to} + Return a translation table suitable for passing to + \function{translate()} or \function{regex.compile()}, that will map + each character in \var{from} into the character at the same position + in \var{to}; \var{from} and \var{to} must have the same length. + + \warning{Don't use strings derived from \constant{lowercase} + and \constant{uppercase} as arguments; in some locales, these don't have + the same length. For case conversions, always use + \function{lower()} and \function{upper()}.} +\end{funcdesc} -Many of the functions provided by this module are also defined as -methods of string and Unicode objects; see ``String Methods'' (section -\ref{string-methods}) for more information on those. -The functions defined in this module are: +\subsection{Deprecated string functions} + +The following list of functions are also defined as methods of string and +Unicode objects; see ``String Methods'' (section +\ref{string-methods}) for more information on those. You should consider +these functions as deprecated, although they will not be removed until Python +3.0. The functions defined in this module are: \begin{funcdesc}{atof}{s} \deprecated{2.0}{Use the \function{float()} built-in function.} @@ -138,14 +252,6 @@ The functions defined in this module are: Return a copy of \var{word} with only its first character capitalized. \end{funcdesc} -\begin{funcdesc}{capwords}{s} - Split the argument into words using \function{split()}, capitalize - each word using \function{capitalize()}, and join the capitalized - words using \function{join()}. Note that this replaces runs of - whitespace characters by a single space, and removes leading and - trailing whitespace. -\end{funcdesc} - \begin{funcdesc}{expandtabs}{s\optional{, tabsize}} Expand tabs in a string, i.e.\ replace them by one or more spaces, depending on the current column and the given tab size. The column @@ -188,18 +294,6 @@ The functions defined in this module are: lower case. \end{funcdesc} -\begin{funcdesc}{maketrans}{from, to} - Return a translation table suitable for passing to - \function{translate()} or \function{regex.compile()}, that will map - each character in \var{from} into the character at the same position - in \var{to}; \var{from} and \var{to} must have the same length. - - \warning{Don't use strings derived from \constant{lowercase} - and \constant{uppercase} as arguments; in some locales, these don't have - the same length. For case conversions, always use - \function{lower()} and \function{upper()}.} -\end{funcdesc} - \begin{funcdesc}{split}{s\optional{, sep\optional{, maxsplit}}} Return a list of the words of the string \var{s}. If the optional second argument \var{sep} is absent or \code{None}, the words are |