Move reference material on PyArg_Parse*() out of the Extending & Embedding

document to the C API reference. Move some instructional text from the API reference to the Extending & Embedding manual. Fix the descriptions of the es and es# formats for PyArg_Parse*(). This closes SF bug #536516.
author: Fred Drake <fdrake@acm.org> 2002-04-05 23:01:14 (GMT)
committer: Fred Drake <fdrake@acm.org> 2002-04-05 23:01:14 (GMT)
commit: 68304ccce381d056b6346dac04404c559698027c (patch)
tree: 0cae3c449b574398e97ce5f8d17a250251ccd49b /Doc/api/utilities.tex
parent: 6b8ab74c8aecef19314375c440669b4364a236fe (diff)
download: cpython-68304ccce381d056b6346dac04404c559698027c.zip
cpython-68304ccce381d056b6346dac04404c559698027c.tar.gz
cpython-68304ccce381d056b6346dac04404c559698027c.tar.bz2
1 files changed, 404 insertions, 9 deletions
diff --git a/Doc/api/utilities.tex b/Doc/api/utilities.tex
index a5ffe3a..96ff816 100644
--- a/Doc/api/utilities.tex
+++ b/Doc/api/utilities.tex
@@ -357,13 +357,291 @@ and methods.  Additional information and examples are available in
 \citetitle[../ext/ext.html]{Extending and Embedding the Python
 Interpreter}.
 
+The first three of these functions described,
+\cfunction{PyArg_ParseTuple()},
+\cfunction{PyArg_ParseTupleAndKeywords()}, and
+\cfunction{PyArg_Parse()}, all use \emph{format strings} which are
+used to tell the function about the expected arguments.  The format
+strings use the same syntax for each of these functions.
+
+A format string consists of zero or more ``format units.''  A format
+unit describes one Python object; it is usually a single character or
+a parenthesized sequence of format units.  With a few exceptions, a
+format unit that is not a parenthesized sequence normally corresponds
+to a single address argument to these functions.  In the following
+description, the quoted form is the format unit; the entry in (round)
+parentheses is the Python object type that matches the format unit;
+and the entry in [square] brackets is the type of the C variable(s)
+whose address should be passed.
+
+\begin{description}
+  \item[\samp{s} (string or Unicode object) {[char *]}]
+  Convert a Python string or Unicode object to a C pointer to a
+  character string.  You must not provide storage for the string
+  itself; a pointer to an existing string is stored into the character
+  pointer variable whose address you pass.  The C string is
+  NUL-terminated.  The Python string must not contain embedded NUL
+  bytes; if it does, a \exception{TypeError} exception is raised.
+  Unicode objects are converted to C strings using the default
+  encoding.  If this conversion fails, a \exception{UnicodeError} is
+  raised.
+
+  \item[\samp{s\#} (string, Unicode or any read buffer compatible object)
+  {[char *, int]}]
+  This variant on \samp{s} stores into two C variables, the first one
+  a pointer to a character string, the second one its length.  In this
+  case the Python string may contain embedded null bytes.  Unicode
+  objects pass back a pointer to the default encoded string version of
+  the object if such a conversion is possible.  All other read-buffer
+  compatible objects pass back a reference to the raw internal data
+  representation.
+
+  \item[\samp{z} (string or \code{None}) {[char *]}]
+  Like \samp{s}, but the Python object may also be \code{None}, in
+  which case the C pointer is set to \NULL.
+
+  \item[\samp{z\#} (string or \code{None} or any read buffer
+  compatible object) {[char *, int]}]
+  This is to \samp{s\#} as \samp{z} is to \samp{s}.
+
+  \item[\samp{u} (Unicode object) {[Py_UNICODE *]}]
+  Convert a Python Unicode object to a C pointer to a NUL-terminated
+  buffer of 16-bit Unicode (UTF-16) data.  As with \samp{s}, there is
+  no need to provide storage for the Unicode data buffer; a pointer to
+  the existing Unicode data is stored into the \ctype{Py_UNICODE}
+  pointer variable whose address you pass.
+
+  \item[\samp{u\#} (Unicode object) {[Py_UNICODE *, int]}]
+  This variant on \samp{u} stores into two C variables, the first one
+  a pointer to a Unicode data buffer, the second one its length.
+  Non-Unicode objects are handled by interpreting their read-buffer
+  pointer as pointer to a \ctype{Py_UNICODE} array.
+
+  \item[\samp{es} (string, Unicode object or character buffer
+  compatible object) {[const char *encoding, char **buffer]}]
+  This variant on \samp{s} is used for encoding Unicode and objects
+  convertible to Unicode into a character buffer. It only works for
+  encoded data without embedded NUL bytes.
+
+  This format requires two arguments.  The first is only used as
+  input, and must be a \ctype{char*} which points to the name of an
+  encoding as a NUL-terminated string, or \NULL, in which case the
+  default encoding is used.  An exception is raised if the named
+  encoding is not known to Python.  The second argument must be a
+  \ctype{char**}; the value of the pointer it references will be set
+  to a buffer with the contents of the argument text.  The text will
+  be encoded in the encoding specified by the first argument.
+
+  \cfunction{PyArg_ParseTuple()} will allocate a buffer of the needed
+  size, copy the encoded data into this buffer and adjust
+  \var{*buffer} to reference the newly allocated storage.  The caller
+  is responsible for calling \cfunction{PyMem_Free()} to free the
+  allocated buffer after use.
+
+  \item[\samp{et} (string, Unicode object or character buffer
+  compatible object) {[const char *encoding, char **buffer]}]
+  Same as \samp{es} except that 8-bit string objects are passed
+  through without recoding them.  Instead, the implementation assumes
+  that the string object uses the encoding passed in as parameter.
+
+  \item[\samp{es\#} (string, Unicode object or character buffer compatible
+  object) {[const char *encoding, char **buffer, int *buffer_length]}]
+  This variant on \samp{s\#} is used for encoding Unicode and objects
+  convertible to Unicode into a character buffer.  Unlike the
+  \samp{es} format, this variant allows input data which contains NUL
+  characters.
+
+  It requires three arguments.  The first is only used as input, and
+  must be a \ctype{char*} which points to the name of an encoding as a
+  NUL-terminated string, or \NULL, in which case the default encoding
+  is used.  An exception is raised if the named encoding is not known
+  to Python.  The second argument must be a \ctype{char**}; the value
+  of the pointer it references will be set to a buffer with the
+  contents of the argument text.  The text will be encoded in the
+  encoding specified by the first argument.  The third argument must
+  be a pointer to an integer; the referenced integer will be set to
+  the number of bytes in the output buffer.
+
+  There are two modes of operation:
+
+  If \var{*buffer} points a \NULL{} pointer, the function will
+  allocate a buffer of the needed size, copy the encoded data into
+  this buffer and set \var{*buffer} to reference the newly allocated
+  storage.  The caller is responsible for calling
+  \cfunction{PyMem_Free()} to free the allocated buffer after usage.
+
+  If \var{*buffer} points to a non-\NULL{} pointer (an already
+  allocated buffer), \cfunction{PyArg_ParseTuple()} will use this
+  location as the buffer and interpret the initial value of
+  \var{*buffer_length} as the buffer size.  It will then copy the
+  encoded data into the buffer and NUL-terminate it.  If the buffer
+  is not large enough, a \exception{ValueError} will be set.
+
+  In both cases, \var{*buffer_length} is set to the length of the
+  encoded data without the trailing NUL byte.
+
+  \item[\samp{et\#} (string, Unicode object or character buffer compatible
+  object) {[const char *encoding, char **buffer]}]
+  Same as \samp{es\#} except that string objects are passed through
+  without recoding them. Instead, the implementation assumes that the
+  string object uses the encoding passed in as parameter.
+
+  \item[\samp{b} (integer) {[char]}]
+  Convert a Python integer to a tiny int, stored in a C \ctype{char}.
+
+  \item[\samp{h} (integer) {[short int]}]
+  Convert a Python integer to a C \ctype{short int}.
+
+  \item[\samp{i} (integer) {[int]}]
+  Convert a Python integer to a plain C \ctype{int}.
+
+  \item[\samp{l} (integer) {[long int]}]
+  Convert a Python integer to a C \ctype{long int}.
+
+  \item[\samp{L} (integer) {[LONG_LONG]}]
+  Convert a Python integer to a C \ctype{long long}.  This format is
+  only available on platforms that support \ctype{long long} (or
+  \ctype{_int64} on Windows).
+
+  \item[\samp{c} (string of length 1) {[char]}]
+  Convert a Python character, represented as a string of length 1, to
+  a C \ctype{char}.
+
+  \item[\samp{f} (float) {[float]}]
+  Convert a Python floating point number to a C \ctype{float}.
+
+  \item[\samp{d} (float) {[double]}]
+  Convert a Python floating point number to a C \ctype{double}.
+
+  \item[\samp{D} (complex) {[Py_complex]}]
+  Convert a Python complex number to a C \ctype{Py_complex} structure.
+
+  \item[\samp{O} (object) {[PyObject *]}]
+  Store a Python object (without any conversion) in a C object
+  pointer.  The C program thus receives the actual object that was
+  passed.  The object's reference count is not increased.  The pointer
+  stored is not \NULL.
+
+  \item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}]
+  Store a Python object in a C object pointer.  This is similar to
+  \samp{O}, but takes two C arguments: the first is the address of a
+  Python type object, the second is the address of the C variable (of
+  type \ctype{PyObject*}) into which the object pointer is stored.  If
+  the Python object does not have the required type,
+  \exception{TypeError} is raised.
+
+  \item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
+  Convert a Python object to a C variable through a \var{converter}
+  function.  This takes two arguments: the first is a function, the
+  second is the address of a C variable (of arbitrary type), converted
+  to \ctype{void *}.  The \var{converter} function in turn is called
+  as follows:
+
+  \var{status}\code{ = }\var{converter}\code{(}\var{object},
+  \var{address}\code{);}
+
+  where \var{object} is the Python object to be converted and
+  \var{address} is the \ctype{void*} argument that was passed to the
+  \cfunction{PyArg_Parse*()} function.  The returned \var{status}
+  should be \code{1} for a successful conversion and \code{0} if the
+  conversion has failed.  When the conversion fails, the
+  \var{converter} function should raise an exception.
+
+  \item[\samp{S} (string) {[PyStringObject *]}]
+  Like \samp{O} but requires that the Python object is a string
+  object.  Raises \exception{TypeError} if the object is not a string
+  object.  The C variable may also be declared as \ctype{PyObject*}.
+
+  \item[\samp{U} (Unicode string) {[PyUnicodeObject *]}]
+  Like \samp{O} but requires that the Python object is a Unicode
+  object.  Raises \exception{TypeError} if the object is not a Unicode
+  object.  The C variable may also be declared as \ctype{PyObject*}.
+
+  \item[\samp{t\#} (read-only character buffer) {[char *, int]}]
+  Like \samp{s\#}, but accepts any object which implements the
+  read-only buffer interface.  The \ctype{char*} variable is set to
+  point to the first byte of the buffer, and the \ctype{int} is set to
+  the length of the buffer.  Only single-segment buffer objects are
+  accepted; \exception{TypeError} is raised for all others.
+
+  \item[\samp{w} (read-write character buffer) {[char *]}]
+  Similar to \samp{s}, but accepts any object which implements the
+  read-write buffer interface.  The caller must determine the length
+  of the buffer by other means, or use \samp{w\#} instead.  Only
+  single-segment buffer objects are accepted; \exception{TypeError} is
+  raised for all others.
+
+  \item[\samp{w\#} (read-write character buffer) {[char *, int]}]
+  Like \samp{s\#}, but accepts any object which implements the
+  read-write buffer interface.  The \ctype{char *} variable is set to
+  point to the first byte of the buffer, and the \ctype{int} is set to
+  the length of the buffer.  Only single-segment buffer objects are
+  accepted; \exception{TypeError} is raised for all others.
+
+  \item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
+  The object must be a Python sequence whose length is the number of
+  format units in \var{items}.  The C arguments must correspond to the
+  individual format units in \var{items}.  Format units for sequences
+  may be nested.
+
+  \note{Prior to Python version 1.5.2, this format specifier only
+  accepted a tuple containing the individual parameters, not an
+  arbitrary sequence.  Code which previously caused
+  \exception{TypeError} to be raised here may now proceed without an
+  exception.  This is not expected to be a problem for existing code.}
+\end{description}
+
+It is possible to pass Python long integers where integers are
+requested; however no proper range checking is done --- the most
+significant bits are silently truncated when the receiving field is
+too small to receive the value (actually, the semantics are inherited
+from downcasts in C --- your mileage may vary).
+
+A few other characters have a meaning in a format string.  These may
+not occur inside nested parentheses.  They are:
+
+\begin{description}
+  \item[\samp{|}]
+  Indicates that the remaining arguments in the Python argument list
+  are optional.  The C variables corresponding to optional arguments
+  should be initialized to their default value --- when an optional
+  argument is not specified, \cfunction{PyArg_ParseTuple()} does not
+  touch the contents of the corresponding C variable(s).
+
+  \item[\samp{:}]
+  The list of format units ends here; the string after the colon is
+  used as the function name in error messages (the ``associated
+  value'' of the exception that \cfunction{PyArg_ParseTuple()}
+  raises).
+
+  \item[\samp{;}]
+  The list of format units ends here; the string after the semicolon
+  is used as the error message \emph{instead} of the default error
+  message.  Clearly, \samp{:} and \samp{;} mutually exclude each
+  other.
+\end{description}
+
+Note that any Python object references which are provided to the
+caller are \emph{borrowed} references; do not decrement their
+reference count!
+
+Additional arguments passed to these functions must be addresses of
+variables whose type is determined by the format string; these are
+used to store values from the input tuple.  There are a few cases, as
+described in the list of format units above, where these parameters
+are used as input values; they should match what is specified for the
+corresponding format unit in that case.
+
+For the conversion to succeed, the \var{arg} object must match the
+format and the format must be exhausted.  On success, the
+\cfunction{PyArg_Parse*()} functions return true, otherwise they
+return false and raise an appropriate exception.
+
 \begin{cfuncdesc}{int}{PyArg_ParseTuple}{PyObject *args, char *format,
                                          \moreargs}
   Parse the parameters of a function that takes only positional
   parameters into local variables.  Returns true on success; on
-  failure, it returns false and raises the appropriate exception.  See
-  \citetitle[../ext/parseTuple.html]{Extending and Embedding the
-  Python Interpreter} for more information.
+  failure, it returns false and raises the appropriate exception.
 \end{cfuncdesc}
 
 \begin{cfuncdesc}{int}{PyArg_ParseTupleAndKeywords}{PyObject *args,
@@ -372,8 +650,6 @@ Interpreter}.
   Parse the parameters of a function that takes both positional and
   keyword parameters into local variables.  Returns true on success;
   on failure, it returns false and raises the appropriate exception.
-  See \citetitle[../ext/parseTupleAndKeywords.html]{Extending and
-  Embedding the Python Interpreter} for more information.
 \end{cfuncdesc}
 
 \begin{cfuncdesc}{int}{PyArg_Parse}{PyObject *args, char *format,
@@ -440,8 +716,127 @@ PyArg_ParseTuple(args, "O|O:ref", &object, &callback)
   Create a new value based on a format string similar to those
   accepted by the \cfunction{PyArg_Parse*()} family of functions and a
   sequence of values.  Returns the value or \NULL{} in the case of an
-  error; an exception will be raised if \NULL{} is returned.  For more
-  information on the format string and additional parameters, see
-  \citetitle[../ext/buildValue.html]{Extending and Embedding the
-  Python Interpreter}.
+  error; an exception will be raised if \NULL{} is returned.
+
+  \cfunction{Py_BuildValue()} does not always build a tuple.  It
+  builds a tuple only if its format string contains two or more format
+  units.  If the format string is empty, it returns \code{None}; if it
+  contains exactly one format unit, it returns whatever object is
+  described by that format unit.  To force it to return a tuple of
+  size 0 or one, parenthesize the format string.
+
+  When memory buffers are passed as parameters to supply data to build
+  objects, as for the \samp{s} and \samp{s\#} formats, the required
+  data is copied.  Buffers provided by the caller are never referenced
+  by the objects created by \cfunction{Py_BuildValue()}.  In other
+  words, if your code invokes \cfunction{malloc()} and passes the
+  allocated memory to \cfunction{Py_BuildValue()}, your code is
+  responsible for calling \cfunction{free()} for that memory once
+  \cfunction{Py_BuildValue()} returns.
+
+  In the following description, the quoted form is the format unit;
+  the entry in (round) parentheses is the Python object type that the
+  format unit will return; and the entry in [square] brackets is the
+  type of the C value(s) to be passed.
+
+  The characters space, tab, colon and comma are ignored in format
+  strings (but not within format units such as \samp{s\#}).  This can
+  be used to make long format strings a tad more readable.
+
+  \begin{description}
+    \item[\samp{s} (string) {[char *]}]
+    Convert a null-terminated C string to a Python object.  If the C
+    string pointer is \NULL, \code{None} is used.
+
+    \item[\samp{s\#} (string) {[char *, int]}]
+    Convert a C string and its length to a Python object.  If the C
+    string pointer is \NULL, the length is ignored and \code{None} is
+    returned.
+
+    \item[\samp{z} (string or \code{None}) {[char *]}]
+    Same as \samp{s}.
+
+    \item[\samp{z\#} (string or \code{None}) {[char *, int]}]
+    Same as \samp{s\#}.
+
+    \item[\samp{u} (Unicode string) {[Py_UNICODE *]}]
+    Convert a null-terminated buffer of Unicode (UCS-2) data to a
+    Python Unicode object.  If the Unicode buffer pointer is \NULL,
+    \code{None} is returned.
+
+    \item[\samp{u\#} (Unicode string) {[Py_UNICODE *, int]}]
+    Convert a Unicode (UCS-2) data buffer and its length to a Python
+    Unicode object.   If the Unicode buffer pointer is \NULL, the
+    length is ignored and \code{None} is returned.
+
+    \item[\samp{i} (integer) {[int]}]
+    Convert a plain C \ctype{int} to a Python integer object.
+
+    \item[\samp{b} (integer) {[char]}]
+    Same as \samp{i}.
+
+    \item[\samp{h} (integer) {[short int]}]
+    Same as \samp{i}.
+
+    \item[\samp{l} (integer) {[long int]}]
+    Convert a C \ctype{long int} to a Python integer object.
+
+    \item[\samp{c} (string of length 1) {[char]}]
+    Convert a C \ctype{int} representing a character to a Python
+    string of length 1.
+
+    \item[\samp{d} (float) {[double]}]
+    Convert a C \ctype{double} to a Python floating point number.
+
+    \item[\samp{f} (float) {[float]}]
+    Same as \samp{d}.
+
+    \item[\samp{D} (complex) {[Py_complex *]}]
+    Convert a C \ctype{Py_complex} structure to a Python complex
+    number.
+
+    \item[\samp{O} (object) {[PyObject *]}]
+    Pass a Python object untouched (except for its reference count,
+    which is incremented by one).  If the object passed in is a
+    \NULL{} pointer, it is assumed that this was caused because the
+    call producing the argument found an error and set an exception.
+    Therefore, \cfunction{Py_BuildValue()} will return \NULL{} but
+    won't raise an exception.  If no exception has been raised yet,
+    \exception{SystemError} is set.
+
+    \item[\samp{S} (object) {[PyObject *]}]
+    Same as \samp{O}.
+
+    \item[\samp{U} (object) {[PyObject *]}]
+    Same as \samp{O}.
+
+    \item[\samp{N} (object) {[PyObject *]}]
+    Same as \samp{O}, except it doesn't increment the reference count
+    on the object.  Useful when the object is created by a call to an
+    object constructor in the argument list.
+
+    \item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}]
+    Convert \var{anything} to a Python object through a
+    \var{converter} function.  The function is called with
+    \var{anything} (which should be compatible with \ctype{void *}) as
+    its argument and should return a ``new'' Python object, or \NULL{}
+    if an error occurred.
+
+    \item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}]
+    Convert a sequence of C values to a Python tuple with the same
+    number of items.
+
+    \item[\samp{[\var{items}]} (list) {[\var{matching-items}]}]
+    Convert a sequence of C values to a Python list with the same
+    number of items.
+
+    \item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}]
+    Convert a sequence of C values to a Python dictionary.  Each pair
+    of consecutive C values adds one item to the dictionary, serving
+    as key and value, respectively.
+
+  \end{description}
+
+  If there is an error in the format string, the
+  \exception{SystemError} exception is set and \NULL{} returned.
 \end{cfuncdesc}
author	Fred Drake <fdrake@acm.org>	2002-04-05 23:01:14 (GMT)
committer	Fred Drake <fdrake@acm.org>	2002-04-05 23:01:14 (GMT)
commit	68304ccce381d056b6346dac04404c559698027c (patch)
tree	0cae3c449b574398e97ce5f8d17a250251ccd49b /Doc/api/utilities.tex
parent	6b8ab74c8aecef19314375c440669b4364a236fe (diff)
download	cpython-68304ccce381d056b6346dac04404c559698027c.zip cpython-68304ccce381d056b6346dac04404c559698027c.tar.gz cpython-68304ccce381d056b6346dac04404c559698027c.tar.bz2