diff options
author | Fred Drake <fdrake@acm.org> | 2002-04-05 23:01:14 (GMT) |
---|---|---|
committer | Fred Drake <fdrake@acm.org> | 2002-04-05 23:01:14 (GMT) |
commit | 68304ccce381d056b6346dac04404c559698027c (patch) | |
tree | 0cae3c449b574398e97ce5f8d17a250251ccd49b /Doc/api/utilities.tex | |
parent | 6b8ab74c8aecef19314375c440669b4364a236fe (diff) | |
download | cpython-68304ccce381d056b6346dac04404c559698027c.zip cpython-68304ccce381d056b6346dac04404c559698027c.tar.gz cpython-68304ccce381d056b6346dac04404c559698027c.tar.bz2 |
Move reference material on PyArg_Parse*() out of the Extending & Embedding
document to the C API reference. Move some instructional text from the API
reference to the Extending & Embedding manual.
Fix the descriptions of the es and es# formats for PyArg_Parse*().
This closes SF bug #536516.
Diffstat (limited to 'Doc/api/utilities.tex')
-rw-r--r-- | Doc/api/utilities.tex | 413 |
1 files changed, 404 insertions, 9 deletions
diff --git a/Doc/api/utilities.tex b/Doc/api/utilities.tex index a5ffe3a..96ff816 100644 --- a/Doc/api/utilities.tex +++ b/Doc/api/utilities.tex @@ -357,13 +357,291 @@ and methods. Additional information and examples are available in \citetitle[../ext/ext.html]{Extending and Embedding the Python Interpreter}. +The first three of these functions described, +\cfunction{PyArg_ParseTuple()}, +\cfunction{PyArg_ParseTupleAndKeywords()}, and +\cfunction{PyArg_Parse()}, all use \emph{format strings} which are +used to tell the function about the expected arguments. The format +strings use the same syntax for each of these functions. + +A format string consists of zero or more ``format units.'' A format +unit describes one Python object; it is usually a single character or +a parenthesized sequence of format units. With a few exceptions, a +format unit that is not a parenthesized sequence normally corresponds +to a single address argument to these functions. In the following +description, the quoted form is the format unit; the entry in (round) +parentheses is the Python object type that matches the format unit; +and the entry in [square] brackets is the type of the C variable(s) +whose address should be passed. + +\begin{description} + \item[\samp{s} (string or Unicode object) {[char *]}] + Convert a Python string or Unicode object to a C pointer to a + character string. You must not provide storage for the string + itself; a pointer to an existing string is stored into the character + pointer variable whose address you pass. The C string is + NUL-terminated. The Python string must not contain embedded NUL + bytes; if it does, a \exception{TypeError} exception is raised. + Unicode objects are converted to C strings using the default + encoding. If this conversion fails, a \exception{UnicodeError} is + raised. + + \item[\samp{s\#} (string, Unicode or any read buffer compatible object) + {[char *, int]}] + This variant on \samp{s} stores into two C variables, the first one + a pointer to a character string, the second one its length. In this + case the Python string may contain embedded null bytes. Unicode + objects pass back a pointer to the default encoded string version of + the object if such a conversion is possible. All other read-buffer + compatible objects pass back a reference to the raw internal data + representation. + + \item[\samp{z} (string or \code{None}) {[char *]}] + Like \samp{s}, but the Python object may also be \code{None}, in + which case the C pointer is set to \NULL. + + \item[\samp{z\#} (string or \code{None} or any read buffer + compatible object) {[char *, int]}] + This is to \samp{s\#} as \samp{z} is to \samp{s}. + + \item[\samp{u} (Unicode object) {[Py_UNICODE *]}] + Convert a Python Unicode object to a C pointer to a NUL-terminated + buffer of 16-bit Unicode (UTF-16) data. As with \samp{s}, there is + no need to provide storage for the Unicode data buffer; a pointer to + the existing Unicode data is stored into the \ctype{Py_UNICODE} + pointer variable whose address you pass. + + \item[\samp{u\#} (Unicode object) {[Py_UNICODE *, int]}] + This variant on \samp{u} stores into two C variables, the first one + a pointer to a Unicode data buffer, the second one its length. + Non-Unicode objects are handled by interpreting their read-buffer + pointer as pointer to a \ctype{Py_UNICODE} array. + + \item[\samp{es} (string, Unicode object or character buffer + compatible object) {[const char *encoding, char **buffer]}] + This variant on \samp{s} is used for encoding Unicode and objects + convertible to Unicode into a character buffer. It only works for + encoded data without embedded NUL bytes. + + This format requires two arguments. The first is only used as + input, and must be a \ctype{char*} which points to the name of an + encoding as a NUL-terminated string, or \NULL, in which case the + default encoding is used. An exception is raised if the named + encoding is not known to Python. The second argument must be a + \ctype{char**}; the value of the pointer it references will be set + to a buffer with the contents of the argument text. The text will + be encoded in the encoding specified by the first argument. + + \cfunction{PyArg_ParseTuple()} will allocate a buffer of the needed + size, copy the encoded data into this buffer and adjust + \var{*buffer} to reference the newly allocated storage. The caller + is responsible for calling \cfunction{PyMem_Free()} to free the + allocated buffer after use. + + \item[\samp{et} (string, Unicode object or character buffer + compatible object) {[const char *encoding, char **buffer]}] + Same as \samp{es} except that 8-bit string objects are passed + through without recoding them. Instead, the implementation assumes + that the string object uses the encoding passed in as parameter. + + \item[\samp{es\#} (string, Unicode object or character buffer compatible + object) {[const char *encoding, char **buffer, int *buffer_length]}] + This variant on \samp{s\#} is used for encoding Unicode and objects + convertible to Unicode into a character buffer. Unlike the + \samp{es} format, this variant allows input data which contains NUL + characters. + + It requires three arguments. The first is only used as input, and + must be a \ctype{char*} which points to the name of an encoding as a + NUL-terminated string, or \NULL, in which case the default encoding + is used. An exception is raised if the named encoding is not known + to Python. The second argument must be a \ctype{char**}; the value + of the pointer it references will be set to a buffer with the + contents of the argument text. The text will be encoded in the + encoding specified by the first argument. The third argument must + be a pointer to an integer; the referenced integer will be set to + the number of bytes in the output buffer. + + There are two modes of operation: + + If \var{*buffer} points a \NULL{} pointer, the function will + allocate a buffer of the needed size, copy the encoded data into + this buffer and set \var{*buffer} to reference the newly allocated + storage. The caller is responsible for calling + \cfunction{PyMem_Free()} to free the allocated buffer after usage. + + If \var{*buffer} points to a non-\NULL{} pointer (an already + allocated buffer), \cfunction{PyArg_ParseTuple()} will use this + location as the buffer and interpret the initial value of + \var{*buffer_length} as the buffer size. It will then copy the + encoded data into the buffer and NUL-terminate it. If the buffer + is not large enough, a \exception{ValueError} will be set. + + In both cases, \var{*buffer_length} is set to the length of the + encoded data without the trailing NUL byte. + + \item[\samp{et\#} (string, Unicode object or character buffer compatible + object) {[const char *encoding, char **buffer]}] + Same as \samp{es\#} except that string objects are passed through + without recoding them. Instead, the implementation assumes that the + string object uses the encoding passed in as parameter. + + \item[\samp{b} (integer) {[char]}] + Convert a Python integer to a tiny int, stored in a C \ctype{char}. + + \item[\samp{h} (integer) {[short int]}] + Convert a Python integer to a C \ctype{short int}. + + \item[\samp{i} (integer) {[int]}] + Convert a Python integer to a plain C \ctype{int}. + + \item[\samp{l} (integer) {[long int]}] + Convert a Python integer to a C \ctype{long int}. + + \item[\samp{L} (integer) {[LONG_LONG]}] + Convert a Python integer to a C \ctype{long long}. This format is + only available on platforms that support \ctype{long long} (or + \ctype{_int64} on Windows). + + \item[\samp{c} (string of length 1) {[char]}] + Convert a Python character, represented as a string of length 1, to + a C \ctype{char}. + + \item[\samp{f} (float) {[float]}] + Convert a Python floating point number to a C \ctype{float}. + + \item[\samp{d} (float) {[double]}] + Convert a Python floating point number to a C \ctype{double}. + + \item[\samp{D} (complex) {[Py_complex]}] + Convert a Python complex number to a C \ctype{Py_complex} structure. + + \item[\samp{O} (object) {[PyObject *]}] + Store a Python object (without any conversion) in a C object + pointer. The C program thus receives the actual object that was + passed. The object's reference count is not increased. The pointer + stored is not \NULL. + + \item[\samp{O!} (object) {[\var{typeobject}, PyObject *]}] + Store a Python object in a C object pointer. This is similar to + \samp{O}, but takes two C arguments: the first is the address of a + Python type object, the second is the address of the C variable (of + type \ctype{PyObject*}) into which the object pointer is stored. If + the Python object does not have the required type, + \exception{TypeError} is raised. + + \item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}] + Convert a Python object to a C variable through a \var{converter} + function. This takes two arguments: the first is a function, the + second is the address of a C variable (of arbitrary type), converted + to \ctype{void *}. The \var{converter} function in turn is called + as follows: + + \var{status}\code{ = }\var{converter}\code{(}\var{object}, + \var{address}\code{);} + + where \var{object} is the Python object to be converted and + \var{address} is the \ctype{void*} argument that was passed to the + \cfunction{PyArg_Parse*()} function. The returned \var{status} + should be \code{1} for a successful conversion and \code{0} if the + conversion has failed. When the conversion fails, the + \var{converter} function should raise an exception. + + \item[\samp{S} (string) {[PyStringObject *]}] + Like \samp{O} but requires that the Python object is a string + object. Raises \exception{TypeError} if the object is not a string + object. The C variable may also be declared as \ctype{PyObject*}. + + \item[\samp{U} (Unicode string) {[PyUnicodeObject *]}] + Like \samp{O} but requires that the Python object is a Unicode + object. Raises \exception{TypeError} if the object is not a Unicode + object. The C variable may also be declared as \ctype{PyObject*}. + + \item[\samp{t\#} (read-only character buffer) {[char *, int]}] + Like \samp{s\#}, but accepts any object which implements the + read-only buffer interface. The \ctype{char*} variable is set to + point to the first byte of the buffer, and the \ctype{int} is set to + the length of the buffer. Only single-segment buffer objects are + accepted; \exception{TypeError} is raised for all others. + + \item[\samp{w} (read-write character buffer) {[char *]}] + Similar to \samp{s}, but accepts any object which implements the + read-write buffer interface. The caller must determine the length + of the buffer by other means, or use \samp{w\#} instead. Only + single-segment buffer objects are accepted; \exception{TypeError} is + raised for all others. + + \item[\samp{w\#} (read-write character buffer) {[char *, int]}] + Like \samp{s\#}, but accepts any object which implements the + read-write buffer interface. The \ctype{char *} variable is set to + point to the first byte of the buffer, and the \ctype{int} is set to + the length of the buffer. Only single-segment buffer objects are + accepted; \exception{TypeError} is raised for all others. + + \item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}] + The object must be a Python sequence whose length is the number of + format units in \var{items}. The C arguments must correspond to the + individual format units in \var{items}. Format units for sequences + may be nested. + + \note{Prior to Python version 1.5.2, this format specifier only + accepted a tuple containing the individual parameters, not an + arbitrary sequence. Code which previously caused + \exception{TypeError} to be raised here may now proceed without an + exception. This is not expected to be a problem for existing code.} +\end{description} + +It is possible to pass Python long integers where integers are +requested; however no proper range checking is done --- the most +significant bits are silently truncated when the receiving field is +too small to receive the value (actually, the semantics are inherited +from downcasts in C --- your mileage may vary). + +A few other characters have a meaning in a format string. These may +not occur inside nested parentheses. They are: + +\begin{description} + \item[\samp{|}] + Indicates that the remaining arguments in the Python argument list + are optional. The C variables corresponding to optional arguments + should be initialized to their default value --- when an optional + argument is not specified, \cfunction{PyArg_ParseTuple()} does not + touch the contents of the corresponding C variable(s). + + \item[\samp{:}] + The list of format units ends here; the string after the colon is + used as the function name in error messages (the ``associated + value'' of the exception that \cfunction{PyArg_ParseTuple()} + raises). + + \item[\samp{;}] + The list of format units ends here; the string after the semicolon + is used as the error message \emph{instead} of the default error + message. Clearly, \samp{:} and \samp{;} mutually exclude each + other. +\end{description} + +Note that any Python object references which are provided to the +caller are \emph{borrowed} references; do not decrement their +reference count! + +Additional arguments passed to these functions must be addresses of +variables whose type is determined by the format string; these are +used to store values from the input tuple. There are a few cases, as +described in the list of format units above, where these parameters +are used as input values; they should match what is specified for the +corresponding format unit in that case. + +For the conversion to succeed, the \var{arg} object must match the +format and the format must be exhausted. On success, the +\cfunction{PyArg_Parse*()} functions return true, otherwise they +return false and raise an appropriate exception. + \begin{cfuncdesc}{int}{PyArg_ParseTuple}{PyObject *args, char *format, \moreargs} Parse the parameters of a function that takes only positional parameters into local variables. Returns true on success; on - failure, it returns false and raises the appropriate exception. See - \citetitle[../ext/parseTuple.html]{Extending and Embedding the - Python Interpreter} for more information. + failure, it returns false and raises the appropriate exception. \end{cfuncdesc} \begin{cfuncdesc}{int}{PyArg_ParseTupleAndKeywords}{PyObject *args, @@ -372,8 +650,6 @@ Interpreter}. Parse the parameters of a function that takes both positional and keyword parameters into local variables. Returns true on success; on failure, it returns false and raises the appropriate exception. - See \citetitle[../ext/parseTupleAndKeywords.html]{Extending and - Embedding the Python Interpreter} for more information. \end{cfuncdesc} \begin{cfuncdesc}{int}{PyArg_Parse}{PyObject *args, char *format, @@ -440,8 +716,127 @@ PyArg_ParseTuple(args, "O|O:ref", &object, &callback) Create a new value based on a format string similar to those accepted by the \cfunction{PyArg_Parse*()} family of functions and a sequence of values. Returns the value or \NULL{} in the case of an - error; an exception will be raised if \NULL{} is returned. For more - information on the format string and additional parameters, see - \citetitle[../ext/buildValue.html]{Extending and Embedding the - Python Interpreter}. + error; an exception will be raised if \NULL{} is returned. + + \cfunction{Py_BuildValue()} does not always build a tuple. It + builds a tuple only if its format string contains two or more format + units. If the format string is empty, it returns \code{None}; if it + contains exactly one format unit, it returns whatever object is + described by that format unit. To force it to return a tuple of + size 0 or one, parenthesize the format string. + + When memory buffers are passed as parameters to supply data to build + objects, as for the \samp{s} and \samp{s\#} formats, the required + data is copied. Buffers provided by the caller are never referenced + by the objects created by \cfunction{Py_BuildValue()}. In other + words, if your code invokes \cfunction{malloc()} and passes the + allocated memory to \cfunction{Py_BuildValue()}, your code is + responsible for calling \cfunction{free()} for that memory once + \cfunction{Py_BuildValue()} returns. + + In the following description, the quoted form is the format unit; + the entry in (round) parentheses is the Python object type that the + format unit will return; and the entry in [square] brackets is the + type of the C value(s) to be passed. + + The characters space, tab, colon and comma are ignored in format + strings (but not within format units such as \samp{s\#}). This can + be used to make long format strings a tad more readable. + + \begin{description} + \item[\samp{s} (string) {[char *]}] + Convert a null-terminated C string to a Python object. If the C + string pointer is \NULL, \code{None} is used. + + \item[\samp{s\#} (string) {[char *, int]}] + Convert a C string and its length to a Python object. If the C + string pointer is \NULL, the length is ignored and \code{None} is + returned. + + \item[\samp{z} (string or \code{None}) {[char *]}] + Same as \samp{s}. + + \item[\samp{z\#} (string or \code{None}) {[char *, int]}] + Same as \samp{s\#}. + + \item[\samp{u} (Unicode string) {[Py_UNICODE *]}] + Convert a null-terminated buffer of Unicode (UCS-2) data to a + Python Unicode object. If the Unicode buffer pointer is \NULL, + \code{None} is returned. + + \item[\samp{u\#} (Unicode string) {[Py_UNICODE *, int]}] + Convert a Unicode (UCS-2) data buffer and its length to a Python + Unicode object. If the Unicode buffer pointer is \NULL, the + length is ignored and \code{None} is returned. + + \item[\samp{i} (integer) {[int]}] + Convert a plain C \ctype{int} to a Python integer object. + + \item[\samp{b} (integer) {[char]}] + Same as \samp{i}. + + \item[\samp{h} (integer) {[short int]}] + Same as \samp{i}. + + \item[\samp{l} (integer) {[long int]}] + Convert a C \ctype{long int} to a Python integer object. + + \item[\samp{c} (string of length 1) {[char]}] + Convert a C \ctype{int} representing a character to a Python + string of length 1. + + \item[\samp{d} (float) {[double]}] + Convert a C \ctype{double} to a Python floating point number. + + \item[\samp{f} (float) {[float]}] + Same as \samp{d}. + + \item[\samp{D} (complex) {[Py_complex *]}] + Convert a C \ctype{Py_complex} structure to a Python complex + number. + + \item[\samp{O} (object) {[PyObject *]}] + Pass a Python object untouched (except for its reference count, + which is incremented by one). If the object passed in is a + \NULL{} pointer, it is assumed that this was caused because the + call producing the argument found an error and set an exception. + Therefore, \cfunction{Py_BuildValue()} will return \NULL{} but + won't raise an exception. If no exception has been raised yet, + \exception{SystemError} is set. + + \item[\samp{S} (object) {[PyObject *]}] + Same as \samp{O}. + + \item[\samp{U} (object) {[PyObject *]}] + Same as \samp{O}. + + \item[\samp{N} (object) {[PyObject *]}] + Same as \samp{O}, except it doesn't increment the reference count + on the object. Useful when the object is created by a call to an + object constructor in the argument list. + + \item[\samp{O\&} (object) {[\var{converter}, \var{anything}]}] + Convert \var{anything} to a Python object through a + \var{converter} function. The function is called with + \var{anything} (which should be compatible with \ctype{void *}) as + its argument and should return a ``new'' Python object, or \NULL{} + if an error occurred. + + \item[\samp{(\var{items})} (tuple) {[\var{matching-items}]}] + Convert a sequence of C values to a Python tuple with the same + number of items. + + \item[\samp{[\var{items}]} (list) {[\var{matching-items}]}] + Convert a sequence of C values to a Python list with the same + number of items. + + \item[\samp{\{\var{items}\}} (dictionary) {[\var{matching-items}]}] + Convert a sequence of C values to a Python dictionary. Each pair + of consecutive C values adds one item to the dictionary, serving + as key and value, respectively. + + \end{description} + + If there is an error in the format string, the + \exception{SystemError} exception is set and \NULL{} returned. \end{cfuncdesc} |