This patch changes the way the string .encode() method works slightly

and introduces a new method .decode(). The major change is that strg.encode() will no longer try to convert Unicode returns from the codec into a string, but instead pass along the Unicode object as-is. The same is now true for all other codec return types. The underlying C APIs were changed accordingly. Note that even though this does have the potential of breaking existing code, the chances are low since conversion from Unicode previously took place using the default encoding which is normally set to ASCII rendering this auto-conversion mechanism useless for most Unicode encodings. The good news is that you can now use .encode() and .decode() with much greater ease and that the door was opened for better accessibility of the builtin codecs. As demonstration of the new feature, the patch includes a few new codecs which allow string to string encoding and decoding (rot13, hex, zip, uu, base64). Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
author: Marc-André Lemburg <mal@egenix.com> 2001-05-15 12:00:02 (GMT)
committer: Marc-André Lemburg <mal@egenix.com> 2001-05-15 12:00:02 (GMT)
commit: 2d9204199fe8913cca9890f1822413d981587ee5 (patch)
tree: f0734f9c8721508ebbd472cbc46abd9aa66c44dd /Doc
parent: 2e0a654f6edeb58bef3cccffa42c2a236117a88c (diff)
download: cpython-2d9204199fe8913cca9890f1822413d981587ee5.zip
cpython-2d9204199fe8913cca9890f1822413d981587ee5.tar.gz
cpython-2d9204199fe8913cca9890f1822413d981587ee5.tar.bz2
1 files changed, 21 insertions, 7 deletions
diff --git a/Doc/api/api.tex b/Doc/api/api.tex
index 07a2263..e7ba299 100644
--- a/Doc/api/api.tex
+++ b/Doc/api/api.tex
@@ -2326,30 +2326,44 @@ interned string object with the same value.
                                                int size,
                                                const char *encoding,
                                                const char *errors}
-Create a string object by decoding \var{size} bytes of the encoded
-buffer \var{s}. \var{encoding} and \var{errors} have the same meaning
+Creates an object by decoding \var{size} bytes of the encoded
+buffer \var{s} using the codec registered
+for \var{encoding}. \var{encoding} and \var{errors} have the same meaning
 as the parameters of the same name in the unicode() builtin
 function. The codec to be used is looked up using the Python codec
 registry. Returns \NULL{} in case an exception was raised by the
 codec.
 \end{cfuncdesc}
 
-\begin{cfuncdesc}{PyObject*}{PyString_Encode}{const Py_UNICODE *s,
+\begin{cfuncdesc}{PyObject*}{PyString_AsDecodedObject}{PyObject *str,
+                                               const char *encoding,
+                                               const char *errors}
+Decodes a string object by passing it to the codec registered
+for \var{encoding} and returns the result as Python 
+object. \var{encoding} and \var{errors} have the same meaning as the
+parameters of the same name in the string .encode() method. The codec
+to be used is looked up using the Python codec registry. Returns
+\NULL{} in case an exception was raised by the codec.
+\end{cfuncdesc}
+
+\begin{cfuncdesc}{PyObject*}{PyString_Encode}{const char *s,
                                                int size,
                                                const char *encoding,
                                                const char *errors}
-Encodes the \ctype{Py_UNICODE} buffer of the given size and returns a
-Python string object. \var{encoding} and \var{errors} have the same
+Encodes the \ctype{char} buffer of the given size by passing it to 
+the codec registered for \var{encoding} and returns a Python object. 
+\var{encoding} and \var{errors} have the same
 meaning as the parameters of the same name in the string .encode()
 method. The codec to be used is looked up using the Python codec
 registry. Returns \NULL{} in case an exception was raised by the
 codec.
 \end{cfuncdesc}
 
-\begin{cfuncdesc}{PyObject*}{PyString_AsEncodedString}{PyObject *unicode,
+\begin{cfuncdesc}{PyObject*}{PyString_AsEncodedObject}{PyObject *str,
                                                const char *encoding,
                                                const char *errors}
-Encodes a string object and returns the result as Python string
+Encodes a string object using the codec registered
+for \var{encoding} and returns the result as Python 
 object. \var{encoding} and \var{errors} have the same meaning as the
 parameters of the same name in the string .encode() method. The codec
 to be used is looked up using the Python codec registry. Returns
author	Marc-André Lemburg <mal@egenix.com>	2001-05-15 12:00:02 (GMT)
committer	Marc-André Lemburg <mal@egenix.com>	2001-05-15 12:00:02 (GMT)
commit	2d9204199fe8913cca9890f1822413d981587ee5 (patch)
tree	f0734f9c8721508ebbd472cbc46abd9aa66c44dd /Doc
parent	2e0a654f6edeb58bef3cccffa42c2a236117a88c (diff)
download	cpython-2d9204199fe8913cca9890f1822413d981587ee5.zip cpython-2d9204199fe8913cca9890f1822413d981587ee5.tar.gz cpython-2d9204199fe8913cca9890f1822413d981587ee5.tar.bz2