diff options
author | Marc-André Lemburg <mal@egenix.com> | 2001-05-15 12:00:02 (GMT) |
---|---|---|
committer | Marc-André Lemburg <mal@egenix.com> | 2001-05-15 12:00:02 (GMT) |
commit | 2d9204199fe8913cca9890f1822413d981587ee5 (patch) | |
tree | f0734f9c8721508ebbd472cbc46abd9aa66c44dd /Include/stringobject.h | |
parent | 2e0a654f6edeb58bef3cccffa42c2a236117a88c (diff) | |
download | cpython-2d9204199fe8913cca9890f1822413d981587ee5.zip cpython-2d9204199fe8913cca9890f1822413d981587ee5.tar.gz cpython-2d9204199fe8913cca9890f1822413d981587ee5.tar.bz2 |
This patch changes the way the string .encode() method works slightly
and introduces a new method .decode().
The major change is that strg.encode() will no longer try to convert
Unicode returns from the codec into a string, but instead pass along
the Unicode object as-is. The same is now true for all other codec
return types. The underlying C APIs were changed accordingly.
Note that even though this does have the potential of breaking
existing code, the chances are low since conversion from Unicode
previously took place using the default encoding which is normally
set to ASCII rendering this auto-conversion mechanism useless for
most Unicode encodings.
The good news is that you can now use .encode() and .decode() with
much greater ease and that the door was opened for better accessibility
of the builtin codecs.
As demonstration of the new feature, the patch includes a few new
codecs which allow string to string encoding and decoding (rot13,
hex, zip, uu, base64).
Written by Marc-Andre Lemburg. Copyright assigned to the PSF.
Diffstat (limited to 'Include/stringobject.h')
-rw-r--r-- | Include/stringobject.h | 43 |
1 files changed, 40 insertions, 3 deletions
diff --git a/Include/stringobject.h b/Include/stringobject.h index cadd78e..12df75a 100644 --- a/Include/stringobject.h +++ b/Include/stringobject.h @@ -78,7 +78,7 @@ extern DL_IMPORT(void) _Py_ReleaseInternedStrings(void); /* --- Generic Codecs ----------------------------------------------------- */ -/* Create a string object by decoding the encoded string s of the +/* Create an object by decoding the encoded string s of the given size. */ extern DL_IMPORT(PyObject*) PyString_Decode( @@ -89,7 +89,7 @@ extern DL_IMPORT(PyObject*) PyString_Decode( ); /* Encodes a char buffer of the given size and returns a - Python string object. */ + Python object. */ extern DL_IMPORT(PyObject*) PyString_Encode( const char *s, /* string char buffer */ @@ -98,15 +98,52 @@ extern DL_IMPORT(PyObject*) PyString_Encode( const char *errors /* error handling */ ); -/* Encodes a string object and returns the result as Python string +/* Encodes a string object and returns the result as Python object. */ +extern DL_IMPORT(PyObject*) PyString_AsEncodedObject( + PyObject *str, /* string object */ + const char *encoding, /* encoding */ + const char *errors /* error handling */ + ); + +/* Encodes a string object and returns the result as Python string + object. + + If the codec returns an Unicode object, the object is converted + back to a string using the default encoding. + + DEPRECATED - use PyString_AsEncodedObject() instead. */ + extern DL_IMPORT(PyObject*) PyString_AsEncodedString( PyObject *str, /* string object */ const char *encoding, /* encoding */ const char *errors /* error handling */ ); +/* Decodes a string object and returns the result as Python + object. */ + +extern DL_IMPORT(PyObject*) PyString_AsDecodedObject( + PyObject *str, /* string object */ + const char *encoding, /* encoding */ + const char *errors /* error handling */ + ); + +/* Decodes a string object and returns the result as Python string + object. + + If the codec returns an Unicode object, the object is converted + back to a string using the default encoding. + + DEPRECATED - use PyString_AsDecodedObject() instead. */ + +extern DL_IMPORT(PyObject*) PyString_AsDecodedString( + PyObject *str, /* string object */ + const char *encoding, /* encoding */ + const char *errors /* error handling */ + ); + /* Provides access to the internal data buffer and size of a string object or the default encoded version of an Unicode object. Passing NULL as *len parameter will force the string buffer to be |