diff options
author | Andrew M. Kuchling <amk@amk.ca> | 2002-10-09 12:11:10 (GMT) |
---|---|---|
committer | Andrew M. Kuchling <amk@amk.ca> | 2002-10-09 12:11:10 (GMT) |
commit | 0a6fa9619e7d9a08698b58c6b4f89e7b0fd8cbaf (patch) | |
tree | 8dec2d20a7ad19297dee957abfaa3a557f817f09 | |
parent | 26bc25a6c4c377f553ff401dd993d8196da9438d (diff) | |
download | cpython-0a6fa9619e7d9a08698b58c6b4f89e7b0fd8cbaf.zip cpython-0a6fa9619e7d9a08698b58c6b4f89e7b0fd8cbaf.tar.gz cpython-0a6fa9619e7d9a08698b58c6b4f89e7b0fd8cbaf.tar.bz2 |
Minor edits and markup fixes
-rw-r--r-- | Doc/whatsnew/whatsnew23.tex | 59 |
1 files changed, 31 insertions, 28 deletions
diff --git a/Doc/whatsnew/whatsnew23.tex b/Doc/whatsnew/whatsnew23.tex index aced4e1..cfc0b94 100644 --- a/Doc/whatsnew/whatsnew23.tex +++ b/Doc/whatsnew/whatsnew23.tex @@ -316,24 +316,25 @@ Hisao and Martin von L\"owis.} \section{PEP 277: Unicode file name support for Windows NT} On Windows NT, 2000, and XP, the system stores file names as Unicode -strings. Traditionally, Python has represented file names are byte -strings, which is inadequate since it renders some file names +strings. Traditionally, Python has represented file names as byte +strings, which is inadequate because it renders some file names inaccessible. -Python allows now to use arbitrary Unicode strings (within limitations -of the file system) for all functions that expect file names, in -particular \function{open}. If a Unicode string is passed to -\function{os.listdir}, Python returns now a list of Unicode strings. -A new function \function{getcwdu} returns the current directory as a -Unicode string. +Python now allows using arbitrary Unicode strings (within the +limitations of the file system) for all functions that expect file +names, in particular the \function{open()} built-in. If a Unicode +string is passed to \function{os.listdir}, Python now returns a list +of Unicode strings. A new function, \function{os.getcwdu()}, returns +the current directory as a Unicode string. -Byte strings continue to work as file names, the system will -transparently convert them to Unicode using the \code{mbcs} encoding. +Byte strings still work as file names, and Python will transparently +convert them to Unicode using the \code{mbcs} encoding. -Other systems allow Unicode strings as file names as well, but convert -them to byte strings before passing them to the system, which may -cause UnicodeErrors. Applications can test whether arbitrary Unicode -strings are supported as file names with \code{os.path.unicode_file_names}. +Other systems also allow Unicode strings as file names, but convert +them to byte strings before passing them to the system which may cause +a \exception{UnicodeError} to be raised. Applications can test whether +arbitrary Unicode strings are supported as file names by checking +\member{os.path.unicode_file_names}, a Boolean value. \begin{seealso} @@ -493,31 +494,33 @@ strings \samp{True} and \samp{False} instead of \samp{1} and \samp{0}. \section{PEP 293: Codec Error Handling Callbacks} When encoding a Unicode string into a byte string, unencodable -characters may be encountered. So far, Python allowed to specify the -error processing as either ``strict'' (raise \code{UnicodeError}, -default), ``ignore'' (skip the character), or ``replace'' (with -question mark). It may be desirable to specify an alternative -processing of the error, e.g. by inserting an XML character reference -or HTML entity reference into the converted string. +characters may be encountered. So far, Python has allowed specifying +the error processing as either ``strict'' (raising +\exception{UnicodeError}), ``ignore'' (skip the character), or +``replace'' (with question mark), defaulting to ``strict''. It may be +desirable to specify an alternative processing of the error, e.g. by +inserting an XML character reference or HTML entity reference into the +converted string. Python now has a flexible framework to add additional processing -strategies; new error handlers can be added with +strategies. New error handlers can be added with \function{codecs.register_error}. Codecs then can access the error -handler with \code{codecs.lookup_error}. An equivalent C API has been -added for codecs written in C. The error handler gets various state -information, such as the string being converted, the position in the -string where the error was detected, and the target encoding. It can -then either raise an exception, or return a replacement string. +handler with \function{codecs.lookup_error}. An equivalent C API has +been added for codecs written in C. The error handler gets the +necessary state information, such as the string being converted, the +position in the string where the error was detected, and the target +encoding. The handler can then either raise an exception, or return a +replacement string. Two additional error handlers have been implemented using this -framework: ``backslashreplace'' using Python backslash quoting to +framework: ``backslashreplace'' uses Python backslash quoting to represent the unencodable character, and ``xmlcharrefreplace'' emits XML character references. \begin{seealso} \seepep{293}{Codec Error Handling Callbacks}{Written and implemented by -Walter Dörwald.} +Walter D\"orwald.} \end{seealso} |