Explain source encodings. Fixes #683486.

author: Martin v. Löwis <martin@v.loewis.de> 2003-06-28 08:11:55 (GMT)
committer: Martin v. Löwis <martin@v.loewis.de> 2003-06-28 08:11:55 (GMT)
commit: 7928f388c445223cef857a9cee94bb20f61d9286 (patch)
tree: 6441f13ea10e8bd8c5aef3a64456facc8200b190 /Doc/tut/tut.tex
parent: ab1e5858eea540e50e8acccdbd37ff86a5afdd19 (diff)
download: cpython-7928f388c445223cef857a9cee94bb20f61d9286.zip
cpython-7928f388c445223cef857a9cee94bb20f61d9286.tar.gz
cpython-7928f388c445223cef857a9cee94bb20f61d9286.tar.bz2
1 files changed, 33 insertions, 0 deletions
diff --git a/Doc/tut/tut.tex b/Doc/tut/tut.tex
index 119e075..2104775 100644
--- a/Doc/tut/tut.tex
+++ b/Doc/tut/tut.tex
@@ -303,6 +303,39 @@ beginning of the script and giving the file an executable mode.  The
 the hash, or pound, character, \character{\#}, is used to start a
 comment in Python.
 
+\subsection{Source Code Encoding}
+
+It is possible to use encodings different than ASCII in Python source
+files. The best way to do it is to put one more special comment line
+right after \code{#!} line making proper encoding declaration:
+
+\begin{verbatim}
+# -*- coding: iso-8859-1 -*- 
+\end{verbatim}
+
+With that declaration, all characters in the source file will be
+treated as belonging to \code{iso-8859-1} encoding, and it will be
+possible to directly write Unicode string literals in the selected
+encoding. The list of possible encodings can be found in the
+\citetitle[../lib/lib.html]{Python Library Reference}, in the section
+on \module{codecs}.
+
+If your editor supports saving files as \code{UTF-8} with an UTF-8
+signature (aka BOM -- Byte Order Mark), you can use that instead of an
+encoding declaration. IDLE supports such saving if
+\code{Options/General/Default Source Encoding/UTF-8} is set. Notice
+that this signature is not understood in older Python releases (2.2
+and earlier), and also not understood by the operating system for
+\code{#!} files. 
+
+By using UTF-8 (either through the signature, or a an encoding
+declaration), characters of most languages in the world can be used
+simultaneously in string literals and comments. Using non-ASCII
+characters in identifiers is not supported. To display all these
+characters properly, your editor must recognize that the file is
+UTF-8, and it must use a font that supports all the characters in the
+file.
+
 \subsection{The Interactive Startup File \label{startup}}
 
 % XXX This should probably be dumped in an appendix, since most people
author	Martin v. Löwis <martin@v.loewis.de>	2003-06-28 08:11:55 (GMT)
committer	Martin v. Löwis <martin@v.loewis.de>	2003-06-28 08:11:55 (GMT)
commit	7928f388c445223cef857a9cee94bb20f61d9286 (patch)
tree	6441f13ea10e8bd8c5aef3a64456facc8200b190 /Doc/tut/tut.tex
parent	ab1e5858eea540e50e8acccdbd37ff86a5afdd19 (diff)
download	cpython-7928f388c445223cef857a9cee94bb20f61d9286.zip cpython-7928f388c445223cef857a9cee94bb20f61d9286.tar.gz cpython-7928f388c445223cef857a9cee94bb20f61d9286.tar.bz2