diff options
author | Georg Brandl <georg@python.org> | 2007-09-28 13:13:35 (GMT) |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2007-09-28 13:13:35 (GMT) |
commit | 2d2590de49081eea38152e0a25496069a059c69e (patch) | |
tree | b80eb1ae3784fd11b7edfbd64092824483838ef3 /Doc/tutorial/interpreter.rst | |
parent | 7c77f753b452cfd423c2cb790977ddcfdabfa162 (diff) | |
download | cpython-2d2590de49081eea38152e0a25496069a059c69e.zip cpython-2d2590de49081eea38152e0a25496069a059c69e.tar.gz cpython-2d2590de49081eea38152e0a25496069a059c69e.tar.bz2 |
#1211, #1212, #1213: py3k fixes to the tutorial.
Diffstat (limited to 'Doc/tutorial/interpreter.rst')
-rw-r--r-- | Doc/tutorial/interpreter.rst | 59 |
1 files changed, 25 insertions, 34 deletions
diff --git a/Doc/tutorial/interpreter.rst b/Doc/tutorial/interpreter.rst index ce78399..5c67ba9 100644 --- a/Doc/tutorial/interpreter.rst +++ b/Doc/tutorial/interpreter.rst @@ -101,11 +101,14 @@ with the *secondary prompt*, by default three dots (``...``). The interpreter prints a welcome message stating its version number and a copyright notice before printing the first prompt:: - python - Python 1.5.2b2 (#1, Feb 28 1999, 00:02:06) [GCC 2.8.1] on sunos5 - Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam + $ python + Python 3.0a1 (py3k, Sep 12 2007, 12:21:02) + [GCC 3.4.6 20060404 (Red Hat 3.4.6-8)] on linux2 + Type "help", "copyright", "credits" or "license" for more information. >>> +.. XXX update for final release of Python 3.0 + Continuation lines are needed when entering a multi-line construct. As an example, take a look at this :keyword:`if` statement:: @@ -170,44 +173,32 @@ The script can be given an executable mode, or permission, using the Source Code Encoding -------------------- -.. XXX out of date! +By default, Python source files are treated as encoded in UTF-8. In that +encoding, characters of most languages in the world can be used simultaneously +in string literals, identifiers and comments --- although the standard library +only uses ASCII characters for identifiers, a convention that any portable code +should follow. To display all these characters properly, your editor must +recognize that the file is UTF-8, and it must use a font that supports all the +characters in the file. -It is possible to use encodings different than ASCII in Python source files. The -best way to do it is to put one more special comment line right after the ``#!`` -line to define the source file encoding:: +It is also possible to specify a different encoding for source files. In order +to do this, put one more special comment line right after the ``#!`` line to +define the source file encoding:: # -*- coding: encoding -*- +With that declaration, everything in the source file will be treated as having +the encoding *encoding* instead of UTF-8. The list of possible encodings can be +found in the Python Library Reference, in the section on :mod:`codecs`. -With that declaration, all characters in the source file will be treated as -having the encoding *encoding*, and it will be possible to directly write -Unicode string literals in the selected encoding. The list of possible -encodings can be found in the Python Library Reference, in the section on -:mod:`codecs`. - -For example, to write Unicode literals including the Euro currency symbol, the -ISO-8859-15 encoding can be used, with the Euro symbol having the ordinal value -164. This script will print the value 8364 (the Unicode codepoint corresponding -to the Euro symbol) and then exit:: - - # -*- coding: iso-8859-15 -*- - - currency = u"€" - print(ord(currency)) +For example, if your editor of choice does not support UTF-8 encoded files and +insists on using some other encoding, say Windows-1252, you can write:: -If your editor supports saving files as ``UTF-8`` with a UTF-8 *byte order mark* -(aka BOM), you can use that instead of an encoding declaration. IDLE supports -this capability if ``Options/General/Default Source Encoding/UTF-8`` is set. -Notice that this signature is not understood in older Python releases (2.2 and -earlier), and also not understood by the operating system for script files with -``#!`` lines (only used on Unix systems). + # -*- coding: cp-1252 -*- -By using UTF-8 (either through the signature or an encoding declaration), -characters of most languages in the world can be used simultaneously in string -literals and comments. Using non-ASCII characters in identifiers is not -supported. To display all these characters properly, your editor must recognize -that the file is UTF-8, and it must use a font that supports all the characters -in the file. +and still use all characters in the Windows-1252 character set in the source +files. The special encoding comment must be in the *first or second* line +within the file. .. _tut-startup: |