diff options
author | Trent Nelson <trent.nelson@snakebite.org> | 2008-03-18 22:41:35 (GMT) |
---|---|---|
committer | Trent Nelson <trent.nelson@snakebite.org> | 2008-03-18 22:41:35 (GMT) |
commit | 428de65ca99492436130165bfbaeb56d6d1daec7 (patch) | |
tree | d6c11516a28d8ca658e1f35ac6d7cc802958e336 /Lib/test/tokenize_tests-utf8-coding-cookie-and-utf8-bom-sig.txt | |
parent | 112367a980481d54f8c21802ee2538a3485fdd41 (diff) | |
download | cpython-428de65ca99492436130165bfbaeb56d6d1daec7.zip cpython-428de65ca99492436130165bfbaeb56d6d1daec7.tar.gz cpython-428de65ca99492436130165bfbaeb56d6d1daec7.tar.bz2 |
- Issue #719888: Updated tokenize to use a bytes API. generate_tokens has been
renamed tokenize and now works with bytes rather than strings. A new
detect_encoding function has been added for determining source file encoding
according to PEP-0263. Token sequences returned by tokenize always start
with an ENCODING token which specifies the encoding used to decode the file.
This token is used to encode the output of untokenize back to bytes.
Credit goes to Michael "I'm-going-to-name-my-first-child-unittest" Foord from Resolver Systems for this work.
Diffstat (limited to 'Lib/test/tokenize_tests-utf8-coding-cookie-and-utf8-bom-sig.txt')
-rw-r--r-- | Lib/test/tokenize_tests-utf8-coding-cookie-and-utf8-bom-sig.txt | 12 |
1 files changed, 12 insertions, 0 deletions
diff --git a/Lib/test/tokenize_tests-utf8-coding-cookie-and-utf8-bom-sig.txt b/Lib/test/tokenize_tests-utf8-coding-cookie-and-utf8-bom-sig.txt new file mode 100644 index 0000000..99b1399 --- /dev/null +++ b/Lib/test/tokenize_tests-utf8-coding-cookie-and-utf8-bom-sig.txt @@ -0,0 +1,12 @@ +# -*- coding: utf-8 -*-
+# IMPORTANT: this file has the utf-8 BOM signature '\xef\xbb\xbf'
+# at the start of it. Make sure this is preserved if any changes
+# are made!
+
+# Arbitrary encoded utf-8 text (stolen from test_doctest2.py).
+x = 'ЉЊЈЁЂ'
+def y():
+ """
+ And again in a comment. ЉЊЈЁЂ
+ """
+ pass
|