summaryrefslogtreecommitdiffstats
path: root/Lib/test/test_unicode.py
Commit message (Collapse)AuthorAgeFilesLines
* bpo-28146: Fix a confusing error message in str.format() (GH-24213)Irit Katriel2021-05-131-2/+5
| | | Automerge-Triggered-By: GH:pitrou
* bpo-44029: Remove Py_UNICODE APIs (GH-25881)Inada Naoki2021-05-071-34/+0
| | | | | | | | | | | | Remove deprecated `Py_UNICODE` APIs: `PyUnicode_Encode`, `PyUnicode_EncodeUTF7`, `PyUnicode_EncodeUTF8`, `PyUnicode_EncodeUTF16`, `PyUnicode_EncodeUTF32`, `PyUnicode_EncodeLatin1`, `PyUnicode_EncodeMBCS`, `PyUnicode_EncodeDecimal`, `PyUnicode_EncodeRawUnicodeEscape`, `PyUnicode_EncodeCharmap`, `PyUnicode_EncodeUnicodeEscape`, `PyUnicode_TransformDecimalToASCII`, `PyUnicode_TranslateCharmap`, `PyUnicodeEncodeError_Create`, `PyUnicodeTranslateError_Create`. See :pep:`393` and :pep:`624` for reference.
* bpo-38659: [Enum] add _simple_enum decorator (GH-25497)Ethan Furman2021-04-211-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add: * `_simple_enum` decorator to transform a normal class into an enum * `_test_simple_enum` function to compare * `_old_convert_` to enable checking `_convert_` generated enums `_simple_enum` takes a normal class and converts it into an enum: @simple_enum(Enum) class Color: RED = 1 GREEN = 2 BLUE = 3 `_old_convert_` works much like` _convert_` does, using the original logic: # in a test file import socket, enum CheckedAddressFamily = enum._old_convert_( enum.IntEnum, 'AddressFamily', 'socket', lambda C: C.isupper() and C.startswith('AF_'), source=_socket, ) `_test_simple_enum` takes a traditional enum and a simple enum and compares the two: # in the REPL or the same module as Color class CheckedColor(Enum): RED = 1 GREEN = 2 BLUE = 3 _test_simple_enum(CheckedColor, Color) _test_simple_enum(CheckedAddressFamily, socket.AddressFamily) Any important differences will raise a TypeError
* Revert "bpo-38659: [Enum] add _simple_enum decorator (GH-25285)" (GH-25476)Ethan Furman2021-04-201-5/+4
| | | This reverts commit dbac8f40e81eb0a29dc833e6409a1abf47467da6.
* bpo-38659: [Enum] add _simple_enum decorator (GH-25285)Ethan Furman2021-04-201-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | add: _simple_enum decorator to transform a normal class into an enum _test_simple_enum function to compare _old_convert_ to enable checking _convert_ generated enums _simple_enum takes a normal class and converts it into an enum: @simple_enum(Enum) class Color: RED = 1 GREEN = 2 BLUE = 3 _old_convert_ works much like _convert_ does, using the original logic: # in a test file import socket, enum CheckedAddressFamily = enum._old_convert_( enum.IntEnum, 'AddressFamily', 'socket', lambda C: C.isupper() and C.startswith('AF_'), source=_socket, ) test_simple_enum takes a traditional enum and a simple enum and compares the two: # in the REPL or the same module as Color class CheckedColor(Enum): RED = 1 GREEN = 2 BLUE = 3 _test_simple_enum(CheckedColor, Color) _test_simple_enum(CheckedAddressFamily, socket.AddressFamily) Any important differences will raise a TypeError
* bpo-40066: Enum: modify `repr()` and `str()` (GH-22392)Ethan Furman2021-03-311-4/+4
| | | | | | | | | * Enum: streamline repr() and str(); improve docs - repr() is now ``enum_class.member_name`` - stdlib global enums are ``module_name.member_name`` - str() is now ``member_name`` - add HOW-TO section for ``Enum`` - change main documentation to be an API reference
* bpo-43405: Fix DeprecationWarnings in test_unicode (GH-24754)Zackery Spytz2021-03-071-20/+24
| | | | DeprecationWarnings were being raised in the test_encode_decimal() and test_transform_decimal() methods after 91a639a0949.
* bpo-36346: Emit DeprecationWarning for PyArg_Parse() with 'u' or 'Z'. (GH-20927)Inada Naoki2021-02-221-2/+4
| | | | | Emit DeprecationWarning when PyArg_Parse*() is called with 'u', 'Z' format. See PEP 623.
* bpo-27772: Make preceding width with 0 valid in string format. (GH-11270)Serhiy Storchaka2021-01-251-0/+6
| | | | Previously it was an error with confusing error message.
* bpo-41100: Support macOS 11 and Apple Silicon (GH-22855)Ronald Oussoren2020-11-081-0/+2
| | | | | | | | | | | Co-authored-by: Lawrence D’Anna <lawrence_danna@apple.com> * Add support for macOS 11 and Apple Silicon (aka arm64) As a side effect of this work use the system copy of libffi on macOS, and remove the vendored copy * Support building on recent versions of macOS while deploying to older versions This allows building installers on macOS 11 while still supporting macOS 10.9.
* bpo-41919, test_codecs: Move codecs.register calls to setUp() (GH-22513)Hai Shi2020-10-161-1/+4
| | | | * Move the codecs' (un)register operation to testcases. * Remove _codecs._forget_codec() and _PyCodec_Forget()
* bpo-36346: Make using the legacy Unicode C API optional (GH-21437)Serhiy Storchaka2020-07-101-0/+4
| | | | Add compile time option USE_UNICODE_WCHAR_CACHE. Setting it to 0 makes the interpreter not using the wchar_t cache and the legacy Unicode C API.
* bpo-40275: Use new test.support helper submodules in tests (GH-21317)Hai Shi2020-07-061-5/+7
|
* bpo-36346: Raise DeprecationWarning when creating legacy Unicode (GH-20933)Inada Naoki2020-06-301-1/+3
|
* bpo-41055: Remove outdated tests for the tp_print slot. (GH-21006)Serhiy Storchaka2020-06-211-16/+0
|
* bpo-40596: Fix str.isidentifier() for non-canonicalized strings containing ↵Serhiy Storchaka2020-05-121-0/+7
| | | | non-BMP characters on Windows. (GH-20053)
* Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer()" (GH-18985)Inada Naoki2020-03-141-22/+0
| | | | | | | * Revert "bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)" This reverts commit c7ad974d341d3edb6b9d2a2dcae4d3d4794ada6b. * Update unicodeobject.h
* bpo-39087: Add _PyUnicode_GetUTF8Buffer() (GH-17659)Inada Naoki2020-03-141-0/+22
| | | Co-authored-by: Victor Stinner <vstinner@python.org>
* Update some www.unicode.org URLs to use HTTPS. (GH-18912)Benjamin Peterson2020-03-111-1/+1
|
* bpo-15999: Clean up of handling boolean arguments. (GH-15610)Serhiy Storchaka2019-09-011-8/+8
| | | | | | * Use the 'p' format unit instead of manually called PyObject_IsTrue(). * Pass boolean value instead 0/1 integers to functions that needs boolean. * Convert some arguments to boolean only once.
* bpo-36502: Correct documentation of str.isspace() (GH-15019)Greg Price2019-08-141-1/+12
| | | | | | | | | | | | | | | | | | The documented definition was much broader than the real one: there are tons of characters with general category "Other", and we don't (and shouldn't) treat most of them as whitespace. Rewrite the definition to agree with the comment on _PyUnicode_IsWhitespace, and with the logic in makeunicodedata.py, which is what generates that function and so ultimately governs. Add suitable breadcrumbs so that a reader who wants to pin down exactly what this definition means (what's a "bidirectional class" of "B"?) can do so. The `unicodedata` module documentation is an appropriate central place for our references to Unicode's own copious documentation, so point there. Also add to the isspace() test a thorough check that the implementation agrees with the intended definition.
* bpo-37476: Adding tests for asutf8 and asutf8andsize (GH-14531)Hai Shi2019-07-201-0/+28
|
* bpo-37388: Development mode check encoding and errors (GH-14341)Victor Stinner2019-06-251-0/+62
| | | | | | | | | In development mode and in debug build, encoding and errors arguments are now checked on string encoding and decoding operations. Examples: open(), str.encode() and bytes.decode(). By default, for best performances, the errors argument is only checked at the first encoding/decoding error, and the encoding argument is sometimes ignored for empty strings.
* bpo-36549: str.capitalize now titlecases the first character instead of ↵Kingsley M2019-04-121-1/+1
| | | | uppercasing it (GH-12804)
* bpo-36297: remove "unicode_internal" codec (GH-12342)Inada Naoki2019-03-181-22/+14
|
* bpo-33817: Fix _PyBytes_Resize() for empty bytes object. (GH-11516)Serhiy Storchaka2019-01-121-0/+6
| | | | Add also tests for PyUnicode_FromFormat() and PyBytes_FromFormat() with empty result.
* Revert "bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)" (GH-9187)Victor Stinner2018-09-111-4/+0
| | | This reverts commit 886483e2b9bbabf60ab769683269b873381dd5ee.
* bpo-34595: Add %T format to PyUnicode_FromFormatV() (GH-9080)Victor Stinner2018-09-071-0/+4
| | | | | | | | | * Add %T format to PyUnicode_FromFormatV(), and so to PyUnicode_FromFormat() and PyErr_Format(), to format an object type name: equivalent to "%s" with Py_TYPE(obj)->tp_name. * Replace Py_TYPE(obj)->tp_name with %T format in unicodeobject.c. * Add unit test on %T format. * Rename unicode_fromformat_write_cstr() to unicode_fromformat_write_utf8(), to make the intent more explicit.
* bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences ↵Zackery Spytz2018-08-191-0/+4
| | | | | | | starting with "+". (GH-8741) The UTF-7 decoder now raises UnicodeDecodeError for ill-formed sequences starting with "+" (as specified in RFC 2152).
* bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342)INADA Naoki2018-01-271-0/+5
|
* bpo-31979: Simplify transforming decimals to ASCII (#4336)Serhiy Storchaka2017-11-131-5/+8
| | | | | in int(), float() and complex() parsers. This also speeds up parsing non-ASCII numbers by around 20%.
* bpo-30978: str.format_map() now passes key lookup exceptions through. (#2790)Serhiy Storchaka2017-08-031-0/+7
| | | Previously any exception was replaced with a KeyError exception.
* bpo-29919: Remove unused imports found by pyflakes (#137)Victor Stinner2017-03-271-1/+0
| | | Make also minor PEP8 coding style fixes on modified imports.
* bpo-28598: Support __rmod__ for RHS subclasses of str in % string formatting ↵Martijn Pieters2017-02-231-0/+9
| | | | | | | | operations (#51) When you use `'%s' % SubClassOfStr()`, where `SubClassOfStr.__rmod__` exists, the reverse operation is ignored as normally such string formatting operations use the `PyUnicode_Format()` fast path. This patch tests for subclasses of `str` first and picks the slow path in that case. Patch by Martijn Pieters.
* Issue #29145: Merge test from 3.6Martin Panter2017-01-141-0/+7
|\
| * Merge tests from 3.5Martin Panter2017-01-141-0/+7
| |\
| | * Issues #1621, #29145: Test for str.join() overflowMartin Panter2017-01-121-0/+7
| | |
* | | Issue #28992: Use bytes.fromhex().Serhiy Storchaka2016-12-211-7/+4
| | |
* | | Issue #28822: Adjust indices handling of PyUnicode_FindChar().Xiang Zhang2016-12-201-0/+23
|/ /
* | Merge spelling and grammar from 3.5Martin Panter2016-12-181-1/+1
|\ \ | |/
| * Fix spelling and grammar in code comments and documentationMartin Panter2016-12-181-1/+1
| |
* | Issue 28128: Print out better error/warning messages for invalid string ↵Eric V. Smith2016-10-311-7/+0
| | | | | | | | escapes. Backport to 3.6.
* | Merge from 3.5.Serhiy Storchaka2016-10-081-1/+44
|\ \ | |/
| * Issue #28379: Added sanity checks and tests for PyUnicode_CopyCharacters().Serhiy Storchaka2016-10-081-1/+44
| | | | | | | | Patch by Xiang Zhang.
* | test_invalid_sequences seems don't have to stay in CAPITest.Serhiy Storchaka2016-10-021-7/+7
| | | | | | | | Reported by Xiang Zhang.
* | Issue #28295: Fixed the documentation and added tests for PyUnicode_AsUCS4().Serhiy Storchaka2016-10-021-0/+17
|\ \ | |/ | | | | Original patch by Xiang Zhang.
| * Issue #28295: Fixed the documentation and added tests for PyUnicode_AsUCS4().Serhiy Storchaka2016-10-021-0/+17
| | | | | | | | Original patch by Xiang Zhang.
* | Moved Unicode C API related tests to separate test class.Serhiy Storchaka2016-10-021-114/+117
|\ \ | |/
| * Moved Unicode C API related tests to separate test class.Serhiy Storchaka2016-10-021-114/+117
| |
* | #27364: Deprecate invalid escape strings in str/byutes.R David Murray2016-09-081-0/+7
| | | | | | | | Patch by Emanuel Barry, reviewed by Serhiy Storchaka and Martin Panter.