summaryrefslogtreecommitdiffstats
path: root/Misc
diff options
context:
space:
mode:
authorVictor Stinner <vstinner@python.org>2022-09-30 12:58:30 (GMT)
committerGitHub <noreply@github.com>2022-09-30 12:58:30 (GMT)
commit9f2f1dd131b912e224cd0269adde8879799686c4 (patch)
treeb9976f4716a607ec95f03d02bff872b0e33e710b /Misc
parentff54dd96cbe589635ed95c8b5b26bc768166b07d (diff)
downloadcpython-9f2f1dd131b912e224cd0269adde8879799686c4.zip
cpython-9f2f1dd131b912e224cd0269adde8879799686c4.tar.gz
cpython-9f2f1dd131b912e224cd0269adde8879799686c4.tar.bz2
gh-94526: getpath_dirname() no longer encodes the path (#97645)
Fix the Python path configuration used to initialized sys.path at Python startup. Paths are no longer encoded to UTF-8/strict to avoid encoding errors if it contains surrogate characters (bytes paths are decoded with the surrogateescape error handler). getpath_basename() and getpath_dirname() functions no longer encode the path to UTF-8/strict, but work directly on Unicode strings. These functions now use PyUnicode_FindChar() and PyUnicode_Substring() on the Unicode path, rather than strrchr() on the encoded bytes string.
Diffstat (limited to 'Misc')
-rw-r--r--Misc/NEWS.d/next/Core and Builtins/2022-09-29-15-19-29.gh-issue-94526.wq5m6T.rst4
1 files changed, 4 insertions, 0 deletions
diff --git a/Misc/NEWS.d/next/Core and Builtins/2022-09-29-15-19-29.gh-issue-94526.wq5m6T.rst b/Misc/NEWS.d/next/Core and Builtins/2022-09-29-15-19-29.gh-issue-94526.wq5m6T.rst
new file mode 100644
index 0000000..59e389a
--- /dev/null
+++ b/Misc/NEWS.d/next/Core and Builtins/2022-09-29-15-19-29.gh-issue-94526.wq5m6T.rst
@@ -0,0 +1,4 @@
+Fix the Python path configuration used to initialized :data:`sys.path` at
+Python startup. Paths are no longer encoded to UTF-8/strict to avoid encoding
+errors if it contains surrogate characters (bytes paths are decoded with the
+surrogateescape error handler). Patch by Victor Stinner.