cpython.git - https://github.com/python/cpython.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	gh-126662: harmonize naming for three namedtuple base classes in ↵	Stephen Morton	2024-11-24	1	-3/+3
\| \| \| \| \|	urllib.parse (GH-126663) harmonize naming for three namedtuple base classes in urllib.parse
*	gh-116897: Deprecate generic false values in urllib.parse.parse_qsl() ↵	Serhiy Storchaka	2024-11-12	1	-8/+17
\| \| \| \| \| \| \| \|	(GH-116903) Accepting objects with false values (like 0 and []) except empty strings and byte-like objects and None in urllib.parse functions parse_qsl() and parse_qs() is now deprecated.
*	gh-125926: Fix urllib.parse.urljoin() for base URI with undefined authority ↵	Serhiy Storchaka	2024-11-07	1	-2/+2
\| \| \| \| \| \| \|	(GH-125989) Although this goes beyond the application of RFC 3986, urljoin() should support relative base URIs for backward compatibility.
*	gh-76960: Fix urljoin() and urldefrag() for URIs with empty components ↵	Serhiy Storchaka	2024-08-31	1	-38/+62
\| \| \| \| \| \| \| \| \| \| \| \|	(GH-123273) * urljoin() with relative reference "?" sets empty query and removes fragment. * Preserve empty components (authority, params, query, fragment) in urljoin(). * Preserve empty components (authority, params, query) in urldefrag(). Also refactor the code and get rid of double _coerce_args() and _coerce_result() calls in urljoin(), urldefrag(), urlparse() and urlunparse().
*	gh-85110: Preserve relative path in URL without netloc in ↵	Serhiy Storchaka	2024-08-21	1	-2/+6
\| \| \| \|	urllib.parse.urlunsplit() (GH-123179)
*	gh-118827: Remove `Quoter` from `urllib.parse` (#118828)	Nikita Sobolev	2024-06-03	1	-8/+0
\| \| \| \|	Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com> Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
*	gh-67693: Fix urlunparse() and urlunsplit() for URIs with path starting with ↵	Serhiy Storchaka	2024-05-14	1	-1/+1
\| \| \| \|	multiple slashes and no authority (GH-113563)
*	gh-116764: Fix regressions in urllib.parse.parse_qsl() (GH-116801)	Serhiy Storchaka	2024-03-16	1	-1/+5
\| \| \| \| \| \| \| \|	* Restore support of None and other false values. * Raise TypeError for non-zero integers and non-empty sequences. The regressions were introduced in gh-74668 (bdba8ef42b15e651dc23374a08143cc2b4c4657d).
*	gh-74668: Fix support of bytes in urllib.parse.parse_qsl() (GH-115771)	Serhiy Storchaka	2024-03-05	1	-24/+26
\| \| \| \|	urllib.parse functions parse_qs() and parse_qsl() now support bytes arguments containing raw and percent-encoded non-ASCII data.
*	GH-104554: Add RTSPS support to `urllib/parse.py` (#104605)	zentarim	2023-06-13	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	* GH-104554: Add RTSPS support to `urllib/parse.py` RTSPS is the permanent scheme defined in https://www.iana.org/assignments/uri-schemes/uri-schemes.xhtml alongside RTSP and RTSPU schemes. * 📜🤖 Added by blurb_it. --------- Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
*	gh-102153: Start stripping C0 control and space chars in `urlsplit` (#102508)	Illia Volochii	2023-05-17	1	-0/+12
\| \| \| \| \| \| \| \| \|	`urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit #25595. This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/#url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329). --------- Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
*	gh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are ↵	JohnJamesUtley	2023-05-10	1	-1/+15
\| \| \| \| \| \| \| \| \|	of IPv6 or IPvFuture format (#103849) * Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format --------- Co-authored-by: Gregory P. Smith <greg@krypto.org>
*	gh-104139: Add itms-services to uses_netloc urllib.parse. (#104312)	Gregory P. Smith	2023-05-09	1	-1/+1
\| \| \| \|	Teach unsplit to retain the `"//"` when assembling `itms-services://?action=generate-bugs` style [Apple Platform Deployment](https://support.apple.com/en-gb/guide/deployment/depce7cefc4d/web) URLs.
*	gh-88500: Reduce memory use of `urllib.unquote` (#96763)	Gregory P. Smith	2022-12-11	1	-11/+19
\| \| \| \| \| \| \| \| \| \| \|	`urllib.unquote_to_bytes` and `urllib.unquote` could both potentially generate `O(len(string))` intermediate `bytes` or `str` objects while computing the unquoted final result depending on the input provided. As Python objects are relatively large, this could consume a lot of ram. This switches the implementation to using an expanding `bytearray` and a generator internally instead of precomputed `split()` style operations. Microbenchmarks with some antagonistic inputs like `mess = "\u0141%%%20a%fe"1000` show this is 10-20% slower for unquote and unquote_to_bytes and no different for typical inputs that are short or lack much unicode or % escaping. But the functions are already quite fast anyways so not a big deal. The slowdown scales consistently linear with input size as expected. Memory usage observed manually using `/usr/bin/time -v` on `python -m timeit` runs of larger inputs. Unittesting memory consumption is difficult and does not seem worthwhile. Observed memory usage is ~1/2 for `unquote()` and <1/3 for `unquote_to_bytes()` using `python -m timeit -s 'from urllib.parse import unquote, unquote_to_bytes; v="\u0141%01\u0161%20"500_000' 'unquote_to_bytes(v)'` as a test.
*	gh-99418: Make urllib.parse.urlparse enforce that a scheme must begin with ↵	Ben Kallus	2022-11-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	an alphabetical ASCII character. (#99421) Prevent urllib.parse.urlparse from accepting schemes that don't begin with an alphabetical ASCII character. RFC 3986 defines a scheme like this: `scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )` RFC 2234 defines an ALPHA like this: `ALPHA = %x41-5A / %x61-7A` The WHATWG URL spec defines a scheme like this: `"A URL-scheme string must be one ASCII alpha, followed by zero or more of ASCII alphanumeric, U+002B (+), U+002D (-), and U+002E (.)."`
*	gh-96035: Make urllib.parse.urlparse reject non-numeric ports (#98273)	Ben Kallus	2022-10-20	1	-9/+8
\| \| \|	Co-authored-by: Jelle Zijlstra <jelle.zijlstra@gmail.com>
*	gh-95865: Further reduce quote_from_bytes memory consumption (#96860)	Gregory P. Smith	2022-09-19	1	-1/+9
\| \| \|	on large input values. Based on Dennis Sweeney's chunking idea.
*	gh-95865: Speed up urllib.parse.quote_from_bytes() (GH-95872)	Dennis Sweeney	2022-08-31	1	-1/+1
\|
*	gh-84623: Remove unused imports in stdlib (#93773)	Victor Stinner	2022-06-13	1	-1/+0
\|
*	Replace with_traceback() with exception chaining and reraising (GH-32074)	Oleg Iarygin	2022-03-30	1	-3/+2
\|
*	bpo-45874: Handle empty query string correctly in urllib.parse.parse_qsl ↵	Christian Sattler	2021-12-12	1	-2/+3
\| \| \| \|	(#29716)
*	bpo-44002: Switch to lru_cache in urllib.parse. (GH-25798)	Gregory P. Smith	2021-05-12	1	-29/+29
\| \| \| \| \| \| \| \| \| \| \| \|	Switch to lru_cache in urllib.parse. urllib.parse now uses functool.lru_cache for its internal URL splitting and quoting caches instead of rolling its own like its the 90s. The undocumented internal Quoted class API is now deprecated as it had no reason to be public and no existing OSS users were found. The clear_cache() API remains undocumented but gets an explicit test as it is used in a few projects' (twisted, gevent) tests as well as our own regrtest.
*	bpo-43882 Remove the newline, and tab early. From query and fragments. ↵	Senthil Kumaran	2021-05-05	1	-3/+5
\| \| \| \|	(GH-25921)
*	bpo-43979: Remove unnecessary operation from urllib.parse.parse_qsl (GH-25756)	Dong-hee Na	2021-04-30	1	-2/+1
\| \| \|	Automerge-Triggered-By: GH:gpshead
*	bpo-43882 - urllib.parse should sanitize urls containing ASCII newline and ↵	Senthil Kumaran	2021-04-29	1	-0/+6
\| \| \| \| \| \| \| \|	tabs. (GH-25595) * issue43882 - urllib.parse should sanitize urls containing ASCII newline and tabs. Co-authored-by: Gregory P. Smith <greg@krypto.org> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
*	bpo-42967: coerce bytes separator to string in urllib.parse_qs(l) (#24818)	Ken Jin	2021-04-11	1	-0/+1
\| \| \| \| \| \| \|	* coerce bytes separator to string * Add news * Update Misc/NEWS.d/next/Library/2021-03-11-00-31-41.bpo-42967.2PeQRw.rst
*	bpo-42967: Fix urllib.parse docs and make logic clearer (GH-24536)	Ken Jin	2021-02-15	1	-2/+1
\|
*	bpo-42967: only use '&' as a query string separator (#24297)	Adam Goldschmidt	2021-02-14	1	-5/+15
\| \| \| \| \| \| \| \| \| \| \|	bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl(). urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator. Co-authored-by: Éric Araujo <merwok@netwok.org> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Co-authored-by: Éric Araujo <merwok@netwok.org>
*	bpo-39481: PEP 585 for a variety of modules (GH-19423)	Batuhan Taşkaya	2020-04-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	- concurrent.futures - ctypes - http.cookies - multiprocessing - queue - tempfile - unittest.case - urllib.parse
*	bpo-37970: update and improve urlparse and urlsplit doc-strings (GH-16458)	idomic	2020-02-16	1	-6/+35
\|
*	bpo-39057: Fix urllib.request.proxy_bypass_environment(). (GH-17619)	Serhiy Storchaka	2020-01-05	1	-2/+2
\| \| \|	Ignore leading dots and no longer ignore a trailing newline.
*	bpo-27657: Fix urlparse() with numeric paths (#661)	Tim Graham	2019-10-18	1	-21/+1
\| \| \| \| \| \| \| \| \| \|	* bpo-27657: Fix urlparse() with numeric paths Revert parsing decision from bpo-754016 in favor of the documented consensus in bpo-16932 of how to treat strings without a // to designate the netloc. * bpo-22891: Remove urlsplit() optimization for 'http' prefixed inputs.
*	bpo-32498: urllib.parse.unquote also accepts bytes (GH-7768)	Stein Karlsen	2019-10-14	1	-0/+2
\|
*	bpo-36742: Corrects fix to handle decomposition in usernames (#13812)	Steve Dower	2019-06-04	1	-3/+3
\|
*	bpo-35397: Remove deprecation and document urllib.parse.unwrap (GH-11481)	Rémi Lapeyre	2019-05-27	1	-7/+5
\|
*	bpo-36742: Fixes handling of pre-normalization characters in urlsplit() ↵	Steve Dower	2019-04-30	1	-4/+7
\| \| \| \|	(GH-13017)
*	bpo-12910: update and correct quote docstring (#2568)	Jörn Hees	2019-04-10	1	-13/+20
\| \| \| \| \| \|	Fixes some mistakes and misleadings in the quote function docstring: - reserved chars are never actually used by quote code, unreserved chars are - reserved chars were wrong and incomplete - mentioned that use-case is not minimal quoting wrt. RFC, but cautious quoting
*	bpo-36216: Add check for characters in netloc that normalize to separators ↵	Steve Dower	2019-03-07	1	-0/+17
\| \| \| \|	(GH-12201)
*	bpo-34866: Adding max_num_fields to cgi.FieldStorage (GH-9660)	matthewbelisle-wf	2018-10-19	1	-3/+19
\| \| \| \|	Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by limiting the number of `MiniFieldStorage` objects created by `FieldStorage`.
*	bpo-27485: Rename and deprecate undocumented functions in urllib.parse (GH-2205)	Cheryl Sabella	2018-04-25	1	-4/+99
\|
*	bpo-33034: Improve exception message when cast fails for ↵	Matt Eaton	2018-03-20	1	-1/+5
\| \| \| \|	{Parse,Split}Result.port (GH-6078)
*	bpo-32323: urllib.parse.urlsplit() must not lowercase() IPv6 scope value (#4867)	Коренберг Марк	2017-12-21	1	-4/+6
\|
*	remove a redundant lower in urllib.parse.urlsplit (#3008)	Oren Milman	2017-09-03	1	-2/+1
\|
*	urllib: Simplify splithost by calling into urlparse. (#1849)	postmasters	2017-06-20	1	-1/+1
\| \| \| \| \| \| \| \|	The current regex based splitting produces a wrong result. For example:: http://abc#@def Web browsers parse that URL as ``http://abc/#@def``, that is, the host is ``abc``, the path is ``/``, and the fragment is ``#@def``.
*	bpo-29976: urllib.parse clarify '' in scheme values. (GH-984)	Senthil Kumaran	2017-05-18	1	-11/+19
\|
*	correct parse_qs and parse_qsl test case descriptions. (#968)	Senthil Kumaran	2017-04-05	1	-13/+17
\| \| \|	* correct parse_qs and parse_qsl test case descriptions.
*	bpo-16285: Update urllib quoting to RFC 3986 (#173)	Ratnadeep Debnath	2017-02-25	1	-3/+6
\| \| \| \| \| \| \| \| \| \|	* bpo-16285: Update urllib quoting to RFC 3986 urllib.parse.quote is now based on RFC 3986, and hence includes `'~'` in the set of characters that is not escaped by default. Patch by Christian Theune and Ratnadeep Debnath.
*	Issue #28992: Use bytes.fromhex().	Serhiy Storchaka	2016-12-21	1	-1/+1
\|
*	Issue #25895: Merge from 3.5	Berker Peksag	2016-09-16	1	-2/+3
\|\
\| *	Issue #25895: Enable WebSocket URL schemes in urllib.parse.urljoin	Berker Peksag	2016-09-16	1	-2/+3
\| \| \| \| \| \| \| \|	Patch by Gergely Imreh and Markus Holtermann.