summaryrefslogtreecommitdiffstats
path: root/Lib/urllib/parse.py
Commit message (Collapse)AuthorAgeFilesLines
* [3.7] gh-102153: Start stripping C0 control and space chars in `urlsplit` ↵stratakis2023-06-051-0/+12
| | | | | | | | | | | | | | | | | (GH-104896) `urllib.parse.urlsplit` has already been respecting the WHATWG spec a bit GH-25595. This adds more sanitizing to respect the "Remove any leading C0 control or space from input" [rule](https://url.spec.whatwg.org/GH-url-parsing:~:text=Remove%20any%20leading%20and%20trailing%20C0%20control%20or%20space%20from%20input.) in response to [CVE-2023-24329](https://nvd.nist.gov/vuln/detail/CVE-2023-24329). (cherry picked from commit d7f8a5fe07b0ff3a419ccec434cc405b21a5a304) (cherry picked from commit 2f630e1ce18ad2e07428296532a68b11dc66ad10) (cherry picked from commit 610cc0ab1b760b2abaac92bd256b96191c46b941) (cherry picked from commit f48a96a28012d28ae37a2f4587a780a5eb779946) Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com> Co-authored-by: Illia Volochii <illia.volochii@gmail.com> Co-authored-by: Gregory P. Smith [Google] <greg@krypto.org>
* [3.7] bpo-43882 - urllib.parse should sanitize urls containing ASCII newline ↵Miss Islington (bot)2021-05-061-0/+10
| | | | | | | | | | | | and tabs. (GH-25923) Co-authored-by: Gregory P. Smith <greg@krypto.org> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> (cherry picked from commit 76cd81d60310d65d01f9d7b48a8985d8ab89c8b4) Co-authored-by: Senthil Kumaran <senthil@uthcode.com> (cherry picked from commit 515a7bc4e13645d0945b46a8e1d9102b918cd407) Co-authored-by: Miss Islington (bot) <31488909+miss-islington@users.noreply.github.com>
* [3.7] bpo-42967: only use '&' as a query string separator (GH-24297) (GH-24531)Senthil Kumaran2021-02-151-5/+14
| | | | | | | | | | | | | | bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl(). urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator. Co-authored-by: Éric Araujo <merwok@netwok.org> Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Co-authored-by: Adam Goldschmidt <adamgold7@gmail.com> (cherry picked from commit fcbe0cb04d35189401c0c880ebfb4311e952d776)
* Revert "bpo-27657: Fix urlparse() with numeric paths (GH-661)" (#18526)Senthil Kumaran2020-02-161-1/+21
| | | | | | | | | | | | | | | | | | | | | This reverts commit 82b5f6b16e051f8a2ac6e87ba86b082fa1c4a77f. The change broke the backwards compatibility of parsing behavior in a patch release of Python (3.7.6). A decision was taken to revert this patch in 3.7.7. In https://bugs.python.org/issue27657 it was decided that the previous behavior like >>> urlparse('localhost:8080') ParseResult(scheme='', netloc='', path='localhost:8080', params='', query='', fragment='') >>> urlparse('undefined:8080') ParseResult(scheme='', netloc='', path='undefined:8080', params='', query='', fragment='') needs to be preserved in patch releases as number of users rely upon it. Explicitly mention the releases involved with the revert in NEWS. Adopt the wording suggested by @ned-deily.
* bpo-39057: Fix urllib.request.proxy_bypass_environment(). (GH-17619)Miss Islington (bot)2020-01-051-2/+2
| | | | | | Ignore leading dots and no longer ignore a trailing newline. (cherry picked from commit 6a265f0d0c0a4b3b8fecf4275d49187a384167f4) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* bpo-27657: Fix urlparse() with numeric paths (GH-661)Miss Islington (bot)2019-10-181-21/+1
| | | | | | | | | | | | * bpo-27657: Fix urlparse() with numeric paths Revert parsing decision from bpo-754016 in favor of the documented consensus in bpo-16932 of how to treat strings without a // to designate the netloc. * bpo-22891: Remove urlsplit() optimization for 'http' prefixed inputs. (cherry picked from commit 5a88d50ff013a64fbdb25b877c87644a9034c969) Co-authored-by: Tim Graham <timograham@gmail.com>
* bpo-36742: Corrects fix to handle decomposition in usernames (GH-13812)Miss Islington (bot)2019-06-041-3/+3
| | | | | (cherry picked from commit 8d0ef0b5edeae52960c7ed05ae8a12388324f87e) Co-authored-by: Steve Dower <steve.dower@python.org>
* bpo-36742: Fixes handling of pre-normalization characters in urlsplit() ↵Miss Islington (bot)2019-04-301-4/+7
| | | | | | | (GH-13017) (cherry picked from commit d537ab0ff9767ef024f26246899728f0116b1ec3) Co-authored-by: Steve Dower <steve.dower@python.org>
* bpo-12910: update and correct quote docstring (GH-2568)Miss Islington (bot)2019-04-101-13/+20
| | | | | | | | | Fixes some mistakes and misleadings in the quote function docstring: - reserved chars are never actually used by quote code, unreserved chars are - reserved chars were wrong and incomplete - mentioned that use-case is not minimal quoting wrt. RFC, but cautious quoting (cherry picked from commit 750d74fac5c510e39958b3f79641fe54096ee54f) Co-authored-by: Jörn Hees <joernhees@users.noreply.github.com>
* bpo-36216: Add check for characters in netloc that normalize to separators ↵Steve Dower2019-03-071-0/+17
| | | | (GH-12201)
* bpo-34866: Adding max_num_fields to cgi.FieldStorage (GH-9660)Miss Islington (bot)2018-10-191-3/+19
| | | | | | | Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by limiting the number of `MiniFieldStorage` objects created by `FieldStorage`. (cherry picked from commit 209144831b0a19715bda3bd72b14a3e6192d9cc1) Co-authored-by: matthewbelisle-wf <matthew.belisle@workiva.com>
* bpo-32323: urllib.parse.urlsplit() must not lowercase() IPv6 scope value (#4867)Коренберг Марк2017-12-211-4/+6
|
* remove a redundant lower in urllib.parse.urlsplit (#3008)Oren Milman2017-09-031-2/+1
|
* urllib: Simplify splithost by calling into urlparse. (#1849)postmasters2017-06-201-1/+1
| | | | | | | | The current regex based splitting produces a wrong result. For example:: http://abc#@def Web browsers parse that URL as ``http://abc/#@def``, that is, the host is ``abc``, the path is ``/``, and the fragment is ``#@def``.
* bpo-29976: urllib.parse clarify '' in scheme values. (GH-984)Senthil Kumaran2017-05-181-11/+19
|
* correct parse_qs and parse_qsl test case descriptions. (#968)Senthil Kumaran2017-04-051-13/+17
| | | * correct parse_qs and parse_qsl test case descriptions.
* bpo-16285: Update urllib quoting to RFC 3986 (#173)Ratnadeep Debnath2017-02-251-3/+6
| | | | | | | | | | * bpo-16285: Update urllib quoting to RFC 3986 urllib.parse.quote is now based on RFC 3986, and hence includes `'~'` in the set of characters that is not escaped by default. Patch by Christian Theune and Ratnadeep Debnath.
* Issue #28992: Use bytes.fromhex().Serhiy Storchaka2016-12-211-1/+1
|
* Issue #25895: Merge from 3.5Berker Peksag2016-09-161-2/+3
|\
| * Issue #25895: Enable WebSocket URL schemes in urllib.parse.urljoinBerker Peksag2016-09-161-2/+3
| | | | | | | | Patch by Gergely Imreh and Markus Holtermann.
* | merge from 3.5Senthil Kumaran2016-01-261-15/+0
|\ \ | |/ | | | | Remove unnecessary test case comment in urllib.parse.py. These are asserted as test cases.
| * Remove unnecessary test case comment in urllib.parse.py. These are asserted ↵Senthil Kumaran2016-01-261-15/+0
| | | | | | | | as test cases.
* | Issue #25822: Add docstrings to the fields of urllib.parse results.Senthil Kumaran2016-01-141-2/+65
| | | | | | | | Patch contributed by Swati Jaiswal.
* | Issue #20059: urllib.parse raises ValueError on all invalid ports.Robert Collins2015-08-091-2/+1
|/ | | | Patch by Martin Panter.
* Issue #13866: add *quote_via* argument to urlencode.R David Murray2015-05-181-14/+15
| | | | | Patch by samwyse, completed by Arnon Yaari, and reviewed by Martin Panter.
* Issue #23703: Fix a regression in urljoin() introduced in 901e4e52b20a.Berker Peksag2015-04-151-2/+1
| | | | Patch by Demian Brecht.
* Issue #23411: Added DefragResult, ParseResult, SplitResult, DefragResultBytes,Serhiy Storchaka2015-04-071-1/+3
| | | | | ParseResultBytes, and SplitResultBytes to urllib.parse.__all__. Patch by Martin Panter.
* Issue #23563: Optimized utility functions in urllib.parse.Serhiy Storchaka2015-03-031-60/+28
|
* Merge: #23040: Clarify treatment of encoding and errors when component is bytes.R David Murray2014-12-251-4/+5
|\
| * #23040: Clarify treatment of encoding and errors when component is bytes.R David Murray2014-12-251-4/+5
| | | | | | | | Patch by Wojtek Ruszczewski.
* | Issue #22278: Fix urljoin problem with relative urls, a regression observedSenthil Kumaran2014-09-221-1/+5
| | | | | | | | | | | | after changes to issue22118 were submitted. Patch contributed by Demian Brecht and reviewed by Antoine Pitrou.
* | Issue #22118: Switch urllib.parse to use RFC 3986 semantics for the ↵Antoine Pitrou2014-08-211-25/+38
| | | | | | | | | | | | resolution of relative URLs, rather than RFCs 1808 and 2396. Patch by Demian Brecht.
* | Issue #22033: Reprs of most Python implemened classes now contain actualSerhiy Storchaka2014-07-251-1/+1
|/ | | | class name instead of hardcoded one.
* Issue #20879: Delay the initialization of encoding and decoding tables forVictor Stinner2014-03-171-2/+7
| | | | | | base32, ascii85 and base85 codecs in the base64 module, and delay the initialization of the unquote_to_bytes() table of the urllib.parse module, to not waste memory if these modules are not used.
* Issue #20270: urllib.urlparse now supports empty ports.Serhiy Storchaka2014-01-181-14/+17
|\
| * Issue #20270: urllib.urlparse now supports empty ports.Serhiy Storchaka2014-01-181-14/+17
| |
* | merge from 3.3Senthil Kumaran2013-09-061-4/+4
|\ \ | |/ | | | | | | Improve urlencode docstring. Patch by Brian Brazil. Closes issue #15350
| * Improve urlencode docstring. Patch by Brian Brazil.Senthil Kumaran2013-09-061-4/+4
| |
* | Remove redundant importsRaymond Hettinger2013-04-071-9/+0
|/
* Issue #1285086: Get rid of the refcounting hack and speed upSerhiy Storchaka2013-03-141-36/+27
| | | | urllib.parse.unquote() and urllib.parse.unquote_to_bytes().
* Fix issue16713 - tel url parsing with paramsSenthil Kumaran2012-12-241-1/+1
|
* Closes #9374: add back now-unused module attributes; removing them is a ↵Georg Brandl2012-08-241-0/+10
| | | | backward compatibility issue, since they have a public-seeming name.
* urllib.parse cleanup. rename keywords used as variablesSenthil Kumaran2012-06-291-7/+7
|
* Issue #14920: Fix the help(urllib.parse) failure on locale C terminals. Just ↵Senthil Kumaran2012-05-261-1/+1
| | | | have ascii in help msg
* Issue #14036: return None when port in urlparse cross 65535Senthil Kumaran2012-05-241-0/+3
|
* #14072: Fix parsing of tel URIs in urlparse by making the check for ports ↵Ezio Melotti2012-05-191-6/+6
| | | | stricter.
* Issue9374 - Generic parsing of query and fragment portion of urls for any schemeSenthil Kumaran2012-05-191-9/+2
|
* Fix closes issue12683 - urljoin to work with relative join of svn scheme.Senthil Kumaran2011-08-031-1/+2
|
* merge from 3.1Senthil Kumaran2011-04-151-5/+10
|\
| * Issue #11467: Fix urlparse behavior when handling urls which contains scheme ↵Senthil Kumaran2011-04-151-5/+10
| | | | | | | | specific part only digits.