summaryrefslogtreecommitdiffstats
path: root/Lib/urllib
Commit message (Collapse)AuthorAgeFilesLines
* [3.8] bpo-42967: only use '&' as a query string separator (GH-24297) (#24529)Senthil Kumaran2021-02-151-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * bpo-42967: only use '&' as a query string separator (#24297) bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl(). urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator. Co-authored-by: Éric Araujo <merwok@netwok.org> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Co-authored-by: Éric Araujo <merwok@netwok.org> (cherry picked from commit fcbe0cb04d35189401c0c880ebfb4311e952d776) * [3.8] bpo-42967: only use '&' as a query string separator (GH-24297) bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl(). urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator. Co-authored-by: Éric Araujo <merwok@netwok.org> Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com> Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com> Co-authored-by: Éric Araujo <merwok@netwok.org>. (cherry picked from commit fcbe0cb04d35189401c0c880ebfb4311e952d776) Co-authored-by: Adam Goldschmidt <adamgold7@gmail.com> * Update correct version information. * fix docs and make logic clearer Co-authored-by: Adam Goldschmidt <adamgold7@gmail.com> Co-authored-by: Fidget-Spinner <28750310+Fidget-Spinner@users.noreply.github.com>
* Allow / character in username,password fields in _PROXY envvars. (GH-23973) ↵Miss Islington (bot)2020-12-291-1/+5
| | | | | | | (#23992) (cherry picked from commit 030a713183084594659aefd77b76fe30178e23c8) Co-authored-by: Senthil Kumaran <senthil@uthcode.com>
* bpo-41471: Ignore invalid prefix lengths in system proxy settings on macOS ↵Miss Skeleton (bot)2020-10-201-0/+5
| | | | | | | (GH-22762) (GH-22774) (cherry picked from commit 93a1ccabdede416425473329b8c718d507c55e29) Co-authored-by: Ronald Oussoren <ronaldoussoren@mac.com>
* [3.8] bpo-32498: Improve exception message on passing bytes to ↵Irit Katriel2020-10-181-0/+2
| | | | urllib.parse.unquote (GH-22746)
* bpo-39503: CVE-2020-8492: Fix AbstractBasicAuthHandler (GH-18284) (GH-19296)Miss Islington (bot)2020-04-021-19/+50
| | | | | | | | | | | | | | | | The AbstractBasicAuthHandler class of the urllib.request module uses an inefficient regular expression which can be exploited by an attacker to cause a denial of service. Fix the regex to prevent the catastrophic backtracking. Vulnerability reported by Ben Caller and Matt Schwager. AbstractBasicAuthHandler of urllib.request now parses all WWW-Authenticate HTTP headers and accepts multiple challenges per header: use the realm of the first Basic challenge. Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Victor Stinner <vstinner@python.org> (cherry picked from commit 0b297d4ff1c0e4480ad33acae793fbaf4bf015b4)
* bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest Auth (GH-18338)Miss Islington (bot)2020-02-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | * bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest authentication - The 'qop' value in the 'WWW-Authenticate' header is optional. The presence of 'qop' in the header should be checked before its value is parsed with 'split'. Signed-off-by: Stephen Balousek <stephen@balousek.net> * bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest authentication - Add NEWS item Signed-off-by: Stephen Balousek <stephen@balousek.net> * Update Misc/NEWS.d/next/Library/2020-02-06-05-33-52.bpo-39548.DF4FFe.rst Co-Authored-By: Brandt Bucher <brandtbucher@gmail.com> Co-authored-by: Brandt Bucher <brandtbucher@gmail.com> (cherry picked from commit 5e260e0fde211829fcb67060cfd602f4b679f802) Co-authored-by: Stephen Balousek <sbalousek@users.noreply.github.com>
* Revert "[3.8] bpo-27657: Fix urlparse() with numeric paths (GH-16839)" ↵Senthil Kumaran2020-02-161-1/+21
| | | | | | | | | | | | | | | | | | | | | | | (GH-18525) This reverts commit 0f3187c1ce3b3ace60f6c1691dfa3d4e744f0384. The change broke the backwards compatibility of parsing behavior in a patch release of Python (3.8.1). A decision was taken to revert this patch in 3.8.2. In https://bugs.python.org/issue27657 it was decided that the previous behavior like >>> urlparse('localhost:8080') ParseResult(scheme='', netloc='', path='localhost:8080', params='', query='', fragment='') >>> urlparse('undefined:8080') ParseResult(scheme='', netloc='', path='undefined:8080', params='', query='', fragment='') needs to be preserved in patch releases as number of users rely upon it. Explicitly mention the releases involved with the revert in NEWS. Adopt the wording suggested by @ned-deily.
* bpo-39057: Fix urllib.request.proxy_bypass_environment(). (GH-17619)Miss Islington (bot)2020-01-052-13/+15
| | | | | | Ignore leading dots and no longer ignore a trailing newline. (cherry picked from commit 6a265f0d0c0a4b3b8fecf4275d49187a384167f4) Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
* bpo-38686: fix HTTP Digest handling in request.py (GH-17045)Miss Islington (bot)2019-11-221-2/+4
| | | | | | | | | | | | | | | | | | | | | | | * fix HTTP Digest handling in request.py There is a bug triggered when server replies to a request with `WWW-Authenticate: Digest` where `qop="auth,auth-int"` rather than mere `qop="auth"`. Having both `auth` and `auth-int` is legitimate according to the `qop-options` rule in §3.2.1 of [[https://www.ietf.org/rfc/rfc2617.txt|RFC 2617]]: > qop-options = "qop" "=" <"> 1GH-qop-value <"> > qop-value = "auth" | "auth-int" | token > **qop-options**: [...] If present, it is a quoted string **of one or more** tokens indicating the "quality of protection" values supported by the server. The value `"auth"` indicates authentication; the value `"auth-int"` indicates authentication with integrity protection This is description confirmed by the definition of the [//n//]`GH-`[//m//]//rule// extended-BNF pattern defined in §2.1 of [[https://www.ietf.org/rfc/rfc2616.txt|RFC 2616]] as 'a comma-separated list of //rule// with at least //n// and at most //m// items'. When this reply is parsed by `get_authorization`, request.py only tests for identity with `'auth'`, failing to recognize it as one of the supported modes the server announced, and claims that `"qop 'auth,auth-int' is not supported"`. * 📜🤖 Added by blurb_it. * bpo-38686 review fix: remember why. * fix trailing space in Lib/urllib/request.py Co-Authored-By: Brandt Bucher <brandtbucher@gmail.com> (cherry picked from commit 14a89c47983f2fb9e7fdf33c769e622eefd3a14a) Co-authored-by: PypeBros <PypeBros@users.noreply.github.com>
* [3.8] bpo-27657: Fix urlparse() with numeric paths (GH-661) (#16839)Senthil Kumaran2019-10-181-21/+1
| | | | | | | | | | | | * bpo-27657: Fix urlparse() with numeric paths Revert parsing decision from bpo-754016 in favor of the documented consensus in bpo-16932 of how to treat strings without a // to designate the netloc. * bpo-22891: Remove urlsplit() optimization for 'http' prefixed inputs. (cherry picked from commit 5a88d50ff013a64fbdb25b877c87644a9034c969) Co-authored-by: Tim Graham <timograham@gmail.com>
* bpo-25068: urllib.request.ProxyHandler now lowercases the dict keys (GH-13489)Miss Islington (bot)2019-09-131-0/+1
| | | | | (cherry picked from commit b761e3aed1fbada4572a776f6a0d3c4be491d595) Co-authored-by: Zackery Spytz <zspytz@gmail.com>
* bpo-35922: Fix RobotFileParser when robots.txt has no relevant crawl delay ↵Miss Islington (bot)2019-06-161-2/+6
| | | | | | | | or request rate (GH-11791) Co-Authored-By: Tal Einat <taleinat+github@gmail.com> (cherry picked from commit 8047e0e1c620f69cc21f9ca48b24bf2cdd5c3668) Co-authored-by: Rémi Lapeyre <remi.lapeyre@henki.fr>
* bpo-36742: Corrects fix to handle decomposition in usernames (#13812)Steve Dower2019-06-041-3/+3
|
* bpo-35397: Remove deprecation and document urllib.parse.unwrap (GH-11481)Rémi Lapeyre2019-05-272-11/+9
|
* bpo-36842: Implement PEP 578 (GH-12613)Steve Dower2019-05-231-0/+1
| | | Adds sys.audit, sys.addaudithook, io.open_code, and associated C APIs.
* bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474)Victor Stinner2019-05-221-1/+1
| | | | | | | CVE-2019-9948: Avoid file reading as disallowing the unnecessary URL scheme in URLopener().open() and URLopener().retrieve() of urllib.request. Co-Authored-By: SH <push0ebp@gmail.com>
* bpo-36948: Fix NameError in urllib.request.URLopener.retrieve (GH-13389)Xtreak2019-05-191-5/+5
|
* bpo-36742: Fixes handling of pre-normalization characters in urlsplit() ↵Steve Dower2019-04-301-4/+7
| | | | (GH-13017)
* bpo-12910: update and correct quote docstring (#2568)Jörn Hees2019-04-101-13/+20
| | | | | | Fixes some mistakes and misleadings in the quote function docstring: - reserved chars are never actually used by quote code, unreserved chars are - reserved chars were wrong and incomplete - mentioned that use-case is not minimal quoting wrt. RFC, but cautious quoting
* bpo-36431: Use PEP 448 dict unpacking for merging two dicts. (GH-12553)Serhiy Storchaka2019-03-271-2/+1
|
* bpo-36216: Add check for characters in netloc that normalize to separators ↵Steve Dower2019-03-071-0/+17
| | | | (GH-12201)
* closes bpo-35309: cpath should be capath (GH-10699)Boštjan Mejak2018-11-251-1/+1
|
* bpo-34866: Adding max_num_fields to cgi.FieldStorage (GH-9660)matthewbelisle-wf2018-10-191-3/+19
| | | | Adding `max_num_fields` to `cgi.FieldStorage` to make DOS attacks harder by limiting the number of `MiniFieldStorage` objects created by `FieldStorage`.
* bpo-21475: Support the Sitemap extension in robotparser (GH-6883)Christopher Beacham2018-05-161-0/+12
|
* bpo-32861: urllib.robotparser fix incomplete __str__ methods. (GH-5711)Michael Lazar2018-05-141-5/+12
| | | | | | The urllib.robotparser's __str__ representation now includes wildcard entries and the "Crawl-delay" and "Request-rate" fields. Also removes extra newlines that were being appended to the end of the string.
* bpo-27485: Rename and deprecate undocumented functions in urllib.parse (GH-2205)Cheryl Sabella2018-04-252-57/+152
|
* bpo-33034: Improve exception message when cast fails for ↵Matt Eaton2018-03-201-1/+5
| | | | {Parse,Split}Result.port (GH-6078)
* Revert unneccessary changes made in bpo-30296 and apply other improvements. ↵Serhiy Storchaka2018-02-261-1/+2
| | | | (GH-2624)
* urllib.request: Remove unused import (GH-5268)INADA Naoki2018-01-221-1/+0
|
* bpo-32323: urllib.parse.urlsplit() must not lowercase() IPv6 scope value (#4867)Коренберг Марк2017-12-211-4/+6
|
* bpo-31325: Fix usage of namedtuple in RobotFileParser.parse() (#4529)Berker Peksag2017-11-231-5/+4
|
* remove a redundant lower in urllib.parse.urlsplit (#3008)Oren Milman2017-09-031-2/+1
|
* urllib: Simplify splithost by calling into urlparse. (#1849)postmasters2017-06-201-1/+1
| | | | | | | | The current regex based splitting produces a wrong result. For example:: http://abc#@def Web browsers parse that URL as ``http://abc/#@def``, that is, the host is ``abc``, the path is ``/``, and the fragment is ``#@def``.
* bpo-30296 Remove unnecessary tuples, lists, sets, and dicts (#1489)Jon Dufresne2017-05-181-6/+5
| | | | | | | | * Replaced list(<generator expression>) with list comprehension * Replaced dict(<generator expression>) with dict comprehension * Replaced set(<list literal>) with set literal * Replaced builtin func(<list comprehension>) with func(<generator expression>) when supported (e.g. any(), all(), tuple(), min(), & max())
* bpo-29976: urllib.parse clarify '' in scheme values. (GH-984)Senthil Kumaran2017-05-181-11/+19
|
* bpo-30022: Get rid of using EnvironmentError and IOError (except test… (#1051)Serhiy Storchaka2017-04-161-1/+1
|
* Remove superfluous comment in urllib.error. (#1076)Senthil Kumaran2017-04-111-4/+0
|
* Remove OSError related comment in urllib.request. (#1070)Senthil Kumaran2017-04-101-1/+0
|
* Remove invalid comment in urllib.request. (#1054)Senthil Kumaran2017-04-091-6/+2
|
* correct parse_qs and parse_qsl test case descriptions. (#968)Senthil Kumaran2017-04-051-13/+17
| | | * correct parse_qs and parse_qsl test case descriptions.
* bpo-16285: Update urllib quoting to RFC 3986 (#173)Ratnadeep Debnath2017-02-251-3/+6
| | | | | | | | | | * bpo-16285: Update urllib quoting to RFC 3986 urllib.parse.quote is now based on RFC 3986, and hence includes `'~'` in the set of characters that is not escaped by default. Patch by Christian Theune and Ratnadeep Debnath.
* Issue #29142: Merge 3.6.Xiang Zhang2017-01-091-0/+1
|\
| * Issue #29142: Merge 3.5.Xiang Zhang2017-01-091-0/+1
| |\
| | * Issue #29142: Fix suffixes in no_proxy handling in urllib.Xiang Zhang2017-01-091-0/+1
| | | | | | | | | | | | | | | | | | In urllib.request, suffixes in no_proxy environment variable with leading dots could match related hostnames again (e.g. .b.c matches a.b.c). Patch by Milan Oberkirch.
* | | Issue #28992: Use bytes.fromhex().Serhiy Storchaka2016-12-211-1/+1
| | |
* | | Remove unused imports.Serhiy Storchaka2016-12-161-1/+0
|/ /
* | Issue #25400: RobotFileParser now correctly returns default values for ↵Berker Peksag2016-09-181-2/+6
| | | | | | | | | | | | crawl_delay and request_rate Initial patch by Peter Wirtz.
* | Issue #25895: Merge from 3.5Berker Peksag2016-09-161-2/+3
|\ \ | |/
| * Issue #25895: Enable WebSocket URL schemes in urllib.parse.urljoinBerker Peksag2016-09-161-2/+3
| | | | | | | | Patch by Gergely Imreh and Markus Holtermann.
| * Issue #22450: Use "Accept: */*" in the default headers for urllib.requestRaymond Hettinger2016-09-091-1/+1
| |