| Commit message (Collapse) | Author | Age | Files | Lines | |
|---|---|---|---|---|---|
| * | [3.13] gh-86155: Fix data loss after unclosed script or style tag in ↵ | Miss Islington (bot) | 2025-05-10 | 1 | -1/+1 |
| | | | | | | | | | | HTMLParser (GH-22658) (GH-133845) When calling .close() the HTMLParser should flush all remaining content, even when that content is in an unclosed script or style tag. (cherry picked from commit 53383e90e4df7029f792b7aa81aa2e4cff348ed0) Co-authored-by: Waylan Limberg <waylan.limberg@icloud.com> | ||||
| * | [3.13] gh-77057: Fix handling of invalid markup declarations in HTMLParser ↵ | Miss Islington (bot) | 2025-05-10 | 1 | -2/+2 |
| | | | | | | | | | (GH-9295) (GH-133834) (cherry picked from commit 76c0b01bc401c3e976011bbc69cec56dbebe0ad5) Co-authored-by: Ezio Melotti <ezio.melotti@gmail.com> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> | ||||
| * | [3.13] gh-69426: HTMLParser: only unescape properly terminated character ↵ | Miss Islington (bot) | 2025-05-09 | 1 | -1/+19 |
| | | | | | | | | | | | | | | entities in attribute values (GH-95215) (GH-133586) According to the HTML5 spec, named character references in attribute values should only be processed if they are not followed by an ASCII alphanumeric, or an equals sign. (cherry picked from commit 77b14a6d58e527f915966446eb0866652a46feb5) https: //html.spec.whatwg.org/multipage/parsing.html#named-character-reference-state Co-authored-by: Sascha Ißbrücker <sascha.issbruecker@googlemail.com> | ||||
| * | gh-100210: Correct the comment link for unescaping HTML (#100212) | Jean-Christophe Amiel | 2023-02-19 | 1 | -1/+1 |
| | | | | gh-100210: correct the comment link for unescaping HTML | ||||
| * | gh-97669: Create Tools/build/ directory (#97963) | Victor Stinner | 2022-10-17 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | | | | | Create Tools/build/ directory. Move the following scripts from Tools/scripts/ to Tools/build/: * check_extension_modules.py * deepfreeze.py * freeze_modules.py * generate_global_objects.py * generate_levenshtein_examples.py * generate_opcode_h.py * generate_re_casefix.py * generate_sre_constants.py * generate_stdlib_module_names.py * generate_token.py * parse_html5_entities.py * smelly.py * stable_abi.py * umarshal.py * update_file.py * verify_ensurepip_wheels.py Update references to these scripts. | ||||
| * | gh-95813: Improve HTMLParser from the view of inheritance (#95874) | Dong-hee Na | 2022-08-18 | 1 | -1/+2 |
| | | | | | | | | * gh-95813: Improve HTMLParser from the view of inheritance * gh-95813: Add unittest * Address code review | ||||
| * | gh-82927: Update files related to HTML entities. (GH-92504) | Ezio Melotti | 2022-06-21 | 1 | -3/+6 |
| | | |||||
| * | Add source for character mappings (#92014) | slateny | 2022-05-06 | 1 | -0/+1 |
| | | |||||
| * | bpo-45421: Remove dead code from html.parser (GH-28847) | Alberto Mardegan | 2021-10-12 | 1 | -7/+0 |
| | | | | | | Support for HtmlParserError was removed back in 2014 with commit 73a4359eb0eb624c588c5d52083ea4944f9787ea, however this small block was missed. | ||||
| * | Fix typos in the Lib directory (GH-28775) | Christian Clauss | 2021-10-06 | 1 | -1/+1 |
| | | | | | | Fix typos in the Lib directory as identified by codespell. Co-authored-by: Terry Jan Reedy <tjreedy@udel.edu> | ||||
| * | bpo-41748: Handles unquoted attributes with commas (#24072) | Karl Dubost | 2021-02-01 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | * bpo-41748: Adds tests for unquoted attributes with comma * bpo-41748: Handles unquoted attributes with comma * bpo-41748: Addresses review comments * bpo-41748: Addresses review comments * Adds more test cases * Simplifies the regex for handling spaces * bpo-41748: Moves attributes tests under the right class * bpo-41748: Addresses review about duplicate attributes * bpo-41748: Adds NEWS.d entry for this patch | ||||
| * | bpo-37328: remove deprecated HTMLParser.unescape (GH-14186) | Inada Naoki | 2019-08-27 | 1 | -8/+0 |
| | | | | It is deprecated since Python 3.4. | ||||
| * | bpo-30629: Remove second call of str.lower() in html.parser.parse_endtag. ↵ | Motoki Naruse | 2017-06-17 | 1 | -1/+1 |
| | | | | | | | (#2099) elem is the result of .lower() 6 lines above the handle_endtag call. Patch by Motoki Naruse | ||||
| * | Revert "Fixed a typo in the HTMLParser.feed docstrings" (#1771) | Serhiy Storchaka | 2017-05-24 | 1 | -1/+1 |
| | | | | | | * Revert "Fixed a typo in the HTMLParser.feed docstrings. The docstring started with an 'r', like a The docstring was correct. I read the patch in opposite direction, as *adding* the "r" prefix. This reverts commit 5ba185039f1bd465d3f82531324fd3fe1ee42f0c. | ||||
| * | Fixed a typo in the HTMLParser.feed docstrings. The docstring started with ↵ | Jani Šumak | 2017-05-23 | 1 | -1/+1 |
| | | | | | an 'r', like a rawstring. (#1759) | ||||
| * | #27364: fix "incorrect" uses of escape character in the stdlib. | R David Murray | 2016-09-08 | 1 | -2/+2 |
| | | | | | | | | And most of the tools. Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and Martin Panter. | ||||
| * | Issue #27076: Doc, comment and tests spelling fixes | Martin Panter | 2016-05-26 | 1 | -1/+1 |
| | | | | | Most fixes to Doc/ and Lib/ directories by Ville Skyttä. | ||||
| * | Merge spelling fixes from 3.4 into 3.5 | Martin Panter | 2015-10-31 | 1 | -1/+1 |
| |\ | |||||
| | * | Fix some spelling errors in documentation and code comments | Martin Panter | 2015-10-31 | 1 | -1/+1 |
| | | | |||||
| * | | #23144: merge with 3.4. | Ezio Melotti | 2015-09-06 | 1 | -1/+9 |
| |\ \ | |/ | |||||
| | * | #23144: Make sure that HTMLParser.feed() returns all the data, even when ↵ | Ezio Melotti | 2015-09-06 | 1 | -1/+9 |
| | | | | | | | | | convert_charrefs is True. | ||||
| * | | Issue #23181: More "codepoint" -> "code point". | Serhiy Storchaka | 2015-01-18 | 1 | -2/+2 |
| |\ \ | |/ | |||||
| | * | Issue #23181: More "codepoint" -> "code point". | Serhiy Storchaka | 2015-01-18 | 1 | -2/+2 |
| | | | |||||
| * | | #21047: set the default value for the *convert_charrefs* argument of ↵ | Ezio Melotti | 2014-08-02 | 1 | -8/+2 |
| | | | | | | | | | HTMLParser to True. Patch by Berker Peksag. | ||||
| * | | Add an __all__ to html.entities. | Ezio Melotti | 2014-08-02 | 1 | -0/+3 |
| | | | |||||
| * | | #15114: the strict mode and argument of HTMLParser, HTMLParser.error, and ↵ | Ezio Melotti | 2014-08-02 | 1 | -94/+12 |
| |/ | | | | the HTMLParserError exception have been removed. | ||||
| * | #20288: merge with 3.3. | Ezio Melotti | 2014-02-01 | 1 | -3/+3 |
| |\ | |||||
| | * | #20288: fix handling of invalid numeric charrefs in HTMLParser. | Ezio Melotti | 2014-02-01 | 1 | -3/+3 |
| | | | |||||
| * | | #13633: Added a new convert_charrefs keyword arg to HTMLParser that, when ↵ | Ezio Melotti | 2013-11-23 | 1 | -17/+45 |
| | | | | | | | | | True, automatically converts all character references. | ||||
| * | | #19688: add back and deprecate the internal HTMLParser.unescape() method. | Ezio Melotti | 2013-11-22 | 1 | -0/+7 |
| | | | |||||
| * | | #2927: Added the unescape() function to the html module. | Ezio Melotti | 2013-11-19 | 2 | -34/+118 |
| | | | |||||
| * | | #19480: merge with 3.3. | Ezio Melotti | 2013-11-07 | 1 | -9/+12 |
| |\ \ | |/ | |||||
| | * | #19480: HTMLParser now accepts all valid start-tag names as defined by the ↵ | Ezio Melotti | 2013-11-07 | 1 | -9/+12 |
| | | | | | | | | | HTML5 standard. | ||||
| * | | #15114: The html.parser module now raises a DeprecationWarning when the ↵ | Ezio Melotti | 2013-11-02 | 1 | -4/+10 |
| | | | | | | | | | strict argument of HTMLParser or the HTMLParser.error method are used. | ||||
| * | | #18020: improve html.escape speed by an order of magnitude. Patch by Matt ↵ | Ezio Melotti | 2013-07-07 | 1 | -7/+6 |
| | | | | | | | | | Bryant. | ||||
| * | | #17802: merge with 3.3. | Ezio Melotti | 2013-05-01 | 1 | -0/+1 |
| |\ \ | |/ | |||||
| | * | #17802: Fix an UnboundLocalError in html.parser. Initial tests by Thomas ↵ | Ezio Melotti | 2013-05-01 | 1 | -0/+1 |
| | | | | | | | | | Barlow. | ||||
| * | | #14679: add an __all__ (that contains only HTMLParser) to html.parser. | Ezio Melotti | 2013-05-01 | 1 | -0/+2 |
| |/ | |||||
| * | #16245: Fix the value of a few entities in html.entities.html5. | Ezio Melotti | 2012-10-23 | 1 | -12/+12 |
| | | |||||
| * | Reorder html.entities.html5 entities to make updates easier. Patch by ↵ | Ezio Melotti | 2012-10-23 | 1 | -109/+109 |
| | | | | | Iuliia Proskurnia. | ||||
| * | #15156: HTMLParser now uses the new "html.entities.html5" dictionary. | Ezio Melotti | 2012-06-24 | 1 | -17/+15 |
| | | |||||
| * | #11113: add a new "html5" dictionary containing the named character ↵ | Ezio Melotti | 2012-06-24 | 1 | -0/+2236 |
| | | | | | references defined by the HTML5 standard and the equivalent Unicode character(s) to the html.entities module. | ||||
| * | #15114: the strict mode of HTMLParser and the HTMLParseError exception are ↵ | Ezio Melotti | 2012-06-23 | 1 | -9/+12 |
| | | | | | deprecated now that the parser is able to parse invalid markup. | ||||
| * | #14538: HTMLParser can now parse correctly start tags that contain a bare /. | Ezio Melotti | 2012-04-19 | 1 | -3/+3 |
| | | |||||
| * | HTMLParser is now able to handle slashes in the start tag. | Ezio Melotti | 2012-02-21 | 1 | -7/+11 |
| | | |||||
| * | Fix an index and clean up comments. | Ezio Melotti | 2012-02-13 | 1 | -1/+2 |
| | | |||||
| * | Improve handling of declarations in HTMLParser. | Ezio Melotti | 2012-02-13 | 1 | -8/+22 |
| | | |||||
| * | #13993: HTMLParser is now able to handle broken end tags when strict=False. | Ezio Melotti | 2012-02-13 | 1 | -15/+27 |
| | | |||||
| * | #13960: HTMLParser is now able to handle broken comments when strict=False. | Ezio Melotti | 2012-02-10 | 1 | -1/+24 |
| | | |||||
| * | #13358: HTMLParser now calls handle_data only once for each CDATA. | Ezio Melotti | 2011-11-18 | 1 | -3/+4 |
| | | |||||
