Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | bpo-37328: remove deprecated HTMLParser.unescape (GH-14186) | Inada Naoki | 2019-08-27 | 1 | -8/+0 |
| | | | It is deprecated since Python 3.4. | ||||
* | bpo-30629: Remove second call of str.lower() in html.parser.parse_endtag. ↵ | Motoki Naruse | 2017-06-17 | 1 | -1/+1 |
| | | | | | | (#2099) elem is the result of .lower() 6 lines above the handle_endtag call. Patch by Motoki Naruse | ||||
* | Revert "Fixed a typo in the HTMLParser.feed docstrings" (#1771) | Serhiy Storchaka | 2017-05-24 | 1 | -1/+1 |
| | | | | | * Revert "Fixed a typo in the HTMLParser.feed docstrings. The docstring started with an 'r', like a The docstring was correct. I read the patch in opposite direction, as *adding* the "r" prefix. This reverts commit 5ba185039f1bd465d3f82531324fd3fe1ee42f0c. | ||||
* | Fixed a typo in the HTMLParser.feed docstrings. The docstring started with ↵ | Jani Šumak | 2017-05-23 | 1 | -1/+1 |
| | | | | an 'r', like a rawstring. (#1759) | ||||
* | #27364: fix "incorrect" uses of escape character in the stdlib. | R David Murray | 2016-09-08 | 1 | -2/+2 |
| | | | | | | | And most of the tools. Patch by Emanual Barry, reviewed by me, Serhiy Storchaka, and Martin Panter. | ||||
* | Issue #27076: Doc, comment and tests spelling fixes | Martin Panter | 2016-05-26 | 1 | -1/+1 |
| | | | | Most fixes to Doc/ and Lib/ directories by Ville Skyttä. | ||||
* | #23144: merge with 3.4. | Ezio Melotti | 2015-09-06 | 1 | -1/+9 |
|\ | |||||
| * | #23144: Make sure that HTMLParser.feed() returns all the data, even when ↵ | Ezio Melotti | 2015-09-06 | 1 | -1/+9 |
| | | | | | | | | convert_charrefs is True. | ||||
* | | #21047: set the default value for the *convert_charrefs* argument of ↵ | Ezio Melotti | 2014-08-02 | 1 | -8/+2 |
| | | | | | | | | HTMLParser to True. Patch by Berker Peksag. | ||||
* | | #15114: the strict mode and argument of HTMLParser, HTMLParser.error, and ↵ | Ezio Melotti | 2014-08-02 | 1 | -94/+12 |
|/ | | | | the HTMLParserError exception have been removed. | ||||
* | #20288: merge with 3.3. | Ezio Melotti | 2014-02-01 | 1 | -3/+3 |
|\ | |||||
| * | #20288: fix handling of invalid numeric charrefs in HTMLParser. | Ezio Melotti | 2014-02-01 | 1 | -3/+3 |
| | | |||||
* | | #13633: Added a new convert_charrefs keyword arg to HTMLParser that, when ↵ | Ezio Melotti | 2013-11-23 | 1 | -17/+45 |
| | | | | | | | | True, automatically converts all character references. | ||||
* | | #19688: add back and deprecate the internal HTMLParser.unescape() method. | Ezio Melotti | 2013-11-22 | 1 | -0/+7 |
| | | |||||
* | | #2927: Added the unescape() function to the html module. | Ezio Melotti | 2013-11-19 | 1 | -33/+5 |
| | | |||||
* | | #19480: merge with 3.3. | Ezio Melotti | 2013-11-07 | 1 | -9/+12 |
|\ \ | |/ | |||||
| * | #19480: HTMLParser now accepts all valid start-tag names as defined by the ↵ | Ezio Melotti | 2013-11-07 | 1 | -9/+12 |
| | | | | | | | | HTML5 standard. | ||||
* | | #15114: The html.parser module now raises a DeprecationWarning when the ↵ | Ezio Melotti | 2013-11-02 | 1 | -4/+10 |
| | | | | | | | | strict argument of HTMLParser or the HTMLParser.error method are used. | ||||
* | | #17802: merge with 3.3. | Ezio Melotti | 2013-05-01 | 1 | -0/+1 |
|\ \ | |/ | |||||
| * | #17802: Fix an UnboundLocalError in html.parser. Initial tests by Thomas ↵ | Ezio Melotti | 2013-05-01 | 1 | -0/+1 |
| | | | | | | | | Barlow. | ||||
* | | #14679: add an __all__ (that contains only HTMLParser) to html.parser. | Ezio Melotti | 2013-05-01 | 1 | -0/+2 |
|/ | |||||
* | #15156: HTMLParser now uses the new "html.entities.html5" dictionary. | Ezio Melotti | 2012-06-24 | 1 | -17/+15 |
| | |||||
* | #15114: the strict mode of HTMLParser and the HTMLParseError exception are ↵ | Ezio Melotti | 2012-06-23 | 1 | -9/+12 |
| | | | | deprecated now that the parser is able to parse invalid markup. | ||||
* | #14538: HTMLParser can now parse correctly start tags that contain a bare /. | Ezio Melotti | 2012-04-19 | 1 | -3/+3 |
| | |||||
* | HTMLParser is now able to handle slashes in the start tag. | Ezio Melotti | 2012-02-21 | 1 | -7/+11 |
| | |||||
* | Fix an index and clean up comments. | Ezio Melotti | 2012-02-13 | 1 | -1/+2 |
| | |||||
* | Improve handling of declarations in HTMLParser. | Ezio Melotti | 2012-02-13 | 1 | -8/+22 |
| | |||||
* | #13993: HTMLParser is now able to handle broken end tags when strict=False. | Ezio Melotti | 2012-02-13 | 1 | -15/+27 |
| | |||||
* | #13960: HTMLParser is now able to handle broken comments when strict=False. | Ezio Melotti | 2012-02-10 | 1 | -1/+24 |
| | |||||
* | #13358: HTMLParser now calls handle_data only once for each CDATA. | Ezio Melotti | 2011-11-18 | 1 | -3/+4 |
| | |||||
* | #1745761, #755670, #13357, #12629, #1200313: improve attribute handling in ↵ | Ezio Melotti | 2011-11-14 | 1 | -9/+10 |
| | | | | HTMLParser. | ||||
* | #670664: Fix HTMLParser to correctly handle the content of ↵ | Ezio Melotti | 2011-11-01 | 1 | -4/+18 |
| | | | | ``<script>...</script>`` and ``<style>...</style>``. | ||||
* | #13273: fix a bug that prevented HTMLParser to properly detect some tags ↵ | Ezio Melotti | 2011-10-28 | 1 | -3/+2 |
| | | | | when strict=False. | ||||
* | #12888: Fix a bug in HTMLParser.unescape that prevented it to escape more ↵ | Ezio Melotti | 2011-09-05 | 1 | -1/+1 |
| | | | | than 128 entities. Patch by Peter Otten. | ||||
* | Merge 3.1 | Éric Araujo | 2011-05-25 | 1 | -1/+1 |
|\ | |||||
| * | Fix display of html.parser.HTMLParser.feed docstring | Éric Araujo | 2011-05-04 | 1 | -1/+1 |
| | | |||||
| * | Merged revisions 87542 via svnmerge from | Senthil Kumaran | 2010-12-28 | 1 | -7/+10 |
| | | | | | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r87542 | senthil.kumaran | 2010-12-28 23:55:16 +0800 (Tue, 28 Dec 2010) | 3 lines Fix Issue10759 - html.parser.unescape() fails on HTML entities with incorrect syntax ........ | ||||
| * | Merged revisions 81504 via svnmerge from | Victor Stinner | 2010-05-24 | 1 | -0/+3 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/branches/py3k ................ r81504 | victor.stinner | 2010-05-24 23:46:25 +0200 (lun., 24 mai 2010) | 13 lines Recorded merge of revisions 81500-81501 via svnmerge from svn+ssh://pythondev@svn.python.org/python/trunk ........ r81500 | victor.stinner | 2010-05-24 23:33:24 +0200 (lun., 24 mai 2010) | 2 lines Issue #6662: Fix parsing of malformatted charref (&#bad;) ........ r81501 | victor.stinner | 2010-05-24 23:37:28 +0200 (lun., 24 mai 2010) | 2 lines Add the author of the last fix (Issue #6662) ........ ................ | ||||
* | | #7311: fix html.parser to accept non-ASCII attribute values. | Ezio Melotti | 2011-04-07 | 1 | -1/+1 |
| | | |||||
* | | Fix Issue10759 - html.parser.unescape() fails on HTML entities with ↵ | Senthil Kumaran | 2010-12-28 | 1 | -7/+10 |
| | | | | | | | | incorrect syntax | ||||
* | | #1486713: Add a tolerant mode to HTMLParser. | R. David Murray | 2010-12-03 | 1 | -16/+83 |
| | | | | | | | | | | | | | | | | | | | | | | | | The motivation for adding this option is that the the functionality it provides used to be provided by sgmllib in Python2, and was used by, for example, BeautifulSoup. Without this option, the Python3 version of BeautifulSoup and the many programs that use it are crippled. The original patch was by 'kxroberto'. I modified it heavily but kept his heuristics and test. I also added additional heuristics to fix #975556, #1046092, and part of #6191. This patch should be completely backward compatible: the behavior with the default strict=True is unchanged. | ||||
* | | Recorded merge of revisions 81500-81501 via svnmerge from | Victor Stinner | 2010-05-24 | 1 | -0/+3 |
|/ | | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r81500 | victor.stinner | 2010-05-24 23:33:24 +0200 (lun., 24 mai 2010) | 2 lines Issue #6662: Fix parsing of malformatted charref (&#bad;) ........ r81501 | victor.stinner | 2010-05-24 23:37:28 +0200 (lun., 24 mai 2010) | 2 lines Add the author of the last fix (Issue #6662) ........ | ||||
* | #2834: Change re module semantics, so that str and bytes mixing is forbidden, | Antoine Pitrou | 2008-08-19 | 1 | -1/+1 |
| | | | | | and str (unicode) patterns get full unicode matching by default. The re.ASCII flag is also introduced to ask for ASCII matching instead. | ||||
* | Change test_htmlparser to reflect the HTMLParser -> html.parser | Mark Dickinson | 2008-05-21 | 1 | -1/+1 |
| | | | | | | rename in r63439. Also fix one occurrence of unichr() in html.parser. | ||||
* | rename HTMLParser to html.parser and htmlentitydefs to html.entities; | Fred Drake | 2008-05-17 | 1 | -0/+388 |
includes merge of trunk revision 63432 |