summaryrefslogtreecommitdiffstats
path: root/Lib/html/parser.py
Commit message (Collapse)AuthorAgeFilesLines
* Fix Issue10759 - html.parser.unescape() fails on HTML entities with ↵Senthil Kumaran2010-12-281-7/+10
| | | | incorrect syntax
* #1486713: Add a tolerant mode to HTMLParser.R. David Murray2010-12-031-16/+83
| | | | | | | | | | | | The motivation for adding this option is that the the functionality it provides used to be provided by sgmllib in Python2, and was used by, for example, BeautifulSoup. Without this option, the Python3 version of BeautifulSoup and the many programs that use it are crippled. The original patch was by 'kxroberto'. I modified it heavily but kept his heuristics and test. I also added additional heuristics to fix #975556, #1046092, and part of #6191. This patch should be completely backward compatible: the behavior with the default strict=True is unchanged.
* Recorded merge of revisions 81500-81501 via svnmerge fromVictor Stinner2010-05-241-0/+3
| | | | | | | | | | | | | | svn+ssh://pythondev@svn.python.org/python/trunk ........ r81500 | victor.stinner | 2010-05-24 23:33:24 +0200 (lun., 24 mai 2010) | 2 lines Issue #6662: Fix parsing of malformatted charref (&#bad;) ........ r81501 | victor.stinner | 2010-05-24 23:37:28 +0200 (lun., 24 mai 2010) | 2 lines Add the author of the last fix (Issue #6662) ........
* #2834: Change re module semantics, so that str and bytes mixing is forbidden,Antoine Pitrou2008-08-191-1/+1
| | | | | and str (unicode) patterns get full unicode matching by default. The re.ASCII flag is also introduced to ask for ASCII matching instead.
* Change test_htmlparser to reflect the HTMLParser -> html.parserMark Dickinson2008-05-211-1/+1
| | | | | | rename in r63439. Also fix one occurrence of unichr() in html.parser.
* rename HTMLParser to html.parser and htmlentitydefs to html.entities;Fred Drake2008-05-171-0/+388
includes merge of trunk revision 63432