Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | bpo-30011: Fixed race condition in HTMLParser.unescape(). (#1140) | Serhiy Storchaka | 2017-04-15 | 1 | -2/+3 |
| | |||||
* | #20288: fix handling of invalid numeric charrefs in HTMLParser. | Ezio Melotti | 2014-02-01 | 1 | -3/+3 |
| | |||||
* | #19480: HTMLParser now accepts all valid start-tag names as defined by the ↵ | Ezio Melotti | 2013-11-07 | 1 | -4/+7 |
| | | | | HTML5 standard. | ||||
* | #14538: HTMLParser can now parse correctly start tags that contain a bare /. | Ezio Melotti | 2012-04-19 | 1 | -3/+3 |
| | |||||
* | HTMLParser is now able to handle slashes in the start tag. | Ezio Melotti | 2012-02-21 | 1 | -5/+5 |
| | |||||
* | #13987: HTMLParser is now able to handle malformed start tags. | Ezio Melotti | 2012-02-15 | 1 | -4/+6 |
| | |||||
* | #13987: HTMLParser is now able to handle EOFs in the middle of a construct. | Ezio Melotti | 2012-02-15 | 1 | -3/+10 |
| | |||||
* | Fix an index, add more tests, avoid raising errors for unknown declarations, ↵ | Ezio Melotti | 2012-02-13 | 1 | -2/+3 |
| | | | | and clean up comments. | ||||
* | #13993: HTMLParser is now able to handle broken end tags. | Ezio Melotti | 2012-02-13 | 1 | -8/+26 |
| | |||||
* | #13960: HTMLParser is now able to handle broken comments. | Ezio Melotti | 2012-02-13 | 1 | -1/+35 |
| | |||||
* | #13358: HTMLParser now calls handle_data only once for each CDATA. | Ezio Melotti | 2011-11-18 | 1 | -3/+4 |
| | |||||
* | #1745761, #755670, #13357, #12629, #1200313: improve attribute handling in ↵ | Ezio Melotti | 2011-11-14 | 1 | -9/+11 |
| | | | | HTMLParser. | ||||
* | #670664: Fix HTMLParser to correctly handle the content of ↵ | Ezio Melotti | 2011-11-01 | 1 | -4/+18 |
| | | | | ``<script>...</script>`` and ``<style>...</style>``. | ||||
* | Fix display of html.parser.HTMLParser.feed docstrin | Éric Araujo | 2011-05-25 | 1 | -1/+1 |
| | |||||
* | #7311: fix HTMLParser to accept non-ASCII attribute values. | Ezio Melotti | 2011-04-05 | 1 | -1/+1 |
| | |||||
* | Fix Issue10759 - HTMLParser.unescape() to handle malform charrefs. | Senthil Kumaran | 2010-12-28 | 1 | -7/+10 |
| | |||||
* | Issue #6662: Fix parsing of malformatted charref (&#bad;) | Victor Stinner | 2010-05-24 | 1 | -0/+3 |
| | |||||
* | revert creation of the html.entities and html.parser modules | Fred Drake | 2008-05-20 | 1 | -0/+387 |
| | | | | (http://bugs.python.org/issue2882) | ||||
* | rename HTMLParser to html.parser, htmlentitydefs to html.entities | Fred Drake | 2008-05-17 | 1 | -387/+0 |
| | | | | (http://bugs.python.org/issue2882) | ||||
* | Patch #912410: Replace HTML entity references for attribute values | Martin v. Löwis | 2007-03-06 | 1 | -6/+24 |
| | | | | in HTMLParser. | ||||
* | Reverting previous checkin. This breaks too much of HTMLParser to be applied | Georg Brandl | 2005-09-01 | 1 | -1/+1 |
| | | | | | without thought. Anyway, such malformed HTML is better handled by something like BeautifulSoup. | ||||
* | bug [ 761452 ] HTMLParser chokes on my.yahoo.com output | Georg Brandl | 2005-08-31 | 1 | -1/+1 |
| | |||||
* | remove unnecessary override of base class method | Fred Drake | 2004-09-08 | 1 | -13/+0 |
| | |||||
* | [Bug #921657] Allow '@' in unquoted HTML attributes. Not strictly legal ↵ | Andrew M. Kuchling | 2004-06-05 | 1 | -1/+1 |
| | | | | according to the HTML REC, but HTMLParser is already a pretty loose parser. Reported by Bernd Zimmermann. | ||||
* | Replace backticks with repr() or "%r" | Walter Dörwald | 2004-02-12 | 1 | -4/+4 |
| | | | | From SF patch #852334. | ||||
* | Accept commas in unquoted attribute values. | Fred Drake | 2003-03-14 | 1 | -1/+1 |
| | | | | This closes SF patch #669683. | ||||
* | Simplify code to remove an unnecessary test. | Fred Drake | 2002-05-14 | 1 | -2/+1 |
| | |||||
* | Convert to using string methods instead of the string module. | Fred Drake | 2001-12-03 | 1 | -29/+25 |
| | | | | | | In goahead(), use a bound version of rawdata.startswith() since we use the same method all the time and never change the value of rawdata. This can save a lot of bound method creation. | ||||
* | Re-factor the HTMLParser class to use the new markupbase.ParserBase class. | Fred Drake | 2001-09-24 | 1 | -305/+19 |
| | | | | | Use a new internal method, error(), consistently to raise parse errors; the new base class also uses this. | ||||
* | Whitespace normalization. | Tim Peters | 2001-09-18 | 1 | -1/+1 |
| | |||||
* | HTMLParser is allowed to be more strict than sgmllib, so let's not | Fred Drake | 2001-09-04 | 1 | -31/+16 |
| | | | | | change their basic behavior: When parsing something that cannot possibly be valid in either HTML or XHTML, raise an exception. | ||||
* | Added reasonable parsing of the DOCTYPE declaration, fixed edge cases | Fred Drake | 2001-09-04 | 1 | -12/+260 |
| | | | | regarding bare ampersands in content. | ||||
* | Deal more appropriately with bare ampersands and pointy brackets; this | Fred Drake | 2001-08-20 | 1 | -12/+12 |
| | | | | | | | | module has to deal with "class" HTML-as-deployed as well as XHTML, so we cannot be as strict as XHTML allows. This closes SF bug #453059, but uses a different fix than suggested in the bug comments. | ||||
* | Change some comments into docstrings. | Fred Drake | 2001-08-03 | 1 | -27/+31 |
| | | | | | | Fix handling of hexadecimal character references (legal in XHTML) so that they are properly interpreted as character references. This fixes SF bug #445196. | ||||
* | Merge my changes to the offending comment with Guido's changes. | Fred Drake | 2001-05-23 | 1 | -6/+10 |
| | |||||
* | Removed incorrect comment left over from sgmllib.py. | Guido van Rossum | 2001-05-22 | 1 | -7/+7 |
| | |||||
* | A much improved HTML parser -- a replacement for sgmllib. The API is | Guido van Rossum | 2001-05-18 | 1 | -0/+432 |
derived from but not quite compatible with that of sgmllib, so it's a new file. I suppose it needs documentation, and htmllib needs to be changed to use this instead of sgmllib, and sgmllib needs to be declared obsolete. But that can all be done later. This code was first published as part of TAL (part of Zope Page Templates), but that was strongly based on sgmllib anyway. Authors are Fred drake and Guido van Rossum. |