summaryrefslogtreecommitdiffstats
path: root/Lib/HTMLParser.py
Commit message (Collapse)AuthorAgeFilesLines
* Re-factor the HTMLParser class to use the new markupbase.ParserBase class.Fred Drake2001-09-241-305/+19
| | | | | Use a new internal method, error(), consistently to raise parse errors; the new base class also uses this.
* Whitespace normalization.Tim Peters2001-09-181-1/+1
|
* HTMLParser is allowed to be more strict than sgmllib, so let's notFred Drake2001-09-041-31/+16
| | | | | change their basic behavior: When parsing something that cannot possibly be valid in either HTML or XHTML, raise an exception.
* Added reasonable parsing of the DOCTYPE declaration, fixed edge casesFred Drake2001-09-041-12/+260
| | | | regarding bare ampersands in content.
* Deal more appropriately with bare ampersands and pointy brackets; thisFred Drake2001-08-201-12/+12
| | | | | | | | module has to deal with "class" HTML-as-deployed as well as XHTML, so we cannot be as strict as XHTML allows. This closes SF bug #453059, but uses a different fix than suggested in the bug comments.
* Change some comments into docstrings.Fred Drake2001-08-031-27/+31
| | | | | | Fix handling of hexadecimal character references (legal in XHTML) so that they are properly interpreted as character references. This fixes SF bug #445196.
* Merge my changes to the offending comment with Guido's changes.Fred Drake2001-05-231-6/+10
|
* Removed incorrect comment left over from sgmllib.py.Guido van Rossum2001-05-221-7/+7
|
* A much improved HTML parser -- a replacement for sgmllib. The API isGuido van Rossum2001-05-181-0/+432
derived from but not quite compatible with that of sgmllib, so it's a new file. I suppose it needs documentation, and htmllib needs to be changed to use this instead of sgmllib, and sgmllib needs to be declared obsolete. But that can all be done later. This code was first published as part of TAL (part of Zope Page Templates), but that was strongly based on sgmllib anyway. Authors are Fred drake and Guido van Rossum.