| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
without thought. Anyway, such malformed HTML is better handled by something
like BeautifulSoup.
|
| |
|
| |
|
|
|
|
| |
according to the HTML REC, but HTMLParser is already a pretty loose parser. Reported by Bernd Zimmermann.
|
|
|
|
| |
From SF patch #852334.
|
|
|
|
| |
This closes SF patch #669683.
|
| |
|
|
|
|
|
|
| |
In goahead(), use a bound version of rawdata.startswith() since we use the
same method all the time and never change the value of rawdata. This can
save a lot of bound method creation.
|
|
|
|
|
| |
Use a new internal method, error(), consistently to raise parse errors;
the new base class also uses this.
|
| |
|
|
|
|
|
| |
change their basic behavior: When parsing something that cannot possibly
be valid in either HTML or XHTML, raise an exception.
|
|
|
|
| |
regarding bare ampersands in content.
|
|
|
|
|
|
|
|
| |
module has to deal with "class" HTML-as-deployed as well as XHTML, so we
cannot be as strict as XHTML allows.
This closes SF bug #453059, but uses a different fix than suggested in
the bug comments.
|
|
|
|
|
|
| |
Fix handling of hexadecimal character references (legal in XHTML) so that
they are properly interpreted as character references.
This fixes SF bug #445196.
|
| |
|
| |
|
|
derived from but not quite compatible with that of sgmllib, so it's a
new file. I suppose it needs documentation, and htmllib needs to be
changed to use this instead of sgmllib, and sgmllib needs to be
declared obsolete. But that can all be done later.
This code was first published as part of TAL (part of Zope Page
Templates), but that was strongly based on sgmllib anyway. Authors
are Fred drake and Guido van Rossum.
|