Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | Although it's hard to be sure, I *think* this is a working conversion | Guido van Rossum | 1997-10-23 | 1 | -67/+67 |
| | | | | | | from regex to re style regular expressions. This should make sgmllib and htmllib threadsafe, so I can now create a threaded version of webchecker... | ||||
* | (sgmllib.py): Partial acceptance of patch from David Leonard | Fred Drake | 1996-12-16 | 1 | -1/+1 |
| | | | | | | | | <leonard@dstc.edu.au>; allows hyphen and period in the middle of attribute names. Still not allowed as first character; as first character these are illegal in the Reference Concrete Syntax, and we've not identified any use of these characters as the first char in an attribute name in deployment on the web. | ||||
* | Reformatted with 4-space tab stops. | Guido van Rossum | 1996-03-28 | 1 | -286/+406 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow '=' and '~' in unquoted attribute values. Added overridable methods handle_starttag(tag, method, attrs) and handle_endtag(tag, method) so subclasses can decide whether they really want to call the method (e.g. when suppressing some portion of the document). Added support for a number of SGML shortcuts: shorthand full notation <tag>...<>... <tag>...<tag>... <tag>...</> <tag>...</tag> <tag/.../ <tag>...</tag> <tag1<tag2> <tag1><tag2> </tag1</tag2> </tag1></tag2> </tag1<tag2> </tag1><tag2> This required factoring out some common actions and rationalizing the interface to parse_endtag(), so as to make the code more readable. Fixed syntax for &entity and &#char references so the trailing semicolon is optional; removed explicit support for trailing period (which was a TBL mistake in HTML 0.0). Generalized the test program. Tried to speed things up a little. (More to come after the profile results are in.) Fix error recovery: call the end methods popped from the stack instead of the one that triggers. (Plus some complications because of the way HTML extensions are handled in Grail.) | ||||
* | typos in attrfind regex | Guido van Rossum | 1995-10-06 | 1 | -1/+1 |
| | |||||
* | allow _ in attr names (Netscape!) | Guido van Rossum | 1995-09-30 | 1 | -1/+1 |
| | |||||
* | fix <!...!> parsing; added verbose option; don't lowercase entityrefs | Guido van Rossum | 1995-09-22 | 1 | -5/+7 |
| | |||||
* | support value-less attributes, using regex.group() | Guido van Rossum | 1995-09-01 | 1 | -14/+8 |
| | |||||
* | added note about missing features | Guido van Rossum | 1995-08-10 | 1 | -0/+2 |
| | |||||
* | changed comment parsing | Guido van Rossum | 1995-08-04 | 1 | -13/+14 |
| | |||||
* | make reporting unbalanced tags an overridable method | Guido van Rossum | 1995-06-22 | 1 | -2/+7 |
| | |||||
* | remove redundant backslashes; some cosnetics | Guido van Rossum | 1995-03-04 | 1 | -9/+10 |
| | |||||
* | added html parser and supporting cast | Guido van Rossum | 1995-02-27 | 1 | -0/+321 |