| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
| |
unnecessary redirection limit code which is already in FancyURLopener.
|
|
|
|
|
|
|
| |
- Use substring search, not re search for user-agent and paths.
- Consider * entry last. Unquote, then requote URLs.
- Treat empty Disallow as "allow everything".
Add test cases. Fixes #523041
|
| |
|
| |
|
|
|
|
|
|
|
| |
* restores urllib as the file fetcher (closes bug #132000)
* allows checking URLs with empty paths (closes patches #103511 and 103721)
* properly handle user agents with versions (e.g., SpamMeister/1.5)
* added several more tests
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
added test script and expected output file as well
this closes patch 103297.
__all__ attributes will be added to other modules without first submitting
a patch, just adding the necessary line to the test script to verify
more-or-less correct implementation.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The robotparser.py module currently lives in Tools/webchecker. In
preparation for its migration to Lib, I made the following changes:
* renamed the test() function _test
* corrected the URLs in _test() so they refer to actual documents
* added an "if __name__ == '__main__'" catcher to invoke _test()
when run as a main program
* added doc strings for the two main methods, parse and can_fetch
* replaced usage of regsub and regex with corresponding re code
|
| |
|
|
|