cpython.git - https://github.com/python/cpython.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix typos in comments, documentation and test method names	Martin Panter	2016-05-08	1	-1/+1
\|
*	Issue 21469: Mitigate risk of false positives with robotparser.	Raymond Hettinger	2014-05-13	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Repair the broken link to norobots-rfc.txt. * HTTP response codes >= 500 treated as a failed read rather than as a not found. Not found means that we can assume the entire site is allowed. A 5xx server error tells us nothing. * A successful read() or parse() updates the mtime (which is defined to be "the time the robots.txt file was last fetched"). * The can_fetch() method returns False unless we've had a read() with a 2xx or 4xx response. This avoids false positives in the case where a user calls can_fetch() before calling read(). * I don't see any easy way to test this patch without hitting internet resources that might change or without use of mock objects that wouldn't provide must reassurance.
*	#17403: urllib.parse.robotparser normalizes the urls before adding to ruleline.	Senthil Kumaran	2013-05-29	1	-0/+1
\| \| \| \|	This helps in handling certain types invalid urls in a conservative manner.
*	Merged revisions 83238 via svnmerge from	Georg Brandl	2010-08-01	1	-2/+4
\| \| \| \| \| \| \| \| \| \|	svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r83238 \| georg.brandl \| 2010-07-29 19:55:01 +0200 (Do, 29 Jul 2010) \| 1 line #4108: the first default entry (User-agent: *) wins. ........
*	Merged revisions 83209 via svnmerge from	Senthil Kumaran	2010-07-28	1	-1/+6
\| \| \| \| \| \| \| \| \| \|	svn+ssh://pythondev@svn.python.org/python/branches/py3k ........ r83209 \| senthil.kumaran \| 2010-07-28 21:57:56 +0530 (Wed, 28 Jul 2010) \| 3 lines Fix Issue6325 - robotparse to honor urls with query strings. ........
*	Close issue 3437 - missing state change when Allow lines are processed.	Skip Montanaro	2008-07-27	1	-0/+5
\| \| \| \|	Adds test cases which use Allow: as well.
*	#1778443 robotparser fixes from Aristotelis Mikropoulos	Benjamin Peterson	2008-07-12	1	-6/+3
\|
*	Get rid of _test(), _main(), _debug() and _check(). Tests are no longer	Skip Montanaro	2008-04-28	1	-93/+12
\| \| \| \| \| \|	needed (better set available in Lib/test/test_robotparser.py). Clean up a few PEP 8 nits (compound statements on a single line, whitespace around operators).
*	fixes 813986	Skip Montanaro	2007-08-28	1	-0/+5
\|
*	Patch #1555098: use str.join() instead of repeated string	Georg Brandl	2007-03-13	1	-9/+6
\| \| \| \|	concatenation in robotparser.
*	Patch #1014237: Consistently return booleans throughout.	Martin v. Löwis	2004-08-23	1	-10/+10
\|
*	Replace str.find()!=1 with the more readable "in" operator.	Raymond Hettinger	2004-05-04	1	-1/+1
\|
*	SF patch #911431: robot.txt must be robots.txt	Raymond Hettinger	2004-03-13	1	-2/+2
\| \| \| \|	(Contributed by George Yoshida.)
*	Get rid of many apply() calls.	Guido van Rossum	2003-02-27	1	-1/+1
\|
*	Remove import of re, it is not used	Neal Norwitz	2002-05-31	1	-1/+1
\|
*	Patch 560023 adding docstrings. 2.2 Candidate (after verifying modules were ↵	Raymond Hettinger	2002-05-29	1	-0/+17
\| \| \| \|	not updated after 2.2).
*	Convert a pile of obvious "yes/no" functions to return bool.	Tim Peters	2002-04-04	1	-6/+6
\|
*	Correctly set default entry in all cases.	Martin v. Löwis	2002-03-18	1	-6/+9
\|
*	Patch #499513: use readline() instead of readlines(). Removed the	Martin v. Löwis	2002-03-18	1	-16/+6
\| \| \| \|	unnecessary redirection limit code which is already in FancyURLopener.
*	Correct various errors:	Martin v. Löwis	2002-02-28	1	-6/+16
\| \| \| \| \| \| \|	- Use substring search, not re search for user-agent and paths. - Consider * entry last. Unquote, then requote URLs. - Treat empty Disallow as "allow everything". Add test cases. Fixes #523041
*	Remove unused import (PyChecker)	Andrew M. Kuchling	2001-08-13	1	-1/+0
\|
*	Whitespace normalization.	Tim Peters	2001-02-15	1	-1/+1
\|
*	The bulk of the credit for these changes goes to Bastian Kleineidam	Skip Montanaro	2001-02-12	1	-34/+89
\| \| \| \| \| \| \|	* restores urllib as the file fetcher (closes bug #132000) * allows checking URLs with empty paths (closes patches #103511 and 103721) * properly handle user agents with versions (e.g., SpamMeister/1.5) * added several more tests
*	String method conversion.	Eric S. Raymond	2001-02-09	1	-8/+8
\|
*	Whitespace normalization.	Tim Peters	2001-01-21	1	-10/+10
\|
*	added __all__ lists to a number of Python modules	Skip Montanaro	2001-01-20	1	-0/+2
\| \| \| \| \| \| \| \|	added test script and expected output file as well this closes patch 103297. __all__ attributes will be added to other modules without first submitting a patch, just adding the necessary line to the test script to verify more-or-less correct implementation.
*	rewrite of robotparser.py by Bastian Kleineidam. Closes patch 102229.	Skip Montanaro	2001-01-20	1	-60/+179
\|
*	Skip Montanaro:	Guido van Rossum	2000-03-27	1	-17/+17
\| \| \| \| \| \| \| \| \| \| \| \|	The robotparser.py module currently lives in Tools/webchecker. In preparation for its migration to Lib, I made the following changes: * renamed the test() function _test * corrected the URLs in _test() so they refer to actual documents * added an "if __name__ == '__main__'" catcher to invoke _test() when run as a main program * added doc strings for the two main methods, parse and can_fetch * replaced usage of regsub and regex with corresponding re code
*	Give in to tabnanny	Guido van Rossum	1998-04-06	1	-60/+60
\|
*	Skip Montanaro's robots.txt parser.	Guido van Rossum	1997-01-30	1	-0/+97