summaryrefslogtreecommitdiffstats
path: root/Doc
diff options
context:
space:
mode:
Diffstat (limited to 'Doc')
-rw-r--r--Doc/library/urllib.robotparser.rst30
-rw-r--r--Doc/whatsnew/3.6.rst8
2 files changed, 36 insertions, 2 deletions
diff --git a/Doc/library/urllib.robotparser.rst b/Doc/library/urllib.robotparser.rst
index f179de2..c2e1bef 100644
--- a/Doc/library/urllib.robotparser.rst
+++ b/Doc/library/urllib.robotparser.rst
@@ -53,15 +53,41 @@ structure of :file:`robots.txt` files, see http://www.robotstxt.org/orig.html.
Sets the time the ``robots.txt`` file was last fetched to the current
time.
+ .. method:: crawl_delay(useragent)
-The following example demonstrates basic use of the RobotFileParser class.
+ Returns the value of the ``Crawl-delay`` parameter from ``robots.txt``
+ for the *useragent* in question. If there is no such parameter or it
+ doesn't apply to the *useragent* specified or the ``robots.txt`` entry
+ for this parameter has invalid syntax, return ``None``.
+
+ .. versionadded:: 3.6
+
+ .. method:: request_rate(useragent)
+
+ Returns the contents of the ``Request-rate`` parameter from
+ ``robots.txt`` in the form of a :func:`~collections.namedtuple`
+ ``(requests, seconds)``. If there is no such parameter or it doesn't
+ apply to the *useragent* specified or the ``robots.txt`` entry for this
+ parameter has invalid syntax, return ``None``.
+
+ .. versionadded:: 3.6
+
+
+The following example demonstrates basic use of the :class:`RobotFileParser`
+class::
>>> import urllib.robotparser
>>> rp = urllib.robotparser.RobotFileParser()
>>> rp.set_url("http://www.musi-cal.com/robots.txt")
>>> rp.read()
+ >>> rrate = rp.request_rate("*")
+ >>> rrate.requests
+ 3
+ >>> rrate.seconds
+ 20
+ >>> rp.crawl_delay("*")
+ 6
>>> rp.can_fetch("*", "http://www.musi-cal.com/cgi-bin/search?city=San+Francisco")
False
>>> rp.can_fetch("*", "http://www.musi-cal.com/")
True
-
diff --git a/Doc/whatsnew/3.6.rst b/Doc/whatsnew/3.6.rst
index dd35c9a..3080820 100644
--- a/Doc/whatsnew/3.6.rst
+++ b/Doc/whatsnew/3.6.rst
@@ -119,6 +119,14 @@ datetime
(Contributed by Ashley Anderson in :issue:`12006`.)
+urllib.robotparser
+------------------
+
+:class:`~urllib.robotparser.RobotFileParser` now supports ``Crawl-delay`` and
+``Request-rate`` extensions.
+(Contributed by Nikolay Bogoychev in :issue:`16099`.)
+
+
Optimizations
=============