summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMartin Panter <vadmium+py@gmail.com>2016-08-27 01:39:26 (GMT)
committerMartin Panter <vadmium+py@gmail.com>2016-08-27 01:39:26 (GMT)
commitef91bb26604ddfae22aac56b3cfdaabf237db37a (patch)
tree491cbeadc0488a4e203de39ca42f8b132653195f
parent8f96a30630b5f6a984af3ee53c63bce3b16077e0 (diff)
downloadcpython-ef91bb26604ddfae22aac56b3cfdaabf237db37a.zip
cpython-ef91bb26604ddfae22aac56b3cfdaabf237db37a.tar.gz
cpython-ef91bb26604ddfae22aac56b3cfdaabf237db37a.tar.bz2
Issue #12319: Always send file request bodies using chunked encoding
The previous attempt to determine the file’s Content-Length gave a false positive for pipes on Windows. Also, drop the special case for sending zero-length iterable bodies.
-rw-r--r--Doc/library/http.client.rst31
-rw-r--r--Doc/library/urllib.request.rst15
-rw-r--r--Doc/whatsnew/3.6.rst11
-rw-r--r--Lib/http/client.py31
-rw-r--r--Lib/test/test_httplib.py27
-rw-r--r--Lib/test/test_urllib2.py56
-rw-r--r--Misc/NEWS7
7 files changed, 96 insertions, 82 deletions
diff --git a/Doc/library/http.client.rst b/Doc/library/http.client.rst
index 9429fb6..d1b4450 100644
--- a/Doc/library/http.client.rst
+++ b/Doc/library/http.client.rst
@@ -240,17 +240,17 @@ HTTPConnection Objects
The *headers* argument should be a mapping of extra HTTP headers to send
with the request.
- If *headers* contains neither Content-Length nor Transfer-Encoding, a
- Content-Length header will be added automatically if possible. If
+ If *headers* contains neither Content-Length nor Transfer-Encoding,
+ but there is a request body, one of those
+ header fields will be added automatically. If
*body* is ``None``, the Content-Length header is set to ``0`` for
methods that expect a body (``PUT``, ``POST``, and ``PATCH``). If
- *body* is a string or bytes-like object, the Content-Length header is
- set to its length. If *body* is a binary :term:`file object`
- supporting :meth:`~io.IOBase.seek`, this will be used to determine
- its size. Otherwise, the Content-Length header is not added
- automatically. In cases where determining the Content-Length up
- front is not possible, the body will be chunk-encoded and the
- Transfer-Encoding header will automatically be set.
+ *body* is a string or a bytes-like object that is not also a
+ :term:`file <file object>`, the Content-Length header is
+ set to its length. Any other type of *body* (files
+ and iterables in general) will be chunk-encoded, and the
+ Transfer-Encoding header will automatically be set instead of
+ Content-Length.
The *encode_chunked* argument is only relevant if Transfer-Encoding is
specified in *headers*. If *encode_chunked* is ``False``, the
@@ -260,19 +260,18 @@ HTTPConnection Objects
.. note::
Chunked transfer encoding has been added to the HTTP protocol
version 1.1. Unless the HTTP server is known to handle HTTP 1.1,
- the caller must either specify the Content-Length or must use a
- body representation whose length can be determined automatically.
+ the caller must either specify the Content-Length, or must pass a
+ :class:`str` or bytes-like object that is not also a file as the
+ body representation.
.. versionadded:: 3.2
*body* can now be an iterable.
.. versionchanged:: 3.6
If neither Content-Length nor Transfer-Encoding are set in
- *headers* and Content-Length cannot be determined, *body* will now
- be automatically chunk-encoded. The *encode_chunked* argument
- was added.
- The Content-Length for binary file objects is determined with seek.
- No attempt is made to determine the Content-Length for text file
+ *headers*, file and iterable *body* objects are now chunk-encoded.
+ The *encode_chunked* argument was added.
+ No attempt is made to determine the Content-Length for file
objects.
.. method:: HTTPConnection.getresponse()
diff --git a/Doc/library/urllib.request.rst b/Doc/library/urllib.request.rst
index e619cc1..d288165 100644
--- a/Doc/library/urllib.request.rst
+++ b/Doc/library/urllib.request.rst
@@ -187,12 +187,11 @@ The following classes are provided:
server, or ``None`` if no such data is needed. Currently HTTP
requests are the only ones that use *data*. The supported object
types include bytes, file-like objects, and iterables. If no
- ``Content-Length`` header has been provided, :class:`HTTPHandler` will
- try to determine the length of *data* and set this header accordingly.
- If this fails, ``Transfer-Encoding: chunked`` as specified in
- :rfc:`7230`, Section 3.3.1 will be used to send the data. See
- :meth:`http.client.HTTPConnection.request` for details on the
- supported object types and on how the content length is determined.
+ ``Content-Length`` nor ``Transfer-Encoding`` header field
+ has been provided, :class:`HTTPHandler` will set these headers according
+ to the type of *data*. ``Content-Length`` will be used to send
+ bytes objects, while ``Transfer-Encoding: chunked`` as specified in
+ :rfc:`7230`, Section 3.3.1 will be used to send files and other iterables.
For an HTTP POST request method, *data* should be a buffer in the
standard :mimetype:`application/x-www-form-urlencoded` format. The
@@ -256,8 +255,8 @@ The following classes are provided:
.. versionchanged:: 3.6
Do not raise an error if the ``Content-Length`` has not been
- provided and could not be determined. Fall back to use chunked
- transfer encoding instead.
+ provided and *data* is neither ``None`` nor a bytes object.
+ Fall back to use chunked transfer encoding instead.
.. class:: OpenerDirector()
diff --git a/Doc/whatsnew/3.6.rst b/Doc/whatsnew/3.6.rst
index ff3861b..6ef82d4 100644
--- a/Doc/whatsnew/3.6.rst
+++ b/Doc/whatsnew/3.6.rst
@@ -579,8 +579,8 @@ The :class:`~unittest.mock.Mock` class has the following improvements:
urllib.request
--------------
-If a HTTP request has a non-empty body but no Content-Length header
-and the content length cannot be determined up front, rather than
+If a HTTP request has a file or iterable body (other than a
+bytes object) but no Content-Length header, rather than
throwing an error, :class:`~urllib.request.AbstractHTTPHandler` now
falls back to use chunked transfer encoding.
(Contributed by Demian Brecht and Rolf Krahl in :issue:`12319`.)
@@ -935,6 +935,13 @@ Changes in the Python API
This behavior has also been backported to earlier Python versions
by Setuptools 26.0.0.
+* In the :mod:`urllib.request` module and the
+ :meth:`http.client.HTTPConnection.request` method, if no Content-Length
+ header field has been specified and the request body is a file object,
+ it is now sent with HTTP 1.1 chunked encoding. If a file object has to
+ be sent to a HTTP 1.0 server, the Content-Length value now has to be
+ specified by the caller. See :issue:`12319`.
+
Changes in the C API
--------------------
diff --git a/Lib/http/client.py b/Lib/http/client.py
index b242ba6..9d5cf45 100644
--- a/Lib/http/client.py
+++ b/Lib/http/client.py
@@ -805,35 +805,21 @@ class HTTPConnection:
def _get_content_length(body, method):
"""Get the content-length based on the body.
- If the body is "empty", we set Content-Length: 0 for methods
- that expect a body (RFC 7230, Section 3.3.2). If the body is
- set for other methods, we set the header provided we can
- figure out what the length is.
+ If the body is None, we set Content-Length: 0 for methods that expect
+ a body (RFC 7230, Section 3.3.2). We also set the Content-Length for
+ any method if the body is a str or bytes-like object and not a file.
"""
- if not body:
+ if body is None:
# do an explicit check for not None here to distinguish
# between unset and set but empty
- if method.upper() in _METHODS_EXPECTING_BODY or body is not None:
+ if method.upper() in _METHODS_EXPECTING_BODY:
return 0
else:
return None
if hasattr(body, 'read'):
# file-like object.
- if HTTPConnection._is_textIO(body):
- # text streams are unpredictable because it depends on
- # character encoding and line ending translation.
- return None
- else:
- # Is it seekable?
- try:
- curpos = body.tell()
- sz = body.seek(0, io.SEEK_END)
- except (TypeError, AttributeError, OSError):
- return None
- else:
- body.seek(curpos)
- return sz - curpos
+ return None
try:
# does it implement the buffer protocol (bytes, bytearray, array)?
@@ -1266,8 +1252,7 @@ class HTTPConnection:
# the caller passes encode_chunked=True or the following
# conditions hold:
# 1. content-length has not been explicitly set
- # 2. the length of the body cannot be determined
- # (e.g. it is a generator or unseekable file)
+ # 2. the body is a file or iterable, but not a str or bytes-like
# 3. Transfer-Encoding has NOT been explicitly set by the caller
if 'content-length' not in header_names:
@@ -1280,7 +1265,7 @@ class HTTPConnection:
encode_chunked = False
content_length = self._get_content_length(body, method)
if content_length is None:
- if body:
+ if body is not None:
if self.debuglevel > 0:
print('Unable to determine size of %r' % body)
encode_chunked = True
diff --git a/Lib/test/test_httplib.py b/Lib/test/test_httplib.py
index a179612..359e0bb 100644
--- a/Lib/test/test_httplib.py
+++ b/Lib/test/test_httplib.py
@@ -381,6 +381,16 @@ class TransferEncodingTest(TestCase):
# same request
self.assertNotIn('content-length', [k.lower() for k in headers])
+ def test_empty_body(self):
+ # Zero-length iterable should be treated like any other iterable
+ conn = client.HTTPConnection('example.com')
+ conn.sock = FakeSocket(b'')
+ conn.request('POST', '/', ())
+ _, headers, body = self._parse_request(conn.sock.data)
+ self.assertEqual(headers['Transfer-Encoding'], 'chunked')
+ self.assertNotIn('content-length', [k.lower() for k in headers])
+ self.assertEqual(body, b"0\r\n\r\n")
+
def _make_body(self, empty_lines=False):
lines = self.expected_body.split(b' ')
for idx, line in enumerate(lines):
@@ -652,7 +662,9 @@ class BasicTest(TestCase):
def test_send_file(self):
expected = (b'GET /foo HTTP/1.1\r\nHost: example.com\r\n'
- b'Accept-Encoding: identity\r\nContent-Length:')
+ b'Accept-Encoding: identity\r\n'
+ b'Transfer-Encoding: chunked\r\n'
+ b'\r\n')
with open(__file__, 'rb') as body:
conn = client.HTTPConnection('example.com')
@@ -1717,7 +1729,7 @@ class RequestBodyTest(TestCase):
self.assertEqual("5", message.get("content-length"))
self.assertEqual(b'body\xc1', f.read())
- def test_file_body(self):
+ def test_text_file_body(self):
self.addCleanup(support.unlink, support.TESTFN)
with open(support.TESTFN, "w") as f:
f.write("body")
@@ -1726,10 +1738,8 @@ class RequestBodyTest(TestCase):
message, f = self.get_headers_and_fp()
self.assertEqual("text/plain", message.get_content_type())
self.assertIsNone(message.get_charset())
- # Note that the length of text files is unpredictable
- # because it depends on character encoding and line ending
- # translation. No content-length will be set, the body
- # will be sent using chunked transfer encoding.
+ # No content-length will be determined for files; the body
+ # will be sent using chunked transfer encoding instead.
self.assertIsNone(message.get("content-length"))
self.assertEqual("chunked", message.get("transfer-encoding"))
self.assertEqual(b'4\r\nbody\r\n0\r\n\r\n', f.read())
@@ -1743,8 +1753,9 @@ class RequestBodyTest(TestCase):
message, f = self.get_headers_and_fp()
self.assertEqual("text/plain", message.get_content_type())
self.assertIsNone(message.get_charset())
- self.assertEqual("5", message.get("content-length"))
- self.assertEqual(b'body\xc1', f.read())
+ self.assertEqual("chunked", message.get("Transfer-Encoding"))
+ self.assertNotIn("Content-Length", message)
+ self.assertEqual(b'5\r\nbody\xc1\r\n0\r\n\r\n', f.read())
class HTTPResponseTest(TestCase):
diff --git a/Lib/test/test_urllib2.py b/Lib/test/test_urllib2.py
index 0eea0c7..34329f8 100644
--- a/Lib/test/test_urllib2.py
+++ b/Lib/test/test_urllib2.py
@@ -913,40 +913,50 @@ class HandlerTests(unittest.TestCase):
self.assertEqual(req.unredirected_hdrs["Spam"], "foo")
def test_http_body_file(self):
- # A regular file - Content Length is calculated unless already set.
+ # A regular file - chunked encoding is used unless Content Length is
+ # already set.
h = urllib.request.AbstractHTTPHandler()
o = h.parent = MockOpener()
file_obj = tempfile.NamedTemporaryFile(mode='w+b', delete=False)
file_path = file_obj.name
- file_obj.write(b"Something\nSomething\nSomething\n")
file_obj.close()
+ self.addCleanup(os.unlink, file_path)
- for headers in {}, {"Content-Length": 30}:
- with open(file_path, "rb") as f:
- req = Request("http://example.com/", f, headers)
- newreq = h.do_request_(req)
- self.assertEqual(int(newreq.get_header('Content-length')), 30)
+ with open(file_path, "rb") as f:
+ req = Request("http://example.com/", f, {})
+ newreq = h.do_request_(req)
+ te = newreq.get_header('Transfer-encoding')
+ self.assertEqual(te, "chunked")
+ self.assertFalse(newreq.has_header('Content-length'))
- os.unlink(file_path)
+ with open(file_path, "rb") as f:
+ req = Request("http://example.com/", f, {"Content-Length": 30})
+ newreq = h.do_request_(req)
+ self.assertEqual(int(newreq.get_header('Content-length')), 30)
+ self.assertFalse(newreq.has_header("Transfer-encoding"))
def test_http_body_fileobj(self):
- # A file object - Content Length is calculated unless already set.
+ # A file object - chunked encoding is used
+ # unless Content Length is already set.
# (Note that there are some subtle differences to a regular
# file, that is why we are testing both cases.)
h = urllib.request.AbstractHTTPHandler()
o = h.parent = MockOpener()
-
file_obj = io.BytesIO()
- file_obj.write(b"Something\nSomething\nSomething\n")
- for headers in {}, {"Content-Length": 30}:
- file_obj.seek(0)
- req = Request("http://example.com/", file_obj, headers)
- newreq = h.do_request_(req)
- self.assertEqual(int(newreq.get_header('Content-length')), 30)
+ req = Request("http://example.com/", file_obj, {})
+ newreq = h.do_request_(req)
+ self.assertEqual(newreq.get_header('Transfer-encoding'), 'chunked')
+ self.assertFalse(newreq.has_header('Content-length'))
+
+ headers = {"Content-Length": 30}
+ req = Request("http://example.com/", file_obj, headers)
+ newreq = h.do_request_(req)
+ self.assertEqual(int(newreq.get_header('Content-length')), 30)
+ self.assertFalse(newreq.has_header("Transfer-encoding"))
file_obj.close()
@@ -959,9 +969,7 @@ class HandlerTests(unittest.TestCase):
h = urllib.request.AbstractHTTPHandler()
o = h.parent = MockOpener()
- cmd = [sys.executable, "-c",
- r"import sys; "
- r"sys.stdout.buffer.write(b'Something\nSomething\nSomething\n')"]
+ cmd = [sys.executable, "-c", r"pass"]
for headers in {}, {"Content-Length": 30}:
with subprocess.Popen(cmd, stdout=subprocess.PIPE) as proc:
req = Request("http://example.com/", proc.stdout, headers)
@@ -983,8 +991,6 @@ class HandlerTests(unittest.TestCase):
def iterable_body():
yield b"one"
- yield b"two"
- yield b"three"
for headers in {}, {"Content-Length": 11}:
req = Request("http://example.com/", iterable_body(), headers)
@@ -996,6 +1002,14 @@ class HandlerTests(unittest.TestCase):
else:
self.assertEqual(int(newreq.get_header('Content-length')), 11)
+ def test_http_body_empty_seq(self):
+ # Zero-length iterable body should be treated like any other iterable
+ h = urllib.request.AbstractHTTPHandler()
+ h.parent = MockOpener()
+ req = h.do_request_(Request("http://example.com/", ()))
+ self.assertEqual(req.get_header("Transfer-encoding"), "chunked")
+ self.assertFalse(req.has_header("Content-length"))
+
def test_http_body_array(self):
# array.array Iterable - Content Length is calculated
diff --git a/Misc/NEWS b/Misc/NEWS
index d5eca5b..fc83452 100644
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -52,10 +52,9 @@ Library
- Issue #12319: Chunked transfer encoding support added to
http.client.HTTPConnection requests. The
urllib.request.AbstractHTTPHandler class does not enforce a Content-Length
- header any more. If a HTTP request has a non-empty body, but no
- Content-Length header, and the content length cannot be determined
- up front, rather than throwing an error, the library now falls back
- to use chunked transfer encoding.
+ header any more. If a HTTP request has a file or iterable body, but no
+ Content-Length header, the library now falls back to use chunked transfer-
+ encoding.
- A new version of typing.py from https://github.com/python/typing:
- Collection (only for 3.6) (Issue #27598)