summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAlexandre Vassalotti <alexandre@peadrop.com>2013-12-07 09:09:27 (GMT)
committerAlexandre Vassalotti <alexandre@peadrop.com>2013-12-07 09:09:27 (GMT)
commitd05c9ff84501d93b13de40a9c7b0360c7d2ebada (patch)
treeae840ca5e91d21e53cc60e6c3e7fdd64b5a9fec4
parentee07b94788e5e3e79f6632e92a5295adc3937bf4 (diff)
downloadcpython-d05c9ff84501d93b13de40a9c7b0360c7d2ebada.zip
cpython-d05c9ff84501d93b13de40a9c7b0360c7d2ebada.tar.gz
cpython-d05c9ff84501d93b13de40a9c7b0360c7d2ebada.tar.bz2
Issue #6784: Strings from Python 2 can now be unpickled as bytes objects.
Initial patch by Merlijn van Deen. I've added a few unrelated docstring fixes in the patch while I was at it, which makes the documentation for pickle a bit more consistent.
-rw-r--r--Doc/library/pickle.rst88
-rw-r--r--Lib/pickle.py71
-rw-r--r--Lib/pickletools.py185
-rw-r--r--Lib/test/pickletester.py30
-rw-r--r--Lib/test/test_pickle.py4
-rw-r--r--Misc/ACKS1
-rw-r--r--Misc/NEWS4
-rw-r--r--Modules/_pickle.c406
8 files changed, 435 insertions, 354 deletions
diff --git a/Doc/library/pickle.rst b/Doc/library/pickle.rst
index 1f35b60..8976211 100644
--- a/Doc/library/pickle.rst
+++ b/Doc/library/pickle.rst
@@ -173,7 +173,7 @@ The :mod:`pickle` module provides the following constants:
An integer, the default :ref:`protocol version <pickle-protocols>` used
for pickling. May be less than :data:`HIGHEST_PROTOCOL`. Currently the
- default protocol is 3, a new protocol designed for Python 3.0.
+ default protocol is 3, a new protocol designed for Python 3.
The :mod:`pickle` module provides the following functions to make the pickling
@@ -184,9 +184,9 @@ process more convenient:
Write a pickled representation of *obj* to the open :term:`file object` *file*.
This is equivalent to ``Pickler(file, protocol).dump(obj)``.
- The optional *protocol* argument tells the pickler to use the given protocol;
- supported protocols are 0, 1, 2, 3. The default protocol is 3; a
- backward-incompatible protocol designed for Python 3.0.
+ The optional *protocol* argument tells the pickler to use the given
+ protocol; supported protocols are 0, 1, 2, 3. The default protocol is 3; a
+ backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version
supported. The higher the protocol used, the more recent the version of
@@ -198,64 +198,66 @@ process more convenient:
interface.
If *fix_imports* is true and *protocol* is less than 3, pickle will try to
- map the new Python 3.x names to the old module names used in Python 2.x,
- so that the pickle data stream is readable with Python 2.x.
+ map the new Python 3 names to the old module names used in Python 2, so
+ that the pickle data stream is readable with Python 2.
.. function:: dumps(obj, protocol=None, \*, fix_imports=True)
- Return the pickled representation of the object as a :class:`bytes`
- object, instead of writing it to a file.
+ Return the pickled representation of the object as a :class:`bytes` object,
+ instead of writing it to a file.
- The optional *protocol* argument tells the pickler to use the given protocol;
- supported protocols are 0, 1, 2, 3. The default protocol is 3; a
- backward-incompatible protocol designed for Python 3.0.
+ The optional *protocol* argument tells the pickler to use the given
+ protocol; supported protocols are 0, 1, 2, 3 and 4. The default protocol
+ is 3; a backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version
supported. The higher the protocol used, the more recent the version of
Python needed to read the pickle produced.
If *fix_imports* is true and *protocol* is less than 3, pickle will try to
- map the new Python 3.x names to the old module names used in Python 2.x,
- so that the pickle data stream is readable with Python 2.x.
+ map the new Python 3 names to the old module names used in Python 2, so
+ that the pickle data stream is readable with Python 2.
.. function:: load(file, \*, fix_imports=True, encoding="ASCII", errors="strict")
- Read a pickled object representation from the open :term:`file object` *file*
- and return the reconstituted object hierarchy specified therein. This is
- equivalent to ``Unpickler(file).load()``.
+ Read a pickled object representation from the open :term:`file object`
+ *file* and return the reconstituted object hierarchy specified therein.
+ This is equivalent to ``Unpickler(file).load()``.
- The protocol version of the pickle is detected automatically, so no protocol
- argument is needed. Bytes past the pickled object's representation are
- ignored.
+ The protocol version of the pickle is detected automatically, so no
+ protocol argument is needed. Bytes past the pickled object's
+ representation are ignored.
The argument *file* must have two methods, a read() method that takes an
integer argument, and a readline() method that requires no arguments. Both
- methods should return bytes. Thus *file* can be an on-disk file opened
- for binary reading, a :class:`io.BytesIO` object, or any other custom object
+ methods should return bytes. Thus *file* can be an on-disk file opened for
+ binary reading, a :class:`io.BytesIO` object, or any other custom object
that meets this interface.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatibility support for pickle stream generated
- by Python 2.x. If *fix_imports* is true, pickle will try to map the old
- Python 2.x names to the new names used in Python 3.x. The *encoding* and
+ by Python 2. If *fix_imports* is true, pickle will try to map the old
+ Python 2 names to the new names used in Python 3. The *encoding* and
*errors* tell pickle how to decode 8-bit string instances pickled by Python
- 2.x; these default to 'ASCII' and 'strict', respectively.
+ 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
+ be 'bytes' to read these 8-bit string instances as bytes objects.
.. function:: loads(bytes_object, \*, fix_imports=True, encoding="ASCII", errors="strict")
Read a pickled object hierarchy from a :class:`bytes` object and return the
reconstituted object hierarchy specified therein
- The protocol version of the pickle is detected automatically, so no protocol
- argument is needed. Bytes past the pickled object's representation are
- ignored.
+ The protocol version of the pickle is detected automatically, so no
+ protocol argument is needed. Bytes past the pickled object's
+ representation are ignored.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatibility support for pickle stream generated
- by Python 2.x. If *fix_imports* is true, pickle will try to map the old
- Python 2.x names to the new names used in Python 3.x. The *encoding* and
+ by Python 2. If *fix_imports* is true, pickle will try to map the old
+ Python 2 names to the new names used in Python 3. The *encoding* and
*errors* tell pickle how to decode 8-bit string instances pickled by Python
- 2.x; these default to 'ASCII' and 'strict', respectively.
+ 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
+ be 'bytes' to read these 8-bit string instances as bytes objects.
The :mod:`pickle` module defines three exceptions:
@@ -290,9 +292,9 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
This takes a binary file for writing a pickle data stream.
- The optional *protocol* argument tells the pickler to use the given protocol;
- supported protocols are 0, 1, 2, 3. The default protocol is 3; a
- backward-incompatible protocol designed for Python 3.0.
+ The optional *protocol* argument tells the pickler to use the given
+ protocol; supported protocols are 0, 1, 2, 3 and 4. The default protocol
+ is 3; a backward-incompatible protocol designed for Python 3.
Specifying a negative protocol version selects the highest protocol version
supported. The higher the protocol used, the more recent the version of
@@ -300,11 +302,12 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
The *file* argument must have a write() method that accepts a single bytes
argument. It can thus be an on-disk file opened for binary writing, a
- :class:`io.BytesIO` instance, or any other custom object that meets this interface.
+ :class:`io.BytesIO` instance, or any other custom object that meets this
+ interface.
If *fix_imports* is true and *protocol* is less than 3, pickle will try to
- map the new Python 3.x names to the old module names used in Python 2.x,
- so that the pickle data stream is readable with Python 2.x.
+ map the new Python 3 names to the old module names used in Python 2, so
+ that the pickle data stream is readable with Python 2.
.. method:: dump(obj)
@@ -366,16 +369,17 @@ The :mod:`pickle` module exports two classes, :class:`Pickler` and
The argument *file* must have two methods, a read() method that takes an
integer argument, and a readline() method that requires no arguments. Both
- methods should return bytes. Thus *file* can be an on-disk file object opened
- for binary reading, a :class:`io.BytesIO` object, or any other custom object
- that meets this interface.
+ methods should return bytes. Thus *file* can be an on-disk file object
+ opened for binary reading, a :class:`io.BytesIO` object, or any other
+ custom object that meets this interface.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatibility support for pickle stream generated
- by Python 2.x. If *fix_imports* is true, pickle will try to map the old
- Python 2.x names to the new names used in Python 3.x. The *encoding* and
+ by Python 2. If *fix_imports* is true, pickle will try to map the old
+ Python 2 names to the new names used in Python 3. The *encoding* and
*errors* tell pickle how to decode 8-bit string instances pickled by Python
- 2.x; these default to 'ASCII' and 'strict', respectively.
+ 2; these default to 'ASCII' and 'strict', respectively. The *encoding* can
+ be 'bytes' to read these ß8-bit string instances as bytes objects.
.. method:: load()
diff --git a/Lib/pickle.py b/Lib/pickle.py
index c57149a..9cd0132 100644
--- a/Lib/pickle.py
+++ b/Lib/pickle.py
@@ -348,24 +348,25 @@ class _Pickler:
def __init__(self, file, protocol=None, *, fix_imports=True):
"""This takes a binary file for writing a pickle data stream.
- The optional protocol argument tells the pickler to use the
+ The optional *protocol* argument tells the pickler to use the
given protocol; supported protocols are 0, 1, 2, 3 and 4. The
- default protocol is 3; a backward-incompatible protocol designed for
- Python 3.
+ default protocol is 3; a backward-incompatible protocol designed
+ for Python 3.
Specifying a negative protocol version selects the highest
protocol version supported. The higher the protocol used, the
more recent the version of Python needed to read the pickle
produced.
- The file argument must have a write() method that accepts a single
- bytes argument. It can thus be a file object opened for binary
- writing, a io.BytesIO instance, or any other custom object that
- meets this interface.
+ The *file* argument must have a write() method that accepts a
+ single bytes argument. It can thus be a file object opened for
+ binary writing, a io.BytesIO instance, or any other custom
+ object that meets this interface.
- If fix_imports is True and protocol is less than 3, pickle will try to
- map the new Python 3 names to the old module names used in Python 2,
- so that the pickle data stream is readable with Python 2.
+ If *fix_imports* is True and *protocol* is less than 3, pickle
+ will try to map the new Python 3 names to the old module names
+ used in Python 2, so that the pickle data stream is readable
+ with Python 2.
"""
if protocol is None:
protocol = DEFAULT_PROTOCOL
@@ -389,10 +390,9 @@ class _Pickler:
"""Clears the pickler's "memo".
The memo is the data structure that remembers which objects the
- pickler has already seen, so that shared or recursive objects are
- pickled by reference and not by value. This method is useful when
- re-using picklers.
-
+ pickler has already seen, so that shared or recursive objects
+ are pickled by reference and not by value. This method is
+ useful when re-using picklers.
"""
self.memo.clear()
@@ -975,8 +975,14 @@ class _Unpickler:
encoding="ASCII", errors="strict"):
"""This takes a binary file for reading a pickle data stream.
- The protocol version of the pickle is detected automatically, so no
- proto argument is needed.
+ The protocol version of the pickle is detected automatically, so
+ no proto argument is needed.
+
+ The argument *file* must have two methods, a read() method that
+ takes an integer argument, and a readline() method that requires
+ no arguments. Both methods should return bytes. Thus *file*
+ can be a binary file object opened for reading, a io.BytesIO
+ object, or any other custom object that meets this interface.
The file-like object must have two methods, a read() method
that takes an integer argument, and a readline() method that
@@ -985,13 +991,14 @@ class _Unpickler:
reading, a BytesIO object, or any other custom object that
meets this interface.
- Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
- which are used to control compatiblity support for pickle stream
- generated by Python 2.x. If *fix_imports* is True, pickle will try to
- map the old Python 2.x names to the new names used in Python 3.x. The
- *encoding* and *errors* tell pickle how to decode 8-bit string
- instances pickled by Python 2.x; these default to 'ASCII' and
- 'strict', respectively.
+ Optional keyword arguments are *fix_imports*, *encoding* and
+ *errors*, which are used to control compatiblity support for
+ pickle stream generated by Python 2. If *fix_imports* is True,
+ pickle will try to map the old Python 2 names to the new names
+ used in Python 3. The *encoding* and *errors* tell pickle how
+ to decode 8-bit string instances pickled by Python 2; these
+ default to 'ASCII' and 'strict', respectively. *encoding* can be
+ 'bytes' to read theses 8-bit string instances as bytes objects.
"""
self._file_readline = file.readline
self._file_read = file.read
@@ -1139,6 +1146,15 @@ class _Unpickler:
self.append(unpack('>d', self.read(8))[0])
dispatch[BINFLOAT[0]] = load_binfloat
+ def _decode_string(self, value):
+ # Used to allow strings from Python 2 to be decoded either as
+ # bytes or Unicode strings. This should be used only with the
+ # STRING, BINSTRING and SHORT_BINSTRING opcodes.
+ if self.encoding == "bytes":
+ return value
+ else:
+ return value.decode(self.encoding, self.errors)
+
def load_string(self):
data = self.readline()[:-1]
# Strip outermost quotes
@@ -1146,8 +1162,7 @@ class _Unpickler:
data = data[1:-1]
else:
raise UnpicklingError("the STRING opcode argument must be quoted")
- self.append(codecs.escape_decode(data)[0]
- .decode(self.encoding, self.errors))
+ self.append(self._decode_string(codecs.escape_decode(data)[0]))
dispatch[STRING[0]] = load_string
def load_binstring(self):
@@ -1156,8 +1171,7 @@ class _Unpickler:
if len < 0:
raise UnpicklingError("BINSTRING pickle has negative byte count")
data = self.read(len)
- value = str(data, self.encoding, self.errors)
- self.append(value)
+ self.append(self._decode_string(data))
dispatch[BINSTRING[0]] = load_binstring
def load_binbytes(self):
@@ -1191,8 +1205,7 @@ class _Unpickler:
def load_short_binstring(self):
len = self.read(1)[0]
data = self.read(len)
- value = str(data, self.encoding, self.errors)
- self.append(value)
+ self.append(self._decode_string(data))
dispatch[SHORT_BINSTRING[0]] = load_short_binstring
def load_short_binbytes(self):
diff --git a/Lib/pickletools.py b/Lib/pickletools.py
index a2480f6..71c2aa1 100644
--- a/Lib/pickletools.py
+++ b/Lib/pickletools.py
@@ -969,113 +969,107 @@ class StackObject(object):
return self.name
-pyint = StackObject(
- name='int',
- obtype=int,
- doc="A short (as opposed to long) Python integer object.")
-
-pylong = StackObject(
- name='long',
- obtype=int,
- doc="A long (as opposed to short) Python integer object.")
+pyint = pylong = StackObject(
+ name='int',
+ obtype=int,
+ doc="A Python integer object.")
pyinteger_or_bool = StackObject(
- name='int_or_bool',
- obtype=(int, bool),
- doc="A Python integer object (short or long), or "
- "a Python bool.")
+ name='int_or_bool',
+ obtype=(int, bool),
+ doc="A Python integer or boolean object.")
pybool = StackObject(
- name='bool',
- obtype=(bool,),
- doc="A Python bool object.")
+ name='bool',
+ obtype=bool,
+ doc="A Python boolean object.")
pyfloat = StackObject(
- name='float',
- obtype=float,
- doc="A Python float object.")
+ name='float',
+ obtype=float,
+ doc="A Python float object.")
-pystring = StackObject(
- name='string',
- obtype=bytes,
- doc="A Python (8-bit) string object.")
+pybytes_or_str = pystring = StackObject(
+ name='bytes_or_str',
+ obtype=(bytes, str),
+ doc="A Python bytes or (Unicode) string object.")
pybytes = StackObject(
- name='bytes',
- obtype=bytes,
- doc="A Python bytes object.")
+ name='bytes',
+ obtype=bytes,
+ doc="A Python bytes object.")
pyunicode = StackObject(
- name='str',
- obtype=str,
- doc="A Python (Unicode) string object.")
+ name='str',
+ obtype=str,
+ doc="A Python (Unicode) string object.")
pynone = StackObject(
- name="None",
- obtype=type(None),
- doc="The Python None object.")
+ name="None",
+ obtype=type(None),
+ doc="The Python None object.")
pytuple = StackObject(
- name="tuple",
- obtype=tuple,
- doc="A Python tuple object.")
+ name="tuple",
+ obtype=tuple,
+ doc="A Python tuple object.")
pylist = StackObject(
- name="list",
- obtype=list,
- doc="A Python list object.")
+ name="list",
+ obtype=list,
+ doc="A Python list object.")
pydict = StackObject(
- name="dict",
- obtype=dict,
- doc="A Python dict object.")
+ name="dict",
+ obtype=dict,
+ doc="A Python dict object.")
pyset = StackObject(
- name="set",
- obtype=set,
- doc="A Python set object.")
+ name="set",
+ obtype=set,
+ doc="A Python set object.")
pyfrozenset = StackObject(
- name="frozenset",
- obtype=set,
- doc="A Python frozenset object.")
+ name="frozenset",
+ obtype=set,
+ doc="A Python frozenset object.")
anyobject = StackObject(
- name='any',
- obtype=object,
- doc="Any kind of object whatsoever.")
+ name='any',
+ obtype=object,
+ doc="Any kind of object whatsoever.")
markobject = StackObject(
- name="mark",
- obtype=StackObject,
- doc="""'The mark' is a unique object.
-
- Opcodes that operate on a variable number of objects
- generally don't embed the count of objects in the opcode,
- or pull it off the stack. Instead the MARK opcode is used
- to push a special marker object on the stack, and then
- some other opcodes grab all the objects from the top of
- the stack down to (but not including) the topmost marker
- object.
- """)
+ name="mark",
+ obtype=StackObject,
+ doc="""'The mark' is a unique object.
+
+Opcodes that operate on a variable number of objects
+generally don't embed the count of objects in the opcode,
+or pull it off the stack. Instead the MARK opcode is used
+to push a special marker object on the stack, and then
+some other opcodes grab all the objects from the top of
+the stack down to (but not including) the topmost marker
+object.
+""")
stackslice = StackObject(
- name="stackslice",
- obtype=StackObject,
- doc="""An object representing a contiguous slice of the stack.
+ name="stackslice",
+ obtype=StackObject,
+ doc="""An object representing a contiguous slice of the stack.
- This is used in conjunction with markobject, to represent all
- of the stack following the topmost markobject. For example,
- the POP_MARK opcode changes the stack from
+This is used in conjunction with markobject, to represent all
+of the stack following the topmost markobject. For example,
+the POP_MARK opcode changes the stack from
- [..., markobject, stackslice]
- to
- [...]
+ [..., markobject, stackslice]
+to
+ [...]
- No matter how many object are on the stack after the topmost
- markobject, POP_MARK gets rid of all of them (including the
- topmost markobject too).
- """)
+No matter how many object are on the stack after the topmost
+markobject, POP_MARK gets rid of all of them (including the
+topmost markobject too).
+""")
##############################################################################
# Descriptors for pickle opcodes.
@@ -1212,7 +1206,7 @@ opcodes = [
code='L',
arg=decimalnl_long,
stack_before=[],
- stack_after=[pylong],
+ stack_after=[pyint],
proto=0,
doc="""Push a long integer.
@@ -1230,7 +1224,7 @@ opcodes = [
code='\x8a',
arg=long1,
stack_before=[],
- stack_after=[pylong],
+ stack_after=[pyint],
proto=2,
doc="""Long integer using one-byte length.
@@ -1241,7 +1235,7 @@ opcodes = [
code='\x8b',
arg=long4,
stack_before=[],
- stack_after=[pylong],
+ stack_after=[pyint],
proto=2,
doc="""Long integer using found-byte length.
@@ -1254,45 +1248,50 @@ opcodes = [
code='S',
arg=stringnl,
stack_before=[],
- stack_after=[pystring],
+ stack_after=[pybytes_or_str],
proto=0,
doc="""Push a Python string object.
The argument is a repr-style string, with bracketing quote characters,
and perhaps embedded escapes. The argument extends until the next
- newline character. (Actually, they are decoded into a str instance
+ newline character. These are usually decoded into a str instance
using the encoding given to the Unpickler constructor. or the default,
- 'ASCII'.)
+ 'ASCII'. If the encoding given was 'bytes' however, they will be
+ decoded as bytes object instead.
"""),
I(name='BINSTRING',
code='T',
arg=string4,
stack_before=[],
- stack_after=[pystring],
+ stack_after=[pybytes_or_str],
proto=1,
doc="""Push a Python string object.
- There are two arguments: the first is a 4-byte little-endian signed int
- giving the number of bytes in the string, and the second is that many
- bytes, which are taken literally as the string content. (Actually,
- they are decoded into a str instance using the encoding given to the
- Unpickler constructor. or the default, 'ASCII'.)
+ There are two arguments: the first is a 4-byte little-endian
+ signed int giving the number of bytes in the string, and the
+ second is that many bytes, which are taken literally as the string
+ content. These are usually decoded into a str instance using the
+ encoding given to the Unpickler constructor. or the default,
+ 'ASCII'. If the encoding given was 'bytes' however, they will be
+ decoded as bytes object instead.
"""),
I(name='SHORT_BINSTRING',
code='U',
arg=string1,
stack_before=[],
- stack_after=[pystring],
+ stack_after=[pybytes_or_str],
proto=1,
doc="""Push a Python string object.
- There are two arguments: the first is a 1-byte unsigned int giving
- the number of bytes in the string, and the second is that many bytes,
- which are taken literally as the string content. (Actually, they
- are decoded into a str instance using the encoding given to the
- Unpickler constructor. or the default, 'ASCII'.)
+ There are two arguments: the first is a 1-byte unsigned int giving
+ the number of bytes in the string, and the second is that many
+ bytes, which are taken literally as the string content. These are
+ usually decoded into a str instance using the encoding given to
+ the Unpickler constructor. or the default, 'ASCII'. If the
+ encoding given was 'bytes' however, they will be decoded as bytes
+ object instead.
"""),
# Bytes (protocol 3 only; older protocols don't support bytes at all)
diff --git a/Lib/test/pickletester.py b/Lib/test/pickletester.py
index 040c26f..05befbf 100644
--- a/Lib/test/pickletester.py
+++ b/Lib/test/pickletester.py
@@ -1305,6 +1305,35 @@ class AbstractPickleTests(unittest.TestCase):
dumped = self.dumps(set([3]), 2)
self.assertEqual(dumped, DATA6)
+ def test_load_python2_str_as_bytes(self):
+ # From Python 2: pickle.dumps('a\x00\xa0', protocol=0)
+ self.assertEqual(self.loads(b"S'a\\x00\\xa0'\n.",
+ encoding="bytes"), b'a\x00\xa0')
+ # From Python 2: pickle.dumps('a\x00\xa0', protocol=1)
+ self.assertEqual(self.loads(b'U\x03a\x00\xa0.',
+ encoding="bytes"), b'a\x00\xa0')
+ # From Python 2: pickle.dumps('a\x00\xa0', protocol=2)
+ self.assertEqual(self.loads(b'\x80\x02U\x03a\x00\xa0.',
+ encoding="bytes"), b'a\x00\xa0')
+
+ def test_load_python2_unicode_as_str(self):
+ # From Python 2: pickle.dumps(u'π', protocol=0)
+ self.assertEqual(self.loads(b'V\\u03c0\n.',
+ encoding='bytes'), 'π')
+ # From Python 2: pickle.dumps(u'π', protocol=1)
+ self.assertEqual(self.loads(b'X\x02\x00\x00\x00\xcf\x80.',
+ encoding="bytes"), 'π')
+ # From Python 2: pickle.dumps(u'π', protocol=2)
+ self.assertEqual(self.loads(b'\x80\x02X\x02\x00\x00\x00\xcf\x80.',
+ encoding="bytes"), 'π')
+
+ def test_load_long_python2_str_as_bytes(self):
+ # From Python 2: pickle.dumps('x' * 300, protocol=1)
+ self.assertEqual(self.loads(pickle.BINSTRING +
+ struct.pack("<I", 300) +
+ b'x' * 300 + pickle.STOP,
+ encoding='bytes'), b'x' * 300)
+
def test_large_pickles(self):
# Test the correctness of internal buffering routines when handling
# large data.
@@ -1566,7 +1595,6 @@ class AbstractPickleTests(unittest.TestCase):
unpickled = self.loads(self.dumps(method, proto))
self.assertEqual(method(obj), unpickled(obj))
-
def test_c_methods(self):
global Subclass
class Subclass(tuple):
diff --git a/Lib/test/test_pickle.py b/Lib/test/test_pickle.py
index fbe96ac..0b2fe1e 100644
--- a/Lib/test/test_pickle.py
+++ b/Lib/test/test_pickle.py
@@ -83,13 +83,17 @@ class PyPicklerUnpicklerObjectTests(AbstractPicklerUnpicklerObjectTests):
class PyDispatchTableTests(AbstractDispatchTableTests):
+
pickler_class = pickle._Pickler
+
def get_dispatch_table(self):
return pickle.dispatch_table.copy()
class PyChainDispatchTableTests(AbstractDispatchTableTests):
+
pickler_class = pickle._Pickler
+
def get_dispatch_table(self):
return collections.ChainMap({}, pickle.dispatch_table)
diff --git a/Misc/ACKS b/Misc/ACKS
index 5e60f9b..798aaa0 100644
--- a/Misc/ACKS
+++ b/Misc/ACKS
@@ -293,6 +293,7 @@ Kushal Das
Jonathan Dasteel
Pierre-Yves David
A. Jesse Jiryu Davis
+Merlijn van Deen
John DeGood
Ned Deily
Vincent Delft
diff --git a/Misc/NEWS b/Misc/NEWS
index 58c4eec..f623a19 100644
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -23,6 +23,10 @@ Library
- Issue #19296: Silence compiler warning in dbm_open
+- Issue #6784: Strings from Python 2 can now be unpickled as bytes
+ objects by setting the encoding argument of Unpickler to be 'bytes'.
+ Initial patch by Merlijn van Deen.
+
- Issue #19839: Fix regression in bz2 module's handling of non-bzip2 data at
EOF, and analogous bug in lzma module.
diff --git a/Modules/_pickle.c b/Modules/_pickle.c
index 330b17d..d61c8ab 100644
--- a/Modules/_pickle.c
+++ b/Modules/_pickle.c
@@ -4016,48 +4016,44 @@ _pickle.Pickler.__init__
This takes a binary file for writing a pickle data stream.
-The optional protocol argument tells the pickler to use the
-given protocol; supported protocols are 0, 1, 2, 3 and 4. The
-default protocol is 3; a backward-incompatible protocol designed for
-Python 3.
+The optional *protocol* argument tells the pickler to use the given
+protocol; supported protocols are 0, 1, 2, 3 and 4. The default
+protocol is 3; a backward-incompatible protocol designed for Python 3.
-Specifying a negative protocol version selects the highest
-protocol version supported. The higher the protocol used, the
-more recent the version of Python needed to read the pickle
-produced.
+Specifying a negative protocol version selects the highest protocol
+version supported. The higher the protocol used, the more recent the
+version of Python needed to read the pickle produced.
-The file argument must have a write() method that accepts a single
+The *file* argument must have a write() method that accepts a single
bytes argument. It can thus be a file object opened for binary
-writing, a io.BytesIO instance, or any other custom object that
-meets this interface.
+writing, a io.BytesIO instance, or any other custom object that meets
+this interface.
-If fix_imports is True and protocol is less than 3, pickle will try to
-map the new Python 3 names to the old module names used in Python 2,
-so that the pickle data stream is readable with Python 2.
+If *fix_imports* is True and protocol is less than 3, pickle will try
+to map the new Python 3 names to the old module names used in Python
+2, so that the pickle data stream is readable with Python 2.
[clinic]*/
PyDoc_STRVAR(_pickle_Pickler___init____doc__,
"__init__(file, protocol=None, fix_imports=True)\n"
"This takes a binary file for writing a pickle data stream.\n"
"\n"
-"The optional protocol argument tells the pickler to use the\n"
-"given protocol; supported protocols are 0, 1, 2, 3 and 4. The\n"
-"default protocol is 3; a backward-incompatible protocol designed for\n"
-"Python 3.\n"
+"The optional *protocol* argument tells the pickler to use the given\n"
+"protocol; supported protocols are 0, 1, 2, 3 and 4. The default\n"
+"protocol is 3; a backward-incompatible protocol designed for Python 3.\n"
"\n"
-"Specifying a negative protocol version selects the highest\n"
-"protocol version supported. The higher the protocol used, the\n"
-"more recent the version of Python needed to read the pickle\n"
-"produced.\n"
+"Specifying a negative protocol version selects the highest protocol\n"
+"version supported. The higher the protocol used, the more recent the\n"
+"version of Python needed to read the pickle produced.\n"
"\n"
-"The file argument must have a write() method that accepts a single\n"
+"The *file* argument must have a write() method that accepts a single\n"
"bytes argument. It can thus be a file object opened for binary\n"
-"writing, a io.BytesIO instance, or any other custom object that\n"
-"meets this interface.\n"
+"writing, a io.BytesIO instance, or any other custom object that meets\n"
+"this interface.\n"
"\n"
-"If fix_imports is True and protocol is less than 3, pickle will try to\n"
-"map the new Python 3 names to the old module names used in Python 2,\n"
-"so that the pickle data stream is readable with Python 2.");
+"If *fix_imports* is True and protocol is less than 3, pickle will try\n"
+"to map the new Python 3 names to the old module names used in Python\n"
+"2, so that the pickle data stream is readable with Python 2.");
#define _PICKLE_PICKLER___INIT___METHODDEF \
{"__init__", (PyCFunction)_pickle_Pickler___init__, METH_VARARGS|METH_KEYWORDS, _pickle_Pickler___init____doc__},
@@ -4086,7 +4082,7 @@ exit:
static PyObject *
_pickle_Pickler___init___impl(PicklerObject *self, PyObject *file, PyObject *protocol, int fix_imports)
-/*[clinic checksum: c99ff417bd703a74affc4b708167e56e135e8969]*/
+/*[clinic checksum: 2b5ce6452544600478cf9f4b701ab9d9b5efbab9]*/
{
_Py_IDENTIFIER(persistent_id);
_Py_IDENTIFIER(dispatch_table);
@@ -4831,7 +4827,7 @@ static int
load_string(UnpicklerObject *self)
{
PyObject *bytes;
- PyObject *str = NULL;
+ PyObject *obj;
Py_ssize_t len;
char *s, *p;
@@ -4857,19 +4853,28 @@ load_string(UnpicklerObject *self)
bytes = PyBytes_DecodeEscape(p, len, NULL, 0, NULL);
if (bytes == NULL)
return -1;
- str = PyUnicode_FromEncodedObject(bytes, self->encoding, self->errors);
- Py_DECREF(bytes);
- if (str == NULL)
- return -1;
- PDATA_PUSH(self->stack, str, -1);
+ /* Leave the Python 2.x strings as bytes if the *encoding* given to the
+ Unpickler was 'bytes'. Otherwise, convert them to unicode. */
+ if (strcmp(self->encoding, "bytes") == 0) {
+ obj = bytes;
+ }
+ else {
+ obj = PyUnicode_FromEncodedObject(bytes, self->encoding, self->errors);
+ Py_DECREF(bytes);
+ if (obj == NULL) {
+ return -1;
+ }
+ }
+
+ PDATA_PUSH(self->stack, obj, -1);
return 0;
}
static int
-load_counted_binbytes(UnpicklerObject *self, int nbytes)
+load_counted_binstring(UnpicklerObject *self, int nbytes)
{
- PyObject *bytes;
+ PyObject *obj;
Py_ssize_t size;
char *s;
@@ -4878,8 +4883,9 @@ load_counted_binbytes(UnpicklerObject *self, int nbytes)
size = calc_binsize(s, nbytes);
if (size < 0) {
- PyErr_Format(PyExc_OverflowError,
- "BINBYTES exceeds system's maximum size of %zd bytes",
+ PickleState *st = _Pickle_GetGlobalState();
+ PyErr_Format(st->UnpicklingError,
+ "BINSTRING exceeds system's maximum size of %zd bytes",
PY_SSIZE_T_MAX);
return -1;
}
@@ -4887,18 +4893,26 @@ load_counted_binbytes(UnpicklerObject *self, int nbytes)
if (_Unpickler_Read(self, &s, size) < 0)
return -1;
- bytes = PyBytes_FromStringAndSize(s, size);
- if (bytes == NULL)
+ /* Convert Python 2.x strings to bytes if the *encoding* given to the
+ Unpickler was 'bytes'. Otherwise, convert them to unicode. */
+ if (strcmp(self->encoding, "bytes") == 0) {
+ obj = PyBytes_FromStringAndSize(s, size);
+ }
+ else {
+ obj = PyUnicode_Decode(s, size, self->encoding, self->errors);
+ }
+ if (obj == NULL) {
return -1;
+ }
- PDATA_PUSH(self->stack, bytes, -1);
+ PDATA_PUSH(self->stack, obj, -1);
return 0;
}
static int
-load_counted_binstring(UnpicklerObject *self, int nbytes)
+load_counted_binbytes(UnpicklerObject *self, int nbytes)
{
- PyObject *str;
+ PyObject *bytes;
Py_ssize_t size;
char *s;
@@ -4907,21 +4921,20 @@ load_counted_binstring(UnpicklerObject *self, int nbytes)
size = calc_binsize(s, nbytes);
if (size < 0) {
- PickleState *st = _Pickle_GetGlobalState();
- PyErr_Format(st->UnpicklingError,
- "BINSTRING exceeds system's maximum size of %zd bytes",
+ PyErr_Format(PyExc_OverflowError,
+ "BINBYTES exceeds system's maximum size of %zd bytes",
PY_SSIZE_T_MAX);
return -1;
}
if (_Unpickler_Read(self, &s, size) < 0)
return -1;
- /* Convert Python 2.x strings to unicode. */
- str = PyUnicode_Decode(s, size, self->encoding, self->errors);
- if (str == NULL)
+
+ bytes = PyBytes_FromStringAndSize(s, size);
+ if (bytes == NULL)
return -1;
- PDATA_PUSH(self->stack, str, -1);
+ PDATA_PUSH(self->stack, bytes, -1);
return 0;
}
@@ -6258,25 +6271,25 @@ _pickle.Unpickler.load
Load a pickle.
-Read a pickled object representation from the open file object given in
-the constructor, and return the reconstituted object hierarchy specified
-therein.
+Read a pickled object representation from the open file object given
+in the constructor, and return the reconstituted object hierarchy
+specified therein.
[clinic]*/
PyDoc_STRVAR(_pickle_Unpickler_load__doc__,
"load()\n"
"Load a pickle.\n"
"\n"
-"Read a pickled object representation from the open file object given in\n"
-"the constructor, and return the reconstituted object hierarchy specified\n"
-"therein.");
+"Read a pickled object representation from the open file object given\n"
+"in the constructor, and return the reconstituted object hierarchy\n"
+"specified therein.");
#define _PICKLE_UNPICKLER_LOAD_METHODDEF \
{"load", (PyCFunction)_pickle_Unpickler_load, METH_NOARGS, _pickle_Unpickler_load__doc__},
static PyObject *
_pickle_Unpickler_load(PyObject *self)
-/*[clinic checksum: 9a30ba4e4d9221d4dcd705e1471ab11b2c9e3ac6]*/
+/*[clinic checksum: c2ae1263f0dd000f34ccf0fe59d7c544464babc4]*/
{
UnpicklerObject *unpickler = (UnpicklerObject*)self;
@@ -6310,8 +6323,9 @@ _pickle.Unpickler.find_class
Return an object from a specified module.
-If necessary, the module will be imported. Subclasses may override this
-method (e.g. to restrict unpickling of arbitrary classes and functions).
+If necessary, the module will be imported. Subclasses may override
+this method (e.g. to restrict unpickling of arbitrary classes and
+functions).
This method is called whenever a class or a function object is
needed. Both arguments passed are str objects.
@@ -6321,8 +6335,9 @@ PyDoc_STRVAR(_pickle_Unpickler_find_class__doc__,
"find_class(module_name, global_name)\n"
"Return an object from a specified module.\n"
"\n"
-"If necessary, the module will be imported. Subclasses may override this\n"
-"method (e.g. to restrict unpickling of arbitrary classes and functions).\n"
+"If necessary, the module will be imported. Subclasses may override\n"
+"this method (e.g. to restrict unpickling of arbitrary classes and\n"
+"functions).\n"
"\n"
"This method is called whenever a class or a function object is\n"
"needed. Both arguments passed are str objects.");
@@ -6352,7 +6367,7 @@ exit:
static PyObject *
_pickle_Unpickler_find_class_impl(UnpicklerObject *self, PyObject *module_name, PyObject *global_name)
-/*[clinic checksum: b7d05d4dd8adc698e5780c1ac2be0f5062d33915]*/
+/*[clinic checksum: 1f353d13a32c9d94feb1466b3c2d0529a7e5650e]*/
{
PyObject *global;
PyObject *modules_dict;
@@ -6515,23 +6530,23 @@ _pickle.Unpickler.__init__
This takes a binary file for reading a pickle data stream.
The protocol version of the pickle is detected automatically, so no
-proto argument is needed.
+protocol argument is needed. Bytes past the pickled object's
+representation are ignored.
-The file-like object must have two methods, a read() method
-that takes an integer argument, and a readline() method that
-requires no arguments. Both methods should return bytes.
-Thus file-like object can be a binary file object opened for
-reading, a BytesIO object, or any other custom object that
-meets this interface.
+The argument *file* must have two methods, a read() method that takes
+an integer argument, and a readline() method that requires no
+arguments. Both methods should return bytes. Thus *file* can be a
+binary file object opened for reading, a io.BytesIO object, or any
+other custom object that meets this interface.
Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
which are used to control compatiblity support for pickle stream
-generated by Python 2.x. If *fix_imports* is True, pickle will try to
-map the old Python 2.x names to the new names used in Python 3.x. The
+generated by Python 2. If *fix_imports* is True, pickle will try to
+map the old Python 2 names to the new names used in Python 3. The
*encoding* and *errors* tell pickle how to decode 8-bit string
-instances pickled by Python 2.x; these default to 'ASCII' and
-'strict', respectively.
-
+instances pickled by Python 2; these default to 'ASCII' and 'strict',
+respectively. The *encoding* can be 'bytes' to read these 8-bit
+string instances as bytes objects.
[clinic]*/
PyDoc_STRVAR(_pickle_Unpickler___init____doc__,
@@ -6539,22 +6554,23 @@ PyDoc_STRVAR(_pickle_Unpickler___init____doc__,
"This takes a binary file for reading a pickle data stream.\n"
"\n"
"The protocol version of the pickle is detected automatically, so no\n"
-"proto argument is needed.\n"
+"protocol argument is needed. Bytes past the pickled object\'s\n"
+"representation are ignored.\n"
"\n"
-"The file-like object must have two methods, a read() method\n"
-"that takes an integer argument, and a readline() method that\n"
-"requires no arguments. Both methods should return bytes.\n"
-"Thus file-like object can be a binary file object opened for\n"
-"reading, a BytesIO object, or any other custom object that\n"
-"meets this interface.\n"
+"The argument *file* must have two methods, a read() method that takes\n"
+"an integer argument, and a readline() method that requires no\n"
+"arguments. Both methods should return bytes. Thus *file* can be a\n"
+"binary file object opened for reading, a io.BytesIO object, or any\n"
+"other custom object that meets this interface.\n"
"\n"
"Optional keyword arguments are *fix_imports*, *encoding* and *errors*,\n"
"which are used to control compatiblity support for pickle stream\n"
-"generated by Python 2.x. If *fix_imports* is True, pickle will try to\n"
-"map the old Python 2.x names to the new names used in Python 3.x. The\n"
+"generated by Python 2. If *fix_imports* is True, pickle will try to\n"
+"map the old Python 2 names to the new names used in Python 3. The\n"
"*encoding* and *errors* tell pickle how to decode 8-bit string\n"
-"instances pickled by Python 2.x; these default to \'ASCII\' and\n"
-"\'strict\', respectively.");
+"instances pickled by Python 2; these default to \'ASCII\' and \'strict\',\n"
+"respectively. The *encoding* can be \'bytes\' to read these 8-bit\n"
+"string instances as bytes objects.");
#define _PICKLE_UNPICKLER___INIT___METHODDEF \
{"__init__", (PyCFunction)_pickle_Unpickler___init__, METH_VARARGS|METH_KEYWORDS, _pickle_Unpickler___init____doc__},
@@ -6584,7 +6600,7 @@ exit:
static PyObject *
_pickle_Unpickler___init___impl(UnpicklerObject *self, PyObject *file, int fix_imports, const char *encoding, const char *errors)
-/*[clinic checksum: bed0d8bbe1c647960ccc6f997b33bf33935fa56f]*/
+/*[clinic checksum: 9ce6783224e220573d42a94fe1bb7199d6f1c5a6]*/
{
_Py_IDENTIFIER(persistent_load);
@@ -7033,48 +7049,50 @@ _pickle.dump
Write a pickled representation of obj to the open file object file.
-This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may be more
-efficient.
+This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may
+be more efficient.
-The optional protocol argument tells the pickler to use the given protocol
-supported protocols are 0, 1, 2, 3. The default protocol is 3; a
-backward-incompatible protocol designed for Python 3.0.
+The optional *protocol* argument tells the pickler to use the given
+protocol supported protocols are 0, 1, 2, 3 and 4. The default
+protocol is 3; a backward-incompatible protocol designed for Python 3.
-Specifying a negative protocol version selects the highest protocol version
-supported. The higher the protocol used, the more recent the version of
-Python needed to read the pickle produced.
+Specifying a negative protocol version selects the highest protocol
+version supported. The higher the protocol used, the more recent the
+version of Python needed to read the pickle produced.
-The file argument must have a write() method that accepts a single bytes
-argument. It can thus be a file object opened for binary writing, a
-io.BytesIO instance, or any other custom object that meets this interface.
+The *file* argument must have a write() method that accepts a single
+bytes argument. It can thus be a file object opened for binary
+writing, a io.BytesIO instance, or any other custom object that meets
+this interface.
-If fix_imports is True and protocol is less than 3, pickle will try to
-map the new Python 3.x names to the old module names used in Python 2.x,
-so that the pickle data stream is readable with Python 2.x.
+If *fix_imports* is True and protocol is less than 3, pickle will try
+to map the new Python 3 names to the old module names used in Python
+2, so that the pickle data stream is readable with Python 2.
[clinic]*/
PyDoc_STRVAR(_pickle_dump__doc__,
"dump(obj, file, protocol=None, *, fix_imports=True)\n"
"Write a pickled representation of obj to the open file object file.\n"
"\n"
-"This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may be more\n"
-"efficient.\n"
+"This is equivalent to ``Pickler(file, protocol).dump(obj)``, but may\n"
+"be more efficient.\n"
"\n"
-"The optional protocol argument tells the pickler to use the given protocol\n"
-"supported protocols are 0, 1, 2, 3. The default protocol is 3; a\n"
-"backward-incompatible protocol designed for Python 3.0.\n"
+"The optional *protocol* argument tells the pickler to use the given\n"
+"protocol supported protocols are 0, 1, 2, 3 and 4. The default\n"
+"protocol is 3; a backward-incompatible protocol designed for Python 3.\n"
"\n"
-"Specifying a negative protocol version selects the highest protocol version\n"
-"supported. The higher the protocol used, the more recent the version of\n"
-"Python needed to read the pickle produced.\n"
+"Specifying a negative protocol version selects the highest protocol\n"
+"version supported. The higher the protocol used, the more recent the\n"
+"version of Python needed to read the pickle produced.\n"
"\n"
-"The file argument must have a write() method that accepts a single bytes\n"
-"argument. It can thus be a file object opened for binary writing, a\n"
-"io.BytesIO instance, or any other custom object that meets this interface.\n"
+"The *file* argument must have a write() method that accepts a single\n"
+"bytes argument. It can thus be a file object opened for binary\n"
+"writing, a io.BytesIO instance, or any other custom object that meets\n"
+"this interface.\n"
"\n"
-"If fix_imports is True and protocol is less than 3, pickle will try to\n"
-"map the new Python 3.x names to the old module names used in Python 2.x,\n"
-"so that the pickle data stream is readable with Python 2.x.");
+"If *fix_imports* is True and protocol is less than 3, pickle will try\n"
+"to map the new Python 3 names to the old module names used in Python\n"
+"2, so that the pickle data stream is readable with Python 2.");
#define _PICKLE_DUMP_METHODDEF \
{"dump", (PyCFunction)_pickle_dump, METH_VARARGS|METH_KEYWORDS, _pickle_dump__doc__},
@@ -7104,7 +7122,7 @@ exit:
static PyObject *
_pickle_dump_impl(PyModuleDef *module, PyObject *obj, PyObject *file, PyObject *protocol, int fix_imports)
-/*[clinic checksum: e442721b16052d921b5e3fbd146d0a62e94a459e]*/
+/*[clinic checksum: eb5c23e64da34477178230b704d2cc9c6b6650ea]*/
{
PicklerObject *pickler = _Pickler_New();
@@ -7142,34 +7160,34 @@ _pickle.dumps
Return the pickled representation of the object as a bytes object.
-The optional protocol argument tells the pickler to use the given protocol;
-supported protocols are 0, 1, 2, 3. The default protocol is 3; a
-backward-incompatible protocol designed for Python 3.0.
+The optional *protocol* argument tells the pickler to use the given
+protocol; supported protocols are 0, 1, 2, 3 and 4. The default
+protocol is 3; a backward-incompatible protocol designed for Python 3.
-Specifying a negative protocol version selects the highest protocol version
-supported. The higher the protocol used, the more recent the version of
-Python needed to read the pickle produced.
+Specifying a negative protocol version selects the highest protocol
+version supported. The higher the protocol used, the more recent the
+version of Python needed to read the pickle produced.
-If fix_imports is True and *protocol* is less than 3, pickle will try to
-map the new Python 3.x names to the old module names used in Python 2.x,
-so that the pickle data stream is readable with Python 2.x.
+If *fix_imports* is True and *protocol* is less than 3, pickle will
+try to map the new Python 3 names to the old module names used in
+Python 2, so that the pickle data stream is readable with Python 2.
[clinic]*/
PyDoc_STRVAR(_pickle_dumps__doc__,
"dumps(obj, protocol=None, *, fix_imports=True)\n"
"Return the pickled representation of the object as a bytes object.\n"
"\n"
-"The optional protocol argument tells the pickler to use the given protocol;\n"
-"supported protocols are 0, 1, 2, 3. The default protocol is 3; a\n"
-"backward-incompatible protocol designed for Python 3.0.\n"
+"The optional *protocol* argument tells the pickler to use the given\n"
+"protocol; supported protocols are 0, 1, 2, 3 and 4. The default\n"
+"protocol is 3; a backward-incompatible protocol designed for Python 3.\n"
"\n"
-"Specifying a negative protocol version selects the highest protocol version\n"
-"supported. The higher the protocol used, the more recent the version of\n"
-"Python needed to read the pickle produced.\n"
+"Specifying a negative protocol version selects the highest protocol\n"
+"version supported. The higher the protocol used, the more recent the\n"
+"version of Python needed to read the pickle produced.\n"
"\n"
-"If fix_imports is True and *protocol* is less than 3, pickle will try to\n"
-"map the new Python 3.x names to the old module names used in Python 2.x,\n"
-"so that the pickle data stream is readable with Python 2.x.");
+"If *fix_imports* is True and *protocol* is less than 3, pickle will\n"
+"try to map the new Python 3 names to the old module names used in\n"
+"Python 2, so that the pickle data stream is readable with Python 2.");
#define _PICKLE_DUMPS_METHODDEF \
{"dumps", (PyCFunction)_pickle_dumps, METH_VARARGS|METH_KEYWORDS, _pickle_dumps__doc__},
@@ -7198,7 +7216,7 @@ exit:
static PyObject *
_pickle_dumps_impl(PyModuleDef *module, PyObject *obj, PyObject *protocol, int fix_imports)
-/*[clinic checksum: df6262c4c487f537f47aec8a1709318204c1e174]*/
+/*[clinic checksum: e9b915d61202a9692cb6c6718db74fe54fc9c4d1]*/
{
PyObject *result;
PicklerObject *pickler = _Pickler_New();
@@ -7231,50 +7249,56 @@ _pickle.load
encoding: str = 'ASCII'
errors: str = 'strict'
-Return a reconstituted object from the pickle data stored in a file.
+Read and return an object from the pickle data stored in a file.
-This is equivalent to ``Unpickler(file).load()``, but may be more efficient.
+This is equivalent to ``Unpickler(file).load()``, but may be more
+efficient.
-The protocol version of the pickle is detected automatically, so no protocol
-argument is needed. Bytes past the pickled object's representation are
-ignored.
+The protocol version of the pickle is detected automatically, so no
+protocol argument is needed. Bytes past the pickled object's
+representation are ignored.
-The argument file must have two methods, a read() method that takes an
-integer argument, and a readline() method that requires no arguments. Both
-methods should return bytes. Thus *file* can be a binary file object opened
-for reading, a BytesIO object, or any other custom object that meets this
-interface.
+The argument *file* must have two methods, a read() method that takes
+an integer argument, and a readline() method that requires no
+arguments. Both methods should return bytes. Thus *file* can be a
+binary file object opened for reading, a io.BytesIO object, or any
+other custom object that meets this interface.
-Optional keyword arguments are fix_imports, encoding and errors,
-which are used to control compatiblity support for pickle stream generated
-by Python 2.x. If fix_imports is True, pickle will try to map the old
-Python 2.x names to the new names used in Python 3.x. The encoding and
-errors tell pickle how to decode 8-bit string instances pickled by Python
-2.x; these default to 'ASCII' and 'strict', respectively.
+Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
+which are used to control compatiblity support for pickle stream
+generated by Python 2. If *fix_imports* is True, pickle will try to
+map the old Python 2 names to the new names used in Python 3. The
+*encoding* and *errors* tell pickle how to decode 8-bit string
+instances pickled by Python 2; these default to 'ASCII' and 'strict',
+respectively. The *encoding* can be 'bytes' to read these 8-bit
+string instances as bytes objects.
[clinic]*/
PyDoc_STRVAR(_pickle_load__doc__,
"load(file, *, fix_imports=True, encoding=\'ASCII\', errors=\'strict\')\n"
-"Return a reconstituted object from the pickle data stored in a file.\n"
+"Read and return an object from the pickle data stored in a file.\n"
"\n"
-"This is equivalent to ``Unpickler(file).load()``, but may be more efficient.\n"
+"This is equivalent to ``Unpickler(file).load()``, but may be more\n"
+"efficient.\n"
"\n"
-"The protocol version of the pickle is detected automatically, so no protocol\n"
-"argument is needed. Bytes past the pickled object\'s representation are\n"
-"ignored.\n"
+"The protocol version of the pickle is detected automatically, so no\n"
+"protocol argument is needed. Bytes past the pickled object\'s\n"
+"representation are ignored.\n"
"\n"
-"The argument file must have two methods, a read() method that takes an\n"
-"integer argument, and a readline() method that requires no arguments. Both\n"
-"methods should return bytes. Thus *file* can be a binary file object opened\n"
-"for reading, a BytesIO object, or any other custom object that meets this\n"
-"interface.\n"
+"The argument *file* must have two methods, a read() method that takes\n"
+"an integer argument, and a readline() method that requires no\n"
+"arguments. Both methods should return bytes. Thus *file* can be a\n"
+"binary file object opened for reading, a io.BytesIO object, or any\n"
+"other custom object that meets this interface.\n"
"\n"
-"Optional keyword arguments are fix_imports, encoding and errors,\n"
-"which are used to control compatiblity support for pickle stream generated\n"
-"by Python 2.x. If fix_imports is True, pickle will try to map the old\n"
-"Python 2.x names to the new names used in Python 3.x. The encoding and\n"
-"errors tell pickle how to decode 8-bit string instances pickled by Python\n"
-"2.x; these default to \'ASCII\' and \'strict\', respectively.");
+"Optional keyword arguments are *fix_imports*, *encoding* and *errors*,\n"
+"which are used to control compatiblity support for pickle stream\n"
+"generated by Python 2. If *fix_imports* is True, pickle will try to\n"
+"map the old Python 2 names to the new names used in Python 3. The\n"
+"*encoding* and *errors* tell pickle how to decode 8-bit string\n"
+"instances pickled by Python 2; these default to \'ASCII\' and \'strict\',\n"
+"respectively. The *encoding* can be \'bytes\' to read these 8-bit\n"
+"string instances as bytes objects.");
#define _PICKLE_LOAD_METHODDEF \
{"load", (PyCFunction)_pickle_load, METH_VARARGS|METH_KEYWORDS, _pickle_load__doc__},
@@ -7304,7 +7328,7 @@ exit:
static PyObject *
_pickle_load_impl(PyModuleDef *module, PyObject *file, int fix_imports, const char *encoding, const char *errors)
-/*[clinic checksum: e10796f6765b22ce48dca6940f11b3933853ca35]*/
+/*[clinic checksum: b41f06970e57acf2fd602e4b7f88e3f3e1e53087]*/
{
PyObject *result;
UnpicklerObject *unpickler = _Unpickler_New();
@@ -7339,34 +7363,38 @@ _pickle.loads
encoding: str = 'ASCII'
errors: str = 'strict'
-Return a reconstituted object from the given pickle data.
+Read and return an object from the given pickle data.
-The protocol version of the pickle is detected automatically, so no protocol
-argument is needed. Bytes past the pickled object's representation are
-ignored.
+The protocol version of the pickle is detected automatically, so no
+protocol argument is needed. Bytes past the pickled object's
+representation are ignored.
-Optional keyword arguments are fix_imports, encoding and errors, which
-are used to control compatiblity support for pickle stream generated
-by Python 2.x. If fix_imports is True, pickle will try to map the old
-Python 2.x names to the new names used in Python 3.x. The encoding and
-errors tell pickle how to decode 8-bit string instances pickled by Python
-2.x; these default to 'ASCII' and 'strict', respectively.
+Optional keyword arguments are *fix_imports*, *encoding* and *errors*,
+which are used to control compatiblity support for pickle stream
+generated by Python 2. If *fix_imports* is True, pickle will try to
+map the old Python 2 names to the new names used in Python 3. The
+*encoding* and *errors* tell pickle how to decode 8-bit string
+instances pickled by Python 2; these default to 'ASCII' and 'strict',
+respectively. The *encoding* can be 'bytes' to read these 8-bit
+string instances as bytes objects.
[clinic]*/
PyDoc_STRVAR(_pickle_loads__doc__,
"loads(data, *, fix_imports=True, encoding=\'ASCII\', errors=\'strict\')\n"
-"Return a reconstituted object from the given pickle data.\n"
+"Read and return an object from the given pickle data.\n"
"\n"
-"The protocol version of the pickle is detected automatically, so no protocol\n"
-"argument is needed. Bytes past the pickled object\'s representation are\n"
-"ignored.\n"
+"The protocol version of the pickle is detected automatically, so no\n"
+"protocol argument is needed. Bytes past the pickled object\'s\n"
+"representation are ignored.\n"
"\n"
-"Optional keyword arguments are fix_imports, encoding and errors, which\n"
-"are used to control compatiblity support for pickle stream generated\n"
-"by Python 2.x. If fix_imports is True, pickle will try to map the old\n"
-"Python 2.x names to the new names used in Python 3.x. The encoding and\n"
-"errors tell pickle how to decode 8-bit string instances pickled by Python\n"
-"2.x; these default to \'ASCII\' and \'strict\', respectively.");
+"Optional keyword arguments are *fix_imports*, *encoding* and *errors*,\n"
+"which are used to control compatiblity support for pickle stream\n"
+"generated by Python 2. If *fix_imports* is True, pickle will try to\n"
+"map the old Python 2 names to the new names used in Python 3. The\n"
+"*encoding* and *errors* tell pickle how to decode 8-bit string\n"
+"instances pickled by Python 2; these default to \'ASCII\' and \'strict\',\n"
+"respectively. The *encoding* can be \'bytes\' to read these 8-bit\n"
+"string instances as bytes objects.");
#define _PICKLE_LOADS_METHODDEF \
{"loads", (PyCFunction)_pickle_loads, METH_VARARGS|METH_KEYWORDS, _pickle_loads__doc__},
@@ -7396,7 +7424,7 @@ exit:
static PyObject *
_pickle_loads_impl(PyModuleDef *module, PyObject *data, int fix_imports, const char *encoding, const char *errors)
-/*[clinic checksum: 29ee725efcbf51a3533c19cb8261a8e267b7080a]*/
+/*[clinic checksum: 0663de43aca6c21508a777e29d98c9c3a6e7f72d]*/
{
PyObject *result;
UnpicklerObject *unpickler = _Unpickler_New();