:mod:`pickle` --- Python object serialization ============================================= .. index:: single: persistence pair: persistent; objects pair: serializing; objects pair: marshalling; objects pair: flattening; objects pair: pickling; objects .. module:: pickle :synopsis: Convert Python objects to streams of bytes and back. .. sectionauthor:: Jim Kerr . .. sectionauthor:: Barry Warsaw The :mod:`pickle` module implements a fundamental, but powerful algorithm for serializing and de-serializing a Python object structure. "Pickling" is the process whereby a Python object hierarchy is converted into a byte stream, and "unpickling" is the inverse operation, whereby a byte stream is converted back into an object hierarchy. Pickling (and unpickling) is alternatively known as "serialization", "marshalling," [#]_ or "flattening", however, to avoid confusion, the terms used here are "pickling" and "unpickling".. Relationship to other Python modules ------------------------------------ The :mod:`pickle` module has an transparent optimizer (:mod:`_pickle`) written in C. It is used whenever available. Otherwise the pure Python implementation is used. Python has a more primitive serialization module called :mod:`marshal`, but in general :mod:`pickle` should always be the preferred way to serialize Python objects. :mod:`marshal` exists primarily to support Python's :file:`.pyc` files. The :mod:`pickle` module differs from :mod:`marshal` several significant ways: * The :mod:`pickle` module keeps track of the objects it has already serialized, so that later references to the same object won't be serialized again. :mod:`marshal` doesn't do this. This has implications both for recursive objects and object sharing. Recursive objects are objects that contain references to themselves. These are not handled by marshal, and in fact, attempting to marshal recursive objects will crash your Python interpreter. Object sharing happens when there are multiple references to the same object in different places in the object hierarchy being serialized. :mod:`pickle` stores such objects only once, and ensures that all other references point to the master copy. Shared objects remain shared, which can be very important for mutable objects. * :mod:`marshal` cannot be used to serialize user-defined classes and their instances. :mod:`pickle` can save and restore class instances transparently, however the class definition must be importable and live in the same module as when the object was stored. * The :mod:`marshal` serialization format is not guaranteed to be portable across Python versions. Because its primary job in life is to support :file:`.pyc` files, the Python implementers reserve the right to change the serialization format in non-backwards compatible ways should the need arise. The :mod:`pickle` serialization format is guaranteed to be backwards compatible across Python releases. .. warning:: The :mod:`pickle` module is not intended to be secure against erroneous or maliciously constructed data. Never unpickle data received from an untrusted or unauthenticated source. Note that serialization is a more primitive notion than persistence; although :mod:`pickle` reads and writes file objects, it does not handle the issue of naming persistent objects, nor the (even more complicated) issue of concurrent access to persistent objects. The :mod:`pickle` module can transform a complex object into a byte stream and it can transform the byte stream into an object with the same internal structure. Perhaps the most obvious thing to do with these byte streams is to write them onto a file, but it is also conceivable to send them across a network or store them in a database. The module :mod:`shelve` provides a simple interface to pickle and unpickle objects on DBM-style database files. Data stream format ------------------ .. index:: single: XDR single: External Data Representation The data format used by :mod:`pickle` is Python-specific. This has the advantage that there are no restrictions imposed by external standards such as XDR (which can't represent pointer sharing); however it means that non-Python programs may not be able to reconstruct pickled Python objects. By default, the :mod:`pickle` data format uses a compact binary representation. The module :mod:`pickletools` contains tools for analyzing data streams generated by :mod:`pickle`. There are currently 4 different protocols which can be used for pickling. * Protocol version 0 is the original ASCII protocol and is backwards compatible with earlier versions of Python. * Protocol version 1 is the old binary format which is also compatible with earlier versions of Python. * Protocol version 2 was introduced in Python 2.3. It provides much more efficient pickling of :term:`new-style class`\es. * Protocol version 3 was added in Python 3.0. It has explicit support for bytes and cannot be unpickled by Python 2.x pickle modules. This is the current recommended protocol, use it whenever it is possible. Refer to :pep:`307` for information about improvements brought by protocol 2. See :mod:`pickletools`'s source code for extensive comments about opcodes used by pickle protocols. If a *protocol* is not specified, protocol 3 is used. If *protocol* is specified as a negative value or :const:`HIGHEST_PROTOCOL`, the highest protocol version available will be used. Module Interface ---------------- To serialize an object hierarchy, you first create a pickler, then you call the pickler's :meth:`dump` method. To de-serialize a data stream, you first create an unpickler, then you call the unpickler's :meth:`load` method. The :mod:`pickle` module provides the following constant: .. data:: HIGHEST_PROTOCOL The highest protocol version available. This value can be passed as a *protocol* value. .. note:: Be sure to always open pickle files created with protocols >= 1 in binary mode. For the old ASCII-based pickle protocol 0 you can use either text mode or binary mode as long as you stay consistent. A pickle file written with protocol 0 in binary mode will contain lone linefeeds as line terminators and therefore will look "funny" when viewed in Notepad or other editors which do not support this format. .. data:: DEFAULT_PROTOCOL The default protocol used for pickling. May be less than HIGHEST_PROTOCOL. Currently the default protocol is 3; a backward-incompatible protocol designed for Python 3.0. The :mod:`pickle` module provides the following functions to make the pickling process more convenient: .. function:: dump(obj, file[, protocol]) Write a pickled representation of *obj* to the open file object *file*. This is equivalent to ``Pickler(file, protocol).dump(obj)``. The optional *protocol* argument tells the pickler to use the given protocol; supported protocols are 0, 1, 2, 3. The default protocol is 3; a backward-incompatible protocol designed for Python 3.0. Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced. The *file* argument must have a write() method that accepts a single bytes argument. It can thus be a file object opened for binary writing, a io.BytesIO instance, or any other custom object that meets this interface. .. function:: dumps(obj[, protocol]) Return the pickled representation of the object as a :class:`bytes` object, instead of writing it to a file. The optional *protocol* argument tells the pickler to use the given protocol; supported protocols are 0, 1, 2, 3. The default protocol is 3; a backward-incompatible protocol designed for Python 3.0. Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced. .. function:: load(file, [\*, encoding="ASCII", errors="strict"]) Read a pickled object representation from the open file object *file* and return the reconstituted object hierarchy specified therein. This is equivalent to ``Unpickler(file).load()``. The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object's representation are ignored. The argument *file* must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return bytes. Thus *file* can be a binary file object opened for reading, a BytesIO object, or any other custom object that meets this interface. Optional keyword arguments are encoding and errors, which are used to decode 8-bit string instances pickled by Python 2.x. These default to 'ASCII' and 'strict', respectively. .. function:: loads(bytes_object, [\*, encoding="ASCII", errors="strict"]) Read a pickled object hierarchy from a :class:`bytes` object and return the reconstituted object hierarchy specified therein The protocol version of the pickle is detected automatically, so no protocol argument is needed. Bytes past the pickled object's representation are ignored. Optional keyword arguments are encoding and errors, which are used to decode 8-bit string instances pickled by Python 2.x. These default to 'ASCII' and 'strict', respectively. The :mod:`pickle` module defines three exceptions: .. exception:: PickleError Common base class for the other pickling exceptions. It inherits :exc:`Exception`. .. exception:: PicklingError Error raised when an unpicklable object is encountered by :class:`Pickler`. It inherits :exc:`PickleError`. Refer to :ref:`pickle-picklable` to learn what kinds of objects can be pickled. .. exception:: UnpicklingError Error raised when there a problem unpickling an object, such as a data corruption or a security violation. It inherits :exc:`PickleError`. Note that other exceptions may also be raised during unpickling, including (but not necessarily limited to) AttributeError, EOFError, ImportError, and IndexError. The :mod:`pickle` module exports two classes, :class:`Pickler` and :class:`Unpickler`: .. class:: Pickler(file[, protocol]) This takes a binary file for writing a pickle data stream. The optional *protocol* argument tells the pickler to use the given protocol; supported protocols are 0, 1, 2, 3. The default protocol is 3; a backward-incompatible protocol designed for Python 3.0. Specifying a negative protocol version selects the highest protocol version supported. The higher the protocol used, the more recent the version of Python needed to read the pickle produced. The *file* argument must have a write() method that accepts a single bytes argument. It can thus be a file object opened for binary writing, a io.BytesIO instance, or any other custom object that meets this interface. .. method:: dump(obj) Write a pickled representation of *obj* to the open file object given in the constructor. .. method:: persistent_id(obj) Do nothing by default. This exists so a subclass can override it. If :meth:`persistent_id` returns ``None``, *obj* is pickled as usual. Any other value causes :class:`Pickler` to emit the returned value as a persistent ID for *obj*. The meaning of this persistent ID should be defined by :meth:`Unpickler.persistent_load`. Note that the value returned by :meth:`persistent_id` cannot itself have a persistent ID. See :ref:`pickle-persistent` for details and examples of uses. .. method:: clear_memo() Deprecated. Use the :meth:`clear` method on :attr:`memo`, instead. Clear the pickler's memo, useful when reusing picklers. .. attribute:: fast Enable fast mode if set to a true value. The fast mode disables the usage of memo, therefore speeding the pickling process by not generating superfluous PUT opcodes. It should not be used with self-referential objects, doing otherwise will cause :class:`Pickler` to recurse infinitely. Use :func:`pickletools.optimize` if you need more compact pickles. .. attribute:: memo Dictionary holding previously pickled objects to allow shared or recursive objects to pickled by reference as opposed to by value. It is possible to make multiple calls to the :meth:`dump` method of the same :class:`Pickler` instance. These must then be matched to the same number of calls to the :meth:`load` method of the corresponding :class:`Unpickler` instance. If the same object is pickled by multiple :meth:`dump` calls, the :meth:`load` will all yield references to the same object. Please note, this is intended for pickling multiple objects without intervening modifications to the objects or their parts. If you modify an object and then pickle it again using the same :class:`Pickler` instance, the object is not pickled again --- a reference to it is pickled and the :class:`Unpickler` will return the old value, not the modified one. .. class:: Unpickler(file, [\*, encoding="ASCII", errors="strict"]) This takes a binary file for reading a pickle data stream. The protocol version of the pickle is detected automatically, so no protocol argument is needed. The argument *file* must have two methods, a read() method that takes an integer argument, and a readline() method that requires no arguments. Both methods should return bytes. Thus *file* can be a binary file object opened for reading, a BytesIO object, or any other custom object that meets this interface. Optional keyword arguments are encoding and errors, which are used to decode 8-bit string instances pickled by Python 2.x. These default to 'ASCII' and 'strict', respectively. .. method:: load() Read a pickled object representation from the open file object given in the constructor, and return the reconstituted object hierarchy specified therein. Bytes past the pickled object's representation are ignored. .. method:: persistent_load(pid) Raise an :exc:`UnpickingError` by default. If defined, :meth:`persistent_load` should return the object specified by the persistent ID *pid*. If an invalid persistent ID is encountered, an :exc:`UnpickingError` should be raised. See :ref:`pickle-persistent` for details and examples of uses. .. method:: find_class(module, name) Import *module* if necessary and return the object called *name* from it, where the *module* and *name* arguments are :class:`str` objects. Note, unlike its name suggests, :meth:`find_class` is also used for finding functions. Subclasses may override this to gain control over what type of objects and how they can be loaded, potentially reducing security risks. Refer to :ref:`pickle-restrict` for details. .. _pickle-picklable: What can be pickled and unpickled? ---------------------------------- The following types can be pickled: * ``None``, ``True``, and ``False`` * integers, floating point numbers, complex numbers * strings, bytes, bytearrays * tuples, lists, sets, and dictionaries containing only picklable objects * functions defined at the top level of a module * built-in functions defined at the top level of a module * classes that are defined at the top level of a module * instances of such classes whose :attr:`__dict__` or :meth:`__setstate__` is picklable (see section :ref:`pickle-protocol` for details) Attempts to pickle unpicklable objects will raise the :exc:`PicklingError` exception; when this happens, an unspecified number of bytes may have already been written to the underlying file. Trying to pickle a highly recursive data structure may exceed the maximum recursion depth, a :exc:`RuntimeError` will be raised in this case. You can carefully raise this limit with :func:`sys.setrecursionlimit`. Note that functions (built-in and user-defined) are pickled by "fully qualified" name reference, not by value. This means that only the function name is pickled, along with the name of module the function is defined in. Neither the function's code, nor any of its function attributes are pickled. Thus the defining module must be importable in the unpickling environment, and the module must contain the named object, otherwise an exception will be raised. [#]_ Similarly, classes are pickled by named reference, so the same restrictions in the unpickling environment apply. Note that none of the class's code or data is pickled, so in the following example the class attribute ``attr`` is not restored in the unpickling environment:: class Foo: attr = 'A class attribute' picklestring = pickle.dumps(Foo) These restrictions are why picklable functions and classes must be defined in the top level of a module. Similarly, when class instances are pickled, their class's code and data are not pickled along with them. Only the instance data are pickled. This is done on purpose, so you can fix bugs in a class or add methods to the class and still load objects that were created with an earlier version of the class. If you plan to have long-lived objects that will see many versions of a class, it may be worthwhile to put a version number in the objects so that suitable conversions can be made by the class's :meth:`__setstate__` method. .. _pickle-protocol: The pickle protocol ------------------- This section describes the "pickling protocol" that defines the interface between the pickler/unpickler and the objects that are being serialized. This protocol provides a standard way for you to define, customize, and control how your objects are serialized and de-serialized. The description in this section doesn't cover specific customizations that you can employ to make the unpickling environment slightly safer from untrusted pickle data streams; see section :ref:`pickle-restrict` for more details. .. _pickle-inst: Pickling and unpickling normal class instances ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. index:: single: __getinitargs__() (copy protocol) single: __init__() (instance constructor) .. XXX is __getinitargs__ only used with old-style classes? .. XXX update w.r.t Py3k's classes When a pickled class instance is unpickled, its :meth:`__init__` method is normally *not* invoked. If it is desirable that the :meth:`__init__` method be called on unpickling, an old-style class can define a method :meth:`__getinitargs__`, which should return a *tuple* containing the arguments to be passed to the class constructor (:meth:`__init__` for example). The :meth:`__getinitargs__` method is called at pickle time; the tuple it returns is incorporated in the pickle for the instance. .. index:: single: __getnewargs__() (copy protocol) New-style types can provide a :meth:`__getnewargs__` method that is used for protocol 2. Implementing this method is needed if the type establishes some internal invariants when the instance is created, or if the memory allocation is affected by the values passed to the :meth:`__new__` method for the type (as it is for tuples and strings). Instances of a :term:`new-style class` :class:`C` are created using :: obj = C.__new__(C, *args) where *args* is the result of calling :meth:`__getnewargs__` on the original object; if there is no :meth:`__getnewargs__`, an empty tuple is assumed. .. index:: single: __getstate__() (copy protocol) single: __setstate__() (copy protocol) single: __dict__ (instance attribute) Classes can further influence how their instances are pickled; if the class defines the method :meth:`__getstate__`, it is called and the return state is pickled as the contents for the instance, instead of the contents of the instance's dictionary. If there is no :meth:`__getstate__` method, the instance's :attr:`__dict__` is pickled. Upon unpickling, if the class also defines the method :meth:`__setstate__`, it is called with the unpickled state. [#]_ If there is no :meth:`__setstate__` method, the pickled state must be a dictionary and its items are assigned to the new instance's dictionary. If a class defines both :meth:`__getstate__` and :meth:`__setstate__`, the state object needn't be a dictionary and these methods can do what they want. [#]_ .. warning:: If :meth:`__getstate__` returns a false value, the :meth:`__setstate__` method will not be called. Pickling and unpickling extension types ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. index:: single: __reduce__() (pickle protocol) single: __reduce_ex__() (pickle protocol) single: __safe_for_unpickling__ (pickle protocol) When the :class:`Pickler` encounters an object of a type it knows nothing about --- such as an extension type --- it looks in two places for a hint of how to pickle it. One alternative is for the object to implement a :meth:`__reduce__` method. If provided, at pickling time :meth:`__reduce__` will be called with no arguments, and it must return either a string or a tuple. If a string is returned, it names a global variable whose contents are pickled as normal. The string returned by :meth:`__reduce__` should be the object's local name relative to its module; the pickle module searches the module namespace to determine the object's module. When a tuple is returned, it must be between two and five elements long. Optional elements can either be omitted, or ``None`` can be provided as their value. The contents of this tuple are pickled as normal and used to reconstruct the object at unpickling time. The semantics of each element are: * A callable object that will be called to create the initial version of the object. The next element of the tuple will provide arguments for this callable, and later elements provide additional state information that will subsequently be used to fully reconstruct the pickled data. In the unpickling environment this object must be either a class, a callable registered as a "safe constructor" (see below), or it must have an attribute :attr:`__safe_for_unpickling__` with a true value. Otherwise, an :exc:`UnpicklingError` will be raised in the unpickling environment. Note that as usual, the callable itself is pickled by name. * A tuple of arguments for the callable object, not ``None``. * Optionally, the object's state, which will be passed to the object's :meth:`__setstate__` method as described in section :ref:`pickle-inst`. If the object has no :meth:`__setstate__` method, then, as above, the value must be a dictionary and it will be added to the object's :attr:`__dict__`. * Optionally, an iterator (and not a sequence) yielding successive list items. These list items will be pickled, and appended to the object using either ``obj.append(item)`` or ``obj.extend(list_of_items)``. This is primarily used for list subclasses, but may be used by other classes as long as they have :meth:`append` and :meth:`extend` methods with the appropriate signature. (Whether :meth:`append` or :meth:`extend` is used depends on which pickle protocol version is used as well as the number of items to append, so both must be supported.) * Optionally, an iterator (not a sequence) yielding successive dictionary items, which should be tuples of the form ``(key, value)``. These items will be pickled and stored to the object using ``obj[key] = value``. This is primarily used for dictionary subclasses, but may be used by other classes as long as they implement :meth:`__setitem__`. It is sometimes useful to know the protocol version when implementing :meth:`__reduce__`. This can be done by implementing a method named :meth:`__reduce_ex__` instead of :meth:`__reduce__`. :meth:`__reduce_ex__`, when it exists, is called in preference over :meth:`__reduce__` (you may still provide :meth:`__reduce__` for backwards compatibility). The :meth:`__reduce_ex__` method will be called with a single integer argument, the protocol version. The :class:`object` class implements both :meth:`__reduce__` and :meth:`__reduce_ex__`; however, if a subclass overrides :meth:`__reduce__` but not :meth:`__reduce_ex__`, the :meth:`__reduce_ex__` implementation detects this and calls :meth:`__reduce__`. An alternative to implementing a :meth:`__reduce__` method on the object to be pickled, is to register the callable with the :mod:`copyreg` module. This module provides a way for programs to register "reduction functions" and constructors for user-defined types. Reduction functions have the same semantics and interface as the :meth:`__reduce__` method described above, except that they are called with a single argument, the object to be pickled. The registered constructor is deemed a "safe constructor" for purposes of unpickling as described above. .. _pickle-persistent: Pickling and unpickling external objects ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ .. index:: single: persistent_id (pickle protocol) single: persistent_load (pickle protocol) For the benefit of object persistence, the :mod:`pickle` module supports the notion of a reference to an object outside the pickled data stream. Such objects are referenced by a persistent ID, which should be either a string of alphanumeric characters (for protocol 0) [#]_ or just an arbitrary object (for any newer protocol). The resolution of such persistent IDs is not defined by the :mod:`pickle` module; it will delegate this resolution to the user defined methods on the pickler and unpickler, :meth:`persistent_id` and :meth:`persistent_load` respectively. To pickle objects that have an external persistent id, the pickler must have a custom :meth:`persistent_id` method that takes an object as an argument and returns either ``None`` or the persistent id for that object. When ``None`` is returned, the pickler simply pickles the object as normal. When a persistent ID string is returned, the pickler will pickle that object, along with a marker so that the unpickler will recognize it as a persistent ID. To unpickle external objects, the unpickler must have a custom :meth:`persistent_load` method that takes a persistent ID object and returns the referenced object. Example: .. XXX Work around for some bug in sphinx/pygments. .. highlightlang:: python .. literalinclude:: ../includes/dbpickle.py .. highlightlang:: python3 .. _pickle-restrict: Restricting Globals ^^^^^^^^^^^^^^^^^^^ .. index:: single: find_class() (pickle protocol) By default, unpickling will import any class or function that it finds in the pickle data. For many applications, this behaviour is unacceptable as it permits the unpickler to import and invoke arbitrary code. Just consider what this hand-crafted pickle data stream does when loaded:: >>> import pickle >>> pickle.loads(b"cos\nsystem\n(S'echo hello world'\ntR.") hello world 0 In this example, the unpickler imports the :func:`os.system` function and then apply the string argument "echo hello world". Although this example is inoffensive, it is not difficult to imagine one that could damage your system. For this reason, you may want to control what gets unpickled by customizing :meth:`Unpickler.find_class`. Unlike its name suggests, :meth:`find_class` is called whenever a global (i.e., a class or a function) is requested. Thus it is possible to either forbid completely globals or restrict them to a safe subset. Here is an example of an unpickler allowing only few safe classes from the :mod:`builtins` module to be loaded:: import builtins import io import pickle safe_builtins = { 'range', 'complex', 'set', 'frozenset', 'slice', } class RestrictedUnpickler(pickle.Unpickler): def find_class(self, module, name): # Only allow safe classes from builtins. if module == "builtins" and name in safe_builtins: return getattr(builtins, name) # Forbid everything else. raise pickle.UnpicklingError("global '%s.%s' is forbidden" % (module, name)) def restricted_loads(s): """Helper function analogous to pickle.loads().""" return RestrictedUnpickler(io.BytesIO(s)).load() A sample usage of our unpickler working has intended:: >>> restricted_loads(pickle.dumps([1, 2, range(15)])) [1, 2, range(0, 15)] >>> restricted_loads(b"cos\nsystem\n(S'echo hello world'\ntR.") Traceback (most recent call last): ... pickle.UnpicklingError: global 'os.system' is forbidden >>> restricted_loads(b'cbuiltins\neval\n' ... b'(S\'getattr(__import__("os"), "system")' ... b'("echo hello world")\'\ntR.') Traceback (most recent call last): ... pickle.UnpicklingError: global 'builtins.eval' is forbidden As our examples shows, you have to be careful with what you allow to be unpickled. Therefore if security is a concern, you may want to consider alternatives such as the marshalling API in :mod:`xmlrpc.client` or third-party solutions. .. _pickle-example: Example ------- For the simplest code, use the :func:`dump` and :func:`load` functions. Note that a self-referencing list is pickled and restored correctly. :: import pickle data1 = {'a': [1, 2.0, 3, 4+6j], 'b': ("string", "string using Unicode features \u0394"), 'c': None} selfref_list = [1, 2, 3] selfref_list.append(selfref_list) output = open('data.pkl', 'wb') # Pickle dictionary using protocol 2. pickle.dump(data1, output, 2) # Pickle the list using the highest protocol available. pickle.dump(selfref_list, output, -1) output.close() The following example reads the resulting pickled data. When reading a pickle-containing file, you should open the file in binary mode because you can't be sure if the ASCII or binary format was used. :: import pprint, pickle pkl_file = open('data.pkl', 'rb') data1 = pickle.load(pkl_file) pprint.pprint(data1) data2 = pickle.load(pkl_file) pprint.pprint(data2) pkl_file.close() Here's a larger example that shows how to modify pickling behavior for a class. The :class:`TextReader` class opens a text file, and returns the line number and line contents each time its :meth:`readline` method is called. If a :class:`TextReader` instance is pickled, all attributes *except* the file object member are saved. When the instance is unpickled, the file is reopened, and reading resumes from the last location. The :meth:`__setstate__` and :meth:`__getstate__` methods are used to implement this behavior. :: #!/usr/local/bin/python class TextReader: """Print and number lines in a text file.""" def __init__(self, file): self.file = file self.fh = open(file) self.lineno = 0 def readline(self): self.lineno = self.lineno + 1 line = self.fh.readline() if not line: return None if line.endswith("\n"): line = line[:-1] return "%d: %s" % (self.lineno, line) def __getstate__(self): odict = self.__dict__.copy() # copy the dict since we change it del odict['fh'] # remove filehandle entry return odict def __setstate__(self, dict): fh = open(dict['file']) # reopen file count = dict['lineno'] # read from file... while count: # until line count is restored fh.readline() count = count - 1 self.__dict__.update(dict) # update attributes self.fh = fh # save the file object A sample usage might be something like this:: >>> import TextReader >>> obj = TextReader.TextReader("TextReader.py") >>> obj.readline() '1: #!/usr/local/bin/python' >>> obj.readline() '2: ' >>> obj.readline() '3: class TextReader:' >>> import pickle >>> pickle.dump(obj, open('save.p', 'wb')) If you want to see that :mod:`pickle` works across Python processes, start another Python session, before continuing. What follows can happen from either the same process or a new process. :: >>> import pickle >>> reader = pickle.load(open('save.p', 'rb')) >>> reader.readline() '4: """Print and number lines in a text file."""' .. seealso:: Module :mod:`copyreg` Pickle interface constructor registration for extension types. Module :mod:`shelve` Indexed databases of objects; uses :mod:`pickle`. Module :mod:`copy` Shallow and deep object copying. Module :mod:`marshal` High-performance serialization of built-in types. .. rubric:: Footnotes .. [#] Don't confuse this with the :mod:`marshal` module .. [#] The exception raised will likely be an :exc:`ImportError` or an :exc:`AttributeError` but it could be something else. .. [#] These methods can also be used to implement copying class instances. .. [#] This protocol is also used by the shallow and deep copying operations defined in the :mod:`copy` module. .. [#] The limitation on alphanumeric characters is due to the fact the persistent IDs, in protocol 0, are delimited by the newline character. Therefore if any kind of newline characters occurs in persistent IDs, the resulting pickle will become unreadable.