summaryrefslogtreecommitdiffstats
path: root/Doc/library/shelve.rst
blob: 65303e971b130f1e1de769f041240f96ed47243b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
:mod:`shelve` --- Python object persistence
===========================================

.. module:: shelve
   :synopsis: Python object persistence.


.. index:: module: pickle

A "shelf" is a persistent, dictionary-like object.  The difference with "dbm"
databases is that the values (not the keys!) in a shelf can be essentially
arbitrary Python objects --- anything that the :mod:`pickle` module can handle.
This includes most class instances, recursive data types, and objects containing
lots of shared  sub-objects.  The keys are ordinary strings.


.. function:: open(filename, flag='c', protocol=None, writeback=False)

   Open a persistent dictionary.  The filename specified is the base filename for
   the underlying database.  As a side-effect, an extension may be added to the
   filename and more than one file may be created.  By default, the underlying
   database file is opened for reading and writing.  The optional *flag* parameter
   has the same interpretation as the *flag* parameter of :func:`dbm.open`.

   By default, version 3 pickles are used to serialize values.  The version of the
   pickle protocol can be specified with the *protocol* parameter.

   Because of Python semantics, a shelf cannot know when a mutable
   persistent-dictionary entry is modified.  By default modified objects are
   written only when assigned to the shelf (see :ref:`shelve-example`).  If
   the optional *writeback* parameter is set to *True*, all entries accessed
   are cached in memory, and written back at close time; this can make it
   handier to mutate mutable entries in the persistent dictionary, but, if
   many entries are accessed, it can consume vast amounts of memory for the
   cache, and it can make the close operation very slow since all accessed
   entries are written back (there is no way to determine which accessed
   entries are mutable, nor which ones were actually mutated).

Shelf objects support all methods supported by dictionaries.  This eases the
transition from dictionary based scripts to those requiring persistent storage.

One additional method is supported:


.. method:: Shelf.sync()

   Write back all entries in the cache if the shelf was opened with *writeback* set
   to *True*. Also empty the cache and synchronize the persistent dictionary on
   disk, if feasible.  This is called automatically when the shelf is closed with
   :meth:`close`.

.. seealso::

   `Persistent dictionary recipe <http://code.activestate.com/recipes/576642/>`_
   with widely supported storage formats and having the speed of native
   dictionaries.


Restrictions
------------

  .. index::
     module: dbm.ndbm
     module: dbm.gnu

* The choice of which database package will be used (such as :mod:`dbm.ndbm` or
  :mod:`dbm.gnu`) depends on which interface is available.  Therefore it is not
  safe to open the database directly using :mod:`dbm`.  The database is also
  (unfortunately) subject to the limitations of :mod:`dbm`, if it is used ---
  this means that (the pickled representation of) the objects stored in the
  database should be fairly small, and in rare cases key collisions may cause
  the database to refuse updates.

* Depending on the implementation, closing a persistent dictionary may or may
  not be necessary to flush changes to disk.  The :meth:`__del__` method of the
  :class:`Shelf` class calls the :meth:`close` method, so the programmer generally
  need not do this explicitly.

* The :mod:`shelve` module does not support *concurrent* read/write access to
  shelved objects.  (Multiple simultaneous read accesses are safe.)  When a
  program has a shelf open for writing, no other program should have it open for
  reading or writing.  Unix file locking can be used to solve this, but this
  differs across Unix versions and requires knowledge about the database
  implementation used.


.. class:: Shelf(dict, protocol=None, writeback=False)

   A subclass of :class:`collections.MutableMapping` which stores pickled values
   in the *dict* object.

   By default, version 0 pickles are used to serialize values.  The version of the
   pickle protocol can be specified with the *protocol* parameter. See the
   :mod:`pickle` documentation for a discussion of the pickle protocols.

   If the *writeback* parameter is ``True``, the object will hold a cache of all
   entries accessed and write them back to the *dict* at sync and close times.
   This allows natural operations on mutable entries, but can consume much more
   memory and make sync and close take a long time.


.. class:: BsdDbShelf(dict, protocol=None, writeback=False)

   A subclass of :class:`Shelf` which exposes :meth:`first`, :meth:`!next`,
   :meth:`previous`, :meth:`last` and :meth:`set_location` which are available
   in the third-party :mod:`bsddb` module from `pybsddb
   <http://www.jcea.es/programacion/pybsddb.htm>`_ but not in other database
   modules.  The *dict* object passed to the constructor must support those
   methods.  This is generally accomplished by calling one of
   :func:`bsddb.hashopen`, :func:`bsddb.btopen` or :func:`bsddb.rnopen`.  The
   optional *protocol* and *writeback* parameters have the same interpretation
   as for the :class:`Shelf` class.


.. class:: DbfilenameShelf(filename, flag='c', protocol=None, writeback=False)

   A subclass of :class:`Shelf` which accepts a *filename* instead of a dict-like
   object.  The underlying file will be opened using :func:`dbm.open`.  By
   default, the file will be created and opened for both read and write.  The
   optional *flag* parameter has the same interpretation as for the :func:`.open`
   function.  The optional *protocol* and *writeback* parameters have the same
   interpretation as for the :class:`Shelf` class.


.. _shelve-example:

Example
-------

To summarize the interface (``key`` is a string, ``data`` is an arbitrary
object)::

   import shelve

   d = shelve.open(filename) # open -- file may get suffix added by low-level
                             # library

   d[key] = data   # store data at key (overwrites old data if
                   # using an existing key)
   data = d[key]   # retrieve a COPY of data at key (raise KeyError if no
                   # such key)
   del d[key]      # delete data stored at key (raises KeyError
                   # if no such key)
   flag = key in d        # true if the key exists
   klist = list(d.keys()) # a list of all existing keys (slow!)

   # as d was opened WITHOUT writeback=True, beware:
   d['xx'] = range(4)  # this works as expected, but...
   d['xx'].append(5)   # *this doesn't!* -- d['xx'] is STILL range(4)!

   # having opened d without writeback=True, you need to code carefully:
   temp = d['xx']      # extracts the copy
   temp.append(5)      # mutates the copy
   d['xx'] = temp      # stores the copy right back, to persist it

   # or, d=shelve.open(filename,writeback=True) would let you just code
   # d['xx'].append(5) and have it work as expected, BUT it would also
   # consume more memory and make the d.close() operation slower.

   d.close()       # close it


.. seealso::

   Module :mod:`dbm`
      Generic interface to ``dbm``-style databases.

   Module :mod:`pickle`
      Object serialization used by :mod:`shelve`.

   Module :mod:`cPickle`
      High-performance version of :mod:`pickle`.