summaryrefslogtreecommitdiffstats
path: root/Objects/dictobject.c
Commit message (Collapse)AuthorAgeFilesLines
* SF patch #659536: Use PyArg_UnpackTuple where possible.Raymond Hettinger2002-12-291-4/+4
| | | | | | | Obtain cleaner coding and a system wide performance boost by using the fast, pre-parsed PyArg_Unpack function instead of PyArg_ParseTuple function which is driven by a format string.
* Constify char* API. Fixes #651363. 2.2 candidate.Martin v. Löwis2002-12-111-3/+3
|
* SF 548651: Fix the METH_CLASS implementation.Tim Peters2002-12-091-3/+2
| | | | | | | Most of these patches are from Thomas Heller, with long lines folded by Tim. The change to test_descr.py is from Guido. See the bug report. Not a bugfix candidate -- METH_CLASS is new in 2.3.
* Remove assumption that cls is a subclass of dict.Raymond Hettinger2002-12-071-7/+1
| | | | Simplifies the code and gets Just van Rossum's example to work.
* Replace BadInternalCall with TypeError. Add a test case. Fix whitespace.Raymond Hettinger2002-12-041-2/+3
| | | | | | | Just van Rossum showed a weird, but clever way for pure python code to trigger the BadInternalCall. The C code had assumed that calling a class constructor would return an instance of that class; however, classes that abuse __new__ can invalidate that assumption.
* Add missing decrefNeal Norwitz2002-11-271-0/+1
|
* SF Patch 643443. Added dict.fromkeys(iterable, value=None), a classRaymond Hettinger2002-11-271-0/+56
| | | | method for constructing new dictionaries from sequences of keys.
* Patch #642500 with slight modifications: allow keyword arguments inJust van Rossum2002-11-231-5/+7
| | | | | | | dict() constructor. Example: >>> dict(a=1, b=2) {'a': 1, 'b': 2} >>>
* In doc strings, use 'k in D' rather than D.has_key(k).Guido van Rossum2002-09-041-2/+2
|
* SF patch 576101, by Oren Tirosh: alternative implementation ofGuido van Rossum2002-08-191-9/+3
| | | | | | | | interning. I modified Oren's patch significantly, but the basic idea and most of the implementation is unchanged. Interned strings created with PyString_InternInPlace() are now mortal, and you must keep a reference to the resulting string around; use the new function PyString_InternImmortal() to create immortal interned strings.
* staticforward bites the dust.Jeremy Hylton2002-07-171-1/+1
| | | | | | | | | | | | | | | The staticforward define was needed to support certain broken C compilers (notably SCO ODT 3.0, perhaps early AIX as well) botched the static keyword when it was used with a forward declaration of a static initialized structure. Standard C allows the forward declaration with static, and we've decided to stop catering to broken C compilers. (In fact, we expect that the compilers are all fixed eight years later.) I'm leaving staticforward and statichere defined in object.h as static. This is only for backwards compatibility with C extensions that might still use it. XXX I haven't updated the documentation.
* Make StopIteration a sink state. This is done by clearing out theGuido van Rossum2002-07-161-28/+11
| | | | | | | | | | di_dict field when the end of the list is reached. Also make the error ("dictionary changed size during iteration") a sticky state. Also remove the next() method -- one is supplied automatically by PyType_Ready() because the tp_iternext slot is set. That's a good thing, because the implementation given here was buggy (it never raised StopIteration).
* Patch #568124: Add doc string macros.Martin v. Löwis2002-06-131-30/+30
|
* Add Raymond Hettinger's d.pop(). See SF patch 539949.Guido van Rossum2002-04-121-0/+38
|
* PyObject_GC_Del and PyObject_Del can now be used as a functionNeil Schemenauer2002-04-121-3/+3
| | | | | | designators. Remove PyMalloc_New.
* Add the 'bool' type and its values 'False' and 'True', as described inGuido van Rossum2002-04-031-1/+1
| | | | | | | | | | | | | PEP 285. Everything described in the PEP is here, and there is even some documentation. I had to fix 12 unit tests; all but one of these were printing Boolean outcomes that changed from 0/1 to False/True. (The exception is test_unicode.py, which did a type(x) == type(y) style comparison. I could've fixed that with a single line using issubtype(x, type(y)), but instead chose to be explicit about those places where a bool is expected. Still to do: perhaps more documentation; change standard library modules to return False/True from predicates.
* Remove the CACHE_HASH and INTERN_STRINGS preprocessor symbols.Tim Peters2002-03-291-34/+8
|
* This is Neil's fix for SF bug 535905 (Evil Trashcan and GC interaction).Guido van Rossum2002-03-281-1/+1
| | | | | | | | The fix makes it possible to call PyObject_GC_UnTrack() more than once on the same object, and then move the PyObject_GC_UnTrack() call to *before* the trashcan code is invoked. BUGFIX CANDIDATE!
* Use pymalloc if it's enabled.Neil Schemenauer2002-03-221-2/+2
|
* SF bug #491415 PyDict_UpdateFromSeq2() unusedTim Peters2001-12-111-8/+2
| | | | | | | PyDict_UpdateFromSeq2(): removed it. PyDict_MergeFromSeq2(): made it public and documented it. PyDict_Merge() docs: updated to reveal <wink> that the second argument can be any mapping object.
* Fix of SF bug #475877 (Mutable subtype instances are hashable).Guido van Rossum2001-12-031-1/+8
| | | | | | | | | | | | | | | | | Rather than tweaking the inheritance of type object slots (which turns out to be too messy to try), this fix adds a __hash__ to the list and dict types (the only mutable types I'm aware of) that explicitly raises an error. This has the advantage that list.__hash__([]) also raises an error (previously, this would invoke object.__hash__([]), returning the argument's address); ditto for dict.__hash__. The disadvantage for this fix is that 3rd party mutable types aren't automatically fixed. This should be added to the rules for creating subclassable extension types: if you don't want your object to be hashable, add a tp_hash function that raises an exception. Also, it's possible that I've forgotten about other mutable types for which this should be done.
* Rename "dictionary" (type and constructor) to "dict".Tim Peters2001-10-291-5/+5
|
* dictionary() constructor:Tim Peters2001-10-271-7/+5
| | | | | | + Change keyword arg name from "x" to "items". People passing a mapping object can stretch their imaginations <wink>. + Simplify the docstring text.
* Generalize dictionary() to accept a sequence of 2-sequences. At theTim Peters2001-10-261-16/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | outer level, the iterator protocol is used for memory-efficiency (the outer sequence may be very large if fully materialized); at the inner level, PySequence_Fast() is used for time-efficiency (these should always be sequences of length 2). dictobject.c, new functions PyDict_{Merge,Update}FromSeq2. These are wholly analogous to PyDict_{Merge,Update}, but process a sequence-of-2- sequences argument instead of a mapping object. For now, I left these functions file static, so no corresponding doc changes. It's tempting to change dict.update() to allow a sequence-of-2-seqs argument too. Also changed the name of dictionary's keyword argument from "mapping" to "x". Got a better name? "mapping_or_sequence_of_pairs" isn't attractive, although more so than "mosop" <wink>. abstract.h, abstract.tex: Added new PySequence_Fast_GET_SIZE function, much faster than going thru the all-purpose PySequence_Size. libfuncs.tex: - Document dictionary(). - Fiddle tuple() and list() to admit that their argument is optional. - The long-winded repetitions of "a sequence, a container that supports iteration, or an iterator object" is getting to be a PITA. Many months ago I suggested factoring this out into "iterable object", where the definition of that could include being explicit about generators too (as is, I'm not sure a reader outside of PythonLabs could guess that "an iterator object" includes a generator call). - Please check my curly braces -- I'm going blind <0.9 wink>. abstract.c, PySequence_Tuple(): When PyObject_GetIter() fails, leave its error msg alone now (the msg it produces has improved since PySequence_Tuple was generalized to accept iterable objects, and PySequence_Tuple was also stomping on the msg in cases it shouldn't have even before PyObject_GetIter grew a better msg).
* Enable GC for new-style instances. This touches lots of files, sinceGuido van Rossum2001-10-051-1/+2
| | | | | | | | | | | | | | | | | | | | | | many types were subclassable but had a xxx_dealloc function that called PyObject_DEL(self) directly instead of deferring to self->ob_type->tp_free(self). It is permissible to set tp_free in the type object directly to _PyObject_Del, for non-GC types, or to _PyObject_GC_Del, for GC types. Still, PyObject_DEL was a tad faster, so I'm fearing that our pystone rating is going down again. I'm not sure if doing something like void xxx_dealloc(PyObject *self) { if (PyXxxCheckExact(self)) PyObject_DEL(self); else self->ob_type->tp_free(self); } is any faster than always calling the else branch, so I haven't attempted that -- however those types whose own dealloc is fancier (int, float, unicode) do use this pattern.
* Changed the dict implementation to take "string shortcuts" only whenTim Peters2001-09-141-13/+16
| | | | | | | | | | | | | | | | keys are true strings -- no subclasses need apply. This may be debatable. The problem is that a str subclass may very well want to override __eq__ and/or __hash__ (see the new example of case-insensitive strings in test_descr), but go-fast shortcuts for strings are ubiquitous in our dicts (and subclass overrides aren't even looked for then). Another go-fast reason for the change is that PyCheck_StringExact() is a quicker test than PyCheck_String(), and we make such a test on virtually every access to every dict. OTOH, a str subclass may also be perfectly happy using the base str eq and hash, and this change slows them a lot. But those cases are still hypothetical, while Python's own reliance on true-string dicts is not.
* Repair typo in comment.Tim Peters2001-09-021-1/+1
|
* Make dictionary() a real constructor. Accepts at most one argument, "aTim Peters2001-09-021-2/+29
| | | | | | | | | | | | mapping object", in the same sense dict.update(x) requires of x (that x has a keys() method and a getitem). Questionable: The other type constructors accept a keyword argument, so I did that here too (e.g., dictionary(mapping={1:2}) works). But type_call doesn't pass the keyword args to the tp_new slot (it passes NULL), it only passes them to the tp_init slot, so getting at them required adding a tp_init slot to dicts. Looks like that makes the normal case (i.e., no args at all) a little slower (the time it takes to call dict.tp_init and have it figure out there's nothing to do).
* Use new GC API.Neil Schemenauer2001-08-291-7/+6
|
* Patch #427190: Implement and use METH_NOARGS and METH_O.Martin v. Löwis2001-08-161-50/+25
|
* Add PyDict_Merge(a, b, override):Guido van Rossum2001-08-101-2/+18
| | | | | | PyDict_Merge(a, b, 1) is the same as PyDict_Update(a, b). PyDict_Merge(a, b, 0) does something similar but leaves existing items unchanged.
* Merge of descr-branch back into trunk.Tim Peters2001-08-021-123/+107
|
* dict_update(): Generalize this method so {}.update() accepts anyBarry Warsaw2001-06-261-17/+70
| | | | | | | | | | | | | | | "mapping" object, specifically one that supports PyMapping_Keys() and PyObject_GetItem(). This allows you to say e.g. {}.update(UserDict()) We keep the special case for concrete dict objects, although that seems moderately questionable. OTOH, the code exists and works, so why change that? .update()'s docstring already claims that D.update(E) implies calling E.keys() so it's appropriate not to transform AttributeErrors in PyMapping_Keys() to TypeErrors. Patch eyeballed by Tim.
* dict_repr: Reuse one of the int vars (minor code simplification).Tim Peters2001-06-161-3/+3
|
* SF bug 433228: repr(list) woes when len(list) big.Tim Peters2001-06-161-30/+68
| | | | | | | | | | | | Gave Python linear-time repr() implementations for dicts, lists, strings. This means, e.g., that repr(range(50000)) is no longer 50x slower than pprint.pprint() in 2.2 <wink>. I don't consider this a bugfix candidate, as it's a performance boost. Added _PyString_Join() to the internal string API. If we want that in the public API, fine, but then it requires runtime error checks instead of asserts.
* Store the mask instead of the size in dictobjects. The mask is moreTim Peters2001-06-041-23/+29
| | | | | | | frequently used, and in particular this allows to drop the last remaining obvious time-waster in the crucial lookdict() and lookdict_string() functions. Other changes consist mostly of changing "i < ma_size" to "i <= ma_mask" everywhere.
* lookdict: stop more insane core-dump mutating comparison cases. ShouldTim Peters2001-06-031-6/+31
| | | | | | | be possible to provoke unbounded recursion now, but leaving that to someone else to provoke and repair. Bugfix candidate -- although this is getting harder to backstitch, and the cases it's protecting against are mondo contrived.
* lookdict: Reduce obfuscating code duplication with a judicious goto.Tim Peters2001-06-031-25/+21
| | | | | This code is likely to get even hairier to squash core dumps due to mutating comparisons, and it's hard enough to follow without that.
* Finish the dict->string coredump fix. Need sleep.Tim Peters2001-06-021-1/+1
| | | | Bugfix candidate.
* Coredumpers from Michael Hudson, mutating dicts while printing orTim Peters2001-06-021-7/+19
| | | | | converting to string. Critical bugfix candidate -- if you take this seriously <wink>.
* dict_popitem(): Repaired last-second 2.1 comment, which misidentified theTim Peters2001-06-021-5/+8
| | | | true reason for allocating the tuple before checking the dict size.
* New collision resolution scheme: no polynomials, simpler, faster, lessTim Peters2001-06-021-163/+124
| | | | | | | code, less memory. Tests have uncovered no drawbacks. Christian and Vladimir are the other two people who have burned many brain cells on the dict code in recent years, and they like the approach too, so I'm checking it in without further ado.
* Implement an old idea of Christian Tismer's: use polynomial divisionTim Peters2001-05-271-18/+72
| | | | | | | | | | | | | | | instead of multiplication to generate the probe sequence. The idea is recorded in Python-Dev for Dec 2000, but that version is prone to rare infinite loops. The value is in getting *all* the bits of the hash code to participate; and, e.g., this speeds up querying every key in a dict with keys [i << 16 for i in range(20000)] by a factor of 500. Should be equally valuable in any bad case where the high-order hash bits were getting ignored. Also wrote up some of the motivations behind Python's ever-more-subtle hash table strategy.
* Patch #424335: Implement string_richcompare, remove string_compare.Martin v. Löwis2001-05-241-4/+3
| | | | Use new _PyString_Eq in lookdict_string.
* dictresize(): Rebuild small tables if there are any dummies, not just ifTim Peters2001-05-241-7/+11
| | | | | they're entirely full. Not a question of correctness, but of temporarily misplaced common sense.
* Jack Jansen hit a bug in the new dict code, reported on python-dev.Tim Peters2001-05-231-9/+28
| | | | | | | | | | dictresize() was too aggressive about never ever resizing small dicts. If a small dict is entirely full, it needs to rebuild it despite that it won't actually resize it, in order to purge old dummy entries thus creating at least one virgin slot (lookdict assumes at least one such exists). Also took the opportunity to add some high-level comments to dictresize.
* Remove unused variable.Fred Drake2001-05-221-1/+0
|
* SF patch #425242: Patch which "inlines" small dictionaries.Tim Peters2001-05-221-81/+145
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea is Marc-Andre Lemburg's, the implementation is Tim's. Add a new ma_smalltable member to dictobjects, an embedded vector of MINSIZE (8) dictentry structs. Short course is that this lets us avoid additional malloc(s) for dicts with no more than 5 entries. The changes are widespread but mostly small. Long course: WRT speed, all scalar operations (getitem, setitem, delitem) on non-empty dicts benefit from no longer needing NULL-pointer checks (ma_table is never NULL anymore). Bulk operations (copy, update, resize, clearing slots during dealloc) benefit in some cases from now looping on the ma_fill count rather than on ma_size, but that was an unexpected benefit: the original reason to loop on ma_fill was to let bulk operations on empty dicts end quickly (since the NULL-pointer checks went away, empty dicts aren't special-cased any more). Special considerations: For dicts that remain empty, this change is a lose on two counts: the dict object contains 8 new dictentry slots now that weren't needed before, and dict object creation also spends time memset'ing these doomed-to-be-unsused slots to NULLs. For dicts with one or two entries that never get larger than 2, it's a mix: a malloc()/free() pair is no longer needed, and the 2-entry case gets to use 8 slots (instead of 4) thus decreasing the chance of collision. Against that, dict object creation spends time memset'ing 4 slots that aren't strictly needed in this case. For dicts with 3 through 5 entries that never get larger than 5, it's a pure win: the dict is created with all the space they need, and they never need to resize. Before they suffered two malloc()/free() calls, plus 1 dict resize, to get enough space. In addition, the 8-slot table they ended with consumed more memory overall, because of the hidden overhead due to the additional malloc. For dicts with 6 or more entries, the ma_smalltable member is wasted space, but then these are large(r) dicts so 8 slots more or less doesn't make much difference. They still benefit all the time from removing ubiquitous dynamic null-pointer checks, and get a small benefit (but relatively smaller the larger the dict) from not having to do two mallocs, two frees, and a resize on the way *to* getting their sixth entry. All in all it appears a small but definite general win, with larger benefits in specific cases. It's especially nice that it allowed to get rid of several branches, gotos and labels, and overall made the code smaller.
* Bugfix candidate.Tim Peters2001-05-191-2/+3
| | | | | | | | | | | | Two exceedingly unlikely errors in dictresize(): 1. The loop for finding the new size had an off-by-one error at the end (could over-index the polys[] vector). 2. The polys[] vector ended with a 0, apparently intended as a sentinel value but never used as such; i.e., it was never checked, so 0 could have been used *as* a polynomial. Neither bug could trigger unless a dict grew to 2**30 slots; since that would consume at least 12GB of memory just to hold the dict pointers, I'm betting it's not the cause of the bug Fred's tracking down <wink>.
* Speed dictresize by collapsing its two passes into one; the reason givenTim Peters2001-05-171-8/+9
| | | | | | | in the comments for using two passes was bogus, as the only object that can get decref'ed due to the copy is the dummy key, and decref'ing dummy can't have side effects (for one thing, dummy is immortal! for another, it's a string object, not a potentially dangerous user-defined object).