diff options
author | Raymond Hettinger <rhettinger@users.noreply.github.com> | 2020-10-23 19:55:39 (GMT) |
---|---|---|
committer | GitHub <noreply@github.com> | 2020-10-23 19:55:39 (GMT) |
commit | 8d3d7314d44d762a6fb42d079f57b6b5273473d6 (patch) | |
tree | cf01bd788cfa99aad36ed95f5b960d335528405a /Doc/howto | |
parent | 7c4065d305228aa406675224d631f81964d12855 (diff) | |
download | cpython-8d3d7314d44d762a6fb42d079f57b6b5273473d6.zip cpython-8d3d7314d44d762a6fb42d079f57b6b5273473d6.tar.gz cpython-8d3d7314d44d762a6fb42d079f57b6b5273473d6.tar.bz2 |
Create a primer section for the descriptor howto guide (GH-22906)
Diffstat (limited to 'Doc/howto')
-rw-r--r-- | Doc/howto/descriptor.rst | 552 |
1 files changed, 494 insertions, 58 deletions
diff --git a/Doc/howto/descriptor.rst b/Doc/howto/descriptor.rst index b792b6c..4a53b9e 100644 --- a/Doc/howto/descriptor.rst +++ b/Doc/howto/descriptor.rst @@ -1,3 +1,5 @@ +.. _descriptorhowto: + ====================== Descriptor HowTo Guide ====================== @@ -7,6 +9,415 @@ Descriptor HowTo Guide .. Contents:: + +:term:`Descriptors <descriptor>` let objects customize attribute lookup, +storage, and deletion. + +This HowTo guide has three major sections: + +1) The "primer" gives a basic overview, moving gently from simple examples, + adding one feature at a time. It is a great place to start. + +2) The second section shows a complete, practical descriptor example. If you + already know the basics, start there. + +3) The third section provides a more technical tutorial that goes into the + detailed mechanics of how descriptors work. Most people don't need this + level of detail. + + +Primer +^^^^^^ + +In this primer, we start with most basic possible example and then we'll add +new capabilities one by one. + + +Simple example: A descriptor that returns a constant +---------------------------------------------------- + +The :class:`Ten` class is a descriptor that always returns the constant ``10``:: + + + class Ten: + def __get__(self, obj, objtype=None): + return 10 + +To use the descriptor, it must be stored as a class variable in another class:: + + class A: + x = 5 # Regular class attribute + y = Ten() # Descriptor + +An interactive session shows the difference between normal attribute lookup +and descriptor lookup:: + + >>> a = A() # Make an instance of class A + >>> a.x # Normal attribute lookup + 5 + >>> a.y # Descriptor lookup + 10 + +In the ``a.x`` attribute lookup, the dot operator finds the value ``5`` stored +in the class dictionary. In the ``a.y`` descriptor lookup, the dot operator +calls the descriptor's :meth:`__get__()` method. That method returns ``10``. +Note that the value ``10`` is not stored in either the class dictionary or the +instance dictionary. Instead, the value ``10`` is computed on demand. + +This example shows how a simple descriptor works, but it isn't very useful. +For retrieving constants, normal attribute lookup would be better. + +In the next section, we'll create something more useful, a dynamic lookup. + + +Dynamic lookups +--------------- + +Interesting descriptors typically run computations instead of doing lookups:: + + + import os + + class DirectorySize: + + def __get__(self, obj, objtype=None): + return len(os.listdir(obj.dirname)) + + class Directory: + + size = DirectorySize() # Descriptor + + def __init__(self, dirname): + self.dirname = dirname # Regular instance attribute + +An interactive session shows that the lookup is dynamic — it computes +different, updated answers each time:: + + >>> g = Directory('games') + >>> s = Directory('songs') + >>> g.size # The games directory has three files + 3 + >>> os.system('touch games/newfile') # Add a fourth file to the directory + 0 + >>> g.size + 4 + >>> s.size # The songs directory has twenty files + 20 + +Besides showing how descriptors can run computations, this example also +reveals the purpose of the parameters to :meth:`__get__`. The *self* +parameter is *size*, an instance of *DirectorySize*. The *obj* parameter is +either *g* or *s*, an instance of *Directory*. It is *obj* parameter that +lets the :meth:`__get__` method learn the target directory. The *objtype* +parameter is the class *Directory*. + + +Managed attributes +------------------ + +A popular use for descriptors is managing access to instance data. The +descriptor is assigned to a public attribute in the class dictionary while the +actual data is stored as a private attribute in the instance dictionary. The +descriptor's :meth:`__get__` and :meth:`__set__` methods are triggered when +the public attribute is accessed. + +In the following example, *age* is the public attribute and *_age* is the +private attribute. When the public attribute is accessed, the descriptor logs +the lookup or update:: + + import logging + + logging.basicConfig(level=logging.INFO) + + class LoggedAgeAccess: + + def __get__(self, obj, objtype=None): + value = obj._age + logging.info('Accessing %r giving %r', 'age', value) + return value + + def __set__(self, obj, value): + logging.info('Updating %r to %r', 'age', value) + obj._age = value + + class Person: + + age = LoggedAgeAccess() # Descriptor + + def __init__(self, name, age): + self.name = name # Regular instance attribute + self.age = age # Calls the descriptor + + def birthday(self): + self.age += 1 # Calls both __get__() and __set__() + + +An interactive session shows that all access to the managed attribute *age* is +logged, but that the regular attribute *name* is not logged:: + + >>> mary = Person('Mary M', 30) # The initial age update is logged + INFO:root:Updating 'age' to 30 + >>> dave = Person('David D', 40) + INFO:root:Updating 'age' to 40 + + >>> vars(mary) # The actual data is in a private attribute + {'name': 'Mary M', '_age': 30} + >>> vars(dave) + {'name': 'David D', '_age': 40} + + >>> mary.age # Access the data and log the lookup + INFO:root:Accessing 'age' giving 30 + 30 + >>> mary.birthday() # Updates are logged as well + INFO:root:Accessing 'age' giving 30 + INFO:root:Updating 'age' to 31 + + >>> dave.name # Regular attribute lookup isn't logged + 'David D' + >>> dave.age # Only the managed attribute is logged + INFO:root:Accessing 'age' giving 40 + 40 + +One major issue with this example is the private name *_age* is hardwired in +the *LoggedAgeAccess* class. That means that each instance can only have one +logged attribute and that its name is unchangeable. In the next example, +we'll fix that problem. + + +Customized Names +---------------- + +When a class uses descriptors, it can inform each descriptor about what +variable name was used. + +In this example, the :class:`Person` class has two descriptor instances, +*name* and *age*. When the :class:`Person` class is defined, it makes a +callback to :meth:`__set_name__` in *LoggedAccess* so that the field names can +be recorded, giving each descriptor its own *public_name* and *private_name*:: + + import logging + + logging.basicConfig(level=logging.INFO) + + class LoggedAccess: + + def __set_name__(self, owner, name): + self.public_name = name + self.private_name = f'_{name}' + + def __get__(self, obj, objtype=None): + value = getattr(obj, self.private_name) + logging.info('Accessing %r giving %r', self.public_name, value) + return value + + def __set__(self, obj, value): + logging.info('Updating %r to %r', self.public_name, value) + setattr(obj, self.private_name, value) + + class Person: + + name = LoggedAccess() # First descriptor + age = LoggedAccess() # Second descriptor + + def __init__(self, name, age): + self.name = name # Calls the first descriptor + self.age = age # Calls the second descriptor + + def birthday(self): + self.age += 1 + +An interactive session shows that the :class:`Person` class has called +:meth:`__set_name__` so that the field names would be recorded. Here +we call :func:`vars` to lookup the descriptor without triggering it:: + + >>> vars(vars(Person)['name']) + {'public_name': 'name', 'private_name': '_name'} + >>> vars(vars(Person)['age']) + {'public_name': 'age', 'private_name': '_age'} + +The new class now logs access to both *name* and *age*:: + + >>> pete = Person('Peter P', 10) + INFO:root:Updating 'name' to 'Peter P' + INFO:root:Updating 'age' to 10 + >>> kate = Person('Catherine C', 20) + INFO:root:Updating 'name' to 'Catherine C' + INFO:root:Updating 'age' to 20 + +The two *Person* instances contain only the private names:: + + >>> vars(pete) + {'_name': 'Peter P', '_age': 10} + >>> vars(kate) + {'_name': 'Catherine C', '_age': 20} + + +Closing thoughts +---------------- + +A :term:`descriptor` is what we call any object that defines :meth:`__get__`, +:meth:`__set__`, or :meth:`__delete__`. + +Descriptors get invoked by the dot operator during attribute lookup. If a +descriptor is accessed indirectly with ``vars(some_class)[descriptor_name]``, +the descriptor instance is returned without invoking it. + +Descriptors only work when used as class variables. When put in instances, +they have no effect. + +The main motivation for descriptors is to provide a hook allowing objects +stored in class variables to control what happens during dotted lookup. + +Traditionally, the calling class controls what happens during lookup. +Descriptors invert that relationship and allow the data being looked-up to +have a say in the matter. + +Descriptors are used throughout the language. It is how functions turn into +bound methods. Common tools like :func:`classmethod`, :func:`staticmethod`, +:func:`property`, and :func:`functools.cached_property` are all implemented as +descriptors. + + +Complete Practical Example +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In this example, we create a practical and powerful tool for locating +notoriously hard to find data corruption bugs. + + +Validator class +--------------- + +A validator is a descriptor for managed attribute access. Prior to storing +any data, it verifies that the new value meets various type and range +restrictions. If those restrictions aren't met, it raises an exception to +prevents data corruption at its source. + +This :class:`Validator` class is both an :term:`abstract base class` and a +managed attribute descriptor:: + + from abc import ABC, abstractmethod + + class Validator(ABC): + + def __set_name__(self, owner, name): + self.private_name = f'_{name}' + + def __get__(self, obj, objtype=None): + return getattr(obj, self.private_name) + + def __set__(self, obj, value): + self.validate(value) + setattr(obj, self.private_name, value) + + @abstractmethod + def validate(self, value): + pass + +Custom validators need to subclass from :class:`Validator` and supply a +:meth:`validate` method to test various restrictions as needed. + + +Custom validators +----------------- + +Here are three practical data validation utilities: + +1) :class:`OneOf` verifies that a value is one of a restricted set of options. + +2) :class:`Number` verifies that a value is either an :class:`int` or + :class:`float`. Optionally, it verifies that a value is between a given + minimum or maximum. + +3) :class:`String` verifies that a value is a :class:`str`. Optionally, it + validates a given minimum or maximum length. Optionally, it can test for + another predicate as well. + +:: + + class OneOf(Validator): + + def __init__(self, *options): + self.options = set(options) + + def validate(self, value): + if value not in self.options: + raise ValueError(f'Expected {value!r} to be one of {self.options!r}') + + class Number(Validator): + + def __init__(self, minvalue=None, maxvalue=None): + self.minvalue = minvalue + self.maxvalue = maxvalue + + def validate(self, value): + if not isinstance(value, (int, float)): + raise TypeError(f'Expected {value!r} to be an int or float') + if self.minvalue is not None and value < self.minvalue: + raise ValueError( + f'Expected {value!r} to be at least {self.minvalue!r}' + ) + if self.maxvalue is not None and value > self.maxvalue: + raise ValueError( + f'Expected {value!r} to be no more than {self.maxvalue!r}' + ) + + class String(Validator): + + def __init__(self, minsize=None, maxsize=None, predicate=None): + self.minsize = minsize + self.maxsize = maxsize + self.predicate = predicate + + def validate(self, value): + if not isinstance(value, str): + raise TypeError(f'Expected {value!r} to be an str') + if self.minsize is not None and len(value) < self.minsize: + raise ValueError( + f'Expected {value!r} to be no smaller than {self.minsize!r}' + ) + if self.maxsize is not None and len(value) > self.maxsize: + raise ValueError( + f'Expected {value!r} to be no bigger than {self.maxsize!r}' + ) + if self.predicate is not None and not self.predicate(value): + raise ValueError( + f'Expected {self.predicate} to be true for {value!r}' + ) + + +Practical use +------------- + +Here's how the data validators can be used in a real class:: + + class Component: + + name = String(minsize=3, maxsize=10, predicate=str.isupper) + kind = OneOf('plastic', 'metal') + quantity = Number(minvalue=0) + + def __init__(self, name, kind, quantity): + self.name = name + self.kind = kind + self.quantity = quantity + +The descriptors prevent invalid instances from being created:: + + Component('WIDGET', 'metal', 5) # Allowed. + Component('Widget', 'metal', 5) # Blocked: 'Widget' is not all uppercase + Component('WIDGET', 'metle', 5) # Blocked: 'metle' is misspelled + Component('WIDGET', 'metal', -5) # Blocked: -5 is negative + Component('WIDGET', 'metal', 'V') # Blocked: 'V' isn't a number + + +Technical Tutorial +^^^^^^^^^^^^^^^^^^ + +What follows is a more technical tutorial for the mechanics and details of how +descriptors work. + + Abstract -------- @@ -39,10 +450,10 @@ Where this occurs in the precedence chain depends on which descriptor methods were defined. Descriptors are a powerful, general purpose protocol. They are the mechanism -behind properties, methods, static methods, class methods, and :func:`super()`. -They are used throughout Python itself to implement the new style classes -introduced in version 2.2. Descriptors simplify the underlying C-code and offer -a flexible set of new tools for everyday Python programs. +behind properties, methods, static methods, class methods, and +:func:`super()`. They are used throughout Python itself. Descriptors +simplify the underlying C code and offer a flexible set of new tools for +everyday Python programs. Descriptor Protocol @@ -132,11 +543,29 @@ The implementation details are in :c:func:`super_getattro()` in The details above show that the mechanism for descriptors is embedded in the :meth:`__getattribute__()` methods for :class:`object`, :class:`type`, and :func:`super`. Classes inherit this machinery when they derive from -:class:`object` or if they have a meta-class providing similar functionality. +:class:`object` or if they have a metaclass providing similar functionality. Likewise, classes can turn-off descriptor invocation by overriding :meth:`__getattribute__()`. +Automatic Name Notification +--------------------------- + +Sometimes it is desirable for a descriptor to know what class variable name it +was assigned to. When a new class is created, the :class:`type` metaclass +scans the dictionary of the new class. If any of the entries are descriptors +and if they define :meth:`__set_name__`, that method is called with two +arguments. The *owner* is the class where the descriptor is used, the *name* +is class variable the descriptor was assigned to. + +The implementation details are in :c:func:`type_new()` and +:c:func:`set_names()` in :source:`Objects/typeobject.c`. + +Since the update logic is in :meth:`type.__new__`, notifications only take +place at the time of class creation. If descriptors are added to the class +afterwards, :meth:`__set_name__` will need to be called manually. + + Descriptor Example ------------------ @@ -154,7 +583,7 @@ descriptor is useful for monitoring just a few chosen attributes:: self.val = initval self.name = name - def __get__(self, obj, objtype): + def __get__(self, obj, objtype=None): print('Retrieving', self.name) return self.val @@ -162,11 +591,11 @@ descriptor is useful for monitoring just a few chosen attributes:: print('Updating', self.name) self.val = val - >>> class MyClass: - ... x = RevealAccess(10, 'var "x"') - ... y = 5 - ... - >>> m = MyClass() + class B: + x = RevealAccess(10, 'var "x"') + y = 5 + + >>> m = B() >>> m.x Retrieving var "x" 10 @@ -251,12 +680,13 @@ affect existing client code accessing the attribute directly. The solution is to wrap access to the value attribute in a property data descriptor:: class Cell: - . . . - def getvalue(self): + ... + + @property + def value(self): "Recalculate the cell before returning value" self.recalc() return self._value - value = property(getvalue) Functions and Methods @@ -278,42 +708,48 @@ non-data descriptors which return bound methods when they are invoked from an object. In pure Python, it works like this:: class Function: - . . . + ... + def __get__(self, obj, objtype=None): "Simulate func_descr_get() in Objects/funcobject.c" if obj is None: return self return types.MethodType(self, obj) -Running the interpreter shows how the function descriptor works in practice:: +Running the following in class in the interpreter shows how the function +descriptor works in practice:: - >>> class D: - ... def f(self, x): - ... return x - ... - >>> d = D() + class D: + def f(self, x): + return x + +Access through the class dictionary does not invoke :meth:`__get__`. Instead, +it just returns the underlying function object:: - # Access through the class dictionary does not invoke __get__. - # It just returns the underlying function object. >>> D.__dict__['f'] <function D.f at 0x00C45070> - # Dotted access from a class calls __get__() which just returns - # the underlying function unchanged. +Dotted access from a class calls :meth:`__get__` which just returns the +underlying function unchanged:: + >>> D.f <function D.f at 0x00C45070> - # The function has a __qualname__ attribute to support introspection +The function has a :term:`qualified name` attribute to support introspection:: + >>> D.f.__qualname__ 'D.f' - # Dotted access from an instance calls __get__() which returns the - # function wrapped in a bound method object +Dotted access from an instance calls :meth:`__get__` which returns a bound +method object:: + + >>> d = D() >>> d.f <bound method D.f of <__main__.D object at 0x00B18C90>> - # Internally, the bound method stores the underlying function and - # the bound instance. +Internally, the bound method stores the underlying function and the bound +instance:: + >>> d.f.__func__ <function D.f at 0x1012e5ae8> >>> d.f.__self__ @@ -328,20 +764,20 @@ patterns of binding functions into methods. To recap, functions have a :meth:`__get__` method so that they can be converted to a method when accessed as attributes. The non-data descriptor transforms an -``obj.f(*args)`` call into ``f(obj, *args)``. Calling ``klass.f(*args)`` +``obj.f(*args)`` call into ``f(obj, *args)``. Calling ``cls.f(*args)`` becomes ``f(*args)``. This chart summarizes the binding and its two most useful variants: +-----------------+----------------------+------------------+ | Transformation | Called from an | Called from a | - | | Object | Class | + | | object | class | +=================+======================+==================+ | function | f(obj, \*args) | f(\*args) | +-----------------+----------------------+------------------+ | staticmethod | f(\*args) | f(\*args) | +-----------------+----------------------+------------------+ - | classmethod | f(type(obj), \*args) | f(klass, \*args) | + | classmethod | f(type(obj), \*args) | f(cls, \*args) | +-----------------+----------------------+------------------+ Static methods return the underlying function without changes. Calling either @@ -365,11 +801,11 @@ It can be called either from an object or the class: ``s.erf(1.5) --> .9332`` o Since staticmethods return the underlying function with no changes, the example calls are unexciting:: - >>> class E: - ... def f(x): - ... print(x) - ... f = staticmethod(f) - ... + class E: + @staticmethod + def f(x): + print(x) + >>> E.f(3) 3 >>> E().f(3) @@ -391,32 +827,33 @@ Unlike static methods, class methods prepend the class reference to the argument list before calling the function. This format is the same for whether the caller is an object or a class:: - >>> class E: - ... def f(klass, x): - ... return klass.__name__, x - ... f = classmethod(f) - ... - >>> print(E.f(3)) - ('E', 3) - >>> print(E().f(3)) - ('E', 3) + class F: + @classmethod + def f(cls, x): + return cls.__name__, x + + >>> print(F.f(3)) + ('F', 3) + >>> print(F().f(3)) + ('F', 3) This behavior is useful whenever the function only needs to have a class -reference and does not care about any underlying data. One use for classmethods -is to create alternate class constructors. In Python 2.3, the classmethod +reference and does not care about any underlying data. One use for +classmethods is to create alternate class constructors. The classmethod :func:`dict.fromkeys` creates a new dictionary from a list of keys. The pure Python equivalent is:: class Dict: - . . . - def fromkeys(klass, iterable, value=None): + ... + + @classmethod + def fromkeys(cls, iterable, value=None): "Emulate dict_fromkeys() in Objects/dictobject.c" - d = klass() + d = cls() for key in iterable: d[key] = value return d - fromkeys = classmethod(fromkeys) Now a new dictionary of unique keys can be constructed like this:: @@ -432,10 +869,9 @@ Using the non-data descriptor protocol, a pure Python version of def __init__(self, f): self.f = f - def __get__(self, obj, klass=None): - if klass is None: - klass = type(obj) + def __get__(self, obj, cls=None): + if cls is None: + cls = type(obj) def newfunc(*args): - return self.f(klass, *args) + return self.f(cls, *args) return newfunc - |