summaryrefslogtreecommitdiffstats
path: root/Doc/tut.tex
diff options
context:
space:
mode:
authorGuido van Rossum <guido@python.org>1992-08-07 16:06:24 (GMT)
committerGuido van Rossum <guido@python.org>1992-08-07 16:06:24 (GMT)
commit5e0759d351ed6ceb3d0c28475df214d5e5ffa626 (patch)
tree7457d36c8e2b8b8f13149457472a00427b3e05d4 /Doc/tut.tex
parent2d4aa4f5d43842bc049752ab8d5ffa3d2880cbe6 (diff)
downloadcpython-5e0759d351ed6ceb3d0c28475df214d5e5ffa626.zip
cpython-5e0759d351ed6ceb3d0c28475df214d5e5ffa626.tar.gz
cpython-5e0759d351ed6ceb3d0c28475df214d5e5ffa626.tar.bz2
Add chapter on classes (mostly from ../misc/CLASSES).
Diffstat (limited to 'Doc/tut.tex')
-rw-r--r--Doc/tut.tex598
1 files changed, 598 insertions, 0 deletions
diff --git a/Doc/tut.tex b/Doc/tut.tex
index 83a0d8b..ac6c5f5 100644
--- a/Doc/tut.tex
+++ b/Doc/tut.tex
@@ -57,6 +57,7 @@ a more formal definition of the language.
\pagenumbering{arabic}
+
\chapter{Whetting Your Appetite}
If you ever wrote a large shell script, you probably know this
@@ -141,6 +142,7 @@ should read the Library Reference, which gives complete (though terse)
reference material about built-in and standard types, functions and
modules that can save you a lot of time when writing Python programs.
+
\chapter{Using the Python Interpreter}
\section{Invoking the Interpreter}
@@ -380,6 +382,7 @@ completion mechanism might use the interpreter's symbol table. A
command to check (or even suggest) matching parentheses, quotes etc.
would also be useful.
+
\chapter{An Informal Introduction to Python}
In the following examples, input and output are distinguished by the
@@ -786,6 +789,7 @@ prompt if the last line was not completed.
\end{itemize}
+
\chapter{More Control Flow Tools}
Besides the {\tt while} statement just introduced, Python knows the
@@ -1065,6 +1069,7 @@ it is equivalent to {\tt result = result + [b]}, but more efficient.
\end{itemize}
+
\chapter{Odds and Ends}
This chapter describes some things you've learned about already in
@@ -1359,6 +1364,7 @@ to their numeric value, so 0 equals 0.0, etc.%
the language.
}
+
\chapter{Modules}
If you quit from the Python interpreter and enter it again, the
@@ -1581,6 +1587,7 @@ meError', 'SystemError', 'TypeError', 'abs', 'chr', 'dir', 'divmod', 'eval',
>>>
\end{verbatim}\ecode
+
\chapter{Output Formatting}
So far we've encountered two ways of writing values: {\em expression
@@ -1675,6 +1682,7 @@ signs:%
>>>
\end{verbatim}\ecode
+
\chapter{Errors and Exceptions}
Until now error messages haven't been more than mentioned, but if you
@@ -1963,4 +1971,594 @@ handler (and even if another exception occurred in the handler).
It is also executed when the {\tt try} statement is left via a
{\tt break} or {\tt return} statement.
+
+\chapter{Classes}
+
+Python's class mechanism adds classes to the language with a minimum
+of new syntax and semantics. It is a mixture of the class mechanisms
+found in C++ and Modula-3. As is true for modules, classes in Python
+do not put an absolute barrier between definition and user, but rather
+rely on the politeness of the user not to ``break into the
+definition.'' The most important features of classes are retained
+with full power, however: the class inheritance mechanism allows
+multiple base classes, a derived class can override any methods of its
+base class(es), a method can call the method of a base class with the
+same name. Objects can contain an arbitrary amount of private data.
+
+In C++ terminology, all class members (including the data members) are
+{\em public}, and all member functions are {\em virtual}. There are
+no special constructors or desctructors. As in Modula-3, there are no
+shorthands for referencing the object's members from its methods: the
+method function is declared with an explicit first argument
+representing the object, which is provided implicitly by the call. As
+in Smalltalk, classes themselves are objects, albeit in the wider
+sense of the word: in Python, all data types are objects. This
+provides semantics for importing and renaming. But, just like in C++
+or Modula-3, built-in types cannot be used as base classes for
+extension by the user. Also, like in Modula-3 but unlike in C++, the
+built-in operators with special syntax (arithmetic operators,
+subscriptong etc.) cannot be redefined for class members.
+
+
+\section{A word about terminology}
+
+Lacking universally accepted terminology to talk about classes, I'll
+make occasional use of Smalltalk and C++ terms. (I'd use Modula-3
+terms, since its object-oriented semantics are closer to those of
+Python than C++, but I expect that few readers have heard of it...)
+
+I also have to warn you that there's a terminological pitfall for
+object-oriented readers: the word ``object'' in Python does not
+necessarily mean a class instance. Like C++ and Modula-3, and unlike
+Smalltalk, not all types in Python are classes: the basic built-in
+types like integers and lists aren't, and even somewhat more exotic
+types like files aren't. However, {\em all} Python types share a little
+bit of common semantics that is best described by using the word
+object.
+
+Objects have individuality, and multiple names (in multiple scopes)
+can be bound to the same object. This is known as aliasing in other
+languages. This is usually not appreciated on a first glance at
+Python, and can be safely ignored when dealing with immutable basic
+types (numbers, strings, tuples). However, aliasing has an
+(intended!) effect on the semantics of Python code involving mutable
+objects such as lists, dictionaries, and most types representing
+entities outside the program (files, windows, etc.). This is usually
+used to the benefit of the program, since aliases behave like pointers
+in some respects. For example, passing an object is cheap since only
+a pointer is passed by the implementation; and if a function modifies
+an object passed as an argument, the caller will see the change --- this
+obviates the need for two different argument passing mechanisms as in
+Pascal.
+
+
+\section{Python scopes and name spaces}
+
+Before introducing classes, I first have to tell you something about
+Python's scope rules. Class definitions play some neat tricks with
+name spaces, and you need to know how scopes and name spaces work to
+fully understand what's going on. Incidentally, knowledge about this
+subject is useful for any advanced Python programmer.
+
+Let's begin with some definitions.
+
+A {\em name space} is a mapping from names to objects. Most name
+spaces are currently implemented as Python dictionaries, but that's
+normally not noticeable in any way (except for performance), and it
+may change in the future. Examples of name spaces are: the set of
+built-in names (functions such as \verb\abs()\, and built-in exception
+names); the global names in a module; and the local names in a
+function invocation. In a sense the set of attributes of an object
+also form a name space. The important things to know about name
+spaces is that there is absolutely no relation between names in
+different name spaces; for instance, two different modules may both
+define a function ``maximize'' without confusion --- users of the
+modules must prefix it with the module name.
+
+By the way, I use the word {\em attribute} for any name following a
+dot --- for example, in the expression \verb\z.real\, \verb\real\ is
+an attribute of the object \verb\z\. Strictly speaking, references to
+names in modules are attribute references: in the expression
+\verb\modname.funcname\, \verb\modname\ is a module object and
+\verb\funcname\ is an attribute of it. In this case there happens to
+be a straightforward mapping between the module's attributes and the
+global names defined in the module: they share the same name space!%
+\footnote{
+ Except for one thing. Module objects have a secret read-only
+ attribute called {\tt __dict__} which returns the dictionary
+ used to implement the module's name space; the name
+ {\tt __dict__} is an attribute but not a global name.
+ Obviously, using this violates the abstraction of name space
+ implementation, and should be restricted to things like
+ post-mortem debuggers...
+}
+
+Attributes may be read-only or writable. In the latter case,
+assignment to attributes is possible. Module attributes are writable:
+you can write \verb\modname.the_answer = 42\. Writable attributes may
+also be deleted with the del statement, e.g.
+\verb\del modname.the_answer\.
+
+Name spaces are created at different moments and have different
+lifetimes. The name space containing the built-in names is created
+when the Python interpreter starts up, and is never deleted. The
+global name space for a module is created when the module definition
+is read in; normally, module name spaces also last until the
+interpreter quits. The statements executed by the top-level
+invocation of the interpreter, either read from a script file or
+interactively, are considered part of a module called \verb\__main__\,
+so they have their own global name space. (The built-in names
+actually also live in a module; this is called \verb\builtin\,
+although it should really have been called \verb\__builtin__\.)
+
+The local name space for a function is created when the function is
+called, and deleted when the function returns or raises an exception
+that is not handled within the function. (Actually, forgetting would
+be a better way to describe what actually happens.) Of course,
+recursive invocations each have their own local name space.
+
+A {\em scope} is a textual region of a Python program where a name space
+is directly accessible. ``Directly accessible'' here means that an
+unqualified reference to a name attempts to find the name in the name
+space.
+
+Although scopes are determined statically, they are used dynamically.
+At any time during execution, exactly three nested scopes are in use
+(i.e., exactly three name spaces are directly accessible): the
+innermost scope, which is searched first, contains the local names,
+the middle scope, searched next, contains the current module's global
+names, and the outermost scope (searched last) is the name space
+containing built-in names.
+
+Usually, the local scope references the local names of the (textually)
+current function. Outside functions, the the local scope references
+the same name space as the global scope: the module's name space.
+Class definitions place yet another name space in the local scope.
+
+It is important to realize that scopes are determined textually: the
+global scope of a function defined in a module is that module's name
+space, no matter from where or by what alias the function is called.
+On the other hand, the actual search for names is done dynamically, at
+run time --- however, the the language definition is evolving towards
+static name resolution, at ``compile'' time, so don't rely on dynamic
+name resolution! (In fact, local variables are already determined
+statically.)
+
+A special quirk of Python is that assignments always go into the
+innermost scope. Assignments do not copy data --- they just
+bind names to objects. The same is true for deletions: the statement
+\verb\del x\ removes the binding of x from the name space referenced by the
+local scope. In fact, all operations that introduce new names use the
+local scope: in particular, import statements and function definitions
+bind the module or function name in the local scope. (The
+\verb\global\ statement can be used to indicate that particular
+variables live in the global scope.)
+
+
+\section{A first look at classes}
+
+Classes introduce a little bit of new syntax, three new object types,
+and some new semantics.
+
+
+\subsection{Class definition syntax}
+
+The simplest form of class definition looks like this:
+
+\begin{verbatim}
+ class ClassName:
+ <statement-1>
+ .
+ .
+ .
+ <statement-N>
+\end{verbatim}
+
+Class definitions, like function definitions (\verb\def\ statements)
+must be executed before they have any effect. (You could conceivably
+place a class definition in a branch of an \verb\if\ statement, or
+inside a function.)
+
+In practice, the statements inside a class definition will usually be
+function definitions, but other statements are allowed, and sometimes
+useful --- we'll come back to this later. The function definitions
+inside a class normally have a peculiar form of argument list,
+dictated by the calling conventions for methods --- again, this is
+explained later.
+
+When a class definition is entered, a new name space is created, and
+used as the local scope --- thus, all assignments to local variables
+go into this new name space. In particular, function definitions bind
+the name of the new function here.
+
+When a class definition is left normally (via the end), a {\em class
+object} is created. This is basically a wrapper around the contents
+of the name space created by the class definition; we'll learn more
+about class objects in the next section. The original local scope
+(the one in effect just before the class definitions was entered) is
+reinstated, and the class object is bound here to class name given in
+the class definition header (ClassName in the example).
+
+
+\subsection{Class objects}
+
+Class objects support two kinds of operations: attribute references
+and instantiation.
+
+{\em Attribute references} use the standard syntax used for all
+attribute references in Python: \verb\obj.name\. Valid attribute
+names are all the names that were in the class's name space when the
+class object was created. So, if the class definition looked like
+this:
+
+\begin{verbatim}
+ class MyClass:
+ i = 12345
+ def f(x):
+ return 'hello world'
+\end{verbatim}
+
+then \verb\MyClass.i\ and \verb\MyClass.f\ are valid attribute
+references, returning an integer and a function object, respectively.
+Class attributes can also be assigned to, so you can change the
+value of \verb\MyClass.i\ by assignment.
+
+Class {\em instantiation} uses function notation. Just pretend that
+the class object is a parameterless function that returns a new
+instance of the class. For example, (assuming the above class):
+
+\begin{verbatim}
+ x = MyClass()
+\end{verbatim}
+
+creates a new {\em instance} of the class and assigns this object to
+the local variable \verb\x\.
+
+
+\subsection{Instance objects}
+
+Now what can we do with instance objects? The only operations
+understood by instance objects are attribute references. There are
+two kinds of valid attribute names.
+
+The first I'll call {\em data attributes}. These correspond to
+``instance variables'' in Smalltalk, and to ``data members'' in C++.
+Data attributes need not be declared; like local variables, they
+spring into existence when they are first assigned to. For example,
+if \verb\x\ in the instance of \verb\MyClass\ created above, the
+following piece of code will print the value 16, without leaving a
+trace:
+
+\begin{verbatim}
+ x.counter = 1
+ while x.counter < 10:
+ x.counter = x.counter * 2
+ print x.counter
+ del x.counter
+\end{verbatim}
+
+The second kind of attribute references understood by instance objects
+are {\em methods}. A method is a function that ``belongs to'' an
+object. (In Python, the term method is not unique to class instances:
+other object types can have methods as well, e.g., list objects have
+methods called append, insert, remove, sort, and so on. However,
+below, we'll use the term method exclusively to mean methods of class
+instance objects, unless explicitly stated otherwise.)
+
+Valid method names of an instance object depend on its class. By
+definition, all attributes of a class that are (user-defined) function
+objects define corresponding methods of its instances. So in our
+example, \verb\x.f\ is a valid method reference, since
+\verb\MyClass.f\ is a function, but \verb\x.i\ is not, since
+\verb\MyClass.i\ is not. But \verb\x.f\ is not the
+same thing as \verb\MyClass.f\ --- it is a {\em method object}, not a
+function object.
+
+
+\subsection{Method objects}
+
+Usually, a method is called immediately, e.g.:
+
+\begin{verbatim}
+ x.f()
+\end{verbatim}
+
+In our example, this will return the string \verb\'hello world'\.
+However, it is not necessary to call a method right away: \verb\x.f\
+is a method object, and can be stored away and called at a later
+moment, for example:
+
+\begin{verbatim}
+ xf = x.f
+ while 1:
+ print xf()
+\end{verbatim}
+
+will continue to print \verb\hello world\ until the end of time.
+
+What exactly happens when a method is called? You may have noticed
+that \verb\x.f()\ was called without an argument above, even though
+the function definition for \verb\f\ specified an argument. What
+happened to the argument? Surely Python raises an exception when a
+function that requires an argument is called without any --- even if
+the argument isn't actually used...
+
+Actually, you may have guessed the answer: the special thing about
+methods is that the object is passed as the first argument of the
+function. In our example, the call \verb\x.f()\ is exactly equivalent
+to \verb\MyClass.f(x)\. In general, calling a method with a list of
+{\em n} arguments is equivalent to calling the corresponding function
+with an argument list that is created by inserting the method's object
+before the first argument.
+
+If you still don't understand how methods work, a look at the
+implementation can perhaps clarify matters. When an instance
+attribute is referenced that isn't a data attribute, its class is
+searched. If the name denotes a valid class attribute that is a
+function object, a method object is created by packing (pointers to)
+the instance object and the function object just found together in an
+abstract object: this is the method object. When the method object is
+called with an argument list, it is unpacked again, a new argument
+list is constructed from the instance object and the original argument
+list, and the function object is called with this new argument list.
+
+
+\section{Random remarks}
+
+
+[These should perhaps be placed more carefully...]
+
+
+Data attributes override method attributes with the same name; to
+avoid accidental name conflicts, which may cause hard-to-find bugs in
+large programs, it is wise to use some kind of convention that
+minimizes the chance of conflicts, e.g., capitalize method names,
+prefix data attribute names with a small unique string (perhaps just
+an undescore), or use verbs for methods and nouns for data attributes.
+
+
+Data attributes may be referenced by methods as well as by ordinary
+users (``clients'') of an object. In other words, classes are not
+usable to implement pure abstract data types. In fact, nothing in
+Python makes it possible to enforce data hiding --- it is all based
+upon convention. (On the other hand, the Python implementation,
+written in C, can completely hide implementation details and control
+access to an object if necessary; this can be used by extensions to
+Python written in C.)
+
+
+Clients should use data attributes with care --- clients may mess up
+invariants maintained by the methods by stamping on their data
+attributes. Note that clients may add data attributes of their own to
+an instance object without affecting the validity of the methods, as
+long as name conflicts are avoided --- again, a naming convention can
+save a lot of headaches here.
+
+
+There is no shorthand for referencing data attributes (or other
+methods!) from within methods. I find that this actually increases
+the readability of methods: there is no chance of confusing local
+variables and instance variables when glancing through a method.
+
+
+Conventionally, the first argument of methods is often called
+\verb\self\. This is nothing more than a convention: the name
+\verb\self\ has absolutely no special meaning to Python. (Note,
+however, that by not following the convention your code may be less
+readable by other Python programmers, and it is also conceivable that
+a {\em class browser} program be written which relies upon such a
+convention.)
+
+
+Any function object that is a class attribute defines a method for
+instances of that class. It is not necessary that the function
+definition is textually enclosed in the class definition: assigning a
+function object to a local variable in the class is also ok. For
+example:
+
+\begin{verbatim}
+ # Function defined outside the class
+ def f1(self, x, y):
+ return min(x, x+y)
+
+ class C:
+ f = f1
+ def g(self):
+ return 'hello world'
+ h = g
+\end{verbatim}
+
+Now \verb\f\, \verb\g\ and \verb\h\ are all attributes of class
+\verb\C\ that refer to function objects, and consequently they are all
+methods of instances of \verb\C\ --- \verb\h\ being exactly equivalent
+to \verb\g\. Note that this practice usually only serves to confuse
+the reader of a program.
+
+
+Methods may call other methods by using method attributes of the
+\verb\self\ argument, e.g.:
+
+\begin{verbatim}
+ class Bag:
+ def empty(self):
+ self.data = []
+ def add(self, x):
+ self.data.append(x)
+ def addtwice(self, x):
+ self.add(x) self.add(x)
+\end{verbatim}
+
+
+The instantiation operation (``calling'' a class object) creates an
+empty object. Many classes like to create objects in a known initial
+state. There is no special syntax to enforce this, but a convention
+works almost as well: add a method named \verb\init\ to the class,
+which initializes the instance (by assigning to some important data
+attributes) and returns the instance itself. For example, class
+\verb\Bag\ above could have the following method:
+
+\begin{verbatim}
+ def init(self):
+ self.empty()
+ return self
+\end{verbatim}
+
+The client can then create and initialize an instance in one
+statement, as follows:
+
+\begin{verbatim}
+ x = Bag().init()
+\end{verbatim}
+
+Of course, the \verb\init\ method may have arguments for greater
+flexibility.
+
+Warning: a common mistake is to forget the \verb\return self\ at the
+end of an init method!
+
+
+Methods may reference global names in the same way as ordinary
+functions. The global scope associated with a method is the module
+containing the class definition. (The class itself is never used as a
+global scope!) While one rarely encounters a good reason for using
+global data in a method, there are many legitimate uses of the global
+scope: for one thing, functions and modules imported into the global
+scope can be used by methods, as well as functions and classes defined
+in it. Usually, the class containing the method is itself defined in
+this global scope, and in the next section we'll find some good
+reasons why a method would want to reference its own class!
+
+
+\section{Inheritance}
+
+Of course, a language feature would not be worthy of the name ``class''
+without supporting inheritance. The syntax for a derived class
+definition looks as follows:
+
+\begin{verbatim}
+ class DerivedClassName(BaseClassName):
+ <statement-1>
+ .
+ .
+ .
+ <statement-N>
+\end{verbatim}
+
+The name \verb\BaseClassName\ must be defined in a scope containing
+the derived class definition. Instead of a base class name, an
+expression is also allowed. This is useful when the base class is
+defined in another module, e.g.,
+
+\begin{verbatim}
+ class DerivedClassName(modname.BaseClassName):
+\end{verbatim}
+
+Execution of a derived class definition proceeds the same as for a
+base class. When the class object is constructed, the base class is
+remembered. This is used for resolving attribute references: if a
+requested attribute is not found in the class, it is searched in the
+base class. This rule is applied recursively if the base class itself
+is derived from some other class.
+
+There's nothing special about instantiation of derived classes:
+\verb\DerivedClassName()\ creates a new instance of the class. Method
+references are resolved as follows: the corresponding class attribute
+is searched, descending down the chain of base classes if necessary,
+and the method reference is valid if this yields a function object.
+
+Derived classes may override methods of their base classes. Because
+methods have no special privileges when calling other methods of the
+same object, a method of a base class that calls another method
+defined in the same base class, may in fact end up calling a method of
+a derived class that overrides it. (For C++ programmers: all methods
+in Python are ``virtual functions''.)
+
+An overriding method in a derived class may in fact want to extend
+rather than simply replace the base class method of the same name.
+There is a simple way to call the base class method directly: just
+call \verb\BaseClassName.methodname(self, arguments)\. This is
+occasionally useful to clients as well. (Note that this only works if
+the base class is defined or imported directly in the global scope.)
+
+
+\subsection{Multiple inheritance}
+
+Poython supports a limited form of multiple inheritance as well. A
+class definition with multiple base classes looks as follows:
+
+\begin{verbatim}
+ class DerivedClassName(Base1, Base2, Base3):
+ <statement-1>
+ .
+ .
+ .
+ <statement-N>
+\end{verbatim}
+
+The only rule necessary to explain the semantics is the resolution
+rule used for class attribute references. This is depth-first,
+left-to-right. Thus, if an attribute is not found in
+\verb\DerivedClassName\, it is searched in \verb\Base1\, then
+(recursively) in the base classes of \verb\Base1\, and only if it is
+not found there, it is searched in \verb\Base2\, and so on.
+
+(To some people breadth first --- searching \verb\Base2\ and
+\verb\Base3\ before the base classes of \verb\Base1\ --- looks more
+natural. However, this would require you to know whether a particular
+attribute of \verb\Base1\ is actually defined in \verb\Base1\ or in
+one of its base classes before you can figure out the consequences of
+a name conflict with an attribute of \verb\Base2\. The depth-first
+rule makes no differences between direct and inherited attributes of
+\verb\Base1\.)
+
+It is clear that indiscriminate use of multiple inheritance is a
+maintenance nightmare, given the reliance in Python on conventions to
+avoid accidental name conflicts. A well-known problem with multiple
+inheritance is a class derived from two classes that happen to have a
+common base class. While it is easy enough to figure out what happens
+in this case (the instance will have a single copy of ``instance
+variables'' or data attributes used by the common base class), it is
+not clear that these semantics are in any way useful.
+
+
+\section{Odds and ends}
+
+Sometimes it is useful to have a data type similar to the Pascal
+``record'' or C ``struct'', bundling together a couple of named data
+items. An empty class definition will do nicely, e.g.:
+
+\begin{verbatim}
+ class Employee:
+ pass
+
+ john = Employee() # Create an empty employee record
+
+ # Fill the fields of the record
+ john.name = 'John Doe'
+ john.dept = 'computer lab'
+ john.salary = 1000
+\end{verbatim}
+
+
+A piece of Python code that expects a particular abstract data type
+can often be passed a class that emulates the methods of that data
+type instead. For instance, if you have a function that formats some
+data from a file object, you can define a class with methods
+\verb\read()\ and \verb\readline()\ that gets the data from a string
+buffer instead, and pass it as an argument. (Unfortunately, this
+technique has its limitations: a class can't define operations that
+are accessed by special syntax such as sequence subscripting or
+arithmetic operators, and assigning such a ``pseudo-file'' to
+\verb\sys.stdin\ will not cause the interpreter to read further input
+from it.)
+
+
+Instance method objects have attributes, too: \verb\m.im_self\ is the
+object of which the method is an instance, and \verb\m.im_func\ is the
+function object corresponding to the method.
+
+
+XXX Mention bw compat hacks.
+
+
\end{document}