diff options
author | Andrew M. Kuchling <amk@amk.ca> | 2005-08-30 01:25:05 (GMT) |
---|---|---|
committer | Andrew M. Kuchling <amk@amk.ca> | 2005-08-30 01:25:05 (GMT) |
commit | e8f44d683e79c7a9659a4480736d55193da4a7b1 (patch) | |
tree | 37e8b05066aa1caf85f6b25d52f1576366e45e8e /Doc/howto/doanddont.tex | |
parent | f1b2ba6aa1751c5325e8fb87a28e54a857796bfa (diff) | |
download | cpython-e8f44d683e79c7a9659a4480736d55193da4a7b1.zip cpython-e8f44d683e79c7a9659a4480736d55193da4a7b1.tar.gz cpython-e8f44d683e79c7a9659a4480736d55193da4a7b1.tar.bz2 |
Commit the howto source to the main Python repository, with Fred's approval
Diffstat (limited to 'Doc/howto/doanddont.tex')
-rw-r--r-- | Doc/howto/doanddont.tex | 343 |
1 files changed, 343 insertions, 0 deletions
diff --git a/Doc/howto/doanddont.tex b/Doc/howto/doanddont.tex new file mode 100644 index 0000000..adbde66 --- /dev/null +++ b/Doc/howto/doanddont.tex @@ -0,0 +1,343 @@ +\documentclass{howto} + +\title{Idioms and Anti-Idioms in Python} + +\release{0.00} + +\author{Moshe Zadka} +\authoraddress{howto@zadka.site.co.il} + +\begin{document} +\maketitle + +This document is placed in the public doman. + +\begin{abstract} +\noindent +This document can be considered a companion to the tutorial. It +shows how to use Python, and even more importantly, how {\em not} +to use Python. +\end{abstract} + +\tableofcontents + +\section{Language Constructs You Should Not Use} + +While Python has relatively few gotchas compared to other languages, it +still has some constructs which are only useful in corner cases, or are +plain dangerous. + +\subsection{from module import *} + +\subsubsection{Inside Function Definitions} + +\code{from module import *} is {\em invalid} inside function definitions. +While many versions of Python do no check for the invalidity, it does not +make it more valid, no more then having a smart lawyer makes a man innocent. +Do not use it like that ever. Even in versions where it was accepted, it made +the function execution slower, because the compiler could not be certain +which names are local and which are global. In Python 2.1 this construct +causes warnings, and sometimes even errors. + +\subsubsection{At Module Level} + +While it is valid to use \code{from module import *} at module level it +is usually a bad idea. For one, this loses an important property Python +otherwise has --- you can know where each toplevel name is defined by +a simple "search" function in your favourite editor. You also open yourself +to trouble in the future, if some module grows additional functions or +classes. + +One of the most awful question asked on the newsgroup is why this code: + +\begin{verbatim} +f = open("www") +f.read() +\end{verbatim} + +does not work. Of course, it works just fine (assuming you have a file +called "www".) But it does not work if somewhere in the module, the +statement \code{from os import *} is present. The \module{os} module +has a function called \function{open()} which returns an integer. While +it is very useful, shadowing builtins is one of its least useful properties. + +Remember, you can never know for sure what names a module exports, so either +take what you need --- \code{from module import name1, name2}, or keep them in +the module and access on a per-need basis --- +\code{import module;print module.name}. + +\subsubsection{When It Is Just Fine} + +There are situations in which \code{from module import *} is just fine: + +\begin{itemize} + +\item The interactive prompt. For example, \code{from math import *} makes + Python an amazing scientific calculator. + +\item When extending a module in C with a module in Python. + +\item When the module advertises itself as \code{from import *} safe. + +\end{itemize} + +\subsection{Unadorned \keyword{exec}, \function{execfile} and friends} + +The word ``unadorned'' refers to the use without an explicit dictionary, +in which case those constructs evaluate code in the {\em current} environment. +This is dangerous for the same reasons \code{from import *} is dangerous --- +it might step over variables you are counting on and mess up things for +the rest of your code. Simply do not do that. + +Bad examples: + +\begin{verbatim} +>>> for name in sys.argv[1:]: +>>> exec "%s=1" % name +>>> def func(s, **kw): +>>> for var, val in kw.items(): +>>> exec "s.%s=val" % var # invalid! +>>> execfile("handler.py") +>>> handle() +\end{verbatim} + +Good examples: + +\begin{verbatim} +>>> d = {} +>>> for name in sys.argv[1:]: +>>> d[name] = 1 +>>> def func(s, **kw): +>>> for var, val in kw.items(): +>>> setattr(s, var, val) +>>> d={} +>>> execfile("handle.py", d, d) +>>> handle = d['handle'] +>>> handle() +\end{verbatim} + +\subsection{from module import name1, name2} + +This is a ``don't'' which is much weaker then the previous ``don't''s +but is still something you should not do if you don't have good reasons +to do that. The reason it is usually bad idea is because you suddenly +have an object which lives in two seperate namespaces. When the binding +in one namespace changes, the binding in the other will not, so there +will be a discrepancy between them. This happens when, for example, +one module is reloaded, or changes the definition of a function at runtime. + +Bad example: + +\begin{verbatim} +# foo.py +a = 1 + +# bar.py +from foo import a +if something(): + a = 2 # danger: foo.a != a +\end{verbatim} + +Good example: + +\begin{verbatim} +# foo.py +a = 1 + +# bar.py +import foo +if something(): + foo.a = 2 +\end{verbatim} + +\subsection{except:} + +Python has the \code{except:} clause, which catches all exceptions. +Since {\em every} error in Python raises an exception, this makes many +programming errors look like runtime problems, and hinders +the debugging process. + +The following code shows a great example: + +\begin{verbatim} +try: + foo = opne("file") # misspelled "open" +except: + sys.exit("could not open file!") +\end{verbatim} + +The second line triggers a \exception{NameError} which is caught by the +except clause. The program will exit, and you will have no idea that +this has nothing to do with the readability of \code{"file"}. + +The example above is better written + +\begin{verbatim} +try: + foo = opne("file") # will be changed to "open" as soon as we run it +except IOError: + sys.exit("could not open file") +\end{verbatim} + +There are some situations in which the \code{except:} clause is useful: +for example, in a framework when running callbacks, it is good not to +let any callback disturb the framework. + +\section{Exceptions} + +Exceptions are a useful feature of Python. You should learn to raise +them whenever something unexpected occurs, and catch them only where +you can do something about them. + +The following is a very popular anti-idiom + +\begin{verbatim} +def get_status(file): + if not os.path.exists(file): + print "file not found" + sys.exit(1) + return open(file).readline() +\end{verbatim} + +Consider the case the file gets deleted between the time the call to +\function{os.path.exists} is made and the time \function{open} is called. +That means the last line will throw an \exception{IOError}. The same would +happen if \var{file} exists but has no read permission. Since testing this +on a normal machine on existing and non-existing files make it seem bugless, +that means in testing the results will seem fine, and the code will get +shipped. Then an unhandled \exception{IOError} escapes to the user, who +has to watch the ugly traceback. + +Here is a better way to do it. + +\begin{verbatim} +def get_status(file): + try: + return open(file).readline() + except (IOError, OSError): + print "file not found" + sys.exit(1) +\end{verbatim} + +In this version, *either* the file gets opened and the line is read +(so it works even on flaky NFS or SMB connections), or the message +is printed and the application aborted. + +Still, \function{get_status} makes too many assumptions --- that it +will only be used in a short running script, and not, say, in a long +running server. Sure, the caller could do something like + +\begin{verbatim} +try: + status = get_status(log) +except SystemExit: + status = None +\end{verbatim} + +So, try to make as few \code{except} clauses in your code --- those will +usually be a catch-all in the \function{main}, or inside calls which +should always succeed. + +So, the best version is probably + +\begin{verbatim} +def get_status(file): + return open(file).readline() +\end{verbatim} + +The caller can deal with the exception if it wants (for example, if it +tries several files in a loop), or just let the exception filter upwards +to {\em its} caller. + +The last version is not very good either --- due to implementation details, +the file would not be closed when an exception is raised until the handler +finishes, and perhaps not at all in non-C implementations (e.g., Jython). + +\begin{verbatim} +def get_status(file): + fp = open(file) + try: + return fp.readline() + finally: + fp.close() +\end{verbatim} + +\section{Using the Batteries} + +Every so often, people seem to be writing stuff in the Python library +again, usually poorly. While the occasional module has a poor interface, +it is usually much better to use the rich standard library and data +types that come with Python then inventing your own. + +A useful module very few people know about is \module{os.path}. It +always has the correct path arithmetic for your operating system, and +will usually be much better then whatever you come up with yourself. + +Compare: + +\begin{verbatim} +# ugh! +return dir+"/"+file +# better +return os.path.join(dir, file) +\end{verbatim} + +More useful functions in \module{os.path}: \function{basename}, +\function{dirname} and \function{splitext}. + +There are also many useful builtin functions people seem not to be +aware of for some reason: \function{min()} and \function{max()} can +find the minimum/maximum of any sequence with comparable semantics, +for example, yet many people write they own max/min. Another highly +useful function is \function{reduce()}. Classical use of \function{reduce()} +is something like + +\begin{verbatim} +import sys, operator +nums = map(float, sys.argv[1:]) +print reduce(operator.add, nums)/len(nums) +\end{verbatim} + +This cute little script prints the average of all numbers given on the +command line. The \function{reduce()} adds up all the numbers, and +the rest is just some pre- and postprocessing. + +On the same note, note that \function{float()}, \function{int()} and +\function{long()} all accept arguments of type string, and so are +suited to parsing --- assuming you are ready to deal with the +\exception{ValueError} they raise. + +\section{Using Backslash to Continue Statements} + +Since Python treats a newline as a statement terminator, +and since statements are often more then is comfortable to put +in one line, many people do: + +\begin{verbatim} +if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ + calculate_number(10, 20) != forbulate(500, 360): + pass +\end{verbatim} + +You should realize that this is dangerous: a stray space after the +\code{\\} would make this line wrong, and stray spaces are notoriously +hard to see in editors. In this case, at least it would be a syntax +error, but if the code was: + +\begin{verbatim} +value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ + + calculate_number(10, 20)*forbulate(500, 360) +\end{verbatim} + +then it would just be subtly wrong. + +It is usually much better to use the implicit continuation inside parenthesis: + +This version is bulletproof: + +\begin{verbatim} +value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] + + calculate_number(10, 20)*forbulate(500, 360)) +\end{verbatim} + +\end{document} |