diff options
Diffstat (limited to 'Doc/howto/doanddont.rst')
-rw-r--r-- | Doc/howto/doanddont.rst | 308 |
1 files changed, 308 insertions, 0 deletions
diff --git a/Doc/howto/doanddont.rst b/Doc/howto/doanddont.rst new file mode 100644 index 0000000..a322c53 --- /dev/null +++ b/Doc/howto/doanddont.rst @@ -0,0 +1,308 @@ +************************************ + Idioms and Anti-Idioms in Python +************************************ + +:Author: Moshe Zadka + +This document is placed in the public doman. + + +.. topic:: Abstract + + This document can be considered a companion to the tutorial. It shows how to use + Python, and even more importantly, how *not* to use Python. + + +Language Constructs You Should Not Use +====================================== + +While Python has relatively few gotchas compared to other languages, it still +has some constructs which are only useful in corner cases, or are plain +dangerous. + + +from module import \* +--------------------- + + +Inside Function Definitions +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +``from module import *`` is *invalid* inside function definitions. While many +versions of Python do not check for the invalidity, it does not make it more +valid, no more then having a smart lawyer makes a man innocent. Do not use it +like that ever. Even in versions where it was accepted, it made the function +execution slower, because the compiler could not be certain which names are +local and which are global. In Python 2.1 this construct causes warnings, and +sometimes even errors. + + +At Module Level +^^^^^^^^^^^^^^^ + +While it is valid to use ``from module import *`` at module level it is usually +a bad idea. For one, this loses an important property Python otherwise has --- +you can know where each toplevel name is defined by a simple "search" function +in your favourite editor. You also open yourself to trouble in the future, if +some module grows additional functions or classes. + +One of the most awful question asked on the newsgroup is why this code:: + + f = open("www") + f.read() + +does not work. Of course, it works just fine (assuming you have a file called +"www".) But it does not work if somewhere in the module, the statement ``from os +import *`` is present. The :mod:`os` module has a function called :func:`open` +which returns an integer. While it is very useful, shadowing builtins is one of +its least useful properties. + +Remember, you can never know for sure what names a module exports, so either +take what you need --- ``from module import name1, name2``, or keep them in the +module and access on a per-need basis --- ``import module;print module.name``. + + +When It Is Just Fine +^^^^^^^^^^^^^^^^^^^^ + +There are situations in which ``from module import *`` is just fine: + +* The interactive prompt. For example, ``from math import *`` makes Python an + amazing scientific calculator. + +* When extending a module in C with a module in Python. + +* When the module advertises itself as ``from import *`` safe. + + +Unadorned :keyword:`exec` and friends +------------------------------------- + +The word "unadorned" refers to the use without an explicit dictionary, in which +case those constructs evaluate code in the *current* environment. This is +dangerous for the same reasons ``from import *`` is dangerous --- it might step +over variables you are counting on and mess up things for the rest of your code. +Simply do not do that. + +Bad examples:: + + >>> for name in sys.argv[1:]: + >>> exec "%s=1" % name + >>> def func(s, **kw): + >>> for var, val in kw.items(): + >>> exec "s.%s=val" % var # invalid! + >>> exec(open("handler.py").read()) + >>> handle() + +Good examples:: + + >>> d = {} + >>> for name in sys.argv[1:]: + >>> d[name] = 1 + >>> def func(s, **kw): + >>> for var, val in kw.items(): + >>> setattr(s, var, val) + >>> d={} + >>> exec(open("handle.py").read(), d, d) + >>> handle = d['handle'] + >>> handle() + + +from module import name1, name2 +------------------------------- + +This is a "don't" which is much weaker then the previous "don't"s but is still +something you should not do if you don't have good reasons to do that. The +reason it is usually bad idea is because you suddenly have an object which lives +in two seperate namespaces. When the binding in one namespace changes, the +binding in the other will not, so there will be a discrepancy between them. This +happens when, for example, one module is reloaded, or changes the definition of +a function at runtime. + +Bad example:: + + # foo.py + a = 1 + + # bar.py + from foo import a + if something(): + a = 2 # danger: foo.a != a + +Good example:: + + # foo.py + a = 1 + + # bar.py + import foo + if something(): + foo.a = 2 + + +except: +------- + +Python has the ``except:`` clause, which catches all exceptions. Since *every* +error in Python raises an exception, this makes many programming errors look +like runtime problems, and hinders the debugging process. + +The following code shows a great example:: + + try: + foo = opne("file") # misspelled "open" + except: + sys.exit("could not open file!") + +The second line triggers a :exc:`NameError` which is caught by the except +clause. The program will exit, and you will have no idea that this has nothing +to do with the readability of ``"file"``. + +The example above is better written :: + + try: + foo = opne("file") # will be changed to "open" as soon as we run it + except IOError: + sys.exit("could not open file") + +There are some situations in which the ``except:`` clause is useful: for +example, in a framework when running callbacks, it is good not to let any +callback disturb the framework. + + +Exceptions +========== + +Exceptions are a useful feature of Python. You should learn to raise them +whenever something unexpected occurs, and catch them only where you can do +something about them. + +The following is a very popular anti-idiom :: + + def get_status(file): + if not os.path.exists(file): + print "file not found" + sys.exit(1) + return open(file).readline() + +Consider the case the file gets deleted between the time the call to +:func:`os.path.exists` is made and the time :func:`open` is called. That means +the last line will throw an :exc:`IOError`. The same would happen if *file* +exists but has no read permission. Since testing this on a normal machine on +existing and non-existing files make it seem bugless, that means in testing the +results will seem fine, and the code will get shipped. Then an unhandled +:exc:`IOError` escapes to the user, who has to watch the ugly traceback. + +Here is a better way to do it. :: + + def get_status(file): + try: + return open(file).readline() + except (IOError, OSError): + print "file not found" + sys.exit(1) + +In this version, \*either\* the file gets opened and the line is read (so it +works even on flaky NFS or SMB connections), or the message is printed and the +application aborted. + +Still, :func:`get_status` makes too many assumptions --- that it will only be +used in a short running script, and not, say, in a long running server. Sure, +the caller could do something like :: + + try: + status = get_status(log) + except SystemExit: + status = None + +So, try to make as few ``except`` clauses in your code --- those will usually be +a catch-all in the :func:`main`, or inside calls which should always succeed. + +So, the best version is probably :: + + def get_status(file): + return open(file).readline() + +The caller can deal with the exception if it wants (for example, if it tries +several files in a loop), or just let the exception filter upwards to *its* +caller. + +The last version is not very good either --- due to implementation details, the +file would not be closed when an exception is raised until the handler finishes, +and perhaps not at all in non-C implementations (e.g., Jython). :: + + def get_status(file): + fp = open(file) + try: + return fp.readline() + finally: + fp.close() + + +Using the Batteries +=================== + +Every so often, people seem to be writing stuff in the Python library again, +usually poorly. While the occasional module has a poor interface, it is usually +much better to use the rich standard library and data types that come with +Python then inventing your own. + +A useful module very few people know about is :mod:`os.path`. It always has the +correct path arithmetic for your operating system, and will usually be much +better then whatever you come up with yourself. + +Compare:: + + # ugh! + return dir+"/"+file + # better + return os.path.join(dir, file) + +More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and +:func:`splitext`. + +There are also many useful builtin functions people seem not to be aware of for +some reason: :func:`min` and :func:`max` can find the minimum/maximum of any +sequence with comparable semantics, for example, yet many people write their own +:func:`max`/:func:`min`. Another highly useful function is :func:`reduce`. A +classical use of :func:`reduce` is something like :: + + import sys, operator + nums = map(float, sys.argv[1:]) + print reduce(operator.add, nums)/len(nums) + +This cute little script prints the average of all numbers given on the command +line. The :func:`reduce` adds up all the numbers, and the rest is just some +pre- and postprocessing. + +On the same note, note that :func:`float`, :func:`int` and :func:`long` all +accept arguments of type string, and so are suited to parsing --- assuming you +are ready to deal with the :exc:`ValueError` they raise. + + +Using Backslash to Continue Statements +====================================== + +Since Python treats a newline as a statement terminator, and since statements +are often more then is comfortable to put in one line, many people do:: + + if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ + calculate_number(10, 20) != forbulate(500, 360): + pass + +You should realize that this is dangerous: a stray space after the ``XXX`` would +make this line wrong, and stray spaces are notoriously hard to see in editors. +In this case, at least it would be a syntax error, but if the code was:: + + value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ + + calculate_number(10, 20)*forbulate(500, 360) + +then it would just be subtly wrong. + +It is usually much better to use the implicit continuation inside parenthesis: + +This version is bulletproof:: + + value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] + + calculate_number(10, 20)*forbulate(500, 360)) + |