diff options
Diffstat (limited to 'Doc/howto')
-rw-r--r-- | Doc/howto/cporting.rst | 16 | ||||
-rw-r--r-- | Doc/howto/descriptor.rst | 6 | ||||
-rw-r--r-- | Doc/howto/doanddont.rst | 290 | ||||
-rw-r--r-- | Doc/howto/index.rst | 4 | ||||
-rw-r--r-- | Doc/howto/logging-cookbook.rst | 1039 | ||||
-rw-r--r-- | Doc/howto/logging.rst | 1026 | ||||
-rw-r--r-- | Doc/howto/pyporting.rst | 703 | ||||
-rw-r--r-- | Doc/howto/sorting.rst | 12 | ||||
-rw-r--r-- | Doc/howto/unicode.rst | 102 | ||||
-rw-r--r-- | Doc/howto/webservers.rst | 2 |
10 files changed, 2837 insertions, 363 deletions
diff --git a/Doc/howto/cporting.rst b/Doc/howto/cporting.rst index d20e4a6..7184496 100644 --- a/Doc/howto/cporting.rst +++ b/Doc/howto/cporting.rst @@ -22,7 +22,7 @@ Conditional compilation ======================= The easiest way to compile only some code for 3.0 is to check if -:cmacro:`PY_MAJOR_VERSION` is greater than or equal to 3. :: +:c:macro:`PY_MAJOR_VERSION` is greater than or equal to 3. :: #if PY_MAJOR_VERSION >= 3 #define IS_PY3K @@ -47,12 +47,12 @@ Python 3.0's :func:`str` (``PyString_*`` functions in C) type is equivalent to 2.x's :func:`unicode` (``PyUnicode_*``). The old 8-bit string type has become :func:`bytes`. Python 2.6 and later provide a compatibility header, :file:`bytesobject.h`, mapping ``PyBytes`` names to ``PyString`` ones. For best -compatibility with 3.0, :ctype:`PyUnicode` should be used for textual data and -:ctype:`PyBytes` for binary data. It's also important to remember that -:ctype:`PyBytes` and :ctype:`PyUnicode` in 3.0 are not interchangeable like -:ctype:`PyString` and :ctype:`PyUnicode` are in 2.x. The following example -shows best practices with regards to :ctype:`PyUnicode`, :ctype:`PyString`, -and :ctype:`PyBytes`. :: +compatibility with 3.0, :c:type:`PyUnicode` should be used for textual data and +:c:type:`PyBytes` for binary data. It's also important to remember that +:c:type:`PyBytes` and :c:type:`PyUnicode` in 3.0 are not interchangeable like +:c:type:`PyString` and :c:type:`PyUnicode` are in 2.x. The following example +shows best practices with regards to :c:type:`PyUnicode`, :c:type:`PyString`, +and :c:type:`PyBytes`. :: #include "stdlib.h" #include "Python.h" @@ -120,7 +120,7 @@ can also be used in some cases. :: Module initialization and state =============================== -Python 3.0 has a revamped extension module initialization system. (See PEP +Python 3.0 has a revamped extension module initialization system. (See :pep:`3121`.) Instead of storing module state in globals, they should be stored in an interpreter specific structure. Creating modules that act correctly in both 2.x and 3.0 is tricky. The following simple example demonstrates how. :: diff --git a/Doc/howto/descriptor.rst b/Doc/howto/descriptor.rst index cdb6a8e..81bb8ca 100644 --- a/Doc/howto/descriptor.rst +++ b/Doc/howto/descriptor.rst @@ -97,7 +97,7 @@ transforms ``b.x`` into ``type(b).__dict__['x'].__get__(b, type(b))``. The implementation works through a precedence chain that gives data descriptors priority over instance variables, instance variables priority over non-data descriptors, and assigns lowest priority to :meth:`__getattr__` if provided. The -full C implementation can be found in :cfunc:`PyObject_GenericGetAttr()` in +full C implementation can be found in :c:func:`PyObject_GenericGetAttr()` in `Objects/object.c <http://svn.python.org/view/python/trunk/Objects/object.c?view=markup>`_\. For classes, the machinery is in :meth:`type.__getattribute__` which transforms @@ -131,7 +131,7 @@ search using :meth:`object.__getattribute__`. Note, in Python 2.2, ``super(B, obj).m()`` would only invoke :meth:`__get__` if ``m`` was a data descriptor. In Python 2.3, non-data descriptors also get invoked unless an old-style class is involved. The implementation details are -in :cfunc:`super_getattro()` in +in :c:func:`super_getattro()` in `Objects/typeobject.c <http://svn.python.org/view/python/trunk/Objects/typeobject.c?view=markup>`_ and a pure Python equivalent can be found in `Guido's Tutorial`_. @@ -297,7 +297,7 @@ Running the interpreter shows how the function descriptor works in practice:: The output suggests that bound and unbound methods are two different types. While they could have been implemented that way, the actual C implementation of -:ctype:`PyMethod_Type` in +:c:type:`PyMethod_Type` in `Objects/classobject.c <http://svn.python.org/view/python/trunk/Objects/classobject.c?view=markup>`_ is a single object with two different representations depending on whether the :attr:`im_self` field is set or is *NULL* (the C equivalent of *None*). diff --git a/Doc/howto/doanddont.rst b/Doc/howto/doanddont.rst deleted file mode 100644 index 365a6209d..0000000 --- a/Doc/howto/doanddont.rst +++ /dev/null @@ -1,290 +0,0 @@ -************************************ - Idioms and Anti-Idioms in Python -************************************ - -:Author: Moshe Zadka - -This document is placed in the public domain. - - -.. topic:: Abstract - - This document can be considered a companion to the tutorial. It shows how to use - Python, and even more importantly, how *not* to use Python. - - -Language Constructs You Should Not Use -====================================== - -While Python has relatively few gotchas compared to other languages, it still -has some constructs which are only useful in corner cases, or are plain -dangerous. - - -from module import \* ---------------------- - - -Inside Function Definitions -^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -``from module import *`` is *invalid* inside function definitions. While many -versions of Python do not check for the invalidity, it does not make it more -valid, no more than having a smart lawyer makes a man innocent. Do not use it -like that ever. Even in versions where it was accepted, it made the function -execution slower, because the compiler could not be certain which names are -local and which are global. In Python 2.1 this construct causes warnings, and -sometimes even errors. - - -At Module Level -^^^^^^^^^^^^^^^ - -While it is valid to use ``from module import *`` at module level it is usually -a bad idea. For one, this loses an important property Python otherwise has --- -you can know where each toplevel name is defined by a simple "search" function -in your favourite editor. You also open yourself to trouble in the future, if -some module grows additional functions or classes. - -One of the most awful question asked on the newsgroup is why this code:: - - f = open("www") - f.read() - -does not work. Of course, it works just fine (assuming you have a file called -"www".) But it does not work if somewhere in the module, the statement ``from os -import *`` is present. The :mod:`os` module has a function called :func:`open` -which returns an integer. While it is very useful, shadowing builtins is one of -its least useful properties. - -Remember, you can never know for sure what names a module exports, so either -take what you need --- ``from module import name1, name2``, or keep them in the -module and access on a per-need basis --- ``import module; print(module.name)``. - - -When It Is Just Fine -^^^^^^^^^^^^^^^^^^^^ - -There are situations in which ``from module import *`` is just fine: - -* The interactive prompt. For example, ``from math import *`` makes Python an - amazing scientific calculator. - -* When extending a module in C with a module in Python. - -* When the module advertises itself as ``from import *`` safe. - - -from module import name1, name2 -------------------------------- - -This is a "don't" which is much weaker than the previous "don't"s but is still -something you should not do if you don't have good reasons to do that. The -reason it is usually bad idea is because you suddenly have an object which lives -in two separate namespaces. When the binding in one namespace changes, the -binding in the other will not, so there will be a discrepancy between them. This -happens when, for example, one module is reloaded, or changes the definition of -a function at runtime. - -Bad example:: - - # foo.py - a = 1 - - # bar.py - from foo import a - if something(): - a = 2 # danger: foo.a != a - -Good example:: - - # foo.py - a = 1 - - # bar.py - import foo - if something(): - foo.a = 2 - - -except: -------- - -Python has the ``except:`` clause, which catches all exceptions. Since *every* -error in Python raises an exception, using ``except:`` can make many -programming errors look like runtime problems, which hinders the debugging -process. - -The following code shows a great example of why this is bad:: - - try: - foo = opne("file") # misspelled "open" - except: - sys.exit("could not open file!") - -The second line triggers a :exc:`NameError`, which is caught by the except -clause. The program will exit, and the error message the program prints will -make you think the problem is the readability of ``"file"`` when in fact -the real error has nothing to do with ``"file"``. - -A better way to write the above is :: - - try: - foo = opne("file") - except IOError: - sys.exit("could not open file") - -When this is run, Python will produce a traceback showing the :exc:`NameError`, -and it will be immediately apparent what needs to be fixed. - -.. index:: bare except, except; bare - -Because ``except:`` catches *all* exceptions, including :exc:`SystemExit`, -:exc:`KeyboardInterrupt`, and :exc:`GeneratorExit` (which is not an error and -should not normally be caught by user code), using a bare ``except:`` is almost -never a good idea. In situations where you need to catch all "normal" errors, -such as in a framework that runs callbacks, you can catch the base class for -all normal exceptions, :exc:`Exception`. - - -Exceptions -========== - -Exceptions are a useful feature of Python. You should learn to raise them -whenever something unexpected occurs, and catch them only where you can do -something about them. - -The following is a very popular anti-idiom :: - - def get_status(file): - if not os.path.exists(file): - print("file not found") - sys.exit(1) - return open(file).readline() - -Consider the case where the file gets deleted between the time the call to -:func:`os.path.exists` is made and the time :func:`open` is called. In that -case the last line will raise an :exc:`IOError`. The same thing would happen -if *file* exists but has no read permission. Since testing this on a normal -machine on existent and non-existent files makes it seem bugless, the test -results will seem fine, and the code will get shipped. Later an unhandled -:exc:`IOError` (or perhaps some other :exc:`EnvironmentError`) escapes to the -user, who gets to watch the ugly traceback. - -Here is a somewhat better way to do it. :: - - def get_status(file): - try: - return open(file).readline() - except EnvironmentError as err: - print("Unable to open file: {}".format(err)) - sys.exit(1) - -In this version, *either* the file gets opened and the line is read (so it -works even on flaky NFS or SMB connections), or an error message is printed -that provides all the available information on why the open failed, and the -application is aborted. - -However, even this version of :func:`get_status` makes too many assumptions --- -that it will only be used in a short running script, and not, say, in a long -running server. Sure, the caller could do something like :: - - try: - status = get_status(log) - except SystemExit: - status = None - -But there is a better way. You should try to use as few ``except`` clauses in -your code as you can --- the ones you do use will usually be inside calls which -should always succeed, or a catch-all in a main function. - -So, an even better version of :func:`get_status()` is probably :: - - def get_status(file): - return open(file).readline() - -The caller can deal with the exception if it wants (for example, if it tries -several files in a loop), or just let the exception filter upwards to *its* -caller. - -But the last version still has a serious problem --- due to implementation -details in CPython, the file would not be closed when an exception is raised -until the exception handler finishes; and, worse, in other implementations -(e.g., Jython) it might not be closed at all regardless of whether or not -an exception is raised. - -The best version of this function uses the ``open()`` call as a context -manager, which will ensure that the file gets closed as soon as the -function returns:: - - def get_status(file): - with open(file) as fp: - return fp.readline() - - -Using the Batteries -=================== - -Every so often, people seem to be writing stuff in the Python library again, -usually poorly. While the occasional module has a poor interface, it is usually -much better to use the rich standard library and data types that come with -Python than inventing your own. - -A useful module very few people know about is :mod:`os.path`. It always has the -correct path arithmetic for your operating system, and will usually be much -better than whatever you come up with yourself. - -Compare:: - - # ugh! - return dir+"/"+file - # better - return os.path.join(dir, file) - -More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and -:func:`splitext`. - -There are also many useful built-in functions people seem not to be aware of -for some reason: :func:`min` and :func:`max` can find the minimum/maximum of -any sequence with comparable semantics, for example, yet many people write -their own :func:`max`/:func:`min`. Another highly useful function is -:func:`functools.reduce` which can be used to repeatly apply a binary -operation to a sequence, reducing it to a single value. For example, compute -a factorial with a series of multiply operations:: - - >>> n = 4 - >>> import operator, functools - >>> functools.reduce(operator.mul, range(1, n+1)) - 24 - -When it comes to parsing numbers, note that :func:`float`, :func:`int` and -:func:`long` all accept string arguments and will reject ill-formed strings -by raising an :exc:`ValueError`. - - -Using Backslash to Continue Statements -====================================== - -Since Python treats a newline as a statement terminator, and since statements -are often more than is comfortable to put in one line, many people do:: - - if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \ - calculate_number(10, 20) != forbulate(500, 360): - pass - -You should realize that this is dangerous: a stray space after the ``\`` would -make this line wrong, and stray spaces are notoriously hard to see in editors. -In this case, at least it would be a syntax error, but if the code was:: - - value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \ - + calculate_number(10, 20)*forbulate(500, 360) - -then it would just be subtly wrong. - -It is usually much better to use the implicit continuation inside parenthesis: - -This version is bulletproof:: - - value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9] - + calculate_number(10, 20)*forbulate(500, 360)) - diff --git a/Doc/howto/index.rst b/Doc/howto/index.rst index 417ae00..11fe108 100644 --- a/Doc/howto/index.rst +++ b/Doc/howto/index.rst @@ -14,11 +14,13 @@ Currently, the HOWTOs are: :maxdepth: 1 advocacy.rst + pyporting.rst cporting.rst curses.rst descriptor.rst - doanddont.rst functional.rst + logging.rst + logging-cookbook.rst regex.rst sockets.rst sorting.rst diff --git a/Doc/howto/logging-cookbook.rst b/Doc/howto/logging-cookbook.rst new file mode 100644 index 0000000..198d892 --- /dev/null +++ b/Doc/howto/logging-cookbook.rst @@ -0,0 +1,1039 @@ +.. _logging-cookbook: + +================ +Logging Cookbook +================ + +:Author: Vinay Sajip <vinay_sajip at red-dove dot com> + +This page contains a number of recipes related to logging, which have been found +useful in the past. + +.. currentmodule:: logging + +Using logging in multiple modules +--------------------------------- + +Multiple calls to ``logging.getLogger('someLogger')`` return a reference to the +same logger object. This is true not only within the same module, but also +across modules as long as it is in the same Python interpreter process. It is +true for references to the same object; additionally, application code can +define and configure a parent logger in one module and create (but not +configure) a child logger in a separate module, and all logger calls to the +child will pass up to the parent. Here is a main module:: + + import logging + import auxiliary_module + + # create logger with 'spam_application' + logger = logging.getLogger('spam_application') + logger.setLevel(logging.DEBUG) + # create file handler which logs even debug messages + fh = logging.FileHandler('spam.log') + fh.setLevel(logging.DEBUG) + # create console handler with a higher log level + ch = logging.StreamHandler() + ch.setLevel(logging.ERROR) + # create formatter and add it to the handlers + formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') + fh.setFormatter(formatter) + ch.setFormatter(formatter) + # add the handlers to the logger + logger.addHandler(fh) + logger.addHandler(ch) + + logger.info('creating an instance of auxiliary_module.Auxiliary') + a = auxiliary_module.Auxiliary() + logger.info('created an instance of auxiliary_module.Auxiliary') + logger.info('calling auxiliary_module.Auxiliary.do_something') + a.do_something() + logger.info('finished auxiliary_module.Auxiliary.do_something') + logger.info('calling auxiliary_module.some_function()') + auxiliary_module.some_function() + logger.info('done with auxiliary_module.some_function()') + +Here is the auxiliary module:: + + import logging + + # create logger + module_logger = logging.getLogger('spam_application.auxiliary') + + class Auxiliary: + def __init__(self): + self.logger = logging.getLogger('spam_application.auxiliary.Auxiliary') + self.logger.info('creating an instance of Auxiliary') + def do_something(self): + self.logger.info('doing something') + a = 1 + 1 + self.logger.info('done doing something') + + def some_function(): + module_logger.info('received a call to "some_function"') + +The output looks like this:: + + 2005-03-23 23:47:11,663 - spam_application - INFO - + creating an instance of auxiliary_module.Auxiliary + 2005-03-23 23:47:11,665 - spam_application.auxiliary.Auxiliary - INFO - + creating an instance of Auxiliary + 2005-03-23 23:47:11,665 - spam_application - INFO - + created an instance of auxiliary_module.Auxiliary + 2005-03-23 23:47:11,668 - spam_application - INFO - + calling auxiliary_module.Auxiliary.do_something + 2005-03-23 23:47:11,668 - spam_application.auxiliary.Auxiliary - INFO - + doing something + 2005-03-23 23:47:11,669 - spam_application.auxiliary.Auxiliary - INFO - + done doing something + 2005-03-23 23:47:11,670 - spam_application - INFO - + finished auxiliary_module.Auxiliary.do_something + 2005-03-23 23:47:11,671 - spam_application - INFO - + calling auxiliary_module.some_function() + 2005-03-23 23:47:11,672 - spam_application.auxiliary - INFO - + received a call to 'some_function' + 2005-03-23 23:47:11,673 - spam_application - INFO - + done with auxiliary_module.some_function() + +Multiple handlers and formatters +-------------------------------- + +Loggers are plain Python objects. The :func:`addHandler` method has no minimum +or maximum quota for the number of handlers you may add. Sometimes it will be +beneficial for an application to log all messages of all severities to a text +file while simultaneously logging errors or above to the console. To set this +up, simply configure the appropriate handlers. The logging calls in the +application code will remain unchanged. Here is a slight modification to the +previous simple module-based configuration example:: + + import logging + + logger = logging.getLogger('simple_example') + logger.setLevel(logging.DEBUG) + # create file handler which logs even debug messages + fh = logging.FileHandler('spam.log') + fh.setLevel(logging.DEBUG) + # create console handler with a higher log level + ch = logging.StreamHandler() + ch.setLevel(logging.ERROR) + # create formatter and add it to the handlers + formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') + ch.setFormatter(formatter) + fh.setFormatter(formatter) + # add the handlers to logger + logger.addHandler(ch) + logger.addHandler(fh) + + # 'application' code + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + +Notice that the 'application' code does not care about multiple handlers. All +that changed was the addition and configuration of a new handler named *fh*. + +The ability to create new handlers with higher- or lower-severity filters can be +very helpful when writing and testing an application. Instead of using many +``print`` statements for debugging, use ``logger.debug``: Unlike the print +statements, which you will have to delete or comment out later, the logger.debug +statements can remain intact in the source code and remain dormant until you +need them again. At that time, the only change that needs to happen is to +modify the severity level of the logger and/or handler to debug. + +.. _multiple-destinations: + +Logging to multiple destinations +-------------------------------- + +Let's say you want to log to console and file with different message formats and +in differing circumstances. Say you want to log messages with levels of DEBUG +and higher to file, and those messages at level INFO and higher to the console. +Let's also assume that the file should contain timestamps, but the console +messages should not. Here's how you can achieve this:: + + import logging + + # set up logging to file - see previous section for more details + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)s %(name)-12s %(levelname)-8s %(message)s', + datefmt='%m-%d %H:%M', + filename='/temp/myapp.log', + filemode='w') + # define a Handler which writes INFO messages or higher to the sys.stderr + console = logging.StreamHandler() + console.setLevel(logging.INFO) + # set a format which is simpler for console use + formatter = logging.Formatter('%(name)-12s: %(levelname)-8s %(message)s') + # tell the handler to use this format + console.setFormatter(formatter) + # add the handler to the root logger + logging.getLogger('').addHandler(console) + + # Now, we can log to the root logger, or any other logger. First the root... + logging.info('Jackdaws love my big sphinx of quartz.') + + # Now, define a couple of other loggers which might represent areas in your + # application: + + logger1 = logging.getLogger('myapp.area1') + logger2 = logging.getLogger('myapp.area2') + + logger1.debug('Quick zephyrs blow, vexing daft Jim.') + logger1.info('How quickly daft jumping zebras vex.') + logger2.warning('Jail zesty vixen who grabbed pay from quack.') + logger2.error('The five boxing wizards jump quickly.') + +When you run this, on the console you will see :: + + root : INFO Jackdaws love my big sphinx of quartz. + myapp.area1 : INFO How quickly daft jumping zebras vex. + myapp.area2 : WARNING Jail zesty vixen who grabbed pay from quack. + myapp.area2 : ERROR The five boxing wizards jump quickly. + +and in the file you will see something like :: + + 10-22 22:19 root INFO Jackdaws love my big sphinx of quartz. + 10-22 22:19 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. + 10-22 22:19 myapp.area1 INFO How quickly daft jumping zebras vex. + 10-22 22:19 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. + 10-22 22:19 myapp.area2 ERROR The five boxing wizards jump quickly. + +As you can see, the DEBUG message only shows up in the file. The other messages +are sent to both destinations. + +This example uses console and file handlers, but you can use any number and +combination of handlers you choose. + + +Configuration server example +---------------------------- + +Here is an example of a module using the logging configuration server:: + + import logging + import logging.config + import time + import os + + # read initial config file + logging.config.fileConfig('logging.conf') + + # create and start listener on port 9999 + t = logging.config.listen(9999) + t.start() + + logger = logging.getLogger('simpleExample') + + try: + # loop through logging calls to see the difference + # new configurations make, until Ctrl+C is pressed + while True: + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + time.sleep(5) + except KeyboardInterrupt: + # cleanup + logging.config.stopListening() + t.join() + +And here is a script that takes a filename and sends that file to the server, +properly preceded with the binary-encoded length, as the new logging +configuration:: + + #!/usr/bin/env python + import socket, sys, struct + + with open(sys.argv[1], 'rb') as f: + data_to_send = f.read() + + HOST = 'localhost' + PORT = 9999 + s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) + print('connecting...') + s.connect((HOST, PORT)) + print('sending config...') + s.send(struct.pack('>L', len(data_to_send))) + s.send(data_to_send) + s.close() + print('complete') + + +Dealing with handlers that block +-------------------------------- + +.. currentmodule:: logging.handlers + +Sometimes you have to get your logging handlers to do their work without +blocking the thread you’re logging from. This is common in Web applications, +though of course it also occurs in other scenarios. + +A common culprit which demonstrates sluggish behaviour is the +:class:`SMTPHandler`: sending emails can take a long time, for a +number of reasons outside the developer’s control (for example, a poorly +performing mail or network infrastructure). But almost any network-based +handler can block: Even a :class:`SocketHandler` operation may do a +DNS query under the hood which is too slow (and this query can be deep in the +socket library code, below the Python layer, and outside your control). + +One solution is to use a two-part approach. For the first part, attach only a +:class:`QueueHandler` to those loggers which are accessed from +performance-critical threads. They simply write to their queue, which can be +sized to a large enough capacity or initialized with no upper bound to their +size. The write to the queue will typically be accepted quickly, though you +will probably need to catch the :exc:`queue.Full` exception as a precaution +in your code. If you are a library developer who has performance-critical +threads in their code, be sure to document this (together with a suggestion to +attach only ``QueueHandlers`` to your loggers) for the benefit of other +developers who will use your code. + +The second part of the solution is :class:`QueueListener`, which has been +designed as the counterpart to :class:`QueueHandler`. A +:class:`QueueListener` is very simple: it’s passed a queue and some handlers, +and it fires up an internal thread which listens to its queue for LogRecords +sent from ``QueueHandlers`` (or any other source of ``LogRecords``, for that +matter). The ``LogRecords`` are removed from the queue and passed to the +handlers for processing. + +The advantage of having a separate :class:`QueueListener` class is that you +can use the same instance to service multiple ``QueueHandlers``. This is more +resource-friendly than, say, having threaded versions of the existing handler +classes, which would eat up one thread per handler for no particular benefit. + +An example of using these two classes follows (imports omitted):: + + que = queue.Queue(-1) # no limit on size + queue_handler = QueueHandler(que) + handler = logging.StreamHandler() + listener = QueueListener(que, handler) + root = logging.getLogger() + root.addHandler(queue_handler) + formatter = logging.Formatter('%(threadName)s: %(message)s') + handler.setFormatter(formatter) + listener.start() + # The log output will display the thread which generated + # the event (the main thread) rather than the internal + # thread which monitors the internal queue. This is what + # you want to happen. + root.warning('Look out!') + listener.stop() + +which, when run, will produce:: + + MainThread: Look out! + + +.. _network-logging: + +Sending and receiving logging events across a network +----------------------------------------------------- + +Let's say you want to send logging events across a network, and handle them at +the receiving end. A simple way of doing this is attaching a +:class:`SocketHandler` instance to the root logger at the sending end:: + + import logging, logging.handlers + + rootLogger = logging.getLogger('') + rootLogger.setLevel(logging.DEBUG) + socketHandler = logging.handlers.SocketHandler('localhost', + logging.handlers.DEFAULT_TCP_LOGGING_PORT) + # don't bother with a formatter, since a socket handler sends the event as + # an unformatted pickle + rootLogger.addHandler(socketHandler) + + # Now, we can log to the root logger, or any other logger. First the root... + logging.info('Jackdaws love my big sphinx of quartz.') + + # Now, define a couple of other loggers which might represent areas in your + # application: + + logger1 = logging.getLogger('myapp.area1') + logger2 = logging.getLogger('myapp.area2') + + logger1.debug('Quick zephyrs blow, vexing daft Jim.') + logger1.info('How quickly daft jumping zebras vex.') + logger2.warning('Jail zesty vixen who grabbed pay from quack.') + logger2.error('The five boxing wizards jump quickly.') + +At the receiving end, you can set up a receiver using the :mod:`socketserver` +module. Here is a basic working example:: + + import pickle + import logging + import logging.handlers + import socketserver + import struct + + + class LogRecordStreamHandler(socketserver.StreamRequestHandler): + """Handler for a streaming logging request. + + This basically logs the record using whatever logging policy is + configured locally. + """ + + def handle(self): + """ + Handle multiple requests - each expected to be a 4-byte length, + followed by the LogRecord in pickle format. Logs the record + according to whatever policy is configured locally. + """ + while True: + chunk = self.connection.recv(4) + if len(chunk) < 4: + break + slen = struct.unpack('>L', chunk)[0] + chunk = self.connection.recv(slen) + while len(chunk) < slen: + chunk = chunk + self.connection.recv(slen - len(chunk)) + obj = self.unPickle(chunk) + record = logging.makeLogRecord(obj) + self.handleLogRecord(record) + + def unPickle(self, data): + return pickle.loads(data) + + def handleLogRecord(self, record): + # if a name is specified, we use the named logger rather than the one + # implied by the record. + if self.server.logname is not None: + name = self.server.logname + else: + name = record.name + logger = logging.getLogger(name) + # N.B. EVERY record gets logged. This is because Logger.handle + # is normally called AFTER logger-level filtering. If you want + # to do filtering, do it at the client end to save wasting + # cycles and network bandwidth! + logger.handle(record) + + class LogRecordSocketReceiver(socketserver.ThreadingTCPServer): + """ + Simple TCP socket-based logging receiver suitable for testing. + """ + + allow_reuse_address = 1 + + def __init__(self, host='localhost', + port=logging.handlers.DEFAULT_TCP_LOGGING_PORT, + handler=LogRecordStreamHandler): + socketserver.ThreadingTCPServer.__init__(self, (host, port), handler) + self.abort = 0 + self.timeout = 1 + self.logname = None + + def serve_until_stopped(self): + import select + abort = 0 + while not abort: + rd, wr, ex = select.select([self.socket.fileno()], + [], [], + self.timeout) + if rd: + self.handle_request() + abort = self.abort + + def main(): + logging.basicConfig( + format='%(relativeCreated)5d %(name)-15s %(levelname)-8s %(message)s') + tcpserver = LogRecordSocketReceiver() + print('About to start TCP server...') + tcpserver.serve_until_stopped() + + if __name__ == '__main__': + main() + +First run the server, and then the client. On the client side, nothing is +printed on the console; on the server side, you should see something like:: + + About to start TCP server... + 59 root INFO Jackdaws love my big sphinx of quartz. + 59 myapp.area1 DEBUG Quick zephyrs blow, vexing daft Jim. + 69 myapp.area1 INFO How quickly daft jumping zebras vex. + 69 myapp.area2 WARNING Jail zesty vixen who grabbed pay from quack. + 69 myapp.area2 ERROR The five boxing wizards jump quickly. + +Note that there are some security issues with pickle in some scenarios. If +these affect you, you can use an alternative serialization scheme by overriding +the :meth:`makePickle` method and implementing your alternative there, as +well as adapting the above script to use your alternative serialization. + + +.. _context-info: + +Adding contextual information to your logging output +---------------------------------------------------- + +Sometimes you want logging output to contain contextual information in +addition to the parameters passed to the logging call. For example, in a +networked application, it may be desirable to log client-specific information +in the log (e.g. remote client's username, or IP address). Although you could +use the *extra* parameter to achieve this, it's not always convenient to pass +the information in this way. While it might be tempting to create +:class:`Logger` instances on a per-connection basis, this is not a good idea +because these instances are not garbage collected. While this is not a problem +in practice, when the number of :class:`Logger` instances is dependent on the +level of granularity you want to use in logging an application, it could +be hard to manage if the number of :class:`Logger` instances becomes +effectively unbounded. + + +Using LoggerAdapters to impart contextual information +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +An easy way in which you can pass contextual information to be output along +with logging event information is to use the :class:`LoggerAdapter` class. +This class is designed to look like a :class:`Logger`, so that you can call +:meth:`debug`, :meth:`info`, :meth:`warning`, :meth:`error`, +:meth:`exception`, :meth:`critical` and :meth:`log`. These methods have the +same signatures as their counterparts in :class:`Logger`, so you can use the +two types of instances interchangeably. + +When you create an instance of :class:`LoggerAdapter`, you pass it a +:class:`Logger` instance and a dict-like object which contains your contextual +information. When you call one of the logging methods on an instance of +:class:`LoggerAdapter`, it delegates the call to the underlying instance of +:class:`Logger` passed to its constructor, and arranges to pass the contextual +information in the delegated call. Here's a snippet from the code of +:class:`LoggerAdapter`:: + + def debug(self, msg, *args, **kwargs): + """ + Delegate a debug call to the underlying logger, after adding + contextual information from this adapter instance. + """ + msg, kwargs = self.process(msg, kwargs) + self.logger.debug(msg, *args, **kwargs) + +The :meth:`process` method of :class:`LoggerAdapter` is where the contextual +information is added to the logging output. It's passed the message and +keyword arguments of the logging call, and it passes back (potentially) +modified versions of these to use in the call to the underlying logger. The +default implementation of this method leaves the message alone, but inserts +an 'extra' key in the keyword argument whose value is the dict-like object +passed to the constructor. Of course, if you had passed an 'extra' keyword +argument in the call to the adapter, it will be silently overwritten. + +The advantage of using 'extra' is that the values in the dict-like object are +merged into the :class:`LogRecord` instance's __dict__, allowing you to use +customized strings with your :class:`Formatter` instances which know about +the keys of the dict-like object. If you need a different method, e.g. if you +want to prepend or append the contextual information to the message string, +you just need to subclass :class:`LoggerAdapter` and override :meth:`process` +to do what you need. Here's an example script which uses this class, which +also illustrates what dict-like behaviour is needed from an arbitrary +'dict-like' object for use in the constructor:: + + import logging + + class ConnInfo: + """ + An example class which shows how an arbitrary class can be used as + the 'extra' context information repository passed to a LoggerAdapter. + """ + + def __getitem__(self, name): + """ + To allow this instance to look like a dict. + """ + from random import choice + if name == 'ip': + result = choice(['127.0.0.1', '192.168.0.1']) + elif name == 'user': + result = choice(['jim', 'fred', 'sheila']) + else: + result = self.__dict__.get(name, '?') + return result + + def __iter__(self): + """ + To allow iteration over keys, which will be merged into + the LogRecord dict before formatting and output. + """ + keys = ['ip', 'user'] + keys.extend(self.__dict__.keys()) + return keys.__iter__() + + if __name__ == '__main__': + from random import choice + levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL) + a1 = logging.LoggerAdapter(logging.getLogger('a.b.c'), + { 'ip' : '123.231.231.123', 'user' : 'sheila' }) + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s') + a1.debug('A debug message') + a1.info('An info message with %s', 'some parameters') + a2 = logging.LoggerAdapter(logging.getLogger('d.e.f'), ConnInfo()) + for x in range(10): + lvl = choice(levels) + lvlname = logging.getLevelName(lvl) + a2.log(lvl, 'A message at %s level with %d %s', lvlname, 2, 'parameters') + +When this script is run, the output should look something like this:: + + 2008-01-18 14:49:54,023 a.b.c DEBUG IP: 123.231.231.123 User: sheila A debug message + 2008-01-18 14:49:54,023 a.b.c INFO IP: 123.231.231.123 User: sheila An info message with some parameters + 2008-01-18 14:49:54,023 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: jim A message at INFO level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: fred A message at ERROR level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: jim A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f INFO IP: 192.168.0.1 User: fred A message at INFO level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 192.168.0.1 User: sheila A message at WARNING level with 2 parameters + 2008-01-18 14:49:54,033 d.e.f WARNING IP: 127.0.0.1 User: jim A message at WARNING level with 2 parameters + + +.. _filters-contextual: + +Using Filters to impart contextual information +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +You can also add contextual information to log output using a user-defined +:class:`Filter`. ``Filter`` instances are allowed to modify the ``LogRecords`` +passed to them, including adding additional attributes which can then be output +using a suitable format string, or if needed a custom :class:`Formatter`. + +For example in a web application, the request being processed (or at least, +the interesting parts of it) can be stored in a threadlocal +(:class:`threading.local`) variable, and then accessed from a ``Filter`` to +add, say, information from the request - say, the remote IP address and remote +user's username - to the ``LogRecord``, using the attribute names 'ip' and +'user' as in the ``LoggerAdapter`` example above. In that case, the same format +string can be used to get similar output to that shown above. Here's an example +script:: + + import logging + from random import choice + + class ContextFilter(logging.Filter): + """ + This is a filter which injects contextual information into the log. + + Rather than use actual contextual information, we just use random + data in this demo. + """ + + USERS = ['jim', 'fred', 'sheila'] + IPS = ['123.231.231.123', '127.0.0.1', '192.168.0.1'] + + def filter(self, record): + + record.ip = choice(ContextFilter.IPS) + record.user = choice(ContextFilter.USERS) + return True + + if __name__ == '__main__': + levels = (logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, logging.CRITICAL) + logging.basicConfig(level=logging.DEBUG, + format='%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s') + a1 = logging.getLogger('a.b.c') + a2 = logging.getLogger('d.e.f') + + f = ContextFilter() + a1.addFilter(f) + a2.addFilter(f) + a1.debug('A debug message') + a1.info('An info message with %s', 'some parameters') + for x in range(10): + lvl = choice(levels) + lvlname = logging.getLevelName(lvl) + a2.log(lvl, 'A message at %s level with %d %s', lvlname, 2, 'parameters') + +which, when run, produces something like:: + + 2010-09-06 22:38:15,292 a.b.c DEBUG IP: 123.231.231.123 User: fred A debug message + 2010-09-06 22:38:15,300 a.b.c INFO IP: 192.168.0.1 User: sheila An info message with some parameters + 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 127.0.0.1 User: sheila A message at CRITICAL level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f ERROR IP: 127.0.0.1 User: jim A message at ERROR level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f DEBUG IP: 127.0.0.1 User: sheila A message at DEBUG level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f ERROR IP: 123.231.231.123 User: fred A message at ERROR level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 192.168.0.1 User: jim A message at CRITICAL level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f CRITICAL IP: 127.0.0.1 User: sheila A message at CRITICAL level with 2 parameters + 2010-09-06 22:38:15,300 d.e.f DEBUG IP: 192.168.0.1 User: jim A message at DEBUG level with 2 parameters + 2010-09-06 22:38:15,301 d.e.f ERROR IP: 127.0.0.1 User: sheila A message at ERROR level with 2 parameters + 2010-09-06 22:38:15,301 d.e.f DEBUG IP: 123.231.231.123 User: fred A message at DEBUG level with 2 parameters + 2010-09-06 22:38:15,301 d.e.f INFO IP: 123.231.231.123 User: fred A message at INFO level with 2 parameters + + +.. _multiple-processes: + +Logging to a single file from multiple processes +------------------------------------------------ + +Although logging is thread-safe, and logging to a single file from multiple +threads in a single process *is* supported, logging to a single file from +*multiple processes* is *not* supported, because there is no standard way to +serialize access to a single file across multiple processes in Python. If you +need to log to a single file from multiple processes, one way of doing this is +to have all the processes log to a :class:`SocketHandler`, and have a separate +process which implements a socket server which reads from the socket and logs +to file. (If you prefer, you can dedicate one thread in one of the existing +processes to perform this function.) The following section documents this +approach in more detail and includes a working socket receiver which can be +used as a starting point for you to adapt in your own applications. + +If you are using a recent version of Python which includes the +:mod:`multiprocessing` module, you could write your own handler which uses the +:class:`Lock` class from this module to serialize access to the file from +your processes. The existing :class:`FileHandler` and subclasses do not make +use of :mod:`multiprocessing` at present, though they may do so in the future. +Note that at present, the :mod:`multiprocessing` module does not provide +working lock functionality on all platforms (see +http://bugs.python.org/issue3770). + +.. currentmodule:: logging.handlers + +Alternatively, you can use a ``Queue`` and a :class:`QueueHandler` to send +all logging events to one of the processes in your multi-process application. +The following example script demonstrates how you can do this; in the example +a separate listener process listens for events sent by other processes and logs +them according to its own logging configuration. Although the example only +demonstrates one way of doing it (for example, you may want to use a listener +thread rather than a separate listener process -- the implementation would be +analogous) it does allow for completely different logging configurations for +the listener and the other processes in your application, and can be used as +the basis for code meeting your own specific requirements:: + + # You'll need these imports in your own code + import logging + import logging.handlers + import multiprocessing + + # Next two import lines for this demo only + from random import choice, random + import time + + # + # Because you'll want to define the logging configurations for listener and workers, the + # listener and worker process functions take a configurer parameter which is a callable + # for configuring logging for that process. These functions are also passed the queue, + # which they use for communication. + # + # In practice, you can configure the listener however you want, but note that in this + # simple example, the listener does not apply level or filter logic to received records. + # In practice, you would probably want to do this logic in the worker processes, to avoid + # sending events which would be filtered out between processes. + # + # The size of the rotated files is made small so you can see the results easily. + def listener_configurer(): + root = logging.getLogger() + h = logging.handlers.RotatingFileHandler('/tmp/mptest.log', 'a', 300, 10) + f = logging.Formatter('%(asctime)s %(processName)-10s %(name)s %(levelname)-8s %(message)s') + h.setFormatter(f) + root.addHandler(h) + + # This is the listener process top-level loop: wait for logging events + # (LogRecords)on the queue and handle them, quit when you get a None for a + # LogRecord. + def listener_process(queue, configurer): + configurer() + while True: + try: + record = queue.get() + if record is None: # We send this as a sentinel to tell the listener to quit. + break + logger = logging.getLogger(record.name) + logger.handle(record) # No level or filter logic applied - just do it! + except (KeyboardInterrupt, SystemExit): + raise + except: + import sys, traceback + print >> sys.stderr, 'Whoops! Problem:' + traceback.print_exc(file=sys.stderr) + + # Arrays used for random selections in this demo + + LEVELS = [logging.DEBUG, logging.INFO, logging.WARNING, + logging.ERROR, logging.CRITICAL] + + LOGGERS = ['a.b.c', 'd.e.f'] + + MESSAGES = [ + 'Random message #1', + 'Random message #2', + 'Random message #3', + ] + + # The worker configuration is done at the start of the worker process run. + # Note that on Windows you can't rely on fork semantics, so each process + # will run the logging configuration code when it starts. + def worker_configurer(queue): + h = logging.handlers.QueueHandler(queue) # Just the one handler needed + root = logging.getLogger() + root.addHandler(h) + root.setLevel(logging.DEBUG) # send all messages, for demo; no other level or filter logic applied. + + # This is the worker process top-level loop, which just logs ten events with + # random intervening delays before terminating. + # The print messages are just so you know it's doing something! + def worker_process(queue, configurer): + configurer(queue) + name = multiprocessing.current_process().name + print('Worker started: %s' % name) + for i in range(10): + time.sleep(random()) + logger = logging.getLogger(choice(LOGGERS)) + level = choice(LEVELS) + message = choice(MESSAGES) + logger.log(level, message) + print('Worker finished: %s' % name) + + # Here's where the demo gets orchestrated. Create the queue, create and start + # the listener, create ten workers and start them, wait for them to finish, + # then send a None to the queue to tell the listener to finish. + def main(): + queue = multiprocessing.Queue(-1) + listener = multiprocessing.Process(target=listener_process, + args=(queue, listener_configurer)) + listener.start() + workers = [] + for i in range(10): + worker = multiprocessing.Process(target=worker_process, + args=(queue, worker_configurer)) + workers.append(worker) + worker.start() + for w in workers: + w.join() + queue.put_nowait(None) + listener.join() + + if __name__ == '__main__': + main() + +A variant of the above script keeps the logging in the main process, in a +separate thread:: + + import logging + import logging.config + import logging.handlers + from multiprocessing import Process, Queue + import random + import threading + import time + + def logger_thread(q): + while True: + record = q.get() + if record is None: + break + logger = logging.getLogger(record.name) + logger.handle(record) + + + def worker_process(q): + qh = logging.handlers.QueueHandler(q) + root = logging.getLogger() + root.setLevel(logging.DEBUG) + root.addHandler(qh) + levels = [logging.DEBUG, logging.INFO, logging.WARNING, logging.ERROR, + logging.CRITICAL] + loggers = ['foo', 'foo.bar', 'foo.bar.baz', + 'spam', 'spam.ham', 'spam.ham.eggs'] + for i in range(100): + lvl = random.choice(levels) + logger = logging.getLogger(random.choice(loggers)) + logger.log(lvl, 'Message no. %d', i) + + if __name__ == '__main__': + q = Queue() + d = { + 'version': 1, + 'formatters': { + 'detailed': { + 'class': 'logging.Formatter', + 'format': '%(asctime)s %(name)-15s %(levelname)-8s %(processName)-10s %(message)s' + } + }, + 'handlers': { + 'console': { + 'class': 'logging.StreamHandler', + 'level': 'INFO', + }, + 'file': { + 'class': 'logging.FileHandler', + 'filename': 'mplog.log', + 'mode': 'w', + 'formatter': 'detailed', + }, + 'foofile': { + 'class': 'logging.FileHandler', + 'filename': 'mplog-foo.log', + 'mode': 'w', + 'formatter': 'detailed', + }, + 'errors': { + 'class': 'logging.FileHandler', + 'filename': 'mplog-errors.log', + 'mode': 'w', + 'level': 'ERROR', + 'formatter': 'detailed', + }, + }, + 'loggers': { + 'foo': { + 'handlers' : ['foofile'] + } + }, + 'root': { + 'level': 'DEBUG', + 'handlers': ['console', 'file', 'errors'] + }, + } + workers = [] + for i in range(5): + wp = Process(target=worker_process, name='worker %d' % (i + 1), args=(q,)) + workers.append(wp) + wp.start() + logging.config.dictConfig(d) + lp = threading.Thread(target=logger_thread, args=(q,)) + lp.start() + # At this point, the main process could do some useful work of its own + # Once it's done that, it can wait for the workers to terminate... + for wp in workers: + wp.join() + # And now tell the logging thread to finish up, too + q.put(None) + lp.join() + +This variant shows how you can e.g. apply configuration for particular loggers +- e.g. the ``foo`` logger has a special handler which stores all events in the +``foo`` subsystem in a file ``mplog-foo.log``. This will be used by the logging +machinery in the main process (even though the logging events are generated in +the worker processes) to direct the messages to the appropriate destinations. + +Using file rotation +------------------- + +.. sectionauthor:: Doug Hellmann, Vinay Sajip (changes) +.. (see <http://blog.doughellmann.com/2007/05/pymotw-logging.html>) + +Sometimes you want to let a log file grow to a certain size, then open a new +file and log to that. You may want to keep a certain number of these files, and +when that many files have been created, rotate the files so that the number of +files and the size of the files both remain bounded. For this usage pattern, the +logging package provides a :class:`RotatingFileHandler`:: + + import glob + import logging + import logging.handlers + + LOG_FILENAME = 'logging_rotatingfile_example.out' + + # Set up a specific logger with our desired output level + my_logger = logging.getLogger('MyLogger') + my_logger.setLevel(logging.DEBUG) + + # Add the log message handler to the logger + handler = logging.handlers.RotatingFileHandler( + LOG_FILENAME, maxBytes=20, backupCount=5) + + my_logger.addHandler(handler) + + # Log some messages + for i in range(20): + my_logger.debug('i = %d' % i) + + # See what files are created + logfiles = glob.glob('%s*' % LOG_FILENAME) + + for filename in logfiles: + print(filename) + +The result should be 6 separate files, each with part of the log history for the +application:: + + logging_rotatingfile_example.out + logging_rotatingfile_example.out.1 + logging_rotatingfile_example.out.2 + logging_rotatingfile_example.out.3 + logging_rotatingfile_example.out.4 + logging_rotatingfile_example.out.5 + +The most current file is always :file:`logging_rotatingfile_example.out`, +and each time it reaches the size limit it is renamed with the suffix +``.1``. Each of the existing backup files is renamed to increment the suffix +(``.1`` becomes ``.2``, etc.) and the ``.6`` file is erased. + +Obviously this example sets the log length much much too small as an extreme +example. You would want to set *maxBytes* to an appropriate value. + +.. _zeromq-handlers: + +Subclassing QueueHandler - a ZeroMQ example +------------------------------------------- + +You can use a :class:`QueueHandler` subclass to send messages to other kinds +of queues, for example a ZeroMQ 'publish' socket. In the example below,the +socket is created separately and passed to the handler (as its 'queue'):: + + import zmq # using pyzmq, the Python binding for ZeroMQ + import json # for serializing records portably + + ctx = zmq.Context() + sock = zmq.Socket(ctx, zmq.PUB) # or zmq.PUSH, or other suitable value + sock.bind('tcp://*:5556') # or wherever + + class ZeroMQSocketHandler(QueueHandler): + def enqueue(self, record): + data = json.dumps(record.__dict__) + self.queue.send(data) + + handler = ZeroMQSocketHandler(sock) + + +Of course there are other ways of organizing this, for example passing in the +data needed by the handler to create the socket:: + + class ZeroMQSocketHandler(QueueHandler): + def __init__(self, uri, socktype=zmq.PUB, ctx=None): + self.ctx = ctx or zmq.Context() + socket = zmq.Socket(self.ctx, socktype) + socket.bind(uri) + QueueHandler.__init__(self, socket) + + def enqueue(self, record): + data = json.dumps(record.__dict__) + self.queue.send(data) + + def close(self): + self.queue.close() + + +Subclassing QueueListener - a ZeroMQ example +-------------------------------------------- + +You can also subclass :class:`QueueListener` to get messages from other kinds +of queues, for example a ZeroMQ 'subscribe' socket. Here's an example:: + + class ZeroMQSocketListener(QueueListener): + def __init__(self, uri, *handlers, **kwargs): + self.ctx = kwargs.get('ctx') or zmq.Context() + socket = zmq.Socket(self.ctx, zmq.SUB) + socket.setsockopt(zmq.SUBSCRIBE, '') # subscribe to everything + socket.connect(uri) + + def dequeue(self): + msg = self.queue.recv() + return logging.makeLogRecord(json.loads(msg)) + + +.. seealso:: + + Module :mod:`logging` + API reference for the logging module. + + Module :mod:`logging.config` + Configuration API for the logging module. + + Module :mod:`logging.handlers` + Useful handlers included with the logging module. + + :ref:`A basic logging tutorial <logging-basic-tutorial>` + + :ref:`A more advanced logging tutorial <logging-advanced-tutorial>` diff --git a/Doc/howto/logging.rst b/Doc/howto/logging.rst new file mode 100644 index 0000000..a7d6024 --- /dev/null +++ b/Doc/howto/logging.rst @@ -0,0 +1,1026 @@ +============= +Logging HOWTO +============= + +:Author: Vinay Sajip <vinay_sajip at red-dove dot com> + +.. _logging-basic-tutorial: + +.. currentmodule:: logging + +Basic Logging Tutorial +---------------------- + +Logging is a means of tracking events that happen when some software runs. The +software's developer adds logging calls to their code to indicate that certain +events have occurred. An event is described by a descriptive message which can +optionally contain variable data (i.e. data that is potentially different for +each occurrence of the event). Events also have an importance which the +developer ascribes to the event; the importance can also be called the *level* +or *severity*. + +When to use logging +^^^^^^^^^^^^^^^^^^^ + +Logging provides a set of convenience functions for simple logging usage. These +are :func:`debug`, :func:`info`, :func:`warning`, :func:`error` and +:func:`critical`. To determine when to use logging, see the table below, which +states, for each of a set of common tasks, the best tool to use for it. + ++-------------------------------------+--------------------------------------+ +| Task you want to perform | The best tool for the task | ++=====================================+======================================+ +| Display console output for ordinary | :func:`print` | +| usage of a command line script or | | +| program | | ++-------------------------------------+--------------------------------------+ +| Report events that occur during | :func:`logging.info` (or | +| normal operation of a program (e.g. | :func:`logging.debug` for very | +| for status monitoring or fault | detailed output for diagnostic | +| investigation) | purposes) | ++-------------------------------------+--------------------------------------+ +| Issue a warning regarding a | :func:`warnings.warn` in library | +| particular runtime event | code if the issue is avoidable and | +| | the client application should be | +| | modified to eliminate the warning | +| | | +| | :func:`logging.warning` if there is | +| | nothing the client application can do| +| | about the situation, but the event | +| | should still be noted | ++-------------------------------------+--------------------------------------+ +| Report an error regarding a | Raise an exception | +| particular runtime event | | ++-------------------------------------+--------------------------------------+ +| Report suppression of an error | :func:`logging.error`, | +| without raising an exception (e.g. | :func:`logging.exception` or | +| error handler in a long-running | :func:`logging.critical` as | +| server process) | appropriate for the specific error | +| | and application domain | ++-------------------------------------+--------------------------------------+ + +The logging functions are named after the level or severity of the events +they are used to track. The standard levels and their applicability are +described below (in increasing order of severity): + ++--------------+---------------------------------------------+ +| Level | When it's used | ++==============+=============================================+ +| ``DEBUG`` | Detailed information, typically of interest | +| | only when diagnosing problems. | ++--------------+---------------------------------------------+ +| ``INFO`` | Confirmation that things are working as | +| | expected. | ++--------------+---------------------------------------------+ +| ``WARNING`` | An indication that something unexpected | +| | happened, or indicative of some problem in | +| | the near future (e.g. 'disk space low'). | +| | The software is still working as expected. | ++--------------+---------------------------------------------+ +| ``ERROR`` | Due to a more serious problem, the software | +| | has not been able to perform some function. | ++--------------+---------------------------------------------+ +| ``CRITICAL`` | A serious error, indicating that the program| +| | itself may be unable to continue running. | ++--------------+---------------------------------------------+ + +The default level is ``WARNING``, which means that only events of this level +and above will be tracked, unless the logging package is configured to do +otherwise. + +Events that are tracked can be handled in different ways. The simplest way of +handling tracked events is to print them to the console. Another common way +is to write them to a disk file. + + +.. _howto-minimal-example: + +A simple example +^^^^^^^^^^^^^^^^ + +A very simple example is:: + + import logging + logging.warning('Watch out!') # will print a message to the console + logging.info('I told you so') # will not print anything + +If you type these lines into a script and run it, you'll see:: + + WARNING:root:Watch out! + +printed out on the console. The ``INFO`` message doesn't appear because the +default level is ``WARNING``. The printed message includes the indication of +the level and the description of the event provided in the logging call, i.e. +'Watch out!'. Don't worry about the 'root' part for now: it will be explained +later. The actual output can be formatted quite flexibly if you need that; +formatting options will also be explained later. + + +Logging to a file +^^^^^^^^^^^^^^^^^ + +A very common situation is that of recording logging events in a file, so let's +look at that next:: + + import logging + logging.basicConfig(filename='example.log',level=logging.DEBUG) + logging.debug('This message should go to the log file') + logging.info('So should this') + logging.warning('And this, too') + +And now if we open the file and look at what we have, we should find the log +messages:: + + DEBUG:root:This message should go to the log file + INFO:root:So should this + WARNING:root:And this, too + +This example also shows how you can set the logging level which acts as the +threshold for tracking. In this case, because we set the threshold to +``DEBUG``, all of the messages were printed. + +If you want to set the logging level from a command-line option such as:: + + --log=INFO + +and you have the value of the parameter passed for ``--log`` in some variable +*loglevel*, you can use:: + + getattr(logging, loglevel.upper()) + +to get the value which you'll pass to :func:`basicConfig` via the *level* +argument. You may want to error check any user input value, perhaps as in the +following example:: + + # assuming loglevel is bound to the string value obtained from the + # command line argument. Convert to upper case to allow the user to + # specify --log=DEBUG or --log=debug + numeric_level = getattr(logging, loglevel.upper(), None) + if not isinstance(numeric_level, int): + raise ValueError('Invalid log level: %s' % loglevel) + logging.basicConfig(level=numeric_level, ...) + +The call to :func:`basicConfig` should come *before* any calls to :func:`debug`, +:func:`info` etc. As it's intended as a one-off simple configuration facility, +only the first call will actually do anything: subsequent calls are effectively +no-ops. + +If you run the above script several times, the messages from successive runs +are appended to the file *example.log*. If you want each run to start afresh, +not remembering the messages from earlier runs, you can specify the *filemode* +argument, by changing the call in the above example to:: + + logging.basicConfig(filename='example.log', filemode='w', level=logging.DEBUG) + +The output will be the same as before, but the log file is no longer appended +to, so the messages from earlier runs are lost. + + +Logging from multiple modules +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If your program consists of multiple modules, here's an example of how you +could organize logging in it:: + + # myapp.py + import logging + import mylib + + def main(): + logging.basicConfig(filename='myapp.log', level=logging.INFO) + logging.info('Started') + mylib.do_something() + logging.info('Finished') + + if __name__ == '__main__': + main() + +:: + + # mylib.py + import logging + + def do_something(): + logging.info('Doing something') + +If you run *myapp.py*, you should see this in *myapp.log*:: + + INFO:root:Started + INFO:root:Doing something + INFO:root:Finished + +which is hopefully what you were expecting to see. You can generalize this to +multiple modules, using the pattern in *mylib.py*. Note that for this simple +usage pattern, you won't know, by looking in the log file, *where* in your +application your messages came from, apart from looking at the event +description. If you want to track the location of your messages, you'll need +to refer to the documentation beyond the tutorial level -- see +:ref:`logging-advanced-tutorial`. + + +Logging variable data +^^^^^^^^^^^^^^^^^^^^^ + +To log variable data, use a format string for the event description message and +append the variable data as arguments. For example:: + + import logging + logging.warning('%s before you %s', 'Look', 'leap!') + +will display:: + + WARNING:root:Look before you leap! + +As you can see, merging of variable data into the event description message +uses the old, %-style of string formatting. This is for backwards +compatibility: the logging package pre-dates newer formatting options such as +:meth:`str.format` and :class:`string.Template`. These newer formatting +options *are* supported, but exploring them is outside the scope of this +tutorial. + + +Changing the format of displayed messages +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To change the format which is used to display messages, you need to +specify the format you want to use:: + + import logging + logging.basicConfig(format='%(levelname)s:%(message)s', level=logging.DEBUG) + logging.debug('This message should appear on the console') + logging.info('So should this') + logging.warning('And this, too') + +which would print:: + + DEBUG:This message should appear on the console + INFO:So should this + WARNING:And this, too + +Notice that the 'root' which appeared in earlier examples has disappeared. For +a full set of things that can appear in format strings, you can refer to the +documentation for :ref:`logrecord-attributes`, but for simple usage, you just +need the *levelname* (severity), *message* (event description, including +variable data) and perhaps to display when the event occurred. This is +described in the next section. + + +Displaying the date/time in messages +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +To display the date and time of an event, you would place '%(asctime)s' in +your format string:: + + import logging + logging.basicConfig(format='%(asctime)s %(message)s') + logging.warning('is when this event was logged.') + +which should print something like this:: + + 2010-12-12 11:41:42,612 is when this event was logged. + +The default format for date/time display (shown above) is ISO8601. If you need +more control over the formatting of the date/time, provide a *datefmt* +argument to ``basicConfig``, as in this example:: + + import logging + logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y %I:%M:%S %p') + logging.warning('is when this event was logged.') + +which would display something like this:: + + 12/12/2010 11:46:36 AM is when this event was logged. + +The format of the *datefmt* argument is the same as supported by +:func:`time.strftime`. + + +Next Steps +^^^^^^^^^^ + +That concludes the basic tutorial. It should be enough to get you up and +running with logging. There's a lot more that the logging package offers, but +to get the best out of it, you'll need to invest a little more of your time in +reading the following sections. If you're ready for that, grab some of your +favourite beverage and carry on. + +If your logging needs are simple, then use the above examples to incorporate +logging into your own scripts, and if you run into problems or don't +understand something, please post a question on the comp.lang.python Usenet +group (available at http://groups.google.com/group/comp.lang.python) and you +should receive help before too long. + +Still here? You can carry on reading the next few sections, which provide a +slightly more advanced/in-depth tutorial than the basic one above. After that, +you can take a look at the :ref:`logging-cookbook`. + +.. _logging-advanced-tutorial: + + +Advanced Logging Tutorial +------------------------- + +The logging library takes a modular approach and offers several categories +of components: loggers, handlers, filters, and formatters. + +* Loggers expose the interface that application code directly uses. +* Handlers send the log records (created by loggers) to the appropriate + destination. +* Filters provide a finer grained facility for determining which log records + to output. +* Formatters specify the layout of log records in the final output. + +Logging is performed by calling methods on instances of the :class:`Logger` +class (hereafter called :dfn:`loggers`). Each instance has a name, and they are +conceptually arranged in a namespace hierarchy using dots (periods) as +separators. For example, a logger named 'scan' is the parent of loggers +'scan.text', 'scan.html' and 'scan.pdf'. Logger names can be anything you want, +and indicate the area of an application in which a logged message originates. + +A good convention to use when naming loggers is to use a module-level logger, +in each module which uses logging, named as follows:: + + logger = logging.getLogger(__name__) + +This means that logger names track the package/module hierarchy, and it's +intuitively obvious where events are logged just from the logger name. + +The root of the hierarchy of loggers is called the root logger. That's the +logger used by the functions :func:`debug`, :func:`info`, :func:`warning`, +:func:`error` and :func:`critical`, which just call the same-named method of +the root logger. The functions and the methods have the same signatures. The +root logger's name is printed as 'root' in the logged output. + +It is, of course, possible to log messages to different destinations. Support +is included in the package for writing log messages to files, HTTP GET/POST +locations, email via SMTP, generic sockets, queues, or OS-specific logging +mechanisms such as syslog or the Windows NT event log. Destinations are served +by :dfn:`handler` classes. You can create your own log destination class if +you have special requirements not met by any of the built-in handler classes. + +By default, no destination is set for any logging messages. You can specify +a destination (such as console or file) by using :func:`basicConfig` as in the +tutorial examples. If you call the functions :func:`debug`, :func:`info`, +:func:`warning`, :func:`error` and :func:`critical`, they will check to see +if no destination is set; and if one is not set, they will set a destination +of the console (``sys.stderr``) and a default format for the displayed +message before delegating to the root logger to do the actual message output. + +The default format set by :func:`basicConfig` for messages is:: + + severity:logger name:message + +You can change this by passing a format string to :func:`basicConfig` with the +*format* keyword argument. For all options regarding how a format string is +constructed, see :ref:`formatter-objects`. + + +Loggers +^^^^^^^ + +:class:`Logger` objects have a threefold job. First, they expose several +methods to application code so that applications can log messages at runtime. +Second, logger objects determine which log messages to act upon based upon +severity (the default filtering facility) or filter objects. Third, logger +objects pass along relevant log messages to all interested log handlers. + +The most widely used methods on logger objects fall into two categories: +configuration and message sending. + +These are the most common configuration methods: + +* :meth:`Logger.setLevel` specifies the lowest-severity log message a logger + will handle, where debug is the lowest built-in severity level and critical + is the highest built-in severity. For example, if the severity level is + INFO, the logger will handle only INFO, WARNING, ERROR, and CRITICAL messages + and will ignore DEBUG messages. + +* :meth:`Logger.addHandler` and :meth:`Logger.removeHandler` add and remove + handler objects from the logger object. Handlers are covered in more detail + in :ref:`handler-basic`. + +* :meth:`Logger.addFilter` and :meth:`Logger.removeFilter` add and remove filter + objects from the logger object. Filters are covered in more detail in + :ref:`filter`. + +You don't need to always call these methods on every logger you create. See the +last two paragraphs in this section. + +With the logger object configured, the following methods create log messages: + +* :meth:`Logger.debug`, :meth:`Logger.info`, :meth:`Logger.warning`, + :meth:`Logger.error`, and :meth:`Logger.critical` all create log records with + a message and a level that corresponds to their respective method names. The + message is actually a format string, which may contain the standard string + substitution syntax of :const:`%s`, :const:`%d`, :const:`%f`, and so on. The + rest of their arguments is a list of objects that correspond with the + substitution fields in the message. With regard to :const:`**kwargs`, the + logging methods care only about a keyword of :const:`exc_info` and use it to + determine whether to log exception information. + +* :meth:`Logger.exception` creates a log message similar to + :meth:`Logger.error`. The difference is that :meth:`Logger.exception` dumps a + stack trace along with it. Call this method only from an exception handler. + +* :meth:`Logger.log` takes a log level as an explicit argument. This is a + little more verbose for logging messages than using the log level convenience + methods listed above, but this is how to log at custom log levels. + +:func:`getLogger` returns a reference to a logger instance with the specified +name if it is provided, or ``root`` if not. The names are period-separated +hierarchical structures. Multiple calls to :func:`getLogger` with the same name +will return a reference to the same logger object. Loggers that are further +down in the hierarchical list are children of loggers higher up in the list. +For example, given a logger with a name of ``foo``, loggers with names of +``foo.bar``, ``foo.bar.baz``, and ``foo.bam`` are all descendants of ``foo``. + +Loggers have a concept of *effective level*. If a level is not explicitly set +on a logger, the level of its parent is used instead as its effective level. +If the parent has no explicit level set, *its* parent is examined, and so on - +all ancestors are searched until an explicitly set level is found. The root +logger always has an explicit level set (``WARNING`` by default). When deciding +whether to process an event, the effective level of the logger is used to +determine whether the event is passed to the logger's handlers. + +Child loggers propagate messages up to the handlers associated with their +ancestor loggers. Because of this, it is unnecessary to define and configure +handlers for all the loggers an application uses. It is sufficient to +configure handlers for a top-level logger and create child loggers as needed. +(You can, however, turn off propagation by setting the *propagate* +attribute of a logger to *False*.) + + +.. _handler-basic: + +Handlers +^^^^^^^^ + +:class:`~logging.Handler` objects are responsible for dispatching the +appropriate log messages (based on the log messages' severity) to the handler's +specified destination. Logger objects can add zero or more handler objects to +themselves with an :func:`addHandler` method. As an example scenario, an +application may want to send all log messages to a log file, all log messages +of error or higher to stdout, and all messages of critical to an email address. +This scenario requires three individual handlers where each handler is +responsible for sending messages of a specific severity to a specific location. + +The standard library includes quite a few handler types (see +:ref:`useful-handlers`); the tutorials use mainly :class:`StreamHandler` and +:class:`FileHandler` in its examples. + +There are very few methods in a handler for application developers to concern +themselves with. The only handler methods that seem relevant for application +developers who are using the built-in handler objects (that is, not creating +custom handlers) are the following configuration methods: + +* The :meth:`Handler.setLevel` method, just as in logger objects, specifies the + lowest severity that will be dispatched to the appropriate destination. Why + are there two :func:`setLevel` methods? The level set in the logger + determines which severity of messages it will pass to its handlers. The level + set in each handler determines which messages that handler will send on. + +* :func:`setFormatter` selects a Formatter object for this handler to use. + +* :func:`addFilter` and :func:`removeFilter` respectively configure and + deconfigure filter objects on handlers. + +Application code should not directly instantiate and use instances of +:class:`Handler`. Instead, the :class:`Handler` class is a base class that +defines the interface that all handlers should have and establishes some +default behavior that child classes can use (or override). + + +Formatters +^^^^^^^^^^ + +Formatter objects configure the final order, structure, and contents of the log +message. Unlike the base :class:`logging.Handler` class, application code may +instantiate formatter classes, although you could likely subclass the formatter +if your application needs special behavior. The constructor takes three +optional arguments -- a message format string, a date format string and a style +indicator. + +.. method:: logging.Formatter.__init__(fmt=None, datefmt=None, style='%') + +If there is no message format string, the default is to use the +raw message. If there is no date format string, the default date format is:: + + %Y-%m-%d %H:%M:%S + +with the milliseconds tacked on at the end. The ``style`` is one of `%`, '{' +or '$'. If one of these is not specified, then '%' will be used. + +If the ``style`` is '%', the message format string uses +``%(<dictionary key>)s`` styled string substitution; the possible keys are +documented in :ref:`logrecord-attributes`. If the style is '{', the message +format string is assumed to be compatible with :meth:`str.format` (using +keyword arguments), while if the style is '$' then the message format string +should conform to what is expected by :meth:`string.Template.substitute`. + +.. versionchanged:: 3.2 + Added the ``style`` parameter. + +The following message format string will log the time in a human-readable +format, the severity of the message, and the contents of the message, in that +order:: + + '%(asctime)s - %(levelname)s - %(message)s' + +Formatters use a user-configurable function to convert the creation time of a +record to a tuple. By default, :func:`time.localtime` is used; to change this +for a particular formatter instance, set the ``converter`` attribute of the +instance to a function with the same signature as :func:`time.localtime` or +:func:`time.gmtime`. To change it for all formatters, for example if you want +all logging times to be shown in GMT, set the ``converter`` attribute in the +Formatter class (to ``time.gmtime`` for GMT display). + + +Configuring Logging +^^^^^^^^^^^^^^^^^^^ + +.. currentmodule:: logging.config + +Programmers can configure logging in three ways: + +1. Creating loggers, handlers, and formatters explicitly using Python + code that calls the configuration methods listed above. +2. Creating a logging config file and reading it using the :func:`fileConfig` + function. +3. Creating a dictionary of configuration information and passing it + to the :func:`dictConfig` function. + +For the reference documentation on the last two options, see +:ref:`logging-config-api`. The following example configures a very simple +logger, a console handler, and a simple formatter using Python code:: + + import logging + + # create logger + logger = logging.getLogger('simple_example') + logger.setLevel(logging.DEBUG) + + # create console handler and set level to debug + ch = logging.StreamHandler() + ch.setLevel(logging.DEBUG) + + # create formatter + formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') + + # add formatter to ch + ch.setFormatter(formatter) + + # add ch to logger + logger.addHandler(ch) + + # 'application' code + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + +Running this module from the command line produces the following output:: + + $ python simple_logging_module.py + 2005-03-19 15:10:26,618 - simple_example - DEBUG - debug message + 2005-03-19 15:10:26,620 - simple_example - INFO - info message + 2005-03-19 15:10:26,695 - simple_example - WARNING - warn message + 2005-03-19 15:10:26,697 - simple_example - ERROR - error message + 2005-03-19 15:10:26,773 - simple_example - CRITICAL - critical message + +The following Python module creates a logger, handler, and formatter nearly +identical to those in the example listed above, with the only difference being +the names of the objects:: + + import logging + import logging.config + + logging.config.fileConfig('logging.conf') + + # create logger + logger = logging.getLogger('simpleExample') + + # 'application' code + logger.debug('debug message') + logger.info('info message') + logger.warn('warn message') + logger.error('error message') + logger.critical('critical message') + +Here is the logging.conf file:: + + [loggers] + keys=root,simpleExample + + [handlers] + keys=consoleHandler + + [formatters] + keys=simpleFormatter + + [logger_root] + level=DEBUG + handlers=consoleHandler + + [logger_simpleExample] + level=DEBUG + handlers=consoleHandler + qualname=simpleExample + propagate=0 + + [handler_consoleHandler] + class=StreamHandler + level=DEBUG + formatter=simpleFormatter + args=(sys.stdout,) + + [formatter_simpleFormatter] + format=%(asctime)s - %(name)s - %(levelname)s - %(message)s + datefmt= + +The output is nearly identical to that of the non-config-file-based example:: + + $ python simple_logging_config.py + 2005-03-19 15:38:55,977 - simpleExample - DEBUG - debug message + 2005-03-19 15:38:55,979 - simpleExample - INFO - info message + 2005-03-19 15:38:56,054 - simpleExample - WARNING - warn message + 2005-03-19 15:38:56,055 - simpleExample - ERROR - error message + 2005-03-19 15:38:56,130 - simpleExample - CRITICAL - critical message + +You can see that the config file approach has a few advantages over the Python +code approach, mainly separation of configuration and code and the ability of +noncoders to easily modify the logging properties. + +.. currentmodule:: logging + +Note that the class names referenced in config files need to be either relative +to the logging module, or absolute values which can be resolved using normal +import mechanisms. Thus, you could use either +:class:`~logging.handlers.WatchedFileHandler` (relative to the logging module) or +``mypackage.mymodule.MyHandler`` (for a class defined in package ``mypackage`` +and module ``mymodule``, where ``mypackage`` is available on the Python import +path). + +In Python 3.2, a new means of configuring logging has been introduced, using +dictionaries to hold configuration information. This provides a superset of the +functionality of the config-file-based approach outlined above, and is the +recommended configuration method for new applications and deployments. Because +a Python dictionary is used to hold configuration information, and since you +can populate that dictionary using different means, you have more options for +configuration. For example, you can use a configuration file in JSON format, +or, if you have access to YAML processing functionality, a file in YAML +format, to populate the configuration dictionary. Or, of course, you can +construct the dictionary in Python code, receive it in pickled form over a +socket, or use whatever approach makes sense for your application. + +Here's an example of the same configuration as above, in YAML format for +the new dictionary-based approach:: + + version: 1 + formatters: + simple: + format: format=%(asctime)s - %(name)s - %(levelname)s - %(message)s + handlers: + console: + class: logging.StreamHandler + level: DEBUG + formatter: simple + stream: ext://sys.stdout + loggers: + simpleExample: + level: DEBUG + handlers: [console] + propagate: no + root: + level: DEBUG + handlers: [console] + +For more information about logging using a dictionary, see +:ref:`logging-config-api`. + +What happens if no configuration is provided +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If no logging configuration is provided, it is possible to have a situation +where a logging event needs to be output, but no handlers can be found to +output the event. The behaviour of the logging package in these +circumstances is dependent on the Python version. + +For versions of Python prior to 3.2, the behaviour is as follows: + +* If *logging.raiseExceptions* is *False* (production mode), the event is + silently dropped. + +* If *logging.raiseExceptions* is *True* (development mode), a message + 'No handlers could be found for logger X.Y.Z' is printed once. + +In Python 3.2 and later, the behaviour is as follows: + +* The event is output using a 'handler of last resort', stored in + ``logging.lastResort``. This internal handler is not associated with any + logger, and acts like a :class:`~logging.StreamHandler` which writes the + event description message to the current value of ``sys.stderr`` (therefore + respecting any redirections which may be in effect). No formatting is + done on the message - just the bare event description message is printed. + The handler's level is set to ``WARNING``, so all events at this and + greater severities will be output. + +To obtain the pre-3.2 behaviour, ``logging.lastResort`` can be set to *None*. + +.. _library-config: + +Configuring Logging for a Library +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +When developing a library which uses logging, you should take care to +document how the library uses logging - for example, the names of loggers +used. Some consideration also needs to be given to its logging configuration. +If the using application does not use logging, and library code makes logging +calls, then (as described in the previous section) events of severity +``WARNING`` and greater will be printed to ``sys.stderr``. This is regarded as +the best default behaviour. + +If for some reason you *don't* want these messages printed in the absence of +any logging configuration, you can attach a do-nothing handler to the top-level +logger for your library. This avoids the message being printed, since a handler +will be always be found for the library's events: it just doesn't produce any +output. If the library user configures logging for application use, presumably +that configuration will add some handlers, and if levels are suitably +configured then logging calls made in library code will send output to those +handlers, as normal. + +A do-nothing handler is included in the logging package: +:class:`~logging.NullHandler` (since Python 3.1). An instance of this handler +could be added to the top-level logger of the logging namespace used by the +library (*if* you want to prevent your library's logged events being output to +``sys.stderr`` in the absence of logging configuration). If all logging by a +library *foo* is done using loggers with names matching 'foo.x', 'foo.x.y', +etc. then the code:: + + import logging + logging.getLogger('foo').addHandler(logging.NullHandler()) + +should have the desired effect. If an organisation produces a number of +libraries, then the logger name specified can be 'orgname.foo' rather than +just 'foo'. + +**PLEASE NOTE:** It is strongly advised that you *do not add any handlers other +than* :class:`~logging.NullHandler` *to your library's loggers*. This is +because the configuration of handlers is the prerogative of the application +developer who uses your library. The application developer knows their target +audience and what handlers are most appropriate for their application: if you +add handlers 'under the hood', you might well interfere with their ability to +carry out unit tests and deliver logs which suit their requirements. + + +Logging Levels +-------------- + +The numeric values of logging levels are given in the following table. These are +primarily of interest if you want to define your own levels, and need them to +have specific values relative to the predefined levels. If you define a level +with the same numeric value, it overwrites the predefined value; the predefined +name is lost. + ++--------------+---------------+ +| Level | Numeric value | ++==============+===============+ +| ``CRITICAL`` | 50 | ++--------------+---------------+ +| ``ERROR`` | 40 | ++--------------+---------------+ +| ``WARNING`` | 30 | ++--------------+---------------+ +| ``INFO`` | 20 | ++--------------+---------------+ +| ``DEBUG`` | 10 | ++--------------+---------------+ +| ``NOTSET`` | 0 | ++--------------+---------------+ + +Levels can also be associated with loggers, being set either by the developer or +through loading a saved logging configuration. When a logging method is called +on a logger, the logger compares its own level with the level associated with +the method call. If the logger's level is higher than the method call's, no +logging message is actually generated. This is the basic mechanism controlling +the verbosity of logging output. + +Logging messages are encoded as instances of the :class:`~logging.LogRecord` +class. When a logger decides to actually log an event, a +:class:`~logging.LogRecord` instance is created from the logging message. + +Logging messages are subjected to a dispatch mechanism through the use of +:dfn:`handlers`, which are instances of subclasses of the :class:`Handler` +class. Handlers are responsible for ensuring that a logged message (in the form +of a :class:`LogRecord`) ends up in a particular location (or set of locations) +which is useful for the target audience for that message (such as end users, +support desk staff, system administrators, developers). Handlers are passed +:class:`LogRecord` instances intended for particular destinations. Each logger +can have zero, one or more handlers associated with it (via the +:meth:`~Logger.addHandler` method of :class:`Logger`). In addition to any +handlers directly associated with a logger, *all handlers associated with all +ancestors of the logger* are called to dispatch the message (unless the +*propagate* flag for a logger is set to a false value, at which point the +passing to ancestor handlers stops). + +Just as for loggers, handlers can have levels associated with them. A handler's +level acts as a filter in the same way as a logger's level does. If a handler +decides to actually dispatch an event, the :meth:`~Handler.emit` method is used +to send the message to its destination. Most user-defined subclasses of +:class:`Handler` will need to override this :meth:`~Handler.emit`. + +.. _custom-levels: + +Custom Levels +^^^^^^^^^^^^^ + +Defining your own levels is possible, but should not be necessary, as the +existing levels have been chosen on the basis of practical experience. +However, if you are convinced that you need custom levels, great care should +be exercised when doing this, and it is possibly *a very bad idea to define +custom levels if you are developing a library*. That's because if multiple +library authors all define their own custom levels, there is a chance that +the logging output from such multiple libraries used together will be +difficult for the using developer to control and/or interpret, because a +given numeric value might mean different things for different libraries. + +.. _useful-handlers: + +Useful Handlers +--------------- + +In addition to the base :class:`Handler` class, many useful subclasses are +provided: + +#. :class:`StreamHandler` instances send messages to streams (file-like + objects). + +#. :class:`FileHandler` instances send messages to disk files. + +#. :class:`~handlers.BaseRotatingHandler` is the base class for handlers that + rotate log files at a certain point. It is not meant to be instantiated + directly. Instead, use :class:`~handlers.RotatingFileHandler` or + :class:`~handlers.TimedRotatingFileHandler`. + +#. :class:`~handlers.RotatingFileHandler` instances send messages to disk + files, with support for maximum log file sizes and log file rotation. + +#. :class:`~handlers.TimedRotatingFileHandler` instances send messages to + disk files, rotating the log file at certain timed intervals. + +#. :class:`~handlers.SocketHandler` instances send messages to TCP/IP + sockets. + +#. :class:`~handlers.DatagramHandler` instances send messages to UDP + sockets. + +#. :class:`~handlers.SMTPHandler` instances send messages to a designated + email address. + +#. :class:`~handlers.SysLogHandler` instances send messages to a Unix + syslog daemon, possibly on a remote machine. + +#. :class:`~handlers.NTEventLogHandler` instances send messages to a + Windows NT/2000/XP event log. + +#. :class:`~handlers.MemoryHandler` instances send messages to a buffer + in memory, which is flushed whenever specific criteria are met. + +#. :class:`~handlers.HTTPHandler` instances send messages to an HTTP + server using either ``GET`` or ``POST`` semantics. + +#. :class:`~handlers.WatchedFileHandler` instances watch the file they are + logging to. If the file changes, it is closed and reopened using the file + name. This handler is only useful on Unix-like systems; Windows does not + support the underlying mechanism used. + +#. :class:`~handlers.QueueHandler` instances send messages to a queue, such as + those implemented in the :mod:`queue` or :mod:`multiprocessing` modules. + +#. :class:`NullHandler` instances do nothing with error messages. They are used + by library developers who want to use logging, but want to avoid the 'No + handlers could be found for logger XXX' message which can be displayed if + the library user has not configured logging. See :ref:`library-config` for + more information. + +.. versionadded:: 3.1 + The :class:`NullHandler` class. + +.. versionadded:: 3.2 + The :class:`~handlers.QueueHandler` class. + +The :class:`NullHandler`, :class:`StreamHandler` and :class:`FileHandler` +classes are defined in the core logging package. The other handlers are +defined in a sub- module, :mod:`logging.handlers`. (There is also another +sub-module, :mod:`logging.config`, for configuration functionality.) + +Logged messages are formatted for presentation through instances of the +:class:`Formatter` class. They are initialized with a format string suitable for +use with the % operator and a dictionary. + +For formatting multiple messages in a batch, instances of +:class:`BufferingFormatter` can be used. In addition to the format string (which +is applied to each message in the batch), there is provision for header and +trailer format strings. + +When filtering based on logger level and/or handler level is not enough, +instances of :class:`Filter` can be added to both :class:`Logger` and +:class:`Handler` instances (through their :meth:`addFilter` method). Before +deciding to process a message further, both loggers and handlers consult all +their filters for permission. If any filter returns a false value, the message +is not processed further. + +The basic :class:`Filter` functionality allows filtering by specific logger +name. If this feature is used, messages sent to the named logger and its +children are allowed through the filter, and all others dropped. + + +.. _logging-exceptions: + +Exceptions raised during logging +-------------------------------- + +The logging package is designed to swallow exceptions which occur while logging +in production. This is so that errors which occur while handling logging events +- such as logging misconfiguration, network or other similar errors - do not +cause the application using logging to terminate prematurely. + +:class:`SystemExit` and :class:`KeyboardInterrupt` exceptions are never +swallowed. Other exceptions which occur during the :meth:`emit` method of a +:class:`Handler` subclass are passed to its :meth:`handleError` method. + +The default implementation of :meth:`handleError` in :class:`Handler` checks +to see if a module-level variable, :data:`raiseExceptions`, is set. If set, a +traceback is printed to :data:`sys.stderr`. If not set, the exception is swallowed. + +**Note:** The default value of :data:`raiseExceptions` is ``True``. This is because +during development, you typically want to be notified of any exceptions that +occur. It's advised that you set :data:`raiseExceptions` to ``False`` for production +usage. + +.. currentmodule:: logging + +.. _arbitrary-object-messages: + +Using arbitrary objects as messages +----------------------------------- + +In the preceding sections and examples, it has been assumed that the message +passed when logging the event is a string. However, this is not the only +possibility. You can pass an arbitrary object as a message, and its +:meth:`__str__` method will be called when the logging system needs to convert +it to a string representation. In fact, if you want to, you can avoid +computing a string representation altogether - for example, the +:class:`SocketHandler` emits an event by pickling it and sending it over the +wire. + + +Optimization +------------ + +Formatting of message arguments is deferred until it cannot be avoided. +However, computing the arguments passed to the logging method can also be +expensive, and you may want to avoid doing it if the logger will just throw +away your event. To decide what to do, you can call the :meth:`isEnabledFor` +method which takes a level argument and returns true if the event would be +created by the Logger for that level of call. You can write code like this:: + + if logger.isEnabledFor(logging.DEBUG): + logger.debug('Message with %s, %s', expensive_func1(), + expensive_func2()) + +so that if the logger's threshold is set above ``DEBUG``, the calls to +:func:`expensive_func1` and :func:`expensive_func2` are never made. + +There are other optimizations which can be made for specific applications which +need more precise control over what logging information is collected. Here's a +list of things you can do to avoid processing during logging which you don't +need: + ++-----------------------------------------------+----------------------------------------+ +| What you don't want to collect | How to avoid collecting it | ++===============================================+========================================+ +| Information about where calls were made from. | Set ``logging._srcfile`` to ``None``. | ++-----------------------------------------------+----------------------------------------+ +| Threading information. | Set ``logging.logThreads`` to ``0``. | ++-----------------------------------------------+----------------------------------------+ +| Process information. | Set ``logging.logProcesses`` to ``0``. | ++-----------------------------------------------+----------------------------------------+ + +Also note that the core logging module only includes the basic handlers. If +you don't import :mod:`logging.handlers` and :mod:`logging.config`, they won't +take up any memory. + +.. seealso:: + + Module :mod:`logging` + API reference for the logging module. + + Module :mod:`logging.config` + Configuration API for the logging module. + + Module :mod:`logging.handlers` + Useful handlers included with the logging module. + + :ref:`A logging cookbook <logging-cookbook>` + diff --git a/Doc/howto/pyporting.rst b/Doc/howto/pyporting.rst new file mode 100644 index 0000000..124ef33 --- /dev/null +++ b/Doc/howto/pyporting.rst @@ -0,0 +1,703 @@ +.. _pyporting-howto: + +********************************* +Porting Python 2 Code to Python 3 +********************************* + +:author: Brett Cannon + +.. topic:: Abstract + + With Python 3 being the future of Python while Python 2 is still in active + use, it is good to have your project available for both major releases of + Python. This guide is meant to help you choose which strategy works best + for your project to support both Python 2 & 3 along with how to execute + that strategy. + + If you are looking to port an extension module instead of pure Python code, + please see :ref:`cporting-howto`. + + +Choosing a Strategy +=================== + +When a project makes the decision that it's time to support both Python 2 & 3, +a decision needs to be made as to how to go about accomplishing that goal. +The chosen strategy will depend on how large the project's existing +codebase is and how much divergence you want from your Python 2 codebase from +your Python 3 one (e.g., starting a new version with Python 3). + +If your project is brand-new or does not have a large codebase, then you may +want to consider writing/porting :ref:`all of your code for Python 3 +and use 3to2 <use_3to2>` to port your code for Python 2. + +If you would prefer to maintain a codebase which is semantically **and** +syntactically compatible with Python 2 & 3 simultaneously, you can write +:ref:`use_same_source`. While this tends to lead to somewhat non-idiomatic +code, it does mean you keep a rapid development process for you, the developer. + +Finally, you do have the option of :ref:`using 2to3 <use_2to3>` to translate +Python 2 code into Python 3 code (with some manual help). This can take the +form of branching your code and using 2to3 to start a Python 3 branch. You can +also have users perform the translation as installation time automatically so +that you only have to maintain a Python 2 codebase. + +Regardless of which approach you choose, porting is not as hard or +time-consuming as you might initially think. You can also tackle the problem +piece-meal as a good portion of porting is simply updating your code to follow +current best practices in a Python 2/3 compatible way. + + +Universal Bits of Advice +------------------------ + +Regardless of what strategy you pick, there are a few things you should +consider. + +One is make sure you have a robust test suite. You need to make sure everything +continues to work, just like when you support a new minor version of Python. +This means making sure your test suite is thorough and is ported properly +between Python 2 & 3. You will also most likely want to use something like tox_ +to automate testing between both a Python 2 and Python 3 VM. + +Two, once your project has Python 3 support, make sure to add the proper +classifier on the Cheeseshop_ (PyPI_). To have your project listed as Python 3 +compatible it must have the +`Python 3 classifier <http://pypi.python.org/pypi?:action=browse&c=533>`_ +(from +http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/):: + + setup( + name='Your Library', + version='1.0', + classifiers=[ + # make sure to use :: Python *and* :: Python :: 3 so + # that pypi can list the package on the python 3 page + 'Programming Language :: Python', + 'Programming Language :: Python :: 3' + ], + packages=['yourlibrary'], + # make sure to add custom_fixers to the MANIFEST.in + include_package_data=True, + # ... + ) + + +Doing so will cause your project to show up in the +`Python 3 packages list +<http://pypi.python.org/pypi?:action=browse&c=533&show=all>`_. You will know +you set the classifier properly as visiting your project page on the Cheeseshop +will show a Python 3 logo in the upper-left corner of the page. + +Three, the six_ project provides a library which helps iron out differences +between Python 2 & 3. If you find there is a sticky point that is a continual +point of contention in your translation or maintenance of code, consider using +a source-compatible solution relying on six. If you have to create your own +Python 2/3 compatible solution, you can use ``sys.version_info[0] >= 3`` as a +guard. + +Four, read all the approaches. Just because some bit of advice applies to one +approach more than another doesn't mean that some advice doesn't apply to other +strategies. + +Five, drop support for older Python versions if possible. `Python 2.5`_ +introduced a lot of useful syntax and libraries which have become idiomatic +in Python 3. `Python 2.6`_ introduced future statements which makes +compatibility much easier if you are going from Python 2 to 3. +`Python 2.7`_ continues the trend in the stdlib. So choose the newest version +of Python which you believe can be your minimum support version +and work from there. + + +.. _tox: http://codespeak.net/tox/ +.. _Cheeseshop: +.. _PyPI: http://pypi.python.org/ +.. _six: http://packages.python.org/six +.. _Python 2.7: http://www.python.org/2.7.x +.. _Python 2.6: http://www.python.org/2.6.x +.. _Python 2.5: http://www.python.org/2.5.x +.. _Python 2.4: http://www.python.org/2.4.x +.. _Python 2.3: http://www.python.org/2.3.x +.. _Python 2.2: http://www.python.org/2.2.x + + +.. _use_3to2: + +Python 3 and 3to2 +================= + +If you are starting a new project or your codebase is small enough, you may +want to consider writing your code for Python 3 and backporting to Python 2 +using 3to2_. Thanks to Python 3 being more strict about things than Python 2 +(e.g., bytes vs. strings), the source translation can be easier and more +straightforward than from Python 2 to 3. Plus it gives you more direct +experience developing in Python 3 which, since it is the future of Python, is a +good thing long-term. + +A drawback of this approach is that 3to2 is a third-party project. This means +that the Python core developers (and thus this guide) can make no promises +about how well 3to2 works at any time. There is nothing to suggest, though, +that 3to2 is not a high-quality project. + + +.. _3to2: https://bitbucket.org/amentajo/lib3to2/overview + + +.. _use_2to3: + +Python 2 and 2to3 +================= + +Included with Python since 2.6, the 2to3_ tool (and :mod:`lib2to3` module) +helps with porting Python 2 to Python 3 by performing various source +translations. This is a perfect solution for projects which wish to branch +their Python 3 code from their Python 2 codebase and maintain them as +independent codebases. You can even begin preparing to use this approach +today by writing future-compatible Python code which works cleanly in +Python 2 in conjunction with 2to3; all steps outlined below will work +with Python 2 code up to the point when the actual use of 2to3 occurs. + +Use of 2to3 as an on-demand translation step at install time is also possible, +preventing the need to maintain a separate Python 3 codebase, but this approach +does come with some drawbacks. While users will only have to pay the +translation cost once at installation, you as a developer will need to pay the +cost regularly during development. If your codebase is sufficiently large +enough then the translation step ends up acting like a compilation step, +robbing you of the rapid development process you are used to with Python. +Obviously the time required to translate a project will vary, so do an +experimental translation just to see how long it takes to evaluate whether you +prefer this approach compared to using :ref:`use_same_source` or simply keeping +a separate Python 3 codebase. + +Below are the typical steps taken by a project which uses a 2to3-based approach +to supporting Python 2 & 3. + + +Support Python 2.7 +------------------ + +As a first step, make sure that your project is compatible with `Python 2.7`_. +This is just good to do as Python 2.7 is the last release of Python 2 and thus +will be used for a rather long time. It also allows for use of the ``-3`` flag +to Python to help discover places in your code which 2to3 cannot handle but are +known to cause issues. + +Try to Support `Python 2.6`_ and Newer Only +------------------------------------------- + +While not possible for all projects, if you can support `Python 2.6`_ and newer +**only**, your life will be much easier. Various future statements, stdlib +additions, etc. exist only in Python 2.6 and later which greatly assist in +porting to Python 3. But if you project must keep support for `Python 2.5`_ (or +even `Python 2.4`_) then it is still possible to port to Python 3. + +Below are the benefits you gain if you only have to support Python 2.6 and +newer. Some of these options are personal choice while others are +**strongly** recommended (the ones that are more for personal choice are +labeled as such). If you continue to support older versions of Python then you +at least need to watch out for situations that these solutions fix. + + +``from __future__ import print_function`` +''''''''''''''''''''''''''''''''''''''''' + +This is a personal choice. 2to3 handles the translation from the print +statement to the print function rather well so this is an optional step. This +future statement does help, though, with getting used to typing +``print('Hello, World')`` instead of ``print 'Hello, World'``. + + +``from __future__ import unicode_literals`` +''''''''''''''''''''''''''''''''''''''''''' + +Another personal choice. You can always mark what you want to be a (unicode) +string with a ``u`` prefix to get the same effect. But regardless of whether +you use this future statement or not, you **must** make sure you know exactly +which Python 2 strings you want to be bytes, and which are to be strings. This +means you should, **at minimum** mark all strings that are meant to be text +strings with a ``u`` prefix if you do not use this future statement. + + +Bytes literals +'''''''''''''' + +This is a **very** important one. The ability to prefix Python 2 strings that +are meant to contain bytes with a ``b`` prefix help to very clearly delineate +what is and is not a Python 3 string. When you run 2to3 on code, all Python 2 +strings become Python 3 strings **unless** they are prefixed with ``b``. + +There are some differences between byte literals in Python 2 and those in +Python 3 thanks to the bytes type just being an alias to ``str`` in Python 2. +Probably the biggest "gotcha" is that indexing results in different values. In +Python 2, the value of ``b'py'[1]`` is ``'y'``, while in Python 3 it's ``121``. +You can avoid this disparity by always slicing at the size of a single element: +``b'py'[1:2]`` is ``'y'`` in Python 2 and ``b'y'`` in Python 3 (i.e., close +enough). + +You cannot concatenate bytes and strings in Python 3. But since in Python +2 has bytes aliased to ``str``, it will succeed: ``b'a' + u'b'`` works in +Python 2, but ``b'a' + 'b'`` in Python 3 is a :exc:`TypeError`. A similar issue +also comes about when doing comparisons between bytes and strings. + + +Supporting `Python 2.5`_ and Newer Only +--------------------------------------- + +If you are supporting `Python 2.5`_ and newer there are still some features of +Python that you can utilize. + + +``from __future__ import absolute_import`` +'''''''''''''''''''''''''''''''''''''''''' + +Implicit relative imports (e.g., importing ``spam.bacon`` from within +``spam.eggs`` with the statement ``import bacon``) does not work in Python 3. +This future statement moves away from that and allows the use of explicit +relative imports (e.g., ``from . import bacon``). + +In `Python 2.5`_ you must use +the __future__ statement to get to use explicit relative imports and prevent +implicit ones. In `Python 2.6`_ explicit relative imports are available without +the statement, but you still want the __future__ statement to prevent implicit +relative imports. In `Python 2.7`_ the __future__ statement is not needed. In +other words, unless you are only supporting Python 2.7 or a version earlier +than Python 2.5, use the __future__ statement. + + + +Handle Common "Gotchas" +----------------------- + +There are a few things that just consistently come up as sticking points for +people which 2to3 cannot handle automatically or can easily be done in Python 2 +to help modernize your code. + + +``from __future__ import division`` +''''''''''''''''''''''''''''''''''' + +While the exact same outcome can be had by using the ``-Qnew`` argument to +Python, using this future statement lifts the requirement that your users use +the flag to get the expected behavior of division in Python 3 +(e.g., ``1/2 == 0.5; 1//2 == 0``). + + + +Specify when opening a file as binary +''''''''''''''''''''''''''''''''''''' + +Unless you have been working on Windows, there is a chance you have not always +bothered to add the ``b`` mode when opening a binary file (e.g., ``rb`` for +binary reading). Under Python 3, binary files and text files are clearly +distinct and mutually incompatible; see the :mod:`io` module for details. +Therefore, you **must** make a decision of whether a file will be used for +binary access (allowing to read and/or write bytes data) or text access +(allowing to read and/or write unicode data). + +Text files +'''''''''' + +Text files created using ``open()`` under Python 2 return byte strings, +while under Python 3 they return unicode strings. Depending on your porting +strategy, this can be an issue. + +If you want text files to return unicode strings in Python 2, you have two +possibilities: + +* Under Python 2.6 and higher, use :func:`io.open`. Since :func:`io.open` + is essentially the same function in both Python 2 and Python 3, it will + help iron out any issues that might arise. + +* If pre-2.6 compatibility is needed, then you should use :func:`codecs.open` + instead. This will make sure that you get back unicode strings in Python 2. + +Subclass ``object`` +''''''''''''''''''' + +New-style classes have been around since `Python 2.2`_. You need to make sure +you are subclassing from ``object`` to avoid odd edge cases involving method +resolution order, etc. This continues to be totally valid in Python 3 (although +unneeded as all classes implicitly inherit from ``object``). + + +Deal With the Bytes/String Dichotomy +'''''''''''''''''''''''''''''''''''' + +One of the biggest issues people have when porting code to Python 3 is handling +the bytes/string dichotomy. Because Python 2 allowed the ``str`` type to hold +textual data, people have over the years been rather loose in their delineation +of what ``str`` instances held text compared to bytes. In Python 3 you cannot +be so care-free anymore and need to properly handle the difference. The key +handling this issue to to make sure that **every** string literal in your +Python 2 code is either syntactically of functionally marked as either bytes or +text data. After this is done you then need to make sure your APIs are designed +to either handle a specific type or made to be properly polymorphic. + + +Mark Up Python 2 String Literals +******************************** + +First thing you must do is designate every single string literal in Python 2 +as either textual or bytes data. If you are only supporting Python 2.6 or +newer, this can be accomplished by marking bytes literals with a ``b`` prefix +and then designating textual data with a ``u`` prefix or using the +``unicode_literals`` future statement. + +If your project supports versions of Python pre-dating 2.6, then you should use +the six_ project and its ``b()`` function to denote bytes literals. For text +literals you can either use six's ``u()`` function or use a ``u`` prefix. + + +Decide what APIs Will Accept +**************************** + +In Python 2 it was very easy to accidentally create an API that accepted both +bytes and textual data. But in Python 3, thanks to the more strict handling of +disparate types, this loose usage of bytes and text together tends to fail. + +Take the dict ``{b'a': 'bytes', u'a': 'text'}`` in Python 2.6. It creates the +dict ``{u'a': 'text'}`` since ``b'a' == u'a'``. But in Python 3 the equivalent +dict creates ``{b'a': 'bytes', 'a': 'text'}``, i.e., no lost data. Similar +issues can crop up when transitioning Python 2 code to Python 3. + +This means you need to choose what an API is going to accept and create and +consistently stick to that API in both Python 2 and 3. + + +Bytes / Unicode Comparison +************************** + +In Python 3, mixing bytes and unicode is forbidden in most situations; it +will raise a :class:`TypeError` where Python 2 would have attempted an implicit +coercion between types. However, there is one case where it doesn't and +it can be very misleading:: + + >>> b"" == "" + False + +This is because an equality comparison is required by the language to always +succeed (and return ``False`` for incompatible types). However, this also +means that code incorrectly ported to Python 3 can display buggy behaviour +if such comparisons are silently executed. To detect such situations, +Python 3 has a ``-b`` flag that will display a warning:: + + $ python3 -b + >>> b"" == "" + __main__:1: BytesWarning: Comparison between bytes and string + False + +To turn the warning into an exception, use the ``-bb`` flag instead:: + + $ python3 -bb + >>> b"" == "" + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + BytesWarning: Comparison between bytes and string + + +Indexing bytes objects +'''''''''''''''''''''' + +Another potentially surprising change is the indexing behaviour of bytes +objects in Python 3:: + + >>> b"xyz"[0] + 120 + +Indeed, Python 3 bytes objects (as well as :class:`bytearray` objects) +are sequences of integers. But code converted from Python 2 will often +assume that indexing a bytestring produces another bytestring, not an +integer. To reconcile both behaviours, use slicing:: + + >>> b"xyz"[0:1] + b'x' + >>> n = 1 + >>> b"xyz"[n:n+1] + b'y' + +The only remaining gotcha is that an out-of-bounds slice returns an empty +bytes object instead of raising ``IndexError``: + + >>> b"xyz"[3] + Traceback (most recent call last): + File "<stdin>", line 1, in <module> + IndexError: index out of range + >>> b"xyz"[3:4] + b'' + + +``__str__()``/``__unicode__()`` +''''''''''''''''''''''''''''''' + +In Python 2, objects can specify both a string and unicode representation of +themselves. In Python 3, though, there is only a string representation. This +becomes an issue as people can inadvertently do things in their ``__str__()`` +methods which have unpredictable results (e.g., infinite recursion if you +happen to use the ``unicode(self).encode('utf8')`` idiom as the body of your +``__str__()`` method). + +There are two ways to solve this issue. One is to use a custom 2to3 fixer. The +blog post at http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ +specifies how to do this. That will allow 2to3 to change all instances of ``def +__unicode(self): ...`` to ``def __str__(self): ...``. This does require you +define your ``__str__()`` method in Python 2 before your ``__unicode__()`` +method. + +The other option is to use a mixin class. This allows you to only define a +``__unicode__()`` method for your class and let the mixin derive +``__str__()`` for you (code from +http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/):: + + import sys + + class UnicodeMixin(object): + + """Mixin class to handle defining the proper __str__/__unicode__ + methods in Python 2 or 3.""" + + if sys.version_info[0] >= 3: # Python 3 + def __str__(self): + return self.__unicode__() + else: # Python 2 + def __str__(self): + return self.__unicode__().encode('utf8') + + + class Spam(UnicodeMixin): + + def __unicode__(self): + return u'spam-spam-bacon-spam' # 2to3 will remove the 'u' prefix + + +Don't Index on Exceptions +''''''''''''''''''''''''' + +In Python 2, the following worked:: + + >>> exc = Exception(1, 2, 3) + >>> exc.args[1] + 2 + >>> exc[1] # Python 2 only! + 2 + +But in Python 3, indexing directly on an exception is an error. You need to +make sure to only index on the :attr:`BaseException.args` attribute which is a +sequence containing all arguments passed to the :meth:`__init__` method. + +Even better is to use the documented attributes the exception provides. + +Don't use ``__getslice__`` & Friends +'''''''''''''''''''''''''''''''''''' + +Been deprecated for a while, but Python 3 finally drops support for +``__getslice__()``, etc. Move completely over to :meth:`__getitem__` and +friends. + + +Updating doctests +''''''''''''''''' + +2to3_ will attempt to generate fixes for doctests that it comes across. It's +not perfect, though. If you wrote a monolithic set of doctests (e.g., a single +docstring containing all of your doctests), you should at least consider +breaking the doctests up into smaller pieces to make it more manageable to fix. +Otherwise it might very well be worth your time and effort to port your tests +to :mod:`unittest`. + + +Eliminate ``-3`` Warnings +------------------------- + +When you run your application's test suite, run it using the ``-3`` flag passed +to Python. This will cause various warnings to be raised during execution about +things that 2to3 cannot handle automatically (e.g., modules that have been +removed). Try to eliminate those warnings to make your code even more portable +to Python 3. + + +Run 2to3 +-------- + +Once you have made your Python 2 code future-compatible with Python 3, it's +time to use 2to3_ to actually port your code. + + +Manually +'''''''' + +To manually convert source code using 2to3_, you use the ``2to3`` script that +is installed with Python 2.6 and later.:: + + 2to3 <directory or file to convert> + +This will cause 2to3 to write out a diff with all of the fixers applied for the +converted source code. If you would like 2to3 to go ahead and apply the changes +you can pass it the ``-w`` flag:: + + 2to3 -w <stuff to convert> + +There are other flags available to control exactly which fixers are applied, +etc. + + +During Installation +''''''''''''''''''' + +When a user installs your project for Python 3, you can have either +:mod:`distutils` or Distribute_ run 2to3_ on your behalf. +For distutils, use the following idiom:: + + try: # Python 3 + from distutils.command.build_py import build_py_2to3 as build_py + except ImportError: # Python 2 + from distutils.command.build_py import build_py + + setup(cmdclass = {'build_py': build_py}, + # ... + ) + +For Distribute:: + + setup(use_2to3=True, + # ... + ) + +This will allow you to not have to distribute a separate Python 3 version of +your project. It does require, though, that when you perform development that +you at least build your project and use the built Python 3 source for testing. + + +Verify & Test +------------- + +At this point you should (hopefully) have your project converted in such a way +that it works in Python 3. Verify it by running your unit tests and making sure +nothing has gone awry. If you miss something then figure out how to fix it in +Python 3, backport to your Python 2 code, and run your code through 2to3 again +to verify the fix transforms properly. + + +.. _2to3: http://docs.python.org/py3k/library/2to3.html +.. _Distribute: http://packages.python.org/distribute/ + + +.. _use_same_source: + +Python 2/3 Compatible Source +============================ + +While it may seem counter-intuitive, you can write Python code which is +source-compatible between Python 2 & 3. It does lead to code that is not +entirely idiomatic Python (e.g., having to extract the currently raised +exception from ``sys.exc_info()[1]``), but it can be run under Python 2 +**and** Python 3 without using 2to3_ as a translation step (although the tool +should be used to help find potential portability problems). This allows you to +continue to have a rapid development process regardless of whether you are +developing under Python 2 or Python 3. Whether this approach or using +:ref:`use_2to3` works best for you will be a per-project decision. + +To get a complete idea of what issues you will need to deal with, see the +`What's New in Python 3.0`_. Others have reorganized the data in other formats +such as http://docs.pythonsprints.com/python3_porting/py-porting.html . + +The following are some steps to take to try to support both Python 2 & 3 from +the same source code. + + +.. _What's New in Python 3.0: http://docs.python.org/release/3.0/whatsnew/3.0.html + + +Follow The Steps for Using 2to3_ +-------------------------------- + +All of the steps outlined in how to +:ref:`port Python 2 code with 2to3 <use_2to3>` apply +to creating a Python 2/3 codebase. This includes trying only support Python 2.6 +or newer (the :mod:`__future__` statements work in Python 3 without issue), +eliminating warnings that are triggered by ``-3``, etc. + +You should even consider running 2to3_ over your code (without committing the +changes). This will let you know where potential pain points are within your +code so that you can fix them properly before they become an issue. + + +Use six_ +-------- + +The six_ project contains many things to help you write portable Python code. +You should make sure to read its documentation from beginning to end and use +any and all features it provides. That way you will minimize any mistakes you +might make in writing cross-version code. + + +Capturing the Currently Raised Exception +---------------------------------------- + +One change between Python 2 and 3 that will require changing how you code (if +you support `Python 2.5`_ and earlier) is +accessing the currently raised exception. In Python 2.5 and earlier the syntax +to access the current exception is:: + + try: + raise Exception() + except Exception, exc: + # Current exception is 'exc' + pass + +This syntax changed in Python 3 (and backported to `Python 2.6`_ and later) +to:: + + try: + raise Exception() + except Exception as exc: + # Current exception is 'exc' + # In Python 3, 'exc' is restricted to the block; Python 2.6 will "leak" + pass + +Because of this syntax change you must change to capturing the current +exception to:: + + try: + raise Exception() + except Exception: + import sys + exc = sys.exc_info()[1] + # Current exception is 'exc' + pass + +You can get more information about the raised exception from +:func:`sys.exc_info` than simply the current exception instance, but you most +likely don't need it. + +.. note:: + In Python 3, the traceback is attached to the exception instance + through the ``__traceback__`` attribute. If the instance is saved in + a local variable that persists outside of the ``except`` block, the + traceback will create a reference cycle with the current frame and its + dictionary of local variables. This will delay reclaiming dead + resources until the next cyclic :term:`garbage collection` pass. + + In Python 2, this problem only occurs if you save the traceback itself + (e.g. the third element of the tuple returned by :func:`sys.exc_info`) + in a variable. + + +Other Resources +=============== + +The authors of the following blog posts, wiki pages, and books deserve special +thanks for making public their tips for porting Python 2 code to Python 3 (and +thus helping provide information for this document): + +* http://python3porting.com/ +* http://docs.pythonsprints.com/python3_porting/py-porting.html +* http://techspot.zzzeek.org/2011/01/24/zzzeek-s-guide-to-python-3-porting/ +* http://dabeaz.blogspot.com/2011/01/porting-py65-and-my-superboard-to.html +* http://lucumr.pocoo.org/2011/1/22/forwards-compatible-python/ +* http://lucumr.pocoo.org/2010/2/11/porting-to-python-3-a-guide/ +* http://wiki.python.org/moin/PortingPythonToPy3k + +If you feel there is something missing from this document that should be added, +please email the python-porting_ mailing list. + +.. _python-porting: http://mail.python.org/mailman/listinfo/python-porting diff --git a/Doc/howto/sorting.rst b/Doc/howto/sorting.rst index 351713f..d9c70e2 100644 --- a/Doc/howto/sorting.rst +++ b/Doc/howto/sorting.rst @@ -23,7 +23,7 @@ returns a new sorted list:: >>> sorted([5, 2, 3, 1, 4]) [1, 2, 3, 4, 5] -You can also use the :meth:`list.sort` method of a list. It modifies the list +You can also use the :meth:`list.sort` method. It modifies the list in-place (and returns *None* to avoid confusion). Usually it's less convenient than :func:`sorted` - but if you don't need the original list, it's slightly more efficient. @@ -87,9 +87,9 @@ Operator Module Functions ========================= The key-function patterns shown above are very common, so Python provides -convenience functions to make accessor functions easier and faster. The operator -module has :func:`operator.itemgetter`, :func:`operator.attrgetter`, and -an :func:`operator.methodcaller` function. +convenience functions to make accessor functions easier and faster. The +:mod:`operator` module has :func:`~operator.itemgetter`, +:func:`~operator.attrgetter`, and an :func:`~operator.methodcaller` function. Using those functions, the above examples become simpler and faster: @@ -247,6 +247,8 @@ To convert to a key function, just wrap the old comparison function: >>> sorted([5, 2, 4, 1, 3], key=cmp_to_key(reverse_numeric)) [5, 4, 3, 2, 1] +In Python 3.2, the :func:`functools.cmp_to_key` function was added to the +:mod:`functools` module in the standard library. Odd and Ends ============ @@ -254,7 +256,7 @@ Odd and Ends * For locale aware sorting, use :func:`locale.strxfrm` for a key function or :func:`locale.strcoll` for a comparison function. -* The *reverse* parameter still maintains sort stability (i.e. records with +* The *reverse* parameter still maintains sort stability (so that records with equal keys retain the original order). Interestingly, that effect can be simulated without the parameter by using the builtin :func:`reversed` function twice: diff --git a/Doc/howto/unicode.rst b/Doc/howto/unicode.rst index 13efa76..77fcd26 100644 --- a/Doc/howto/unicode.rst +++ b/Doc/howto/unicode.rst @@ -4,13 +4,11 @@ Unicode HOWTO ***************** -:Release: 1.11 +:Release: 1.12 -This HOWTO discusses Python 2.x's support for Unicode, and explains +This HOWTO discusses Python support for Unicode, and explains various problems that people commonly encounter when trying to work -with Unicode. (This HOWTO has not yet been updated to cover the 3.x -versions of Python.) - +with Unicode. Introduction to Unicode ======================= @@ -44,14 +42,14 @@ In the 1980s, almost all personal computers were 8-bit, meaning that bytes could hold values ranging from 0 to 255. ASCII codes only went up to 127, so some machines assigned values between 128 and 255 to accented characters. Different machines had different codes, however, which led to problems exchanging files. -Eventually various commonly used sets of values for the 128-255 range emerged. +Eventually various commonly used sets of values for the 128--255 range emerged. Some were true standards, defined by the International Standards Organization, and some were **de facto** conventions that were invented by one company or another and managed to catch on. 255 characters aren't very many. For example, you can't fit both the accented characters used in Western Europe and the Cyrillic alphabet used for Russian -into the 128-255 range because there are more than 127 such characters. +into the 128--255 range because there are more than 127 such characters. You could write files using different codes (all your Russian files in a coding system called KOI8, all your French files in a different coding system called @@ -64,8 +62,8 @@ bits means you have 2^16 = 65,536 distinct values available, making it possible to represent many different characters from many different alphabets; an initial goal was to have Unicode contain the alphabets for every single human language. It turns out that even 16 bits isn't enough to meet that goal, and the modern -Unicode specification uses a wider range of codes, 0-1,114,111 (0x10ffff in -base-16). +Unicode specification uses a wider range of codes, 0 through 1,114,111 (0x10ffff +in base 16). There's a related ISO standard, ISO 10646. Unicode and ISO 10646 were originally separate efforts, but the specifications were merged with the 1.1 @@ -90,7 +88,7 @@ meanings. The Unicode standard describes how characters are represented by **code points**. A code point is an integer value, usually denoted in base 16. In the standard, a code point is written using the notation U+12ca to mean the -character with value 0x12ca (4810 decimal). The Unicode standard contains a lot +character with value 0x12ca (4,810 decimal). The Unicode standard contains a lot of tables listing characters and their corresponding code points:: 0061 'a'; LATIN SMALL LETTER A @@ -117,10 +115,10 @@ Encodings --------- To summarize the previous section: a Unicode string is a sequence of code -points, which are numbers from 0 to 0x10ffff. This sequence needs to be -represented as a set of bytes (meaning, values from 0-255) in memory. The rules -for translating a Unicode string into a sequence of bytes are called an -**encoding**. +points, which are numbers from 0 through 0x10ffff (1,114,111 decimal). This +sequence needs to be represented as a set of bytes (meaning, values +from 0 through 255) in memory. The rules for translating a Unicode string +into a sequence of bytes are called an **encoding**. The first encoding you might think of is an array of 32-bit integers. In this representation, the string "Python" would look like this:: @@ -164,7 +162,7 @@ encoding, for example, are simple; for each code point: case.) Latin-1, also known as ISO-8859-1, is a similar encoding. Unicode code points -0-255 are identical to the Latin-1 values, so converting to this encoding simply +0--255 are identical to the Latin-1 values, so converting to this encoding simply requires converting code points to byte values; if a code point larger than 255 is encountered, the string can't be encoded into Latin-1. @@ -226,8 +224,8 @@ Wikipedia entries are often helpful; see the entries for "character encoding" <http://en.wikipedia.org/wiki/UTF-8>, for example. -Python 2.x's Unicode Support -============================ +Python's Unicode Support +======================== Now that you've learned the rudiments of Unicode, we can look at Python's Unicode features. @@ -284,10 +282,10 @@ that contains the corresponding code point. The reverse operation is the built-in :func:`ord` function that takes a one-character Unicode string and returns the code point value:: - >>> chr(40960) - '\ua000' - >>> ord('\ua000') - 40960 + >>> chr(57344) + '\ue000' + >>> ord('\ue000') + 57344 Converting to Bytes ------------------- @@ -329,7 +327,8 @@ Unicode Literals in Python Source Code In Python source code, specific Unicode code points can be written using the ``\u`` escape sequence, which is followed by four hex digits giving the code -point. The ``\U`` escape sequence is similar, but expects 8 hex digits, not 4:: +point. The ``\U`` escape sequence is similar, but expects eight hex digits, +not four:: >>> s = "a\xac\u1234\u20ac\U00008000" ^^^^ two-digit hex escape @@ -409,7 +408,7 @@ These are grouped into categories such as "Letter", "Number", "Punctuation", or from the above output, ``'Ll'`` means 'Letter, lowercase', ``'No'`` means "Number, other", ``'Mn'`` is "Mark, nonspacing", and ``'So'`` is "Symbol, other". See -<http://unicode.org/Public/5.1.0/ucd/UCD.html#General_Category_Values> for a +<http://www.unicode.org/reports/tr44/#General_Category_Values> for a list of category codes. References @@ -468,18 +467,17 @@ like those in string objects' :meth:`encode` and :meth:`decode` methods. Reading Unicode from a file is therefore simple:: - f = open('unicode.rst', encoding='utf-8') - for line in f: - print(repr(line)) + with open('unicode.rst', encoding='utf-8') as f: + for line in f: + print(repr(line)) It's also possible to open files in update mode, allowing both reading and writing:: - f = open('test', encoding='utf-8', mode='w+') - f.write('\u4500 blah blah blah\n') - f.seek(0) - print(repr(f.readline()[:1])) - f.close() + with open('test', encoding='utf-8', mode='w+') as f: + f.write('\u4500 blah blah blah\n') + f.seek(0) + print(repr(f.readline()[:1])) The Unicode character U+FEFF is used as a byte-order mark (BOM), and is often written as the first character of a file in order to assist with autodetection @@ -516,14 +514,13 @@ usually just provide the Unicode string as the filename, and it will be automatically converted to the right encoding for you:: filename = 'filename\u4500abc' - f = open(filename, 'w') - f.write('blah\n') - f.close() + with open(filename, 'w') as f: + f.write('blah\n') Functions in the :mod:`os` module such as :func:`os.stat` will also accept Unicode filenames. -:func:`os.listdir`, which returns filenames, raises an issue: should it return +Function :func:`os.listdir`, which returns filenames, raises an issue: should it return the Unicode version of filenames, or should it return byte strings containing the encoded versions? :func:`os.listdir` will do both, depending on whether you provided the directory path as a byte string or a Unicode string. If you pass a @@ -572,14 +569,6 @@ strings, you will find your program vulnerable to bugs wherever you combine the two different kinds of strings. There is no automatic encoding or decoding if you do e.g. ``str + bytes``, a :exc:`TypeError` is raised for this expression. -It's easy to miss such problems if you only test your software with data that -doesn't contain any accents; everything will seem to work, but there's actually -a bug in your program waiting for the first user who attempts to use characters -> 127. A second tip, therefore, is: - - Include characters > 127 and, even better, characters > 255 in your test - data. - When using data coming from a web browser or some other untrusted source, a common technique is to check for illegal characters in a string before using the string in a generated command line or storing it in a database. If you're doing @@ -597,8 +586,8 @@ this code:: if '/' in filename: raise ValueError("'/' not allowed in filenames") unicode_name = filename.decode(encoding) - f = open(unicode_name, 'r') - # ... return contents of file ... + with open(unicode_name, 'r') as f: + # ... return contents of file ... However, if an attacker could specify the ``'base64'`` encoding, they could pass ``'L2V0Yy9wYXNzd2Q='``, which is the base-64 encoded form of the string @@ -613,27 +602,30 @@ The PDF slides for Marc-André Lemburg's presentation "Writing Unicode-aware Applications in Python" are available at <http://downloads.egenix.com/python/LSM2005-Developing-Unicode-aware-applications-in-Python.pdf> and discuss questions of character encodings as well as how to internationalize -and localize an application. +and localize an application. These slides cover Python 2.x only. -Revision History and Acknowledgements -===================================== +Acknowledgements +================ Thanks to the following people who have noted errors or offered suggestions on this article: Nicholas Bastin, Marius Gedminas, Kent Johnson, Ken Krugler, Marc-André Lemburg, Martin von Löwis, Chad Whitacre. -Version 1.0: posted August 5 2005. +.. comment + Revision History + + Version 1.0: posted August 5 2005. -Version 1.01: posted August 7 2005. Corrects factual and markup errors; adds -several links. + Version 1.01: posted August 7 2005. Corrects factual and markup errors; adds + several links. -Version 1.02: posted August 16 2005. Corrects factual errors. + Version 1.02: posted August 16 2005. Corrects factual errors. -Version 1.1: Feb-Nov 2008. Updates the document with respect to Python 3 changes. + Version 1.1: Feb-Nov 2008. Updates the document with respect to Python 3 changes. -Version 1.11: posted June 20 2010. Notes that Python 3.x is not covered, -and that the HOWTO only covers 2.x. + Version 1.11: posted June 20 2010. Notes that Python 3.x is not covered, + and that the HOWTO only covers 2.x. .. comment Describe Python 3.x support (new section? new document?) .. comment Additional topic: building Python w/ UCS2 or UCS4 support diff --git a/Doc/howto/webservers.rst b/Doc/howto/webservers.rst index caf0ad6..c4ac2b2 100644 --- a/Doc/howto/webservers.rst +++ b/Doc/howto/webservers.rst @@ -293,7 +293,7 @@ following WSGI-application:: # -*- coding: UTF-8 -*- import sys, os - from cgi import escape + from html import escape from flup.server.fcgi import WSGIServer def app(environ, start_response): |