diff options
author | Raymond Hettinger <python@rcn.com> | 2004-05-26 13:52:59 (GMT) |
---|---|---|
committer | Raymond Hettinger <python@rcn.com> | 2004-05-26 13:52:59 (GMT) |
commit | 846865bba63d65e3f25331bc8349233b6961f538 (patch) | |
tree | eff35acfcdf7c9ba824d9e2c87e18ebd60923f09 | |
parent | a8aebcedf94192c80d95ca3ce3501c8481f9e41b (diff) | |
download | cpython-846865bba63d65e3f25331bc8349233b6961f538.zip cpython-846865bba63d65e3f25331bc8349233b6961f538.tar.gz cpython-846865bba63d65e3f25331bc8349233b6961f538.tar.bz2 |
Add a further tour of the standard library.
-rw-r--r-- | Doc/tut/tut.tex | 290 |
1 files changed, 290 insertions, 0 deletions
diff --git a/Doc/tut/tut.tex b/Doc/tut/tut.tex index b635aef..86ba0c3 100644 --- a/Doc/tut/tut.tex +++ b/Doc/tut/tut.tex @@ -4763,6 +4763,296 @@ data interchange between python applications and other tools. +\chapter{Brief Tour of the Standard Library -- Part II\label{briefTourTwo}} + + +\section{Output Formatting\label{output-formatting}} + +The \ulink{\module{repr}}{../lib/module-repr.html} module provides an +version of \function{repr()} for abbreviated displays of large or deeply +nested containers: + +\begin{verbatim} + >>> import repr + >>> repr.repr(set('supercalifragilisticexpialidocious')) + "set(['a', 'c', 'd', 'e', 'f', 'g', ...])" +\end{verbatim} + +The \ulink{\module{pprint}}{../lib/module-pprint.html} module offers +more sophisticated control over printing both built-in and user defined +objects in a way that is readable by the interpreter. When the result +is longer than one line, the ``pretty printer'' adds line breaks and +indentation to more clearly reveal data structure: + +\begin{verbatim} + >>> import pprint + >>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta', + ... 'yellow'], 'blue']]] + ... + >>> pprint.pprint(t, width=30) + [[[['black', 'cyan'], + 'white', + ['green', 'red']], + [['magenta', 'yellow'], + 'blue']]] +\end{verbatim} + +The \ulink{\module{textwrap}}{../lib/module-textwrap.html} module +formats paragraphs of text to fit a given screen width: + +\begin{verbatim} + >>> import textwrap + >>> doc = """The wrap() method is just like fill() except that it returns + ... a list of strings instead of one big string with newlines to separate + ... the wrapped lines.""" + ... + >>> print textwrap.fill(doc, width=40) + The wrap() method is just like fill() + except that it returns a list of strings + instead of one big string with newlines + to separate the wrapped lines. +\end{verbatim} + +The \ulink{\module{locale}}{../lib/module-locale.html} module accesses +a database of culture specific data formats. The grouping attribute +of locale's format function provides a direct way of formatting numbers +with group separators: + +\begin{verbatim} + >>> import locale + >>> locale.setlocale(locale.LC_ALL, 'English_United States.1252') + 'English_United States.1252' + >>> conv = locale.localeconv() # get a mapping of conventions + >>> x = 1234567.8 + >>> locale.format("%d", x, grouping=True) + '1,234,567' + >>> locale.format("%s%.*f", (conv['currency_symbol'], + ... conv['int_frac_digits'], x), grouping=True) + '$1,234,567.80' +\end{verbatim} + + +\section{Working with Binary Data Record Layouts\label{binary-formats}} + +The \ulink{\module{struct}}{../lib/module-struct.html} module provides +\function{pack()} and \function{unpack()} functions for working with +variable length binary record formats. The following example shows how +to loop through header information in a ZIP file (with pack codes +\code{"H"} and \code{"L"} representing two and four byte unsigned +numbers respectively): + +\begin{verbatim} + import struct + + data = open('myfile.zip', 'rb').read() + start = 0 + for i in range(3): # show the first 3 file headers + start += 14 + fields = struct.unpack('LLLHH', data[start:start+16]) + crc32, comp_size, uncomp_size, filenamesize, extra_size = fields + + start += 16 + filename = data[start:start+filenamesize] + start += filenamesize + extra = data[start:start+extra_size] + print filename, hex(crc32), comp_size, uncomp_size + + start += extra_size + comp_size # skip to the next header +\end{verbatim} + + +\section{Multi-threading\label{multi-threading}} + +Threading is a technique for decoupling tasks which are not sequentially +dependent. Python threads are driven by the operating system and run +in a single process and share memory space in a single interpreter. + +Threads can be used to improve the responsiveness of applications that +accept user input while other tasks run in the background. The +following code shows how the high level +\ulink{\module{threading}}{../lib/module-threading.html} module can run +tasks in background while the main program continues to run: + +\begin{verbatim} + import threading, zipfile + + class AsyncZip(threading.Thread): + def __init__(self, infile, outfile): + threading.Thread.__init__(self) + self.infile = infile + self.outfile = outfile + def run(self): + f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED) + f.write(self.infile) + f.close() + print 'Finished background zip of: ', self.infile + + AsyncZip('mydata.txt', 'myarchive.zip').start() + print 'The main program continues to run' +\end{verbatim} + +The principal challenge of multi-thread applications is coordinating +threads that share data or other resources. To that end, the threading +module provides a number of synchronization primitives including locks, +events, condition variables, and semaphores. + +While those tools are powerful, minor design errors can result in +problems that are difficult to reproduce. A simpler and more robust +approach to task coordination is concentrating all access to a resource +in a single thread and then using the +\ulink{\module{Queue}}{../lib/module-Queue.html} module to feed that +thread with requests from other threads. Applications that use +\class{Queue} objects for inter-thread communication and coordination +tend to be easier to design, more readable, and more reliable. + + +\section{Logging\label{logging}} + +The \ulink{\module{logging}}{../lib/module-logging.html} module offers +a full featured and flexible logging system. At its simplest, log +messages are sent to a file or to \code{sys.stderr}: + +\begin{verbatim} + import logging + logging.debug('Debugging information') + logging.info('Informational message') + logging.warning('Warning:config file %s not found', 'server.conf') + logging.error('Error occurred') + logging.critical('Critical error -- shutting down') +\end{verbatim} + +This produces the following output: + +\begin{verbatim} + WARNING:root:Warning:config file server.conf not found + ERROR:root:Error occurred + CRITICAL:root:Critical error -- shutting down +\end{verbatim} + +By default, informational and debugging messages are suppressed and the +output is sent to standard error. Other output options include routing +messages through email, datagrams, sockets, or to an HTTP Server. New +filters select different routing based on message priority: DEBUG, +INFO, WARNING, ERROR, and CRITICAL. + +The logging system can be configured directly from Python or can be +loaded from a user editable configuration file for customized logging +without altering the application. + + +\section{Weak References\label{weak-references}} + +Python does automatic memory management (reference counting for most +objects and garbage collection to eliminate cycles). The memory is +freed shortly after the last reference to it has been eliminated. + +This approach works fine for most applications but occasionally there +is a need to track objects only as long as they are being used by +something else. Unfortunately, just tracking them creates a reference +that makes them permanent. The +\ulink{\module{weakref}}{../lib/module-weakref.html} module provides +tools for tracking objects without creating a reference. When the +object is no longer needed, it is automatically removed from a weakref +table and a callback is triggered for weakref objects. Typical +applications include caching objects that are expensive to create: + +\begin{verbatim} + >>> import weakref, gc + >>> class A: + ... def __init__(self, value): + ... self.value = value + ... def __repr__(self): + ... return str(self.value) + ... + >>> a = A(10) # create a reference + >>> d = weakref.WeakValueDictionary() + >>> d['primary'] = a # does not create a reference + >>> d['primary'] # fetch the object if it is still alive + 10 + >>> del a # remove the one reference + >>> gc.collect() # run garbage collection right away + 0 + >>> d['primary'] # entry was automatically removed + Traceback (most recent call last): + File "<pyshell#108>", line 1, in -toplevel- + d['primary'] # entry was automatically removed + File "C:/PY24/lib/weakref.py", line 46, in __getitem__ + o = self.data[key]() + KeyError: 'primary' +\end{verbatim} + +\section{Tools for Working with Lists\label{list-tools}} + +Many data structure needs can be met with the built-in list type. +However, sometimes there is a need for alternative implementations +with different performance trade-offs. + +The \ulink{\module{array}}{../lib/module-array.html} module provides an +\class{array()} object that is like a list that stores only homogenous +data but stores it more compactly. The following example shows an array +of numbers stored as two byte unsigned binary numbers (typecode +\code{"H"}) rather than the usual 16 bytes per entry for regular lists +of python int objects: + +\begin{verbatim} + >>> from array import array + >>> a = array('H', [4000, 10, 700, 22222]) + >>> sum(a) + 26932 + >>> a[1:3] + array('H', [10, 700]) +\end{verbatim} + +The \ulink{\module{collections}}{../lib/module-collections.html} module +provides a \class{deque()} object that is like a list with faster +appends and pops from the left side but slower lookups in the middle. +These objects are well suited for implementing queues and breadth first +tree searches: + +\begin{verbatim} + >>> from collections import deque + >>> d = deque(["task1", "task2", "task3"]) + >>> d.append("task4") + >>> print "Handling", d.popleft() + Handling task1 + + unsearched = deque([starting_node]) + def breadth_first_search(unsearched): + node = unsearched.popleft() + for m in gen_moves(node): + if is_goal(m): + return m + unsearched.append(m) +\end{verbatim} + +In addition to alternative list implementations, the library also offers +other tools such as the \ulink{\module{bisect}}{../lib/module-bisect.html} +module with functions for manipulating sorted lists: + +\begin{verbatim} + >>> import bisect + >>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')] + >>> bisect.insort(scores, (300, 'ruby')) + >>> scores + [(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')] +\end{verbatim} + +The \ulink{\module{heapq}}{../lib/module-heapq.html} module provides +functions for implementing heaps based on regular lists. The lowest +valued entry is always kept at position zero. This is useful for +applications which repeatedly access the smallest element but do not +want to run a full list sort: + +\begin{verbatim} + >>> from heapq import heapify, heappop, heappush + >>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0] + >>> heapify(data) # rearrange the list into heap order + >>> heappush(data, -5) # add a new entry + >>> [heappop(data) for i in range(3)] # fetch the three smallest entries + [-5, 0, 1] +\end{verbatim} + + \chapter{What Now? \label{whatNow}} Reading this tutorial has probably reinforced your interest in using |