summaryrefslogtreecommitdiffstats
path: root/Doc/lib/libasyncore.tex
blob: 05528965bf0a7f9ce17b46dd8072f97c86fe3819 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
\section{\module{asyncore} ---
         Asynchronous socket handler}

\declaremodule{builtin}{asyncore}
\modulesynopsis{A base class for developing asynchronous socket 
                handling services.}
\moduleauthor{Sam Rushing}{rushing@nightmare.com}
\sectionauthor{Christopher Petrilli}{petrilli@amber.org}
\sectionauthor{Steve Holden}{sholden@holdenweb.com}
% Heavily adapted from original documentation by Sam Rushing.

This module provides the basic infrastructure for writing asynchronous 
socket service clients and servers.

There are only two ways to have a program on a single processor do 
``more than one thing at a time.'' Multi-threaded programming is the 
simplest and most popular way to do it, but there is another very 
different technique, that lets you have nearly all the advantages of 
multi-threading, without actually using multiple threads.  It's really 
only practical if your program is largely I/O bound.  If your program 
is processor bound, then pre-emptive scheduled threads are probably what 
you really need. Network servers are rarely processor bound, however.

If your operating system supports the \cfunction{select()} system call 
in its I/O library (and nearly all do), then you can use it to juggle 
multiple communication channels at once; doing other work while your 
I/O is taking place in the ``background.''  Although this strategy can 
seem strange and complex, especially at first, it is in many ways 
easier to understand and control than multi-threaded programming.  
The \module{asyncore} module solves many of the difficult problems for 
you, making the task of building sophisticated high-performance 
network servers and clients a snap. For ``conversational'' applications
and protocols the companion  \refmodule{asynchat} module is invaluable.

The basic idea behind both modules is to create one or more network
\emph{channels}, instances of class \class{asyncore.dispatcher} and
\class{asynchat.async_chat}. Creating the channels adds them to a global
map, used by the \function{loop()} function if you do not provide it
with your own \var{map}.

Once the initial channel(s) is(are) created, calling the \function{loop()}
function activates channel service, which continues until the last
channel (including any that have been added to the map during asynchronous
service) is closed.

\begin{funcdesc}{loop}{\optional{timeout\optional{, use_poll\optional{,
                       map\optional{,count}}}}}
  Enter a polling loop that terminates after count passes or all open
  channels have been closed.  All arguments are optional.  The \var{count}
  parameter defaults to None, resulting in the loop terminating only
  when all channels have been closed.  The \var{timeout} argument sets the
  timeout parameter for the appropriate \function{select()} or
  \function{poll()} call, measured in seconds; the default is 30 seconds.
  The \var{use_poll} parameter, if true, indicates that \function{poll()}
  should be used in preference to \function{select()} (the default is
  \code{False}).  

  The \var{map} parameter is a dictionary whose items are
  the channels to watch.  As channels are closed they are deleted from their
  map.  If \var{map} is omitted, a global map is used.
  Channels (instances of \class{asyncore.dispatcher}, \class{asynchat.async_chat}
  and subclasses thereof) can freely be mixed in the map.
\end{funcdesc}

\begin{classdesc}{dispatcher}{}
  The \class{dispatcher} class is a thin wrapper around a low-level socket object.
  To make it more useful, it has a few methods for event-handling  which are called
  from the asynchronous loop.  
  Otherwise, it can be treated as a normal non-blocking socket object.

  Two class attributes can be modified, to improve performance,
  or possibly even to conserve memory.

  \begin{datadesc}{ac_in_buffer_size}
  The asynchronous input buffer size (default \code{4096}).
  \end{datadesc}

  \begin{datadesc}{ac_out_buffer_size}
  The asynchronous output buffer size (default \code{4096}).
  \end{datadesc}

  The firing of low-level events at certain times or in certain connection
  states tells the asynchronous loop that certain higher-level events have
  taken place. For example, if we have asked for a socket to connect to
  another host, we know that the connection has been made when the socket
  becomes writable for the first time (at this point you know that you may
  write to it with the expectation of success). The implied higher-level
  events are:

  \begin{tableii}{l|l}{code}{Event}{Description}
    \lineii{handle_connect()}{Implied by the first write event}
    \lineii{handle_close()}{Implied by a read event with no data available}
    \lineii{handle_accept()}{Implied by a read event on a listening socket}
  \end{tableii}

  During asynchronous processing, each mapped channel's \method{readable()}
  and \method{writable()} methods are used to determine whether the channel's
  socket should be added to the list of channels \cfunction{select()}ed or
  \cfunction{poll()}ed for read and write events.

\end{classdesc}

Thus, the set of channel events is larger than the basic socket events.
The full set of methods that can be overridden in your subclass follows:

\begin{methoddesc}{handle_read}{}
  Called when the asynchronous loop detects that a \method{read()}
  call on the channel's socket will succeed.
\end{methoddesc}

\begin{methoddesc}{handle_write}{}
  Called when the asynchronous loop detects that a writable socket
  can be written.  
  Often this method will implement the necessary buffering for 
  performance.  For example:

\begin{verbatim}
def handle_write(self):
    sent = self.send(self.buffer)
    self.buffer = self.buffer[sent:]
\end{verbatim}
\end{methoddesc}

\begin{methoddesc}{handle_expt}{}
  Called when there is out of band (OOB) data for a socket 
  connection.  This will almost never happen, as OOB is 
  tenuously supported and rarely used.
\end{methoddesc}

\begin{methoddesc}{handle_connect}{}
  Called when the active opener's socket actually makes a connection.
  Might send a ``welcome'' banner, or initiate a protocol
  negotiation with the remote endpoint, for example.
\end{methoddesc}

\begin{methoddesc}{handle_close}{}
  Called when the socket is closed.
\end{methoddesc}

\begin{methoddesc}{handle_error}{}
  Called when an exception is raised and not otherwise handled.  The default
  version prints a condensed traceback.
\end{methoddesc}

\begin{methoddesc}{handle_accept}{}
  Called on listening channels (passive openers) when a  
  connection can be established with a new remote endpoint that
  has issued a \method{connect()} call for the local endpoint.
\end{methoddesc}

\begin{methoddesc}{readable}{}
  Called each time around the asynchronous loop to determine whether a
  channel's socket should be added to the list on which read events can
  occur.  The default method simply returns \code{True}, 
  indicating that by default, all channels will be interested in
  read events.
\end{methoddesc}

\begin{methoddesc}{writable}{}
  Called each time around the asynchronous loop to determine whether a
  channel's socket should be added to the list on which write events can
  occur.  The default method simply returns \code{True}, 
  indicating that by default, all channels will be interested in
  write events.
\end{methoddesc}

In addition, each channel delegates or extends many of the socket methods.
Most of these are nearly identical to their socket partners.

\begin{methoddesc}{create_socket}{family, type}
  This is identical to the creation of a normal socket, and 
  will use the same options for creation.  Refer to the
  \refmodule{socket} documentation for information on creating
  sockets.
\end{methoddesc}

\begin{methoddesc}{connect}{address}
  As with the normal socket object, \var{address} is a 
  tuple with the first element the host to connect to, and the 
  second the port number.
\end{methoddesc}

\begin{methoddesc}{send}{data}
  Send \var{data} to the remote end-point of the socket.
\end{methoddesc}

\begin{methoddesc}{recv}{buffer_size}
  Read at most \var{buffer_size} bytes from the socket's remote end-point.
  An empty string implies that the channel has been closed from the other
  end.
\end{methoddesc}

\begin{methoddesc}{listen}{backlog}
  Listen for connections made to the socket.  The \var{backlog}
  argument specifies the maximum number of queued connections
  and should be at least 1; the maximum value is
  system-dependent (usually 5).
\end{methoddesc}

\begin{methoddesc}{bind}{address}
  Bind the socket to \var{address}.  The socket must not already be
  bound.  (The format of \var{address} depends on the address family
  --- see above.)  To mark the socket as re-usable (setting the
  \constant{SO_REUSEADDR} option), call the \class{dispatcher}
  object's \method{set_reuse_addr()} method.
\end{methoddesc}

\begin{methoddesc}{accept}{}
  Accept a connection.  The socket must be bound to an address
  and listening for connections.  The return value is a pair
  \code{(\var{conn}, \var{address})} where \var{conn} is a
  \emph{new} socket object usable to send and receive data on
  the connection, and \var{address} is the address bound to the
  socket on the other end of the connection.
\end{methoddesc}

\begin{methoddesc}{close}{}
  Close the socket.  All future operations on the socket object
  will fail.  The remote end-point will receive no more data (after
  queued data is flushed).  Sockets are automatically closed
  when they are garbage-collected.
\end{methoddesc}


\subsection{asyncore Example basic HTTP client \label{asyncore-example}}

Here is a very basic HTTP client that uses the \class{dispatcher}
class to implement its socket handling:

\begin{verbatim}
import asyncore, socket

class http_client(asyncore.dispatcher):

    def __init__(self, host, path):
        asyncore.dispatcher.__init__(self)
        self.create_socket(socket.AF_INET, socket.SOCK_STREAM)
        self.connect( (host, 80) )
        self.buffer = 'GET %s HTTP/1.0\r\n\r\n' % path

    def handle_connect(self):
        pass

    def handle_close(self):
        self.close()

    def handle_read(self):
        print self.recv(8192)

    def writable(self):
        return (len(self.buffer) > 0)

    def handle_write(self):
        sent = self.send(self.buffer)
        self.buffer = self.buffer[sent:]

c = http_client('www.python.org', '/')

asyncore.loop()
\end{verbatim}