\section{\module{socket} --- Low-level networking interface} \declaremodule{builtin}{socket} \modulesynopsis{Low-level networking interface.} This module provides access to the BSD \emph{socket} interface. It is available on all modern \UNIX{} systems, Windows, MacOS, BeOS, OS/2, and probably additional platforms. For an introduction to socket programming (in C), see the following papers: \citetitle{An Introductory 4.3BSD Interprocess Communication Tutorial}, by Stuart Sechrest and \citetitle{An Advanced 4.3BSD Interprocess Communication Tutorial}, by Samuel J. Leffler et al, both in the \citetitle{\UNIX{} Programmer's Manual, Supplementary Documents 1} (sections PS1:7 and PS1:8). The platform-specific reference material for the various socket-related system calls are also a valuable source of information on the details of socket semantics. For \UNIX, refer to the manual pages; for Windows, see the WinSock (or Winsock 2) specification. For IPv6-ready APIs, readers may want to refer to RFC2553 titled \cite{Basic Socket Interface Extensions for IPv6}. The Python interface is a straightforward transliteration of the \UNIX{} system call and library interface for sockets to Python's object-oriented style: the \function{socket()} function returns a \dfn{socket object}\obindex{socket} whose methods implement the various socket system calls. Parameter types are somewhat higher-level than in the C interface: as with \method{read()} and \method{write()} operations on Python files, buffer allocation on receive operations is automatic, and buffer length is implicit on send operations. Socket addresses are represented as follows: A single string is used for the \constant{AF_UNIX} address family. A pair \code{(\var{host}, \var{port})} is used for the \constant{AF_INET} address family, where \var{host} is a string representing either a hostname in Internet domain notation like \code{'daring.cwi.nl'} or an IPv4 address like \code{'100.50.200.5'}, and \var{port} is an integral port number. For \constant{AF_INET6} address family, a four-tuple \code{(\var{host}, \var{port}, \var{flowinfo}, \var{scopeid})} is used, where \var{flowinfo} and \var{scopeid} represents \code{sin6_flowinfo} and \code{sin6_scope_id} member in \constant{struct sockaddr_in6} in C. For \module{socket} module methods, \var{flowinfo} and \var{scopeid} can be omitted just for backward compatibility. Note, however, omission of \var{scopeid} can cause problems in manipulating scoped IPv6 addresses. Other address families are currently not supported. The address format required by a particular socket object is automatically selected based on the address family specified when the socket object was created. For IPv4 addresses, two special forms are accepted instead of a host address: the empty string represents \constant{INADDR_ANY}, and the string \code{''} represents \constant{INADDR_BROADCAST}. The behavior is not available for IPv6 for backward compatibility, therefore, you may want to avoid these if you intend to support IPv6 with your Python programs. If you use a hostname in the \var{host} portion of IPv4/v6 socket address, the program may show a nondeterministic behavior, as Python uses the first address returned from the DNS resolution. The socket address will be resolved differently into an actual IPv4/v6 address, depending on the results from DNS resolution and/or the host configuration. For deterministic behavior use a numeric address in \var{host} portion. All errors raise exceptions. The normal exceptions for invalid argument types and out-of-memory conditions can be raised; errors related to socket or address semantics raise the error \exception{socket.error}. Non-blocking mode is supported through the \method{setblocking()} method. The module \module{socket} exports the following constants and functions: \begin{excdesc}{error} This exception is raised for socket-related errors. The accompanying value is either a string telling what went wrong or a pair \code{(\var{errno}, \var{string})} representing an error returned by a system call, similar to the value accompanying \exception{os.error}. See the module \refmodule{errno}\refbimodindex{errno}, which contains names for the error codes defined by the underlying operating system. \end{excdesc} \begin{excdesc}{herror} This exception is raised for address-related errors, i.e. for functions that use \var{h_errno} in C API, including \function{gethostbyname_ex} and \function{gethostbyaddr}. The accompanying value is a pair \code{(\var{h_errno}, \var{string})} representing an error returned by a library call. \var{string} represents the description of \var{h_errno}, as returned by \cfunction{hstrerror} C API. \end{excdesc} \begin{excdesc}{gaierror} This exception is raised for address-related errors, for \function{getaddrinfo} and \function{getnameinfo}. The accompanying value is a pair \code{(\var{error}, \var{string})} representing an error returned by a library call. \var{string} represents the description of \var{error}, as returned by \cfunction{gai_strerror} C API. \end{excdesc} \begin{datadesc}{AF_UNIX} \dataline{AF_INET} \dataline{AF_INET6} These constants represent the address (and protocol) families, used for the first argument to \function{socket()}. If the \constant{AF_UNIX} constant is not defined then this protocol is unsupported. \end{datadesc} \begin{datadesc}{SOCK_STREAM} \dataline{SOCK_DGRAM} \dataline{SOCK_RAW} \dataline{SOCK_RDM} \dataline{SOCK_SEQPACKET} These constants represent the socket types, used for the second argument to \function{socket()}. (Only \constant{SOCK_STREAM} and \constant{SOCK_DGRAM} appear to be generally useful.) \end{datadesc} \begin{datadesc}{SO_*} \dataline{SOMAXCONN} \dataline{MSG_*} \dataline{SOL_*} \dataline{IPPROTO_*} \dataline{IPPORT_*} \dataline{INADDR_*} \dataline{IP_*} \dataline{IPV6_*} \dataline{EAI_*} \dataline{AI_*} \dataline{NI_*} Many constants of these forms, documented in the \UNIX{} documentation on sockets and/or the IP protocol, are also defined in the socket module. They are generally used in arguments to the \method{setsockopt()} and \method{getsockopt()} methods of socket objects. In most cases, only those symbols that are defined in the \UNIX{} header files are defined; for a few symbols, default values are provided. \end{datadesc} \begin{funcdesc}{getaddrinfo}{host, port\optional{, family, socktype, proto, flags}} Resolves the \var{host}/\var{port} argument, into a sequence of 5-tuples that contain all the necessary argument for the sockets manipulation. \var{host} is a domain name, a string representation of IPv4/v6 address or \code{None}. \var{port} is a string service name (like \code{``http''}), a numeric port number or \code{None}. The rest of the arguments are optional and must be numeric if specified. For \var{host} and \var{port}, by passing either an empty string or \code{None}, you can pass \code{NULL} to the C API. The \function{getaddrinfo()} function returns a list of 5-tuples with the following structure: \code{(\var{family}, \var{socktype}, \var{proto}, \var{canonname}, \var{sockaddr})}. \var{family}, \var{socktype}, \var{proto} are all integer and are meant to be passed to the \function{socket()} function. \var{canonname} is a string representing the canonical name of the \var{host}. It can be a numeric IPv4/v6 address when \code{AI_CANONNAME} is specified for a numeric \var{host}. \var{sockaddr} is a tuple describing a socket address, as described above. See \code{Lib/httplib.py} and other library files for a typical usage of the function. \versionadded{2.2} \end{funcdesc} \begin{funcdesc}{getfqdn}{\optional{name}} Return a fully qualified domain name for \var{name}. If \var{name} is omitted or empty, it is interpreted as the local host. To find the fully qualified name, the hostname returned by \function{gethostbyaddr()} is checked, then aliases for the host, if available. The first name which includes a period is selected. In case no fully qualified domain name is available, the hostname is returned. \versionadded{2.0} \end{funcdesc} \begin{funcdesc}{gethostbyname}{hostname} Translate a host name to IPv4 address format. The IPv4 address is returned as a string, e.g., \code{'100.50.200.5'}. If the host name is an IPv4 address itself it is returned unchanged. See \function{gethostbyname_ex()} for a more complete interface. \function{gethostbyname()} does not support IPv6 name resolution, and \function{getaddrinfo()} should be used instead for IPv4/v6 dual stack support. \end{funcdesc} \begin{funcdesc}{gethostbyname_ex}{hostname} Translate a host name to IPv4 address format, extended interface. Return a triple \code{(hostname, aliaslist, ipaddrlist)} where \code{hostname} is the primary host name responding to the given \var{ip_address}, \code{aliaslist} is a (possibly empty) list of alternative host names for the same address, and \code{ipaddrlist} is a list of IPv4 addresses for the same interface on the same host (often but not always a single address). \function{gethostbyname_ex()} does not support IPv6 name resolution, and \function{getaddrinfo()} should be used instead for IPv4/v6 dual stack support. \end{funcdesc} \begin{funcdesc}{gethostname}{} Return a string containing the hostname of the machine where the Python interpreter is currently executing. If you want to know the current machine's IP address, you may want to use \code{gethostbyname(gethostname())}. This operation assumes that there is a valid address-to-host mapping for the host, and the assumption does not always hold. Note: \function{gethostname()} doesn't always return the fully qualified domain name; use \code{gethostbyaddr(gethostname())} (see below). \end{funcdesc} \begin{funcdesc}{gethostbyaddr}{ip_address} Return a triple \code{(\var{hostname}, \var{aliaslist}, \var{ipaddrlist})} where \var{hostname} is the primary host name responding to the given \var{ip_address}, \var{aliaslist} is a (possibly empty) list of alternative host names for the same address, and \var{ipaddrlist} is a list of IPv4/v6 addresses for the same interface on the same host (most likely containing only a single address). To find the fully qualified domain name, use the function \function{getfqdn()}. \function{gethostbyaddr} supports both IPv4 and IPv6. \end{funcdesc} \begin{funcdesc}{getnameinfo}{sockaddr, flags} Translate a socket address \var{sockaddr} into a 2-tuple \code{(\var{host}, \var{port})}. Depending on the settings of \var{flags}, the result can contain a fully-qualified domain name or numeric address representation in \var{host}. Similarly, \var{port} can contain a string port name or a numeric port number. \versionadded{2.2} \end{funcdesc} \begin{funcdesc}{getprotobyname}{protocolname} Translate an Internet protocol name (e.g.\ \code{'icmp'}) to a constant suitable for passing as the (optional) third argument to the \function{socket()} function. This is usually only needed for sockets opened in ``raw'' mode (\constant{SOCK_RAW}); for the normal socket modes, the correct protocol is chosen automatically if the protocol is omitted or zero. \end{funcdesc} \begin{funcdesc}{getservbyname}{servicename, protocolname} Translate an Internet service name and protocol name to a port number for that service. The protocol name should be \code{'tcp'} or \code{'udp'}. \end{funcdesc} \begin{funcdesc}{socket}{family, type\optional{, proto}} Create a new socket using the given address family, socket type and protocol number. The address family should be \constant{AF_INET}, \constant{AF_INET6} or \constant{AF_UNIX}. The socket type should be \constant{SOCK_STREAM}, \constant{SOCK_DGRAM} or perhaps one of the other \samp{SOCK_} constants. The protocol number is usually zero and may be omitted in that case. \end{funcdesc} \begin{funcdesc}{fromfd}{fd, family, type\optional{, proto}} Build a socket object from an existing file descriptor (an integer as returned by a file object's \method{fileno()} method). Address family, socket type and protocol number are as for the \function{socket()} function above. The file descriptor should refer to a socket, but this is not checked --- subsequent operations on the object may fail if the file descriptor is invalid. This function is rarely needed, but can be used to get or set socket options on a socket passed to a program as standard input or output (e.g.\ a server started by the \UNIX{} inet daemon). \end{funcdesc} \begin{funcdesc}{ntohl}{x} Convert 32-bit integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation. \end{funcdesc} \begin{funcdesc}{ntohs}{x} Convert 16-bit integers from network to host byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation. \end{funcdesc} \begin{funcdesc}{htonl}{x} Convert 32-bit integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 4-byte swap operation. \end{funcdesc} \begin{funcdesc}{htons}{x} Convert 16-bit integers from host to network byte order. On machines where the host byte order is the same as network byte order, this is a no-op; otherwise, it performs a 2-byte swap operation. \end{funcdesc} \begin{funcdesc}{inet_aton}{ip_string} Convert an IPv4 address from dotted-quad string format (e.g.\ '123.45.67.89') to 32-bit packed binary format, as a string four characters in length. Useful when conversing with a program that uses the standard C library and needs objects of type \ctype{struct in_addr}, which is the C type for the 32-bit packed binary this function returns. If the IPv4 address string passed to this function is invalid, \exception{socket.error} will be raised. Note that exactly what is valid depends on the underlying C implementation of \cfunction{inet_aton()}. \function{inet_aton} does not support IPv6, and \function{getnameinfo()} should be used instead for IPv4/v6 dual stack support. \end{funcdesc} \begin{funcdesc}{inet_ntoa}{packed_ip} Convert a 32-bit packed IPv4 address (a string four characters in length) to its standard dotted-quad string representation (e.g. '123.45.67.89'). Useful when conversing with a program that uses the standard C library and needs objects of type \ctype{struct in_addr}, which is the C type for the 32-bit packed binary this function takes as an argument. If the string passed to this function is not exactly 4 bytes in length, \exception{socket.error} will be raised. \function{inet_ntoa} does not support IPv6, and \function{getnameinfo()} should be used instead for IPv4/v6 dual stack support. \end{funcdesc} \begin{datadesc}{SocketType} This is a Python type object that represents the socket object type. It is the same as \code{type(socket(...))}. \end{datadesc} \begin{seealso} \seemodule{SocketServer}{Classes that simplify writing network servers.} \end{seealso} \subsection{Socket Objects \label{socket-objects}} Socket objects have the following methods. Except for \method{makefile()} these correspond to \UNIX{} system calls applicable to sockets. \begin{methoddesc}[socket]{accept}{} Accept a connection. The socket must be bound to an address and listening for connections. The return value is a pair \code{(\var{conn}, \var{address})} where \var{conn} is a \emph{new} socket object usable to send and receive data on the connection, and \var{address} is the address bound to the socket on the other end of the connection. \end{methoddesc} \begin{methoddesc}[socket]{bind}{address} Bind the socket to \var{address}. The socket must not already be bound. (The format of \var{address} depends on the address family --- see above.) \strong{Note:} This method has historically accepted a pair of parameters for \constant{AF_INET} addresses instead of only a tuple. This was never intentional and is no longer be available in Python 2.0. \end{methoddesc} \begin{methoddesc}[socket]{close}{} Close the socket. All future operations on the socket object will fail. The remote end will receive no more data (after queued data is flushed). Sockets are automatically closed when they are garbage-collected. \end{methoddesc} \begin{methoddesc}[socket]{connect}{address} Connect to a remote socket at \var{address}. (The format of \var{address} depends on the address family --- see above.) \strong{Note:} This method has historically accepted a pair of parameters for \constant{AF_INET} addresses instead of only a tuple. This was never intentional and is no longer available in Python 2.0 and later. \end{methoddesc} \begin{methoddesc}[socket]{connect_ex}{address} Like \code{connect(\var{address})}, but return an error indicator instead of raising an exception for errors returned by the C-level \cfunction{connect()} call (other problems, such as ``host not found,'' can still raise exceptions). The error indicator is \code{0} if the operation succeeded, otherwise the value of the \cdata{errno} variable. This is useful, e.g., for asynchronous connects. \strong{Note:} This method has historically accepted a pair of parameters for \constant{AF_INET} addresses instead of only a tuple. This was never intentional and is no longer be available in Python 2.0 and later. \end{methoddesc} \begin{methoddesc}[socket]{fileno}{} Return the socket's file descriptor (a small integer). This is useful with \function{select.select()}. \end{methoddesc} \begin{methoddesc}[socket]{getpeername}{} Return the remote address to which the socket is connected. This is useful to find out the port number of a remote IPv4/v6 socket, for instance. (The format of the address returned depends on the address family --- see above.) On some systems this function is not supported. \end{methoddesc} \begin{methoddesc}[socket]{getsockname}{} Return the socket's own address. This is useful to find out the port number of an IPv4/v6 socket, for instance. (The format of the address returned depends on the address family --- see above.) \end{methoddesc} \begin{methoddesc}[socket]{getsockopt}{level, optname\optional{, buflen}} Return the value of the given socket option (see the \UNIX{} man page \manpage{getsockopt}{2}). The needed symbolic constants (\constant{SO_*} etc.) are defined in this module. If \var{buflen} is absent, an integer option is assumed and its integer value is returned by the function. If \var{buflen} is present, it specifies the maximum length of the buffer used to receive the option in, and this buffer is returned as a string. It is up to the caller to decode the contents of the buffer (see the optional built-in module \refmodule{struct} for a way to decode C structures encoded as strings). \end{methoddesc} \begin{methoddesc}[socket]{listen}{backlog} Listen for connections made to the socket. The \var{backlog} argument specifies the maximum number of queued connections and should be at least 1; the maximum value is system-dependent (usually 5). \end{methoddesc} \begin{methoddesc}[socket]{makefile}{\optional{mode\optional{, bufsize}}} Return a \dfn{file object} associated with the socket. (File objects are described in \ref{bltin-file-objects}, ``File Objects.'') The file object references a \cfunction{dup()}ped version of the socket file descriptor, so the file object and socket object may be closed or garbage-collected independently. \index{I/O control!buffering}The optional \var{mode} and \var{bufsize} arguments are interpreted the same way as by the built-in \function{open()} function. \end{methoddesc} \begin{methoddesc}[socket]{recv}{bufsize\optional{, flags}} Receive data from the socket. The return value is a string representing the data received. The maximum amount of data to be received at once is specified by \var{bufsize}. See the \UNIX{} manual page \manpage{recv}{2} for the meaning of the optional argument \var{flags}; it defaults to zero. \end{methoddesc} \begin{methoddesc}[socket]{recvfrom}{bufsize\optional{, flags}} Receive data from the socket. The return value is a pair \code{(\var{string}, \var{address})} where \var{string} is a string representing the data received and \var{address} is the address of the socket sending the data. The optional \var{flags} argument has the same meaning as for \method{recv()} above. (The format of \var{address} depends on the address family --- see above.) \end{methoddesc} \begin{methoddesc}[socket]{send}{string\optional{, flags}} Send data to the socket. The socket must be connected to a remote socket. The optional \var{flags} argument has the same meaning as for \method{recv()} above. Returns the number of bytes sent. \end{methoddesc} \begin{methoddesc}[socket]{sendto}{string\optional{, flags}, address} Send data to the socket. The socket should not be connected to a remote socket, since the destination socket is specified by \var{address}. The optional \var{flags} argument has the same meaning as for \method{recv()} above. Return the number of bytes sent. (The format of \var{address} depends on the address family --- see above.) \end{methoddesc} \begin{methoddesc}[socket]{setblocking}{flag} Set blocking or non-blocking mode of the socket: if \var{flag} is 0, the socket is set to non-blocking, else to blocking mode. Initially all sockets are in blocking mode. In non-blocking mode, if a \method{recv()} call doesn't find any data, or if a \method{send()} call can't immediately dispose of the data, a \exception{error} exception is raised; in blocking mode, the calls block until they can proceed. \end{methoddesc} \begin{methoddesc}[socket]{setsockopt}{level, optname, value} Set the value of the given socket option (see the \UNIX{} manual page \manpage{setsockopt}{2}). The needed symbolic constants are defined in the \module{socket} module (\code{SO_*} etc.). The value can be an integer or a string representing a buffer. In the latter case it is up to the caller to ensure that the string contains the proper bits (see the optional built-in module \refmodule{struct}\refbimodindex{struct} for a way to encode C structures as strings). \end{methoddesc} \begin{methoddesc}[socket]{shutdown}{how} Shut down one or both halves of the connection. If \var{how} is \code{0}, further receives are disallowed. If \var{how} is \code{1}, further sends are disallowed. If \var{how} is \code{2}, further sends and receives are disallowed. \end{methoddesc} Note that there are no methods \method{read()} or \method{write()}; use \method{recv()} and \method{send()} without \var{flags} argument instead. \subsection{Example \label{socket-example}} Here are four minimal example programs using the TCP/IP protocol:\ a server that echoes all data that it receives back (servicing only one client), and a client using it. Note that a server must perform the sequence \function{socket()}, \method{bind()}, \method{listen()}, \method{accept()} (possibly repeating the \method{accept()} to service more than one client), while a client only needs the sequence \function{socket()}, \method{connect()}. Also note that the server does not \method{send()}/\method{recv()} on the socket it is listening on but on the new socket returned by \method{accept()}. The first two examples support IPv4 only. \begin{verbatim} # Echo server program import socket HOST = '' # Symbolic name meaning the local host PORT = 50007 # Arbitrary non-privileged port s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.bind((HOST, PORT)) s.listen(1) conn, addr = s.accept() print 'Connected by', addr while 1: data = conn.recv(1024) if not data: break conn.send(data) conn.close() \end{verbatim} \begin{verbatim} # Echo client program import socket HOST = 'daring.cwi.nl' # The remote host PORT = 50007 # The same port as used by the server s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect((HOST, PORT)) s.send('Hello, world') data = s.recv(1024) s.close() print 'Received', `data` \end{verbatim} The next two examples are identical to the above two, but support both IPv4 and IPv6. The server side will listen to the first address family available (it should listen to both instead). On most of IPv6-ready systems, IPv6 will take precedence and the server may not accept IPv4 traffic. The client side will try to connect to the all addresses returned as a result of the name resolution, and sends traffic to the first one connected successfully. \begin{verbatim} # Echo server program import socket import sys HOST = '' # Symbolic name meaning the local host PORT = 50007 # Arbitrary non-privileged port s = None for res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC, socket.SOCK_STREAM, 0, socket.AI_PASSIVE): af, socktype, proto, canonname, sa = res try: s = socket.socket(af, socktype, proto) except socket.error, msg: s = None continue try: s.bind(sa) s.listen(1) except socket.error, msg: s.close() s = None continue break if s is None: print 'could not open socket' sys.exit(1) conn, addr = s.accept() print 'Connected by', addr while 1: data = conn.recv(1024) if not data: break conn.send(data) conn.close() \end{verbatim} \begin{verbatim} # Echo client program import socket import sys HOST = 'daring.cwi.nl' # The remote host PORT = 50007 # The same port as used by the server s = None for res in socket.getaddrinfo(HOST, PORT, socket.AF_UNSPEC, socket.SOCK_STREAM): af, socktype, proto, canonname, sa = res try: s = socket.socket(af, socktype, proto) except socket.error, msg: s = None continue try: s.connect(sa) except socket.error, msg: s.close() s = None continue break if s is None: print 'could not open socket' sys.exit(1) s.send('Hello, world') data = s.recv(1024) s.close() print 'Received', `data` \end{verbatim}