diff options
author | Fred Drake <fdrake@acm.org> | 1998-05-07 01:49:07 (GMT) |
---|---|---|
committer | Fred Drake <fdrake@acm.org> | 1998-05-07 01:49:07 (GMT) |
commit | cda63cc875f54b047018cad362aa23d5493b97f3 (patch) | |
tree | f43f888293bb4046a7622dffefd561b669e993c2 /Doc/liburllib.tex | |
parent | bbe33c559403c7e06642111c494bd32d9abe528f (diff) | |
download | cpython-cda63cc875f54b047018cad362aa23d5493b97f3.zip cpython-cda63cc875f54b047018cad362aa23d5493b97f3.tar.gz cpython-cda63cc875f54b047018cad362aa23d5493b97f3.tar.bz2 |
Relocating file to Doc/lib/
Diffstat (limited to 'Doc/liburllib.tex')
-rw-r--r-- | Doc/liburllib.tex | 132 |
1 files changed, 0 insertions, 132 deletions
diff --git a/Doc/liburllib.tex b/Doc/liburllib.tex deleted file mode 100644 index 8995798..0000000 --- a/Doc/liburllib.tex +++ /dev/null @@ -1,132 +0,0 @@ -\section{Standard Module \module{urllib}} -\label{module-urllib} -\stmodindex{urllib} -\index{WWW} -\index{World-Wide Web} -\index{URL} - - -This module provides a high-level interface for fetching data across -the World-Wide Web. In particular, the \function{urlopen()} function -is similar to the built-in function \function{open()}, but accepts -Universal Resource Locators (URLs) instead of filenames. Some -restrictions apply --- it can only open URLs for reading, and no seek -operations are available. - -It defines the following public functions: - -\begin{funcdesc}{urlopen}{url} -Open a network object denoted by a URL for reading. If the URL does -not have a scheme identifier, or if it has \file{file:} as its scheme -identifier, this opens a local file; otherwise it opens a socket to a -server somewhere on the network. If the connection cannot be made, or -if the server returns an error code, the \exception{IOError} exception -is raised. If all went well, a file-like object is returned. This -supports the following methods: \method{read()}, \method{readline()}, -\method{readlines()}, \method{fileno()}, \method{close()} and -\method{info()}. -Except for the last one, these methods have the same interface as for -file objects --- see section \ref{bltin-file-objects} in this -manual. (It is not a built-in file object, however, so it can't be -used at those few places where a true built-in file object is -required.) - -The \method{info()} method returns an instance of the class -\class{mimetools.Message} containing the headers received from the -server, if the protocol uses such headers (currently the only -supported protocol that uses this is HTTP). See the description of -the \module{mimetools}\refstmodindex{mimetools} module. -\end{funcdesc} - -\begin{funcdesc}{urlretrieve}{url} -Copy a network object denoted by a URL to a local file, if necessary. -If the URL points to a local file, or a valid cached copy of the -object exists, the object is not copied. Return a tuple -\code{(\var{filename}, \var{headers})} where \var{filename} is the -local file name under which the object can be found, and \var{headers} -is either \code{None} (for a local object) or whatever the -\method{info()} method of the object returned by \function{urlopen()} -returned (for a remote object, possibly cached). Exceptions are the -same as for \function{urlopen()}. -\end{funcdesc} - -\begin{funcdesc}{urlcleanup}{} -Clear the cache that may have been built up by previous calls to -\function{urlretrieve()}. -\end{funcdesc} - -\begin{funcdesc}{quote}{string\optional{, addsafe}} -Replace special characters in \var{string} using the \samp{\%xx} escape. -Letters, digits, and the characters \character{_,.-} are never quoted. -The optional \var{addsafe} parameter specifies additional characters -that should not be quoted --- its default value is \code{'/'}. - -Example: \code{quote('/\~connolly/')} yields \code{'/\%7econnolly/'}. -\end{funcdesc} - -\begin{funcdesc}{quote_plus}{string\optional{, addsafe}} -Like \function{quote()}, but also replaces spaces by plus signs, as -required for quoting HTML form values. -\end{funcdesc} - -\begin{funcdesc}{unquote}{string} -Replace \samp{\%xx} escapes by their single-character equivalent. - -Example: \code{unquote('/\%7Econnolly/')} yields \code{'/\~connolly/'}. -\end{funcdesc} - -\begin{funcdesc}{unquote_plus}{string} -Like \function{unquote()}, but also replaces plus signs by spaces, as -required for unquoting HTML form values. -\end{funcdesc} - -Restrictions: - -\begin{itemize} - -\item -Currently, only the following protocols are supported: HTTP, (versions -0.9 and 1.0), Gopher (but not Gopher-+), FTP, and local files. -\indexii{HTTP}{protocol} -\indexii{Gopher}{protocol} -\indexii{FTP}{protocol} - -\item -The caching feature of \function{urlretrieve()} has been disabled -until I find the time to hack proper processing of Expiration time -headers. - -\item -There should be a function to query whether a particular URL is in -the cache. - -\item -For backward compatibility, if a URL appears to point to a local file -but the file can't be opened, the URL is re-interpreted using the FTP -protocol. This can sometimes cause confusing error messages. - -\item -The \function{urlopen()} and \function{urlretrieve()} functions can -cause arbitrarily long delays while waiting for a network connection -to be set up. This means that it is difficult to build an interactive -web client using these functions without using threads. - -\item -The data returned by \function{urlopen()} or \function{urlretrieve()} -is the raw data returned by the server. This may be binary data -(e.g. an image), plain text or (for example) HTML. The HTTP protocol -provides type information in the reply header, which can be inspected -by looking at the \code{content-type} header. For the Gopher protocol, -type information is encoded in the URL; there is currently no easy way -to extract it. If the returned data is HTML, you can use the module -\module{htmllib}\refstmodindex{htmllib} to parse it. -\index{HTML} -\indexii{HTTP}{protocol} -\indexii{Gopher}{protocol} - -\item -Although the \module{urllib} module contains (undocumented) routines -to parse and unparse URL strings, the recommended interface for URL -manipulation is in module \module{urlparse}\refstmodindex{urlparse}. - -\end{itemize} |