| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
splittype(): Always lower-case the URL scheme; these are supposed to be
normalized according to RFC 1738 anyway.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
often, ftp URLs hang in the final close. Further analysis suggests
that this is because the close hook in addclosehook() calls the hook
before acually closing the connection. The hook, in this case, waits
for the '226 Transfer complete' status from the server on the command
socket. However, more and more ftp servers only send this status when
the data socket has actually been closed -- causing a deadlock.
The fix is simple: in addclosehook.close(), call addbase.close()
*before* calling the closehook.
|
| |
|
|
|
|
| |
The fix also adds support for POSTing to an https URL
|
|
|
|
|
|
|
|
| |
The attached patches update the standard library so that all modules
have docstrings beginning with one-line summaries.
A new docstring was added to formatter. The docstring for os.py
was updated to mention nt, os2, ce in addition to posix, dos, mac.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed a TypeError: not enough arguments; expected 4, got 3.
When authentication is needed, the default http_error_401 method calls
retry_http_basic_auth. The default version of that method expected a
data argument which wasn't provided, so now we provide the argument if
it was given and we also made the data argument optional.
Also changed other calls where data was optional to not pass data if
it was not passed to the calling method (in line with other similar
occurances).
|
|
|
|
|
|
| |
Brian E Gallew, which were improved and adapted to OpenSSL 0.9.4 by
Laszlo Kovacs of HP. Both have kindly given permission to include
the patches in the Python distribution. Final formatting by GvR.
|
|
|
|
|
| |
ftp://user@host//root/path: the double slash in the pathname means to
go to the root directory even if the initial directory isn't the root.
|
|
|
|
|
|
| |
In splithost, accept empty host part in URLs. This is required for
file URLs that can have an empty host part. For such URLs, we should
not return the initial 2 slashes as part of the file name.
|
|
|
|
|
|
|
|
|
|
|
| |
Urllib makes the URL of the opened file available through the geturl
method of the returned object. For local files, this consists of
file: plus the name of the file. This results in an invalid URL if
the file name was relative. This patch fixes this so that the
returned URL is just a relative URL in that case. When the file name
is absolute, the URL returned is of the form file:///absolute/path.
[I guess that a URL of the form "file:foo.html" is illegal... GvR]
|
|
|
|
| |
and quote_plus() can be optimized tenfold.
|
|
|
|
| |
right thing "just happens" (basejoin() with old URL).
|
|
|
|
| |
Don't convert URLs to URLs using pathname2url.
|
|
|
|
|
|
| |
The filename to URL conversion didn't properly quote special
characters.
The URL to filename didn't properly unquote special chatacters.
|
|
|
|
| |
extra argument if data is None.
|
|
|
|
| |
extra argument if data is None.
|
| |
|
| |
|
|
|
|
|
|
|
| |
urlopen is used to specify form data, make sure the second argument is
threaded through all of the http_error_NNN calls. This allows error
handlers like the redirect and authorization handlers to properly
re-start the connection.
|
|
|
|
| |
calls to addinfourl() in open_file().
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
File names with "funny" characters get translated wrong by
pathname2url (any variety). E.g. the (Unix) file "/ufs/sjoerd/#tmp"
gets translated into "/ufs/sjoerd/#tmp" which, when interpreted as a
URL is file "/ufs/sjoerd/" with fragment ID "tmp".
Here's an easy fix. (An alternative fix would be to change the
various implementations of pathname2url and url2pathname to include
calls to quote and unquote.
[The main problem is with the normal use of URLs:
url = url2pathname(file)
transmit url
url, tag = splittag(url)
urlopen(url)
]
In addition, this patch fixes some uses of unquote:
- the host part of URLs should be unquoted
- the file path in the FTP URL should be unquoted before it is split
into components.
- because of the latter, I removed all unquoting from ftpwrapper,
and moved it to the caller, but that is not essential
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Generate a correct Content-Length header visible through the info() method
if a request to open an FTP URL gets a length in the response to RETR.
2. Take a third argument to urlretrieve() that makes it possible to progress-
meter an urlretrieve call (this is what I needed the above change for).
See the second patch band below for details.
3. To avoid spurious errors, I commented out the gopher test. The target
document no longer exists.
|
|
|
|
| |
Also added two XXX comments about lingering thread unsafeness.
|
|
|
|
|
|
|
|
| |
Fix the implementation of quote_plus(). (It wouldn't treat '+' in the
original data right.)
Add urlencode(dict) which is handy to create the data for sending a
POST request with urlopen().
|
|
|
|
| |
the '%' should be put back in.
|
| |
|
|
|
|
| |
retrieve one or more URLs to stdout. Use -t to run the self-test.
|
|
|
|
| |
code here.
|
|
|
|
|
|
| |
as soon as I change things even just a little bit? :-) Even works
when accessing a password-protected page through the proxy. Prompted
by complaints from, and correct operation verified by, Nigel O'Brian.
|
|
|
|
|
| |
when splithost() returned no useable host, to avoid calling
splituser() on None.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
guess the mime type of a local file.
Change suggested by Sjoerd (with different implementation):
when retrieve() creates a temporary file, preserve the suffix.
Corrollary of the first change:
also return the mime type of a local file in retrieve().
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
most recently opened URL in self.openedurl of the URLopener instance.
This doesn't really work if multiple threads share the same opener
instance!
Fix: openedurl was actually simply the type prefix (e.g. "http:")
followed by the rest of the URL; since the rest of the URL is
available and the type is effectively determined by where you are in
the code, I can reconstruct the full URL easily, e.g. "http:" + url.
|
|
|
|
|
|
| |
- use the tempcache in the open() method, too.
- use the "unwrap"ped url as key for the tempcache.
|
|
|
|
|
| |
(2) Provisional hack to avoid dying when trying to turn echo on or off
on Macs, where os.system() doesn't exist.
|
|
|
|
| |
would set the transfer to text mode instead of the specified mode.
|
|
|
|
|
|
|
|
|
|
| |
retrieving files from the same host and directory, you had to close
the previous instance before opening a new one; and retrieving a
non-existent file would return an empty file. (The latter fix relies
on maybe an undocumented property of NLST -- NLST of a file returns
just that file, while NLST of a non-existent file returns nothing. A
side effect, unfortunately, seems to be that now ftp-retrieving an
*empty* directory may fail. Ah well.)
|
| |
|
|
|
|
| |
is indeed a dictionary (or a mapping).
|
|
|
|
|
| |
problems with this module, even if an instance of a derived class is
kept alive longer than the urllib module itself...
|
| |
|
|
|
|
| |
Use "re" module, making it threadsafe.
|
|
|
|
|
| |
a shared class variable -- but each instance will attempt to clean it
up entirely ob cleanup).
|
|
|
|
|
|
|
|
|
|
|
| |
Sjoerd: add separate administration of temporary files created y
URLopener.retrieve() so cleanup can properly remove them. The old
code removed everything in tempcache which was a bad idea if the user
had passed a non-temp file into it. (I added a line to delete the
tempcache in cleanup() -- it still seems to make sense.)
Jack: in basejoin(), interpret relative paths starting in "../". This
is necessary if the server uses symbolic links.
|