diff options
-rw-r--r-- | doc/cookiejar.n | 167 | ||||
-rw-r--r-- | doc/http.n | 110 | ||||
-rw-r--r-- | library/http/http.tcl | 13 |
3 files changed, 288 insertions, 2 deletions
diff --git a/doc/cookiejar.n b/doc/cookiejar.n new file mode 100644 index 0000000..61f2243 --- /dev/null +++ b/doc/cookiejar.n @@ -0,0 +1,167 @@ +'\" +'\" Copyright (c) 2014-2018 Donal K. Fellows. +'\" +'\" See the file "license.terms" for information on usage and redistribution +'\" of this file, and for a DISCLAIMER OF ALL WARRANTIES. +'\" +.TH "cookiejar" n 0.1 cookiejar "Tcl Bundled Packages" +.so man.macros +.BS +'\" Note: do not modify the .SH NAME line immediately below! +.SH NAME +cookiejar \- Implementation of the Tcl http package cookie jar protocol +.SH SYNOPSIS +.nf +\fBpackage require\fR \fBcookiejar\fR ?\fB0.1\fR? + +\fB::http::cookiejar configure\fR ?\fIoptionName\fR? ?\fIoptionValue\fR? +\fB::http::cookiejar create\fR \fIname\fR ?\fIfilename\fR? +\fB::http::cookiejar new\fR ?\fIfilename\fR? + +\fIcookiejar\fR \fBdestroy\fR +\fIcookiejar\fR \fBforceLoadDomainData\fR +\fIcookiejar\fR \fBgetCookies\fR \fIprotocol host path\fR +\fIcookiejar\fR \fBstoreCookie\fR \fIoptions\fR +\fIcookiejar\fR \fBlookup\fR ?\fIhost\fR? ?\fIkey\fR? +.fi +.SH DESCRIPTION +.PP +The cookiejar package provides an implementation of the http package's cookie +jar protocol using an SQLite database. It provides one main command, +\fB::http::cookiejar\fR, which is a TclOO class that should be instantiated to +create a cookie jar that manages a particular HTTP session. +.PP +The database management policy can be controlled at the package level by the +\fBconfigure\fR method on the \fB::http::cookiejar\fR class object: +.TP +\fB::http::cookiejar configure\fR ?\fIoptionName\fR? ?\fIoptionValue\fR? +. +If neither \fIoptionName\fR nor \fIoptionValue\fR are supplied, this returns a +copy of the configuration as a Tcl dictionary. If just \fIoptionName\fR is +supplied, just the value of the named option is returned. If both +\fIoptionName\fR and \fIoptionValue\fR are given, the named option is changed +to be the given value. +.RS +.PP +Supported options are: +.TP +\fB\-domainfile \fIfilename\fR +. +A file (defaulting to within the cookiejar package) with a description of the +list of top-level domains (e.g., \fB.com\fR or \fB.co.jp\fR). Such domains +\fImust not\fR accept cookies set upon them. Note that the list of such +domains is both security-sensitive and \fInot\fR constant and should be +periodically refetched. Cookie jars maintain their own cache of the domain +list. +.TP +\fB\-domainlist \fIurl\fR +. +A URL to fetch the list of top-level domains (e.g., \fB.com\fR or +\fB.co.jp\fR) from. Such domains \fImust not\fR accept cookies set upon +them. Note that the list of such domains is both security-sensitive and +\fInot\fR constant and should be periodically refetched. Cookie jars maintain +their own cache of the domain list. +.TP +\fB\-domainrefresh \fIintervalMilliseconds\fR +. +The number of milliseconds between checks of the \fI\-domainlist\fR for new +domains. +.TP +\fB\-loglevel \fIlevel\fR +. +The logging level of this package. The logging level must be (in order of +decreasing verbosity) one of \fBdebug\fR, \fBinfo\fR, \fBwarn\fR, or +\fBerror\fR. +.TP +\fB\-offline \fIflag\fR +. +Allows the cookie managment engine to be placed into offline mode. In offline +mode, the list of domains is read immediately from the file configured in the +\fB\-domainfile\fR option, and the \fB\-domainlist\fR option is not used; it +also makes the \fB\-domainrefresh\fR option be effectively ignored. +.TP +\fB\-purgeold \fIintervalMilliseconds\fR +. +The number of milliseconds between checks of the database for expired +cookies; expired cookies are deleted. +.TP +\fB\-retain \fIcookieCount\fR +. +The maximum number of cookies to retain in the database. +.TP +\fB\-vacuumtrigger \fIdeletionCount\fR +. +A count of the number of persistent cookie deletions to go between vacuuming +the database. +.RE +.PP +Cookie jar instances may be made with any of the standard TclOO instance +creation methods (\fBcreate\fR or \fRnew\fR). +.TP +\fB::http::cookiejar new\fR ?\fIfilename\fR? +. +If a \fIfilename\fR argument is provided, it is the name of a file containing +an SQLite database that will contain the persistent cookies maintained by the +cookie jar; the database will be created if the file does not already +exist. If \fIfilename\fR is not supplied, the database will be held entirely within +memory, which effectively forces all cookies within it to be session cookies. +.SS "INSTANCE METHODS" +.PP +The following methods are supported on the instances: +.TP +\fIcookiejar\fR \fBdestroy\fR +. +This is the standard TclOO destruction method. It does \fInot\fR delete the +SQLite database if it is written to disk. Callers are responsible for ensuring +that the cookie jar is not in use by the http package at the time of +destruction. +.TP +\fIcookiejar\fR \fBforceLoadDomainData\fR +. +This method causes the cookie jar to immediately load (and cache) the domain +list data. The domain list will be loaded from the \fB\-domainlist\fR +configured a the package level if that is enabled, and otherwise will be +obtained from the \fB\-domainfile\fR configured at the package level. +.TP +\fIcookiejar\fR \fBgetCookies\fR \fIprotocol host path\fR +. +This method obtains the cookies for a particular HTTP request. \fIThis +implements the http cookie jar protocol.\fR +.TP +\fIcookiejar\fR \fBstoreCookie\fR \fIoptions\fR +. +This method stores a single cookie from a particular HTTP response. Cookies +that fail security checks are ignored. \fIThis implements the http cookie jar +protocol.\fR +.TP +\fIcookiejar\fR \fBlookup\fR ?\fIhost\fR? ?\fIkey\fR? +. +This method looks a cookie by exact host (or domain) matching. If neither +\fIhost\fR nor \fIkey\fR are supplied, the list of hosts for which a cookie is +stored is returned. If just \fIhost\fR (which may be a hostname or a domain +name) is supplied, the list of cookie keys stored for that host is returned. +If both \fIhost\fR and \fIkey\fR are supplied, the value for that key is +returned; it is an error if no such host or key match exactly. +.SH "EXAMPLE" +.PP +The simplest way of using a cookie jar is to just permanently configure it at +the start of the application. +.PP +.CS +package require http +\fBpackage require cookiejar\fR + +set cookiedb ~/.tclcookies.db +http::configure -cookiejar [\fBhttp::cookiejar new\fR $cookiedb] + +# No further explicit steps are required to use cookies +set tok [http::geturl http://core.tcl.tk/] +.CE +.SH "SEE ALSO" +http(n), oo::class(n), sqlite3(n) +.SH KEYWORDS +cookie, internet, security policy, www +'\" Local Variables: +'\" mode: nroff +'\" fill-column: 78 +'\" End: @@ -99,6 +99,15 @@ comma-separated list of mime type patterns that you are willing to receive. For example, .QW "image/gif, image/jpeg, text/*" . .TP +\fB\-cookiejar\fR \fIcommand\fR +.VS TIP406 +The cookie store for the package to use to manage HTTP cookies. +\fIcommand\fR is a command prefix list; if the empty list (the +default value) is used, no cookies will be sent by requests or stored +from responses. The command indicated by \fIcommand\fR, if supplied, +must obey the \fBCOOKIE JAR PROTOCOL\fR described below. +.VE TIP406 +.TP \fB\-pipeline\fR \fIboolean\fR . Specifies whether HTTP/1.1 transactions on a persistent socket will be @@ -770,6 +779,107 @@ Subsequent GET and HEAD requests in a failed pipeline will also be retried. that the retry is appropriate\fR - specifically, the application must know that if the failed POST successfully modified the state of the server, a repeat POST would have no adverse effect. +.VS TIP406 +.SH "COOKIE JAR PROTOCOL" +.PP +Cookies are short key-value pairs used to implement sessions within the +otherwise-stateless HTTP protocol. \fB(TODO: CITE RFC)\fR +.PP +Cookie storage managment commands \(em +.QW "cookie jars" +\(em must support these subcommands which form the HTTP cookie storage +management protocol. Note that \fIcookieJar\fR below does not have to be a +command name; it is properly a command prefix (a Tcl list of words that will +be expanded in place) and admits many possible implementations. +.PP +Though not formally part of the protocol, it is expected that particular +values of \fIcookieJar\fR will correspond to sessions; it is up to the caller +of \fB::http::config\fR to decide what session applies and to manage the +deletion of said sessions when they are no longer desired (which should be +when they not configured as the current cookie jar). +.TP +\fIcookieJar \fBgetCookies \fIprotocol host requestPath\fR +. +This command asks the cookie jar what cookies should be supplied for a +particular request. It should take the \fIprotocol\fR (typically \fBhttp\fR or +\fBhttps\fR), \fIhost\fR name and \fIrequestPath\fR (parsed from the \fIurl\fR +argument to \fB::http::geturl\fR) and return a list of cookie keys and values +that describe the cookies to supply to the remote host. The list must have an +even number of elements. +.RS +.PP +There should only ever be at most one cookie with a particular key for any +request (typically the one with the most specific \fIhost\fR/domain match and +most specific \fIrequestPath\fR/path match), but there may be many cookies +with different names in any request. +.RE +.TP +\fIcookieJar \fBstoreCookie \fIcookieDictionary\fR +. +This command asks the cookie jar to store a particular cookie that was +returned by a request. The cookie (which will have been parsed by the http +package) is described by a dictionary, \fIcookieDictionary\fR, that may have +the following keys: +.RS +.TP +\fBdomain\fR +. +This is always present. Its value describes the domain hostname \fIor +prefix\fR that the cookie should be returned for. The checking of the domain +against the origin (below) should be careful since sites that issue cookies +should only do so for domains related to themselves. Cookies that do not obey +a relevant origin matching rule should be ignored. +.TP +\fBexpires\fR +. +This is optional. If present, the cookie is intended to be a persistent cookie +and the value of the option is the Tcl timestamp (in seconds from the same +base as \fBclock seconds\fR) of when the cookie expires (which may be in the +past, which should result in the cookie being deleted immediately). If absent, +the cookie is intended to be a session cookie that should be not persisted +beyond the lifetime of the cookie jar. +.TP +\fBhostonly\fR +. +This is always present. Its value is a boolean that describes whether the +cookie is a single host cookie (true) or a domain-level cookie (false). +.TP +\fBhttponly\fR +. +This is always present. Its value is a boolean that is true when the site +wishes the cookie to only ever be used with HTTP (or HTTPS) traffic. +.TP +\fBkey\fR +. +This is always present. Its value is the \fIkey\fR of the cookie, which is +part of the information that must be return when sending this cookie back in a +future request. +.TP +\fBorigin\fR +. +This is always present. Its value describes where the http package believes it +received the cookie from, which may be useful for checking whether the +cookie's domain is valid. +.TP +\fBpath\fR +. +This is always present. Its value describes the path prefix of requests to the +cookie domain where the cookie should be returned. +.TP +\fBsecure\fR +. +This is always present. Its value is a boolean that is true when the cookie +should only used on requests sent over secure channels (typically HTTPS). +.TP +\fBvalue\fR +. +This is always present. Its value is the value of the cookie, which is part of +the information that must be return when sending this cookie back in a future +request. +.PP +Other keys may always be ignored; they have no meaning in this protocol. +.RE +.VE TIP406 .SH EXAMPLE .PP This example creates a procedure to copy a URL to a file while printing a diff --git a/library/http/http.tcl b/library/http/http.tcl index e843ec8..7236bae 100644 --- a/library/http/http.tcl +++ b/library/http/http.tcl @@ -20,6 +20,7 @@ namespace eval http { if {![info exists http]} { array set http { -accept */* + -cookiejar {} -pipeline 1 -postfresh 0 -proxyhost {} @@ -28,7 +29,6 @@ namespace eval http { -repost 0 -urlencoding utf-8 -zip 1 - -cookiejar {} } # We need a useragent string of this style or various servers will # refuse to send us compressed content even when we ask for it. This @@ -129,7 +129,16 @@ namespace eval http { } # Regular expression used to parse cookies - variable CookieRE {\s*([^][\u0000- ()<>@,;:\\""/?={}\u007f-\uffff]+)=([!\u0023-+\u002D-:<-\u005B\u005D-~]*)(?:\s*;\s*([^\u0000]+))?} + variable CookieRE {(?x) # EXPANDED SYNTAX + \s* # Ignore leading spaces + ([^][\u0000- ()<>@,;:\\""/?={}\u007f-\uffff]+) # Match the name + = # LITERAL: Equal sign + ([!\u0023-+\u002D-:<-\u005B\u005D-~]*) # Match the value + (?: + \s* ; \s* # LITERAL: semicolon + ([^\u0000]+) # Match the options + )? + } namespace export geturl config reset wait formatQuery quoteString namespace export register unregister registerError |