[2] | 1 | :mod:`cookielib` --- Cookie handling for HTTP clients
|
---|
| 2 | =====================================================
|
---|
| 3 |
|
---|
| 4 | .. module:: cookielib
|
---|
| 5 | :synopsis: Classes for automatic handling of HTTP cookies.
|
---|
| 6 | .. moduleauthor:: John J. Lee <jjl@pobox.com>
|
---|
| 7 | .. sectionauthor:: John J. Lee <jjl@pobox.com>
|
---|
| 8 |
|
---|
| 9 | .. note::
|
---|
| 10 | The :mod:`cookielib` module has been renamed to :mod:`http.cookiejar` in
|
---|
[391] | 11 | Python 3. The :term:`2to3` tool will automatically adapt imports when
|
---|
| 12 | converting your sources to Python 3.
|
---|
[2] | 13 |
|
---|
| 14 | .. versionadded:: 2.4
|
---|
| 15 |
|
---|
[391] | 16 | **Source code:** :source:`Lib/cookielib.py`
|
---|
[2] | 17 |
|
---|
[391] | 18 | --------------
|
---|
[2] | 19 |
|
---|
| 20 | The :mod:`cookielib` module defines classes for automatic handling of HTTP
|
---|
| 21 | cookies. It is useful for accessing web sites that require small pieces of data
|
---|
| 22 | -- :dfn:`cookies` -- to be set on the client machine by an HTTP response from a
|
---|
| 23 | web server, and then returned to the server in later HTTP requests.
|
---|
| 24 |
|
---|
| 25 | Both the regular Netscape cookie protocol and the protocol defined by
|
---|
| 26 | :rfc:`2965` are handled. RFC 2965 handling is switched off by default.
|
---|
| 27 | :rfc:`2109` cookies are parsed as Netscape cookies and subsequently treated
|
---|
| 28 | either as Netscape or RFC 2965 cookies according to the 'policy' in effect.
|
---|
| 29 | Note that the great majority of cookies on the Internet are Netscape cookies.
|
---|
| 30 | :mod:`cookielib` attempts to follow the de-facto Netscape cookie protocol (which
|
---|
| 31 | differs substantially from that set out in the original Netscape specification),
|
---|
| 32 | including taking note of the ``max-age`` and ``port`` cookie-attributes
|
---|
| 33 | introduced with RFC 2965.
|
---|
| 34 |
|
---|
| 35 | .. note::
|
---|
| 36 |
|
---|
| 37 | The various named parameters found in :mailheader:`Set-Cookie` and
|
---|
| 38 | :mailheader:`Set-Cookie2` headers (eg. ``domain`` and ``expires``) are
|
---|
| 39 | conventionally referred to as :dfn:`attributes`. To distinguish them from
|
---|
| 40 | Python attributes, the documentation for this module uses the term
|
---|
| 41 | :dfn:`cookie-attribute` instead.
|
---|
| 42 |
|
---|
| 43 |
|
---|
| 44 | The module defines the following exception:
|
---|
| 45 |
|
---|
| 46 |
|
---|
| 47 | .. exception:: LoadError
|
---|
| 48 |
|
---|
| 49 | Instances of :class:`FileCookieJar` raise this exception on failure to load
|
---|
| 50 | cookies from a file.
|
---|
| 51 |
|
---|
| 52 | .. note::
|
---|
| 53 |
|
---|
| 54 | For backwards-compatibility with Python 2.4 (which raised an :exc:`IOError`),
|
---|
| 55 | :exc:`LoadError` is a subclass of :exc:`IOError`.
|
---|
| 56 |
|
---|
| 57 |
|
---|
| 58 | The following classes are provided:
|
---|
| 59 |
|
---|
| 60 |
|
---|
| 61 | .. class:: CookieJar(policy=None)
|
---|
| 62 |
|
---|
| 63 | *policy* is an object implementing the :class:`CookiePolicy` interface.
|
---|
| 64 |
|
---|
| 65 | The :class:`CookieJar` class stores HTTP cookies. It extracts cookies from HTTP
|
---|
| 66 | requests, and returns them in HTTP responses. :class:`CookieJar` instances
|
---|
| 67 | automatically expire contained cookies when necessary. Subclasses are also
|
---|
| 68 | responsible for storing and retrieving cookies from a file or database.
|
---|
| 69 |
|
---|
| 70 |
|
---|
| 71 | .. class:: FileCookieJar(filename, delayload=None, policy=None)
|
---|
| 72 |
|
---|
| 73 | *policy* is an object implementing the :class:`CookiePolicy` interface. For the
|
---|
| 74 | other arguments, see the documentation for the corresponding attributes.
|
---|
| 75 |
|
---|
| 76 | A :class:`CookieJar` which can load cookies from, and perhaps save cookies to, a
|
---|
| 77 | file on disk. Cookies are **NOT** loaded from the named file until either the
|
---|
| 78 | :meth:`load` or :meth:`revert` method is called. Subclasses of this class are
|
---|
| 79 | documented in section :ref:`file-cookie-jar-classes`.
|
---|
| 80 |
|
---|
| 81 |
|
---|
| 82 | .. class:: CookiePolicy()
|
---|
| 83 |
|
---|
| 84 | This class is responsible for deciding whether each cookie should be accepted
|
---|
| 85 | from / returned to the server.
|
---|
| 86 |
|
---|
| 87 |
|
---|
| 88 | .. class:: DefaultCookiePolicy( blocked_domains=None, allowed_domains=None, netscape=True, rfc2965=False, rfc2109_as_netscape=None, hide_cookie2=False, strict_domain=False, strict_rfc2965_unverifiable=True, strict_ns_unverifiable=False, strict_ns_domain=DefaultCookiePolicy.DomainLiberal, strict_ns_set_initial_dollar=False, strict_ns_set_path=False )
|
---|
| 89 |
|
---|
| 90 | Constructor arguments should be passed as keyword arguments only.
|
---|
| 91 | *blocked_domains* is a sequence of domain names that we never accept cookies
|
---|
| 92 | from, nor return cookies to. *allowed_domains* if not :const:`None`, this is a
|
---|
| 93 | sequence of the only domains for which we accept and return cookies. For all
|
---|
| 94 | other arguments, see the documentation for :class:`CookiePolicy` and
|
---|
| 95 | :class:`DefaultCookiePolicy` objects.
|
---|
| 96 |
|
---|
| 97 | :class:`DefaultCookiePolicy` implements the standard accept / reject rules for
|
---|
| 98 | Netscape and RFC 2965 cookies. By default, RFC 2109 cookies (ie. cookies
|
---|
| 99 | received in a :mailheader:`Set-Cookie` header with a version cookie-attribute of
|
---|
| 100 | 1) are treated according to the RFC 2965 rules. However, if RFC 2965 handling
|
---|
| 101 | is turned off or :attr:`rfc2109_as_netscape` is True, RFC 2109 cookies are
|
---|
| 102 | 'downgraded' by the :class:`CookieJar` instance to Netscape cookies, by
|
---|
| 103 | setting the :attr:`version` attribute of the :class:`Cookie` instance to 0.
|
---|
| 104 | :class:`DefaultCookiePolicy` also provides some parameters to allow some
|
---|
| 105 | fine-tuning of policy.
|
---|
| 106 |
|
---|
| 107 |
|
---|
| 108 | .. class:: Cookie()
|
---|
| 109 |
|
---|
| 110 | This class represents Netscape, RFC 2109 and RFC 2965 cookies. It is not
|
---|
| 111 | expected that users of :mod:`cookielib` construct their own :class:`Cookie`
|
---|
| 112 | instances. Instead, if necessary, call :meth:`make_cookies` on a
|
---|
| 113 | :class:`CookieJar` instance.
|
---|
| 114 |
|
---|
| 115 |
|
---|
| 116 | .. seealso::
|
---|
| 117 |
|
---|
| 118 | Module :mod:`urllib2`
|
---|
| 119 | URL opening with automatic cookie handling.
|
---|
| 120 |
|
---|
| 121 | Module :mod:`Cookie`
|
---|
| 122 | HTTP cookie classes, principally useful for server-side code. The
|
---|
| 123 | :mod:`cookielib` and :mod:`Cookie` modules do not depend on each other.
|
---|
| 124 |
|
---|
| 125 | http://wp.netscape.com/newsref/std/cookie_spec.html
|
---|
| 126 | The specification of the original Netscape cookie protocol. Though this is
|
---|
| 127 | still the dominant protocol, the 'Netscape cookie protocol' implemented by all
|
---|
| 128 | the major browsers (and :mod:`cookielib`) only bears a passing resemblance to
|
---|
| 129 | the one sketched out in ``cookie_spec.html``.
|
---|
| 130 |
|
---|
| 131 | :rfc:`2109` - HTTP State Management Mechanism
|
---|
| 132 | Obsoleted by RFC 2965. Uses :mailheader:`Set-Cookie` with version=1.
|
---|
| 133 |
|
---|
| 134 | :rfc:`2965` - HTTP State Management Mechanism
|
---|
| 135 | The Netscape protocol with the bugs fixed. Uses :mailheader:`Set-Cookie2` in
|
---|
| 136 | place of :mailheader:`Set-Cookie`. Not widely used.
|
---|
| 137 |
|
---|
| 138 | http://kristol.org/cookie/errata.html
|
---|
| 139 | Unfinished errata to RFC 2965.
|
---|
| 140 |
|
---|
| 141 | :rfc:`2964` - Use of HTTP State Management
|
---|
| 142 |
|
---|
| 143 | .. _cookie-jar-objects:
|
---|
| 144 |
|
---|
| 145 | CookieJar and FileCookieJar Objects
|
---|
| 146 | -----------------------------------
|
---|
| 147 |
|
---|
| 148 | :class:`CookieJar` objects support the :term:`iterator` protocol for iterating over
|
---|
| 149 | contained :class:`Cookie` objects.
|
---|
| 150 |
|
---|
| 151 | :class:`CookieJar` has the following methods:
|
---|
| 152 |
|
---|
| 153 |
|
---|
| 154 | .. method:: CookieJar.add_cookie_header(request)
|
---|
| 155 |
|
---|
| 156 | Add correct :mailheader:`Cookie` header to *request*.
|
---|
| 157 |
|
---|
| 158 | If policy allows (ie. the :attr:`rfc2965` and :attr:`hide_cookie2` attributes of
|
---|
| 159 | the :class:`CookieJar`'s :class:`CookiePolicy` instance are true and false
|
---|
| 160 | respectively), the :mailheader:`Cookie2` header is also added when appropriate.
|
---|
| 161 |
|
---|
| 162 | The *request* object (usually a :class:`urllib2.Request` instance) must support
|
---|
| 163 | the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`get_type`,
|
---|
| 164 | :meth:`unverifiable`, :meth:`get_origin_req_host`, :meth:`has_header`,
|
---|
| 165 | :meth:`get_header`, :meth:`header_items`, and :meth:`add_unredirected_header`,as
|
---|
| 166 | documented by :mod:`urllib2`.
|
---|
| 167 |
|
---|
| 168 |
|
---|
| 169 | .. method:: CookieJar.extract_cookies(response, request)
|
---|
| 170 |
|
---|
| 171 | Extract cookies from HTTP *response* and store them in the :class:`CookieJar`,
|
---|
| 172 | where allowed by policy.
|
---|
| 173 |
|
---|
| 174 | The :class:`CookieJar` will look for allowable :mailheader:`Set-Cookie` and
|
---|
| 175 | :mailheader:`Set-Cookie2` headers in the *response* argument, and store cookies
|
---|
| 176 | as appropriate (subject to the :meth:`CookiePolicy.set_ok` method's approval).
|
---|
| 177 |
|
---|
| 178 | The *response* object (usually the result of a call to :meth:`urllib2.urlopen`,
|
---|
| 179 | or similar) should support an :meth:`info` method, which returns an object with
|
---|
| 180 | a :meth:`getallmatchingheaders` method (usually a :class:`mimetools.Message`
|
---|
| 181 | instance).
|
---|
| 182 |
|
---|
| 183 | The *request* object (usually a :class:`urllib2.Request` instance) must support
|
---|
| 184 | the methods :meth:`get_full_url`, :meth:`get_host`, :meth:`unverifiable`, and
|
---|
| 185 | :meth:`get_origin_req_host`, as documented by :mod:`urllib2`. The request is
|
---|
| 186 | used to set default values for cookie-attributes as well as for checking that
|
---|
| 187 | the cookie is allowed to be set.
|
---|
| 188 |
|
---|
| 189 |
|
---|
| 190 | .. method:: CookieJar.set_policy(policy)
|
---|
| 191 |
|
---|
| 192 | Set the :class:`CookiePolicy` instance to be used.
|
---|
| 193 |
|
---|
| 194 |
|
---|
| 195 | .. method:: CookieJar.make_cookies(response, request)
|
---|
| 196 |
|
---|
| 197 | Return sequence of :class:`Cookie` objects extracted from *response* object.
|
---|
| 198 |
|
---|
| 199 | See the documentation for :meth:`extract_cookies` for the interfaces required of
|
---|
| 200 | the *response* and *request* arguments.
|
---|
| 201 |
|
---|
| 202 |
|
---|
| 203 | .. method:: CookieJar.set_cookie_if_ok(cookie, request)
|
---|
| 204 |
|
---|
| 205 | Set a :class:`Cookie` if policy says it's OK to do so.
|
---|
| 206 |
|
---|
| 207 |
|
---|
| 208 | .. method:: CookieJar.set_cookie(cookie)
|
---|
| 209 |
|
---|
| 210 | Set a :class:`Cookie`, without checking with policy to see whether or not it
|
---|
| 211 | should be set.
|
---|
| 212 |
|
---|
| 213 |
|
---|
| 214 | .. method:: CookieJar.clear([domain[, path[, name]]])
|
---|
| 215 |
|
---|
| 216 | Clear some cookies.
|
---|
| 217 |
|
---|
| 218 | If invoked without arguments, clear all cookies. If given a single argument,
|
---|
| 219 | only cookies belonging to that *domain* will be removed. If given two arguments,
|
---|
| 220 | cookies belonging to the specified *domain* and URL *path* are removed. If
|
---|
| 221 | given three arguments, then the cookie with the specified *domain*, *path* and
|
---|
| 222 | *name* is removed.
|
---|
| 223 |
|
---|
| 224 | Raises :exc:`KeyError` if no matching cookie exists.
|
---|
| 225 |
|
---|
| 226 |
|
---|
| 227 | .. method:: CookieJar.clear_session_cookies()
|
---|
| 228 |
|
---|
| 229 | Discard all session cookies.
|
---|
| 230 |
|
---|
| 231 | Discards all contained cookies that have a true :attr:`discard` attribute
|
---|
| 232 | (usually because they had either no ``max-age`` or ``expires`` cookie-attribute,
|
---|
| 233 | or an explicit ``discard`` cookie-attribute). For interactive browsers, the end
|
---|
| 234 | of a session usually corresponds to closing the browser window.
|
---|
| 235 |
|
---|
| 236 | Note that the :meth:`save` method won't save session cookies anyway, unless you
|
---|
| 237 | ask otherwise by passing a true *ignore_discard* argument.
|
---|
| 238 |
|
---|
| 239 | :class:`FileCookieJar` implements the following additional methods:
|
---|
| 240 |
|
---|
| 241 |
|
---|
| 242 | .. method:: FileCookieJar.save(filename=None, ignore_discard=False, ignore_expires=False)
|
---|
| 243 |
|
---|
| 244 | Save cookies to a file.
|
---|
| 245 |
|
---|
| 246 | This base class raises :exc:`NotImplementedError`. Subclasses may leave this
|
---|
| 247 | method unimplemented.
|
---|
| 248 |
|
---|
| 249 | *filename* is the name of file in which to save cookies. If *filename* is not
|
---|
| 250 | specified, :attr:`self.filename` is used (whose default is the value passed to
|
---|
| 251 | the constructor, if any); if :attr:`self.filename` is :const:`None`,
|
---|
| 252 | :exc:`ValueError` is raised.
|
---|
| 253 |
|
---|
| 254 | *ignore_discard*: save even cookies set to be discarded. *ignore_expires*: save
|
---|
| 255 | even cookies that have expired
|
---|
| 256 |
|
---|
| 257 | The file is overwritten if it already exists, thus wiping all the cookies it
|
---|
| 258 | contains. Saved cookies can be restored later using the :meth:`load` or
|
---|
| 259 | :meth:`revert` methods.
|
---|
| 260 |
|
---|
| 261 |
|
---|
| 262 | .. method:: FileCookieJar.load(filename=None, ignore_discard=False, ignore_expires=False)
|
---|
| 263 |
|
---|
| 264 | Load cookies from a file.
|
---|
| 265 |
|
---|
| 266 | Old cookies are kept unless overwritten by newly loaded ones.
|
---|
| 267 |
|
---|
| 268 | Arguments are as for :meth:`save`.
|
---|
| 269 |
|
---|
| 270 | The named file must be in the format understood by the class, or
|
---|
| 271 | :exc:`LoadError` will be raised. Also, :exc:`IOError` may be raised, for
|
---|
| 272 | example if the file does not exist.
|
---|
| 273 |
|
---|
| 274 | .. note::
|
---|
| 275 |
|
---|
| 276 | For backwards-compatibility with Python 2.4 (which raised an :exc:`IOError`),
|
---|
| 277 | :exc:`LoadError` is a subclass of :exc:`IOError`.
|
---|
| 278 |
|
---|
| 279 |
|
---|
| 280 | .. method:: FileCookieJar.revert(filename=None, ignore_discard=False, ignore_expires=False)
|
---|
| 281 |
|
---|
| 282 | Clear all cookies and reload cookies from a saved file.
|
---|
| 283 |
|
---|
| 284 | :meth:`revert` can raise the same exceptions as :meth:`load`. If there is a
|
---|
| 285 | failure, the object's state will not be altered.
|
---|
| 286 |
|
---|
| 287 | :class:`FileCookieJar` instances have the following public attributes:
|
---|
| 288 |
|
---|
| 289 |
|
---|
| 290 | .. attribute:: FileCookieJar.filename
|
---|
| 291 |
|
---|
| 292 | Filename of default file in which to keep cookies. This attribute may be
|
---|
| 293 | assigned to.
|
---|
| 294 |
|
---|
| 295 |
|
---|
| 296 | .. attribute:: FileCookieJar.delayload
|
---|
| 297 |
|
---|
| 298 | If true, load cookies lazily from disk. This attribute should not be assigned
|
---|
| 299 | to. This is only a hint, since this only affects performance, not behaviour
|
---|
| 300 | (unless the cookies on disk are changing). A :class:`CookieJar` object may
|
---|
| 301 | ignore it. None of the :class:`FileCookieJar` classes included in the standard
|
---|
| 302 | library lazily loads cookies.
|
---|
| 303 |
|
---|
| 304 |
|
---|
| 305 | .. _file-cookie-jar-classes:
|
---|
| 306 |
|
---|
| 307 | FileCookieJar subclasses and co-operation with web browsers
|
---|
| 308 | -----------------------------------------------------------
|
---|
| 309 |
|
---|
[391] | 310 | The following :class:`CookieJar` subclasses are provided for reading and
|
---|
| 311 | writing .
|
---|
[2] | 312 |
|
---|
| 313 | .. class:: MozillaCookieJar(filename, delayload=None, policy=None)
|
---|
| 314 |
|
---|
| 315 | A :class:`FileCookieJar` that can load from and save cookies to disk in the
|
---|
| 316 | Mozilla ``cookies.txt`` file format (which is also used by the Lynx and Netscape
|
---|
| 317 | browsers).
|
---|
| 318 |
|
---|
| 319 | .. note::
|
---|
| 320 |
|
---|
| 321 | Version 3 of the Firefox web browser no longer writes cookies in the
|
---|
| 322 | ``cookies.txt`` file format.
|
---|
| 323 |
|
---|
| 324 | .. note::
|
---|
| 325 |
|
---|
| 326 | This loses information about RFC 2965 cookies, and also about newer or
|
---|
| 327 | non-standard cookie-attributes such as ``port``.
|
---|
| 328 |
|
---|
| 329 | .. warning::
|
---|
| 330 |
|
---|
| 331 | Back up your cookies before saving if you have cookies whose loss / corruption
|
---|
| 332 | would be inconvenient (there are some subtleties which may lead to slight
|
---|
| 333 | changes in the file over a load / save round-trip).
|
---|
| 334 |
|
---|
| 335 | Also note that cookies saved while Mozilla is running will get clobbered by
|
---|
| 336 | Mozilla.
|
---|
| 337 |
|
---|
| 338 |
|
---|
| 339 | .. class:: LWPCookieJar(filename, delayload=None, policy=None)
|
---|
| 340 |
|
---|
| 341 | A :class:`FileCookieJar` that can load from and save cookies to disk in format
|
---|
| 342 | compatible with the libwww-perl library's ``Set-Cookie3`` file format. This is
|
---|
| 343 | convenient if you want to store cookies in a human-readable file.
|
---|
| 344 |
|
---|
| 345 |
|
---|
| 346 | .. _cookie-policy-objects:
|
---|
| 347 |
|
---|
| 348 | CookiePolicy Objects
|
---|
| 349 | --------------------
|
---|
| 350 |
|
---|
| 351 | Objects implementing the :class:`CookiePolicy` interface have the following
|
---|
| 352 | methods:
|
---|
| 353 |
|
---|
| 354 |
|
---|
| 355 | .. method:: CookiePolicy.set_ok(cookie, request)
|
---|
| 356 |
|
---|
| 357 | Return boolean value indicating whether cookie should be accepted from server.
|
---|
| 358 |
|
---|
| 359 | *cookie* is a :class:`cookielib.Cookie` instance. *request* is an object
|
---|
| 360 | implementing the interface defined by the documentation for
|
---|
| 361 | :meth:`CookieJar.extract_cookies`.
|
---|
| 362 |
|
---|
| 363 |
|
---|
| 364 | .. method:: CookiePolicy.return_ok(cookie, request)
|
---|
| 365 |
|
---|
| 366 | Return boolean value indicating whether cookie should be returned to server.
|
---|
| 367 |
|
---|
| 368 | *cookie* is a :class:`cookielib.Cookie` instance. *request* is an object
|
---|
| 369 | implementing the interface defined by the documentation for
|
---|
| 370 | :meth:`CookieJar.add_cookie_header`.
|
---|
| 371 |
|
---|
| 372 |
|
---|
| 373 | .. method:: CookiePolicy.domain_return_ok(domain, request)
|
---|
| 374 |
|
---|
| 375 | Return false if cookies should not be returned, given cookie domain.
|
---|
| 376 |
|
---|
| 377 | This method is an optimization. It removes the need for checking every cookie
|
---|
| 378 | with a particular domain (which might involve reading many files). Returning
|
---|
| 379 | true from :meth:`domain_return_ok` and :meth:`path_return_ok` leaves all the
|
---|
| 380 | work to :meth:`return_ok`.
|
---|
| 381 |
|
---|
| 382 | If :meth:`domain_return_ok` returns true for the cookie domain,
|
---|
| 383 | :meth:`path_return_ok` is called for the cookie path. Otherwise,
|
---|
| 384 | :meth:`path_return_ok` and :meth:`return_ok` are never called for that cookie
|
---|
| 385 | domain. If :meth:`path_return_ok` returns true, :meth:`return_ok` is called
|
---|
| 386 | with the :class:`Cookie` object itself for a full check. Otherwise,
|
---|
| 387 | :meth:`return_ok` is never called for that cookie path.
|
---|
| 388 |
|
---|
| 389 | Note that :meth:`domain_return_ok` is called for every *cookie* domain, not just
|
---|
| 390 | for the *request* domain. For example, the function might be called with both
|
---|
| 391 | ``".example.com"`` and ``"www.example.com"`` if the request domain is
|
---|
| 392 | ``"www.example.com"``. The same goes for :meth:`path_return_ok`.
|
---|
| 393 |
|
---|
| 394 | The *request* argument is as documented for :meth:`return_ok`.
|
---|
| 395 |
|
---|
| 396 |
|
---|
| 397 | .. method:: CookiePolicy.path_return_ok(path, request)
|
---|
| 398 |
|
---|
| 399 | Return false if cookies should not be returned, given cookie path.
|
---|
| 400 |
|
---|
| 401 | See the documentation for :meth:`domain_return_ok`.
|
---|
| 402 |
|
---|
| 403 | In addition to implementing the methods above, implementations of the
|
---|
| 404 | :class:`CookiePolicy` interface must also supply the following attributes,
|
---|
| 405 | indicating which protocols should be used, and how. All of these attributes may
|
---|
| 406 | be assigned to.
|
---|
| 407 |
|
---|
| 408 |
|
---|
| 409 | .. attribute:: CookiePolicy.netscape
|
---|
| 410 |
|
---|
| 411 | Implement Netscape protocol.
|
---|
| 412 |
|
---|
| 413 |
|
---|
| 414 | .. attribute:: CookiePolicy.rfc2965
|
---|
| 415 |
|
---|
| 416 | Implement RFC 2965 protocol.
|
---|
| 417 |
|
---|
| 418 |
|
---|
| 419 | .. attribute:: CookiePolicy.hide_cookie2
|
---|
| 420 |
|
---|
| 421 | Don't add :mailheader:`Cookie2` header to requests (the presence of this header
|
---|
| 422 | indicates to the server that we understand RFC 2965 cookies).
|
---|
| 423 |
|
---|
| 424 | The most useful way to define a :class:`CookiePolicy` class is by subclassing
|
---|
| 425 | from :class:`DefaultCookiePolicy` and overriding some or all of the methods
|
---|
| 426 | above. :class:`CookiePolicy` itself may be used as a 'null policy' to allow
|
---|
| 427 | setting and receiving any and all cookies (this is unlikely to be useful).
|
---|
| 428 |
|
---|
| 429 |
|
---|
| 430 | .. _default-cookie-policy-objects:
|
---|
| 431 |
|
---|
| 432 | DefaultCookiePolicy Objects
|
---|
| 433 | ---------------------------
|
---|
| 434 |
|
---|
| 435 | Implements the standard rules for accepting and returning cookies.
|
---|
| 436 |
|
---|
| 437 | Both RFC 2965 and Netscape cookies are covered. RFC 2965 handling is switched
|
---|
| 438 | off by default.
|
---|
| 439 |
|
---|
| 440 | The easiest way to provide your own policy is to override this class and call
|
---|
| 441 | its methods in your overridden implementations before adding your own additional
|
---|
| 442 | checks::
|
---|
| 443 |
|
---|
| 444 | import cookielib
|
---|
| 445 | class MyCookiePolicy(cookielib.DefaultCookiePolicy):
|
---|
| 446 | def set_ok(self, cookie, request):
|
---|
| 447 | if not cookielib.DefaultCookiePolicy.set_ok(self, cookie, request):
|
---|
| 448 | return False
|
---|
| 449 | if i_dont_want_to_store_this_cookie(cookie):
|
---|
| 450 | return False
|
---|
| 451 | return True
|
---|
| 452 |
|
---|
| 453 | In addition to the features required to implement the :class:`CookiePolicy`
|
---|
| 454 | interface, this class allows you to block and allow domains from setting and
|
---|
| 455 | receiving cookies. There are also some strictness switches that allow you to
|
---|
| 456 | tighten up the rather loose Netscape protocol rules a little bit (at the cost of
|
---|
| 457 | blocking some benign cookies).
|
---|
| 458 |
|
---|
| 459 | A domain blacklist and whitelist is provided (both off by default). Only domains
|
---|
| 460 | not in the blacklist and present in the whitelist (if the whitelist is active)
|
---|
| 461 | participate in cookie setting and returning. Use the *blocked_domains*
|
---|
| 462 | constructor argument, and :meth:`blocked_domains` and
|
---|
| 463 | :meth:`set_blocked_domains` methods (and the corresponding argument and methods
|
---|
| 464 | for *allowed_domains*). If you set a whitelist, you can turn it off again by
|
---|
| 465 | setting it to :const:`None`.
|
---|
| 466 |
|
---|
| 467 | Domains in block or allow lists that do not start with a dot must equal the
|
---|
| 468 | cookie domain to be matched. For example, ``"example.com"`` matches a blacklist
|
---|
| 469 | entry of ``"example.com"``, but ``"www.example.com"`` does not. Domains that do
|
---|
| 470 | start with a dot are matched by more specific domains too. For example, both
|
---|
| 471 | ``"www.example.com"`` and ``"www.coyote.example.com"`` match ``".example.com"``
|
---|
| 472 | (but ``"example.com"`` itself does not). IP addresses are an exception, and
|
---|
| 473 | must match exactly. For example, if blocked_domains contains ``"192.168.1.2"``
|
---|
| 474 | and ``".168.1.2"``, 192.168.1.2 is blocked, but 193.168.1.2 is not.
|
---|
| 475 |
|
---|
| 476 | :class:`DefaultCookiePolicy` implements the following additional methods:
|
---|
| 477 |
|
---|
| 478 |
|
---|
| 479 | .. method:: DefaultCookiePolicy.blocked_domains()
|
---|
| 480 |
|
---|
| 481 | Return the sequence of blocked domains (as a tuple).
|
---|
| 482 |
|
---|
| 483 |
|
---|
| 484 | .. method:: DefaultCookiePolicy.set_blocked_domains(blocked_domains)
|
---|
| 485 |
|
---|
| 486 | Set the sequence of blocked domains.
|
---|
| 487 |
|
---|
| 488 |
|
---|
| 489 | .. method:: DefaultCookiePolicy.is_blocked(domain)
|
---|
| 490 |
|
---|
| 491 | Return whether *domain* is on the blacklist for setting or receiving cookies.
|
---|
| 492 |
|
---|
| 493 |
|
---|
| 494 | .. method:: DefaultCookiePolicy.allowed_domains()
|
---|
| 495 |
|
---|
| 496 | Return :const:`None`, or the sequence of allowed domains (as a tuple).
|
---|
| 497 |
|
---|
| 498 |
|
---|
| 499 | .. method:: DefaultCookiePolicy.set_allowed_domains(allowed_domains)
|
---|
| 500 |
|
---|
| 501 | Set the sequence of allowed domains, or :const:`None`.
|
---|
| 502 |
|
---|
| 503 |
|
---|
| 504 | .. method:: DefaultCookiePolicy.is_not_allowed(domain)
|
---|
| 505 |
|
---|
| 506 | Return whether *domain* is not on the whitelist for setting or receiving
|
---|
| 507 | cookies.
|
---|
| 508 |
|
---|
| 509 | :class:`DefaultCookiePolicy` instances have the following attributes, which are
|
---|
| 510 | all initialised from the constructor arguments of the same name, and which may
|
---|
| 511 | all be assigned to.
|
---|
| 512 |
|
---|
| 513 |
|
---|
| 514 | .. attribute:: DefaultCookiePolicy.rfc2109_as_netscape
|
---|
| 515 |
|
---|
| 516 | If true, request that the :class:`CookieJar` instance downgrade RFC 2109 cookies
|
---|
| 517 | (ie. cookies received in a :mailheader:`Set-Cookie` header with a version
|
---|
| 518 | cookie-attribute of 1) to Netscape cookies by setting the version attribute of
|
---|
| 519 | the :class:`Cookie` instance to 0. The default value is :const:`None`, in which
|
---|
| 520 | case RFC 2109 cookies are downgraded if and only if RFC 2965 handling is turned
|
---|
| 521 | off. Therefore, RFC 2109 cookies are downgraded by default.
|
---|
| 522 |
|
---|
| 523 | .. versionadded:: 2.5
|
---|
| 524 |
|
---|
| 525 | General strictness switches:
|
---|
| 526 |
|
---|
| 527 |
|
---|
| 528 | .. attribute:: DefaultCookiePolicy.strict_domain
|
---|
| 529 |
|
---|
| 530 | Don't allow sites to set two-component domains with country-code top-level
|
---|
| 531 | domains like ``.co.uk``, ``.gov.uk``, ``.co.nz``.etc. This is far from perfect
|
---|
| 532 | and isn't guaranteed to work!
|
---|
| 533 |
|
---|
| 534 | RFC 2965 protocol strictness switches:
|
---|
| 535 |
|
---|
| 536 |
|
---|
| 537 | .. attribute:: DefaultCookiePolicy.strict_rfc2965_unverifiable
|
---|
| 538 |
|
---|
| 539 | Follow RFC 2965 rules on unverifiable transactions (usually, an unverifiable
|
---|
| 540 | transaction is one resulting from a redirect or a request for an image hosted on
|
---|
| 541 | another site). If this is false, cookies are *never* blocked on the basis of
|
---|
| 542 | verifiability
|
---|
| 543 |
|
---|
| 544 | Netscape protocol strictness switches:
|
---|
| 545 |
|
---|
| 546 |
|
---|
| 547 | .. attribute:: DefaultCookiePolicy.strict_ns_unverifiable
|
---|
| 548 |
|
---|
| 549 | apply RFC 2965 rules on unverifiable transactions even to Netscape cookies
|
---|
| 550 |
|
---|
| 551 |
|
---|
| 552 | .. attribute:: DefaultCookiePolicy.strict_ns_domain
|
---|
| 553 |
|
---|
| 554 | Flags indicating how strict to be with domain-matching rules for Netscape
|
---|
| 555 | cookies. See below for acceptable values.
|
---|
| 556 |
|
---|
| 557 |
|
---|
| 558 | .. attribute:: DefaultCookiePolicy.strict_ns_set_initial_dollar
|
---|
| 559 |
|
---|
| 560 | Ignore cookies in Set-Cookie: headers that have names starting with ``'$'``.
|
---|
| 561 |
|
---|
| 562 |
|
---|
| 563 | .. attribute:: DefaultCookiePolicy.strict_ns_set_path
|
---|
| 564 |
|
---|
| 565 | Don't allow setting cookies whose path doesn't path-match request URI.
|
---|
| 566 |
|
---|
| 567 | :attr:`strict_ns_domain` is a collection of flags. Its value is constructed by
|
---|
| 568 | or-ing together (for example, ``DomainStrictNoDots|DomainStrictNonDomain`` means
|
---|
| 569 | both flags are set).
|
---|
| 570 |
|
---|
| 571 |
|
---|
| 572 | .. attribute:: DefaultCookiePolicy.DomainStrictNoDots
|
---|
| 573 |
|
---|
| 574 | When setting cookies, the 'host prefix' must not contain a dot (eg.
|
---|
| 575 | ``www.foo.bar.com`` can't set a cookie for ``.bar.com``, because ``www.foo``
|
---|
| 576 | contains a dot).
|
---|
| 577 |
|
---|
| 578 |
|
---|
| 579 | .. attribute:: DefaultCookiePolicy.DomainStrictNonDomain
|
---|
| 580 |
|
---|
| 581 | Cookies that did not explicitly specify a ``domain`` cookie-attribute can only
|
---|
| 582 | be returned to a domain equal to the domain that set the cookie (eg.
|
---|
| 583 | ``spam.example.com`` won't be returned cookies from ``example.com`` that had no
|
---|
| 584 | ``domain`` cookie-attribute).
|
---|
| 585 |
|
---|
| 586 |
|
---|
| 587 | .. attribute:: DefaultCookiePolicy.DomainRFC2965Match
|
---|
| 588 |
|
---|
| 589 | When setting cookies, require a full RFC 2965 domain-match.
|
---|
| 590 |
|
---|
| 591 | The following attributes are provided for convenience, and are the most useful
|
---|
| 592 | combinations of the above flags:
|
---|
| 593 |
|
---|
| 594 |
|
---|
| 595 | .. attribute:: DefaultCookiePolicy.DomainLiberal
|
---|
| 596 |
|
---|
| 597 | Equivalent to 0 (ie. all of the above Netscape domain strictness flags switched
|
---|
| 598 | off).
|
---|
| 599 |
|
---|
| 600 |
|
---|
| 601 | .. attribute:: DefaultCookiePolicy.DomainStrict
|
---|
| 602 |
|
---|
| 603 | Equivalent to ``DomainStrictNoDots|DomainStrictNonDomain``.
|
---|
| 604 |
|
---|
| 605 |
|
---|
| 606 | .. _cookielib-cookie-objects:
|
---|
| 607 |
|
---|
| 608 | Cookie Objects
|
---|
| 609 | --------------
|
---|
| 610 |
|
---|
| 611 | :class:`Cookie` instances have Python attributes roughly corresponding to the
|
---|
| 612 | standard cookie-attributes specified in the various cookie standards. The
|
---|
| 613 | correspondence is not one-to-one, because there are complicated rules for
|
---|
| 614 | assigning default values, because the ``max-age`` and ``expires``
|
---|
| 615 | cookie-attributes contain equivalent information, and because RFC 2109 cookies
|
---|
| 616 | may be 'downgraded' by :mod:`cookielib` from version 1 to version 0 (Netscape)
|
---|
| 617 | cookies.
|
---|
| 618 |
|
---|
| 619 | Assignment to these attributes should not be necessary other than in rare
|
---|
| 620 | circumstances in a :class:`CookiePolicy` method. The class does not enforce
|
---|
| 621 | internal consistency, so you should know what you're doing if you do that.
|
---|
| 622 |
|
---|
| 623 |
|
---|
| 624 | .. attribute:: Cookie.version
|
---|
| 625 |
|
---|
| 626 | Integer or :const:`None`. Netscape cookies have :attr:`version` 0. RFC 2965 and
|
---|
| 627 | RFC 2109 cookies have a ``version`` cookie-attribute of 1. However, note that
|
---|
| 628 | :mod:`cookielib` may 'downgrade' RFC 2109 cookies to Netscape cookies, in which
|
---|
| 629 | case :attr:`version` is 0.
|
---|
| 630 |
|
---|
| 631 |
|
---|
| 632 | .. attribute:: Cookie.name
|
---|
| 633 |
|
---|
| 634 | Cookie name (a string).
|
---|
| 635 |
|
---|
| 636 |
|
---|
| 637 | .. attribute:: Cookie.value
|
---|
| 638 |
|
---|
| 639 | Cookie value (a string), or :const:`None`.
|
---|
| 640 |
|
---|
| 641 |
|
---|
| 642 | .. attribute:: Cookie.port
|
---|
| 643 |
|
---|
| 644 | String representing a port or a set of ports (eg. '80', or '80,8080'), or
|
---|
| 645 | :const:`None`.
|
---|
| 646 |
|
---|
| 647 |
|
---|
| 648 | .. attribute:: Cookie.path
|
---|
| 649 |
|
---|
| 650 | Cookie path (a string, eg. ``'/acme/rocket_launchers'``).
|
---|
| 651 |
|
---|
| 652 |
|
---|
| 653 | .. attribute:: Cookie.secure
|
---|
| 654 |
|
---|
| 655 | True if cookie should only be returned over a secure connection.
|
---|
| 656 |
|
---|
| 657 |
|
---|
| 658 | .. attribute:: Cookie.expires
|
---|
| 659 |
|
---|
| 660 | Integer expiry date in seconds since epoch, or :const:`None`. See also the
|
---|
| 661 | :meth:`is_expired` method.
|
---|
| 662 |
|
---|
| 663 |
|
---|
| 664 | .. attribute:: Cookie.discard
|
---|
| 665 |
|
---|
| 666 | True if this is a session cookie.
|
---|
| 667 |
|
---|
| 668 |
|
---|
| 669 | .. attribute:: Cookie.comment
|
---|
| 670 |
|
---|
| 671 | String comment from the server explaining the function of this cookie, or
|
---|
| 672 | :const:`None`.
|
---|
| 673 |
|
---|
| 674 |
|
---|
| 675 | .. attribute:: Cookie.comment_url
|
---|
| 676 |
|
---|
| 677 | URL linking to a comment from the server explaining the function of this cookie,
|
---|
| 678 | or :const:`None`.
|
---|
| 679 |
|
---|
| 680 |
|
---|
| 681 | .. attribute:: Cookie.rfc2109
|
---|
| 682 |
|
---|
| 683 | True if this cookie was received as an RFC 2109 cookie (ie. the cookie
|
---|
| 684 | arrived in a :mailheader:`Set-Cookie` header, and the value of the Version
|
---|
| 685 | cookie-attribute in that header was 1). This attribute is provided because
|
---|
| 686 | :mod:`cookielib` may 'downgrade' RFC 2109 cookies to Netscape cookies, in
|
---|
| 687 | which case :attr:`version` is 0.
|
---|
| 688 |
|
---|
| 689 | .. versionadded:: 2.5
|
---|
| 690 |
|
---|
| 691 |
|
---|
| 692 | .. attribute:: Cookie.port_specified
|
---|
| 693 |
|
---|
| 694 | True if a port or set of ports was explicitly specified by the server (in the
|
---|
| 695 | :mailheader:`Set-Cookie` / :mailheader:`Set-Cookie2` header).
|
---|
| 696 |
|
---|
| 697 |
|
---|
| 698 | .. attribute:: Cookie.domain_specified
|
---|
| 699 |
|
---|
| 700 | True if a domain was explicitly specified by the server.
|
---|
| 701 |
|
---|
| 702 |
|
---|
| 703 | .. attribute:: Cookie.domain_initial_dot
|
---|
| 704 |
|
---|
| 705 | True if the domain explicitly specified by the server began with a dot
|
---|
| 706 | (``'.'``).
|
---|
| 707 |
|
---|
| 708 | Cookies may have additional non-standard cookie-attributes. These may be
|
---|
| 709 | accessed using the following methods:
|
---|
| 710 |
|
---|
| 711 |
|
---|
| 712 | .. method:: Cookie.has_nonstandard_attr(name)
|
---|
| 713 |
|
---|
| 714 | Return true if cookie has the named cookie-attribute.
|
---|
| 715 |
|
---|
| 716 |
|
---|
| 717 | .. method:: Cookie.get_nonstandard_attr(name, default=None)
|
---|
| 718 |
|
---|
| 719 | If cookie has the named cookie-attribute, return its value. Otherwise, return
|
---|
| 720 | *default*.
|
---|
| 721 |
|
---|
| 722 |
|
---|
| 723 | .. method:: Cookie.set_nonstandard_attr(name, value)
|
---|
| 724 |
|
---|
| 725 | Set the value of the named cookie-attribute.
|
---|
| 726 |
|
---|
| 727 | The :class:`Cookie` class also defines the following method:
|
---|
| 728 |
|
---|
| 729 |
|
---|
| 730 | .. method:: Cookie.is_expired([now=None])
|
---|
| 731 |
|
---|
| 732 | True if cookie has passed the time at which the server requested it should
|
---|
| 733 | expire. If *now* is given (in seconds since the epoch), return whether the
|
---|
| 734 | cookie has expired at the specified time.
|
---|
| 735 |
|
---|
| 736 |
|
---|
| 737 | .. _cookielib-examples:
|
---|
| 738 |
|
---|
| 739 | Examples
|
---|
| 740 | --------
|
---|
| 741 |
|
---|
| 742 | The first example shows the most common usage of :mod:`cookielib`::
|
---|
| 743 |
|
---|
| 744 | import cookielib, urllib2
|
---|
| 745 | cj = cookielib.CookieJar()
|
---|
| 746 | opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
|
---|
| 747 | r = opener.open("http://example.com/")
|
---|
| 748 |
|
---|
| 749 | This example illustrates how to open a URL using your Netscape, Mozilla, or Lynx
|
---|
| 750 | cookies (assumes Unix/Netscape convention for location of the cookies file)::
|
---|
| 751 |
|
---|
| 752 | import os, cookielib, urllib2
|
---|
| 753 | cj = cookielib.MozillaCookieJar()
|
---|
[391] | 754 | cj.load(os.path.join(os.path.expanduser("~"), ".netscape", "cookies.txt"))
|
---|
[2] | 755 | opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
|
---|
| 756 | r = opener.open("http://example.com/")
|
---|
| 757 |
|
---|
| 758 | The next example illustrates the use of :class:`DefaultCookiePolicy`. Turn on
|
---|
| 759 | RFC 2965 cookies, be more strict about domains when setting and returning
|
---|
| 760 | Netscape cookies, and block some domains from setting cookies or having them
|
---|
| 761 | returned::
|
---|
| 762 |
|
---|
| 763 | import urllib2
|
---|
| 764 | from cookielib import CookieJar, DefaultCookiePolicy
|
---|
| 765 | policy = DefaultCookiePolicy(
|
---|
| 766 | rfc2965=True, strict_ns_domain=DefaultCookiePolicy.DomainStrict,
|
---|
| 767 | blocked_domains=["ads.net", ".ads.net"])
|
---|
| 768 | cj = CookieJar(policy)
|
---|
| 769 | opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
|
---|
| 770 | r = opener.open("http://example.com/")
|
---|
| 771 |
|
---|