[2] | 1 | :mod:`urllib2` --- extensible library for opening URLs
|
---|
| 2 | ======================================================
|
---|
| 3 |
|
---|
| 4 | .. module:: urllib2
|
---|
| 5 | :synopsis: Next generation URL opening library.
|
---|
| 6 | .. moduleauthor:: Jeremy Hylton <jhylton@users.sourceforge.net>
|
---|
| 7 | .. sectionauthor:: Moshe Zadka <moshez@users.sourceforge.net>
|
---|
| 8 |
|
---|
| 9 |
|
---|
| 10 | .. note::
|
---|
| 11 | The :mod:`urllib2` module has been split across several modules in
|
---|
[391] | 12 | Python 3 named :mod:`urllib.request` and :mod:`urllib.error`.
|
---|
[2] | 13 | The :term:`2to3` tool will automatically adapt imports when converting
|
---|
[391] | 14 | your sources to Python 3.
|
---|
[2] | 15 |
|
---|
| 16 |
|
---|
| 17 | The :mod:`urllib2` module defines functions and classes which help in opening
|
---|
| 18 | URLs (mostly HTTP) in a complex world --- basic and digest authentication,
|
---|
| 19 | redirections, cookies and more.
|
---|
| 20 |
|
---|
[391] | 21 |
|
---|
[2] | 22 | The :mod:`urllib2` module defines the following functions:
|
---|
| 23 |
|
---|
| 24 |
|
---|
| 25 | .. function:: urlopen(url[, data][, timeout])
|
---|
| 26 |
|
---|
| 27 | Open the URL *url*, which can be either a string or a :class:`Request` object.
|
---|
| 28 |
|
---|
[391] | 29 | .. warning::
|
---|
| 30 | HTTPS requests do not do any verification of the server's certificate.
|
---|
| 31 |
|
---|
[2] | 32 | *data* may be a string specifying additional data to send to the server, or
|
---|
| 33 | ``None`` if no such data is needed. Currently HTTP requests are the only ones
|
---|
| 34 | that use *data*; the HTTP request will be a POST instead of a GET when the
|
---|
| 35 | *data* parameter is provided. *data* should be a buffer in the standard
|
---|
| 36 | :mimetype:`application/x-www-form-urlencoded` format. The
|
---|
| 37 | :func:`urllib.urlencode` function takes a mapping or sequence of 2-tuples and
|
---|
[391] | 38 | returns a string in this format. urllib2 module sends HTTP/1.1 requests with
|
---|
| 39 | ``Connection:close`` header included.
|
---|
[2] | 40 |
|
---|
| 41 | The optional *timeout* parameter specifies a timeout in seconds for blocking
|
---|
| 42 | operations like the connection attempt (if not specified, the global default
|
---|
[391] | 43 | timeout setting will be used). This actually only works for HTTP, HTTPS and
|
---|
| 44 | FTP connections.
|
---|
[2] | 45 |
|
---|
| 46 | This function returns a file-like object with two additional methods:
|
---|
| 47 |
|
---|
| 48 | * :meth:`geturl` --- return the URL of the resource retrieved, commonly used to
|
---|
| 49 | determine if a redirect was followed
|
---|
| 50 |
|
---|
[391] | 51 | * :meth:`info` --- return the meta-information of the page, such as headers,
|
---|
| 52 | in the form of an :class:`mimetools.Message` instance
|
---|
[2] | 53 | (see `Quick Reference to HTTP Headers <http://www.cs.tut.fi/~jkorpela/http.html>`_)
|
---|
| 54 |
|
---|
[391] | 55 | * :meth:`getcode` --- return the HTTP status code of the response.
|
---|
| 56 |
|
---|
[2] | 57 | Raises :exc:`URLError` on errors.
|
---|
| 58 |
|
---|
| 59 | Note that ``None`` may be returned if no handler handles the request (though the
|
---|
| 60 | default installed global :class:`OpenerDirector` uses :class:`UnknownHandler` to
|
---|
| 61 | ensure this never happens).
|
---|
| 62 |
|
---|
[391] | 63 | In addition, if proxy settings are detected (for example, when a ``*_proxy``
|
---|
| 64 | environment variable like :envvar:`http_proxy` is set),
|
---|
| 65 | :class:`ProxyHandler` is default installed and makes sure the requests are
|
---|
| 66 | handled through the proxy.
|
---|
[2] | 67 |
|
---|
| 68 | .. versionchanged:: 2.6
|
---|
| 69 | *timeout* was added.
|
---|
| 70 |
|
---|
| 71 |
|
---|
| 72 | .. function:: install_opener(opener)
|
---|
| 73 |
|
---|
| 74 | Install an :class:`OpenerDirector` instance as the default global opener.
|
---|
| 75 | Installing an opener is only necessary if you want urlopen to use that opener;
|
---|
| 76 | otherwise, simply call :meth:`OpenerDirector.open` instead of :func:`urlopen`.
|
---|
| 77 | The code does not check for a real :class:`OpenerDirector`, and any class with
|
---|
| 78 | the appropriate interface will work.
|
---|
| 79 |
|
---|
| 80 |
|
---|
| 81 | .. function:: build_opener([handler, ...])
|
---|
| 82 |
|
---|
| 83 | Return an :class:`OpenerDirector` instance, which chains the handlers in the
|
---|
| 84 | order given. *handler*\s can be either instances of :class:`BaseHandler`, or
|
---|
| 85 | subclasses of :class:`BaseHandler` (in which case it must be possible to call
|
---|
| 86 | the constructor without any parameters). Instances of the following classes
|
---|
| 87 | will be in front of the *handler*\s, unless the *handler*\s contain them,
|
---|
[391] | 88 | instances of them or subclasses of them: :class:`ProxyHandler` (if proxy
|
---|
| 89 | settings are detected),
|
---|
[2] | 90 | :class:`UnknownHandler`, :class:`HTTPHandler`, :class:`HTTPDefaultErrorHandler`,
|
---|
| 91 | :class:`HTTPRedirectHandler`, :class:`FTPHandler`, :class:`FileHandler`,
|
---|
| 92 | :class:`HTTPErrorProcessor`.
|
---|
| 93 |
|
---|
| 94 | If the Python installation has SSL support (i.e., if the :mod:`ssl` module can be imported),
|
---|
| 95 | :class:`HTTPSHandler` will also be added.
|
---|
| 96 |
|
---|
| 97 | Beginning in Python 2.3, a :class:`BaseHandler` subclass may also change its
|
---|
[391] | 98 | :attr:`handler_order` attribute to modify its position in the handlers
|
---|
[2] | 99 | list.
|
---|
| 100 |
|
---|
| 101 | The following exceptions are raised as appropriate:
|
---|
| 102 |
|
---|
| 103 |
|
---|
| 104 | .. exception:: URLError
|
---|
| 105 |
|
---|
| 106 | The handlers raise this exception (or derived exceptions) when they run into a
|
---|
| 107 | problem. It is a subclass of :exc:`IOError`.
|
---|
| 108 |
|
---|
| 109 | .. attribute:: reason
|
---|
| 110 |
|
---|
| 111 | The reason for this error. It can be a message string or another exception
|
---|
| 112 | instance (:exc:`socket.error` for remote URLs, :exc:`OSError` for local
|
---|
| 113 | URLs).
|
---|
| 114 |
|
---|
| 115 |
|
---|
| 116 | .. exception:: HTTPError
|
---|
| 117 |
|
---|
| 118 | Though being an exception (a subclass of :exc:`URLError`), an :exc:`HTTPError`
|
---|
| 119 | can also function as a non-exceptional file-like return value (the same thing
|
---|
| 120 | that :func:`urlopen` returns). This is useful when handling exotic HTTP
|
---|
| 121 | errors, such as requests for authentication.
|
---|
| 122 |
|
---|
| 123 | .. attribute:: code
|
---|
| 124 |
|
---|
| 125 | An HTTP status code as defined in `RFC 2616 <http://www.faqs.org/rfcs/rfc2616.html>`_.
|
---|
| 126 | This numeric value corresponds to a value found in the dictionary of
|
---|
| 127 | codes as found in :attr:`BaseHTTPServer.BaseHTTPRequestHandler.responses`.
|
---|
| 128 |
|
---|
[391] | 129 | .. attribute:: reason
|
---|
[2] | 130 |
|
---|
[391] | 131 | The reason for this error. It can be a message string or another exception
|
---|
| 132 | instance.
|
---|
[2] | 133 |
|
---|
| 134 | The following classes are provided:
|
---|
| 135 |
|
---|
| 136 |
|
---|
| 137 | .. class:: Request(url[, data][, headers][, origin_req_host][, unverifiable])
|
---|
| 138 |
|
---|
| 139 | This class is an abstraction of a URL request.
|
---|
| 140 |
|
---|
| 141 | *url* should be a string containing a valid URL.
|
---|
| 142 |
|
---|
| 143 | *data* may be a string specifying additional data to send to the server, or
|
---|
| 144 | ``None`` if no such data is needed. Currently HTTP requests are the only ones
|
---|
| 145 | that use *data*; the HTTP request will be a POST instead of a GET when the
|
---|
| 146 | *data* parameter is provided. *data* should be a buffer in the standard
|
---|
| 147 | :mimetype:`application/x-www-form-urlencoded` format. The
|
---|
| 148 | :func:`urllib.urlencode` function takes a mapping or sequence of 2-tuples and
|
---|
| 149 | returns a string in this format.
|
---|
| 150 |
|
---|
| 151 | *headers* should be a dictionary, and will be treated as if :meth:`add_header`
|
---|
| 152 | was called with each key and value as arguments. This is often used to "spoof"
|
---|
| 153 | the ``User-Agent`` header, which is used by a browser to identify itself --
|
---|
| 154 | some HTTP servers only allow requests coming from common browsers as opposed
|
---|
| 155 | to scripts. For example, Mozilla Firefox may identify itself as ``"Mozilla/5.0
|
---|
| 156 | (X11; U; Linux i686) Gecko/20071127 Firefox/2.0.0.11"``, while :mod:`urllib2`'s
|
---|
| 157 | default user agent string is ``"Python-urllib/2.6"`` (on Python 2.6).
|
---|
| 158 |
|
---|
| 159 | The final two arguments are only of interest for correct handling of third-party
|
---|
| 160 | HTTP cookies:
|
---|
| 161 |
|
---|
| 162 | *origin_req_host* should be the request-host of the origin transaction, as
|
---|
| 163 | defined by :rfc:`2965`. It defaults to ``cookielib.request_host(self)``. This
|
---|
| 164 | is the host name or IP address of the original request that was initiated by the
|
---|
| 165 | user. For example, if the request is for an image in an HTML document, this
|
---|
| 166 | should be the request-host of the request for the page containing the image.
|
---|
| 167 |
|
---|
| 168 | *unverifiable* should indicate whether the request is unverifiable, as defined
|
---|
| 169 | by RFC 2965. It defaults to False. An unverifiable request is one whose URL
|
---|
| 170 | the user did not have the option to approve. For example, if the request is for
|
---|
| 171 | an image in an HTML document, and the user had no option to approve the
|
---|
| 172 | automatic fetching of the image, this should be true.
|
---|
| 173 |
|
---|
| 174 |
|
---|
| 175 | .. class:: OpenerDirector()
|
---|
| 176 |
|
---|
| 177 | The :class:`OpenerDirector` class opens URLs via :class:`BaseHandler`\ s chained
|
---|
| 178 | together. It manages the chaining of handlers, and recovery from errors.
|
---|
| 179 |
|
---|
| 180 |
|
---|
| 181 | .. class:: BaseHandler()
|
---|
| 182 |
|
---|
| 183 | This is the base class for all registered handlers --- and handles only the
|
---|
| 184 | simple mechanics of registration.
|
---|
| 185 |
|
---|
| 186 |
|
---|
| 187 | .. class:: HTTPDefaultErrorHandler()
|
---|
| 188 |
|
---|
| 189 | A class which defines a default handler for HTTP error responses; all responses
|
---|
| 190 | are turned into :exc:`HTTPError` exceptions.
|
---|
| 191 |
|
---|
| 192 |
|
---|
| 193 | .. class:: HTTPRedirectHandler()
|
---|
| 194 |
|
---|
| 195 | A class to handle redirections.
|
---|
| 196 |
|
---|
| 197 |
|
---|
| 198 | .. class:: HTTPCookieProcessor([cookiejar])
|
---|
| 199 |
|
---|
| 200 | A class to handle HTTP Cookies.
|
---|
| 201 |
|
---|
| 202 |
|
---|
| 203 | .. class:: ProxyHandler([proxies])
|
---|
| 204 |
|
---|
| 205 | Cause requests to go through a proxy. If *proxies* is given, it must be a
|
---|
| 206 | dictionary mapping protocol names to URLs of proxies. The default is to read
|
---|
| 207 | the list of proxies from the environment variables
|
---|
[391] | 208 | :envvar:`<protocol>_proxy`. If no proxy environment variables are set, then
|
---|
| 209 | in a Windows environment proxy settings are obtained from the registry's
|
---|
| 210 | Internet Settings section, and in a Mac OS X environment proxy information
|
---|
[2] | 211 | is retrieved from the OS X System Configuration Framework.
|
---|
| 212 |
|
---|
| 213 | To disable autodetected proxy pass an empty dictionary.
|
---|
| 214 |
|
---|
| 215 |
|
---|
| 216 | .. class:: HTTPPasswordMgr()
|
---|
| 217 |
|
---|
| 218 | Keep a database of ``(realm, uri) -> (user, password)`` mappings.
|
---|
| 219 |
|
---|
| 220 |
|
---|
| 221 | .. class:: HTTPPasswordMgrWithDefaultRealm()
|
---|
| 222 |
|
---|
| 223 | Keep a database of ``(realm, uri) -> (user, password)`` mappings. A realm of
|
---|
| 224 | ``None`` is considered a catch-all realm, which is searched if no other realm
|
---|
| 225 | fits.
|
---|
| 226 |
|
---|
| 227 |
|
---|
| 228 | .. class:: AbstractBasicAuthHandler([password_mgr])
|
---|
| 229 |
|
---|
| 230 | This is a mixin class that helps with HTTP authentication, both to the remote
|
---|
| 231 | host and to a proxy. *password_mgr*, if given, should be something that is
|
---|
| 232 | compatible with :class:`HTTPPasswordMgr`; refer to section
|
---|
| 233 | :ref:`http-password-mgr` for information on the interface that must be
|
---|
| 234 | supported.
|
---|
| 235 |
|
---|
| 236 |
|
---|
| 237 | .. class:: HTTPBasicAuthHandler([password_mgr])
|
---|
| 238 |
|
---|
| 239 | Handle authentication with the remote host. *password_mgr*, if given, should be
|
---|
| 240 | something that is compatible with :class:`HTTPPasswordMgr`; refer to section
|
---|
| 241 | :ref:`http-password-mgr` for information on the interface that must be
|
---|
| 242 | supported.
|
---|
| 243 |
|
---|
| 244 |
|
---|
| 245 | .. class:: ProxyBasicAuthHandler([password_mgr])
|
---|
| 246 |
|
---|
| 247 | Handle authentication with the proxy. *password_mgr*, if given, should be
|
---|
| 248 | something that is compatible with :class:`HTTPPasswordMgr`; refer to section
|
---|
| 249 | :ref:`http-password-mgr` for information on the interface that must be
|
---|
| 250 | supported.
|
---|
| 251 |
|
---|
| 252 |
|
---|
| 253 | .. class:: AbstractDigestAuthHandler([password_mgr])
|
---|
| 254 |
|
---|
| 255 | This is a mixin class that helps with HTTP authentication, both to the remote
|
---|
| 256 | host and to a proxy. *password_mgr*, if given, should be something that is
|
---|
| 257 | compatible with :class:`HTTPPasswordMgr`; refer to section
|
---|
| 258 | :ref:`http-password-mgr` for information on the interface that must be
|
---|
| 259 | supported.
|
---|
| 260 |
|
---|
| 261 |
|
---|
| 262 | .. class:: HTTPDigestAuthHandler([password_mgr])
|
---|
| 263 |
|
---|
| 264 | Handle authentication with the remote host. *password_mgr*, if given, should be
|
---|
| 265 | something that is compatible with :class:`HTTPPasswordMgr`; refer to section
|
---|
| 266 | :ref:`http-password-mgr` for information on the interface that must be
|
---|
| 267 | supported.
|
---|
| 268 |
|
---|
| 269 |
|
---|
| 270 | .. class:: ProxyDigestAuthHandler([password_mgr])
|
---|
| 271 |
|
---|
| 272 | Handle authentication with the proxy. *password_mgr*, if given, should be
|
---|
| 273 | something that is compatible with :class:`HTTPPasswordMgr`; refer to section
|
---|
| 274 | :ref:`http-password-mgr` for information on the interface that must be
|
---|
| 275 | supported.
|
---|
| 276 |
|
---|
| 277 |
|
---|
| 278 | .. class:: HTTPHandler()
|
---|
| 279 |
|
---|
| 280 | A class to handle opening of HTTP URLs.
|
---|
| 281 |
|
---|
| 282 |
|
---|
| 283 | .. class:: HTTPSHandler()
|
---|
| 284 |
|
---|
| 285 | A class to handle opening of HTTPS URLs.
|
---|
| 286 |
|
---|
| 287 |
|
---|
| 288 | .. class:: FileHandler()
|
---|
| 289 |
|
---|
| 290 | Open local files.
|
---|
| 291 |
|
---|
| 292 |
|
---|
| 293 | .. class:: FTPHandler()
|
---|
| 294 |
|
---|
| 295 | Open FTP URLs.
|
---|
| 296 |
|
---|
| 297 |
|
---|
| 298 | .. class:: CacheFTPHandler()
|
---|
| 299 |
|
---|
| 300 | Open FTP URLs, keeping a cache of open FTP connections to minimize delays.
|
---|
| 301 |
|
---|
| 302 |
|
---|
| 303 | .. class:: UnknownHandler()
|
---|
| 304 |
|
---|
| 305 | A catch-all class to handle unknown URLs.
|
---|
| 306 |
|
---|
| 307 |
|
---|
[391] | 308 | .. class:: HTTPErrorProcessor()
|
---|
| 309 |
|
---|
| 310 | Process HTTP error responses.
|
---|
| 311 |
|
---|
| 312 |
|
---|
[2] | 313 | .. _request-objects:
|
---|
| 314 |
|
---|
| 315 | Request Objects
|
---|
| 316 | ---------------
|
---|
| 317 |
|
---|
| 318 | The following methods describe all of :class:`Request`'s public interface, and
|
---|
| 319 | so all must be overridden in subclasses.
|
---|
| 320 |
|
---|
| 321 |
|
---|
| 322 | .. method:: Request.add_data(data)
|
---|
| 323 |
|
---|
| 324 | Set the :class:`Request` data to *data*. This is ignored by all handlers except
|
---|
| 325 | HTTP handlers --- and there it should be a byte string, and will change the
|
---|
| 326 | request to be ``POST`` rather than ``GET``.
|
---|
| 327 |
|
---|
| 328 |
|
---|
| 329 | .. method:: Request.get_method()
|
---|
| 330 |
|
---|
| 331 | Return a string indicating the HTTP request method. This is only meaningful for
|
---|
| 332 | HTTP requests, and currently always returns ``'GET'`` or ``'POST'``.
|
---|
| 333 |
|
---|
| 334 |
|
---|
| 335 | .. method:: Request.has_data()
|
---|
| 336 |
|
---|
| 337 | Return whether the instance has a non-\ ``None`` data.
|
---|
| 338 |
|
---|
| 339 |
|
---|
| 340 | .. method:: Request.get_data()
|
---|
| 341 |
|
---|
| 342 | Return the instance's data.
|
---|
| 343 |
|
---|
| 344 |
|
---|
| 345 | .. method:: Request.add_header(key, val)
|
---|
| 346 |
|
---|
| 347 | Add another header to the request. Headers are currently ignored by all
|
---|
| 348 | handlers except HTTP handlers, where they are added to the list of headers sent
|
---|
| 349 | to the server. Note that there cannot be more than one header with the same
|
---|
| 350 | name, and later calls will overwrite previous calls in case the *key* collides.
|
---|
| 351 | Currently, this is no loss of HTTP functionality, since all headers which have
|
---|
| 352 | meaning when used more than once have a (header-specific) way of gaining the
|
---|
| 353 | same functionality using only one header.
|
---|
| 354 |
|
---|
| 355 |
|
---|
| 356 | .. method:: Request.add_unredirected_header(key, header)
|
---|
| 357 |
|
---|
| 358 | Add a header that will not be added to a redirected request.
|
---|
| 359 |
|
---|
| 360 | .. versionadded:: 2.4
|
---|
| 361 |
|
---|
| 362 |
|
---|
| 363 | .. method:: Request.has_header(header)
|
---|
| 364 |
|
---|
| 365 | Return whether the instance has the named header (checks both regular and
|
---|
| 366 | unredirected).
|
---|
| 367 |
|
---|
| 368 | .. versionadded:: 2.4
|
---|
| 369 |
|
---|
| 370 |
|
---|
| 371 | .. method:: Request.get_full_url()
|
---|
| 372 |
|
---|
| 373 | Return the URL given in the constructor.
|
---|
| 374 |
|
---|
| 375 |
|
---|
| 376 | .. method:: Request.get_type()
|
---|
| 377 |
|
---|
| 378 | Return the type of the URL --- also known as the scheme.
|
---|
| 379 |
|
---|
| 380 |
|
---|
| 381 | .. method:: Request.get_host()
|
---|
| 382 |
|
---|
| 383 | Return the host to which a connection will be made.
|
---|
| 384 |
|
---|
| 385 |
|
---|
| 386 | .. method:: Request.get_selector()
|
---|
| 387 |
|
---|
| 388 | Return the selector --- the part of the URL that is sent to the server.
|
---|
| 389 |
|
---|
| 390 |
|
---|
[391] | 391 | .. method:: Request.get_header(header_name, default=None)
|
---|
| 392 |
|
---|
| 393 | Return the value of the given header. If the header is not present, return
|
---|
| 394 | the default value.
|
---|
| 395 |
|
---|
| 396 |
|
---|
| 397 | .. method:: Request.header_items()
|
---|
| 398 |
|
---|
| 399 | Return a list of tuples (header_name, header_value) of the Request headers.
|
---|
| 400 |
|
---|
| 401 |
|
---|
[2] | 402 | .. method:: Request.set_proxy(host, type)
|
---|
| 403 |
|
---|
| 404 | Prepare the request by connecting to a proxy server. The *host* and *type* will
|
---|
| 405 | replace those of the instance, and the instance's selector will be the original
|
---|
| 406 | URL given in the constructor.
|
---|
| 407 |
|
---|
| 408 |
|
---|
| 409 | .. method:: Request.get_origin_req_host()
|
---|
| 410 |
|
---|
| 411 | Return the request-host of the origin transaction, as defined by :rfc:`2965`.
|
---|
| 412 | See the documentation for the :class:`Request` constructor.
|
---|
| 413 |
|
---|
| 414 |
|
---|
| 415 | .. method:: Request.is_unverifiable()
|
---|
| 416 |
|
---|
| 417 | Return whether the request is unverifiable, as defined by RFC 2965. See the
|
---|
| 418 | documentation for the :class:`Request` constructor.
|
---|
| 419 |
|
---|
| 420 |
|
---|
| 421 | .. _opener-director-objects:
|
---|
| 422 |
|
---|
| 423 | OpenerDirector Objects
|
---|
| 424 | ----------------------
|
---|
| 425 |
|
---|
| 426 | :class:`OpenerDirector` instances have the following methods:
|
---|
| 427 |
|
---|
| 428 |
|
---|
| 429 | .. method:: OpenerDirector.add_handler(handler)
|
---|
| 430 |
|
---|
| 431 | *handler* should be an instance of :class:`BaseHandler`. The following
|
---|
| 432 | methods are searched, and added to the possible chains (note that HTTP errors
|
---|
| 433 | are a special case).
|
---|
| 434 |
|
---|
| 435 | * :samp:`{protocol}_open` --- signal that the handler knows how to open
|
---|
| 436 | *protocol* URLs.
|
---|
| 437 |
|
---|
| 438 | * :samp:`http_error_{type}` --- signal that the handler knows how to handle
|
---|
| 439 | HTTP errors with HTTP error code *type*.
|
---|
| 440 |
|
---|
| 441 | * :samp:`{protocol}_error` --- signal that the handler knows how to handle
|
---|
| 442 | errors from (non-\ ``http``) *protocol*.
|
---|
| 443 |
|
---|
| 444 | * :samp:`{protocol}_request` --- signal that the handler knows how to
|
---|
| 445 | pre-process *protocol* requests.
|
---|
| 446 |
|
---|
| 447 | * :samp:`{protocol}_response` --- signal that the handler knows how to
|
---|
| 448 | post-process *protocol* responses.
|
---|
| 449 |
|
---|
| 450 |
|
---|
| 451 | .. method:: OpenerDirector.open(url[, data][, timeout])
|
---|
| 452 |
|
---|
| 453 | Open the given *url* (which can be a request object or a string), optionally
|
---|
| 454 | passing the given *data*. Arguments, return values and exceptions raised are
|
---|
| 455 | the same as those of :func:`urlopen` (which simply calls the :meth:`open`
|
---|
| 456 | method on the currently installed global :class:`OpenerDirector`). The
|
---|
| 457 | optional *timeout* parameter specifies a timeout in seconds for blocking
|
---|
| 458 | operations like the connection attempt (if not specified, the global default
|
---|
[391] | 459 | timeout setting will be used). The timeout feature actually works only for
|
---|
| 460 | HTTP, HTTPS and FTP connections).
|
---|
[2] | 461 |
|
---|
| 462 | .. versionchanged:: 2.6
|
---|
| 463 | *timeout* was added.
|
---|
| 464 |
|
---|
| 465 |
|
---|
| 466 | .. method:: OpenerDirector.error(proto[, arg[, ...]])
|
---|
| 467 |
|
---|
| 468 | Handle an error of the given protocol. This will call the registered error
|
---|
| 469 | handlers for the given protocol with the given arguments (which are protocol
|
---|
| 470 | specific). The HTTP protocol is a special case which uses the HTTP response
|
---|
| 471 | code to determine the specific error handler; refer to the :meth:`http_error_\*`
|
---|
| 472 | methods of the handler classes.
|
---|
| 473 |
|
---|
| 474 | Return values and exceptions raised are the same as those of :func:`urlopen`.
|
---|
| 475 |
|
---|
| 476 | OpenerDirector objects open URLs in three stages:
|
---|
| 477 |
|
---|
| 478 | The order in which these methods are called within each stage is determined by
|
---|
| 479 | sorting the handler instances.
|
---|
| 480 |
|
---|
| 481 | #. Every handler with a method named like :samp:`{protocol}_request` has that
|
---|
| 482 | method called to pre-process the request.
|
---|
| 483 |
|
---|
| 484 | #. Handlers with a method named like :samp:`{protocol}_open` are called to handle
|
---|
| 485 | the request. This stage ends when a handler either returns a non-\ :const:`None`
|
---|
| 486 | value (ie. a response), or raises an exception (usually :exc:`URLError`).
|
---|
| 487 | Exceptions are allowed to propagate.
|
---|
| 488 |
|
---|
| 489 | In fact, the above algorithm is first tried for methods named
|
---|
| 490 | :meth:`default_open`. If all such methods return :const:`None`, the
|
---|
| 491 | algorithm is repeated for methods named like :samp:`{protocol}_open`. If all
|
---|
| 492 | such methods return :const:`None`, the algorithm is repeated for methods
|
---|
| 493 | named :meth:`unknown_open`.
|
---|
| 494 |
|
---|
| 495 | Note that the implementation of these methods may involve calls of the parent
|
---|
[391] | 496 | :class:`OpenerDirector` instance's :meth:`~OpenerDirector.open` and
|
---|
| 497 | :meth:`~OpenerDirector.error` methods.
|
---|
[2] | 498 |
|
---|
| 499 | #. Every handler with a method named like :samp:`{protocol}_response` has that
|
---|
| 500 | method called to post-process the response.
|
---|
| 501 |
|
---|
| 502 |
|
---|
| 503 | .. _base-handler-objects:
|
---|
| 504 |
|
---|
| 505 | BaseHandler Objects
|
---|
| 506 | -------------------
|
---|
| 507 |
|
---|
| 508 | :class:`BaseHandler` objects provide a couple of methods that are directly
|
---|
| 509 | useful, and others that are meant to be used by derived classes. These are
|
---|
| 510 | intended for direct use:
|
---|
| 511 |
|
---|
| 512 |
|
---|
| 513 | .. method:: BaseHandler.add_parent(director)
|
---|
| 514 |
|
---|
| 515 | Add a director as parent.
|
---|
| 516 |
|
---|
| 517 |
|
---|
| 518 | .. method:: BaseHandler.close()
|
---|
| 519 |
|
---|
| 520 | Remove any parents.
|
---|
| 521 |
|
---|
[391] | 522 | The following attributes and methods should only be used by classes derived from
|
---|
[2] | 523 | :class:`BaseHandler`.
|
---|
| 524 |
|
---|
| 525 | .. note::
|
---|
| 526 |
|
---|
| 527 | The convention has been adopted that subclasses defining
|
---|
| 528 | :meth:`protocol_request` or :meth:`protocol_response` methods are named
|
---|
| 529 | :class:`\*Processor`; all others are named :class:`\*Handler`.
|
---|
| 530 |
|
---|
| 531 |
|
---|
| 532 | .. attribute:: BaseHandler.parent
|
---|
| 533 |
|
---|
| 534 | A valid :class:`OpenerDirector`, which can be used to open using a different
|
---|
| 535 | protocol, or handle errors.
|
---|
| 536 |
|
---|
| 537 |
|
---|
| 538 | .. method:: BaseHandler.default_open(req)
|
---|
| 539 |
|
---|
| 540 | This method is *not* defined in :class:`BaseHandler`, but subclasses should
|
---|
| 541 | define it if they want to catch all URLs.
|
---|
| 542 |
|
---|
| 543 | This method, if implemented, will be called by the parent
|
---|
| 544 | :class:`OpenerDirector`. It should return a file-like object as described in
|
---|
| 545 | the return value of the :meth:`open` of :class:`OpenerDirector`, or ``None``.
|
---|
| 546 | It should raise :exc:`URLError`, unless a truly exceptional thing happens (for
|
---|
| 547 | example, :exc:`MemoryError` should not be mapped to :exc:`URLError`).
|
---|
| 548 |
|
---|
| 549 | This method will be called before any protocol-specific open method.
|
---|
| 550 |
|
---|
| 551 |
|
---|
| 552 | .. method:: BaseHandler.protocol_open(req)
|
---|
| 553 | :noindex:
|
---|
| 554 |
|
---|
| 555 | ("protocol" is to be replaced by the protocol name.)
|
---|
| 556 |
|
---|
| 557 | This method is *not* defined in :class:`BaseHandler`, but subclasses should
|
---|
| 558 | define it if they want to handle URLs with the given *protocol*.
|
---|
| 559 |
|
---|
| 560 | This method, if defined, will be called by the parent :class:`OpenerDirector`.
|
---|
| 561 | Return values should be the same as for :meth:`default_open`.
|
---|
| 562 |
|
---|
| 563 |
|
---|
| 564 | .. method:: BaseHandler.unknown_open(req)
|
---|
| 565 |
|
---|
| 566 | This method is *not* defined in :class:`BaseHandler`, but subclasses should
|
---|
| 567 | define it if they want to catch all URLs with no specific registered handler to
|
---|
| 568 | open it.
|
---|
| 569 |
|
---|
| 570 | This method, if implemented, will be called by the :attr:`parent`
|
---|
| 571 | :class:`OpenerDirector`. Return values should be the same as for
|
---|
| 572 | :meth:`default_open`.
|
---|
| 573 |
|
---|
| 574 |
|
---|
| 575 | .. method:: BaseHandler.http_error_default(req, fp, code, msg, hdrs)
|
---|
| 576 |
|
---|
| 577 | This method is *not* defined in :class:`BaseHandler`, but subclasses should
|
---|
| 578 | override it if they intend to provide a catch-all for otherwise unhandled HTTP
|
---|
| 579 | errors. It will be called automatically by the :class:`OpenerDirector` getting
|
---|
| 580 | the error, and should not normally be called in other circumstances.
|
---|
| 581 |
|
---|
| 582 | *req* will be a :class:`Request` object, *fp* will be a file-like object with
|
---|
| 583 | the HTTP error body, *code* will be the three-digit code of the error, *msg*
|
---|
| 584 | will be the user-visible explanation of the code and *hdrs* will be a mapping
|
---|
| 585 | object with the headers of the error.
|
---|
| 586 |
|
---|
| 587 | Return values and exceptions raised should be the same as those of
|
---|
| 588 | :func:`urlopen`.
|
---|
| 589 |
|
---|
| 590 |
|
---|
| 591 | .. method:: BaseHandler.http_error_nnn(req, fp, code, msg, hdrs)
|
---|
| 592 |
|
---|
| 593 | *nnn* should be a three-digit HTTP error code. This method is also not defined
|
---|
| 594 | in :class:`BaseHandler`, but will be called, if it exists, on an instance of a
|
---|
| 595 | subclass, when an HTTP error with code *nnn* occurs.
|
---|
| 596 |
|
---|
| 597 | Subclasses should override this method to handle specific HTTP errors.
|
---|
| 598 |
|
---|
| 599 | Arguments, return values and exceptions raised should be the same as for
|
---|
| 600 | :meth:`http_error_default`.
|
---|
| 601 |
|
---|
| 602 |
|
---|
| 603 | .. method:: BaseHandler.protocol_request(req)
|
---|
| 604 | :noindex:
|
---|
| 605 |
|
---|
| 606 | ("protocol" is to be replaced by the protocol name.)
|
---|
| 607 |
|
---|
| 608 | This method is *not* defined in :class:`BaseHandler`, but subclasses should
|
---|
| 609 | define it if they want to pre-process requests of the given *protocol*.
|
---|
| 610 |
|
---|
| 611 | This method, if defined, will be called by the parent :class:`OpenerDirector`.
|
---|
| 612 | *req* will be a :class:`Request` object. The return value should be a
|
---|
| 613 | :class:`Request` object.
|
---|
| 614 |
|
---|
| 615 |
|
---|
| 616 | .. method:: BaseHandler.protocol_response(req, response)
|
---|
| 617 | :noindex:
|
---|
| 618 |
|
---|
| 619 | ("protocol" is to be replaced by the protocol name.)
|
---|
| 620 |
|
---|
| 621 | This method is *not* defined in :class:`BaseHandler`, but subclasses should
|
---|
| 622 | define it if they want to post-process responses of the given *protocol*.
|
---|
| 623 |
|
---|
| 624 | This method, if defined, will be called by the parent :class:`OpenerDirector`.
|
---|
| 625 | *req* will be a :class:`Request` object. *response* will be an object
|
---|
| 626 | implementing the same interface as the return value of :func:`urlopen`. The
|
---|
| 627 | return value should implement the same interface as the return value of
|
---|
| 628 | :func:`urlopen`.
|
---|
| 629 |
|
---|
| 630 |
|
---|
| 631 | .. _http-redirect-handler:
|
---|
| 632 |
|
---|
| 633 | HTTPRedirectHandler Objects
|
---|
| 634 | ---------------------------
|
---|
| 635 |
|
---|
| 636 | .. note::
|
---|
| 637 |
|
---|
| 638 | Some HTTP redirections require action from this module's client code. If this
|
---|
| 639 | is the case, :exc:`HTTPError` is raised. See :rfc:`2616` for details of the
|
---|
| 640 | precise meanings of the various redirection codes.
|
---|
| 641 |
|
---|
| 642 |
|
---|
| 643 | .. method:: HTTPRedirectHandler.redirect_request(req, fp, code, msg, hdrs, newurl)
|
---|
| 644 |
|
---|
| 645 | Return a :class:`Request` or ``None`` in response to a redirect. This is called
|
---|
| 646 | by the default implementations of the :meth:`http_error_30\*` methods when a
|
---|
| 647 | redirection is received from the server. If a redirection should take place,
|
---|
| 648 | return a new :class:`Request` to allow :meth:`http_error_30\*` to perform the
|
---|
| 649 | redirect to *newurl*. Otherwise, raise :exc:`HTTPError` if no other handler
|
---|
| 650 | should try to handle this URL, or return ``None`` if you can't but another
|
---|
| 651 | handler might.
|
---|
| 652 |
|
---|
| 653 | .. note::
|
---|
| 654 |
|
---|
| 655 | The default implementation of this method does not strictly follow :rfc:`2616`,
|
---|
| 656 | which says that 301 and 302 responses to ``POST`` requests must not be
|
---|
| 657 | automatically redirected without confirmation by the user. In reality, browsers
|
---|
| 658 | do allow automatic redirection of these responses, changing the POST to a
|
---|
| 659 | ``GET``, and the default implementation reproduces this behavior.
|
---|
| 660 |
|
---|
| 661 |
|
---|
| 662 | .. method:: HTTPRedirectHandler.http_error_301(req, fp, code, msg, hdrs)
|
---|
| 663 |
|
---|
| 664 | Redirect to the ``Location:`` or ``URI:`` URL. This method is called by the
|
---|
| 665 | parent :class:`OpenerDirector` when getting an HTTP 'moved permanently' response.
|
---|
| 666 |
|
---|
| 667 |
|
---|
| 668 | .. method:: HTTPRedirectHandler.http_error_302(req, fp, code, msg, hdrs)
|
---|
| 669 |
|
---|
| 670 | The same as :meth:`http_error_301`, but called for the 'found' response.
|
---|
| 671 |
|
---|
| 672 |
|
---|
| 673 | .. method:: HTTPRedirectHandler.http_error_303(req, fp, code, msg, hdrs)
|
---|
| 674 |
|
---|
| 675 | The same as :meth:`http_error_301`, but called for the 'see other' response.
|
---|
| 676 |
|
---|
| 677 |
|
---|
| 678 | .. method:: HTTPRedirectHandler.http_error_307(req, fp, code, msg, hdrs)
|
---|
| 679 |
|
---|
| 680 | The same as :meth:`http_error_301`, but called for the 'temporary redirect'
|
---|
| 681 | response.
|
---|
| 682 |
|
---|
| 683 |
|
---|
| 684 | .. _http-cookie-processor:
|
---|
| 685 |
|
---|
| 686 | HTTPCookieProcessor Objects
|
---|
| 687 | ---------------------------
|
---|
| 688 |
|
---|
| 689 | .. versionadded:: 2.4
|
---|
| 690 |
|
---|
| 691 | :class:`HTTPCookieProcessor` instances have one attribute:
|
---|
| 692 |
|
---|
| 693 |
|
---|
| 694 | .. attribute:: HTTPCookieProcessor.cookiejar
|
---|
| 695 |
|
---|
| 696 | The :class:`cookielib.CookieJar` in which cookies are stored.
|
---|
| 697 |
|
---|
| 698 |
|
---|
| 699 | .. _proxy-handler:
|
---|
| 700 |
|
---|
| 701 | ProxyHandler Objects
|
---|
| 702 | --------------------
|
---|
| 703 |
|
---|
| 704 |
|
---|
| 705 | .. method:: ProxyHandler.protocol_open(request)
|
---|
| 706 | :noindex:
|
---|
| 707 |
|
---|
| 708 | ("protocol" is to be replaced by the protocol name.)
|
---|
| 709 |
|
---|
| 710 | The :class:`ProxyHandler` will have a method :samp:`{protocol}_open` for every
|
---|
| 711 | *protocol* which has a proxy in the *proxies* dictionary given in the
|
---|
| 712 | constructor. The method will modify requests to go through the proxy, by
|
---|
| 713 | calling ``request.set_proxy()``, and call the next handler in the chain to
|
---|
| 714 | actually execute the protocol.
|
---|
| 715 |
|
---|
| 716 |
|
---|
| 717 | .. _http-password-mgr:
|
---|
| 718 |
|
---|
| 719 | HTTPPasswordMgr Objects
|
---|
| 720 | -----------------------
|
---|
| 721 |
|
---|
| 722 | These methods are available on :class:`HTTPPasswordMgr` and
|
---|
| 723 | :class:`HTTPPasswordMgrWithDefaultRealm` objects.
|
---|
| 724 |
|
---|
| 725 |
|
---|
| 726 | .. method:: HTTPPasswordMgr.add_password(realm, uri, user, passwd)
|
---|
| 727 |
|
---|
| 728 | *uri* can be either a single URI, or a sequence of URIs. *realm*, *user* and
|
---|
| 729 | *passwd* must be strings. This causes ``(user, passwd)`` to be used as
|
---|
| 730 | authentication tokens when authentication for *realm* and a super-URI of any of
|
---|
| 731 | the given URIs is given.
|
---|
| 732 |
|
---|
| 733 |
|
---|
| 734 | .. method:: HTTPPasswordMgr.find_user_password(realm, authuri)
|
---|
| 735 |
|
---|
| 736 | Get user/password for given realm and URI, if any. This method will return
|
---|
| 737 | ``(None, None)`` if there is no matching user/password.
|
---|
| 738 |
|
---|
| 739 | For :class:`HTTPPasswordMgrWithDefaultRealm` objects, the realm ``None`` will be
|
---|
| 740 | searched if the given *realm* has no matching user/password.
|
---|
| 741 |
|
---|
| 742 |
|
---|
| 743 | .. _abstract-basic-auth-handler:
|
---|
| 744 |
|
---|
| 745 | AbstractBasicAuthHandler Objects
|
---|
| 746 | --------------------------------
|
---|
| 747 |
|
---|
| 748 |
|
---|
| 749 | .. method:: AbstractBasicAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
|
---|
| 750 |
|
---|
| 751 | Handle an authentication request by getting a user/password pair, and re-trying
|
---|
| 752 | the request. *authreq* should be the name of the header where the information
|
---|
| 753 | about the realm is included in the request, *host* specifies the URL and path to
|
---|
| 754 | authenticate for, *req* should be the (failed) :class:`Request` object, and
|
---|
| 755 | *headers* should be the error headers.
|
---|
| 756 |
|
---|
| 757 | *host* is either an authority (e.g. ``"python.org"``) or a URL containing an
|
---|
| 758 | authority component (e.g. ``"http://python.org/"``). In either case, the
|
---|
| 759 | authority must not contain a userinfo component (so, ``"python.org"`` and
|
---|
| 760 | ``"python.org:80"`` are fine, ``"joe:password@python.org"`` is not).
|
---|
| 761 |
|
---|
| 762 |
|
---|
| 763 | .. _http-basic-auth-handler:
|
---|
| 764 |
|
---|
| 765 | HTTPBasicAuthHandler Objects
|
---|
| 766 | ----------------------------
|
---|
| 767 |
|
---|
| 768 |
|
---|
| 769 | .. method:: HTTPBasicAuthHandler.http_error_401(req, fp, code, msg, hdrs)
|
---|
| 770 |
|
---|
| 771 | Retry the request with authentication information, if available.
|
---|
| 772 |
|
---|
| 773 |
|
---|
| 774 | .. _proxy-basic-auth-handler:
|
---|
| 775 |
|
---|
| 776 | ProxyBasicAuthHandler Objects
|
---|
| 777 | -----------------------------
|
---|
| 778 |
|
---|
| 779 |
|
---|
| 780 | .. method:: ProxyBasicAuthHandler.http_error_407(req, fp, code, msg, hdrs)
|
---|
| 781 |
|
---|
| 782 | Retry the request with authentication information, if available.
|
---|
| 783 |
|
---|
| 784 |
|
---|
| 785 | .. _abstract-digest-auth-handler:
|
---|
| 786 |
|
---|
| 787 | AbstractDigestAuthHandler Objects
|
---|
| 788 | ---------------------------------
|
---|
| 789 |
|
---|
| 790 |
|
---|
| 791 | .. method:: AbstractDigestAuthHandler.http_error_auth_reqed(authreq, host, req, headers)
|
---|
| 792 |
|
---|
| 793 | *authreq* should be the name of the header where the information about the realm
|
---|
| 794 | is included in the request, *host* should be the host to authenticate to, *req*
|
---|
| 795 | should be the (failed) :class:`Request` object, and *headers* should be the
|
---|
| 796 | error headers.
|
---|
| 797 |
|
---|
| 798 |
|
---|
| 799 | .. _http-digest-auth-handler:
|
---|
| 800 |
|
---|
| 801 | HTTPDigestAuthHandler Objects
|
---|
| 802 | -----------------------------
|
---|
| 803 |
|
---|
| 804 |
|
---|
| 805 | .. method:: HTTPDigestAuthHandler.http_error_401(req, fp, code, msg, hdrs)
|
---|
| 806 |
|
---|
| 807 | Retry the request with authentication information, if available.
|
---|
| 808 |
|
---|
| 809 |
|
---|
| 810 | .. _proxy-digest-auth-handler:
|
---|
| 811 |
|
---|
| 812 | ProxyDigestAuthHandler Objects
|
---|
| 813 | ------------------------------
|
---|
| 814 |
|
---|
| 815 |
|
---|
| 816 | .. method:: ProxyDigestAuthHandler.http_error_407(req, fp, code, msg, hdrs)
|
---|
| 817 |
|
---|
| 818 | Retry the request with authentication information, if available.
|
---|
| 819 |
|
---|
| 820 |
|
---|
| 821 | .. _http-handler-objects:
|
---|
| 822 |
|
---|
| 823 | HTTPHandler Objects
|
---|
| 824 | -------------------
|
---|
| 825 |
|
---|
| 826 |
|
---|
| 827 | .. method:: HTTPHandler.http_open(req)
|
---|
| 828 |
|
---|
| 829 | Send an HTTP request, which can be either GET or POST, depending on
|
---|
| 830 | ``req.has_data()``.
|
---|
| 831 |
|
---|
| 832 |
|
---|
| 833 | .. _https-handler-objects:
|
---|
| 834 |
|
---|
| 835 | HTTPSHandler Objects
|
---|
| 836 | --------------------
|
---|
| 837 |
|
---|
| 838 |
|
---|
| 839 | .. method:: HTTPSHandler.https_open(req)
|
---|
| 840 |
|
---|
| 841 | Send an HTTPS request, which can be either GET or POST, depending on
|
---|
| 842 | ``req.has_data()``.
|
---|
| 843 |
|
---|
| 844 |
|
---|
| 845 | .. _file-handler-objects:
|
---|
| 846 |
|
---|
| 847 | FileHandler Objects
|
---|
| 848 | -------------------
|
---|
| 849 |
|
---|
| 850 |
|
---|
| 851 | .. method:: FileHandler.file_open(req)
|
---|
| 852 |
|
---|
| 853 | Open the file locally, if there is no host name, or the host name is
|
---|
| 854 | ``'localhost'``. Change the protocol to ``ftp`` otherwise, and retry opening it
|
---|
| 855 | using :attr:`parent`.
|
---|
| 856 |
|
---|
| 857 |
|
---|
| 858 | .. _ftp-handler-objects:
|
---|
| 859 |
|
---|
| 860 | FTPHandler Objects
|
---|
| 861 | ------------------
|
---|
| 862 |
|
---|
| 863 |
|
---|
| 864 | .. method:: FTPHandler.ftp_open(req)
|
---|
| 865 |
|
---|
| 866 | Open the FTP file indicated by *req*. The login is always done with empty
|
---|
| 867 | username and password.
|
---|
| 868 |
|
---|
| 869 |
|
---|
| 870 | .. _cacheftp-handler-objects:
|
---|
| 871 |
|
---|
| 872 | CacheFTPHandler Objects
|
---|
| 873 | -----------------------
|
---|
| 874 |
|
---|
| 875 | :class:`CacheFTPHandler` objects are :class:`FTPHandler` objects with the
|
---|
| 876 | following additional methods:
|
---|
| 877 |
|
---|
| 878 |
|
---|
| 879 | .. method:: CacheFTPHandler.setTimeout(t)
|
---|
| 880 |
|
---|
| 881 | Set timeout of connections to *t* seconds.
|
---|
| 882 |
|
---|
| 883 |
|
---|
| 884 | .. method:: CacheFTPHandler.setMaxConns(m)
|
---|
| 885 |
|
---|
| 886 | Set maximum number of cached connections to *m*.
|
---|
| 887 |
|
---|
| 888 |
|
---|
| 889 | .. _unknown-handler-objects:
|
---|
| 890 |
|
---|
| 891 | UnknownHandler Objects
|
---|
| 892 | ----------------------
|
---|
| 893 |
|
---|
| 894 |
|
---|
| 895 | .. method:: UnknownHandler.unknown_open()
|
---|
| 896 |
|
---|
| 897 | Raise a :exc:`URLError` exception.
|
---|
| 898 |
|
---|
| 899 |
|
---|
| 900 | .. _http-error-processor-objects:
|
---|
| 901 |
|
---|
| 902 | HTTPErrorProcessor Objects
|
---|
| 903 | --------------------------
|
---|
| 904 |
|
---|
| 905 | .. versionadded:: 2.4
|
---|
| 906 |
|
---|
| 907 |
|
---|
[391] | 908 | .. method:: HTTPErrorProcessor.http_response()
|
---|
[2] | 909 |
|
---|
| 910 | Process HTTP error responses.
|
---|
| 911 |
|
---|
| 912 | For 200 error codes, the response object is returned immediately.
|
---|
| 913 |
|
---|
| 914 | For non-200 error codes, this simply passes the job on to the
|
---|
| 915 | :samp:`{protocol}_error_code` handler methods, via
|
---|
| 916 | :meth:`OpenerDirector.error`. Eventually,
|
---|
| 917 | :class:`urllib2.HTTPDefaultErrorHandler` will raise an :exc:`HTTPError` if no
|
---|
| 918 | other handler handles the error.
|
---|
| 919 |
|
---|
[391] | 920 | .. method:: HTTPErrorProcessor.https_response()
|
---|
[2] | 921 |
|
---|
[391] | 922 | Process HTTPS error responses.
|
---|
| 923 |
|
---|
| 924 | The behavior is same as :meth:`http_response`.
|
---|
| 925 |
|
---|
| 926 |
|
---|
[2] | 927 | .. _urllib2-examples:
|
---|
| 928 |
|
---|
| 929 | Examples
|
---|
| 930 | --------
|
---|
| 931 |
|
---|
| 932 | This example gets the python.org main page and displays the first 100 bytes of
|
---|
| 933 | it::
|
---|
| 934 |
|
---|
| 935 | >>> import urllib2
|
---|
| 936 | >>> f = urllib2.urlopen('http://www.python.org/')
|
---|
| 937 | >>> print f.read(100)
|
---|
| 938 | <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
|
---|
| 939 | <?xml-stylesheet href="./css/ht2html
|
---|
| 940 |
|
---|
| 941 | Here we are sending a data-stream to the stdin of a CGI and reading the data it
|
---|
| 942 | returns to us. Note that this example will only work when the Python
|
---|
| 943 | installation supports SSL. ::
|
---|
| 944 |
|
---|
| 945 | >>> import urllib2
|
---|
| 946 | >>> req = urllib2.Request(url='https://localhost/cgi-bin/test.cgi',
|
---|
| 947 | ... data='This data is passed to stdin of the CGI')
|
---|
| 948 | >>> f = urllib2.urlopen(req)
|
---|
| 949 | >>> print f.read()
|
---|
| 950 | Got Data: "This data is passed to stdin of the CGI"
|
---|
| 951 |
|
---|
| 952 | The code for the sample CGI used in the above example is::
|
---|
| 953 |
|
---|
| 954 | #!/usr/bin/env python
|
---|
| 955 | import sys
|
---|
| 956 | data = sys.stdin.read()
|
---|
| 957 | print 'Content-type: text-plain\n\nGot Data: "%s"' % data
|
---|
| 958 |
|
---|
| 959 | Use of Basic HTTP Authentication::
|
---|
| 960 |
|
---|
| 961 | import urllib2
|
---|
| 962 | # Create an OpenerDirector with support for Basic HTTP Authentication...
|
---|
| 963 | auth_handler = urllib2.HTTPBasicAuthHandler()
|
---|
| 964 | auth_handler.add_password(realm='PDQ Application',
|
---|
| 965 | uri='https://mahler:8092/site-updates.py',
|
---|
| 966 | user='klem',
|
---|
| 967 | passwd='kadidd!ehopper')
|
---|
| 968 | opener = urllib2.build_opener(auth_handler)
|
---|
| 969 | # ...and install it globally so it can be used with urlopen.
|
---|
| 970 | urllib2.install_opener(opener)
|
---|
| 971 | urllib2.urlopen('http://www.example.com/login.html')
|
---|
| 972 |
|
---|
| 973 | :func:`build_opener` provides many handlers by default, including a
|
---|
| 974 | :class:`ProxyHandler`. By default, :class:`ProxyHandler` uses the environment
|
---|
| 975 | variables named ``<scheme>_proxy``, where ``<scheme>`` is the URL scheme
|
---|
| 976 | involved. For example, the :envvar:`http_proxy` environment variable is read to
|
---|
| 977 | obtain the HTTP proxy's URL.
|
---|
| 978 |
|
---|
| 979 | This example replaces the default :class:`ProxyHandler` with one that uses
|
---|
| 980 | programmatically-supplied proxy URLs, and adds proxy authorization support with
|
---|
| 981 | :class:`ProxyBasicAuthHandler`. ::
|
---|
| 982 |
|
---|
| 983 | proxy_handler = urllib2.ProxyHandler({'http': 'http://www.example.com:3128/'})
|
---|
| 984 | proxy_auth_handler = urllib2.ProxyBasicAuthHandler()
|
---|
| 985 | proxy_auth_handler.add_password('realm', 'host', 'username', 'password')
|
---|
| 986 |
|
---|
| 987 | opener = urllib2.build_opener(proxy_handler, proxy_auth_handler)
|
---|
| 988 | # This time, rather than install the OpenerDirector, we use it directly:
|
---|
| 989 | opener.open('http://www.example.com/login.html')
|
---|
| 990 |
|
---|
| 991 | Adding HTTP headers:
|
---|
| 992 |
|
---|
| 993 | Use the *headers* argument to the :class:`Request` constructor, or::
|
---|
| 994 |
|
---|
| 995 | import urllib2
|
---|
| 996 | req = urllib2.Request('http://www.example.com/')
|
---|
| 997 | req.add_header('Referer', 'http://www.python.org/')
|
---|
| 998 | r = urllib2.urlopen(req)
|
---|
| 999 |
|
---|
| 1000 | :class:`OpenerDirector` automatically adds a :mailheader:`User-Agent` header to
|
---|
| 1001 | every :class:`Request`. To change this::
|
---|
| 1002 |
|
---|
| 1003 | import urllib2
|
---|
| 1004 | opener = urllib2.build_opener()
|
---|
| 1005 | opener.addheaders = [('User-agent', 'Mozilla/5.0')]
|
---|
| 1006 | opener.open('http://www.example.com/')
|
---|
| 1007 |
|
---|
| 1008 | Also, remember that a few standard headers (:mailheader:`Content-Length`,
|
---|
| 1009 | :mailheader:`Content-Type` and :mailheader:`Host`) are added when the
|
---|
| 1010 | :class:`Request` is passed to :func:`urlopen` (or :meth:`OpenerDirector.open`).
|
---|
| 1011 |
|
---|