[2] | 1 |
|
---|
| 2 | :mod:`zlib` --- Compression compatible with :program:`gzip`
|
---|
| 3 | ===========================================================
|
---|
| 4 |
|
---|
| 5 | .. module:: zlib
|
---|
| 6 | :synopsis: Low-level interface to compression and decompression routines compatible with
|
---|
| 7 | gzip.
|
---|
| 8 |
|
---|
| 9 |
|
---|
| 10 | For applications that require data compression, the functions in this module
|
---|
| 11 | allow compression and decompression, using the zlib library. The zlib library
|
---|
| 12 | has its own home page at http://www.zlib.net. There are known
|
---|
| 13 | incompatibilities between the Python module and versions of the zlib library
|
---|
| 14 | earlier than 1.1.3; 1.1.3 has a security vulnerability, so we recommend using
|
---|
| 15 | 1.1.4 or later.
|
---|
| 16 |
|
---|
| 17 | zlib's functions have many options and often need to be used in a particular
|
---|
| 18 | order. This documentation doesn't attempt to cover all of the permutations;
|
---|
| 19 | consult the zlib manual at http://www.zlib.net/manual.html for authoritative
|
---|
| 20 | information.
|
---|
| 21 |
|
---|
[391] | 22 | For reading and writing ``.gz`` files see the :mod:`gzip` module.
|
---|
[2] | 23 |
|
---|
| 24 | The available exception and functions in this module are:
|
---|
| 25 |
|
---|
| 26 |
|
---|
| 27 | .. exception:: error
|
---|
| 28 |
|
---|
| 29 | Exception raised on compression and decompression errors.
|
---|
| 30 |
|
---|
| 31 |
|
---|
| 32 | .. function:: adler32(data[, value])
|
---|
| 33 |
|
---|
| 34 | Computes a Adler-32 checksum of *data*. (An Adler-32 checksum is almost as
|
---|
| 35 | reliable as a CRC32 but can be computed much more quickly.) If *value* is
|
---|
| 36 | present, it is used as the starting value of the checksum; otherwise, a fixed
|
---|
| 37 | default value is used. This allows computing a running checksum over the
|
---|
| 38 | concatenation of several inputs. The algorithm is not cryptographically
|
---|
| 39 | strong, and should not be used for authentication or digital signatures. Since
|
---|
| 40 | the algorithm is designed for use as a checksum algorithm, it is not suitable
|
---|
| 41 | for use as a general hash algorithm.
|
---|
| 42 |
|
---|
| 43 | This function always returns an integer object.
|
---|
| 44 |
|
---|
| 45 | .. note::
|
---|
| 46 | To generate the same numeric value across all Python versions and
|
---|
| 47 | platforms use adler32(data) & 0xffffffff. If you are only using
|
---|
| 48 | the checksum in packed binary format this is not necessary as the
|
---|
| 49 | return value is the correct 32bit binary representation
|
---|
| 50 | regardless of sign.
|
---|
| 51 |
|
---|
| 52 | .. versionchanged:: 2.6
|
---|
| 53 | The return value is in the range [-2**31, 2**31-1]
|
---|
| 54 | regardless of platform. In older versions the value is
|
---|
| 55 | signed on some platforms and unsigned on others.
|
---|
| 56 |
|
---|
| 57 | .. versionchanged:: 3.0
|
---|
| 58 | The return value is unsigned and in the range [0, 2**32-1]
|
---|
| 59 | regardless of platform.
|
---|
| 60 |
|
---|
| 61 |
|
---|
| 62 | .. function:: compress(string[, level])
|
---|
| 63 |
|
---|
| 64 | Compresses the data in *string*, returning a string contained compressed data.
|
---|
[391] | 65 | *level* is an integer from ``0`` to ``9`` controlling the level of compression;
|
---|
[2] | 66 | ``1`` is fastest and produces the least compression, ``9`` is slowest and
|
---|
[391] | 67 | produces the most. ``0`` is no compression. The default value is ``6``.
|
---|
| 68 | Raises the :exc:`error` exception if any error occurs.
|
---|
[2] | 69 |
|
---|
| 70 |
|
---|
[391] | 71 | .. function:: compressobj([level[, method[, wbits[, memlevel[, strategy]]]]])
|
---|
[2] | 72 |
|
---|
| 73 | Returns a compression object, to be used for compressing data streams that won't
|
---|
[391] | 74 | fit into memory at once. *level* is an integer from ``0`` to ``9`` controlling
|
---|
[2] | 75 | the level of compression; ``1`` is fastest and produces the least compression,
|
---|
[391] | 76 | ``9`` is slowest and produces the most. ``0`` is no compression. The default
|
---|
| 77 | value is ``6``.
|
---|
[2] | 78 |
|
---|
[391] | 79 | *method* is the compression algorithm. Currently, the only supported value is
|
---|
| 80 | ``DEFLATED``.
|
---|
[2] | 81 |
|
---|
[391] | 82 | *wbits* is the base two logarithm of the size of the window buffer. This
|
---|
| 83 | should be an integer from ``8`` to ``15``. Higher values give better
|
---|
| 84 | compression, but use more memory. The default is 15.
|
---|
| 85 |
|
---|
| 86 | *memlevel* controls the amount of memory used for internal compression state.
|
---|
| 87 | Valid values range from ``1`` to ``9``. Higher values using more memory,
|
---|
| 88 | but are faster and produce smaller output. The default is 8.
|
---|
| 89 |
|
---|
| 90 | *strategy* is used to tune the compression algorithm. Possible values are
|
---|
| 91 | ``Z_DEFAULT_STRATEGY``, ``Z_FILTERED``, and ``Z_HUFFMAN_ONLY``. The default
|
---|
| 92 | is ``Z_DEFAULT_STRATEGY``.
|
---|
| 93 |
|
---|
| 94 |
|
---|
[2] | 95 | .. function:: crc32(data[, value])
|
---|
| 96 |
|
---|
| 97 | .. index::
|
---|
| 98 | single: Cyclic Redundancy Check
|
---|
| 99 | single: checksum; Cyclic Redundancy Check
|
---|
| 100 |
|
---|
| 101 | Computes a CRC (Cyclic Redundancy Check) checksum of *data*. If *value* is
|
---|
| 102 | present, it is used as the starting value of the checksum; otherwise, a fixed
|
---|
| 103 | default value is used. This allows computing a running checksum over the
|
---|
| 104 | concatenation of several inputs. The algorithm is not cryptographically
|
---|
| 105 | strong, and should not be used for authentication or digital signatures. Since
|
---|
| 106 | the algorithm is designed for use as a checksum algorithm, it is not suitable
|
---|
| 107 | for use as a general hash algorithm.
|
---|
| 108 |
|
---|
| 109 | This function always returns an integer object.
|
---|
| 110 |
|
---|
| 111 | .. note::
|
---|
| 112 | To generate the same numeric value across all Python versions and
|
---|
| 113 | platforms use crc32(data) & 0xffffffff. If you are only using
|
---|
| 114 | the checksum in packed binary format this is not necessary as the
|
---|
| 115 | return value is the correct 32bit binary representation
|
---|
| 116 | regardless of sign.
|
---|
| 117 |
|
---|
| 118 | .. versionchanged:: 2.6
|
---|
| 119 | The return value is in the range [-2**31, 2**31-1]
|
---|
| 120 | regardless of platform. In older versions the value would be
|
---|
| 121 | signed on some platforms and unsigned on others.
|
---|
| 122 |
|
---|
| 123 | .. versionchanged:: 3.0
|
---|
| 124 | The return value is unsigned and in the range [0, 2**32-1]
|
---|
| 125 | regardless of platform.
|
---|
| 126 |
|
---|
| 127 |
|
---|
| 128 | .. function:: decompress(string[, wbits[, bufsize]])
|
---|
| 129 |
|
---|
| 130 | Decompresses the data in *string*, returning a string containing the
|
---|
| 131 | uncompressed data. The *wbits* parameter controls the size of the window
|
---|
[391] | 132 | buffer, and is discussed further below.
|
---|
| 133 | If *bufsize* is given, it is used as the initial size of the output
|
---|
[2] | 134 | buffer. Raises the :exc:`error` exception if any error occurs.
|
---|
| 135 |
|
---|
| 136 | The absolute value of *wbits* is the base two logarithm of the size of the
|
---|
| 137 | history buffer (the "window size") used when compressing data. Its absolute
|
---|
| 138 | value should be between 8 and 15 for the most recent versions of the zlib
|
---|
| 139 | library, larger values resulting in better compression at the expense of greater
|
---|
[391] | 140 | memory usage. When decompressing a stream, *wbits* must not be smaller
|
---|
| 141 | than the size originally used to compress the stream; using a too-small
|
---|
| 142 | value will result in an exception. The default value is therefore the
|
---|
| 143 | highest value, 15. When *wbits* is negative, the standard
|
---|
| 144 | :program:`gzip` header is suppressed.
|
---|
[2] | 145 |
|
---|
| 146 | *bufsize* is the initial size of the buffer used to hold decompressed data. If
|
---|
| 147 | more space is required, the buffer size will be increased as needed, so you
|
---|
| 148 | don't have to get this value exactly right; tuning it will only save a few calls
|
---|
[391] | 149 | to :c:func:`malloc`. The default size is 16384.
|
---|
[2] | 150 |
|
---|
| 151 |
|
---|
| 152 | .. function:: decompressobj([wbits])
|
---|
| 153 |
|
---|
| 154 | Returns a decompression object, to be used for decompressing data streams that
|
---|
| 155 | won't fit into memory at once. The *wbits* parameter controls the size of the
|
---|
| 156 | window buffer.
|
---|
| 157 |
|
---|
| 158 | Compression objects support the following methods:
|
---|
| 159 |
|
---|
| 160 |
|
---|
| 161 | .. method:: Compress.compress(string)
|
---|
| 162 |
|
---|
| 163 | Compress *string*, returning a string containing compressed data for at least
|
---|
| 164 | part of the data in *string*. This data should be concatenated to the output
|
---|
| 165 | produced by any preceding calls to the :meth:`compress` method. Some input may
|
---|
| 166 | be kept in internal buffers for later processing.
|
---|
| 167 |
|
---|
| 168 |
|
---|
| 169 | .. method:: Compress.flush([mode])
|
---|
| 170 |
|
---|
| 171 | All pending input is processed, and a string containing the remaining compressed
|
---|
| 172 | output is returned. *mode* can be selected from the constants
|
---|
| 173 | :const:`Z_SYNC_FLUSH`, :const:`Z_FULL_FLUSH`, or :const:`Z_FINISH`,
|
---|
| 174 | defaulting to :const:`Z_FINISH`. :const:`Z_SYNC_FLUSH` and
|
---|
| 175 | :const:`Z_FULL_FLUSH` allow compressing further strings of data, while
|
---|
| 176 | :const:`Z_FINISH` finishes the compressed stream and prevents compressing any
|
---|
| 177 | more data. After calling :meth:`flush` with *mode* set to :const:`Z_FINISH`,
|
---|
| 178 | the :meth:`compress` method cannot be called again; the only realistic action is
|
---|
| 179 | to delete the object.
|
---|
| 180 |
|
---|
| 181 |
|
---|
| 182 | .. method:: Compress.copy()
|
---|
| 183 |
|
---|
| 184 | Returns a copy of the compression object. This can be used to efficiently
|
---|
| 185 | compress a set of data that share a common initial prefix.
|
---|
| 186 |
|
---|
| 187 | .. versionadded:: 2.5
|
---|
| 188 |
|
---|
| 189 | Decompression objects support the following methods, and two attributes:
|
---|
| 190 |
|
---|
| 191 |
|
---|
| 192 | .. attribute:: Decompress.unused_data
|
---|
| 193 |
|
---|
| 194 | A string which contains any bytes past the end of the compressed data. That is,
|
---|
| 195 | this remains ``""`` until the last byte that contains compression data is
|
---|
| 196 | available. If the whole string turned out to contain compressed data, this is
|
---|
| 197 | ``""``, the empty string.
|
---|
| 198 |
|
---|
| 199 | The only way to determine where a string of compressed data ends is by actually
|
---|
| 200 | decompressing it. This means that when compressed data is contained part of a
|
---|
| 201 | larger file, you can only find the end of it by reading data and feeding it
|
---|
| 202 | followed by some non-empty string into a decompression object's
|
---|
| 203 | :meth:`decompress` method until the :attr:`unused_data` attribute is no longer
|
---|
| 204 | the empty string.
|
---|
| 205 |
|
---|
| 206 |
|
---|
| 207 | .. attribute:: Decompress.unconsumed_tail
|
---|
| 208 |
|
---|
| 209 | A string that contains any data that was not consumed by the last
|
---|
| 210 | :meth:`decompress` call because it exceeded the limit for the uncompressed data
|
---|
| 211 | buffer. This data has not yet been seen by the zlib machinery, so you must feed
|
---|
| 212 | it (possibly with further data concatenated to it) back to a subsequent
|
---|
| 213 | :meth:`decompress` method call in order to get correct output.
|
---|
| 214 |
|
---|
| 215 |
|
---|
| 216 | .. method:: Decompress.decompress(string[, max_length])
|
---|
| 217 |
|
---|
| 218 | Decompress *string*, returning a string containing the uncompressed data
|
---|
| 219 | corresponding to at least part of the data in *string*. This data should be
|
---|
| 220 | concatenated to the output produced by any preceding calls to the
|
---|
| 221 | :meth:`decompress` method. Some of the input data may be preserved in internal
|
---|
| 222 | buffers for later processing.
|
---|
| 223 |
|
---|
| 224 | If the optional parameter *max_length* is supplied then the return value will be
|
---|
| 225 | no longer than *max_length*. This may mean that not all of the compressed input
|
---|
| 226 | can be processed; and unconsumed data will be stored in the attribute
|
---|
| 227 | :attr:`unconsumed_tail`. This string must be passed to a subsequent call to
|
---|
| 228 | :meth:`decompress` if decompression is to continue. If *max_length* is not
|
---|
| 229 | supplied then the whole input is decompressed, and :attr:`unconsumed_tail` is an
|
---|
| 230 | empty string.
|
---|
| 231 |
|
---|
| 232 |
|
---|
| 233 | .. method:: Decompress.flush([length])
|
---|
| 234 |
|
---|
| 235 | All pending input is processed, and a string containing the remaining
|
---|
| 236 | uncompressed output is returned. After calling :meth:`flush`, the
|
---|
| 237 | :meth:`decompress` method cannot be called again; the only realistic action is
|
---|
| 238 | to delete the object.
|
---|
| 239 |
|
---|
| 240 | The optional parameter *length* sets the initial size of the output buffer.
|
---|
| 241 |
|
---|
| 242 |
|
---|
| 243 | .. method:: Decompress.copy()
|
---|
| 244 |
|
---|
| 245 | Returns a copy of the decompression object. This can be used to save the state
|
---|
| 246 | of the decompressor midway through the data stream in order to speed up random
|
---|
| 247 | seeks into the stream at a future point.
|
---|
| 248 |
|
---|
| 249 | .. versionadded:: 2.5
|
---|
| 250 |
|
---|
| 251 |
|
---|
| 252 | .. seealso::
|
---|
| 253 |
|
---|
| 254 | Module :mod:`gzip`
|
---|
| 255 | Reading and writing :program:`gzip`\ -format files.
|
---|
| 256 |
|
---|
| 257 | http://www.zlib.net
|
---|
| 258 | The zlib library home page.
|
---|
| 259 |
|
---|
| 260 | http://www.zlib.net/manual.html
|
---|
| 261 | The zlib manual explains the semantics and usage of the library's many
|
---|
| 262 | functions.
|
---|
| 263 |
|
---|