1 | \section{\module{bz2} ---
|
---|
2 | Compression compatible with \program{bzip2}}
|
---|
3 |
|
---|
4 | \declaremodule{builtin}{bz2}
|
---|
5 | \modulesynopsis{Interface to compression and decompression
|
---|
6 | routines compatible with \program{bzip2}.}
|
---|
7 | \moduleauthor{Gustavo Niemeyer}{niemeyer@conectiva.com}
|
---|
8 | \sectionauthor{Gustavo Niemeyer}{niemeyer@conectiva.com}
|
---|
9 |
|
---|
10 | \versionadded{2.3}
|
---|
11 |
|
---|
12 | This module provides a comprehensive interface for the bz2 compression library.
|
---|
13 | It implements a complete file interface, one-shot (de)compression functions,
|
---|
14 | and types for sequential (de)compression.
|
---|
15 |
|
---|
16 | Here is a resume of the features offered by the bz2 module:
|
---|
17 |
|
---|
18 | \begin{itemize}
|
---|
19 | \item \class{BZ2File} class implements a complete file interface, including
|
---|
20 | \method{readline()}, \method{readlines()},
|
---|
21 | \method{writelines()}, \method{seek()}, etc;
|
---|
22 | \item \class{BZ2File} class implements emulated \method{seek()} support;
|
---|
23 | \item \class{BZ2File} class implements universal newline support;
|
---|
24 | \item \class{BZ2File} class offers an optimized line iteration using
|
---|
25 | the readahead algorithm borrowed from file objects;
|
---|
26 | \item Sequential (de)compression supported by \class{BZ2Compressor} and
|
---|
27 | \class{BZ2Decompressor} classes;
|
---|
28 | \item One-shot (de)compression supported by \function{compress()} and
|
---|
29 | \function{decompress()} functions;
|
---|
30 | \item Thread safety uses individual locking mechanism;
|
---|
31 | \item Complete inline documentation;
|
---|
32 | \end{itemize}
|
---|
33 |
|
---|
34 |
|
---|
35 | \subsection{(De)compression of files}
|
---|
36 |
|
---|
37 | Handling of compressed files is offered by the \class{BZ2File} class.
|
---|
38 |
|
---|
39 | \begin{classdesc}{BZ2File}{filename\optional{, mode\optional{,
|
---|
40 | buffering\optional{, compresslevel}}}}
|
---|
41 | Open a bz2 file. Mode can be either \code{'r'} or \code{'w'}, for reading
|
---|
42 | (default) or writing. When opened for writing, the file will be created if
|
---|
43 | it doesn't exist, and truncated otherwise. If \var{buffering} is given,
|
---|
44 | \code{0} means unbuffered, and larger numbers specify the buffer size;
|
---|
45 | the default is \code{0}. If
|
---|
46 | \var{compresslevel} is given, it must be a number between \code{1} and
|
---|
47 | \code{9}; the default is \code{9}.
|
---|
48 | Add a \character{U} to mode to open the file for input with universal newline
|
---|
49 | support. Any line ending in the input file will be seen as a
|
---|
50 | \character{\e n} in Python. Also, a file so opened gains the
|
---|
51 | attribute \member{newlines}; the value for this attribute is one of
|
---|
52 | \code{None} (no newline read yet), \code{'\e r'}, \code{'\e n'},
|
---|
53 | \code{'\e r\e n'} or a tuple containing all the newline types
|
---|
54 | seen. Universal newlines are available only when reading.
|
---|
55 | Instances support iteration in the same way as normal \class{file}
|
---|
56 | instances.
|
---|
57 | \end{classdesc}
|
---|
58 |
|
---|
59 | \begin{methoddesc}[BZ2File]{close}{}
|
---|
60 | Close the file. Sets data attribute \member{closed} to true. A closed file
|
---|
61 | cannot be used for further I/O operations. \method{close()} may be called
|
---|
62 | more than once without error.
|
---|
63 | \end{methoddesc}
|
---|
64 |
|
---|
65 | \begin{methoddesc}[BZ2File]{read}{\optional{size}}
|
---|
66 | Read at most \var{size} uncompressed bytes, returned as a string. If the
|
---|
67 | \var{size} argument is negative or omitted, read until EOF is reached.
|
---|
68 | \end{methoddesc}
|
---|
69 |
|
---|
70 | \begin{methoddesc}[BZ2File]{readline}{\optional{size}}
|
---|
71 | Return the next line from the file, as a string, retaining newline.
|
---|
72 | A non-negative \var{size} argument limits the maximum number of bytes to
|
---|
73 | return (an incomplete line may be returned then). Return an empty
|
---|
74 | string at EOF.
|
---|
75 | \end{methoddesc}
|
---|
76 |
|
---|
77 | \begin{methoddesc}[BZ2File]{readlines}{\optional{size}}
|
---|
78 | Return a list of lines read. The optional \var{size} argument, if given,
|
---|
79 | is an approximate bound on the total number of bytes in the lines returned.
|
---|
80 | \end{methoddesc}
|
---|
81 |
|
---|
82 | \begin{methoddesc}[BZ2File]{xreadlines}{}
|
---|
83 | For backward compatibility. \class{BZ2File} objects now include the
|
---|
84 | performance optimizations previously implemented in the
|
---|
85 | \module{xreadlines} module.
|
---|
86 | \deprecated{2.3}{This exists only for compatibility with the method by
|
---|
87 | this name on \class{file} objects, which is
|
---|
88 | deprecated. Use \code{for line in file} instead.}
|
---|
89 | \end{methoddesc}
|
---|
90 |
|
---|
91 | \begin{methoddesc}[BZ2File]{seek}{offset\optional{, whence}}
|
---|
92 | Move to new file position. Argument \var{offset} is a byte count. Optional
|
---|
93 | argument \var{whence} defaults to \code{0} (offset from start of file,
|
---|
94 | offset should be \code{>= 0}); other values are \code{1} (move relative to
|
---|
95 | current position, positive or negative), and \code{2} (move relative to end
|
---|
96 | of file, usually negative, although many platforms allow seeking beyond
|
---|
97 | the end of a file).
|
---|
98 |
|
---|
99 | Note that seeking of bz2 files is emulated, and depending on the parameters
|
---|
100 | the operation may be extremely slow.
|
---|
101 | \end{methoddesc}
|
---|
102 |
|
---|
103 | \begin{methoddesc}[BZ2File]{tell}{}
|
---|
104 | Return the current file position, an integer (may be a long integer).
|
---|
105 | \end{methoddesc}
|
---|
106 |
|
---|
107 | \begin{methoddesc}[BZ2File]{write}{data}
|
---|
108 | Write string \var{data} to file. Note that due to buffering, \method{close()}
|
---|
109 | may be needed before the file on disk reflects the data written.
|
---|
110 | \end{methoddesc}
|
---|
111 |
|
---|
112 | \begin{methoddesc}[BZ2File]{writelines}{sequence_of_strings}
|
---|
113 | Write the sequence of strings to the file. Note that newlines are not added.
|
---|
114 | The sequence can be any iterable object producing strings. This is equivalent
|
---|
115 | to calling write() for each string.
|
---|
116 | \end{methoddesc}
|
---|
117 |
|
---|
118 |
|
---|
119 | \subsection{Sequential (de)compression}
|
---|
120 |
|
---|
121 | Sequential compression and decompression is done using the classes
|
---|
122 | \class{BZ2Compressor} and \class{BZ2Decompressor}.
|
---|
123 |
|
---|
124 | \begin{classdesc}{BZ2Compressor}{\optional{compresslevel}}
|
---|
125 | Create a new compressor object. This object may be used to compress
|
---|
126 | data sequentially. If you want to compress data in one shot, use the
|
---|
127 | \function{compress()} function instead. The \var{compresslevel} parameter,
|
---|
128 | if given, must be a number between \code{1} and \code{9}; the default
|
---|
129 | is \code{9}.
|
---|
130 | \end{classdesc}
|
---|
131 |
|
---|
132 | \begin{methoddesc}[BZ2Compressor]{compress}{data}
|
---|
133 | Provide more data to the compressor object. It will return chunks of compressed
|
---|
134 | data whenever possible. When you've finished providing data to compress, call
|
---|
135 | the \method{flush()} method to finish the compression process, and return what
|
---|
136 | is left in internal buffers.
|
---|
137 | \end{methoddesc}
|
---|
138 |
|
---|
139 | \begin{methoddesc}[BZ2Compressor]{flush}{}
|
---|
140 | Finish the compression process and return what is left in internal buffers. You
|
---|
141 | must not use the compressor object after calling this method.
|
---|
142 | \end{methoddesc}
|
---|
143 |
|
---|
144 | \begin{classdesc}{BZ2Decompressor}{}
|
---|
145 | Create a new decompressor object. This object may be used to decompress
|
---|
146 | data sequentially. If you want to decompress data in one shot, use the
|
---|
147 | \function{decompress()} function instead.
|
---|
148 | \end{classdesc}
|
---|
149 |
|
---|
150 | \begin{methoddesc}[BZ2Decompressor]{decompress}{data}
|
---|
151 | Provide more data to the decompressor object. It will return chunks of
|
---|
152 | decompressed data whenever possible. If you try to decompress data after the
|
---|
153 | end of stream is found, \exception{EOFError} will be raised. If any data was
|
---|
154 | found after the end of stream, it'll be ignored and saved in
|
---|
155 | \member{unused\_data} attribute.
|
---|
156 | \end{methoddesc}
|
---|
157 |
|
---|
158 |
|
---|
159 | \subsection{One-shot (de)compression}
|
---|
160 |
|
---|
161 | One-shot compression and decompression is provided through the
|
---|
162 | \function{compress()} and \function{decompress()} functions.
|
---|
163 |
|
---|
164 | \begin{funcdesc}{compress}{data\optional{, compresslevel}}
|
---|
165 | Compress \var{data} in one shot. If you want to compress data sequentially,
|
---|
166 | use an instance of \class{BZ2Compressor} instead. The \var{compresslevel}
|
---|
167 | parameter, if given, must be a number between \code{1} and \code{9};
|
---|
168 | the default is \code{9}.
|
---|
169 | \end{funcdesc}
|
---|
170 |
|
---|
171 | \begin{funcdesc}{decompress}{data}
|
---|
172 | Decompress \var{data} in one shot. If you want to decompress data
|
---|
173 | sequentially, use an instance of \class{BZ2Decompressor} instead.
|
---|
174 | \end{funcdesc}
|
---|