1 | \documentclass{howto}
|
---|
2 |
|
---|
3 | % $Id: whatsnew20.tex 50964 2006-07-30 03:03:43Z fred.drake $
|
---|
4 |
|
---|
5 | \title{What's New in Python 2.0}
|
---|
6 | \release{1.02}
|
---|
7 | \author{A.M. Kuchling and Moshe Zadka}
|
---|
8 | \authoraddress{
|
---|
9 | \strong{Python Software Foundation}\\
|
---|
10 | Email: \email{amk@amk.ca}, \email{moshez@twistedmatrix.com}
|
---|
11 | }
|
---|
12 | \begin{document}
|
---|
13 | \maketitle\tableofcontents
|
---|
14 |
|
---|
15 | \section{Introduction}
|
---|
16 |
|
---|
17 | A new release of Python, version 2.0, was released on October 16, 2000. This
|
---|
18 | article covers the exciting new features in 2.0, highlights some other
|
---|
19 | useful changes, and points out a few incompatible changes that may require
|
---|
20 | rewriting code.
|
---|
21 |
|
---|
22 | Python's development never completely stops between releases, and a
|
---|
23 | steady flow of bug fixes and improvements are always being submitted.
|
---|
24 | A host of minor fixes, a few optimizations, additional docstrings, and
|
---|
25 | better error messages went into 2.0; to list them all would be
|
---|
26 | impossible, but they're certainly significant. Consult the
|
---|
27 | publicly-available CVS logs if you want to see the full list. This
|
---|
28 | progress is due to the five developers working for
|
---|
29 | PythonLabs are now getting paid to spend their days fixing bugs,
|
---|
30 | and also due to the improved communication resulting
|
---|
31 | from moving to SourceForge.
|
---|
32 |
|
---|
33 | % ======================================================================
|
---|
34 | \section{What About Python 1.6?}
|
---|
35 |
|
---|
36 | Python 1.6 can be thought of as the Contractual Obligations Python
|
---|
37 | release. After the core development team left CNRI in May 2000, CNRI
|
---|
38 | requested that a 1.6 release be created, containing all the work on
|
---|
39 | Python that had been performed at CNRI. Python 1.6 therefore
|
---|
40 | represents the state of the CVS tree as of May 2000, with the most
|
---|
41 | significant new feature being Unicode support. Development continued
|
---|
42 | after May, of course, so the 1.6 tree received a few fixes to ensure
|
---|
43 | that it's forward-compatible with Python 2.0. 1.6 is therefore part
|
---|
44 | of Python's evolution, and not a side branch.
|
---|
45 |
|
---|
46 | So, should you take much interest in Python 1.6? Probably not. The
|
---|
47 | 1.6final and 2.0beta1 releases were made on the same day (September 5,
|
---|
48 | 2000), the plan being to finalize Python 2.0 within a month or so. If
|
---|
49 | you have applications to maintain, there seems little point in
|
---|
50 | breaking things by moving to 1.6, fixing them, and then having another
|
---|
51 | round of breakage within a month by moving to 2.0; you're better off
|
---|
52 | just going straight to 2.0. Most of the really interesting features
|
---|
53 | described in this document are only in 2.0, because a lot of work was
|
---|
54 | done between May and September.
|
---|
55 |
|
---|
56 | % ======================================================================
|
---|
57 | \section{New Development Process}
|
---|
58 |
|
---|
59 | The most important change in Python 2.0 may not be to the code at all,
|
---|
60 | but to how Python is developed: in May 2000 the Python developers
|
---|
61 | began using the tools made available by SourceForge for storing
|
---|
62 | source code, tracking bug reports, and managing the queue of patch
|
---|
63 | submissions. To report bugs or submit patches for Python 2.0, use the
|
---|
64 | bug tracking and patch manager tools available from Python's project
|
---|
65 | page, located at \url{http://sourceforge.net/projects/python/}.
|
---|
66 |
|
---|
67 | The most important of the services now hosted at SourceForge is the
|
---|
68 | Python CVS tree, the version-controlled repository containing the
|
---|
69 | source code for Python. Previously, there were roughly 7 or so people
|
---|
70 | who had write access to the CVS tree, and all patches had to be
|
---|
71 | inspected and checked in by one of the people on this short list.
|
---|
72 | Obviously, this wasn't very scalable. By moving the CVS tree to
|
---|
73 | SourceForge, it became possible to grant write access to more people;
|
---|
74 | as of September 2000 there were 27 people able to check in changes, a
|
---|
75 | fourfold increase. This makes possible large-scale changes that
|
---|
76 | wouldn't be attempted if they'd have to be filtered through the small
|
---|
77 | group of core developers. For example, one day Peter Schneider-Kamp
|
---|
78 | took it into his head to drop K\&R C compatibility and convert the C
|
---|
79 | source for Python to ANSI C. After getting approval on the python-dev
|
---|
80 | mailing list, he launched into a flurry of checkins that lasted about
|
---|
81 | a week, other developers joined in to help, and the job was done. If
|
---|
82 | there were only 5 people with write access, probably that task would
|
---|
83 | have been viewed as ``nice, but not worth the time and effort needed''
|
---|
84 | and it would never have gotten done.
|
---|
85 |
|
---|
86 | The shift to using SourceForge's services has resulted in a remarkable
|
---|
87 | increase in the speed of development. Patches now get submitted,
|
---|
88 | commented on, revised by people other than the original submitter, and
|
---|
89 | bounced back and forth between people until the patch is deemed worth
|
---|
90 | checking in. Bugs are tracked in one central location and can be
|
---|
91 | assigned to a specific person for fixing, and we can count the number
|
---|
92 | of open bugs to measure progress. This didn't come without a cost:
|
---|
93 | developers now have more e-mail to deal with, more mailing lists to
|
---|
94 | follow, and special tools had to be written for the new environment.
|
---|
95 | For example, SourceForge sends default patch and bug notification
|
---|
96 | e-mail messages that are completely unhelpful, so Ka-Ping Yee wrote an
|
---|
97 | HTML screen-scraper that sends more useful messages.
|
---|
98 |
|
---|
99 | The ease of adding code caused a few initial growing pains, such as
|
---|
100 | code was checked in before it was ready or without getting clear
|
---|
101 | agreement from the developer group. The approval process that has
|
---|
102 | emerged is somewhat similar to that used by the Apache group.
|
---|
103 | Developers can vote +1, +0, -0, or -1 on a patch; +1 and -1 denote
|
---|
104 | acceptance or rejection, while +0 and -0 mean the developer is mostly
|
---|
105 | indifferent to the change, though with a slight positive or negative
|
---|
106 | slant. The most significant change from the Apache model is that the
|
---|
107 | voting is essentially advisory, letting Guido van Rossum, who has
|
---|
108 | Benevolent Dictator For Life status, know what the general opinion is.
|
---|
109 | He can still ignore the result of a vote, and approve or
|
---|
110 | reject a change even if the community disagrees with him.
|
---|
111 |
|
---|
112 | Producing an actual patch is the last step in adding a new feature,
|
---|
113 | and is usually easy compared to the earlier task of coming up with a
|
---|
114 | good design. Discussions of new features can often explode into
|
---|
115 | lengthy mailing list threads, making the discussion hard to follow,
|
---|
116 | and no one can read every posting to python-dev. Therefore, a
|
---|
117 | relatively formal process has been set up to write Python Enhancement
|
---|
118 | Proposals (PEPs), modelled on the Internet RFC process. PEPs are
|
---|
119 | draft documents that describe a proposed new feature, and are
|
---|
120 | continually revised until the community reaches a consensus, either
|
---|
121 | accepting or rejecting the proposal. Quoting from the introduction to
|
---|
122 | PEP 1, ``PEP Purpose and Guidelines'':
|
---|
123 |
|
---|
124 | \begin{quotation}
|
---|
125 | PEP stands for Python Enhancement Proposal. A PEP is a design
|
---|
126 | document providing information to the Python community, or
|
---|
127 | describing a new feature for Python. The PEP should provide a
|
---|
128 | concise technical specification of the feature and a rationale for
|
---|
129 | the feature.
|
---|
130 |
|
---|
131 | We intend PEPs to be the primary mechanisms for proposing new
|
---|
132 | features, for collecting community input on an issue, and for
|
---|
133 | documenting the design decisions that have gone into Python. The
|
---|
134 | PEP author is responsible for building consensus within the
|
---|
135 | community and documenting dissenting opinions.
|
---|
136 | \end{quotation}
|
---|
137 |
|
---|
138 | Read the rest of PEP 1 for the details of the PEP editorial process,
|
---|
139 | style, and format. PEPs are kept in the Python CVS tree on
|
---|
140 | SourceForge, though they're not part of the Python 2.0 distribution,
|
---|
141 | and are also available in HTML form from
|
---|
142 | \url{http://www.python.org/peps/}. As of September 2000,
|
---|
143 | there are 25 PEPS, ranging from PEP 201, ``Lockstep Iteration'', to
|
---|
144 | PEP 225, ``Elementwise/Objectwise Operators''.
|
---|
145 |
|
---|
146 | % ======================================================================
|
---|
147 | \section{Unicode}
|
---|
148 |
|
---|
149 | The largest new feature in Python 2.0 is a new fundamental data type:
|
---|
150 | Unicode strings. Unicode uses 16-bit numbers to represent characters
|
---|
151 | instead of the 8-bit number used by ASCII, meaning that 65,536
|
---|
152 | distinct characters can be supported.
|
---|
153 |
|
---|
154 | The final interface for Unicode support was arrived at through
|
---|
155 | countless often-stormy discussions on the python-dev mailing list, and
|
---|
156 | mostly implemented by Marc-Andr\'e Lemburg, based on a Unicode string
|
---|
157 | type implementation by Fredrik Lundh. A detailed explanation of the
|
---|
158 | interface was written up as \pep{100}, ``Python Unicode Integration''.
|
---|
159 | This article will simply cover the most significant points about the
|
---|
160 | Unicode interfaces.
|
---|
161 |
|
---|
162 | In Python source code, Unicode strings are written as
|
---|
163 | \code{u"string"}. Arbitrary Unicode characters can be written using a
|
---|
164 | new escape sequence, \code{\e u\var{HHHH}}, where \var{HHHH} is a
|
---|
165 | 4-digit hexadecimal number from 0000 to FFFF. The existing
|
---|
166 | \code{\e x\var{HHHH}} escape sequence can also be used, and octal
|
---|
167 | escapes can be used for characters up to U+01FF, which is represented
|
---|
168 | by \code{\e 777}.
|
---|
169 |
|
---|
170 | Unicode strings, just like regular strings, are an immutable sequence
|
---|
171 | type. They can be indexed and sliced, but not modified in place.
|
---|
172 | Unicode strings have an \method{encode( \optional{encoding} )} method
|
---|
173 | that returns an 8-bit string in the desired encoding. Encodings are
|
---|
174 | named by strings, such as \code{'ascii'}, \code{'utf-8'},
|
---|
175 | \code{'iso-8859-1'}, or whatever. A codec API is defined for
|
---|
176 | implementing and registering new encodings that are then available
|
---|
177 | throughout a Python program. If an encoding isn't specified, the
|
---|
178 | default encoding is usually 7-bit ASCII, though it can be changed for
|
---|
179 | your Python installation by calling the
|
---|
180 | \function{sys.setdefaultencoding(\var{encoding})} function in a
|
---|
181 | customised version of \file{site.py}.
|
---|
182 |
|
---|
183 | Combining 8-bit and Unicode strings always coerces to Unicode, using
|
---|
184 | the default ASCII encoding; the result of \code{'a' + u'bc'} is
|
---|
185 | \code{u'abc'}.
|
---|
186 |
|
---|
187 | New built-in functions have been added, and existing built-ins
|
---|
188 | modified to support Unicode:
|
---|
189 |
|
---|
190 | \begin{itemize}
|
---|
191 | \item \code{unichr(\var{ch})} returns a Unicode string 1 character
|
---|
192 | long, containing the character \var{ch}.
|
---|
193 |
|
---|
194 | \item \code{ord(\var{u})}, where \var{u} is a 1-character regular or Unicode string, returns the number of the character as an integer.
|
---|
195 |
|
---|
196 | \item \code{unicode(\var{string} \optional{, \var{encoding}}
|
---|
197 | \optional{, \var{errors}} ) } creates a Unicode string from an 8-bit
|
---|
198 | string. \code{encoding} is a string naming the encoding to use.
|
---|
199 | The \code{errors} parameter specifies the treatment of characters that
|
---|
200 | are invalid for the current encoding; passing \code{'strict'} as the
|
---|
201 | value causes an exception to be raised on any encoding error, while
|
---|
202 | \code{'ignore'} causes errors to be silently ignored and
|
---|
203 | \code{'replace'} uses U+FFFD, the official replacement character, in
|
---|
204 | case of any problems.
|
---|
205 |
|
---|
206 | \item The \keyword{exec} statement, and various built-ins such as
|
---|
207 | \code{eval()}, \code{getattr()}, and \code{setattr()} will also
|
---|
208 | accept Unicode strings as well as regular strings. (It's possible
|
---|
209 | that the process of fixing this missed some built-ins; if you find a
|
---|
210 | built-in function that accepts strings but doesn't accept Unicode
|
---|
211 | strings at all, please report it as a bug.)
|
---|
212 |
|
---|
213 | \end{itemize}
|
---|
214 |
|
---|
215 | A new module, \module{unicodedata}, provides an interface to Unicode
|
---|
216 | character properties. For example, \code{unicodedata.category(u'A')}
|
---|
217 | returns the 2-character string 'Lu', the 'L' denoting it's a letter,
|
---|
218 | and 'u' meaning that it's uppercase.
|
---|
219 | \code{unicodedata.bidirectional(u'\e u0660')} returns 'AN', meaning that U+0660 is
|
---|
220 | an Arabic number.
|
---|
221 |
|
---|
222 | The \module{codecs} module contains functions to look up existing encodings
|
---|
223 | and register new ones. Unless you want to implement a
|
---|
224 | new encoding, you'll most often use the
|
---|
225 | \function{codecs.lookup(\var{encoding})} function, which returns a
|
---|
226 | 4-element tuple: \code{(\var{encode_func},
|
---|
227 | \var{decode_func}, \var{stream_reader}, \var{stream_writer})}.
|
---|
228 |
|
---|
229 | \begin{itemize}
|
---|
230 | \item \var{encode_func} is a function that takes a Unicode string, and
|
---|
231 | returns a 2-tuple \code{(\var{string}, \var{length})}. \var{string}
|
---|
232 | is an 8-bit string containing a portion (perhaps all) of the Unicode
|
---|
233 | string converted into the given encoding, and \var{length} tells you
|
---|
234 | how much of the Unicode string was converted.
|
---|
235 |
|
---|
236 | \item \var{decode_func} is the opposite of \var{encode_func}, taking
|
---|
237 | an 8-bit string and returning a 2-tuple \code{(\var{ustring},
|
---|
238 | \var{length})}, consisting of the resulting Unicode string
|
---|
239 | \var{ustring} and the integer \var{length} telling how much of the
|
---|
240 | 8-bit string was consumed.
|
---|
241 |
|
---|
242 | \item \var{stream_reader} is a class that supports decoding input from
|
---|
243 | a stream. \var{stream_reader(\var{file_obj})} returns an object that
|
---|
244 | supports the \method{read()}, \method{readline()}, and
|
---|
245 | \method{readlines()} methods. These methods will all translate from
|
---|
246 | the given encoding and return Unicode strings.
|
---|
247 |
|
---|
248 | \item \var{stream_writer}, similarly, is a class that supports
|
---|
249 | encoding output to a stream. \var{stream_writer(\var{file_obj})}
|
---|
250 | returns an object that supports the \method{write()} and
|
---|
251 | \method{writelines()} methods. These methods expect Unicode strings,
|
---|
252 | translating them to the given encoding on output.
|
---|
253 | \end{itemize}
|
---|
254 |
|
---|
255 | For example, the following code writes a Unicode string into a file,
|
---|
256 | encoding it as UTF-8:
|
---|
257 |
|
---|
258 | \begin{verbatim}
|
---|
259 | import codecs
|
---|
260 |
|
---|
261 | unistr = u'\u0660\u2000ab ...'
|
---|
262 |
|
---|
263 | (UTF8_encode, UTF8_decode,
|
---|
264 | UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
|
---|
265 |
|
---|
266 | output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
|
---|
267 | output.write( unistr )
|
---|
268 | output.close()
|
---|
269 | \end{verbatim}
|
---|
270 |
|
---|
271 | The following code would then read UTF-8 input from the file:
|
---|
272 |
|
---|
273 | \begin{verbatim}
|
---|
274 | input = UTF8_streamreader( open( '/tmp/output', 'rb') )
|
---|
275 | print repr(input.read())
|
---|
276 | input.close()
|
---|
277 | \end{verbatim}
|
---|
278 |
|
---|
279 | Unicode-aware regular expressions are available through the
|
---|
280 | \module{re} module, which has a new underlying implementation called
|
---|
281 | SRE written by Fredrik Lundh of Secret Labs AB.
|
---|
282 |
|
---|
283 | A \code{-U} command line option was added which causes the Python
|
---|
284 | compiler to interpret all string literals as Unicode string literals.
|
---|
285 | This is intended to be used in testing and future-proofing your Python
|
---|
286 | code, since some future version of Python may drop support for 8-bit
|
---|
287 | strings and provide only Unicode strings.
|
---|
288 |
|
---|
289 | % ======================================================================
|
---|
290 | \section{List Comprehensions}
|
---|
291 |
|
---|
292 | Lists are a workhorse data type in Python, and many programs
|
---|
293 | manipulate a list at some point. Two common operations on lists are
|
---|
294 | to loop over them, and either pick out the elements that meet a
|
---|
295 | certain criterion, or apply some function to each element. For
|
---|
296 | example, given a list of strings, you might want to pull out all the
|
---|
297 | strings containing a given substring, or strip off trailing whitespace
|
---|
298 | from each line.
|
---|
299 |
|
---|
300 | The existing \function{map()} and \function{filter()} functions can be
|
---|
301 | used for this purpose, but they require a function as one of their
|
---|
302 | arguments. This is fine if there's an existing built-in function that
|
---|
303 | can be passed directly, but if there isn't, you have to create a
|
---|
304 | little function to do the required work, and Python's scoping rules
|
---|
305 | make the result ugly if the little function needs additional
|
---|
306 | information. Take the first example in the previous paragraph,
|
---|
307 | finding all the strings in the list containing a given substring. You
|
---|
308 | could write the following to do it:
|
---|
309 |
|
---|
310 | \begin{verbatim}
|
---|
311 | # Given the list L, make a list of all strings
|
---|
312 | # containing the substring S.
|
---|
313 | sublist = filter( lambda s, substring=S:
|
---|
314 | string.find(s, substring) != -1,
|
---|
315 | L)
|
---|
316 | \end{verbatim}
|
---|
317 |
|
---|
318 | Because of Python's scoping rules, a default argument is used so that
|
---|
319 | the anonymous function created by the \keyword{lambda} statement knows
|
---|
320 | what substring is being searched for. List comprehensions make this
|
---|
321 | cleaner:
|
---|
322 |
|
---|
323 | \begin{verbatim}
|
---|
324 | sublist = [ s for s in L if string.find(s, S) != -1 ]
|
---|
325 | \end{verbatim}
|
---|
326 |
|
---|
327 | List comprehensions have the form:
|
---|
328 |
|
---|
329 | \begin{verbatim}
|
---|
330 | [ expression for expr in sequence1
|
---|
331 | for expr2 in sequence2 ...
|
---|
332 | for exprN in sequenceN
|
---|
333 | if condition ]
|
---|
334 | \end{verbatim}
|
---|
335 |
|
---|
336 | The \keyword{for}...\keyword{in} clauses contain the sequences to be
|
---|
337 | iterated over. The sequences do not have to be the same length,
|
---|
338 | because they are \emph{not} iterated over in parallel, but
|
---|
339 | from left to right; this is explained more clearly in the following
|
---|
340 | paragraphs. The elements of the generated list will be the successive
|
---|
341 | values of \var{expression}. The final \keyword{if} clause is
|
---|
342 | optional; if present, \var{expression} is only evaluated and added to
|
---|
343 | the result if \var{condition} is true.
|
---|
344 |
|
---|
345 | To make the semantics very clear, a list comprehension is equivalent
|
---|
346 | to the following Python code:
|
---|
347 |
|
---|
348 | \begin{verbatim}
|
---|
349 | for expr1 in sequence1:
|
---|
350 | for expr2 in sequence2:
|
---|
351 | ...
|
---|
352 | for exprN in sequenceN:
|
---|
353 | if (condition):
|
---|
354 | # Append the value of
|
---|
355 | # the expression to the
|
---|
356 | # resulting list.
|
---|
357 | \end{verbatim}
|
---|
358 |
|
---|
359 | This means that when there are multiple \keyword{for}...\keyword{in} clauses,
|
---|
360 | the resulting list will be equal to the product of the lengths of all
|
---|
361 | the sequences. If you have two lists of length 3, the output list is
|
---|
362 | 9 elements long:
|
---|
363 |
|
---|
364 | \begin{verbatim}
|
---|
365 | seq1 = 'abc'
|
---|
366 | seq2 = (1,2,3)
|
---|
367 | >>> [ (x,y) for x in seq1 for y in seq2]
|
---|
368 | [('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1),
|
---|
369 | ('c', 2), ('c', 3)]
|
---|
370 | \end{verbatim}
|
---|
371 |
|
---|
372 | To avoid introducing an ambiguity into Python's grammar, if
|
---|
373 | \var{expression} is creating a tuple, it must be surrounded with
|
---|
374 | parentheses. The first list comprehension below is a syntax error,
|
---|
375 | while the second one is correct:
|
---|
376 |
|
---|
377 | \begin{verbatim}
|
---|
378 | # Syntax error
|
---|
379 | [ x,y for x in seq1 for y in seq2]
|
---|
380 | # Correct
|
---|
381 | [ (x,y) for x in seq1 for y in seq2]
|
---|
382 | \end{verbatim}
|
---|
383 |
|
---|
384 | The idea of list comprehensions originally comes from the functional
|
---|
385 | programming language Haskell (\url{http://www.haskell.org}). Greg
|
---|
386 | Ewing argued most effectively for adding them to Python and wrote the
|
---|
387 | initial list comprehension patch, which was then discussed for a
|
---|
388 | seemingly endless time on the python-dev mailing list and kept
|
---|
389 | up-to-date by Skip Montanaro.
|
---|
390 |
|
---|
391 | % ======================================================================
|
---|
392 | \section{Augmented Assignment}
|
---|
393 |
|
---|
394 | Augmented assignment operators, another long-requested feature, have
|
---|
395 | been added to Python 2.0. Augmented assignment operators include
|
---|
396 | \code{+=}, \code{-=}, \code{*=}, and so forth. For example, the
|
---|
397 | statement \code{a += 2} increments the value of the variable
|
---|
398 | \code{a} by 2, equivalent to the slightly lengthier \code{a = a + 2}.
|
---|
399 |
|
---|
400 | % The empty groups below prevent conversion to guillemets.
|
---|
401 | The full list of supported assignment operators is \code{+=},
|
---|
402 | \code{-=}, \code{*=}, \code{/=}, \code{\%=}, \code{**=}, \code{\&=},
|
---|
403 | \code{|=}, \verb|^=|, \code{>>=}, and \code{<<=}. Python classes can
|
---|
404 | override the augmented assignment operators by defining methods named
|
---|
405 | \method{__iadd__}, \method{__isub__}, etc. For example, the following
|
---|
406 | \class{Number} class stores a number and supports using += to create a
|
---|
407 | new instance with an incremented value.
|
---|
408 |
|
---|
409 | \begin{verbatim}
|
---|
410 | class Number:
|
---|
411 | def __init__(self, value):
|
---|
412 | self.value = value
|
---|
413 | def __iadd__(self, increment):
|
---|
414 | return Number( self.value + increment)
|
---|
415 |
|
---|
416 | n = Number(5)
|
---|
417 | n += 3
|
---|
418 | print n.value
|
---|
419 | \end{verbatim}
|
---|
420 |
|
---|
421 | The \method{__iadd__} special method is called with the value of the
|
---|
422 | increment, and should return a new instance with an appropriately
|
---|
423 | modified value; this return value is bound as the new value of the
|
---|
424 | variable on the left-hand side.
|
---|
425 |
|
---|
426 | Augmented assignment operators were first introduced in the C
|
---|
427 | programming language, and most C-derived languages, such as
|
---|
428 | \program{awk}, \Cpp, Java, Perl, and PHP also support them. The augmented
|
---|
429 | assignment patch was implemented by Thomas Wouters.
|
---|
430 |
|
---|
431 | % ======================================================================
|
---|
432 | \section{String Methods}
|
---|
433 |
|
---|
434 | Until now string-manipulation functionality was in the \module{string}
|
---|
435 | module, which was usually a front-end for the \module{strop}
|
---|
436 | module written in C. The addition of Unicode posed a difficulty for
|
---|
437 | the \module{strop} module, because the functions would all need to be
|
---|
438 | rewritten in order to accept either 8-bit or Unicode strings. For
|
---|
439 | functions such as \function{string.replace()}, which takes 3 string
|
---|
440 | arguments, that means eight possible permutations, and correspondingly
|
---|
441 | complicated code.
|
---|
442 |
|
---|
443 | Instead, Python 2.0 pushes the problem onto the string type, making
|
---|
444 | string manipulation functionality available through methods on both
|
---|
445 | 8-bit strings and Unicode strings.
|
---|
446 |
|
---|
447 | \begin{verbatim}
|
---|
448 | >>> 'andrew'.capitalize()
|
---|
449 | 'Andrew'
|
---|
450 | >>> 'hostname'.replace('os', 'linux')
|
---|
451 | 'hlinuxtname'
|
---|
452 | >>> 'moshe'.find('sh')
|
---|
453 | 2
|
---|
454 | \end{verbatim}
|
---|
455 |
|
---|
456 | One thing that hasn't changed, a noteworthy April Fools' joke
|
---|
457 | notwithstanding, is that Python strings are immutable. Thus, the
|
---|
458 | string methods return new strings, and do not modify the string on
|
---|
459 | which they operate.
|
---|
460 |
|
---|
461 | The old \module{string} module is still around for backwards
|
---|
462 | compatibility, but it mostly acts as a front-end to the new string
|
---|
463 | methods.
|
---|
464 |
|
---|
465 | Two methods which have no parallel in pre-2.0 versions, although they
|
---|
466 | did exist in JPython for quite some time, are \method{startswith()}
|
---|
467 | and \method{endswith}. \code{s.startswith(t)} is equivalent to \code{s[:len(t)]
|
---|
468 | == t}, while \code{s.endswith(t)} is equivalent to \code{s[-len(t):] == t}.
|
---|
469 |
|
---|
470 | One other method which deserves special mention is \method{join}. The
|
---|
471 | \method{join} method of a string receives one parameter, a sequence of
|
---|
472 | strings, and is equivalent to the \function{string.join} function from
|
---|
473 | the old \module{string} module, with the arguments reversed. In other
|
---|
474 | words, \code{s.join(seq)} is equivalent to the old
|
---|
475 | \code{string.join(seq, s)}.
|
---|
476 |
|
---|
477 | % ======================================================================
|
---|
478 | \section{Garbage Collection of Cycles}
|
---|
479 |
|
---|
480 | The C implementation of Python uses reference counting to implement
|
---|
481 | garbage collection. Every Python object maintains a count of the
|
---|
482 | number of references pointing to itself, and adjusts the count as
|
---|
483 | references are created or destroyed. Once the reference count reaches
|
---|
484 | zero, the object is no longer accessible, since you need to have a
|
---|
485 | reference to an object to access it, and if the count is zero, no
|
---|
486 | references exist any longer.
|
---|
487 |
|
---|
488 | Reference counting has some pleasant properties: it's easy to
|
---|
489 | understand and implement, and the resulting implementation is
|
---|
490 | portable, fairly fast, and reacts well with other libraries that
|
---|
491 | implement their own memory handling schemes. The major problem with
|
---|
492 | reference counting is that it sometimes doesn't realise that objects
|
---|
493 | are no longer accessible, resulting in a memory leak. This happens
|
---|
494 | when there are cycles of references.
|
---|
495 |
|
---|
496 | Consider the simplest possible cycle,
|
---|
497 | a class instance which has a reference to itself:
|
---|
498 |
|
---|
499 | \begin{verbatim}
|
---|
500 | instance = SomeClass()
|
---|
501 | instance.myself = instance
|
---|
502 | \end{verbatim}
|
---|
503 |
|
---|
504 | After the above two lines of code have been executed, the reference
|
---|
505 | count of \code{instance} is 2; one reference is from the variable
|
---|
506 | named \samp{'instance'}, and the other is from the \samp{myself}
|
---|
507 | attribute of the instance.
|
---|
508 |
|
---|
509 | If the next line of code is \code{del instance}, what happens? The
|
---|
510 | reference count of \code{instance} is decreased by 1, so it has a
|
---|
511 | reference count of 1; the reference in the \samp{myself} attribute
|
---|
512 | still exists. Yet the instance is no longer accessible through Python
|
---|
513 | code, and it could be deleted. Several objects can participate in a
|
---|
514 | cycle if they have references to each other, causing all of the
|
---|
515 | objects to be leaked.
|
---|
516 |
|
---|
517 | Python 2.0 fixes this problem by periodically executing a cycle
|
---|
518 | detection algorithm which looks for inaccessible cycles and deletes
|
---|
519 | the objects involved. A new \module{gc} module provides functions to
|
---|
520 | perform a garbage collection, obtain debugging statistics, and tuning
|
---|
521 | the collector's parameters.
|
---|
522 |
|
---|
523 | Running the cycle detection algorithm takes some time, and therefore
|
---|
524 | will result in some additional overhead. It is hoped that after we've
|
---|
525 | gotten experience with the cycle collection from using 2.0, Python 2.1
|
---|
526 | will be able to minimize the overhead with careful tuning. It's not
|
---|
527 | yet obvious how much performance is lost, because benchmarking this is
|
---|
528 | tricky and depends crucially on how often the program creates and
|
---|
529 | destroys objects. The detection of cycles can be disabled when Python
|
---|
530 | is compiled, if you can't afford even a tiny speed penalty or suspect
|
---|
531 | that the cycle collection is buggy, by specifying the
|
---|
532 | \longprogramopt{without-cycle-gc} switch when running the
|
---|
533 | \program{configure} script.
|
---|
534 |
|
---|
535 | Several people tackled this problem and contributed to a solution. An
|
---|
536 | early implementation of the cycle detection approach was written by
|
---|
537 | Toby Kelsey. The current algorithm was suggested by Eric Tiedemann
|
---|
538 | during a visit to CNRI, and Guido van Rossum and Neil Schemenauer
|
---|
539 | wrote two different implementations, which were later integrated by
|
---|
540 | Neil. Lots of other people offered suggestions along the way; the
|
---|
541 | March 2000 archives of the python-dev mailing list contain most of the
|
---|
542 | relevant discussion, especially in the threads titled ``Reference
|
---|
543 | cycle collection for Python'' and ``Finalization again''.
|
---|
544 |
|
---|
545 | % ======================================================================
|
---|
546 | \section{Other Core Changes}
|
---|
547 |
|
---|
548 | Various minor changes have been made to Python's syntax and built-in
|
---|
549 | functions. None of the changes are very far-reaching, but they're
|
---|
550 | handy conveniences.
|
---|
551 |
|
---|
552 | \subsection{Minor Language Changes}
|
---|
553 |
|
---|
554 | A new syntax makes it more convenient to call a given function
|
---|
555 | with a tuple of arguments and/or a dictionary of keyword arguments.
|
---|
556 | In Python 1.5 and earlier, you'd use the \function{apply()}
|
---|
557 | built-in function: \code{apply(f, \var{args}, \var{kw})} calls the
|
---|
558 | function \function{f()} with the argument tuple \var{args} and the
|
---|
559 | keyword arguments in the dictionary \var{kw}. \function{apply()}
|
---|
560 | is the same in 2.0, but thanks to a patch from
|
---|
561 | Greg Ewing, \code{f(*\var{args}, **\var{kw})} as a shorter
|
---|
562 | and clearer way to achieve the same effect. This syntax is
|
---|
563 | symmetrical with the syntax for defining functions:
|
---|
564 |
|
---|
565 | \begin{verbatim}
|
---|
566 | def f(*args, **kw):
|
---|
567 | # args is a tuple of positional args,
|
---|
568 | # kw is a dictionary of keyword args
|
---|
569 | ...
|
---|
570 | \end{verbatim}
|
---|
571 |
|
---|
572 | The \keyword{print} statement can now have its output directed to a
|
---|
573 | file-like object by following the \keyword{print} with
|
---|
574 | \verb|>> file|, similar to the redirection operator in \UNIX{} shells.
|
---|
575 | Previously you'd either have to use the \method{write()} method of the
|
---|
576 | file-like object, which lacks the convenience and simplicity of
|
---|
577 | \keyword{print}, or you could assign a new value to
|
---|
578 | \code{sys.stdout} and then restore the old value. For sending output to standard error,
|
---|
579 | it's much easier to write this:
|
---|
580 |
|
---|
581 | \begin{verbatim}
|
---|
582 | print >> sys.stderr, "Warning: action field not supplied"
|
---|
583 | \end{verbatim}
|
---|
584 |
|
---|
585 | Modules can now be renamed on importing them, using the syntax
|
---|
586 | \code{import \var{module} as \var{name}} or \code{from \var{module}
|
---|
587 | import \var{name} as \var{othername}}. The patch was submitted by
|
---|
588 | Thomas Wouters.
|
---|
589 |
|
---|
590 | A new format style is available when using the \code{\%} operator;
|
---|
591 | '\%r' will insert the \function{repr()} of its argument. This was
|
---|
592 | also added from symmetry considerations, this time for symmetry with
|
---|
593 | the existing '\%s' format style, which inserts the \function{str()} of
|
---|
594 | its argument. For example, \code{'\%r \%s' \% ('abc', 'abc')} returns a
|
---|
595 | string containing \verb|'abc' abc|.
|
---|
596 |
|
---|
597 | Previously there was no way to implement a class that overrode
|
---|
598 | Python's built-in \keyword{in} operator and implemented a custom
|
---|
599 | version. \code{\var{obj} in \var{seq}} returns true if \var{obj} is
|
---|
600 | present in the sequence \var{seq}; Python computes this by simply
|
---|
601 | trying every index of the sequence until either \var{obj} is found or
|
---|
602 | an \exception{IndexError} is encountered. Moshe Zadka contributed a
|
---|
603 | patch which adds a \method{__contains__} magic method for providing a
|
---|
604 | custom implementation for \keyword{in}. Additionally, new built-in
|
---|
605 | objects written in C can define what \keyword{in} means for them via a
|
---|
606 | new slot in the sequence protocol.
|
---|
607 |
|
---|
608 | Earlier versions of Python used a recursive algorithm for deleting
|
---|
609 | objects. Deeply nested data structures could cause the interpreter to
|
---|
610 | fill up the C stack and crash; Christian Tismer rewrote the deletion
|
---|
611 | logic to fix this problem. On a related note, comparing recursive
|
---|
612 | objects recursed infinitely and crashed; Jeremy Hylton rewrote the
|
---|
613 | code to no longer crash, producing a useful result instead. For
|
---|
614 | example, after this code:
|
---|
615 |
|
---|
616 | \begin{verbatim}
|
---|
617 | a = []
|
---|
618 | b = []
|
---|
619 | a.append(a)
|
---|
620 | b.append(b)
|
---|
621 | \end{verbatim}
|
---|
622 |
|
---|
623 | The comparison \code{a==b} returns true, because the two recursive
|
---|
624 | data structures are isomorphic. See the thread ``trashcan
|
---|
625 | and PR\#7'' in the April 2000 archives of the python-dev mailing list
|
---|
626 | for the discussion leading up to this implementation, and some useful
|
---|
627 | relevant links.
|
---|
628 | % Starting URL:
|
---|
629 | % http://www.python.org/pipermail/python-dev/2000-April/004834.html
|
---|
630 |
|
---|
631 | Note that comparisons can now also raise exceptions. In earlier
|
---|
632 | versions of Python, a comparison operation such as \code{cmp(a,b)}
|
---|
633 | would always produce an answer, even if a user-defined
|
---|
634 | \method{__cmp__} method encountered an error, since the resulting
|
---|
635 | exception would simply be silently swallowed.
|
---|
636 |
|
---|
637 | Work has been done on porting Python to 64-bit Windows on the Itanium
|
---|
638 | processor, mostly by Trent Mick of ActiveState. (Confusingly,
|
---|
639 | \code{sys.platform} is still \code{'win32'} on Win64 because it seems
|
---|
640 | that for ease of porting, MS Visual \Cpp{} treats code as 32 bit on Itanium.)
|
---|
641 | PythonWin also supports Windows CE; see the Python CE page at
|
---|
642 | \url{http://starship.python.net/crew/mhammond/ce/} for more
|
---|
643 | information.
|
---|
644 |
|
---|
645 | Another new platform is Darwin/MacOS X; initial support for it is in
|
---|
646 | Python 2.0. Dynamic loading works, if you specify ``configure
|
---|
647 | --with-dyld --with-suffix=.x''. Consult the README in the Python
|
---|
648 | source distribution for more instructions.
|
---|
649 |
|
---|
650 | An attempt has been made to alleviate one of Python's warts, the
|
---|
651 | often-confusing \exception{NameError} exception when code refers to a
|
---|
652 | local variable before the variable has been assigned a value. For
|
---|
653 | example, the following code raises an exception on the \keyword{print}
|
---|
654 | statement in both 1.5.2 and 2.0; in 1.5.2 a \exception{NameError}
|
---|
655 | exception is raised, while 2.0 raises a new
|
---|
656 | \exception{UnboundLocalError} exception.
|
---|
657 | \exception{UnboundLocalError} is a subclass of \exception{NameError},
|
---|
658 | so any existing code that expects \exception{NameError} to be raised
|
---|
659 | should still work.
|
---|
660 |
|
---|
661 | \begin{verbatim}
|
---|
662 | def f():
|
---|
663 | print "i=",i
|
---|
664 | i = i + 1
|
---|
665 | f()
|
---|
666 | \end{verbatim}
|
---|
667 |
|
---|
668 | Two new exceptions, \exception{TabError} and
|
---|
669 | \exception{IndentationError}, have been introduced. They're both
|
---|
670 | subclasses of \exception{SyntaxError}, and are raised when Python code
|
---|
671 | is found to be improperly indented.
|
---|
672 |
|
---|
673 | \subsection{Changes to Built-in Functions}
|
---|
674 |
|
---|
675 | A new built-in, \function{zip(\var{seq1}, \var{seq2}, ...)}, has been
|
---|
676 | added. \function{zip()} returns a list of tuples where each tuple
|
---|
677 | contains the i-th element from each of the argument sequences. The
|
---|
678 | difference between \function{zip()} and \code{map(None, \var{seq1},
|
---|
679 | \var{seq2})} is that \function{map()} pads the sequences with
|
---|
680 | \code{None} if the sequences aren't all of the same length, while
|
---|
681 | \function{zip()} truncates the returned list to the length of the
|
---|
682 | shortest argument sequence.
|
---|
683 |
|
---|
684 | The \function{int()} and \function{long()} functions now accept an
|
---|
685 | optional ``base'' parameter when the first argument is a string.
|
---|
686 | \code{int('123', 10)} returns 123, while \code{int('123', 16)} returns
|
---|
687 | 291. \code{int(123, 16)} raises a \exception{TypeError} exception
|
---|
688 | with the message ``can't convert non-string with explicit base''.
|
---|
689 |
|
---|
690 | A new variable holding more detailed version information has been
|
---|
691 | added to the \module{sys} module. \code{sys.version_info} is a tuple
|
---|
692 | \code{(\var{major}, \var{minor}, \var{micro}, \var{level},
|
---|
693 | \var{serial})} For example, in a hypothetical 2.0.1beta1,
|
---|
694 | \code{sys.version_info} would be \code{(2, 0, 1, 'beta', 1)}.
|
---|
695 | \var{level} is a string such as \code{"alpha"}, \code{"beta"}, or
|
---|
696 | \code{"final"} for a final release.
|
---|
697 |
|
---|
698 | Dictionaries have an odd new method, \method{setdefault(\var{key},
|
---|
699 | \var{default})}, which behaves similarly to the existing
|
---|
700 | \method{get()} method. However, if the key is missing,
|
---|
701 | \method{setdefault()} both returns the value of \var{default} as
|
---|
702 | \method{get()} would do, and also inserts it into the dictionary as
|
---|
703 | the value for \var{key}. Thus, the following lines of code:
|
---|
704 |
|
---|
705 | \begin{verbatim}
|
---|
706 | if dict.has_key( key ): return dict[key]
|
---|
707 | else:
|
---|
708 | dict[key] = []
|
---|
709 | return dict[key]
|
---|
710 | \end{verbatim}
|
---|
711 |
|
---|
712 | can be reduced to a single \code{return dict.setdefault(key, [])} statement.
|
---|
713 |
|
---|
714 | The interpreter sets a maximum recursion depth in order to catch
|
---|
715 | runaway recursion before filling the C stack and causing a core dump
|
---|
716 | or GPF.. Previously this limit was fixed when you compiled Python,
|
---|
717 | but in 2.0 the maximum recursion depth can be read and modified using
|
---|
718 | \function{sys.getrecursionlimit} and \function{sys.setrecursionlimit}.
|
---|
719 | The default value is 1000, and a rough maximum value for a given
|
---|
720 | platform can be found by running a new script,
|
---|
721 | \file{Misc/find_recursionlimit.py}.
|
---|
722 |
|
---|
723 | % ======================================================================
|
---|
724 | \section{Porting to 2.0}
|
---|
725 |
|
---|
726 | New Python releases try hard to be compatible with previous releases,
|
---|
727 | and the record has been pretty good. However, some changes are
|
---|
728 | considered useful enough, usually because they fix initial design decisions that
|
---|
729 | turned out to be actively mistaken, that breaking backward compatibility
|
---|
730 | can't always be avoided. This section lists the changes in Python 2.0
|
---|
731 | that may cause old Python code to break.
|
---|
732 |
|
---|
733 | The change which will probably break the most code is tightening up
|
---|
734 | the arguments accepted by some methods. Some methods would take
|
---|
735 | multiple arguments and treat them as a tuple, particularly various
|
---|
736 | list methods such as \method{.append()} and \method{.insert()}.
|
---|
737 | In earlier versions of Python, if \code{L} is a list, \code{L.append(
|
---|
738 | 1,2 )} appends the tuple \code{(1,2)} to the list. In Python 2.0 this
|
---|
739 | causes a \exception{TypeError} exception to be raised, with the
|
---|
740 | message: 'append requires exactly 1 argument; 2 given'. The fix is to
|
---|
741 | simply add an extra set of parentheses to pass both values as a tuple:
|
---|
742 | \code{L.append( (1,2) )}.
|
---|
743 |
|
---|
744 | The earlier versions of these methods were more forgiving because they
|
---|
745 | used an old function in Python's C interface to parse their arguments;
|
---|
746 | 2.0 modernizes them to use \function{PyArg_ParseTuple}, the current
|
---|
747 | argument parsing function, which provides more helpful error messages
|
---|
748 | and treats multi-argument calls as errors. If you absolutely must use
|
---|
749 | 2.0 but can't fix your code, you can edit \file{Objects/listobject.c}
|
---|
750 | and define the preprocessor symbol \code{NO_STRICT_LIST_APPEND} to
|
---|
751 | preserve the old behaviour; this isn't recommended.
|
---|
752 |
|
---|
753 | Some of the functions in the \module{socket} module are still
|
---|
754 | forgiving in this way. For example, \function{socket.connect(
|
---|
755 | ('hostname', 25) )} is the correct form, passing a tuple representing
|
---|
756 | an IP address, but \function{socket.connect( 'hostname', 25 )} also
|
---|
757 | works. \function{socket.connect_ex()} and \function{socket.bind()} are
|
---|
758 | similarly easy-going. 2.0alpha1 tightened these functions up, but
|
---|
759 | because the documentation actually used the erroneous multiple
|
---|
760 | argument form, many people wrote code which would break with the
|
---|
761 | stricter checking. GvR backed out the changes in the face of public
|
---|
762 | reaction, so for the \module{socket} module, the documentation was
|
---|
763 | fixed and the multiple argument form is simply marked as deprecated;
|
---|
764 | it \emph{will} be tightened up again in a future Python version.
|
---|
765 |
|
---|
766 | The \code{\e x} escape in string literals now takes exactly 2 hex
|
---|
767 | digits. Previously it would consume all the hex digits following the
|
---|
768 | 'x' and take the lowest 8 bits of the result, so \code{\e x123456} was
|
---|
769 | equivalent to \code{\e x56}.
|
---|
770 |
|
---|
771 | The \exception{AttributeError} and \exception{NameError} exceptions
|
---|
772 | have a more friendly error message, whose text will be something like
|
---|
773 | \code{'Spam' instance has no attribute 'eggs'} or \code{name 'eggs' is
|
---|
774 | not defined}. Previously the error message was just the missing
|
---|
775 | attribute name \code{eggs}, and code written to take advantage of this
|
---|
776 | fact will break in 2.0.
|
---|
777 |
|
---|
778 | Some work has been done to make integers and long integers a bit more
|
---|
779 | interchangeable. In 1.5.2, large-file support was added for Solaris,
|
---|
780 | to allow reading files larger than 2~GiB; this made the \method{tell()}
|
---|
781 | method of file objects return a long integer instead of a regular
|
---|
782 | integer. Some code would subtract two file offsets and attempt to use
|
---|
783 | the result to multiply a sequence or slice a string, but this raised a
|
---|
784 | \exception{TypeError}. In 2.0, long integers can be used to multiply
|
---|
785 | or slice a sequence, and it'll behave as you'd intuitively expect it
|
---|
786 | to; \code{3L * 'abc'} produces 'abcabcabc', and \code{
|
---|
787 | (0,1,2,3)[2L:4L]} produces (2,3). Long integers can also be used in
|
---|
788 | various contexts where previously only integers were accepted, such
|
---|
789 | as in the \method{seek()} method of file objects, and in the formats
|
---|
790 | supported by the \verb|%| operator (\verb|%d|, \verb|%i|, \verb|%x|,
|
---|
791 | etc.). For example, \code{"\%d" \% 2L**64} will produce the string
|
---|
792 | \samp{18446744073709551616}.
|
---|
793 |
|
---|
794 | The subtlest long integer change of all is that the \function{str()}
|
---|
795 | of a long integer no longer has a trailing 'L' character, though
|
---|
796 | \function{repr()} still includes it. The 'L' annoyed many people who
|
---|
797 | wanted to print long integers that looked just like regular integers,
|
---|
798 | since they had to go out of their way to chop off the character. This
|
---|
799 | is no longer a problem in 2.0, but code which does \code{str(longval)[:-1]} and assumes the 'L' is there, will now lose
|
---|
800 | the final digit.
|
---|
801 |
|
---|
802 | Taking the \function{repr()} of a float now uses a different
|
---|
803 | formatting precision than \function{str()}. \function{repr()} uses
|
---|
804 | \code{\%.17g} format string for C's \function{sprintf()}, while
|
---|
805 | \function{str()} uses \code{\%.12g} as before. The effect is that
|
---|
806 | \function{repr()} may occasionally show more decimal places than
|
---|
807 | \function{str()}, for certain numbers.
|
---|
808 | For example, the number 8.1 can't be represented exactly in binary, so
|
---|
809 | \code{repr(8.1)} is \code{'8.0999999999999996'}, while str(8.1) is
|
---|
810 | \code{'8.1'}.
|
---|
811 |
|
---|
812 | The \code{-X} command-line option, which turned all standard
|
---|
813 | exceptions into strings instead of classes, has been removed; the
|
---|
814 | standard exceptions will now always be classes. The
|
---|
815 | \module{exceptions} module containing the standard exceptions was
|
---|
816 | translated from Python to a built-in C module, written by Barry Warsaw
|
---|
817 | and Fredrik Lundh.
|
---|
818 |
|
---|
819 | % Commented out for now -- I don't think anyone will care.
|
---|
820 | %The pattern and match objects provided by SRE are C types, not Python
|
---|
821 | %class instances as in 1.5. This means you can no longer inherit from
|
---|
822 | %\class{RegexObject} or \class{MatchObject}, but that shouldn't be much
|
---|
823 | %of a problem since no one should have been doing that in the first
|
---|
824 | %place.
|
---|
825 |
|
---|
826 | % ======================================================================
|
---|
827 | \section{Extending/Embedding Changes}
|
---|
828 |
|
---|
829 | Some of the changes are under the covers, and will only be apparent to
|
---|
830 | people writing C extension modules or embedding a Python interpreter
|
---|
831 | in a larger application. If you aren't dealing with Python's C API,
|
---|
832 | you can safely skip this section.
|
---|
833 |
|
---|
834 | The version number of the Python C API was incremented, so C
|
---|
835 | extensions compiled for 1.5.2 must be recompiled in order to work with
|
---|
836 | 2.0. On Windows, it's not possible for Python 2.0 to import a third
|
---|
837 | party extension built for Python 1.5.x due to how Windows DLLs work,
|
---|
838 | so Python will raise an exception and the import will fail.
|
---|
839 |
|
---|
840 | Users of Jim Fulton's ExtensionClass module will be pleased to find
|
---|
841 | out that hooks have been added so that ExtensionClasses are now
|
---|
842 | supported by \function{isinstance()} and \function{issubclass()}.
|
---|
843 | This means you no longer have to remember to write code such as
|
---|
844 | \code{if type(obj) == myExtensionClass}, but can use the more natural
|
---|
845 | \code{if isinstance(obj, myExtensionClass)}.
|
---|
846 |
|
---|
847 | The \file{Python/importdl.c} file, which was a mass of \#ifdefs to
|
---|
848 | support dynamic loading on many different platforms, was cleaned up
|
---|
849 | and reorganised by Greg Stein. \file{importdl.c} is now quite small,
|
---|
850 | and platform-specific code has been moved into a bunch of
|
---|
851 | \file{Python/dynload_*.c} files. Another cleanup: there were also a
|
---|
852 | number of \file{my*.h} files in the Include/ directory that held
|
---|
853 | various portability hacks; they've been merged into a single file,
|
---|
854 | \file{Include/pyport.h}.
|
---|
855 |
|
---|
856 | Vladimir Marangozov's long-awaited malloc restructuring was completed,
|
---|
857 | to make it easy to have the Python interpreter use a custom allocator
|
---|
858 | instead of C's standard \function{malloc()}. For documentation, read
|
---|
859 | the comments in \file{Include/pymem.h} and
|
---|
860 | \file{Include/objimpl.h}. For the lengthy discussions during which
|
---|
861 | the interface was hammered out, see the Web archives of the 'patches'
|
---|
862 | and 'python-dev' lists at python.org.
|
---|
863 |
|
---|
864 | Recent versions of the GUSI development environment for MacOS support
|
---|
865 | POSIX threads. Therefore, Python's POSIX threading support now works
|
---|
866 | on the Macintosh. Threading support using the user-space GNU \texttt{pth}
|
---|
867 | library was also contributed.
|
---|
868 |
|
---|
869 | Threading support on Windows was enhanced, too. Windows supports
|
---|
870 | thread locks that use kernel objects only in case of contention; in
|
---|
871 | the common case when there's no contention, they use simpler functions
|
---|
872 | which are an order of magnitude faster. A threaded version of Python
|
---|
873 | 1.5.2 on NT is twice as slow as an unthreaded version; with the 2.0
|
---|
874 | changes, the difference is only 10\%. These improvements were
|
---|
875 | contributed by Yakov Markovitch.
|
---|
876 |
|
---|
877 | Python 2.0's source now uses only ANSI C prototypes, so compiling Python now
|
---|
878 | requires an ANSI C compiler, and can no longer be done using a compiler that
|
---|
879 | only supports K\&R C.
|
---|
880 |
|
---|
881 | Previously the Python virtual machine used 16-bit numbers in its
|
---|
882 | bytecode, limiting the size of source files. In particular, this
|
---|
883 | affected the maximum size of literal lists and dictionaries in Python
|
---|
884 | source; occasionally people who are generating Python code would run
|
---|
885 | into this limit. A patch by Charles G. Waldman raises the limit from
|
---|
886 | \verb|2^16| to \verb|2^{32}|.
|
---|
887 |
|
---|
888 | Three new convenience functions intended for adding constants to a
|
---|
889 | module's dictionary at module initialization time were added:
|
---|
890 | \function{PyModule_AddObject()}, \function{PyModule_AddIntConstant()},
|
---|
891 | and \function{PyModule_AddStringConstant()}. Each of these functions
|
---|
892 | takes a module object, a null-terminated C string containing the name
|
---|
893 | to be added, and a third argument for the value to be assigned to the
|
---|
894 | name. This third argument is, respectively, a Python object, a C
|
---|
895 | long, or a C string.
|
---|
896 |
|
---|
897 | A wrapper API was added for \UNIX-style signal handlers.
|
---|
898 | \function{PyOS_getsig()} gets a signal handler and
|
---|
899 | \function{PyOS_setsig()} will set a new handler.
|
---|
900 |
|
---|
901 | % ======================================================================
|
---|
902 | \section{Distutils: Making Modules Easy to Install}
|
---|
903 |
|
---|
904 | Before Python 2.0, installing modules was a tedious affair -- there
|
---|
905 | was no way to figure out automatically where Python is installed, or
|
---|
906 | what compiler options to use for extension modules. Software authors
|
---|
907 | had to go through an arduous ritual of editing Makefiles and
|
---|
908 | configuration files, which only really work on \UNIX{} and leave Windows
|
---|
909 | and MacOS unsupported. Python users faced wildly differing
|
---|
910 | installation instructions which varied between different extension
|
---|
911 | packages, which made administering a Python installation something of
|
---|
912 | a chore.
|
---|
913 |
|
---|
914 | The SIG for distribution utilities, shepherded by Greg Ward, has
|
---|
915 | created the Distutils, a system to make package installation much
|
---|
916 | easier. They form the \module{distutils} package, a new part of
|
---|
917 | Python's standard library. In the best case, installing a Python
|
---|
918 | module from source will require the same steps: first you simply mean
|
---|
919 | unpack the tarball or zip archive, and the run ``\code{python setup.py
|
---|
920 | install}''. The platform will be automatically detected, the compiler
|
---|
921 | will be recognized, C extension modules will be compiled, and the
|
---|
922 | distribution installed into the proper directory. Optional
|
---|
923 | command-line arguments provide more control over the installation
|
---|
924 | process, the distutils package offers many places to override defaults
|
---|
925 | -- separating the build from the install, building or installing in
|
---|
926 | non-default directories, and more.
|
---|
927 |
|
---|
928 | In order to use the Distutils, you need to write a \file{setup.py}
|
---|
929 | script. For the simple case, when the software contains only .py
|
---|
930 | files, a minimal \file{setup.py} can be just a few lines long:
|
---|
931 |
|
---|
932 | \begin{verbatim}
|
---|
933 | from distutils.core import setup
|
---|
934 | setup (name = "foo", version = "1.0",
|
---|
935 | py_modules = ["module1", "module2"])
|
---|
936 | \end{verbatim}
|
---|
937 |
|
---|
938 | The \file{setup.py} file isn't much more complicated if the software
|
---|
939 | consists of a few packages:
|
---|
940 |
|
---|
941 | \begin{verbatim}
|
---|
942 | from distutils.core import setup
|
---|
943 | setup (name = "foo", version = "1.0",
|
---|
944 | packages = ["package", "package.subpackage"])
|
---|
945 | \end{verbatim}
|
---|
946 |
|
---|
947 | A C extension can be the most complicated case; here's an example taken from
|
---|
948 | the PyXML package:
|
---|
949 |
|
---|
950 |
|
---|
951 | \begin{verbatim}
|
---|
952 | from distutils.core import setup, Extension
|
---|
953 |
|
---|
954 | expat_extension = Extension('xml.parsers.pyexpat',
|
---|
955 | define_macros = [('XML_NS', None)],
|
---|
956 | include_dirs = [ 'extensions/expat/xmltok',
|
---|
957 | 'extensions/expat/xmlparse' ],
|
---|
958 | sources = [ 'extensions/pyexpat.c',
|
---|
959 | 'extensions/expat/xmltok/xmltok.c',
|
---|
960 | 'extensions/expat/xmltok/xmlrole.c',
|
---|
961 | ]
|
---|
962 | )
|
---|
963 | setup (name = "PyXML", version = "0.5.4",
|
---|
964 | ext_modules =[ expat_extension ] )
|
---|
965 | \end{verbatim}
|
---|
966 |
|
---|
967 | The Distutils can also take care of creating source and binary
|
---|
968 | distributions. The ``sdist'' command, run by ``\code{python setup.py
|
---|
969 | sdist}', builds a source distribution such as \file{foo-1.0.tar.gz}.
|
---|
970 | Adding new commands isn't difficult, ``bdist_rpm'' and
|
---|
971 | ``bdist_wininst'' commands have already been contributed to create an
|
---|
972 | RPM distribution and a Windows installer for the software,
|
---|
973 | respectively. Commands to create other distribution formats such as
|
---|
974 | Debian packages and Solaris \file{.pkg} files are in various stages of
|
---|
975 | development.
|
---|
976 |
|
---|
977 | All this is documented in a new manual, \textit{Distributing Python
|
---|
978 | Modules}, that joins the basic set of Python documentation.
|
---|
979 |
|
---|
980 | % ======================================================================
|
---|
981 | \section{XML Modules}
|
---|
982 |
|
---|
983 | Python 1.5.2 included a simple XML parser in the form of the
|
---|
984 | \module{xmllib} module, contributed by Sjoerd Mullender. Since
|
---|
985 | 1.5.2's release, two different interfaces for processing XML have
|
---|
986 | become common: SAX2 (version 2 of the Simple API for XML) provides an
|
---|
987 | event-driven interface with some similarities to \module{xmllib}, and
|
---|
988 | the DOM (Document Object Model) provides a tree-based interface,
|
---|
989 | transforming an XML document into a tree of nodes that can be
|
---|
990 | traversed and modified. Python 2.0 includes a SAX2 interface and a
|
---|
991 | stripped-down DOM interface as part of the \module{xml} package.
|
---|
992 | Here we will give a brief overview of these new interfaces; consult
|
---|
993 | the Python documentation or the source code for complete details.
|
---|
994 | The Python XML SIG is also working on improved documentation.
|
---|
995 |
|
---|
996 | \subsection{SAX2 Support}
|
---|
997 |
|
---|
998 | SAX defines an event-driven interface for parsing XML. To use SAX,
|
---|
999 | you must write a SAX handler class. Handler classes inherit from
|
---|
1000 | various classes provided by SAX, and override various methods that
|
---|
1001 | will then be called by the XML parser. For example, the
|
---|
1002 | \method{startElement} and \method{endElement} methods are called for
|
---|
1003 | every starting and end tag encountered by the parser, the
|
---|
1004 | \method{characters()} method is called for every chunk of character
|
---|
1005 | data, and so forth.
|
---|
1006 |
|
---|
1007 | The advantage of the event-driven approach is that the whole
|
---|
1008 | document doesn't have to be resident in memory at any one time, which
|
---|
1009 | matters if you are processing really huge documents. However, writing
|
---|
1010 | the SAX handler class can get very complicated if you're trying to
|
---|
1011 | modify the document structure in some elaborate way.
|
---|
1012 |
|
---|
1013 | For example, this little example program defines a handler that prints
|
---|
1014 | a message for every starting and ending tag, and then parses the file
|
---|
1015 | \file{hamlet.xml} using it:
|
---|
1016 |
|
---|
1017 | \begin{verbatim}
|
---|
1018 | from xml import sax
|
---|
1019 |
|
---|
1020 | class SimpleHandler(sax.ContentHandler):
|
---|
1021 | def startElement(self, name, attrs):
|
---|
1022 | print 'Start of element:', name, attrs.keys()
|
---|
1023 |
|
---|
1024 | def endElement(self, name):
|
---|
1025 | print 'End of element:', name
|
---|
1026 |
|
---|
1027 | # Create a parser object
|
---|
1028 | parser = sax.make_parser()
|
---|
1029 |
|
---|
1030 | # Tell it what handler to use
|
---|
1031 | handler = SimpleHandler()
|
---|
1032 | parser.setContentHandler( handler )
|
---|
1033 |
|
---|
1034 | # Parse a file!
|
---|
1035 | parser.parse( 'hamlet.xml' )
|
---|
1036 | \end{verbatim}
|
---|
1037 |
|
---|
1038 | For more information, consult the Python documentation, or the XML
|
---|
1039 | HOWTO at \url{http://pyxml.sourceforge.net/topics/howto/xml-howto.html}.
|
---|
1040 |
|
---|
1041 | \subsection{DOM Support}
|
---|
1042 |
|
---|
1043 | The Document Object Model is a tree-based representation for an XML
|
---|
1044 | document. A top-level \class{Document} instance is the root of the
|
---|
1045 | tree, and has a single child which is the top-level \class{Element}
|
---|
1046 | instance. This \class{Element} has children nodes representing
|
---|
1047 | character data and any sub-elements, which may have further children
|
---|
1048 | of their own, and so forth. Using the DOM you can traverse the
|
---|
1049 | resulting tree any way you like, access element and attribute values,
|
---|
1050 | insert and delete nodes, and convert the tree back into XML.
|
---|
1051 |
|
---|
1052 | The DOM is useful for modifying XML documents, because you can create
|
---|
1053 | a DOM tree, modify it by adding new nodes or rearranging subtrees, and
|
---|
1054 | then produce a new XML document as output. You can also construct a
|
---|
1055 | DOM tree manually and convert it to XML, which can be a more flexible
|
---|
1056 | way of producing XML output than simply writing
|
---|
1057 | \code{<tag1>}...\code{</tag1>} to a file.
|
---|
1058 |
|
---|
1059 | The DOM implementation included with Python lives in the
|
---|
1060 | \module{xml.dom.minidom} module. It's a lightweight implementation of
|
---|
1061 | the Level 1 DOM with support for XML namespaces. The
|
---|
1062 | \function{parse()} and \function{parseString()} convenience
|
---|
1063 | functions are provided for generating a DOM tree:
|
---|
1064 |
|
---|
1065 | \begin{verbatim}
|
---|
1066 | from xml.dom import minidom
|
---|
1067 | doc = minidom.parse('hamlet.xml')
|
---|
1068 | \end{verbatim}
|
---|
1069 |
|
---|
1070 | \code{doc} is a \class{Document} instance. \class{Document}, like all
|
---|
1071 | the other DOM classes such as \class{Element} and \class{Text}, is a
|
---|
1072 | subclass of the \class{Node} base class. All the nodes in a DOM tree
|
---|
1073 | therefore support certain common methods, such as \method{toxml()}
|
---|
1074 | which returns a string containing the XML representation of the node
|
---|
1075 | and its children. Each class also has special methods of its own; for
|
---|
1076 | example, \class{Element} and \class{Document} instances have a method
|
---|
1077 | to find all child elements with a given tag name. Continuing from the
|
---|
1078 | previous 2-line example:
|
---|
1079 |
|
---|
1080 | \begin{verbatim}
|
---|
1081 | perslist = doc.getElementsByTagName( 'PERSONA' )
|
---|
1082 | print perslist[0].toxml()
|
---|
1083 | print perslist[1].toxml()
|
---|
1084 | \end{verbatim}
|
---|
1085 |
|
---|
1086 | For the \textit{Hamlet} XML file, the above few lines output:
|
---|
1087 |
|
---|
1088 | \begin{verbatim}
|
---|
1089 | <PERSONA>CLAUDIUS, king of Denmark. </PERSONA>
|
---|
1090 | <PERSONA>HAMLET, son to the late, and nephew to the present king.</PERSONA>
|
---|
1091 | \end{verbatim}
|
---|
1092 |
|
---|
1093 | The root element of the document is available as
|
---|
1094 | \code{doc.documentElement}, and its children can be easily modified
|
---|
1095 | by deleting, adding, or removing nodes:
|
---|
1096 |
|
---|
1097 | \begin{verbatim}
|
---|
1098 | root = doc.documentElement
|
---|
1099 |
|
---|
1100 | # Remove the first child
|
---|
1101 | root.removeChild( root.childNodes[0] )
|
---|
1102 |
|
---|
1103 | # Move the new first child to the end
|
---|
1104 | root.appendChild( root.childNodes[0] )
|
---|
1105 |
|
---|
1106 | # Insert the new first child (originally,
|
---|
1107 | # the third child) before the 20th child.
|
---|
1108 | root.insertBefore( root.childNodes[0], root.childNodes[20] )
|
---|
1109 | \end{verbatim}
|
---|
1110 |
|
---|
1111 | Again, I will refer you to the Python documentation for a complete
|
---|
1112 | listing of the different \class{Node} classes and their various methods.
|
---|
1113 |
|
---|
1114 | \subsection{Relationship to PyXML}
|
---|
1115 |
|
---|
1116 | The XML Special Interest Group has been working on XML-related Python
|
---|
1117 | code for a while. Its code distribution, called PyXML, is available
|
---|
1118 | from the SIG's Web pages at \url{http://www.python.org/sigs/xml-sig/}.
|
---|
1119 | The PyXML distribution also used the package name \samp{xml}. If
|
---|
1120 | you've written programs that used PyXML, you're probably wondering
|
---|
1121 | about its compatibility with the 2.0 \module{xml} package.
|
---|
1122 |
|
---|
1123 | The answer is that Python 2.0's \module{xml} package isn't compatible
|
---|
1124 | with PyXML, but can be made compatible by installing a recent version
|
---|
1125 | PyXML. Many applications can get by with the XML support that is
|
---|
1126 | included with Python 2.0, but more complicated applications will
|
---|
1127 | require that the full PyXML package will be installed. When
|
---|
1128 | installed, PyXML versions 0.6.0 or greater will replace the
|
---|
1129 | \module{xml} package shipped with Python, and will be a strict
|
---|
1130 | superset of the standard package, adding a bunch of additional
|
---|
1131 | features. Some of the additional features in PyXML include:
|
---|
1132 |
|
---|
1133 | \begin{itemize}
|
---|
1134 | \item 4DOM, a full DOM implementation
|
---|
1135 | from FourThought, Inc.
|
---|
1136 | \item The xmlproc validating parser, written by Lars Marius Garshol.
|
---|
1137 | \item The \module{sgmlop} parser accelerator module, written by Fredrik Lundh.
|
---|
1138 | \end{itemize}
|
---|
1139 |
|
---|
1140 | % ======================================================================
|
---|
1141 | \section{Module changes}
|
---|
1142 |
|
---|
1143 | Lots of improvements and bugfixes were made to Python's extensive
|
---|
1144 | standard library; some of the affected modules include
|
---|
1145 | \module{readline}, \module{ConfigParser}, \module{cgi},
|
---|
1146 | \module{calendar}, \module{posix}, \module{readline}, \module{xmllib},
|
---|
1147 | \module{aifc}, \module{chunk, wave}, \module{random}, \module{shelve},
|
---|
1148 | and \module{nntplib}. Consult the CVS logs for the exact
|
---|
1149 | patch-by-patch details.
|
---|
1150 |
|
---|
1151 | Brian Gallew contributed OpenSSL support for the \module{socket}
|
---|
1152 | module. OpenSSL is an implementation of the Secure Socket Layer,
|
---|
1153 | which encrypts the data being sent over a socket. When compiling
|
---|
1154 | Python, you can edit \file{Modules/Setup} to include SSL support,
|
---|
1155 | which adds an additional function to the \module{socket} module:
|
---|
1156 | \function{socket.ssl(\var{socket}, \var{keyfile}, \var{certfile})},
|
---|
1157 | which takes a socket object and returns an SSL socket. The
|
---|
1158 | \module{httplib} and \module{urllib} modules were also changed to
|
---|
1159 | support ``https://'' URLs, though no one has implemented FTP or SMTP
|
---|
1160 | over SSL.
|
---|
1161 |
|
---|
1162 | The \module{httplib} module has been rewritten by Greg Stein to
|
---|
1163 | support HTTP/1.1. Backward compatibility with the 1.5 version of
|
---|
1164 | \module{httplib} is provided, though using HTTP/1.1 features such as
|
---|
1165 | pipelining will require rewriting code to use a different set of
|
---|
1166 | interfaces.
|
---|
1167 |
|
---|
1168 | The \module{Tkinter} module now supports Tcl/Tk version 8.1, 8.2, or
|
---|
1169 | 8.3, and support for the older 7.x versions has been dropped. The
|
---|
1170 | Tkinter module now supports displaying Unicode strings in Tk widgets.
|
---|
1171 | Also, Fredrik Lundh contributed an optimization which makes operations
|
---|
1172 | like \code{create_line} and \code{create_polygon} much faster,
|
---|
1173 | especially when using lots of coordinates.
|
---|
1174 |
|
---|
1175 | The \module{curses} module has been greatly extended, starting from
|
---|
1176 | Oliver Andrich's enhanced version, to provide many additional
|
---|
1177 | functions from ncurses and SYSV curses, such as colour, alternative
|
---|
1178 | character set support, pads, and mouse support. This means the module
|
---|
1179 | is no longer compatible with operating systems that only have BSD
|
---|
1180 | curses, but there don't seem to be any currently maintained OSes that
|
---|
1181 | fall into this category.
|
---|
1182 |
|
---|
1183 | As mentioned in the earlier discussion of 2.0's Unicode support, the
|
---|
1184 | underlying implementation of the regular expressions provided by the
|
---|
1185 | \module{re} module has been changed. SRE, a new regular expression
|
---|
1186 | engine written by Fredrik Lundh and partially funded by Hewlett
|
---|
1187 | Packard, supports matching against both 8-bit strings and Unicode
|
---|
1188 | strings.
|
---|
1189 |
|
---|
1190 | % ======================================================================
|
---|
1191 | \section{New modules}
|
---|
1192 |
|
---|
1193 | A number of new modules were added. We'll simply list them with brief
|
---|
1194 | descriptions; consult the 2.0 documentation for the details of a
|
---|
1195 | particular module.
|
---|
1196 |
|
---|
1197 | \begin{itemize}
|
---|
1198 |
|
---|
1199 | \item{\module{atexit}}:
|
---|
1200 | For registering functions to be called before the Python interpreter exits.
|
---|
1201 | Code that currently sets
|
---|
1202 | \code{sys.exitfunc} directly should be changed to
|
---|
1203 | use the \module{atexit} module instead, importing \module{atexit}
|
---|
1204 | and calling \function{atexit.register()} with
|
---|
1205 | the function to be called on exit.
|
---|
1206 | (Contributed by Skip Montanaro.)
|
---|
1207 |
|
---|
1208 | \item{\module{codecs}, \module{encodings}, \module{unicodedata}:} Added as part of the new Unicode support.
|
---|
1209 |
|
---|
1210 | \item{\module{filecmp}:} Supersedes the old \module{cmp}, \module{cmpcache} and
|
---|
1211 | \module{dircmp} modules, which have now become deprecated.
|
---|
1212 | (Contributed by Gordon MacMillan and Moshe Zadka.)
|
---|
1213 |
|
---|
1214 | \item{\module{gettext}:} This module provides internationalization
|
---|
1215 | (I18N) and localization (L10N) support for Python programs by
|
---|
1216 | providing an interface to the GNU gettext message catalog library.
|
---|
1217 | (Integrated by Barry Warsaw, from separate contributions by Martin
|
---|
1218 | von~L\"owis, Peter Funk, and James Henstridge.)
|
---|
1219 |
|
---|
1220 | \item{\module{linuxaudiodev}:} Support for the \file{/dev/audio}
|
---|
1221 | device on Linux, a twin to the existing \module{sunaudiodev} module.
|
---|
1222 | (Contributed by Peter Bosch, with fixes by Jeremy Hylton.)
|
---|
1223 |
|
---|
1224 | \item{\module{mmap}:} An interface to memory-mapped files on both
|
---|
1225 | Windows and \UNIX. A file's contents can be mapped directly into
|
---|
1226 | memory, at which point it behaves like a mutable string, so its
|
---|
1227 | contents can be read and modified. They can even be passed to
|
---|
1228 | functions that expect ordinary strings, such as the \module{re}
|
---|
1229 | module. (Contributed by Sam Rushing, with some extensions by
|
---|
1230 | A.M. Kuchling.)
|
---|
1231 |
|
---|
1232 | \item{\module{pyexpat}:} An interface to the Expat XML parser.
|
---|
1233 | (Contributed by Paul Prescod.)
|
---|
1234 |
|
---|
1235 | \item{\module{robotparser}:} Parse a \file{robots.txt} file, which is
|
---|
1236 | used for writing Web spiders that politely avoid certain areas of a
|
---|
1237 | Web site. The parser accepts the contents of a \file{robots.txt} file,
|
---|
1238 | builds a set of rules from it, and can then answer questions about
|
---|
1239 | the fetchability of a given URL. (Contributed by Skip Montanaro.)
|
---|
1240 |
|
---|
1241 | \item{\module{tabnanny}:} A module/script to
|
---|
1242 | check Python source code for ambiguous indentation.
|
---|
1243 | (Contributed by Tim Peters.)
|
---|
1244 |
|
---|
1245 | \item{\module{UserString}:} A base class useful for deriving objects that behave like strings.
|
---|
1246 |
|
---|
1247 | \item{\module{webbrowser}:} A module that provides a platform independent
|
---|
1248 | way to launch a web browser on a specific URL. For each platform, various
|
---|
1249 | browsers are tried in a specific order. The user can alter which browser
|
---|
1250 | is launched by setting the \var{BROWSER} environment variable.
|
---|
1251 | (Originally inspired by Eric S. Raymond's patch to \module{urllib}
|
---|
1252 | which added similar functionality, but
|
---|
1253 | the final module comes from code originally
|
---|
1254 | implemented by Fred Drake as \file{Tools/idle/BrowserControl.py},
|
---|
1255 | and adapted for the standard library by Fred.)
|
---|
1256 |
|
---|
1257 | \item{\module{_winreg}:} An interface to the
|
---|
1258 | Windows registry. \module{_winreg} is an adaptation of functions that
|
---|
1259 | have been part of PythonWin since 1995, but has now been added to the core
|
---|
1260 | distribution, and enhanced to support Unicode.
|
---|
1261 | \module{_winreg} was written by Bill Tutt and Mark Hammond.
|
---|
1262 |
|
---|
1263 | \item{\module{zipfile}:} A module for reading and writing ZIP-format
|
---|
1264 | archives. These are archives produced by \program{PKZIP} on
|
---|
1265 | DOS/Windows or \program{zip} on \UNIX, not to be confused with
|
---|
1266 | \program{gzip}-format files (which are supported by the \module{gzip}
|
---|
1267 | module)
|
---|
1268 | (Contributed by James C. Ahlstrom.)
|
---|
1269 |
|
---|
1270 | \item{\module{imputil}:} A module that provides a simpler way for
|
---|
1271 | writing customised import hooks, in comparison to the existing
|
---|
1272 | \module{ihooks} module. (Implemented by Greg Stein, with much
|
---|
1273 | discussion on python-dev along the way.)
|
---|
1274 |
|
---|
1275 | \end{itemize}
|
---|
1276 |
|
---|
1277 | % ======================================================================
|
---|
1278 | \section{IDLE Improvements}
|
---|
1279 |
|
---|
1280 | IDLE is the official Python cross-platform IDE, written using Tkinter.
|
---|
1281 | Python 2.0 includes IDLE 0.6, which adds a number of new features and
|
---|
1282 | improvements. A partial list:
|
---|
1283 |
|
---|
1284 | \begin{itemize}
|
---|
1285 | \item UI improvements and optimizations,
|
---|
1286 | especially in the area of syntax highlighting and auto-indentation.
|
---|
1287 |
|
---|
1288 | \item The class browser now shows more information, such as the top
|
---|
1289 | level functions in a module.
|
---|
1290 |
|
---|
1291 | \item Tab width is now a user settable option. When opening an existing Python
|
---|
1292 | file, IDLE automatically detects the indentation conventions, and adapts.
|
---|
1293 |
|
---|
1294 | \item There is now support for calling browsers on various platforms,
|
---|
1295 | used to open the Python documentation in a browser.
|
---|
1296 |
|
---|
1297 | \item IDLE now has a command line, which is largely similar to
|
---|
1298 | the vanilla Python interpreter.
|
---|
1299 |
|
---|
1300 | \item Call tips were added in many places.
|
---|
1301 |
|
---|
1302 | \item IDLE can now be installed as a package.
|
---|
1303 |
|
---|
1304 | \item In the editor window, there is now a line/column bar at the bottom.
|
---|
1305 |
|
---|
1306 | \item Three new keystroke commands: Check module (Alt-F5), Import
|
---|
1307 | module (F5) and Run script (Ctrl-F5).
|
---|
1308 |
|
---|
1309 | \end{itemize}
|
---|
1310 |
|
---|
1311 | % ======================================================================
|
---|
1312 | \section{Deleted and Deprecated Modules}
|
---|
1313 |
|
---|
1314 | A few modules have been dropped because they're obsolete, or because
|
---|
1315 | there are now better ways to do the same thing. The \module{stdwin}
|
---|
1316 | module is gone; it was for a platform-independent windowing toolkit
|
---|
1317 | that's no longer developed.
|
---|
1318 |
|
---|
1319 | A number of modules have been moved to the
|
---|
1320 | \file{lib-old} subdirectory:
|
---|
1321 | \module{cmp}, \module{cmpcache}, \module{dircmp}, \module{dump},
|
---|
1322 | \module{find}, \module{grep}, \module{packmail},
|
---|
1323 | \module{poly}, \module{util}, \module{whatsound}, \module{zmod}.
|
---|
1324 | If you have code which relies on a module that's been moved to
|
---|
1325 | \file{lib-old}, you can simply add that directory to \code{sys.path}
|
---|
1326 | to get them back, but you're encouraged to update any code that uses
|
---|
1327 | these modules.
|
---|
1328 |
|
---|
1329 | \section{Acknowledgements}
|
---|
1330 |
|
---|
1331 | The authors would like to thank the following people for offering
|
---|
1332 | suggestions on various drafts of this article: David Bolen, Mark
|
---|
1333 | Hammond, Gregg Hauser, Jeremy Hylton, Fredrik Lundh, Detlef Lannert,
|
---|
1334 | Aahz Maruch, Skip Montanaro, Vladimir Marangozov, Tobias Polzin, Guido
|
---|
1335 | van Rossum, Neil Schemenauer, and Russ Schmidt.
|
---|
1336 |
|
---|
1337 | \end{document}
|
---|