1 | \documentclass{howto}
|
---|
2 |
|
---|
3 | % $Id: whatsnew22.tex 37315 2004-09-10 19:33:00Z akuchling $
|
---|
4 |
|
---|
5 | \title{What's New in Python 2.2}
|
---|
6 | \release{1.02}
|
---|
7 | \author{A.M. Kuchling}
|
---|
8 | \authoraddress{
|
---|
9 | \strong{Python Software Foundation}\\
|
---|
10 | Email: \email{amk@amk.ca}
|
---|
11 | }
|
---|
12 | \begin{document}
|
---|
13 | \maketitle\tableofcontents
|
---|
14 |
|
---|
15 | \section{Introduction}
|
---|
16 |
|
---|
17 | This article explains the new features in Python 2.2.2, released on
|
---|
18 | October 14, 2002. Python 2.2.2 is a bugfix release of Python 2.2,
|
---|
19 | originally released on December 21, 2001.
|
---|
20 |
|
---|
21 | Python 2.2 can be thought of as the "cleanup release". There are some
|
---|
22 | features such as generators and iterators that are completely new, but
|
---|
23 | most of the changes, significant and far-reaching though they may be,
|
---|
24 | are aimed at cleaning up irregularities and dark corners of the
|
---|
25 | language design.
|
---|
26 |
|
---|
27 | This article doesn't attempt to provide a complete specification of
|
---|
28 | the new features, but instead provides a convenient overview. For
|
---|
29 | full details, you should refer to the documentation for Python 2.2,
|
---|
30 | such as the
|
---|
31 | \citetitle[http://www.python.org/doc/2.2/lib/lib.html]{Python
|
---|
32 | Library Reference} and the
|
---|
33 | \citetitle[http://www.python.org/doc/2.2/ref/ref.html]{Python
|
---|
34 | Reference Manual}. If you want to understand the complete
|
---|
35 | implementation and design rationale for a change, refer to the PEP for
|
---|
36 | a particular new feature.
|
---|
37 |
|
---|
38 | \begin{seealso}
|
---|
39 |
|
---|
40 | \seeurl{http://www.unixreview.com/documents/s=1356/urm0109h/0109h.htm}
|
---|
41 | {``What's So Special About Python 2.2?'' is also about the new 2.2
|
---|
42 | features, and was written by Cameron Laird and Kathryn Soraiz.}
|
---|
43 |
|
---|
44 | \end{seealso}
|
---|
45 |
|
---|
46 |
|
---|
47 | %======================================================================
|
---|
48 | \section{PEPs 252 and 253: Type and Class Changes}
|
---|
49 |
|
---|
50 | The largest and most far-reaching changes in Python 2.2 are to
|
---|
51 | Python's model of objects and classes. The changes should be backward
|
---|
52 | compatible, so it's likely that your code will continue to run
|
---|
53 | unchanged, but the changes provide some amazing new capabilities.
|
---|
54 | Before beginning this, the longest and most complicated section of
|
---|
55 | this article, I'll provide an overview of the changes and offer some
|
---|
56 | comments.
|
---|
57 |
|
---|
58 | A long time ago I wrote a Web page
|
---|
59 | (\url{http://www.amk.ca/python/writing/warts.html}) listing flaws in
|
---|
60 | Python's design. One of the most significant flaws was that it's
|
---|
61 | impossible to subclass Python types implemented in C. In particular,
|
---|
62 | it's not possible to subclass built-in types, so you can't just
|
---|
63 | subclass, say, lists in order to add a single useful method to them.
|
---|
64 | The \module{UserList} module provides a class that supports all of the
|
---|
65 | methods of lists and that can be subclassed further, but there's lots
|
---|
66 | of C code that expects a regular Python list and won't accept a
|
---|
67 | \class{UserList} instance.
|
---|
68 |
|
---|
69 | Python 2.2 fixes this, and in the process adds some exciting new
|
---|
70 | capabilities. A brief summary:
|
---|
71 |
|
---|
72 | \begin{itemize}
|
---|
73 |
|
---|
74 | \item You can subclass built-in types such as lists and even integers,
|
---|
75 | and your subclasses should work in every place that requires the
|
---|
76 | original type.
|
---|
77 |
|
---|
78 | \item It's now possible to define static and class methods, in addition
|
---|
79 | to the instance methods available in previous versions of Python.
|
---|
80 |
|
---|
81 | \item It's also possible to automatically call methods on accessing or
|
---|
82 | setting an instance attribute by using a new mechanism called
|
---|
83 | \dfn{properties}. Many uses of \method{__getattr__} can be rewritten
|
---|
84 | to use properties instead, making the resulting code simpler and
|
---|
85 | faster. As a small side benefit, attributes can now have docstrings,
|
---|
86 | too.
|
---|
87 |
|
---|
88 | \item The list of legal attributes for an instance can be limited to a
|
---|
89 | particular set using \dfn{slots}, making it possible to safeguard
|
---|
90 | against typos and perhaps make more optimizations possible in future
|
---|
91 | versions of Python.
|
---|
92 |
|
---|
93 | \end{itemize}
|
---|
94 |
|
---|
95 | Some users have voiced concern about all these changes. Sure, they
|
---|
96 | say, the new features are neat and lend themselves to all sorts of
|
---|
97 | tricks that weren't possible in previous versions of Python, but
|
---|
98 | they also make the language more complicated. Some people have said
|
---|
99 | that they've always recommended Python for its simplicity, and feel
|
---|
100 | that its simplicity is being lost.
|
---|
101 |
|
---|
102 | Personally, I think there's no need to worry. Many of the new
|
---|
103 | features are quite esoteric, and you can write a lot of Python code
|
---|
104 | without ever needed to be aware of them. Writing a simple class is no
|
---|
105 | more difficult than it ever was, so you don't need to bother learning
|
---|
106 | or teaching them unless they're actually needed. Some very
|
---|
107 | complicated tasks that were previously only possible from C will now
|
---|
108 | be possible in pure Python, and to my mind that's all for the better.
|
---|
109 |
|
---|
110 | I'm not going to attempt to cover every single corner case and small
|
---|
111 | change that were required to make the new features work. Instead this
|
---|
112 | section will paint only the broad strokes. See section~\ref{sect-rellinks},
|
---|
113 | ``Related Links'', for further sources of information about Python 2.2's new
|
---|
114 | object model.
|
---|
115 |
|
---|
116 |
|
---|
117 | \subsection{Old and New Classes}
|
---|
118 |
|
---|
119 | First, you should know that Python 2.2 really has two kinds of
|
---|
120 | classes: classic or old-style classes, and new-style classes. The
|
---|
121 | old-style class model is exactly the same as the class model in
|
---|
122 | earlier versions of Python. All the new features described in this
|
---|
123 | section apply only to new-style classes. This divergence isn't
|
---|
124 | intended to last forever; eventually old-style classes will be
|
---|
125 | dropped, possibly in Python 3.0.
|
---|
126 |
|
---|
127 | So how do you define a new-style class? You do it by subclassing an
|
---|
128 | existing new-style class. Most of Python's built-in types, such as
|
---|
129 | integers, lists, dictionaries, and even files, are new-style classes
|
---|
130 | now. A new-style class named \class{object}, the base class for all
|
---|
131 | built-in types, has also been added so if no built-in type is
|
---|
132 | suitable, you can just subclass \class{object}:
|
---|
133 |
|
---|
134 | \begin{verbatim}
|
---|
135 | class C(object):
|
---|
136 | def __init__ (self):
|
---|
137 | ...
|
---|
138 | ...
|
---|
139 | \end{verbatim}
|
---|
140 |
|
---|
141 | This means that \keyword{class} statements that don't have any base
|
---|
142 | classes are always classic classes in Python 2.2. (Actually you can
|
---|
143 | also change this by setting a module-level variable named
|
---|
144 | \member{__metaclass__} --- see \pep{253} for the details --- but it's
|
---|
145 | easier to just subclass \keyword{object}.)
|
---|
146 |
|
---|
147 | The type objects for the built-in types are available as built-ins,
|
---|
148 | named using a clever trick. Python has always had built-in functions
|
---|
149 | named \function{int()}, \function{float()}, and \function{str()}. In
|
---|
150 | 2.2, they aren't functions any more, but type objects that behave as
|
---|
151 | factories when called.
|
---|
152 |
|
---|
153 | \begin{verbatim}
|
---|
154 | >>> int
|
---|
155 | <type 'int'>
|
---|
156 | >>> int('123')
|
---|
157 | 123
|
---|
158 | \end{verbatim}
|
---|
159 |
|
---|
160 | To make the set of types complete, new type objects such as
|
---|
161 | \function{dict} and \function{file} have been added. Here's a
|
---|
162 | more interesting example, adding a \method{lock()} method to file
|
---|
163 | objects:
|
---|
164 |
|
---|
165 | \begin{verbatim}
|
---|
166 | class LockableFile(file):
|
---|
167 | def lock (self, operation, length=0, start=0, whence=0):
|
---|
168 | import fcntl
|
---|
169 | return fcntl.lockf(self.fileno(), operation,
|
---|
170 | length, start, whence)
|
---|
171 | \end{verbatim}
|
---|
172 |
|
---|
173 | The now-obsolete \module{posixfile} module contained a class that
|
---|
174 | emulated all of a file object's methods and also added a
|
---|
175 | \method{lock()} method, but this class couldn't be passed to internal
|
---|
176 | functions that expected a built-in file, something which is possible
|
---|
177 | with our new \class{LockableFile}.
|
---|
178 |
|
---|
179 |
|
---|
180 | \subsection{Descriptors}
|
---|
181 |
|
---|
182 | In previous versions of Python, there was no consistent way to
|
---|
183 | discover what attributes and methods were supported by an object.
|
---|
184 | There were some informal conventions, such as defining
|
---|
185 | \member{__members__} and \member{__methods__} attributes that were
|
---|
186 | lists of names, but often the author of an extension type or a class
|
---|
187 | wouldn't bother to define them. You could fall back on inspecting the
|
---|
188 | \member{__dict__} of an object, but when class inheritance or an
|
---|
189 | arbitrary \method{__getattr__} hook were in use this could still be
|
---|
190 | inaccurate.
|
---|
191 |
|
---|
192 | The one big idea underlying the new class model is that an API for
|
---|
193 | describing the attributes of an object using \dfn{descriptors} has
|
---|
194 | been formalized. Descriptors specify the value of an attribute,
|
---|
195 | stating whether it's a method or a field. With the descriptor API,
|
---|
196 | static methods and class methods become possible, as well as more
|
---|
197 | exotic constructs.
|
---|
198 |
|
---|
199 | Attribute descriptors are objects that live inside class objects, and
|
---|
200 | have a few attributes of their own:
|
---|
201 |
|
---|
202 | \begin{itemize}
|
---|
203 |
|
---|
204 | \item \member{__name__} is the attribute's name.
|
---|
205 |
|
---|
206 | \item \member{__doc__} is the attribute's docstring.
|
---|
207 |
|
---|
208 | \item \method{__get__(\var{object})} is a method that retrieves the
|
---|
209 | attribute value from \var{object}.
|
---|
210 |
|
---|
211 | \item \method{__set__(\var{object}, \var{value})} sets the attribute
|
---|
212 | on \var{object} to \var{value}.
|
---|
213 |
|
---|
214 | \item \method{__delete__(\var{object}, \var{value})} deletes the \var{value}
|
---|
215 | attribute of \var{object}.
|
---|
216 | \end{itemize}
|
---|
217 |
|
---|
218 | For example, when you write \code{obj.x}, the steps that Python
|
---|
219 | actually performs are:
|
---|
220 |
|
---|
221 | \begin{verbatim}
|
---|
222 | descriptor = obj.__class__.x
|
---|
223 | descriptor.__get__(obj)
|
---|
224 | \end{verbatim}
|
---|
225 |
|
---|
226 | For methods, \method{descriptor.__get__} returns a temporary object that's
|
---|
227 | callable, and wraps up the instance and the method to be called on it.
|
---|
228 | This is also why static methods and class methods are now possible;
|
---|
229 | they have descriptors that wrap up just the method, or the method and
|
---|
230 | the class. As a brief explanation of these new kinds of methods,
|
---|
231 | static methods aren't passed the instance, and therefore resemble
|
---|
232 | regular functions. Class methods are passed the class of the object,
|
---|
233 | but not the object itself. Static and class methods are defined like
|
---|
234 | this:
|
---|
235 |
|
---|
236 | \begin{verbatim}
|
---|
237 | class C(object):
|
---|
238 | def f(arg1, arg2):
|
---|
239 | ...
|
---|
240 | f = staticmethod(f)
|
---|
241 |
|
---|
242 | def g(cls, arg1, arg2):
|
---|
243 | ...
|
---|
244 | g = classmethod(g)
|
---|
245 | \end{verbatim}
|
---|
246 |
|
---|
247 | The \function{staticmethod()} function takes the function
|
---|
248 | \function{f}, and returns it wrapped up in a descriptor so it can be
|
---|
249 | stored in the class object. You might expect there to be special
|
---|
250 | syntax for creating such methods (\code{def static f()},
|
---|
251 | \code{defstatic f()}, or something like that) but no such syntax has
|
---|
252 | been defined yet; that's been left for future versions of Python.
|
---|
253 |
|
---|
254 | More new features, such as slots and properties, are also implemented
|
---|
255 | as new kinds of descriptors, and it's not difficult to write a
|
---|
256 | descriptor class that does something novel. For example, it would be
|
---|
257 | possible to write a descriptor class that made it possible to write
|
---|
258 | Eiffel-style preconditions and postconditions for a method. A class
|
---|
259 | that used this feature might be defined like this:
|
---|
260 |
|
---|
261 | \begin{verbatim}
|
---|
262 | from eiffel import eiffelmethod
|
---|
263 |
|
---|
264 | class C(object):
|
---|
265 | def f(self, arg1, arg2):
|
---|
266 | # The actual function
|
---|
267 | ...
|
---|
268 | def pre_f(self):
|
---|
269 | # Check preconditions
|
---|
270 | ...
|
---|
271 | def post_f(self):
|
---|
272 | # Check postconditions
|
---|
273 | ...
|
---|
274 |
|
---|
275 | f = eiffelmethod(f, pre_f, post_f)
|
---|
276 | \end{verbatim}
|
---|
277 |
|
---|
278 | Note that a person using the new \function{eiffelmethod()} doesn't
|
---|
279 | have to understand anything about descriptors. This is why I think
|
---|
280 | the new features don't increase the basic complexity of the language.
|
---|
281 | There will be a few wizards who need to know about it in order to
|
---|
282 | write \function{eiffelmethod()} or the ZODB or whatever, but most
|
---|
283 | users will just write code on top of the resulting libraries and
|
---|
284 | ignore the implementation details.
|
---|
285 |
|
---|
286 |
|
---|
287 | \subsection{Multiple Inheritance: The Diamond Rule}
|
---|
288 |
|
---|
289 | Multiple inheritance has also been made more useful through changing
|
---|
290 | the rules under which names are resolved. Consider this set of classes
|
---|
291 | (diagram taken from \pep{253} by Guido van Rossum):
|
---|
292 |
|
---|
293 | \begin{verbatim}
|
---|
294 | class A:
|
---|
295 | ^ ^ def save(self): ...
|
---|
296 | / \
|
---|
297 | / \
|
---|
298 | / \
|
---|
299 | / \
|
---|
300 | class B class C:
|
---|
301 | ^ ^ def save(self): ...
|
---|
302 | \ /
|
---|
303 | \ /
|
---|
304 | \ /
|
---|
305 | \ /
|
---|
306 | class D
|
---|
307 | \end{verbatim}
|
---|
308 |
|
---|
309 | The lookup rule for classic classes is simple but not very smart; the
|
---|
310 | base classes are searched depth-first, going from left to right. A
|
---|
311 | reference to \method{D.save} will search the classes \class{D},
|
---|
312 | \class{B}, and then \class{A}, where \method{save()} would be found
|
---|
313 | and returned. \method{C.save()} would never be found at all. This is
|
---|
314 | bad, because if \class{C}'s \method{save()} method is saving some
|
---|
315 | internal state specific to \class{C}, not calling it will result in
|
---|
316 | that state never getting saved.
|
---|
317 |
|
---|
318 | New-style classes follow a different algorithm that's a bit more
|
---|
319 | complicated to explain, but does the right thing in this situation.
|
---|
320 | (Note that Python 2.3 changes this algorithm to one that produces the
|
---|
321 | same results in most cases, but produces more useful results for
|
---|
322 | really complicated inheritance graphs.)
|
---|
323 |
|
---|
324 | \begin{enumerate}
|
---|
325 |
|
---|
326 | \item List all the base classes, following the classic lookup rule and
|
---|
327 | include a class multiple times if it's visited repeatedly. In the
|
---|
328 | above example, the list of visited classes is [\class{D}, \class{B},
|
---|
329 | \class{A}, \class{C}, \class{A}].
|
---|
330 |
|
---|
331 | \item Scan the list for duplicated classes. If any are found, remove
|
---|
332 | all but one occurrence, leaving the \emph{last} one in the list. In
|
---|
333 | the above example, the list becomes [\class{D}, \class{B}, \class{C},
|
---|
334 | \class{A}] after dropping duplicates.
|
---|
335 |
|
---|
336 | \end{enumerate}
|
---|
337 |
|
---|
338 | Following this rule, referring to \method{D.save()} will return
|
---|
339 | \method{C.save()}, which is the behaviour we're after. This lookup
|
---|
340 | rule is the same as the one followed by Common Lisp. A new built-in
|
---|
341 | function, \function{super()}, provides a way to get at a class's
|
---|
342 | superclasses without having to reimplement Python's algorithm.
|
---|
343 | The most commonly used form will be
|
---|
344 | \function{super(\var{class}, \var{obj})}, which returns
|
---|
345 | a bound superclass object (not the actual class object). This form
|
---|
346 | will be used in methods to call a method in the superclass; for
|
---|
347 | example, \class{D}'s \method{save()} method would look like this:
|
---|
348 |
|
---|
349 | \begin{verbatim}
|
---|
350 | class D (B,C):
|
---|
351 | def save (self):
|
---|
352 | # Call superclass .save()
|
---|
353 | super(D, self).save()
|
---|
354 | # Save D's private information here
|
---|
355 | ...
|
---|
356 | \end{verbatim}
|
---|
357 |
|
---|
358 | \function{super()} can also return unbound superclass objects
|
---|
359 | when called as \function{super(\var{class})} or
|
---|
360 | \function{super(\var{class1}, \var{class2})}, but this probably won't
|
---|
361 | often be useful.
|
---|
362 |
|
---|
363 |
|
---|
364 | \subsection{Attribute Access}
|
---|
365 |
|
---|
366 | A fair number of sophisticated Python classes define hooks for
|
---|
367 | attribute access using \method{__getattr__}; most commonly this is
|
---|
368 | done for convenience, to make code more readable by automatically
|
---|
369 | mapping an attribute access such as \code{obj.parent} into a method
|
---|
370 | call such as \code{obj.get_parent()}. Python 2.2 adds some new ways
|
---|
371 | of controlling attribute access.
|
---|
372 |
|
---|
373 | First, \method{__getattr__(\var{attr_name})} is still supported by
|
---|
374 | new-style classes, and nothing about it has changed. As before, it
|
---|
375 | will be called when an attempt is made to access \code{obj.foo} and no
|
---|
376 | attribute named \samp{foo} is found in the instance's dictionary.
|
---|
377 |
|
---|
378 | New-style classes also support a new method,
|
---|
379 | \method{__getattribute__(\var{attr_name})}. The difference between
|
---|
380 | the two methods is that \method{__getattribute__} is \emph{always}
|
---|
381 | called whenever any attribute is accessed, while the old
|
---|
382 | \method{__getattr__} is only called if \samp{foo} isn't found in the
|
---|
383 | instance's dictionary.
|
---|
384 |
|
---|
385 | However, Python 2.2's support for \dfn{properties} will often be a
|
---|
386 | simpler way to trap attribute references. Writing a
|
---|
387 | \method{__getattr__} method is complicated because to avoid recursion
|
---|
388 | you can't use regular attribute accesses inside them, and instead have
|
---|
389 | to mess around with the contents of \member{__dict__}.
|
---|
390 | \method{__getattr__} methods also end up being called by Python when
|
---|
391 | it checks for other methods such as \method{__repr__} or
|
---|
392 | \method{__coerce__}, and so have to be written with this in mind.
|
---|
393 | Finally, calling a function on every attribute access results in a
|
---|
394 | sizable performance loss.
|
---|
395 |
|
---|
396 | \class{property} is a new built-in type that packages up three
|
---|
397 | functions that get, set, or delete an attribute, and a docstring. For
|
---|
398 | example, if you want to define a \member{size} attribute that's
|
---|
399 | computed, but also settable, you could write:
|
---|
400 |
|
---|
401 | \begin{verbatim}
|
---|
402 | class C(object):
|
---|
403 | def get_size (self):
|
---|
404 | result = ... computation ...
|
---|
405 | return result
|
---|
406 | def set_size (self, size):
|
---|
407 | ... compute something based on the size
|
---|
408 | and set internal state appropriately ...
|
---|
409 |
|
---|
410 | # Define a property. The 'delete this attribute'
|
---|
411 | # method is defined as None, so the attribute
|
---|
412 | # can't be deleted.
|
---|
413 | size = property(get_size, set_size,
|
---|
414 | None,
|
---|
415 | "Storage size of this instance")
|
---|
416 | \end{verbatim}
|
---|
417 |
|
---|
418 | That is certainly clearer and easier to write than a pair of
|
---|
419 | \method{__getattr__}/\method{__setattr__} methods that check for the
|
---|
420 | \member{size} attribute and handle it specially while retrieving all
|
---|
421 | other attributes from the instance's \member{__dict__}. Accesses to
|
---|
422 | \member{size} are also the only ones which have to perform the work of
|
---|
423 | calling a function, so references to other attributes run at
|
---|
424 | their usual speed.
|
---|
425 |
|
---|
426 | Finally, it's possible to constrain the list of attributes that can be
|
---|
427 | referenced on an object using the new \member{__slots__} class attribute.
|
---|
428 | Python objects are usually very dynamic; at any time it's possible to
|
---|
429 | define a new attribute on an instance by just doing
|
---|
430 | \code{obj.new_attr=1}. A new-style class can define a class attribute named
|
---|
431 | \member{__slots__} to limit the legal attributes
|
---|
432 | to a particular set of names. An example will make this clear:
|
---|
433 |
|
---|
434 | \begin{verbatim}
|
---|
435 | >>> class C(object):
|
---|
436 | ... __slots__ = ('template', 'name')
|
---|
437 | ...
|
---|
438 | >>> obj = C()
|
---|
439 | >>> print obj.template
|
---|
440 | None
|
---|
441 | >>> obj.template = 'Test'
|
---|
442 | >>> print obj.template
|
---|
443 | Test
|
---|
444 | >>> obj.newattr = None
|
---|
445 | Traceback (most recent call last):
|
---|
446 | File "<stdin>", line 1, in ?
|
---|
447 | AttributeError: 'C' object has no attribute 'newattr'
|
---|
448 | \end{verbatim}
|
---|
449 |
|
---|
450 | Note how you get an \exception{AttributeError} on the attempt to
|
---|
451 | assign to an attribute not listed in \member{__slots__}.
|
---|
452 |
|
---|
453 |
|
---|
454 |
|
---|
455 | \subsection{Related Links}
|
---|
456 | \label{sect-rellinks}
|
---|
457 |
|
---|
458 | This section has just been a quick overview of the new features,
|
---|
459 | giving enough of an explanation to start you programming, but many
|
---|
460 | details have been simplified or ignored. Where should you go to get a
|
---|
461 | more complete picture?
|
---|
462 |
|
---|
463 | \url{http://www.python.org/2.2/descrintro.html} is a lengthy tutorial
|
---|
464 | introduction to the descriptor features, written by Guido van Rossum.
|
---|
465 | If my description has whetted your appetite, go read this tutorial
|
---|
466 | next, because it goes into much more detail about the new features
|
---|
467 | while still remaining quite easy to read.
|
---|
468 |
|
---|
469 | Next, there are two relevant PEPs, \pep{252} and \pep{253}. \pep{252}
|
---|
470 | is titled "Making Types Look More Like Classes", and covers the
|
---|
471 | descriptor API. \pep{253} is titled "Subtyping Built-in Types", and
|
---|
472 | describes the changes to type objects that make it possible to subtype
|
---|
473 | built-in objects. \pep{253} is the more complicated PEP of the two,
|
---|
474 | and at a few points the necessary explanations of types and meta-types
|
---|
475 | may cause your head to explode. Both PEPs were written and
|
---|
476 | implemented by Guido van Rossum, with substantial assistance from the
|
---|
477 | rest of the Zope Corp. team.
|
---|
478 |
|
---|
479 | Finally, there's the ultimate authority: the source code. Most of the
|
---|
480 | machinery for the type handling is in \file{Objects/typeobject.c}, but
|
---|
481 | you should only resort to it after all other avenues have been
|
---|
482 | exhausted, including posting a question to python-list or python-dev.
|
---|
483 |
|
---|
484 |
|
---|
485 | %======================================================================
|
---|
486 | \section{PEP 234: Iterators}
|
---|
487 |
|
---|
488 | Another significant addition to 2.2 is an iteration interface at both
|
---|
489 | the C and Python levels. Objects can define how they can be looped
|
---|
490 | over by callers.
|
---|
491 |
|
---|
492 | In Python versions up to 2.1, the usual way to make \code{for item in
|
---|
493 | obj} work is to define a \method{__getitem__()} method that looks
|
---|
494 | something like this:
|
---|
495 |
|
---|
496 | \begin{verbatim}
|
---|
497 | def __getitem__(self, index):
|
---|
498 | return <next item>
|
---|
499 | \end{verbatim}
|
---|
500 |
|
---|
501 | \method{__getitem__()} is more properly used to define an indexing
|
---|
502 | operation on an object so that you can write \code{obj[5]} to retrieve
|
---|
503 | the sixth element. It's a bit misleading when you're using this only
|
---|
504 | to support \keyword{for} loops. Consider some file-like object that
|
---|
505 | wants to be looped over; the \var{index} parameter is essentially
|
---|
506 | meaningless, as the class probably assumes that a series of
|
---|
507 | \method{__getitem__()} calls will be made with \var{index}
|
---|
508 | incrementing by one each time. In other words, the presence of the
|
---|
509 | \method{__getitem__()} method doesn't mean that using \code{file[5]}
|
---|
510 | to randomly access the sixth element will work, though it really should.
|
---|
511 |
|
---|
512 | In Python 2.2, iteration can be implemented separately, and
|
---|
513 | \method{__getitem__()} methods can be limited to classes that really
|
---|
514 | do support random access. The basic idea of iterators is
|
---|
515 | simple. A new built-in function, \function{iter(obj)} or
|
---|
516 | \code{iter(\var{C}, \var{sentinel})}, is used to get an iterator.
|
---|
517 | \function{iter(obj)} returns an iterator for the object \var{obj},
|
---|
518 | while \code{iter(\var{C}, \var{sentinel})} returns an iterator that
|
---|
519 | will invoke the callable object \var{C} until it returns
|
---|
520 | \var{sentinel} to signal that the iterator is done.
|
---|
521 |
|
---|
522 | Python classes can define an \method{__iter__()} method, which should
|
---|
523 | create and return a new iterator for the object; if the object is its
|
---|
524 | own iterator, this method can just return \code{self}. In particular,
|
---|
525 | iterators will usually be their own iterators. Extension types
|
---|
526 | implemented in C can implement a \member{tp_iter} function in order to
|
---|
527 | return an iterator, and extension types that want to behave as
|
---|
528 | iterators can define a \member{tp_iternext} function.
|
---|
529 |
|
---|
530 | So, after all this, what do iterators actually do? They have one
|
---|
531 | required method, \method{next()}, which takes no arguments and returns
|
---|
532 | the next value. When there are no more values to be returned, calling
|
---|
533 | \method{next()} should raise the \exception{StopIteration} exception.
|
---|
534 |
|
---|
535 | \begin{verbatim}
|
---|
536 | >>> L = [1,2,3]
|
---|
537 | >>> i = iter(L)
|
---|
538 | >>> print i
|
---|
539 | <iterator object at 0x8116870>
|
---|
540 | >>> i.next()
|
---|
541 | 1
|
---|
542 | >>> i.next()
|
---|
543 | 2
|
---|
544 | >>> i.next()
|
---|
545 | 3
|
---|
546 | >>> i.next()
|
---|
547 | Traceback (most recent call last):
|
---|
548 | File "<stdin>", line 1, in ?
|
---|
549 | StopIteration
|
---|
550 | >>>
|
---|
551 | \end{verbatim}
|
---|
552 |
|
---|
553 | In 2.2, Python's \keyword{for} statement no longer expects a sequence;
|
---|
554 | it expects something for which \function{iter()} will return an iterator.
|
---|
555 | For backward compatibility and convenience, an iterator is
|
---|
556 | automatically constructed for sequences that don't implement
|
---|
557 | \method{__iter__()} or a \member{tp_iter} slot, so \code{for i in
|
---|
558 | [1,2,3]} will still work. Wherever the Python interpreter loops over
|
---|
559 | a sequence, it's been changed to use the iterator protocol. This
|
---|
560 | means you can do things like this:
|
---|
561 |
|
---|
562 | \begin{verbatim}
|
---|
563 | >>> L = [1,2,3]
|
---|
564 | >>> i = iter(L)
|
---|
565 | >>> a,b,c = i
|
---|
566 | >>> a,b,c
|
---|
567 | (1, 2, 3)
|
---|
568 | \end{verbatim}
|
---|
569 |
|
---|
570 | Iterator support has been added to some of Python's basic types.
|
---|
571 | Calling \function{iter()} on a dictionary will return an iterator
|
---|
572 | which loops over its keys:
|
---|
573 |
|
---|
574 | \begin{verbatim}
|
---|
575 | >>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
|
---|
576 | ... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
|
---|
577 | >>> for key in m: print key, m[key]
|
---|
578 | ...
|
---|
579 | Mar 3
|
---|
580 | Feb 2
|
---|
581 | Aug 8
|
---|
582 | Sep 9
|
---|
583 | May 5
|
---|
584 | Jun 6
|
---|
585 | Jul 7
|
---|
586 | Jan 1
|
---|
587 | Apr 4
|
---|
588 | Nov 11
|
---|
589 | Dec 12
|
---|
590 | Oct 10
|
---|
591 | \end{verbatim}
|
---|
592 |
|
---|
593 | That's just the default behaviour. If you want to iterate over keys,
|
---|
594 | values, or key/value pairs, you can explicitly call the
|
---|
595 | \method{iterkeys()}, \method{itervalues()}, or \method{iteritems()}
|
---|
596 | methods to get an appropriate iterator. In a minor related change,
|
---|
597 | the \keyword{in} operator now works on dictionaries, so
|
---|
598 | \code{\var{key} in dict} is now equivalent to
|
---|
599 | \code{dict.has_key(\var{key})}.
|
---|
600 |
|
---|
601 | Files also provide an iterator, which calls the \method{readline()}
|
---|
602 | method until there are no more lines in the file. This means you can
|
---|
603 | now read each line of a file using code like this:
|
---|
604 |
|
---|
605 | \begin{verbatim}
|
---|
606 | for line in file:
|
---|
607 | # do something for each line
|
---|
608 | ...
|
---|
609 | \end{verbatim}
|
---|
610 |
|
---|
611 | Note that you can only go forward in an iterator; there's no way to
|
---|
612 | get the previous element, reset the iterator, or make a copy of it.
|
---|
613 | An iterator object could provide such additional capabilities, but the
|
---|
614 | iterator protocol only requires a \method{next()} method.
|
---|
615 |
|
---|
616 | \begin{seealso}
|
---|
617 |
|
---|
618 | \seepep{234}{Iterators}{Written by Ka-Ping Yee and GvR; implemented
|
---|
619 | by the Python Labs crew, mostly by GvR and Tim Peters.}
|
---|
620 |
|
---|
621 | \end{seealso}
|
---|
622 |
|
---|
623 |
|
---|
624 | %======================================================================
|
---|
625 | \section{PEP 255: Simple Generators}
|
---|
626 |
|
---|
627 | Generators are another new feature, one that interacts with the
|
---|
628 | introduction of iterators.
|
---|
629 |
|
---|
630 | You're doubtless familiar with how function calls work in Python or
|
---|
631 | C. When you call a function, it gets a private namespace where its local
|
---|
632 | variables are created. When the function reaches a \keyword{return}
|
---|
633 | statement, the local variables are destroyed and the resulting value
|
---|
634 | is returned to the caller. A later call to the same function will get
|
---|
635 | a fresh new set of local variables. But, what if the local variables
|
---|
636 | weren't thrown away on exiting a function? What if you could later
|
---|
637 | resume the function where it left off? This is what generators
|
---|
638 | provide; they can be thought of as resumable functions.
|
---|
639 |
|
---|
640 | Here's the simplest example of a generator function:
|
---|
641 |
|
---|
642 | \begin{verbatim}
|
---|
643 | def generate_ints(N):
|
---|
644 | for i in range(N):
|
---|
645 | yield i
|
---|
646 | \end{verbatim}
|
---|
647 |
|
---|
648 | A new keyword, \keyword{yield}, was introduced for generators. Any
|
---|
649 | function containing a \keyword{yield} statement is a generator
|
---|
650 | function; this is detected by Python's bytecode compiler which
|
---|
651 | compiles the function specially as a result. Because a new keyword was
|
---|
652 | introduced, generators must be explicitly enabled in a module by
|
---|
653 | including a \code{from __future__ import generators} statement near
|
---|
654 | the top of the module's source code. In Python 2.3 this statement
|
---|
655 | will become unnecessary.
|
---|
656 |
|
---|
657 | When you call a generator function, it doesn't return a single value;
|
---|
658 | instead it returns a generator object that supports the iterator
|
---|
659 | protocol. On executing the \keyword{yield} statement, the generator
|
---|
660 | outputs the value of \code{i}, similar to a \keyword{return}
|
---|
661 | statement. The big difference between \keyword{yield} and a
|
---|
662 | \keyword{return} statement is that on reaching a \keyword{yield} the
|
---|
663 | generator's state of execution is suspended and local variables are
|
---|
664 | preserved. On the next call to the generator's \code{next()} method,
|
---|
665 | the function will resume executing immediately after the
|
---|
666 | \keyword{yield} statement. (For complicated reasons, the
|
---|
667 | \keyword{yield} statement isn't allowed inside the \keyword{try} block
|
---|
668 | of a \keyword{try}...\keyword{finally} statement; read \pep{255} for a full
|
---|
669 | explanation of the interaction between \keyword{yield} and
|
---|
670 | exceptions.)
|
---|
671 |
|
---|
672 | Here's a sample usage of the \function{generate_ints} generator:
|
---|
673 |
|
---|
674 | \begin{verbatim}
|
---|
675 | >>> gen = generate_ints(3)
|
---|
676 | >>> gen
|
---|
677 | <generator object at 0x8117f90>
|
---|
678 | >>> gen.next()
|
---|
679 | 0
|
---|
680 | >>> gen.next()
|
---|
681 | 1
|
---|
682 | >>> gen.next()
|
---|
683 | 2
|
---|
684 | >>> gen.next()
|
---|
685 | Traceback (most recent call last):
|
---|
686 | File "<stdin>", line 1, in ?
|
---|
687 | File "<stdin>", line 2, in generate_ints
|
---|
688 | StopIteration
|
---|
689 | \end{verbatim}
|
---|
690 |
|
---|
691 | You could equally write \code{for i in generate_ints(5)}, or
|
---|
692 | \code{a,b,c = generate_ints(3)}.
|
---|
693 |
|
---|
694 | Inside a generator function, the \keyword{return} statement can only
|
---|
695 | be used without a value, and signals the end of the procession of
|
---|
696 | values; afterwards the generator cannot return any further values.
|
---|
697 | \keyword{return} with a value, such as \code{return 5}, is a syntax
|
---|
698 | error inside a generator function. The end of the generator's results
|
---|
699 | can also be indicated by raising \exception{StopIteration} manually,
|
---|
700 | or by just letting the flow of execution fall off the bottom of the
|
---|
701 | function.
|
---|
702 |
|
---|
703 | You could achieve the effect of generators manually by writing your
|
---|
704 | own class and storing all the local variables of the generator as
|
---|
705 | instance variables. For example, returning a list of integers could
|
---|
706 | be done by setting \code{self.count} to 0, and having the
|
---|
707 | \method{next()} method increment \code{self.count} and return it.
|
---|
708 | However, for a moderately complicated generator, writing a
|
---|
709 | corresponding class would be much messier.
|
---|
710 | \file{Lib/test/test_generators.py} contains a number of more
|
---|
711 | interesting examples. The simplest one implements an in-order
|
---|
712 | traversal of a tree using generators recursively.
|
---|
713 |
|
---|
714 | \begin{verbatim}
|
---|
715 | # A recursive generator that generates Tree leaves in in-order.
|
---|
716 | def inorder(t):
|
---|
717 | if t:
|
---|
718 | for x in inorder(t.left):
|
---|
719 | yield x
|
---|
720 | yield t.label
|
---|
721 | for x in inorder(t.right):
|
---|
722 | yield x
|
---|
723 | \end{verbatim}
|
---|
724 |
|
---|
725 | Two other examples in \file{Lib/test/test_generators.py} produce
|
---|
726 | solutions for the N-Queens problem (placing $N$ queens on an $NxN$
|
---|
727 | chess board so that no queen threatens another) and the Knight's Tour
|
---|
728 | (a route that takes a knight to every square of an $NxN$ chessboard
|
---|
729 | without visiting any square twice).
|
---|
730 |
|
---|
731 | The idea of generators comes from other programming languages,
|
---|
732 | especially Icon (\url{http://www.cs.arizona.edu/icon/}), where the
|
---|
733 | idea of generators is central. In Icon, every
|
---|
734 | expression and function call behaves like a generator. One example
|
---|
735 | from ``An Overview of the Icon Programming Language'' at
|
---|
736 | \url{http://www.cs.arizona.edu/icon/docs/ipd266.htm} gives an idea of
|
---|
737 | what this looks like:
|
---|
738 |
|
---|
739 | \begin{verbatim}
|
---|
740 | sentence := "Store it in the neighboring harbor"
|
---|
741 | if (i := find("or", sentence)) > 5 then write(i)
|
---|
742 | \end{verbatim}
|
---|
743 |
|
---|
744 | In Icon the \function{find()} function returns the indexes at which the
|
---|
745 | substring ``or'' is found: 3, 23, 33. In the \keyword{if} statement,
|
---|
746 | \code{i} is first assigned a value of 3, but 3 is less than 5, so the
|
---|
747 | comparison fails, and Icon retries it with the second value of 23. 23
|
---|
748 | is greater than 5, so the comparison now succeeds, and the code prints
|
---|
749 | the value 23 to the screen.
|
---|
750 |
|
---|
751 | Python doesn't go nearly as far as Icon in adopting generators as a
|
---|
752 | central concept. Generators are considered a new part of the core
|
---|
753 | Python language, but learning or using them isn't compulsory; if they
|
---|
754 | don't solve any problems that you have, feel free to ignore them.
|
---|
755 | One novel feature of Python's interface as compared to
|
---|
756 | Icon's is that a generator's state is represented as a concrete object
|
---|
757 | (the iterator) that can be passed around to other functions or stored
|
---|
758 | in a data structure.
|
---|
759 |
|
---|
760 | \begin{seealso}
|
---|
761 |
|
---|
762 | \seepep{255}{Simple Generators}{Written by Neil Schemenauer, Tim
|
---|
763 | Peters, Magnus Lie Hetland. Implemented mostly by Neil Schemenauer
|
---|
764 | and Tim Peters, with other fixes from the Python Labs crew.}
|
---|
765 |
|
---|
766 | \end{seealso}
|
---|
767 |
|
---|
768 |
|
---|
769 | %======================================================================
|
---|
770 | \section{PEP 237: Unifying Long Integers and Integers}
|
---|
771 |
|
---|
772 | In recent versions, the distinction between regular integers, which
|
---|
773 | are 32-bit values on most machines, and long integers, which can be of
|
---|
774 | arbitrary size, was becoming an annoyance. For example, on platforms
|
---|
775 | that support files larger than \code{2**32} bytes, the
|
---|
776 | \method{tell()} method of file objects has to return a long integer.
|
---|
777 | However, there were various bits of Python that expected plain
|
---|
778 | integers and would raise an error if a long integer was provided
|
---|
779 | instead. For example, in Python 1.5, only regular integers
|
---|
780 | could be used as a slice index, and \code{'abc'[1L:]} would raise a
|
---|
781 | \exception{TypeError} exception with the message 'slice index must be
|
---|
782 | int'.
|
---|
783 |
|
---|
784 | Python 2.2 will shift values from short to long integers as required.
|
---|
785 | The 'L' suffix is no longer needed to indicate a long integer literal,
|
---|
786 | as now the compiler will choose the appropriate type. (Using the 'L'
|
---|
787 | suffix will be discouraged in future 2.x versions of Python,
|
---|
788 | triggering a warning in Python 2.4, and probably dropped in Python
|
---|
789 | 3.0.) Many operations that used to raise an \exception{OverflowError}
|
---|
790 | will now return a long integer as their result. For example:
|
---|
791 |
|
---|
792 | \begin{verbatim}
|
---|
793 | >>> 1234567890123
|
---|
794 | 1234567890123L
|
---|
795 | >>> 2 ** 64
|
---|
796 | 18446744073709551616L
|
---|
797 | \end{verbatim}
|
---|
798 |
|
---|
799 | In most cases, integers and long integers will now be treated
|
---|
800 | identically. You can still distinguish them with the
|
---|
801 | \function{type()} built-in function, but that's rarely needed.
|
---|
802 |
|
---|
803 | \begin{seealso}
|
---|
804 |
|
---|
805 | \seepep{237}{Unifying Long Integers and Integers}{Written by
|
---|
806 | Moshe Zadka and Guido van Rossum. Implemented mostly by Guido van
|
---|
807 | Rossum.}
|
---|
808 |
|
---|
809 | \end{seealso}
|
---|
810 |
|
---|
811 |
|
---|
812 | %======================================================================
|
---|
813 | \section{PEP 238: Changing the Division Operator}
|
---|
814 |
|
---|
815 | The most controversial change in Python 2.2 heralds the start of an effort
|
---|
816 | to fix an old design flaw that's been in Python from the beginning.
|
---|
817 | Currently Python's division operator, \code{/}, behaves like C's
|
---|
818 | division operator when presented with two integer arguments: it
|
---|
819 | returns an integer result that's truncated down when there would be
|
---|
820 | a fractional part. For example, \code{3/2} is 1, not 1.5, and
|
---|
821 | \code{(-1)/2} is -1, not -0.5. This means that the results of divison
|
---|
822 | can vary unexpectedly depending on the type of the two operands and
|
---|
823 | because Python is dynamically typed, it can be difficult to determine
|
---|
824 | the possible types of the operands.
|
---|
825 |
|
---|
826 | (The controversy is over whether this is \emph{really} a design flaw,
|
---|
827 | and whether it's worth breaking existing code to fix this. It's
|
---|
828 | caused endless discussions on python-dev, and in July 2001 erupted into an
|
---|
829 | storm of acidly sarcastic postings on \newsgroup{comp.lang.python}. I
|
---|
830 | won't argue for either side here and will stick to describing what's
|
---|
831 | implemented in 2.2. Read \pep{238} for a summary of arguments and
|
---|
832 | counter-arguments.)
|
---|
833 |
|
---|
834 | Because this change might break code, it's being introduced very
|
---|
835 | gradually. Python 2.2 begins the transition, but the switch won't be
|
---|
836 | complete until Python 3.0.
|
---|
837 |
|
---|
838 | First, I'll borrow some terminology from \pep{238}. ``True division'' is the
|
---|
839 | division that most non-programmers are familiar with: 3/2 is 1.5, 1/4
|
---|
840 | is 0.25, and so forth. ``Floor division'' is what Python's \code{/}
|
---|
841 | operator currently does when given integer operands; the result is the
|
---|
842 | floor of the value returned by true division. ``Classic division'' is
|
---|
843 | the current mixed behaviour of \code{/}; it returns the result of
|
---|
844 | floor division when the operands are integers, and returns the result
|
---|
845 | of true division when one of the operands is a floating-point number.
|
---|
846 |
|
---|
847 | Here are the changes 2.2 introduces:
|
---|
848 |
|
---|
849 | \begin{itemize}
|
---|
850 |
|
---|
851 | \item A new operator, \code{//}, is the floor division operator.
|
---|
852 | (Yes, we know it looks like \Cpp's comment symbol.) \code{//}
|
---|
853 | \emph{always} performs floor division no matter what the types of
|
---|
854 | its operands are, so \code{1 // 2} is 0 and \code{1.0 // 2.0} is also
|
---|
855 | 0.0.
|
---|
856 |
|
---|
857 | \code{//} is always available in Python 2.2; you don't need to enable
|
---|
858 | it using a \code{__future__} statement.
|
---|
859 |
|
---|
860 | \item By including a \code{from __future__ import division} in a
|
---|
861 | module, the \code{/} operator will be changed to return the result of
|
---|
862 | true division, so \code{1/2} is 0.5. Without the \code{__future__}
|
---|
863 | statement, \code{/} still means classic division. The default meaning
|
---|
864 | of \code{/} will not change until Python 3.0.
|
---|
865 |
|
---|
866 | \item Classes can define methods called \method{__truediv__} and
|
---|
867 | \method{__floordiv__} to overload the two division operators. At the
|
---|
868 | C level, there are also slots in the \ctype{PyNumberMethods} structure
|
---|
869 | so extension types can define the two operators.
|
---|
870 |
|
---|
871 | \item Python 2.2 supports some command-line arguments for testing
|
---|
872 | whether code will works with the changed division semantics. Running
|
---|
873 | python with \programopt{-Q warn} will cause a warning to be issued
|
---|
874 | whenever division is applied to two integers. You can use this to
|
---|
875 | find code that's affected by the change and fix it. By default,
|
---|
876 | Python 2.2 will simply perform classic division without a warning; the
|
---|
877 | warning will be turned on by default in Python 2.3.
|
---|
878 |
|
---|
879 | \end{itemize}
|
---|
880 |
|
---|
881 | \begin{seealso}
|
---|
882 |
|
---|
883 | \seepep{238}{Changing the Division Operator}{Written by Moshe Zadka and
|
---|
884 | Guido van Rossum. Implemented by Guido van Rossum..}
|
---|
885 |
|
---|
886 | \end{seealso}
|
---|
887 |
|
---|
888 |
|
---|
889 | %======================================================================
|
---|
890 | \section{Unicode Changes}
|
---|
891 |
|
---|
892 | Python's Unicode support has been enhanced a bit in 2.2. Unicode
|
---|
893 | strings are usually stored as UCS-2, as 16-bit unsigned integers.
|
---|
894 | Python 2.2 can also be compiled to use UCS-4, 32-bit unsigned
|
---|
895 | integers, as its internal encoding by supplying
|
---|
896 | \longprogramopt{enable-unicode=ucs4} to the configure script.
|
---|
897 | (It's also possible to specify
|
---|
898 | \longprogramopt{disable-unicode} to completely disable Unicode
|
---|
899 | support.)
|
---|
900 |
|
---|
901 | When built to use UCS-4 (a ``wide Python''), the interpreter can
|
---|
902 | natively handle Unicode characters from U+000000 to U+110000, so the
|
---|
903 | range of legal values for the \function{unichr()} function is expanded
|
---|
904 | accordingly. Using an interpreter compiled to use UCS-2 (a ``narrow
|
---|
905 | Python''), values greater than 65535 will still cause
|
---|
906 | \function{unichr()} to raise a \exception{ValueError} exception.
|
---|
907 | This is all described in \pep{261}, ``Support for `wide' Unicode
|
---|
908 | characters''; consult it for further details.
|
---|
909 |
|
---|
910 | Another change is simpler to explain. Since their introduction,
|
---|
911 | Unicode strings have supported an \method{encode()} method to convert
|
---|
912 | the string to a selected encoding such as UTF-8 or Latin-1. A
|
---|
913 | symmetric \method{decode(\optional{\var{encoding}})} method has been
|
---|
914 | added to 8-bit strings (though not to Unicode strings) in 2.2.
|
---|
915 | \method{decode()} assumes that the string is in the specified encoding
|
---|
916 | and decodes it, returning whatever is returned by the codec.
|
---|
917 |
|
---|
918 | Using this new feature, codecs have been added for tasks not directly
|
---|
919 | related to Unicode. For example, codecs have been added for
|
---|
920 | uu-encoding, MIME's base64 encoding, and compression with the
|
---|
921 | \module{zlib} module:
|
---|
922 |
|
---|
923 | \begin{verbatim}
|
---|
924 | >>> s = """Here is a lengthy piece of redundant, overly verbose,
|
---|
925 | ... and repetitive text.
|
---|
926 | ... """
|
---|
927 | >>> data = s.encode('zlib')
|
---|
928 | >>> data
|
---|
929 | 'x\x9c\r\xc9\xc1\r\x80 \x10\x04\xc0?Ul...'
|
---|
930 | >>> data.decode('zlib')
|
---|
931 | 'Here is a lengthy piece of redundant, overly verbose,\nand repetitive text.\n'
|
---|
932 | >>> print s.encode('uu')
|
---|
933 | begin 666 <data>
|
---|
934 | M2&5R92!I<R!A(&QE;F=T:'D@<&EE8V4@;V8@<F5D=6YD86YT+"!O=F5R;'D@
|
---|
935 | >=F5R8F]S92P*86YD(')E<&5T:71I=F4@=&5X="X*
|
---|
936 |
|
---|
937 | end
|
---|
938 | >>> "sheesh".encode('rot-13')
|
---|
939 | 'furrfu'
|
---|
940 | \end{verbatim}
|
---|
941 |
|
---|
942 | To convert a class instance to Unicode, a \method{__unicode__} method
|
---|
943 | can be defined by a class, analogous to \method{__str__}.
|
---|
944 |
|
---|
945 | \method{encode()}, \method{decode()}, and \method{__unicode__} were
|
---|
946 | implemented by Marc-Andr\'e Lemburg. The changes to support using
|
---|
947 | UCS-4 internally were implemented by Fredrik Lundh and Martin von
|
---|
948 | L\"owis.
|
---|
949 |
|
---|
950 | \begin{seealso}
|
---|
951 |
|
---|
952 | \seepep{261}{Support for `wide' Unicode characters}{Written by
|
---|
953 | Paul Prescod.}
|
---|
954 |
|
---|
955 | \end{seealso}
|
---|
956 |
|
---|
957 |
|
---|
958 | %======================================================================
|
---|
959 | \section{PEP 227: Nested Scopes}
|
---|
960 |
|
---|
961 | In Python 2.1, statically nested scopes were added as an optional
|
---|
962 | feature, to be enabled by a \code{from __future__ import
|
---|
963 | nested_scopes} directive. In 2.2 nested scopes no longer need to be
|
---|
964 | specially enabled, and are now always present. The rest of this section
|
---|
965 | is a copy of the description of nested scopes from my ``What's New in
|
---|
966 | Python 2.1'' document; if you read it when 2.1 came out, you can skip
|
---|
967 | the rest of this section.
|
---|
968 |
|
---|
969 | The largest change introduced in Python 2.1, and made complete in 2.2,
|
---|
970 | is to Python's scoping rules. In Python 2.0, at any given time there
|
---|
971 | are at most three namespaces used to look up variable names: local,
|
---|
972 | module-level, and the built-in namespace. This often surprised people
|
---|
973 | because it didn't match their intuitive expectations. For example, a
|
---|
974 | nested recursive function definition doesn't work:
|
---|
975 |
|
---|
976 | \begin{verbatim}
|
---|
977 | def f():
|
---|
978 | ...
|
---|
979 | def g(value):
|
---|
980 | ...
|
---|
981 | return g(value-1) + 1
|
---|
982 | ...
|
---|
983 | \end{verbatim}
|
---|
984 |
|
---|
985 | The function \function{g()} will always raise a \exception{NameError}
|
---|
986 | exception, because the binding of the name \samp{g} isn't in either
|
---|
987 | its local namespace or in the module-level namespace. This isn't much
|
---|
988 | of a problem in practice (how often do you recursively define interior
|
---|
989 | functions like this?), but this also made using the \keyword{lambda}
|
---|
990 | statement clumsier, and this was a problem in practice. In code which
|
---|
991 | uses \keyword{lambda} you can often find local variables being copied
|
---|
992 | by passing them as the default values of arguments.
|
---|
993 |
|
---|
994 | \begin{verbatim}
|
---|
995 | def find(self, name):
|
---|
996 | "Return list of any entries equal to 'name'"
|
---|
997 | L = filter(lambda x, name=name: x == name,
|
---|
998 | self.list_attribute)
|
---|
999 | return L
|
---|
1000 | \end{verbatim}
|
---|
1001 |
|
---|
1002 | The readability of Python code written in a strongly functional style
|
---|
1003 | suffers greatly as a result.
|
---|
1004 |
|
---|
1005 | The most significant change to Python 2.2 is that static scoping has
|
---|
1006 | been added to the language to fix this problem. As a first effect,
|
---|
1007 | the \code{name=name} default argument is now unnecessary in the above
|
---|
1008 | example. Put simply, when a given variable name is not assigned a
|
---|
1009 | value within a function (by an assignment, or the \keyword{def},
|
---|
1010 | \keyword{class}, or \keyword{import} statements), references to the
|
---|
1011 | variable will be looked up in the local namespace of the enclosing
|
---|
1012 | scope. A more detailed explanation of the rules, and a dissection of
|
---|
1013 | the implementation, can be found in the PEP.
|
---|
1014 |
|
---|
1015 | This change may cause some compatibility problems for code where the
|
---|
1016 | same variable name is used both at the module level and as a local
|
---|
1017 | variable within a function that contains further function definitions.
|
---|
1018 | This seems rather unlikely though, since such code would have been
|
---|
1019 | pretty confusing to read in the first place.
|
---|
1020 |
|
---|
1021 | One side effect of the change is that the \code{from \var{module}
|
---|
1022 | import *} and \keyword{exec} statements have been made illegal inside
|
---|
1023 | a function scope under certain conditions. The Python reference
|
---|
1024 | manual has said all along that \code{from \var{module} import *} is
|
---|
1025 | only legal at the top level of a module, but the CPython interpreter
|
---|
1026 | has never enforced this before. As part of the implementation of
|
---|
1027 | nested scopes, the compiler which turns Python source into bytecodes
|
---|
1028 | has to generate different code to access variables in a containing
|
---|
1029 | scope. \code{from \var{module} import *} and \keyword{exec} make it
|
---|
1030 | impossible for the compiler to figure this out, because they add names
|
---|
1031 | to the local namespace that are unknowable at compile time.
|
---|
1032 | Therefore, if a function contains function definitions or
|
---|
1033 | \keyword{lambda} expressions with free variables, the compiler will
|
---|
1034 | flag this by raising a \exception{SyntaxError} exception.
|
---|
1035 |
|
---|
1036 | To make the preceding explanation a bit clearer, here's an example:
|
---|
1037 |
|
---|
1038 | \begin{verbatim}
|
---|
1039 | x = 1
|
---|
1040 | def f():
|
---|
1041 | # The next line is a syntax error
|
---|
1042 | exec 'x=2'
|
---|
1043 | def g():
|
---|
1044 | return x
|
---|
1045 | \end{verbatim}
|
---|
1046 |
|
---|
1047 | Line 4 containing the \keyword{exec} statement is a syntax error,
|
---|
1048 | since \keyword{exec} would define a new local variable named \samp{x}
|
---|
1049 | whose value should be accessed by \function{g()}.
|
---|
1050 |
|
---|
1051 | This shouldn't be much of a limitation, since \keyword{exec} is rarely
|
---|
1052 | used in most Python code (and when it is used, it's often a sign of a
|
---|
1053 | poor design anyway).
|
---|
1054 |
|
---|
1055 | \begin{seealso}
|
---|
1056 |
|
---|
1057 | \seepep{227}{Statically Nested Scopes}{Written and implemented by
|
---|
1058 | Jeremy Hylton.}
|
---|
1059 |
|
---|
1060 | \end{seealso}
|
---|
1061 |
|
---|
1062 |
|
---|
1063 | %======================================================================
|
---|
1064 | \section{New and Improved Modules}
|
---|
1065 |
|
---|
1066 | \begin{itemize}
|
---|
1067 |
|
---|
1068 | \item The \module{xmlrpclib} module was contributed to the standard
|
---|
1069 | library by Fredrik Lundh, providing support for writing XML-RPC
|
---|
1070 | clients. XML-RPC is a simple remote procedure call protocol built on
|
---|
1071 | top of HTTP and XML. For example, the following snippet retrieves a
|
---|
1072 | list of RSS channels from the O'Reilly Network, and then
|
---|
1073 | lists the recent headlines for one channel:
|
---|
1074 |
|
---|
1075 | \begin{verbatim}
|
---|
1076 | import xmlrpclib
|
---|
1077 | s = xmlrpclib.Server(
|
---|
1078 | 'http://www.oreillynet.com/meerkat/xml-rpc/server.php')
|
---|
1079 | channels = s.meerkat.getChannels()
|
---|
1080 | # channels is a list of dictionaries, like this:
|
---|
1081 | # [{'id': 4, 'title': 'Freshmeat Daily News'}
|
---|
1082 | # {'id': 190, 'title': '32Bits Online'},
|
---|
1083 | # {'id': 4549, 'title': '3DGamers'}, ... ]
|
---|
1084 |
|
---|
1085 | # Get the items for one channel
|
---|
1086 | items = s.meerkat.getItems( {'channel': 4} )
|
---|
1087 |
|
---|
1088 | # 'items' is another list of dictionaries, like this:
|
---|
1089 | # [{'link': 'http://freshmeat.net/releases/52719/',
|
---|
1090 | # 'description': 'A utility which converts HTML to XSL FO.',
|
---|
1091 | # 'title': 'html2fo 0.3 (Default)'}, ... ]
|
---|
1092 | \end{verbatim}
|
---|
1093 |
|
---|
1094 | The \module{SimpleXMLRPCServer} module makes it easy to create
|
---|
1095 | straightforward XML-RPC servers. See \url{http://www.xmlrpc.com/} for
|
---|
1096 | more information about XML-RPC.
|
---|
1097 |
|
---|
1098 | \item The new \module{hmac} module implements the HMAC
|
---|
1099 | algorithm described by \rfc{2104}.
|
---|
1100 | (Contributed by Gerhard H\"aring.)
|
---|
1101 |
|
---|
1102 | \item Several functions that originally returned lengthy tuples now
|
---|
1103 | return pseudo-sequences that still behave like tuples but also have
|
---|
1104 | mnemonic attributes such as member{st_mtime} or \member{tm_year}.
|
---|
1105 | The enhanced functions include \function{stat()},
|
---|
1106 | \function{fstat()}, \function{statvfs()}, and \function{fstatvfs()}
|
---|
1107 | in the \module{os} module, and \function{localtime()},
|
---|
1108 | \function{gmtime()}, and \function{strptime()} in the \module{time}
|
---|
1109 | module.
|
---|
1110 |
|
---|
1111 | For example, to obtain a file's size using the old tuples, you'd end
|
---|
1112 | up writing something like \code{file_size =
|
---|
1113 | os.stat(filename)[stat.ST_SIZE]}, but now this can be written more
|
---|
1114 | clearly as \code{file_size = os.stat(filename).st_size}.
|
---|
1115 |
|
---|
1116 | The original patch for this feature was contributed by Nick Mathewson.
|
---|
1117 |
|
---|
1118 | \item The Python profiler has been extensively reworked and various
|
---|
1119 | errors in its output have been corrected. (Contributed by
|
---|
1120 | Fred~L. Drake, Jr. and Tim Peters.)
|
---|
1121 |
|
---|
1122 | \item The \module{socket} module can be compiled to support IPv6;
|
---|
1123 | specify the \longprogramopt{enable-ipv6} option to Python's configure
|
---|
1124 | script. (Contributed by Jun-ichiro ``itojun'' Hagino.)
|
---|
1125 |
|
---|
1126 | \item Two new format characters were added to the \module{struct}
|
---|
1127 | module for 64-bit integers on platforms that support the C
|
---|
1128 | \ctype{long long} type. \samp{q} is for a signed 64-bit integer,
|
---|
1129 | and \samp{Q} is for an unsigned one. The value is returned in
|
---|
1130 | Python's long integer type. (Contributed by Tim Peters.)
|
---|
1131 |
|
---|
1132 | \item In the interpreter's interactive mode, there's a new built-in
|
---|
1133 | function \function{help()} that uses the \module{pydoc} module
|
---|
1134 | introduced in Python 2.1 to provide interactive help.
|
---|
1135 | \code{help(\var{object})} displays any available help text about
|
---|
1136 | \var{object}. \function{help()} with no argument puts you in an online
|
---|
1137 | help utility, where you can enter the names of functions, classes,
|
---|
1138 | or modules to read their help text.
|
---|
1139 | (Contributed by Guido van Rossum, using Ka-Ping Yee's \module{pydoc} module.)
|
---|
1140 |
|
---|
1141 | \item Various bugfixes and performance improvements have been made
|
---|
1142 | to the SRE engine underlying the \module{re} module. For example,
|
---|
1143 | the \function{re.sub()} and \function{re.split()} functions have
|
---|
1144 | been rewritten in C. Another contributed patch speeds up certain
|
---|
1145 | Unicode character ranges by a factor of two, and a new \method{finditer()}
|
---|
1146 | method that returns an iterator over all the non-overlapping matches in
|
---|
1147 | a given string.
|
---|
1148 | (SRE is maintained by
|
---|
1149 | Fredrik Lundh. The BIGCHARSET patch was contributed by Martin von
|
---|
1150 | L\"owis.)
|
---|
1151 |
|
---|
1152 | \item The \module{smtplib} module now supports \rfc{2487}, ``Secure
|
---|
1153 | SMTP over TLS'', so it's now possible to encrypt the SMTP traffic
|
---|
1154 | between a Python program and the mail transport agent being handed a
|
---|
1155 | message. \module{smtplib} also supports SMTP authentication.
|
---|
1156 | (Contributed by Gerhard H\"aring.)
|
---|
1157 |
|
---|
1158 | \item The \module{imaplib} module, maintained by Piers Lauder, has
|
---|
1159 | support for several new extensions: the NAMESPACE extension defined
|
---|
1160 | in \rfc{2342}, SORT, GETACL and SETACL. (Contributed by Anthony
|
---|
1161 | Baxter and Michel Pelletier.)
|
---|
1162 |
|
---|
1163 | \item The \module{rfc822} module's parsing of email addresses is now
|
---|
1164 | compliant with \rfc{2822}, an update to \rfc{822}. (The module's
|
---|
1165 | name is \emph{not} going to be changed to \samp{rfc2822}.) A new
|
---|
1166 | package, \module{email}, has also been added for parsing and
|
---|
1167 | generating e-mail messages. (Contributed by Barry Warsaw, and
|
---|
1168 | arising out of his work on Mailman.)
|
---|
1169 |
|
---|
1170 | \item The \module{difflib} module now contains a new \class{Differ}
|
---|
1171 | class for producing human-readable lists of changes (a ``delta'')
|
---|
1172 | between two sequences of lines of text. There are also two
|
---|
1173 | generator functions, \function{ndiff()} and \function{restore()},
|
---|
1174 | which respectively return a delta from two sequences, or one of the
|
---|
1175 | original sequences from a delta. (Grunt work contributed by David
|
---|
1176 | Goodger, from ndiff.py code by Tim Peters who then did the
|
---|
1177 | generatorization.)
|
---|
1178 |
|
---|
1179 | \item New constants \constant{ascii_letters},
|
---|
1180 | \constant{ascii_lowercase}, and \constant{ascii_uppercase} were
|
---|
1181 | added to the \module{string} module. There were several modules in
|
---|
1182 | the standard library that used \constant{string.letters} to mean the
|
---|
1183 | ranges A-Za-z, but that assumption is incorrect when locales are in
|
---|
1184 | use, because \constant{string.letters} varies depending on the set
|
---|
1185 | of legal characters defined by the current locale. The buggy
|
---|
1186 | modules have all been fixed to use \constant{ascii_letters} instead.
|
---|
1187 | (Reported by an unknown person; fixed by Fred~L. Drake, Jr.)
|
---|
1188 |
|
---|
1189 | \item The \module{mimetypes} module now makes it easier to use
|
---|
1190 | alternative MIME-type databases by the addition of a
|
---|
1191 | \class{MimeTypes} class, which takes a list of filenames to be
|
---|
1192 | parsed. (Contributed by Fred~L. Drake, Jr.)
|
---|
1193 |
|
---|
1194 | \item A \class{Timer} class was added to the \module{threading}
|
---|
1195 | module that allows scheduling an activity to happen at some future
|
---|
1196 | time. (Contributed by Itamar Shtull-Trauring.)
|
---|
1197 |
|
---|
1198 | \end{itemize}
|
---|
1199 |
|
---|
1200 |
|
---|
1201 | %======================================================================
|
---|
1202 | \section{Interpreter Changes and Fixes}
|
---|
1203 |
|
---|
1204 | Some of the changes only affect people who deal with the Python
|
---|
1205 | interpreter at the C level because they're writing Python extension modules,
|
---|
1206 | embedding the interpreter, or just hacking on the interpreter itself.
|
---|
1207 | If you only write Python code, none of the changes described here will
|
---|
1208 | affect you very much.
|
---|
1209 |
|
---|
1210 | \begin{itemize}
|
---|
1211 |
|
---|
1212 | \item Profiling and tracing functions can now be implemented in C,
|
---|
1213 | which can operate at much higher speeds than Python-based functions
|
---|
1214 | and should reduce the overhead of profiling and tracing. This
|
---|
1215 | will be of interest to authors of development environments for
|
---|
1216 | Python. Two new C functions were added to Python's API,
|
---|
1217 | \cfunction{PyEval_SetProfile()} and \cfunction{PyEval_SetTrace()}.
|
---|
1218 | The existing \function{sys.setprofile()} and
|
---|
1219 | \function{sys.settrace()} functions still exist, and have simply
|
---|
1220 | been changed to use the new C-level interface. (Contributed by Fred
|
---|
1221 | L. Drake, Jr.)
|
---|
1222 |
|
---|
1223 | \item Another low-level API, primarily of interest to implementors
|
---|
1224 | of Python debuggers and development tools, was added.
|
---|
1225 | \cfunction{PyInterpreterState_Head()} and
|
---|
1226 | \cfunction{PyInterpreterState_Next()} let a caller walk through all
|
---|
1227 | the existing interpreter objects;
|
---|
1228 | \cfunction{PyInterpreterState_ThreadHead()} and
|
---|
1229 | \cfunction{PyThreadState_Next()} allow looping over all the thread
|
---|
1230 | states for a given interpreter. (Contributed by David Beazley.)
|
---|
1231 |
|
---|
1232 | \item The C-level interface to the garbage collector has been changed
|
---|
1233 | to make it easier to write extension types that support garbage
|
---|
1234 | collection and to debug misuses of the functions.
|
---|
1235 | Various functions have slightly different semantics, so a bunch of
|
---|
1236 | functions had to be renamed. Extensions that use the old API will
|
---|
1237 | still compile but will \emph{not} participate in garbage collection,
|
---|
1238 | so updating them for 2.2 should be considered fairly high priority.
|
---|
1239 |
|
---|
1240 | To upgrade an extension module to the new API, perform the following
|
---|
1241 | steps:
|
---|
1242 |
|
---|
1243 | \begin{itemize}
|
---|
1244 |
|
---|
1245 | \item Rename \cfunction{Py_TPFLAGS_GC} to \cfunction{PyTPFLAGS_HAVE_GC}.
|
---|
1246 |
|
---|
1247 | \item Use \cfunction{PyObject_GC_New} or \cfunction{PyObject_GC_NewVar} to
|
---|
1248 | allocate objects, and \cfunction{PyObject_GC_Del} to deallocate them.
|
---|
1249 |
|
---|
1250 | \item Rename \cfunction{PyObject_GC_Init} to \cfunction{PyObject_GC_Track} and
|
---|
1251 | \cfunction{PyObject_GC_Fini} to \cfunction{PyObject_GC_UnTrack}.
|
---|
1252 |
|
---|
1253 | \item Remove \cfunction{PyGC_HEAD_SIZE} from object size calculations.
|
---|
1254 |
|
---|
1255 | \item Remove calls to \cfunction{PyObject_AS_GC} and \cfunction{PyObject_FROM_GC}.
|
---|
1256 |
|
---|
1257 | \end{itemize}
|
---|
1258 |
|
---|
1259 | \item A new \samp{et} format sequence was added to
|
---|
1260 | \cfunction{PyArg_ParseTuple}; \samp{et} takes both a parameter and
|
---|
1261 | an encoding name, and converts the parameter to the given encoding
|
---|
1262 | if the parameter turns out to be a Unicode string, or leaves it
|
---|
1263 | alone if it's an 8-bit string, assuming it to already be in the
|
---|
1264 | desired encoding. This differs from the \samp{es} format character,
|
---|
1265 | which assumes that 8-bit strings are in Python's default ASCII
|
---|
1266 | encoding and converts them to the specified new encoding.
|
---|
1267 | (Contributed by M.-A. Lemburg, and used for the MBCS support on
|
---|
1268 | Windows described in the following section.)
|
---|
1269 |
|
---|
1270 | \item A different argument parsing function,
|
---|
1271 | \cfunction{PyArg_UnpackTuple()}, has been added that's simpler and
|
---|
1272 | presumably faster. Instead of specifying a format string, the
|
---|
1273 | caller simply gives the minimum and maximum number of arguments
|
---|
1274 | expected, and a set of pointers to \ctype{PyObject*} variables that
|
---|
1275 | will be filled in with argument values.
|
---|
1276 |
|
---|
1277 | \item Two new flags \constant{METH_NOARGS} and \constant{METH_O} are
|
---|
1278 | available in method definition tables to simplify implementation of
|
---|
1279 | methods with no arguments or a single untyped argument. Calling
|
---|
1280 | such methods is more efficient than calling a corresponding method
|
---|
1281 | that uses \constant{METH_VARARGS}.
|
---|
1282 | Also, the old \constant{METH_OLDARGS} style of writing C methods is
|
---|
1283 | now officially deprecated.
|
---|
1284 |
|
---|
1285 | \item
|
---|
1286 | Two new wrapper functions, \cfunction{PyOS_snprintf()} and
|
---|
1287 | \cfunction{PyOS_vsnprintf()} were added to provide
|
---|
1288 | cross-platform implementations for the relatively new
|
---|
1289 | \cfunction{snprintf()} and \cfunction{vsnprintf()} C lib APIs. In
|
---|
1290 | contrast to the standard \cfunction{sprintf()} and
|
---|
1291 | \cfunction{vsprintf()} functions, the Python versions check the
|
---|
1292 | bounds of the buffer used to protect against buffer overruns.
|
---|
1293 | (Contributed by M.-A. Lemburg.)
|
---|
1294 |
|
---|
1295 | \item The \cfunction{_PyTuple_Resize()} function has lost an unused
|
---|
1296 | parameter, so now it takes 2 parameters instead of 3. The third
|
---|
1297 | argument was never used, and can simply be discarded when porting
|
---|
1298 | code from earlier versions to Python 2.2.
|
---|
1299 |
|
---|
1300 | \end{itemize}
|
---|
1301 |
|
---|
1302 |
|
---|
1303 | %======================================================================
|
---|
1304 | \section{Other Changes and Fixes}
|
---|
1305 |
|
---|
1306 | As usual there were a bunch of other improvements and bugfixes
|
---|
1307 | scattered throughout the source tree. A search through the CVS change
|
---|
1308 | logs finds there were 527 patches applied and 683 bugs fixed between
|
---|
1309 | Python 2.1 and 2.2; 2.2.1 applied 139 patches and fixed 143 bugs;
|
---|
1310 | 2.2.2 applied 106 patches and fixed 82 bugs. These figures are likely
|
---|
1311 | to be underestimates.
|
---|
1312 |
|
---|
1313 | Some of the more notable changes are:
|
---|
1314 |
|
---|
1315 | \begin{itemize}
|
---|
1316 |
|
---|
1317 | \item The code for the MacOS port for Python, maintained by Jack
|
---|
1318 | Jansen, is now kept in the main Python CVS tree, and many changes
|
---|
1319 | have been made to support MacOS~X.
|
---|
1320 |
|
---|
1321 | The most significant change is the ability to build Python as a
|
---|
1322 | framework, enabled by supplying the \longprogramopt{enable-framework}
|
---|
1323 | option to the configure script when compiling Python. According to
|
---|
1324 | Jack Jansen, ``This installs a self-contained Python installation plus
|
---|
1325 | the OS~X framework "glue" into
|
---|
1326 | \file{/Library/Frameworks/Python.framework} (or another location of
|
---|
1327 | choice). For now there is little immediate added benefit to this
|
---|
1328 | (actually, there is the disadvantage that you have to change your PATH
|
---|
1329 | to be able to find Python), but it is the basis for creating a
|
---|
1330 | full-blown Python application, porting the MacPython IDE, possibly
|
---|
1331 | using Python as a standard OSA scripting language and much more.''
|
---|
1332 |
|
---|
1333 | Most of the MacPython toolbox modules, which interface to MacOS APIs
|
---|
1334 | such as windowing, QuickTime, scripting, etc. have been ported to OS~X,
|
---|
1335 | but they've been left commented out in \file{setup.py}. People who want
|
---|
1336 | to experiment with these modules can uncomment them manually.
|
---|
1337 |
|
---|
1338 | % Jack's original comments:
|
---|
1339 | %The main change is the possibility to build Python as a
|
---|
1340 | %framework. This installs a self-contained Python installation plus the
|
---|
1341 | %OSX framework "glue" into /Library/Frameworks/Python.framework (or
|
---|
1342 | %another location of choice). For now there is little immedeate added
|
---|
1343 | %benefit to this (actually, there is the disadvantage that you have to
|
---|
1344 | %change your PATH to be able to find Python), but it is the basis for
|
---|
1345 | %creating a fullblown Python application, porting the MacPython IDE,
|
---|
1346 | %possibly using Python as a standard OSA scripting language and much
|
---|
1347 | %more. You enable this with "configure --enable-framework".
|
---|
1348 |
|
---|
1349 | %The other change is that most MacPython toolbox modules, which
|
---|
1350 | %interface to all the MacOS APIs such as windowing, quicktime,
|
---|
1351 | %scripting, etc. have been ported. Again, most of these are not of
|
---|
1352 | %immedeate use, as they need a full application to be really useful, so
|
---|
1353 | %they have been commented out in setup.py. People wanting to experiment
|
---|
1354 | %can uncomment them. Gestalt and Internet Config modules are enabled by
|
---|
1355 | %default.
|
---|
1356 |
|
---|
1357 | \item Keyword arguments passed to builtin functions that don't take them
|
---|
1358 | now cause a \exception{TypeError} exception to be raised, with the
|
---|
1359 | message "\var{function} takes no keyword arguments".
|
---|
1360 |
|
---|
1361 | \item Weak references, added in Python 2.1 as an extension module,
|
---|
1362 | are now part of the core because they're used in the implementation
|
---|
1363 | of new-style classes. The \exception{ReferenceError} exception has
|
---|
1364 | therefore moved from the \module{weakref} module to become a
|
---|
1365 | built-in exception.
|
---|
1366 |
|
---|
1367 | \item A new script, \file{Tools/scripts/cleanfuture.py} by Tim
|
---|
1368 | Peters, automatically removes obsolete \code{__future__} statements
|
---|
1369 | from Python source code.
|
---|
1370 |
|
---|
1371 | \item An additional \var{flags} argument has been added to the
|
---|
1372 | built-in function \function{compile()}, so the behaviour of
|
---|
1373 | \code{__future__} statements can now be correctly observed in
|
---|
1374 | simulated shells, such as those presented by IDLE and other
|
---|
1375 | development environments. This is described in \pep{264}.
|
---|
1376 | (Contributed by Michael Hudson.)
|
---|
1377 |
|
---|
1378 | \item The new license introduced with Python 1.6 wasn't
|
---|
1379 | GPL-compatible. This is fixed by some minor textual changes to the
|
---|
1380 | 2.2 license, so it's now legal to embed Python inside a GPLed
|
---|
1381 | program again. Note that Python itself is not GPLed, but instead is
|
---|
1382 | under a license that's essentially equivalent to the BSD license,
|
---|
1383 | same as it always was. The license changes were also applied to the
|
---|
1384 | Python 2.0.1 and 2.1.1 releases.
|
---|
1385 |
|
---|
1386 | \item When presented with a Unicode filename on Windows, Python will
|
---|
1387 | now convert it to an MBCS encoded string, as used by the Microsoft
|
---|
1388 | file APIs. As MBCS is explicitly used by the file APIs, Python's
|
---|
1389 | choice of ASCII as the default encoding turns out to be an
|
---|
1390 | annoyance. On \UNIX, the locale's character set is used if
|
---|
1391 | \function{locale.nl_langinfo(CODESET)} is available. (Windows
|
---|
1392 | support was contributed by Mark Hammond with assistance from
|
---|
1393 | Marc-Andr\'e Lemburg. \UNIX{} support was added by Martin von L\"owis.)
|
---|
1394 |
|
---|
1395 | \item Large file support is now enabled on Windows. (Contributed by
|
---|
1396 | Tim Peters.)
|
---|
1397 |
|
---|
1398 | \item The \file{Tools/scripts/ftpmirror.py} script
|
---|
1399 | now parses a \file{.netrc} file, if you have one.
|
---|
1400 | (Contributed by Mike Romberg.)
|
---|
1401 |
|
---|
1402 | \item Some features of the object returned by the
|
---|
1403 | \function{xrange()} function are now deprecated, and trigger
|
---|
1404 | warnings when they're accessed; they'll disappear in Python 2.3.
|
---|
1405 | \class{xrange} objects tried to pretend they were full sequence
|
---|
1406 | types by supporting slicing, sequence multiplication, and the
|
---|
1407 | \keyword{in} operator, but these features were rarely used and
|
---|
1408 | therefore buggy. The \method{tolist()} method and the
|
---|
1409 | \member{start}, \member{stop}, and \member{step} attributes are also
|
---|
1410 | being deprecated. At the C level, the fourth argument to the
|
---|
1411 | \cfunction{PyRange_New()} function, \samp{repeat}, has also been
|
---|
1412 | deprecated.
|
---|
1413 |
|
---|
1414 | \item There were a bunch of patches to the dictionary
|
---|
1415 | implementation, mostly to fix potential core dumps if a dictionary
|
---|
1416 | contains objects that sneakily changed their hash value, or mutated
|
---|
1417 | the dictionary they were contained in. For a while python-dev fell
|
---|
1418 | into a gentle rhythm of Michael Hudson finding a case that dumped
|
---|
1419 | core, Tim Peters fixing the bug, Michael finding another case, and round
|
---|
1420 | and round it went.
|
---|
1421 |
|
---|
1422 | \item On Windows, Python can now be compiled with Borland C thanks
|
---|
1423 | to a number of patches contributed by Stephen Hansen, though the
|
---|
1424 | result isn't fully functional yet. (But this \emph{is} progress...)
|
---|
1425 |
|
---|
1426 | \item Another Windows enhancement: Wise Solutions generously offered
|
---|
1427 | PythonLabs use of their InstallerMaster 8.1 system. Earlier
|
---|
1428 | PythonLabs Windows installers used Wise 5.0a, which was beginning to
|
---|
1429 | show its age. (Packaged up by Tim Peters.)
|
---|
1430 |
|
---|
1431 | \item Files ending in \samp{.pyw} can now be imported on Windows.
|
---|
1432 | \samp{.pyw} is a Windows-only thing, used to indicate that a script
|
---|
1433 | needs to be run using PYTHONW.EXE instead of PYTHON.EXE in order to
|
---|
1434 | prevent a DOS console from popping up to display the output. This
|
---|
1435 | patch makes it possible to import such scripts, in case they're also
|
---|
1436 | usable as modules. (Implemented by David Bolen.)
|
---|
1437 |
|
---|
1438 | \item On platforms where Python uses the C \cfunction{dlopen()} function
|
---|
1439 | to load extension modules, it's now possible to set the flags used
|
---|
1440 | by \cfunction{dlopen()} using the \function{sys.getdlopenflags()} and
|
---|
1441 | \function{sys.setdlopenflags()} functions. (Contributed by Bram Stolk.)
|
---|
1442 |
|
---|
1443 | \item The \function{pow()} built-in function no longer supports 3
|
---|
1444 | arguments when floating-point numbers are supplied.
|
---|
1445 | \code{pow(\var{x}, \var{y}, \var{z})} returns \code{(x**y) \% z}, but
|
---|
1446 | this is never useful for floating point numbers, and the final
|
---|
1447 | result varies unpredictably depending on the platform. A call such
|
---|
1448 | as \code{pow(2.0, 8.0, 7.0)} will now raise a \exception{TypeError}
|
---|
1449 | exception.
|
---|
1450 |
|
---|
1451 | \end{itemize}
|
---|
1452 |
|
---|
1453 |
|
---|
1454 | %======================================================================
|
---|
1455 | \section{Acknowledgements}
|
---|
1456 |
|
---|
1457 | The author would like to thank the following people for offering
|
---|
1458 | suggestions, corrections and assistance with various drafts of this
|
---|
1459 | article: Fred Bremmer, Keith Briggs, Andrew Dalke, Fred~L. Drake, Jr.,
|
---|
1460 | Carel Fellinger, David Goodger, Mark Hammond, Stephen Hansen, Michael
|
---|
1461 | Hudson, Jack Jansen, Marc-Andr\'e Lemburg, Martin von L\"owis, Fredrik
|
---|
1462 | Lundh, Michael McLay, Nick Mathewson, Paul Moore, Gustavo Niemeyer,
|
---|
1463 | Don O'Donnell, Joonas Paalasma, Tim Peters, Jens Quade, Tom Reinhardt, Neil
|
---|
1464 | Schemenauer, Guido van Rossum, Greg Ward, Edward Welbourne.
|
---|
1465 |
|
---|
1466 | \end{document}
|
---|