source: python/trunk/Doc/whatsnew/2.3.rst

Last change on this file was 391, checked in by dmik, 11 years ago

python: Merge vendor 2.7.6 to trunk.

  • Property svn:eol-style set to native
File size: 87.4 KB
RevLine 
[2]1****************************
2 What's New in Python 2.3
3****************************
4
5:Author: A.M. Kuchling
6
7.. |release| replace:: 1.01
8
9.. $Id: whatsnew23.tex 54631 2007-03-31 11:58:36Z georg.brandl $
10
11This article explains the new features in Python 2.3. Python 2.3 was released
12on July 29, 2003.
13
14The main themes for Python 2.3 are polishing some of the features added in 2.2,
15adding various small but useful enhancements to the core language, and expanding
16the standard library. The new object model introduced in the previous version
17has benefited from 18 months of bugfixes and from optimization efforts that have
18improved the performance of new-style classes. A few new built-in functions
19have been added such as :func:`sum` and :func:`enumerate`. The :keyword:`in`
20operator can now be used for substring searches (e.g. ``"ab" in "abc"`` returns
21:const:`True`).
22
23Some of the many new library features include Boolean, set, heap, and date/time
24data types, the ability to import modules from ZIP-format archives, metadata
25support for the long-awaited Python catalog, an updated version of IDLE, and
26modules for logging messages, wrapping text, parsing CSV files, processing
27command-line options, using BerkeleyDB databases... the list of new and
28enhanced modules is lengthy.
29
30This article doesn't attempt to provide a complete specification of the new
31features, but instead provides a convenient overview. For full details, you
32should refer to the documentation for Python 2.3, such as the Python Library
33Reference and the Python Reference Manual. If you want to understand the
34complete implementation and design rationale, refer to the PEP for a particular
35new feature.
36
37.. ======================================================================
38
39
40PEP 218: A Standard Set Datatype
41================================
42
43The new :mod:`sets` module contains an implementation of a set datatype. The
44:class:`Set` class is for mutable sets, sets that can have members added and
45removed. The :class:`ImmutableSet` class is for sets that can't be modified,
46and instances of :class:`ImmutableSet` can therefore be used as dictionary keys.
47Sets are built on top of dictionaries, so the elements within a set must be
48hashable.
49
50Here's a simple example::
51
52 >>> import sets
53 >>> S = sets.Set([1,2,3])
54 >>> S
55 Set([1, 2, 3])
56 >>> 1 in S
57 True
58 >>> 0 in S
59 False
60 >>> S.add(5)
61 >>> S.remove(3)
62 >>> S
63 Set([1, 2, 5])
64 >>>
65
66The union and intersection of sets can be computed with the :meth:`union` and
67:meth:`intersection` methods; an alternative notation uses the bitwise operators
68``&`` and ``|``. Mutable sets also have in-place versions of these methods,
69:meth:`union_update` and :meth:`intersection_update`. ::
70
71 >>> S1 = sets.Set([1,2,3])
72 >>> S2 = sets.Set([4,5,6])
73 >>> S1.union(S2)
74 Set([1, 2, 3, 4, 5, 6])
75 >>> S1 | S2 # Alternative notation
76 Set([1, 2, 3, 4, 5, 6])
77 >>> S1.intersection(S2)
78 Set([])
79 >>> S1 & S2 # Alternative notation
80 Set([])
81 >>> S1.union_update(S2)
82 >>> S1
83 Set([1, 2, 3, 4, 5, 6])
84 >>>
85
86It's also possible to take the symmetric difference of two sets. This is the
87set of all elements in the union that aren't in the intersection. Another way
88of putting it is that the symmetric difference contains all elements that are in
89exactly one set. Again, there's an alternative notation (``^``), and an in-
90place version with the ungainly name :meth:`symmetric_difference_update`. ::
91
92 >>> S1 = sets.Set([1,2,3,4])
93 >>> S2 = sets.Set([3,4,5,6])
94 >>> S1.symmetric_difference(S2)
95 Set([1, 2, 5, 6])
96 >>> S1 ^ S2
97 Set([1, 2, 5, 6])
98 >>>
99
100There are also :meth:`issubset` and :meth:`issuperset` methods for checking
101whether one set is a subset or superset of another::
102
103 >>> S1 = sets.Set([1,2,3])
104 >>> S2 = sets.Set([2,3])
105 >>> S2.issubset(S1)
106 True
107 >>> S1.issubset(S2)
108 False
109 >>> S1.issuperset(S2)
110 True
111 >>>
112
113
114.. seealso::
115
116 :pep:`218` - Adding a Built-In Set Object Type
117 PEP written by Greg V. Wilson. Implemented by Greg V. Wilson, Alex Martelli, and
118 GvR.
119
120.. ======================================================================
121
122
123.. _section-generators:
124
125PEP 255: Simple Generators
126==========================
127
128In Python 2.2, generators were added as an optional feature, to be enabled by a
129``from __future__ import generators`` directive. In 2.3 generators no longer
130need to be specially enabled, and are now always present; this means that
131:keyword:`yield` is now always a keyword. The rest of this section is a copy of
132the description of generators from the "What's New in Python 2.2" document; if
133you read it back when Python 2.2 came out, you can skip the rest of this
134section.
135
136You're doubtless familiar with how function calls work in Python or C. When you
137call a function, it gets a private namespace where its local variables are
138created. When the function reaches a :keyword:`return` statement, the local
139variables are destroyed and the resulting value is returned to the caller. A
140later call to the same function will get a fresh new set of local variables.
141But, what if the local variables weren't thrown away on exiting a function?
142What if you could later resume the function where it left off? This is what
143generators provide; they can be thought of as resumable functions.
144
145Here's the simplest example of a generator function::
146
147 def generate_ints(N):
148 for i in range(N):
149 yield i
150
151A new keyword, :keyword:`yield`, was introduced for generators. Any function
152containing a :keyword:`yield` statement is a generator function; this is
153detected by Python's bytecode compiler which compiles the function specially as
154a result.
155
156When you call a generator function, it doesn't return a single value; instead it
157returns a generator object that supports the iterator protocol. On executing
158the :keyword:`yield` statement, the generator outputs the value of ``i``,
159similar to a :keyword:`return` statement. The big difference between
160:keyword:`yield` and a :keyword:`return` statement is that on reaching a
161:keyword:`yield` the generator's state of execution is suspended and local
162variables are preserved. On the next call to the generator's ``.next()``
163method, the function will resume executing immediately after the
164:keyword:`yield` statement. (For complicated reasons, the :keyword:`yield`
165statement isn't allowed inside the :keyword:`try` block of a :keyword:`try`...\
166:keyword:`finally` statement; read :pep:`255` for a full explanation of the
167interaction between :keyword:`yield` and exceptions.)
168
169Here's a sample usage of the :func:`generate_ints` generator::
170
171 >>> gen = generate_ints(3)
172 >>> gen
173 <generator object at 0x8117f90>
174 >>> gen.next()
175 0
176 >>> gen.next()
177 1
178 >>> gen.next()
179 2
180 >>> gen.next()
181 Traceback (most recent call last):
182 File "stdin", line 1, in ?
183 File "stdin", line 2, in generate_ints
184 StopIteration
185
186You could equally write ``for i in generate_ints(5)``, or ``a,b,c =
187generate_ints(3)``.
188
189Inside a generator function, the :keyword:`return` statement can only be used
190without a value, and signals the end of the procession of values; afterwards the
191generator cannot return any further values. :keyword:`return` with a value, such
192as ``return 5``, is a syntax error inside a generator function. The end of the
193generator's results can also be indicated by raising :exc:`StopIteration`
194manually, or by just letting the flow of execution fall off the bottom of the
195function.
196
197You could achieve the effect of generators manually by writing your own class
198and storing all the local variables of the generator as instance variables. For
199example, returning a list of integers could be done by setting ``self.count`` to
2000, and having the :meth:`next` method increment ``self.count`` and return it.
201However, for a moderately complicated generator, writing a corresponding class
202would be much messier. :file:`Lib/test/test_generators.py` contains a number of
203more interesting examples. The simplest one implements an in-order traversal of
204a tree using generators recursively. ::
205
206 # A recursive generator that generates Tree leaves in in-order.
207 def inorder(t):
208 if t:
209 for x in inorder(t.left):
210 yield x
211 yield t.label
212 for x in inorder(t.right):
213 yield x
214
215Two other examples in :file:`Lib/test/test_generators.py` produce solutions for
216the N-Queens problem (placing $N$ queens on an $NxN$ chess board so that no
217queen threatens another) and the Knight's Tour (a route that takes a knight to
218every square of an $NxN$ chessboard without visiting any square twice).
219
220The idea of generators comes from other programming languages, especially Icon
221(http://www.cs.arizona.edu/icon/), where the idea of generators is central. In
222Icon, every expression and function call behaves like a generator. One example
223from "An Overview of the Icon Programming Language" at
224http://www.cs.arizona.edu/icon/docs/ipd266.htm gives an idea of what this looks
225like::
226
227 sentence := "Store it in the neighboring harbor"
228 if (i := find("or", sentence)) > 5 then write(i)
229
230In Icon the :func:`find` function returns the indexes at which the substring
231"or" is found: 3, 23, 33. In the :keyword:`if` statement, ``i`` is first
232assigned a value of 3, but 3 is less than 5, so the comparison fails, and Icon
233retries it with the second value of 23. 23 is greater than 5, so the comparison
234now succeeds, and the code prints the value 23 to the screen.
235
236Python doesn't go nearly as far as Icon in adopting generators as a central
237concept. Generators are considered part of the core Python language, but
238learning or using them isn't compulsory; if they don't solve any problems that
239you have, feel free to ignore them. One novel feature of Python's interface as
240compared to Icon's is that a generator's state is represented as a concrete
241object (the iterator) that can be passed around to other functions or stored in
242a data structure.
243
244
245.. seealso::
246
247 :pep:`255` - Simple Generators
248 Written by Neil Schemenauer, Tim Peters, Magnus Lie Hetland. Implemented mostly
249 by Neil Schemenauer and Tim Peters, with other fixes from the Python Labs crew.
250
251.. ======================================================================
252
253
254.. _section-encodings:
255
256PEP 263: Source Code Encodings
257==============================
258
259Python source files can now be declared as being in different character set
260encodings. Encodings are declared by including a specially formatted comment in
261the first or second line of the source file. For example, a UTF-8 file can be
262declared with::
263
264 #!/usr/bin/env python
265 # -*- coding: UTF-8 -*-
266
267Without such an encoding declaration, the default encoding used is 7-bit ASCII.
268Executing or importing modules that contain string literals with 8-bit
269characters and have no encoding declaration will result in a
270:exc:`DeprecationWarning` being signalled by Python 2.3; in 2.4 this will be a
271syntax error.
272
273The encoding declaration only affects Unicode string literals, which will be
274converted to Unicode using the specified encoding. Note that Python identifiers
275are still restricted to ASCII characters, so you can't have variable names that
276use characters outside of the usual alphanumerics.
277
278
279.. seealso::
280
281 :pep:`263` - Defining Python Source Code Encodings
282 Written by Marc-André Lemburg and Martin von Löwis; implemented by Suzuki Hisao
283 and Martin von Löwis.
284
285.. ======================================================================
286
287
288PEP 273: Importing Modules from ZIP Archives
289============================================
290
291The new :mod:`zipimport` module adds support for importing modules from a ZIP-
292format archive. You don't need to import the module explicitly; it will be
293automatically imported if a ZIP archive's filename is added to ``sys.path``.
294For example::
295
296 amk@nyman:~/src/python$ unzip -l /tmp/example.zip
297 Archive: /tmp/example.zip
298 Length Date Time Name
299 -------- ---- ---- ----
300 8467 11-26-02 22:30 jwzthreading.py
301 -------- -------
302 8467 1 file
303 amk@nyman:~/src/python$ ./python
304 Python 2.3 (#1, Aug 1 2003, 19:54:32)
305 >>> import sys
306 >>> sys.path.insert(0, '/tmp/example.zip') # Add .zip file to front of path
307 >>> import jwzthreading
308 >>> jwzthreading.__file__
309 '/tmp/example.zip/jwzthreading.py'
310 >>>
311
312An entry in ``sys.path`` can now be the filename of a ZIP archive. The ZIP
313archive can contain any kind of files, but only files named :file:`\*.py`,
314:file:`\*.pyc`, or :file:`\*.pyo` can be imported. If an archive only contains
315:file:`\*.py` files, Python will not attempt to modify the archive by adding the
316corresponding :file:`\*.pyc` file, meaning that if a ZIP archive doesn't contain
317:file:`\*.pyc` files, importing may be rather slow.
318
319A path within the archive can also be specified to only import from a
320subdirectory; for example, the path :file:`/tmp/example.zip/lib/` would only
321import from the :file:`lib/` subdirectory within the archive.
322
323
324.. seealso::
325
326 :pep:`273` - Import Modules from Zip Archives
327 Written by James C. Ahlstrom, who also provided an implementation. Python 2.3
328 follows the specification in :pep:`273`, but uses an implementation written by
329 Just van Rossum that uses the import hooks described in :pep:`302`. See section
330 :ref:`section-pep302` for a description of the new import hooks.
331
332.. ======================================================================
333
334
335PEP 277: Unicode file name support for Windows NT
336=================================================
337
338On Windows NT, 2000, and XP, the system stores file names as Unicode strings.
339Traditionally, Python has represented file names as byte strings, which is
340inadequate because it renders some file names inaccessible.
341
342Python now allows using arbitrary Unicode strings (within the limitations of the
343file system) for all functions that expect file names, most notably the
344:func:`open` built-in function. If a Unicode string is passed to
345:func:`os.listdir`, Python now returns a list of Unicode strings. A new
346function, :func:`os.getcwdu`, returns the current directory as a Unicode string.
347
348Byte strings still work as file names, and on Windows Python will transparently
349convert them to Unicode using the ``mbcs`` encoding.
350
351Other systems also allow Unicode strings as file names but convert them to byte
352strings before passing them to the system, which can cause a :exc:`UnicodeError`
353to be raised. Applications can test whether arbitrary Unicode strings are
354supported as file names by checking :attr:`os.path.supports_unicode_filenames`,
355a Boolean value.
356
357Under MacOS, :func:`os.listdir` may now return Unicode filenames.
358
359
360.. seealso::
361
362 :pep:`277` - Unicode file name support for Windows NT
363 Written by Neil Hodgson; implemented by Neil Hodgson, Martin von Löwis, and Mark
364 Hammond.
365
366.. ======================================================================
367
368
[391]369.. index::
370 single: universal newlines; What's new
371
[2]372PEP 278: Universal Newline Support
373==================================
374
375The three major operating systems used today are Microsoft Windows, Apple's
376Macintosh OS, and the various Unix derivatives. A minor irritation of cross-
377platform work is that these three platforms all use different characters to
378mark the ends of lines in text files. Unix uses the linefeed (ASCII character
37910), MacOS uses the carriage return (ASCII character 13), and Windows uses a
380two-character sequence of a carriage return plus a newline.
381
[391]382Python's file objects can now support end of line conventions other than the
383one followed by the platform on which Python is running. Opening a file with
384the mode ``'U'`` or ``'rU'`` will open a file for reading in :term:`universal
385newlines` mode. All three line ending conventions will be translated to a
386``'\n'`` in the strings returned by the various file methods such as
387:meth:`read` and :meth:`readline`.
[2]388
389Universal newline support is also used when importing modules and when executing
390a file with the :func:`execfile` function. This means that Python modules can
391be shared between all three operating systems without needing to convert the
392line-endings.
393
394This feature can be disabled when compiling Python by specifying the
395:option:`--without-universal-newlines` switch when running Python's
396:program:`configure` script.
397
398
399.. seealso::
400
401 :pep:`278` - Universal Newline Support
402 Written and implemented by Jack Jansen.
403
404.. ======================================================================
405
406
407.. _section-enumerate:
408
409PEP 279: enumerate()
410====================
411
412A new built-in function, :func:`enumerate`, will make certain loops a bit
413clearer. ``enumerate(thing)``, where *thing* is either an iterator or a
414sequence, returns a iterator that will return ``(0, thing[0])``, ``(1,
415thing[1])``, ``(2, thing[2])``, and so forth.
416
417A common idiom to change every element of a list looks like this::
418
419 for i in range(len(L)):
420 item = L[i]
421 # ... compute some result based on item ...
422 L[i] = result
423
424This can be rewritten using :func:`enumerate` as::
425
426 for i, item in enumerate(L):
427 # ... compute some result based on item ...
428 L[i] = result
429
430
431.. seealso::
432
433 :pep:`279` - The enumerate() built-in function
434 Written and implemented by Raymond D. Hettinger.
435
436.. ======================================================================
437
438
439PEP 282: The logging Package
440============================
441
442A standard package for writing logs, :mod:`logging`, has been added to Python
4432.3. It provides a powerful and flexible mechanism for generating logging
444output which can then be filtered and processed in various ways. A
445configuration file written in a standard format can be used to control the
446logging behavior of a program. Python includes handlers that will write log
447records to standard error or to a file or socket, send them to the system log,
448or even e-mail them to a particular address; of course, it's also possible to
449write your own handler classes.
450
451The :class:`Logger` class is the primary class. Most application code will deal
452with one or more :class:`Logger` objects, each one used by a particular
453subsystem of the application. Each :class:`Logger` is identified by a name, and
454names are organized into a hierarchy using ``.`` as the component separator.
455For example, you might have :class:`Logger` instances named ``server``,
456``server.auth`` and ``server.network``. The latter two instances are below
457``server`` in the hierarchy. This means that if you turn up the verbosity for
458``server`` or direct ``server`` messages to a different handler, the changes
459will also apply to records logged to ``server.auth`` and ``server.network``.
460There's also a root :class:`Logger` that's the parent of all other loggers.
461
462For simple uses, the :mod:`logging` package contains some convenience functions
463that always use the root log::
464
465 import logging
466
467 logging.debug('Debugging information')
468 logging.info('Informational message')
469 logging.warning('Warning:config file %s not found', 'server.conf')
470 logging.error('Error occurred')
471 logging.critical('Critical error -- shutting down')
472
473This produces the following output::
474
475 WARNING:root:Warning:config file server.conf not found
476 ERROR:root:Error occurred
477 CRITICAL:root:Critical error -- shutting down
478
479In the default configuration, informational and debugging messages are
480suppressed and the output is sent to standard error. You can enable the display
481of informational and debugging messages by calling the :meth:`setLevel` method
482on the root logger.
483
484Notice the :func:`warning` call's use of string formatting operators; all of the
485functions for logging messages take the arguments ``(msg, arg1, arg2, ...)`` and
486log the string resulting from ``msg % (arg1, arg2, ...)``.
487
488There's also an :func:`exception` function that records the most recent
489traceback. Any of the other functions will also record the traceback if you
490specify a true value for the keyword argument *exc_info*. ::
491
492 def f():
493 try: 1/0
494 except: logging.exception('Problem recorded')
495
496 f()
497
498This produces the following output::
499
500 ERROR:root:Problem recorded
501 Traceback (most recent call last):
502 File "t.py", line 6, in f
503 1/0
504 ZeroDivisionError: integer division or modulo by zero
505
506Slightly more advanced programs will use a logger other than the root logger.
507The :func:`getLogger(name)` function is used to get a particular log, creating
508it if it doesn't exist yet. :func:`getLogger(None)` returns the root logger. ::
509
510 log = logging.getLogger('server')
511 ...
512 log.info('Listening on port %i', port)
513 ...
514 log.critical('Disk full')
515 ...
516
517Log records are usually propagated up the hierarchy, so a message logged to
518``server.auth`` is also seen by ``server`` and ``root``, but a :class:`Logger`
519can prevent this by setting its :attr:`propagate` attribute to :const:`False`.
520
521There are more classes provided by the :mod:`logging` package that can be
522customized. When a :class:`Logger` instance is told to log a message, it
523creates a :class:`LogRecord` instance that is sent to any number of different
524:class:`Handler` instances. Loggers and handlers can also have an attached list
525of filters, and each filter can cause the :class:`LogRecord` to be ignored or
526can modify the record before passing it along. When they're finally output,
527:class:`LogRecord` instances are converted to text by a :class:`Formatter`
528class. All of these classes can be replaced by your own specially-written
529classes.
530
531With all of these features the :mod:`logging` package should provide enough
532flexibility for even the most complicated applications. This is only an
533incomplete overview of its features, so please see the package's reference
534documentation for all of the details. Reading :pep:`282` will also be helpful.
535
536
537.. seealso::
538
539 :pep:`282` - A Logging System
540 Written by Vinay Sajip and Trent Mick; implemented by Vinay Sajip.
541
542.. ======================================================================
543
544
545.. _section-bool:
546
547PEP 285: A Boolean Type
548=======================
549
550A Boolean type was added to Python 2.3. Two new constants were added to the
551:mod:`__builtin__` module, :const:`True` and :const:`False`. (:const:`True` and
552:const:`False` constants were added to the built-ins in Python 2.2.1, but the
5532.2.1 versions are simply set to integer values of 1 and 0 and aren't a
554different type.)
555
556The type object for this new type is named :class:`bool`; the constructor for it
557takes any Python value and converts it to :const:`True` or :const:`False`. ::
558
559 >>> bool(1)
560 True
561 >>> bool(0)
562 False
563 >>> bool([])
564 False
565 >>> bool( (1,) )
566 True
567
568Most of the standard library modules and built-in functions have been changed to
569return Booleans. ::
570
571 >>> obj = []
572 >>> hasattr(obj, 'append')
573 True
574 >>> isinstance(obj, list)
575 True
576 >>> isinstance(obj, tuple)
577 False
578
579Python's Booleans were added with the primary goal of making code clearer. For
580example, if you're reading a function and encounter the statement ``return 1``,
581you might wonder whether the ``1`` represents a Boolean truth value, an index,
582or a coefficient that multiplies some other quantity. If the statement is
583``return True``, however, the meaning of the return value is quite clear.
584
585Python's Booleans were *not* added for the sake of strict type-checking. A very
586strict language such as Pascal would also prevent you performing arithmetic with
587Booleans, and would require that the expression in an :keyword:`if` statement
588always evaluate to a Boolean result. Python is not this strict and never will
589be, as :pep:`285` explicitly says. This means you can still use any expression
590in an :keyword:`if` statement, even ones that evaluate to a list or tuple or
591some random object. The Boolean type is a subclass of the :class:`int` class so
592that arithmetic using a Boolean still works. ::
593
594 >>> True + 1
595 2
596 >>> False + 1
597 1
598 >>> False * 75
599 0
600 >>> True * 75
601 75
602
603To sum up :const:`True` and :const:`False` in a sentence: they're alternative
604ways to spell the integer values 1 and 0, with the single difference that
605:func:`str` and :func:`repr` return the strings ``'True'`` and ``'False'``
606instead of ``'1'`` and ``'0'``.
607
608
609.. seealso::
610
611 :pep:`285` - Adding a bool type
612 Written and implemented by GvR.
613
614.. ======================================================================
615
616
617PEP 293: Codec Error Handling Callbacks
618=======================================
619
620When encoding a Unicode string into a byte string, unencodable characters may be
621encountered. So far, Python has allowed specifying the error processing as
622either "strict" (raising :exc:`UnicodeError`), "ignore" (skipping the
623character), or "replace" (using a question mark in the output string), with
624"strict" being the default behavior. It may be desirable to specify alternative
625processing of such errors, such as inserting an XML character reference or HTML
626entity reference into the converted string.
627
628Python now has a flexible framework to add different processing strategies. New
629error handlers can be added with :func:`codecs.register_error`, and codecs then
630can access the error handler with :func:`codecs.lookup_error`. An equivalent C
631API has been added for codecs written in C. The error handler gets the necessary
632state information such as the string being converted, the position in the string
633where the error was detected, and the target encoding. The handler can then
634either raise an exception or return a replacement string.
635
636Two additional error handlers have been implemented using this framework:
637"backslashreplace" uses Python backslash quoting to represent unencodable
638characters and "xmlcharrefreplace" emits XML character references.
639
640
641.. seealso::
642
643 :pep:`293` - Codec Error Handling Callbacks
644 Written and implemented by Walter Dörwald.
645
646.. ======================================================================
647
648
649.. _section-pep301:
650
651PEP 301: Package Index and Metadata for Distutils
652=================================================
653
654Support for the long-requested Python catalog makes its first appearance in 2.3.
655
656The heart of the catalog is the new Distutils :command:`register` command.
657Running ``python setup.py register`` will collect the metadata describing a
658package, such as its name, version, maintainer, description, &c., and send it to
659a central catalog server. The resulting catalog is available from
660http://www.python.org/pypi.
661
662To make the catalog a bit more useful, a new optional *classifiers* keyword
663argument has been added to the Distutils :func:`setup` function. A list of
664`Trove <http://catb.org/~esr/trove/>`_-style strings can be supplied to help
665classify the software.
666
667Here's an example :file:`setup.py` with classifiers, written to be compatible
668with older versions of the Distutils::
669
670 from distutils import core
671 kw = {'name': "Quixote",
672 'version': "0.5.1",
673 'description': "A highly Pythonic Web application framework",
674 # ...
675 }
676
677 if (hasattr(core, 'setup_keywords') and
678 'classifiers' in core.setup_keywords):
679 kw['classifiers'] = \
680 ['Topic :: Internet :: WWW/HTTP :: Dynamic Content',
681 'Environment :: No Input/Output (Daemon)',
682 'Intended Audience :: Developers'],
683
684 core.setup(**kw)
685
686The full list of classifiers can be obtained by running ``python setup.py
687register --list-classifiers``.
688
689
690.. seealso::
691
692 :pep:`301` - Package Index and Metadata for Distutils
693 Written and implemented by Richard Jones.
694
695.. ======================================================================
696
697
698.. _section-pep302:
699
700PEP 302: New Import Hooks
701=========================
702
703While it's been possible to write custom import hooks ever since the
704:mod:`ihooks` module was introduced in Python 1.3, no one has ever been really
705happy with it because writing new import hooks is difficult and messy. There
706have been various proposed alternatives such as the :mod:`imputil` and :mod:`iu`
707modules, but none of them has ever gained much acceptance, and none of them were
708easily usable from C code.
709
710:pep:`302` borrows ideas from its predecessors, especially from Gordon
711McMillan's :mod:`iu` module. Three new items are added to the :mod:`sys`
712module:
713
714* ``sys.path_hooks`` is a list of callable objects; most often they'll be
715 classes. Each callable takes a string containing a path and either returns an
716 importer object that will handle imports from this path or raises an
717 :exc:`ImportError` exception if it can't handle this path.
718
719* ``sys.path_importer_cache`` caches importer objects for each path, so
720 ``sys.path_hooks`` will only need to be traversed once for each path.
721
722* ``sys.meta_path`` is a list of importer objects that will be traversed before
723 ``sys.path`` is checked. This list is initially empty, but user code can add
724 objects to it. Additional built-in and frozen modules can be imported by an
725 object added to this list.
726
727Importer objects must have a single method, :meth:`find_module(fullname,
728path=None)`. *fullname* will be a module or package name, e.g. ``string`` or
729``distutils.core``. :meth:`find_module` must return a loader object that has a
730single method, :meth:`load_module(fullname)`, that creates and returns the
731corresponding module object.
732
733Pseudo-code for Python's new import logic, therefore, looks something like this
734(simplified a bit; see :pep:`302` for the full details)::
735
736 for mp in sys.meta_path:
737 loader = mp(fullname)
738 if loader is not None:
739 <module> = loader.load_module(fullname)
740
741 for path in sys.path:
742 for hook in sys.path_hooks:
743 try:
744 importer = hook(path)
745 except ImportError:
746 # ImportError, so try the other path hooks
747 pass
748 else:
749 loader = importer.find_module(fullname)
750 <module> = loader.load_module(fullname)
751
752 # Not found!
753 raise ImportError
754
755
756.. seealso::
757
758 :pep:`302` - New Import Hooks
759 Written by Just van Rossum and Paul Moore. Implemented by Just van Rossum.
760
761.. ======================================================================
762
763
764.. _section-pep305:
765
766PEP 305: Comma-separated Files
767==============================
768
769Comma-separated files are a format frequently used for exporting data from
770databases and spreadsheets. Python 2.3 adds a parser for comma-separated files.
771
772Comma-separated format is deceptively simple at first glance::
773
774 Costs,150,200,3.95
775
776Read a line and call ``line.split(',')``: what could be simpler? But toss in
777string data that can contain commas, and things get more complicated::
778
779 "Costs",150,200,3.95,"Includes taxes, shipping, and sundry items"
780
781A big ugly regular expression can parse this, but using the new :mod:`csv`
782package is much simpler::
783
784 import csv
785
786 input = open('datafile', 'rb')
787 reader = csv.reader(input)
788 for line in reader:
789 print line
790
791The :func:`reader` function takes a number of different options. The field
792separator isn't limited to the comma and can be changed to any character, and so
793can the quoting and line-ending characters.
794
795Different dialects of comma-separated files can be defined and registered;
796currently there are two dialects, both used by Microsoft Excel. A separate
797:class:`csv.writer` class will generate comma-separated files from a succession
798of tuples or lists, quoting strings that contain the delimiter.
799
800
801.. seealso::
802
803 :pep:`305` - CSV File API
804 Written and implemented by Kevin Altis, Dave Cole, Andrew McNamara, Skip
805 Montanaro, Cliff Wells.
806
807.. ======================================================================
808
809
810.. _section-pep307:
811
812PEP 307: Pickle Enhancements
813============================
814
815The :mod:`pickle` and :mod:`cPickle` modules received some attention during the
8162.3 development cycle. In 2.2, new-style classes could be pickled without
817difficulty, but they weren't pickled very compactly; :pep:`307` quotes a trivial
818example where a new-style class results in a pickled string three times longer
819than that for a classic class.
820
821The solution was to invent a new pickle protocol. The :func:`pickle.dumps`
822function has supported a text-or-binary flag for a long time. In 2.3, this
823flag is redefined from a Boolean to an integer: 0 is the old text-mode pickle
824format, 1 is the old binary format, and now 2 is a new 2.3-specific format. A
825new constant, :const:`pickle.HIGHEST_PROTOCOL`, can be used to select the
826fanciest protocol available.
827
828Unpickling is no longer considered a safe operation. 2.2's :mod:`pickle`
829provided hooks for trying to prevent unsafe classes from being unpickled
830(specifically, a :attr:`__safe_for_unpickling__` attribute), but none of this
831code was ever audited and therefore it's all been ripped out in 2.3. You should
832not unpickle untrusted data in any version of Python.
833
834To reduce the pickling overhead for new-style classes, a new interface for
835customizing pickling was added using three special methods:
836:meth:`__getstate__`, :meth:`__setstate__`, and :meth:`__getnewargs__`. Consult
837:pep:`307` for the full semantics of these methods.
838
839As a way to compress pickles yet further, it's now possible to use integer codes
840instead of long strings to identify pickled classes. The Python Software
841Foundation will maintain a list of standardized codes; there's also a range of
842codes for private use. Currently no codes have been specified.
843
844
845.. seealso::
846
847 :pep:`307` - Extensions to the pickle protocol
848 Written and implemented by Guido van Rossum and Tim Peters.
849
850.. ======================================================================
851
852
853.. _section-slices:
854
855Extended Slices
856===============
857
858Ever since Python 1.4, the slicing syntax has supported an optional third "step"
859or "stride" argument. For example, these are all legal Python syntax:
860``L[1:10:2]``, ``L[:-1:1]``, ``L[::-1]``. This was added to Python at the
861request of the developers of Numerical Python, which uses the third argument
862extensively. However, Python's built-in list, tuple, and string sequence types
863have never supported this feature, raising a :exc:`TypeError` if you tried it.
864Michael Hudson contributed a patch to fix this shortcoming.
865
866For example, you can now easily extract the elements of a list that have even
867indexes::
868
869 >>> L = range(10)
870 >>> L[::2]
871 [0, 2, 4, 6, 8]
872
873Negative values also work to make a copy of the same list in reverse order::
874
875 >>> L[::-1]
876 [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
877
878This also works for tuples, arrays, and strings::
879
880 >>> s='abcd'
881 >>> s[::2]
882 'ac'
883 >>> s[::-1]
884 'dcba'
885
886If you have a mutable sequence such as a list or an array you can assign to or
887delete an extended slice, but there are some differences between assignment to
888extended and regular slices. Assignment to a regular slice can be used to
889change the length of the sequence::
890
891 >>> a = range(3)
892 >>> a
893 [0, 1, 2]
894 >>> a[1:3] = [4, 5, 6]
895 >>> a
896 [0, 4, 5, 6]
897
898Extended slices aren't this flexible. When assigning to an extended slice, the
899list on the right hand side of the statement must contain the same number of
900items as the slice it is replacing::
901
902 >>> a = range(4)
903 >>> a
904 [0, 1, 2, 3]
905 >>> a[::2]
906 [0, 2]
907 >>> a[::2] = [0, -1]
908 >>> a
909 [0, 1, -1, 3]
910 >>> a[::2] = [0,1,2]
911 Traceback (most recent call last):
912 File "<stdin>", line 1, in ?
913 ValueError: attempt to assign sequence of size 3 to extended slice of size 2
914
915Deletion is more straightforward::
916
917 >>> a = range(4)
918 >>> a
919 [0, 1, 2, 3]
920 >>> a[::2]
921 [0, 2]
922 >>> del a[::2]
923 >>> a
924 [1, 3]
925
926One can also now pass slice objects to the :meth:`__getitem__` methods of the
927built-in sequences::
928
929 >>> range(10).__getitem__(slice(0, 5, 2))
930 [0, 2, 4]
931
932Or use slice objects directly in subscripts::
933
934 >>> range(10)[slice(0, 5, 2)]
935 [0, 2, 4]
936
937To simplify implementing sequences that support extended slicing, slice objects
938now have a method :meth:`indices(length)` which, given the length of a sequence,
939returns a ``(start, stop, step)`` tuple that can be passed directly to
940:func:`range`. :meth:`indices` handles omitted and out-of-bounds indices in a
941manner consistent with regular slices (and this innocuous phrase hides a welter
942of confusing details!). The method is intended to be used like this::
943
944 class FakeSeq:
945 ...
946 def calc_item(self, i):
947 ...
948 def __getitem__(self, item):
949 if isinstance(item, slice):
950 indices = item.indices(len(self))
951 return FakeSeq([self.calc_item(i) for i in range(*indices)])
952 else:
953 return self.calc_item(i)
954
955From this example you can also see that the built-in :class:`slice` object is
956now the type object for the slice type, and is no longer a function. This is
957consistent with Python 2.2, where :class:`int`, :class:`str`, etc., underwent
958the same change.
959
960.. ======================================================================
961
962
963Other Language Changes
964======================
965
966Here are all of the changes that Python 2.3 makes to the core Python language.
967
968* The :keyword:`yield` statement is now always a keyword, as described in
969 section :ref:`section-generators` of this document.
970
971* A new built-in function :func:`enumerate` was added, as described in section
972 :ref:`section-enumerate` of this document.
973
974* Two new constants, :const:`True` and :const:`False` were added along with the
975 built-in :class:`bool` type, as described in section :ref:`section-bool` of this
976 document.
977
978* The :func:`int` type constructor will now return a long integer instead of
979 raising an :exc:`OverflowError` when a string or floating-point number is too
980 large to fit into an integer. This can lead to the paradoxical result that
981 ``isinstance(int(expression), int)`` is false, but that seems unlikely to cause
982 problems in practice.
983
984* Built-in types now support the extended slicing syntax, as described in
985 section :ref:`section-slices` of this document.
986
987* A new built-in function, :func:`sum(iterable, start=0)`, adds up the numeric
988 items in the iterable object and returns their sum. :func:`sum` only accepts
989 numbers, meaning that you can't use it to concatenate a bunch of strings.
990 (Contributed by Alex Martelli.)
991
992* ``list.insert(pos, value)`` used to insert *value* at the front of the list
993 when *pos* was negative. The behaviour has now been changed to be consistent
994 with slice indexing, so when *pos* is -1 the value will be inserted before the
995 last element, and so forth.
996
997* ``list.index(value)``, which searches for *value* within the list and returns
998 its index, now takes optional *start* and *stop* arguments to limit the search
999 to only part of the list.
1000
1001* Dictionaries have a new method, :meth:`pop(key[, *default*])`, that returns
1002 the value corresponding to *key* and removes that key/value pair from the
1003 dictionary. If the requested key isn't present in the dictionary, *default* is
1004 returned if it's specified and :exc:`KeyError` raised if it isn't. ::
1005
1006 >>> d = {1:2}
1007 >>> d
1008 {1: 2}
1009 >>> d.pop(4)
1010 Traceback (most recent call last):
1011 File "stdin", line 1, in ?
1012 KeyError: 4
1013 >>> d.pop(1)
1014 2
1015 >>> d.pop(1)
1016 Traceback (most recent call last):
1017 File "stdin", line 1, in ?
1018 KeyError: 'pop(): dictionary is empty'
1019 >>> d
1020 {}
1021 >>>
1022
1023 There's also a new class method, :meth:`dict.fromkeys(iterable, value)`, that
1024 creates a dictionary with keys taken from the supplied iterator *iterable* and
1025 all values set to *value*, defaulting to ``None``.
1026
1027 (Patches contributed by Raymond Hettinger.)
1028
1029 Also, the :func:`dict` constructor now accepts keyword arguments to simplify
1030 creating small dictionaries::
1031
1032 >>> dict(red=1, blue=2, green=3, black=4)
1033 {'blue': 2, 'black': 4, 'green': 3, 'red': 1}
1034
1035 (Contributed by Just van Rossum.)
1036
1037* The :keyword:`assert` statement no longer checks the ``__debug__`` flag, so
1038 you can no longer disable assertions by assigning to ``__debug__``. Running
1039 Python with the :option:`-O` switch will still generate code that doesn't
1040 execute any assertions.
1041
1042* Most type objects are now callable, so you can use them to create new objects
1043 such as functions, classes, and modules. (This means that the :mod:`new` module
1044 can be deprecated in a future Python version, because you can now use the type
1045 objects available in the :mod:`types` module.) For example, you can create a new
1046 module object with the following code:
1047
1048 ::
1049
1050 >>> import types
1051 >>> m = types.ModuleType('abc','docstring')
1052 >>> m
1053 <module 'abc' (built-in)>
1054 >>> m.__doc__
1055 'docstring'
1056
1057* A new warning, :exc:`PendingDeprecationWarning` was added to indicate features
1058 which are in the process of being deprecated. The warning will *not* be printed
1059 by default. To check for use of features that will be deprecated in the future,
1060 supply :option:`-Walways::PendingDeprecationWarning::` on the command line or
1061 use :func:`warnings.filterwarnings`.
1062
1063* The process of deprecating string-based exceptions, as in ``raise "Error
1064 occurred"``, has begun. Raising a string will now trigger
1065 :exc:`PendingDeprecationWarning`.
1066
1067* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
1068 warning. In a future version of Python, ``None`` may finally become a keyword.
1069
1070* The :meth:`xreadlines` method of file objects, introduced in Python 2.1, is no
1071 longer necessary because files now behave as their own iterator.
1072 :meth:`xreadlines` was originally introduced as a faster way to loop over all
1073 the lines in a file, but now you can simply write ``for line in file_obj``.
1074 File objects also have a new read-only :attr:`encoding` attribute that gives the
1075 encoding used by the file; Unicode strings written to the file will be
1076 automatically converted to bytes using the given encoding.
1077
1078* The method resolution order used by new-style classes has changed, though
1079 you'll only notice the difference if you have a really complicated inheritance
1080 hierarchy. Classic classes are unaffected by this change. Python 2.2
1081 originally used a topological sort of a class's ancestors, but 2.3 now uses the
1082 C3 algorithm as described in the paper `"A Monotonic Superclass Linearization
1083 for Dylan" <http://www.webcom.com/haahr/dylan/linearization-oopsla96.html>`_. To
1084 understand the motivation for this change, read Michele Simionato's article
1085 `"Python 2.3 Method Resolution Order" <http://www.python.org/2.3/mro.html>`_, or
1086 read the thread on python-dev starting with the message at
1087 http://mail.python.org/pipermail/python-dev/2002-October/029035.html. Samuele
1088 Pedroni first pointed out the problem and also implemented the fix by coding the
1089 C3 algorithm.
1090
1091* Python runs multithreaded programs by switching between threads after
1092 executing N bytecodes. The default value for N has been increased from 10 to
1093 100 bytecodes, speeding up single-threaded applications by reducing the
1094 switching overhead. Some multithreaded applications may suffer slower response
1095 time, but that's easily fixed by setting the limit back to a lower number using
1096 :func:`sys.setcheckinterval(N)`. The limit can be retrieved with the new
1097 :func:`sys.getcheckinterval` function.
1098
1099* One minor but far-reaching change is that the names of extension types defined
1100 by the modules included with Python now contain the module and a ``'.'`` in
1101 front of the type name. For example, in Python 2.2, if you created a socket and
1102 printed its :attr:`__class__`, you'd get this output::
1103
1104 >>> s = socket.socket()
1105 >>> s.__class__
1106 <type 'socket'>
1107
1108 In 2.3, you get this::
1109
1110 >>> s.__class__
1111 <type '_socket.socket'>
1112
1113* One of the noted incompatibilities between old- and new-style classes has been
1114 removed: you can now assign to the :attr:`__name__` and :attr:`__bases__`
1115 attributes of new-style classes. There are some restrictions on what can be
1116 assigned to :attr:`__bases__` along the lines of those relating to assigning to
1117 an instance's :attr:`__class__` attribute.
1118
1119.. ======================================================================
1120
1121
1122String Changes
1123--------------
1124
1125* The :keyword:`in` operator now works differently for strings. Previously, when
1126 evaluating ``X in Y`` where *X* and *Y* are strings, *X* could only be a single
1127 character. That's now changed; *X* can be a string of any length, and ``X in Y``
1128 will return :const:`True` if *X* is a substring of *Y*. If *X* is the empty
1129 string, the result is always :const:`True`. ::
1130
1131 >>> 'ab' in 'abcd'
1132 True
1133 >>> 'ad' in 'abcd'
1134 False
1135 >>> '' in 'abcd'
1136 True
1137
1138 Note that this doesn't tell you where the substring starts; if you need that
1139 information, use the :meth:`find` string method.
1140
1141* The :meth:`strip`, :meth:`lstrip`, and :meth:`rstrip` string methods now have
1142 an optional argument for specifying the characters to strip. The default is
1143 still to remove all whitespace characters::
1144
1145 >>> ' abc '.strip()
1146 'abc'
1147 >>> '><><abc<><><>'.strip('<>')
1148 'abc'
1149 >>> '><><abc<><><>\n'.strip('<>')
1150 'abc<><><>\n'
1151 >>> u'\u4000\u4001abc\u4000'.strip(u'\u4000')
1152 u'\u4001abc'
1153 >>>
1154
1155 (Suggested by Simon Brunning and implemented by Walter Dörwald.)
1156
1157* The :meth:`startswith` and :meth:`endswith` string methods now accept negative
1158 numbers for the *start* and *end* parameters.
1159
1160* Another new string method is :meth:`zfill`, originally a function in the
1161 :mod:`string` module. :meth:`zfill` pads a numeric string with zeros on the
1162 left until it's the specified width. Note that the ``%`` operator is still more
1163 flexible and powerful than :meth:`zfill`. ::
1164
1165 >>> '45'.zfill(4)
1166 '0045'
1167 >>> '12345'.zfill(4)
1168 '12345'
1169 >>> 'goofy'.zfill(6)
1170 '0goofy'
1171
1172 (Contributed by Walter Dörwald.)
1173
1174* A new type object, :class:`basestring`, has been added. Both 8-bit strings and
1175 Unicode strings inherit from this type, so ``isinstance(obj, basestring)`` will
1176 return :const:`True` for either kind of string. It's a completely abstract
1177 type, so you can't create :class:`basestring` instances.
1178
1179* Interned strings are no longer immortal and will now be garbage-collected in
1180 the usual way when the only reference to them is from the internal dictionary of
1181 interned strings. (Implemented by Oren Tirosh.)
1182
1183.. ======================================================================
1184
1185
1186Optimizations
1187-------------
1188
1189* The creation of new-style class instances has been made much faster; they're
1190 now faster than classic classes!
1191
1192* The :meth:`sort` method of list objects has been extensively rewritten by Tim
1193 Peters, and the implementation is significantly faster.
1194
1195* Multiplication of large long integers is now much faster thanks to an
1196 implementation of Karatsuba multiplication, an algorithm that scales better than
1197 the O(n\*n) required for the grade-school multiplication algorithm. (Original
1198 patch by Christopher A. Craig, and significantly reworked by Tim Peters.)
1199
1200* The ``SET_LINENO`` opcode is now gone. This may provide a small speed
1201 increase, depending on your compiler's idiosyncrasies. See section
1202 :ref:`section-other` for a longer explanation. (Removed by Michael Hudson.)
1203
1204* :func:`xrange` objects now have their own iterator, making ``for i in
1205 xrange(n)`` slightly faster than ``for i in range(n)``. (Patch by Raymond
1206 Hettinger.)
1207
1208* A number of small rearrangements have been made in various hotspots to improve
1209 performance, such as inlining a function or removing some code. (Implemented
1210 mostly by GvR, but lots of people have contributed single changes.)
1211
1212The net result of the 2.3 optimizations is that Python 2.3 runs the pystone
1213benchmark around 25% faster than Python 2.2.
1214
1215.. ======================================================================
1216
1217
1218New, Improved, and Deprecated Modules
1219=====================================
1220
1221As usual, Python's standard library received a number of enhancements and bug
1222fixes. Here's a partial list of the most notable changes, sorted alphabetically
1223by module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
1224complete list of changes, or look through the CVS logs for all the details.
1225
1226* The :mod:`array` module now supports arrays of Unicode characters using the
1227 ``'u'`` format character. Arrays also now support using the ``+=`` assignment
1228 operator to add another array's contents, and the ``*=`` assignment operator to
1229 repeat an array. (Contributed by Jason Orendorff.)
1230
1231* The :mod:`bsddb` module has been replaced by version 4.1.6 of the `PyBSDDB
1232 <http://pybsddb.sourceforge.net>`_ package, providing a more complete interface
1233 to the transactional features of the BerkeleyDB library.
1234
1235 The old version of the module has been renamed to :mod:`bsddb185` and is no
1236 longer built automatically; you'll have to edit :file:`Modules/Setup` to enable
1237 it. Note that the new :mod:`bsddb` package is intended to be compatible with
1238 the old module, so be sure to file bugs if you discover any incompatibilities.
1239 When upgrading to Python 2.3, if the new interpreter is compiled with a new
1240 version of the underlying BerkeleyDB library, you will almost certainly have to
1241 convert your database files to the new version. You can do this fairly easily
1242 with the new scripts :file:`db2pickle.py` and :file:`pickle2db.py` which you
1243 will find in the distribution's :file:`Tools/scripts` directory. If you've
1244 already been using the PyBSDDB package and importing it as :mod:`bsddb3`, you
1245 will have to change your ``import`` statements to import it as :mod:`bsddb`.
1246
1247* The new :mod:`bz2` module is an interface to the bz2 data compression library.
1248 bz2-compressed data is usually smaller than corresponding :mod:`zlib`\
1249 -compressed data. (Contributed by Gustavo Niemeyer.)
1250
1251* A set of standard date/time types has been added in the new :mod:`datetime`
1252 module. See the following section for more details.
1253
1254* The Distutils :class:`Extension` class now supports an extra constructor
1255 argument named *depends* for listing additional source files that an extension
1256 depends on. This lets Distutils recompile the module if any of the dependency
1257 files are modified. For example, if :file:`sampmodule.c` includes the header
1258 file :file:`sample.h`, you would create the :class:`Extension` object like
1259 this::
1260
1261 ext = Extension("samp",
1262 sources=["sampmodule.c"],
1263 depends=["sample.h"])
1264
1265 Modifying :file:`sample.h` would then cause the module to be recompiled.
1266 (Contributed by Jeremy Hylton.)
1267
1268* Other minor changes to Distutils: it now checks for the :envvar:`CC`,
1269 :envvar:`CFLAGS`, :envvar:`CPP`, :envvar:`LDFLAGS`, and :envvar:`CPPFLAGS`
1270 environment variables, using them to override the settings in Python's
1271 configuration (contributed by Robert Weber).
1272
1273* Previously the :mod:`doctest` module would only search the docstrings of
1274 public methods and functions for test cases, but it now also examines private
1275 ones as well. The :func:`DocTestSuite(` function creates a
1276 :class:`unittest.TestSuite` object from a set of :mod:`doctest` tests.
1277
1278* The new :func:`gc.get_referents(object)` function returns a list of all the
1279 objects referenced by *object*.
1280
1281* The :mod:`getopt` module gained a new function, :func:`gnu_getopt`, that
1282 supports the same arguments as the existing :func:`getopt` function but uses
1283 GNU-style scanning mode. The existing :func:`getopt` stops processing options as
1284 soon as a non-option argument is encountered, but in GNU-style mode processing
1285 continues, meaning that options and arguments can be mixed. For example::
1286
1287 >>> getopt.getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1288 ([('-f', 'filename')], ['output', '-v'])
1289 >>> getopt.gnu_getopt(['-f', 'filename', 'output', '-v'], 'f:v')
1290 ([('-f', 'filename'), ('-v', '')], ['output'])
1291
1292 (Contributed by Peter Å
1293strand.)
1294
1295* The :mod:`grp`, :mod:`pwd`, and :mod:`resource` modules now return enhanced
1296 tuples::
1297
1298 >>> import grp
1299 >>> g = grp.getgrnam('amk')
1300 >>> g.gr_name, g.gr_gid
1301 ('amk', 500)
1302
1303* The :mod:`gzip` module can now handle files exceeding 2 GiB.
1304
1305* The new :mod:`heapq` module contains an implementation of a heap queue
1306 algorithm. A heap is an array-like data structure that keeps items in a
1307 partially sorted order such that, for every index *k*, ``heap[k] <=
1308 heap[2*k+1]`` and ``heap[k] <= heap[2*k+2]``. This makes it quick to remove the
1309 smallest item, and inserting a new item while maintaining the heap property is
1310 O(lg n). (See http://www.nist.gov/dads/HTML/priorityque.html for more
1311 information about the priority queue data structure.)
1312
1313 The :mod:`heapq` module provides :func:`heappush` and :func:`heappop` functions
1314 for adding and removing items while maintaining the heap property on top of some
1315 other mutable Python sequence type. Here's an example that uses a Python list::
1316
1317 >>> import heapq
1318 >>> heap = []
1319 >>> for item in [3, 7, 5, 11, 1]:
1320 ... heapq.heappush(heap, item)
1321 ...
1322 >>> heap
1323 [1, 3, 5, 11, 7]
1324 >>> heapq.heappop(heap)
1325 1
1326 >>> heapq.heappop(heap)
1327 3
1328 >>> heap
1329 [5, 7, 11]
1330
1331 (Contributed by Kevin O'Connor.)
1332
1333* The IDLE integrated development environment has been updated using the code
1334 from the IDLEfork project (http://idlefork.sf.net). The most notable feature is
1335 that the code being developed is now executed in a subprocess, meaning that
1336 there's no longer any need for manual ``reload()`` operations. IDLE's core code
1337 has been incorporated into the standard library as the :mod:`idlelib` package.
1338
1339* The :mod:`imaplib` module now supports IMAP over SSL. (Contributed by Piers
1340 Lauder and Tino Lange.)
1341
1342* The :mod:`itertools` contains a number of useful functions for use with
1343 iterators, inspired by various functions provided by the ML and Haskell
1344 languages. For example, ``itertools.ifilter(predicate, iterator)`` returns all
1345 elements in the iterator for which the function :func:`predicate` returns
1346 :const:`True`, and ``itertools.repeat(obj, N)`` returns ``obj`` *N* times.
1347 There are a number of other functions in the module; see the package's reference
1348 documentation for details.
1349 (Contributed by Raymond Hettinger.)
1350
1351* Two new functions in the :mod:`math` module, :func:`degrees(rads)` and
1352 :func:`radians(degs)`, convert between radians and degrees. Other functions in
1353 the :mod:`math` module such as :func:`math.sin` and :func:`math.cos` have always
1354 required input values measured in radians. Also, an optional *base* argument
1355 was added to :func:`math.log` to make it easier to compute logarithms for bases
1356 other than ``e`` and ``10``. (Contributed by Raymond Hettinger.)
1357
1358* Several new POSIX functions (:func:`getpgid`, :func:`killpg`, :func:`lchown`,
1359 :func:`loadavg`, :func:`major`, :func:`makedev`, :func:`minor`, and
1360 :func:`mknod`) were added to the :mod:`posix` module that underlies the
1361 :mod:`os` module. (Contributed by Gustavo Niemeyer, Geert Jansen, and Denis S.
1362 Otkidach.)
1363
1364* In the :mod:`os` module, the :func:`\*stat` family of functions can now report
1365 fractions of a second in a timestamp. Such time stamps are represented as
1366 floats, similar to the value returned by :func:`time.time`.
1367
1368 During testing, it was found that some applications will break if time stamps
1369 are floats. For compatibility, when using the tuple interface of the
1370 :class:`stat_result` time stamps will be represented as integers. When using
1371 named fields (a feature first introduced in Python 2.2), time stamps are still
1372 represented as integers, unless :func:`os.stat_float_times` is invoked to enable
1373 float return values::
1374
1375 >>> os.stat("/tmp").st_mtime
1376 1034791200
1377 >>> os.stat_float_times(True)
1378 >>> os.stat("/tmp").st_mtime
1379 1034791200.6335014
1380
1381 In Python 2.4, the default will change to always returning floats.
1382
1383 Application developers should enable this feature only if all their libraries
1384 work properly when confronted with floating point time stamps, or if they use
1385 the tuple API. If used, the feature should be activated on an application level
1386 instead of trying to enable it on a per-use basis.
1387
1388* The :mod:`optparse` module contains a new parser for command-line arguments
1389 that can convert option values to a particular Python type and will
1390 automatically generate a usage message. See the following section for more
1391 details.
1392
1393* The old and never-documented :mod:`linuxaudiodev` module has been deprecated,
1394 and a new version named :mod:`ossaudiodev` has been added. The module was
1395 renamed because the OSS sound drivers can be used on platforms other than Linux,
1396 and the interface has also been tidied and brought up to date in various ways.
1397 (Contributed by Greg Ward and Nicholas FitzRoy-Dale.)
1398
1399* The new :mod:`platform` module contains a number of functions that try to
1400 determine various properties of the platform you're running on. There are
1401 functions for getting the architecture, CPU type, the Windows OS version, and
1402 even the Linux distribution version. (Contributed by Marc-André Lemburg.)
1403
1404* The parser objects provided by the :mod:`pyexpat` module can now optionally
1405 buffer character data, resulting in fewer calls to your character data handler
1406 and therefore faster performance. Setting the parser object's
1407 :attr:`buffer_text` attribute to :const:`True` will enable buffering.
1408
1409* The :func:`sample(population, k)` function was added to the :mod:`random`
1410 module. *population* is a sequence or :class:`xrange` object containing the
1411 elements of a population, and :func:`sample` chooses *k* elements from the
1412 population without replacing chosen elements. *k* can be any value up to
1413 ``len(population)``. For example::
1414
1415 >>> days = ['Mo', 'Tu', 'We', 'Th', 'Fr', 'St', 'Sn']
1416 >>> random.sample(days, 3) # Choose 3 elements
1417 ['St', 'Sn', 'Th']
1418 >>> random.sample(days, 7) # Choose 7 elements
1419 ['Tu', 'Th', 'Mo', 'We', 'St', 'Fr', 'Sn']
1420 >>> random.sample(days, 7) # Choose 7 again
1421 ['We', 'Mo', 'Sn', 'Fr', 'Tu', 'St', 'Th']
1422 >>> random.sample(days, 8) # Can't choose eight
1423 Traceback (most recent call last):
1424 File "<stdin>", line 1, in ?
1425 File "random.py", line 414, in sample
1426 raise ValueError, "sample larger than population"
1427 ValueError: sample larger than population
1428 >>> random.sample(xrange(1,10000,2), 10) # Choose ten odd nos. under 10000
1429 [3407, 3805, 1505, 7023, 2401, 2267, 9733, 3151, 8083, 9195]
1430
1431 The :mod:`random` module now uses a new algorithm, the Mersenne Twister,
1432 implemented in C. It's faster and more extensively studied than the previous
1433 algorithm.
1434
1435 (All changes contributed by Raymond Hettinger.)
1436
1437* The :mod:`readline` module also gained a number of new functions:
1438 :func:`get_history_item`, :func:`get_current_history_length`, and
1439 :func:`redisplay`.
1440
1441* The :mod:`rexec` and :mod:`Bastion` modules have been declared dead, and
1442 attempts to import them will fail with a :exc:`RuntimeError`. New-style classes
1443 provide new ways to break out of the restricted execution environment provided
1444 by :mod:`rexec`, and no one has interest in fixing them or time to do so. If
1445 you have applications using :mod:`rexec`, rewrite them to use something else.
1446
1447 (Sticking with Python 2.2 or 2.1 will not make your applications any safer
1448 because there are known bugs in the :mod:`rexec` module in those versions. To
1449 repeat: if you're using :mod:`rexec`, stop using it immediately.)
1450
1451* The :mod:`rotor` module has been deprecated because the algorithm it uses for
1452 encryption is not believed to be secure. If you need encryption, use one of the
1453 several AES Python modules that are available separately.
1454
1455* The :mod:`shutil` module gained a :func:`move(src, dest)` function that
1456 recursively moves a file or directory to a new location.
1457
1458* Support for more advanced POSIX signal handling was added to the :mod:`signal`
1459 but then removed again as it proved impossible to make it work reliably across
1460 platforms.
1461
1462* The :mod:`socket` module now supports timeouts. You can call the
1463 :meth:`settimeout(t)` method on a socket object to set a timeout of *t* seconds.
1464 Subsequent socket operations that take longer than *t* seconds to complete will
1465 abort and raise a :exc:`socket.timeout` exception.
1466
1467 The original timeout implementation was by Tim O'Malley. Michael Gilfix
1468 integrated it into the Python :mod:`socket` module and shepherded it through a
1469 lengthy review. After the code was checked in, Guido van Rossum rewrote parts
1470 of it. (This is a good example of a collaborative development process in
1471 action.)
1472
1473* On Windows, the :mod:`socket` module now ships with Secure Sockets Layer
1474 (SSL) support.
1475
1476* The value of the C :const:`PYTHON_API_VERSION` macro is now exposed at the
1477 Python level as ``sys.api_version``. The current exception can be cleared by
1478 calling the new :func:`sys.exc_clear` function.
1479
1480* The new :mod:`tarfile` module allows reading from and writing to
1481 :program:`tar`\ -format archive files. (Contributed by Lars GustÀbel.)
1482
1483* The new :mod:`textwrap` module contains functions for wrapping strings
1484 containing paragraphs of text. The :func:`wrap(text, width)` function takes a
1485 string and returns a list containing the text split into lines of no more than
1486 the chosen width. The :func:`fill(text, width)` function returns a single
1487 string, reformatted to fit into lines no longer than the chosen width. (As you
1488 can guess, :func:`fill` is built on top of :func:`wrap`. For example::
1489
1490 >>> import textwrap
1491 >>> paragraph = "Not a whit, we defy augury: ... more text ..."
1492 >>> textwrap.wrap(paragraph, 60)
1493 ["Not a whit, we defy augury: there's a special providence in",
1494 "the fall of a sparrow. If it be now, 'tis not to come; if it",
1495 ...]
1496 >>> print textwrap.fill(paragraph, 35)
1497 Not a whit, we defy augury: there's
1498 a special providence in the fall of
1499 a sparrow. If it be now, 'tis not
1500 to come; if it be not to come, it
1501 will be now; if it be not now, yet
1502 it will come: the readiness is all.
1503 >>>
1504
1505 The module also contains a :class:`TextWrapper` class that actually implements
1506 the text wrapping strategy. Both the :class:`TextWrapper` class and the
1507 :func:`wrap` and :func:`fill` functions support a number of additional keyword
1508 arguments for fine-tuning the formatting; consult the module's documentation
1509 for details. (Contributed by Greg Ward.)
1510
1511* The :mod:`thread` and :mod:`threading` modules now have companion modules,
1512 :mod:`dummy_thread` and :mod:`dummy_threading`, that provide a do-nothing
1513 implementation of the :mod:`thread` module's interface for platforms where
1514 threads are not supported. The intention is to simplify thread-aware modules
1515 (ones that *don't* rely on threads to run) by putting the following code at the
1516 top::
1517
1518 try:
1519 import threading as _threading
1520 except ImportError:
1521 import dummy_threading as _threading
1522
1523 In this example, :mod:`_threading` is used as the module name to make it clear
1524 that the module being used is not necessarily the actual :mod:`threading`
1525 module. Code can call functions and use classes in :mod:`_threading` whether or
1526 not threads are supported, avoiding an :keyword:`if` statement and making the
1527 code slightly clearer. This module will not magically make multithreaded code
1528 run without threads; code that waits for another thread to return or to do
1529 something will simply hang forever.
1530
1531* The :mod:`time` module's :func:`strptime` function has long been an annoyance
1532 because it uses the platform C library's :func:`strptime` implementation, and
1533 different platforms sometimes have odd bugs. Brett Cannon contributed a
1534 portable implementation that's written in pure Python and should behave
1535 identically on all platforms.
1536
1537* The new :mod:`timeit` module helps measure how long snippets of Python code
1538 take to execute. The :file:`timeit.py` file can be run directly from the
1539 command line, or the module's :class:`Timer` class can be imported and used
1540 directly. Here's a short example that figures out whether it's faster to
1541 convert an 8-bit string to Unicode by appending an empty Unicode string to it or
1542 by using the :func:`unicode` function::
1543
1544 import timeit
1545
1546 timer1 = timeit.Timer('unicode("abc")')
1547 timer2 = timeit.Timer('"abc" + u""')
1548
1549 # Run three trials
1550 print timer1.repeat(repeat=3, number=100000)
1551 print timer2.repeat(repeat=3, number=100000)
1552
1553 # On my laptop this outputs:
1554 # [0.36831796169281006, 0.37441694736480713, 0.35304892063140869]
1555 # [0.17574405670166016, 0.18193507194519043, 0.17565798759460449]
1556
1557* The :mod:`Tix` module has received various bug fixes and updates for the
1558 current version of the Tix package.
1559
1560* The :mod:`Tkinter` module now works with a thread-enabled version of Tcl.
1561 Tcl's threading model requires that widgets only be accessed from the thread in
1562 which they're created; accesses from another thread can cause Tcl to panic. For
1563 certain Tcl interfaces, :mod:`Tkinter` will now automatically avoid this when a
1564 widget is accessed from a different thread by marshalling a command, passing it
1565 to the correct thread, and waiting for the results. Other interfaces can't be
1566 handled automatically but :mod:`Tkinter` will now raise an exception on such an
1567 access so that you can at least find out about the problem. See
1568 http://mail.python.org/pipermail/python-dev/2002-December/031107.html for a more
1569 detailed explanation of this change. (Implemented by Martin von Löwis.)
1570
1571* Calling Tcl methods through :mod:`_tkinter` no longer returns only strings.
1572 Instead, if Tcl returns other objects those objects are converted to their
1573 Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
1574 object if no Python equivalent exists. This behavior can be controlled through
1575 the :meth:`wantobjects` method of :class:`tkapp` objects.
1576
1577 When using :mod:`_tkinter` through the :mod:`Tkinter` module (as most Tkinter
1578 applications will), this feature is always activated. It should not cause
1579 compatibility problems, since Tkinter would always convert string results to
1580 Python types where possible.
1581
1582 If any incompatibilities are found, the old behavior can be restored by setting
1583 the :attr:`wantobjects` variable in the :mod:`Tkinter` module to false before
1584 creating the first :class:`tkapp` object. ::
1585
1586 import Tkinter
1587 Tkinter.wantobjects = 0
1588
1589 Any breakage caused by this change should be reported as a bug.
1590
1591* The :mod:`UserDict` module has a new :class:`DictMixin` class which defines
1592 all dictionary methods for classes that already have a minimum mapping
1593 interface. This greatly simplifies writing classes that need to be
1594 substitutable for dictionaries, such as the classes in the :mod:`shelve`
1595 module.
1596
1597 Adding the mix-in as a superclass provides the full dictionary interface
1598 whenever the class defines :meth:`__getitem__`, :meth:`__setitem__`,
1599 :meth:`__delitem__`, and :meth:`keys`. For example::
1600
1601 >>> import UserDict
1602 >>> class SeqDict(UserDict.DictMixin):
1603 ... """Dictionary lookalike implemented with lists."""
1604 ... def __init__(self):
1605 ... self.keylist = []
1606 ... self.valuelist = []
1607 ... def __getitem__(self, key):
1608 ... try:
1609 ... i = self.keylist.index(key)
1610 ... except ValueError:
1611 ... raise KeyError
1612 ... return self.valuelist[i]
1613 ... def __setitem__(self, key, value):
1614 ... try:
1615 ... i = self.keylist.index(key)
1616 ... self.valuelist[i] = value
1617 ... except ValueError:
1618 ... self.keylist.append(key)
1619 ... self.valuelist.append(value)
1620 ... def __delitem__(self, key):
1621 ... try:
1622 ... i = self.keylist.index(key)
1623 ... except ValueError:
1624 ... raise KeyError
1625 ... self.keylist.pop(i)
1626 ... self.valuelist.pop(i)
1627 ... def keys(self):
1628 ... return list(self.keylist)
1629 ...
1630 >>> s = SeqDict()
1631 >>> dir(s) # See that other dictionary methods are implemented
1632 ['__cmp__', '__contains__', '__delitem__', '__doc__', '__getitem__',
1633 '__init__', '__iter__', '__len__', '__module__', '__repr__',
1634 '__setitem__', 'clear', 'get', 'has_key', 'items', 'iteritems',
1635 'iterkeys', 'itervalues', 'keylist', 'keys', 'pop', 'popitem',
1636 'setdefault', 'update', 'valuelist', 'values']
1637
1638 (Contributed by Raymond Hettinger.)
1639
1640* The DOM implementation in :mod:`xml.dom.minidom` can now generate XML output
1641 in a particular encoding by providing an optional encoding argument to the
1642 :meth:`toxml` and :meth:`toprettyxml` methods of DOM nodes.
1643
1644* The :mod:`xmlrpclib` module now supports an XML-RPC extension for handling nil
1645 data values such as Python's ``None``. Nil values are always supported on
1646 unmarshalling an XML-RPC response. To generate requests containing ``None``,
1647 you must supply a true value for the *allow_none* parameter when creating a
1648 :class:`Marshaller` instance.
1649
1650* The new :mod:`DocXMLRPCServer` module allows writing self-documenting XML-RPC
1651 servers. Run it in demo mode (as a program) to see it in action. Pointing the
1652 Web browser to the RPC server produces pydoc-style documentation; pointing
1653 xmlrpclib to the server allows invoking the actual methods. (Contributed by
1654 Brian Quinlan.)
1655
1656* Support for internationalized domain names (RFCs 3454, 3490, 3491, and 3492)
1657 has been added. The "idna" encoding can be used to convert between a Unicode
1658 domain name and the ASCII-compatible encoding (ACE) of that name. ::
1659
1660 >{}>{}> u"www.Alliancefrançaise.nu".encode("idna")
1661 'www.xn--alliancefranaise-npb.nu'
1662
1663 The :mod:`socket` module has also been extended to transparently convert
1664 Unicode hostnames to the ACE version before passing them to the C library.
1665 Modules that deal with hostnames such as :mod:`httplib` and :mod:`ftplib`)
1666 also support Unicode host names; :mod:`httplib` also sends HTTP ``Host``
1667 headers using the ACE version of the domain name. :mod:`urllib` supports
1668 Unicode URLs with non-ASCII host names as long as the ``path`` part of the URL
1669 is ASCII only.
1670
1671 To implement this change, the :mod:`stringprep` module, the ``mkstringprep``
1672 tool and the ``punycode`` encoding have been added.
1673
1674.. ======================================================================
1675
1676
1677Date/Time Type
1678--------------
1679
1680Date and time types suitable for expressing timestamps were added as the
1681:mod:`datetime` module. The types don't support different calendars or many
1682fancy features, and just stick to the basics of representing time.
1683
1684The three primary types are: :class:`date`, representing a day, month, and year;
1685:class:`time`, consisting of hour, minute, and second; and :class:`datetime`,
1686which contains all the attributes of both :class:`date` and :class:`time`.
1687There's also a :class:`timedelta` class representing differences between two
1688points in time, and time zone logic is implemented by classes inheriting from
1689the abstract :class:`tzinfo` class.
1690
1691You can create instances of :class:`date` and :class:`time` by either supplying
1692keyword arguments to the appropriate constructor, e.g.
1693``datetime.date(year=1972, month=10, day=15)``, or by using one of a number of
1694class methods. For example, the :meth:`date.today` class method returns the
1695current local date.
1696
1697Once created, instances of the date/time classes are all immutable. There are a
1698number of methods for producing formatted strings from objects::
1699
1700 >>> import datetime
1701 >>> now = datetime.datetime.now()
1702 >>> now.isoformat()
1703 '2002-12-30T21:27:03.994956'
1704 >>> now.ctime() # Only available on date, datetime
1705 'Mon Dec 30 21:27:03 2002'
1706 >>> now.strftime('%Y %d %b')
1707 '2002 30 Dec'
1708
1709The :meth:`replace` method allows modifying one or more fields of a
1710:class:`date` or :class:`datetime` instance, returning a new instance::
1711
1712 >>> d = datetime.datetime.now()
1713 >>> d
1714 datetime.datetime(2002, 12, 30, 22, 15, 38, 827738)
1715 >>> d.replace(year=2001, hour = 12)
1716 datetime.datetime(2001, 12, 30, 12, 15, 38, 827738)
1717 >>>
1718
1719Instances can be compared, hashed, and converted to strings (the result is the
1720same as that of :meth:`isoformat`). :class:`date` and :class:`datetime`
1721instances can be subtracted from each other, and added to :class:`timedelta`
1722instances. The largest missing feature is that there's no standard library
1723support for parsing strings and getting back a :class:`date` or
1724:class:`datetime`.
1725
1726For more information, refer to the module's reference documentation.
1727(Contributed by Tim Peters.)
1728
1729.. ======================================================================
1730
1731
1732The optparse Module
1733-------------------
1734
1735The :mod:`getopt` module provides simple parsing of command-line arguments. The
1736new :mod:`optparse` module (originally named Optik) provides more elaborate
1737command-line parsing that follows the Unix conventions, automatically creates
1738the output for :option:`--help`, and can perform different actions for different
1739options.
1740
1741You start by creating an instance of :class:`OptionParser` and telling it what
1742your program's options are. ::
1743
1744 import sys
1745 from optparse import OptionParser
1746
1747 op = OptionParser()
1748 op.add_option('-i', '--input',
1749 action='store', type='string', dest='input',
1750 help='set input filename')
1751 op.add_option('-l', '--length',
1752 action='store', type='int', dest='length',
1753 help='set maximum length of output')
1754
1755Parsing a command line is then done by calling the :meth:`parse_args` method. ::
1756
1757 options, args = op.parse_args(sys.argv[1:])
1758 print options
1759 print args
1760
1761This returns an object containing all of the option values, and a list of
1762strings containing the remaining arguments.
1763
1764Invoking the script with the various arguments now works as you'd expect it to.
1765Note that the length argument is automatically converted to an integer. ::
1766
1767 $ ./python opt.py -i data arg1
1768 <Values at 0x400cad4c: {'input': 'data', 'length': None}>
1769 ['arg1']
1770 $ ./python opt.py --input=data --length=4
1771 <Values at 0x400cad2c: {'input': 'data', 'length': 4}>
1772 []
1773 $
1774
1775The help message is automatically generated for you::
1776
1777 $ ./python opt.py --help
1778 usage: opt.py [options]
1779
1780 options:
1781 -h, --help show this help message and exit
1782 -iINPUT, --input=INPUT
1783 set input filename
1784 -lLENGTH, --length=LENGTH
1785 set maximum length of output
1786 $
1787
1788See the module's documentation for more details.
1789
1790
1791Optik was written by Greg Ward, with suggestions from the readers of the Getopt
1792SIG.
1793
1794.. ======================================================================
1795
1796
1797.. _section-pymalloc:
1798
1799Pymalloc: A Specialized Object Allocator
1800========================================
1801
1802Pymalloc, a specialized object allocator written by Vladimir Marangozov, was a
[391]1803feature added to Python 2.1. Pymalloc is intended to be faster than the system
1804:c:func:`malloc` and to have less memory overhead for allocation patterns typical
[2]1805of Python programs. The allocator uses C's :c:func:`malloc` function to get large
1806pools of memory and then fulfills smaller memory requests from these pools.
1807
1808In 2.1 and 2.2, pymalloc was an experimental feature and wasn't enabled by
1809default; you had to explicitly enable it when compiling Python by providing the
1810:option:`--with-pymalloc` option to the :program:`configure` script. In 2.3,
1811pymalloc has had further enhancements and is now enabled by default; you'll have
1812to supply :option:`--without-pymalloc` to disable it.
1813
1814This change is transparent to code written in Python; however, pymalloc may
1815expose bugs in C extensions. Authors of C extension modules should test their
1816code with pymalloc enabled, because some incorrect code may cause core dumps at
1817runtime.
1818
1819There's one particularly common error that causes problems. There are a number
[391]1820of memory allocation functions in Python's C API that have previously just been
[2]1821aliases for the C library's :c:func:`malloc` and :c:func:`free`, meaning that if
1822you accidentally called mismatched functions the error wouldn't be noticeable.
[391]1823When the object allocator is enabled, these functions aren't aliases of
[2]1824:c:func:`malloc` and :c:func:`free` any more, and calling the wrong function to
[391]1825free memory may get you a core dump. For example, if memory was allocated using
1826:c:func:`PyObject_Malloc`, it has to be freed using :c:func:`PyObject_Free`, not
[2]1827:c:func:`free`. A few modules included with Python fell afoul of this and had to
1828be fixed; doubtless there are more third-party modules that will have the same
1829problem.
1830
1831As part of this change, the confusing multiple interfaces for allocating memory
1832have been consolidated down into two API families. Memory allocated with one
1833family must not be manipulated with functions from the other family. There is
1834one family for allocating chunks of memory and another family of functions
1835specifically for allocating Python objects.
1836
[391]1837* To allocate and free an undistinguished chunk of memory use the "raw memory"
[2]1838 family: :c:func:`PyMem_Malloc`, :c:func:`PyMem_Realloc`, and :c:func:`PyMem_Free`.
1839
1840* The "object memory" family is the interface to the pymalloc facility described
[391]1841 above and is biased towards a large number of "small" allocations:
[2]1842 :c:func:`PyObject_Malloc`, :c:func:`PyObject_Realloc`, and :c:func:`PyObject_Free`.
1843
[391]1844* To allocate and free Python objects, use the "object" family
[2]1845 :c:func:`PyObject_New`, :c:func:`PyObject_NewVar`, and :c:func:`PyObject_Del`.
1846
1847Thanks to lots of work by Tim Peters, pymalloc in 2.3 also provides debugging
1848features to catch memory overwrites and doubled frees in both extension modules
1849and in the interpreter itself. To enable this support, compile a debugging
1850version of the Python interpreter by running :program:`configure` with
1851:option:`--with-pydebug`.
1852
1853To aid extension writers, a header file :file:`Misc/pymemcompat.h` is
1854distributed with the source to Python 2.3 that allows Python extensions to use
1855the 2.3 interfaces to memory allocation while compiling against any version of
1856Python since 1.5.2. You would copy the file from Python's source distribution
1857and bundle it with the source of your extension.
1858
1859
1860.. seealso::
1861
1862 http://svn.python.org/view/python/trunk/Objects/obmalloc.c
1863 For the full details of the pymalloc implementation, see the comments at
1864 the top of the file :file:`Objects/obmalloc.c` in the Python source code.
1865 The above link points to the file within the python.org SVN browser.
1866
1867.. ======================================================================
1868
1869
1870Build and C API Changes
1871=======================
1872
1873Changes to Python's build process and to the C API include:
1874
1875* The cycle detection implementation used by the garbage collection has proven
1876 to be stable, so it's now been made mandatory. You can no longer compile Python
1877 without it, and the :option:`--with-cycle-gc` switch to :program:`configure` has
1878 been removed.
1879
1880* Python can now optionally be built as a shared library
1881 (:file:`libpython2.3.so`) by supplying :option:`--enable-shared` when running
1882 Python's :program:`configure` script. (Contributed by Ondrej Palkovsky.)
[391]1883
[2]1884* The :c:macro:`DL_EXPORT` and :c:macro:`DL_IMPORT` macros are now deprecated.
[391]1885 Initialization functions for Python extension modules should now be declared
1886 using the new macro :c:macro:`PyMODINIT_FUNC`, while the Python core will
[2]1887 generally use the :c:macro:`PyAPI_FUNC` and :c:macro:`PyAPI_DATA` macros.
1888
1889* The interpreter can be compiled without any docstrings for the built-in
1890 functions and modules by supplying :option:`--without-doc-strings` to the
1891 :program:`configure` script. This makes the Python executable about 10% smaller,
1892 but will also mean that you can't get help for Python's built-ins. (Contributed
1893 by Gustavo Niemeyer.)
[391]1894
[2]1895* The :c:func:`PyArg_NoArgs` macro is now deprecated, and code that uses it
1896 should be changed. For Python 2.2 and later, the method definition table can
1897 specify the :const:`METH_NOARGS` flag, signalling that there are no arguments,
1898 and the argument checking can then be removed. If compatibility with pre-2.2
1899 versions of Python is important, the code could use ``PyArg_ParseTuple(args,
1900 "")`` instead, but this will be slower than using :const:`METH_NOARGS`.
[391]1901
1902* :c:func:`PyArg_ParseTuple` accepts new format characters for various sizes of
1903 unsigned integers: ``B`` for :c:type:`unsigned char`, ``H`` for :c:type:`unsigned
[2]1904 short int`, ``I`` for :c:type:`unsigned int`, and ``K`` for :c:type:`unsigned
1905 long long`.
[391]1906
[2]1907* A new function, :c:func:`PyObject_DelItemString(mapping, char \*key)` was added
1908 as shorthand for ``PyObject_DelItem(mapping, PyString_New(key))``.
1909
1910* File objects now manage their internal string buffer differently, increasing
1911 it exponentially when needed. This results in the benchmark tests in
1912 :file:`Lib/test/test_bufio.py` speeding up considerably (from 57 seconds to 1.7
1913 seconds, according to one measurement).
1914
1915* It's now possible to define class and static methods for a C extension type by
[391]1916 setting either the :const:`METH_CLASS` or :const:`METH_STATIC` flags in a
[2]1917 method's :c:type:`PyMethodDef` structure.
1918
1919* Python now includes a copy of the Expat XML parser's source code, removing any
1920 dependence on a system version or local installation of Expat.
1921
1922* If you dynamically allocate type objects in your extension, you should be
1923 aware of a change in the rules relating to the :attr:`__module__` and
1924 :attr:`__name__` attributes. In summary, you will want to ensure the type's
1925 dictionary contains a ``'__module__'`` key; making the module name the part of
1926 the type name leading up to the final period will no longer have the desired
1927 effect. For more detail, read the API reference documentation or the source.
1928
1929.. ======================================================================
1930
1931
1932Port-Specific Changes
1933---------------------
1934
1935Support for a port to IBM's OS/2 using the EMX runtime environment was merged
1936into the main Python source tree. EMX is a POSIX emulation layer over the OS/2
1937system APIs. The Python port for EMX tries to support all the POSIX-like
1938capability exposed by the EMX runtime, and mostly succeeds; :func:`fork` and
1939:func:`fcntl` are restricted by the limitations of the underlying emulation
1940layer. The standard OS/2 port, which uses IBM's Visual Age compiler, also
1941gained support for case-sensitive import semantics as part of the integration of
1942the EMX port into CVS. (Contributed by Andrew MacIntyre.)
1943
1944On MacOS, most toolbox modules have been weaklinked to improve backward
1945compatibility. This means that modules will no longer fail to load if a single
1946routine is missing on the current OS version. Instead calling the missing
1947routine will raise an exception. (Contributed by Jack Jansen.)
1948
1949The RPM spec files, found in the :file:`Misc/RPM/` directory in the Python
1950source distribution, were updated for 2.3. (Contributed by Sean Reifschneider.)
1951
1952Other new platforms now supported by Python include AtheOS
1953(http://www.atheos.cx/), GNU/Hurd, and OpenVMS.
1954
1955.. ======================================================================
1956
1957
1958.. _section-other:
1959
1960Other Changes and Fixes
1961=======================
1962
1963As usual, there were a bunch of other improvements and bugfixes scattered
1964throughout the source tree. A search through the CVS change logs finds there
1965were 523 patches applied and 514 bugs fixed between Python 2.2 and 2.3. Both
1966figures are likely to be underestimates.
1967
1968Some of the more notable changes are:
1969
1970* If the :envvar:`PYTHONINSPECT` environment variable is set, the Python
1971 interpreter will enter the interactive prompt after running a Python program, as
1972 if Python had been invoked with the :option:`-i` option. The environment
1973 variable can be set before running the Python interpreter, or it can be set by
1974 the Python program as part of its execution.
1975
1976* The :file:`regrtest.py` script now provides a way to allow "all resources
1977 except *foo*." A resource name passed to the :option:`-u` option can now be
1978 prefixed with a hyphen (``'-'``) to mean "remove this resource." For example,
1979 the option '``-uall,-bsddb``' could be used to enable the use of all resources
1980 except ``bsddb``.
1981
1982* The tools used to build the documentation now work under Cygwin as well as
1983 Unix.
1984
1985* The ``SET_LINENO`` opcode has been removed. Back in the mists of time, this
1986 opcode was needed to produce line numbers in tracebacks and support trace
1987 functions (for, e.g., :mod:`pdb`). Since Python 1.5, the line numbers in
1988 tracebacks have been computed using a different mechanism that works with
1989 "python -O". For Python 2.3 Michael Hudson implemented a similar scheme to
1990 determine when to call the trace function, removing the need for ``SET_LINENO``
1991 entirely.
1992
1993 It would be difficult to detect any resulting difference from Python code, apart
1994 from a slight speed up when Python is run without :option:`-O`.
1995
1996 C extensions that access the :attr:`f_lineno` field of frame objects should
1997 instead call ``PyCode_Addr2Line(f->f_code, f->f_lasti)``. This will have the
1998 added effect of making the code work as desired under "python -O" in earlier
1999 versions of Python.
2000
2001 A nifty new feature is that trace functions can now assign to the
2002 :attr:`f_lineno` attribute of frame objects, changing the line that will be
2003 executed next. A ``jump`` command has been added to the :mod:`pdb` debugger
2004 taking advantage of this new feature. (Implemented by Richie Hindle.)
2005
2006.. ======================================================================
2007
2008
2009Porting to Python 2.3
2010=====================
2011
2012This section lists previously described changes that may require changes to your
2013code:
2014
2015* :keyword:`yield` is now always a keyword; if it's used as a variable name in
2016 your code, a different name must be chosen.
2017
2018* For strings *X* and *Y*, ``X in Y`` now works if *X* is more than one
2019 character long.
2020
2021* The :func:`int` type constructor will now return a long integer instead of
2022 raising an :exc:`OverflowError` when a string or floating-point number is too
2023 large to fit into an integer.
2024
2025* If you have Unicode strings that contain 8-bit characters, you must declare
2026 the file's encoding (UTF-8, Latin-1, or whatever) by adding a comment to the top
2027 of the file. See section :ref:`section-encodings` for more information.
2028
2029* Calling Tcl methods through :mod:`_tkinter` no longer returns only strings.
2030 Instead, if Tcl returns other objects those objects are converted to their
2031 Python equivalent, if one exists, or wrapped with a :class:`_tkinter.Tcl_Obj`
2032 object if no Python equivalent exists.
2033
2034* Large octal and hex literals such as ``0xffffffff`` now trigger a
2035 :exc:`FutureWarning`. Currently they're stored as 32-bit numbers and result in a
2036 negative value, but in Python 2.4 they'll become positive long integers.
2037
2038 There are a few ways to fix this warning. If you really need a positive number,
2039 just add an ``L`` to the end of the literal. If you're trying to get a 32-bit
2040 integer with low bits set and have previously used an expression such as ``~(1
2041 << 31)``, it's probably clearest to start with all bits set and clear the
2042 desired upper bits. For example, to clear just the top bit (bit 31), you could
2043 write ``0xffffffffL &~(1L<<31)``.
2044
2045* You can no longer disable assertions by assigning to ``__debug__``.
2046
2047* The Distutils :func:`setup` function has gained various new keyword arguments
2048 such as *depends*. Old versions of the Distutils will abort if passed unknown
2049 keywords. A solution is to check for the presence of the new
2050 :func:`get_distutil_options` function in your :file:`setup.py` and only uses the
2051 new keywords with a version of the Distutils that supports them::
2052
2053 from distutils import core
2054
2055 kw = {'sources': 'foo.c', ...}
2056 if hasattr(core, 'get_distutil_options'):
2057 kw['depends'] = ['foo.h']
2058 ext = Extension(**kw)
2059
2060* Using ``None`` as a variable name will now result in a :exc:`SyntaxWarning`
2061 warning.
2062
2063* Names of extension types defined by the modules included with Python now
2064 contain the module and a ``'.'`` in front of the type name.
2065
2066.. ======================================================================
2067
2068
2069.. _23acks:
2070
2071Acknowledgements
2072================
2073
2074The author would like to thank the following people for offering suggestions,
2075corrections and assistance with various drafts of this article: Jeff Bauer,
2076Simon Brunning, Brett Cannon, Michael Chermside, Andrew Dalke, Scott David
2077Daniels, Fred L. Drake, Jr., David Fraser, Kelly Gerber, Raymond Hettinger,
2078Michael Hudson, Chris Lambert, Detlef Lannert, Martin von Löwis, Andrew
2079MacIntyre, Lalo Martins, Chad Netzer, Gustavo Niemeyer, Neal Norwitz, Hans
2080Nowak, Chris Reedy, Francesco Ricciardi, Vinay Sajip, Neil Schemenauer, Roman
2081Suzi, Jason Tishler, Just van Rossum.
2082
Note: See TracBrowser for help on using the repository browser.