[2] | 1 | ****************************
|
---|
| 2 | What's New in Python 2.0
|
---|
| 3 | ****************************
|
---|
| 4 |
|
---|
| 5 | :Author: A.M. Kuchling and Moshe Zadka
|
---|
| 6 |
|
---|
| 7 | .. |release| replace:: 1.02
|
---|
| 8 |
|
---|
| 9 | .. $Id: whatsnew20.tex 50964 2006-07-30 03:03:43Z fred.drake $
|
---|
| 10 |
|
---|
| 11 |
|
---|
| 12 | Introduction
|
---|
| 13 | ============
|
---|
| 14 |
|
---|
| 15 | A new release of Python, version 2.0, was released on October 16, 2000. This
|
---|
| 16 | article covers the exciting new features in 2.0, highlights some other useful
|
---|
| 17 | changes, and points out a few incompatible changes that may require rewriting
|
---|
| 18 | code.
|
---|
| 19 |
|
---|
| 20 | Python's development never completely stops between releases, and a steady flow
|
---|
| 21 | of bug fixes and improvements are always being submitted. A host of minor fixes,
|
---|
| 22 | a few optimizations, additional docstrings, and better error messages went into
|
---|
| 23 | 2.0; to list them all would be impossible, but they're certainly significant.
|
---|
| 24 | Consult the publicly-available CVS logs if you want to see the full list. This
|
---|
| 25 | progress is due to the five developers working for PythonLabs are now getting
|
---|
| 26 | paid to spend their days fixing bugs, and also due to the improved communication
|
---|
| 27 | resulting from moving to SourceForge.
|
---|
| 28 |
|
---|
| 29 | .. ======================================================================
|
---|
| 30 |
|
---|
| 31 |
|
---|
| 32 | What About Python 1.6?
|
---|
| 33 | ======================
|
---|
| 34 |
|
---|
| 35 | Python 1.6 can be thought of as the Contractual Obligations Python release.
|
---|
| 36 | After the core development team left CNRI in May 2000, CNRI requested that a 1.6
|
---|
| 37 | release be created, containing all the work on Python that had been performed at
|
---|
| 38 | CNRI. Python 1.6 therefore represents the state of the CVS tree as of May 2000,
|
---|
| 39 | with the most significant new feature being Unicode support. Development
|
---|
| 40 | continued after May, of course, so the 1.6 tree received a few fixes to ensure
|
---|
| 41 | that it's forward-compatible with Python 2.0. 1.6 is therefore part of Python's
|
---|
| 42 | evolution, and not a side branch.
|
---|
| 43 |
|
---|
| 44 | So, should you take much interest in Python 1.6? Probably not. The 1.6final
|
---|
| 45 | and 2.0beta1 releases were made on the same day (September 5, 2000), the plan
|
---|
| 46 | being to finalize Python 2.0 within a month or so. If you have applications to
|
---|
| 47 | maintain, there seems little point in breaking things by moving to 1.6, fixing
|
---|
| 48 | them, and then having another round of breakage within a month by moving to 2.0;
|
---|
| 49 | you're better off just going straight to 2.0. Most of the really interesting
|
---|
| 50 | features described in this document are only in 2.0, because a lot of work was
|
---|
| 51 | done between May and September.
|
---|
| 52 |
|
---|
| 53 | .. ======================================================================
|
---|
| 54 |
|
---|
| 55 |
|
---|
| 56 | New Development Process
|
---|
| 57 | =======================
|
---|
| 58 |
|
---|
| 59 | The most important change in Python 2.0 may not be to the code at all, but to
|
---|
| 60 | how Python is developed: in May 2000 the Python developers began using the tools
|
---|
| 61 | made available by SourceForge for storing source code, tracking bug reports,
|
---|
| 62 | and managing the queue of patch submissions. To report bugs or submit patches
|
---|
| 63 | for Python 2.0, use the bug tracking and patch manager tools available from
|
---|
| 64 | Python's project page, located at http://sourceforge.net/projects/python/.
|
---|
| 65 |
|
---|
| 66 | The most important of the services now hosted at SourceForge is the Python CVS
|
---|
| 67 | tree, the version-controlled repository containing the source code for Python.
|
---|
| 68 | Previously, there were roughly 7 or so people who had write access to the CVS
|
---|
| 69 | tree, and all patches had to be inspected and checked in by one of the people on
|
---|
| 70 | this short list. Obviously, this wasn't very scalable. By moving the CVS tree
|
---|
| 71 | to SourceForge, it became possible to grant write access to more people; as of
|
---|
| 72 | September 2000 there were 27 people able to check in changes, a fourfold
|
---|
| 73 | increase. This makes possible large-scale changes that wouldn't be attempted if
|
---|
| 74 | they'd have to be filtered through the small group of core developers. For
|
---|
| 75 | example, one day Peter Schneider-Kamp took it into his head to drop K&R C
|
---|
| 76 | compatibility and convert the C source for Python to ANSI C. After getting
|
---|
| 77 | approval on the python-dev mailing list, he launched into a flurry of checkins
|
---|
| 78 | that lasted about a week, other developers joined in to help, and the job was
|
---|
| 79 | done. If there were only 5 people with write access, probably that task would
|
---|
| 80 | have been viewed as "nice, but not worth the time and effort needed" and it
|
---|
| 81 | would never have gotten done.
|
---|
| 82 |
|
---|
| 83 | The shift to using SourceForge's services has resulted in a remarkable increase
|
---|
| 84 | in the speed of development. Patches now get submitted, commented on, revised
|
---|
| 85 | by people other than the original submitter, and bounced back and forth between
|
---|
| 86 | people until the patch is deemed worth checking in. Bugs are tracked in one
|
---|
| 87 | central location and can be assigned to a specific person for fixing, and we can
|
---|
| 88 | count the number of open bugs to measure progress. This didn't come without a
|
---|
| 89 | cost: developers now have more e-mail to deal with, more mailing lists to
|
---|
| 90 | follow, and special tools had to be written for the new environment. For
|
---|
| 91 | example, SourceForge sends default patch and bug notification e-mail messages
|
---|
| 92 | that are completely unhelpful, so Ka-Ping Yee wrote an HTML screen-scraper that
|
---|
| 93 | sends more useful messages.
|
---|
| 94 |
|
---|
| 95 | The ease of adding code caused a few initial growing pains, such as code was
|
---|
| 96 | checked in before it was ready or without getting clear agreement from the
|
---|
| 97 | developer group. The approval process that has emerged is somewhat similar to
|
---|
| 98 | that used by the Apache group. Developers can vote +1, +0, -0, or -1 on a patch;
|
---|
| 99 | +1 and -1 denote acceptance or rejection, while +0 and -0 mean the developer is
|
---|
| 100 | mostly indifferent to the change, though with a slight positive or negative
|
---|
| 101 | slant. The most significant change from the Apache model is that the voting is
|
---|
| 102 | essentially advisory, letting Guido van Rossum, who has Benevolent Dictator For
|
---|
| 103 | Life status, know what the general opinion is. He can still ignore the result of
|
---|
| 104 | a vote, and approve or reject a change even if the community disagrees with him.
|
---|
| 105 |
|
---|
| 106 | Producing an actual patch is the last step in adding a new feature, and is
|
---|
| 107 | usually easy compared to the earlier task of coming up with a good design.
|
---|
| 108 | Discussions of new features can often explode into lengthy mailing list threads,
|
---|
| 109 | making the discussion hard to follow, and no one can read every posting to
|
---|
| 110 | python-dev. Therefore, a relatively formal process has been set up to write
|
---|
| 111 | Python Enhancement Proposals (PEPs), modelled on the Internet RFC process. PEPs
|
---|
| 112 | are draft documents that describe a proposed new feature, and are continually
|
---|
| 113 | revised until the community reaches a consensus, either accepting or rejecting
|
---|
| 114 | the proposal. Quoting from the introduction to PEP 1, "PEP Purpose and
|
---|
| 115 | Guidelines":
|
---|
| 116 |
|
---|
| 117 |
|
---|
| 118 | .. epigraph::
|
---|
| 119 |
|
---|
| 120 | PEP stands for Python Enhancement Proposal. A PEP is a design document
|
---|
| 121 | providing information to the Python community, or describing a new feature for
|
---|
| 122 | Python. The PEP should provide a concise technical specification of the feature
|
---|
| 123 | and a rationale for the feature.
|
---|
| 124 |
|
---|
| 125 | We intend PEPs to be the primary mechanisms for proposing new features, for
|
---|
| 126 | collecting community input on an issue, and for documenting the design decisions
|
---|
| 127 | that have gone into Python. The PEP author is responsible for building
|
---|
| 128 | consensus within the community and documenting dissenting opinions.
|
---|
| 129 |
|
---|
| 130 | Read the rest of PEP 1 for the details of the PEP editorial process, style, and
|
---|
| 131 | format. PEPs are kept in the Python CVS tree on SourceForge, though they're not
|
---|
| 132 | part of the Python 2.0 distribution, and are also available in HTML form from
|
---|
| 133 | http://www.python.org/peps/. As of September 2000, there are 25 PEPS, ranging
|
---|
| 134 | from PEP 201, "Lockstep Iteration", to PEP 225, "Elementwise/Objectwise
|
---|
| 135 | Operators".
|
---|
| 136 |
|
---|
| 137 | .. ======================================================================
|
---|
| 138 |
|
---|
| 139 |
|
---|
| 140 | Unicode
|
---|
| 141 | =======
|
---|
| 142 |
|
---|
| 143 | The largest new feature in Python 2.0 is a new fundamental data type: Unicode
|
---|
| 144 | strings. Unicode uses 16-bit numbers to represent characters instead of the
|
---|
| 145 | 8-bit number used by ASCII, meaning that 65,536 distinct characters can be
|
---|
| 146 | supported.
|
---|
| 147 |
|
---|
| 148 | The final interface for Unicode support was arrived at through countless often-
|
---|
| 149 | stormy discussions on the python-dev mailing list, and mostly implemented by
|
---|
| 150 | Marc-André Lemburg, based on a Unicode string type implementation by Fredrik
|
---|
| 151 | Lundh. A detailed explanation of the interface was written up as :pep:`100`,
|
---|
| 152 | "Python Unicode Integration". This article will simply cover the most
|
---|
| 153 | significant points about the Unicode interfaces.
|
---|
| 154 |
|
---|
| 155 | In Python source code, Unicode strings are written as ``u"string"``. Arbitrary
|
---|
| 156 | Unicode characters can be written using a new escape sequence, ``\uHHHH``, where
|
---|
| 157 | *HHHH* is a 4-digit hexadecimal number from 0000 to FFFF. The existing
|
---|
| 158 | ``\xHHHH`` escape sequence can also be used, and octal escapes can be used for
|
---|
| 159 | characters up to U+01FF, which is represented by ``\777``.
|
---|
| 160 |
|
---|
| 161 | Unicode strings, just like regular strings, are an immutable sequence type.
|
---|
| 162 | They can be indexed and sliced, but not modified in place. Unicode strings have
|
---|
| 163 | an ``encode( [encoding] )`` method that returns an 8-bit string in the desired
|
---|
| 164 | encoding. Encodings are named by strings, such as ``'ascii'``, ``'utf-8'``,
|
---|
| 165 | ``'iso-8859-1'``, or whatever. A codec API is defined for implementing and
|
---|
| 166 | registering new encodings that are then available throughout a Python program.
|
---|
| 167 | If an encoding isn't specified, the default encoding is usually 7-bit ASCII,
|
---|
| 168 | though it can be changed for your Python installation by calling the
|
---|
| 169 | :func:`sys.setdefaultencoding(encoding)` function in a customised version of
|
---|
| 170 | :file:`site.py`.
|
---|
| 171 |
|
---|
| 172 | Combining 8-bit and Unicode strings always coerces to Unicode, using the default
|
---|
| 173 | ASCII encoding; the result of ``'a' + u'bc'`` is ``u'abc'``.
|
---|
| 174 |
|
---|
| 175 | New built-in functions have been added, and existing built-ins modified to
|
---|
| 176 | support Unicode:
|
---|
| 177 |
|
---|
| 178 | * ``unichr(ch)`` returns a Unicode string 1 character long, containing the
|
---|
| 179 | character *ch*.
|
---|
| 180 |
|
---|
| 181 | * ``ord(u)``, where *u* is a 1-character regular or Unicode string, returns the
|
---|
| 182 | number of the character as an integer.
|
---|
| 183 |
|
---|
| 184 | * ``unicode(string [, encoding] [, errors] )`` creates a Unicode string
|
---|
| 185 | from an 8-bit string. ``encoding`` is a string naming the encoding to use. The
|
---|
| 186 | ``errors`` parameter specifies the treatment of characters that are invalid for
|
---|
| 187 | the current encoding; passing ``'strict'`` as the value causes an exception to
|
---|
| 188 | be raised on any encoding error, while ``'ignore'`` causes errors to be silently
|
---|
| 189 | ignored and ``'replace'`` uses U+FFFD, the official replacement character, in
|
---|
| 190 | case of any problems.
|
---|
| 191 |
|
---|
| 192 | * The :keyword:`exec` statement, and various built-ins such as ``eval()``,
|
---|
| 193 | ``getattr()``, and ``setattr()`` will also accept Unicode strings as well as
|
---|
| 194 | regular strings. (It's possible that the process of fixing this missed some
|
---|
| 195 | built-ins; if you find a built-in function that accepts strings but doesn't
|
---|
| 196 | accept Unicode strings at all, please report it as a bug.)
|
---|
| 197 |
|
---|
| 198 | A new module, :mod:`unicodedata`, provides an interface to Unicode character
|
---|
| 199 | properties. For example, ``unicodedata.category(u'A')`` returns the 2-character
|
---|
| 200 | string 'Lu', the 'L' denoting it's a letter, and 'u' meaning that it's
|
---|
| 201 | uppercase. ``unicodedata.bidirectional(u'\u0660')`` returns 'AN', meaning that
|
---|
| 202 | U+0660 is an Arabic number.
|
---|
| 203 |
|
---|
| 204 | The :mod:`codecs` module contains functions to look up existing encodings and
|
---|
| 205 | register new ones. Unless you want to implement a new encoding, you'll most
|
---|
| 206 | often use the :func:`codecs.lookup(encoding)` function, which returns a
|
---|
| 207 | 4-element tuple: ``(encode_func, decode_func, stream_reader, stream_writer)``.
|
---|
| 208 |
|
---|
| 209 | * *encode_func* is a function that takes a Unicode string, and returns a 2-tuple
|
---|
| 210 | ``(string, length)``. *string* is an 8-bit string containing a portion (perhaps
|
---|
| 211 | all) of the Unicode string converted into the given encoding, and *length* tells
|
---|
| 212 | you how much of the Unicode string was converted.
|
---|
| 213 |
|
---|
| 214 | * *decode_func* is the opposite of *encode_func*, taking an 8-bit string and
|
---|
| 215 | returning a 2-tuple ``(ustring, length)``, consisting of the resulting Unicode
|
---|
| 216 | string *ustring* and the integer *length* telling how much of the 8-bit string
|
---|
| 217 | was consumed.
|
---|
| 218 |
|
---|
| 219 | * *stream_reader* is a class that supports decoding input from a stream.
|
---|
| 220 | *stream_reader(file_obj)* returns an object that supports the :meth:`read`,
|
---|
| 221 | :meth:`readline`, and :meth:`readlines` methods. These methods will all
|
---|
| 222 | translate from the given encoding and return Unicode strings.
|
---|
| 223 |
|
---|
| 224 | * *stream_writer*, similarly, is a class that supports encoding output to a
|
---|
| 225 | stream. *stream_writer(file_obj)* returns an object that supports the
|
---|
| 226 | :meth:`write` and :meth:`writelines` methods. These methods expect Unicode
|
---|
| 227 | strings, translating them to the given encoding on output.
|
---|
| 228 |
|
---|
| 229 | For example, the following code writes a Unicode string into a file, encoding
|
---|
| 230 | it as UTF-8::
|
---|
| 231 |
|
---|
| 232 | import codecs
|
---|
| 233 |
|
---|
| 234 | unistr = u'\u0660\u2000ab ...'
|
---|
| 235 |
|
---|
| 236 | (UTF8_encode, UTF8_decode,
|
---|
| 237 | UTF8_streamreader, UTF8_streamwriter) = codecs.lookup('UTF-8')
|
---|
| 238 |
|
---|
| 239 | output = UTF8_streamwriter( open( '/tmp/output', 'wb') )
|
---|
| 240 | output.write( unistr )
|
---|
| 241 | output.close()
|
---|
| 242 |
|
---|
| 243 | The following code would then read UTF-8 input from the file::
|
---|
| 244 |
|
---|
| 245 | input = UTF8_streamreader( open( '/tmp/output', 'rb') )
|
---|
| 246 | print repr(input.read())
|
---|
| 247 | input.close()
|
---|
| 248 |
|
---|
| 249 | Unicode-aware regular expressions are available through the :mod:`re` module,
|
---|
| 250 | which has a new underlying implementation called SRE written by Fredrik Lundh of
|
---|
| 251 | Secret Labs AB.
|
---|
| 252 |
|
---|
| 253 | A ``-U`` command line option was added which causes the Python compiler to
|
---|
| 254 | interpret all string literals as Unicode string literals. This is intended to be
|
---|
| 255 | used in testing and future-proofing your Python code, since some future version
|
---|
| 256 | of Python may drop support for 8-bit strings and provide only Unicode strings.
|
---|
| 257 |
|
---|
| 258 | .. ======================================================================
|
---|
| 259 |
|
---|
| 260 |
|
---|
| 261 | List Comprehensions
|
---|
| 262 | ===================
|
---|
| 263 |
|
---|
| 264 | Lists are a workhorse data type in Python, and many programs manipulate a list
|
---|
| 265 | at some point. Two common operations on lists are to loop over them, and either
|
---|
| 266 | pick out the elements that meet a certain criterion, or apply some function to
|
---|
| 267 | each element. For example, given a list of strings, you might want to pull out
|
---|
| 268 | all the strings containing a given substring, or strip off trailing whitespace
|
---|
| 269 | from each line.
|
---|
| 270 |
|
---|
| 271 | The existing :func:`map` and :func:`filter` functions can be used for this
|
---|
| 272 | purpose, but they require a function as one of their arguments. This is fine if
|
---|
| 273 | there's an existing built-in function that can be passed directly, but if there
|
---|
| 274 | isn't, you have to create a little function to do the required work, and
|
---|
| 275 | Python's scoping rules make the result ugly if the little function needs
|
---|
| 276 | additional information. Take the first example in the previous paragraph,
|
---|
| 277 | finding all the strings in the list containing a given substring. You could
|
---|
| 278 | write the following to do it::
|
---|
| 279 |
|
---|
| 280 | # Given the list L, make a list of all strings
|
---|
| 281 | # containing the substring S.
|
---|
| 282 | sublist = filter( lambda s, substring=S:
|
---|
| 283 | string.find(s, substring) != -1,
|
---|
| 284 | L)
|
---|
| 285 |
|
---|
| 286 | Because of Python's scoping rules, a default argument is used so that the
|
---|
| 287 | anonymous function created by the :keyword:`lambda` statement knows what
|
---|
| 288 | substring is being searched for. List comprehensions make this cleaner::
|
---|
| 289 |
|
---|
| 290 | sublist = [ s for s in L if string.find(s, S) != -1 ]
|
---|
| 291 |
|
---|
| 292 | List comprehensions have the form::
|
---|
| 293 |
|
---|
| 294 | [ expression for expr in sequence1
|
---|
| 295 | for expr2 in sequence2 ...
|
---|
| 296 | for exprN in sequenceN
|
---|
| 297 | if condition ]
|
---|
| 298 |
|
---|
| 299 | The :keyword:`for`...\ :keyword:`in` clauses contain the sequences to be
|
---|
| 300 | iterated over. The sequences do not have to be the same length, because they
|
---|
| 301 | are *not* iterated over in parallel, but from left to right; this is explained
|
---|
| 302 | more clearly in the following paragraphs. The elements of the generated list
|
---|
| 303 | will be the successive values of *expression*. The final :keyword:`if` clause
|
---|
| 304 | is optional; if present, *expression* is only evaluated and added to the result
|
---|
| 305 | if *condition* is true.
|
---|
| 306 |
|
---|
| 307 | To make the semantics very clear, a list comprehension is equivalent to the
|
---|
| 308 | following Python code::
|
---|
| 309 |
|
---|
| 310 | for expr1 in sequence1:
|
---|
| 311 | for expr2 in sequence2:
|
---|
| 312 | ...
|
---|
| 313 | for exprN in sequenceN:
|
---|
| 314 | if (condition):
|
---|
| 315 | # Append the value of
|
---|
| 316 | # the expression to the
|
---|
| 317 | # resulting list.
|
---|
| 318 |
|
---|
| 319 | This means that when there are multiple :keyword:`for`...\ :keyword:`in`
|
---|
| 320 | clauses, the resulting list will be equal to the product of the lengths of all
|
---|
| 321 | the sequences. If you have two lists of length 3, the output list is 9 elements
|
---|
| 322 | long::
|
---|
| 323 |
|
---|
| 324 | seq1 = 'abc'
|
---|
| 325 | seq2 = (1,2,3)
|
---|
| 326 | >>> [ (x,y) for x in seq1 for y in seq2]
|
---|
| 327 | [('a', 1), ('a', 2), ('a', 3), ('b', 1), ('b', 2), ('b', 3), ('c', 1),
|
---|
| 328 | ('c', 2), ('c', 3)]
|
---|
| 329 |
|
---|
| 330 | To avoid introducing an ambiguity into Python's grammar, if *expression* is
|
---|
| 331 | creating a tuple, it must be surrounded with parentheses. The first list
|
---|
| 332 | comprehension below is a syntax error, while the second one is correct::
|
---|
| 333 |
|
---|
| 334 | # Syntax error
|
---|
| 335 | [ x,y for x in seq1 for y in seq2]
|
---|
| 336 | # Correct
|
---|
| 337 | [ (x,y) for x in seq1 for y in seq2]
|
---|
| 338 |
|
---|
| 339 | The idea of list comprehensions originally comes from the functional programming
|
---|
| 340 | language Haskell (http://www.haskell.org). Greg Ewing argued most effectively
|
---|
| 341 | for adding them to Python and wrote the initial list comprehension patch, which
|
---|
| 342 | was then discussed for a seemingly endless time on the python-dev mailing list
|
---|
| 343 | and kept up-to-date by Skip Montanaro.
|
---|
| 344 |
|
---|
| 345 | .. ======================================================================
|
---|
| 346 |
|
---|
| 347 |
|
---|
| 348 | Augmented Assignment
|
---|
| 349 | ====================
|
---|
| 350 |
|
---|
| 351 | Augmented assignment operators, another long-requested feature, have been added
|
---|
| 352 | to Python 2.0. Augmented assignment operators include ``+=``, ``-=``, ``*=``,
|
---|
| 353 | and so forth. For example, the statement ``a += 2`` increments the value of the
|
---|
| 354 | variable ``a`` by 2, equivalent to the slightly lengthier ``a = a + 2``.
|
---|
| 355 |
|
---|
| 356 | The full list of supported assignment operators is ``+=``, ``-=``, ``*=``,
|
---|
| 357 | ``/=``, ``%=``, ``**=``, ``&=``, ``|=``, ``^=``, ``>>=``, and ``<<=``. Python
|
---|
| 358 | classes can override the augmented assignment operators by defining methods
|
---|
| 359 | named :meth:`__iadd__`, :meth:`__isub__`, etc. For example, the following
|
---|
| 360 | :class:`Number` class stores a number and supports using += to create a new
|
---|
| 361 | instance with an incremented value.
|
---|
| 362 |
|
---|
| 363 | .. The empty groups below prevent conversion to guillemets.
|
---|
| 364 |
|
---|
| 365 | ::
|
---|
| 366 |
|
---|
| 367 | class Number:
|
---|
| 368 | def __init__(self, value):
|
---|
| 369 | self.value = value
|
---|
| 370 | def __iadd__(self, increment):
|
---|
| 371 | return Number( self.value + increment)
|
---|
| 372 |
|
---|
| 373 | n = Number(5)
|
---|
| 374 | n += 3
|
---|
| 375 | print n.value
|
---|
| 376 |
|
---|
| 377 | The :meth:`__iadd__` special method is called with the value of the increment,
|
---|
| 378 | and should return a new instance with an appropriately modified value; this
|
---|
| 379 | return value is bound as the new value of the variable on the left-hand side.
|
---|
| 380 |
|
---|
| 381 | Augmented assignment operators were first introduced in the C programming
|
---|
| 382 | language, and most C-derived languages, such as :program:`awk`, C++, Java, Perl,
|
---|
| 383 | and PHP also support them. The augmented assignment patch was implemented by
|
---|
| 384 | Thomas Wouters.
|
---|
| 385 |
|
---|
| 386 | .. ======================================================================
|
---|
| 387 |
|
---|
| 388 |
|
---|
| 389 | String Methods
|
---|
| 390 | ==============
|
---|
| 391 |
|
---|
| 392 | Until now string-manipulation functionality was in the :mod:`string` module,
|
---|
| 393 | which was usually a front-end for the :mod:`strop` module written in C. The
|
---|
| 394 | addition of Unicode posed a difficulty for the :mod:`strop` module, because the
|
---|
| 395 | functions would all need to be rewritten in order to accept either 8-bit or
|
---|
| 396 | Unicode strings. For functions such as :func:`string.replace`, which takes 3
|
---|
| 397 | string arguments, that means eight possible permutations, and correspondingly
|
---|
| 398 | complicated code.
|
---|
| 399 |
|
---|
| 400 | Instead, Python 2.0 pushes the problem onto the string type, making string
|
---|
| 401 | manipulation functionality available through methods on both 8-bit strings and
|
---|
| 402 | Unicode strings. ::
|
---|
| 403 |
|
---|
| 404 | >>> 'andrew'.capitalize()
|
---|
| 405 | 'Andrew'
|
---|
| 406 | >>> 'hostname'.replace('os', 'linux')
|
---|
| 407 | 'hlinuxtname'
|
---|
| 408 | >>> 'moshe'.find('sh')
|
---|
| 409 | 2
|
---|
| 410 |
|
---|
| 411 | One thing that hasn't changed, a noteworthy April Fools' joke notwithstanding,
|
---|
| 412 | is that Python strings are immutable. Thus, the string methods return new
|
---|
| 413 | strings, and do not modify the string on which they operate.
|
---|
| 414 |
|
---|
| 415 | The old :mod:`string` module is still around for backwards compatibility, but it
|
---|
| 416 | mostly acts as a front-end to the new string methods.
|
---|
| 417 |
|
---|
| 418 | Two methods which have no parallel in pre-2.0 versions, although they did exist
|
---|
| 419 | in JPython for quite some time, are :meth:`startswith` and :meth:`endswith`.
|
---|
| 420 | ``s.startswith(t)`` is equivalent to ``s[:len(t)] == t``, while
|
---|
| 421 | ``s.endswith(t)`` is equivalent to ``s[-len(t):] == t``.
|
---|
| 422 |
|
---|
| 423 | One other method which deserves special mention is :meth:`join`. The
|
---|
| 424 | :meth:`join` method of a string receives one parameter, a sequence of strings,
|
---|
| 425 | and is equivalent to the :func:`string.join` function from the old :mod:`string`
|
---|
| 426 | module, with the arguments reversed. In other words, ``s.join(seq)`` is
|
---|
| 427 | equivalent to the old ``string.join(seq, s)``.
|
---|
| 428 |
|
---|
| 429 | .. ======================================================================
|
---|
| 430 |
|
---|
| 431 |
|
---|
| 432 | Garbage Collection of Cycles
|
---|
| 433 | ============================
|
---|
| 434 |
|
---|
| 435 | The C implementation of Python uses reference counting to implement garbage
|
---|
| 436 | collection. Every Python object maintains a count of the number of references
|
---|
| 437 | pointing to itself, and adjusts the count as references are created or
|
---|
| 438 | destroyed. Once the reference count reaches zero, the object is no longer
|
---|
| 439 | accessible, since you need to have a reference to an object to access it, and if
|
---|
| 440 | the count is zero, no references exist any longer.
|
---|
| 441 |
|
---|
| 442 | Reference counting has some pleasant properties: it's easy to understand and
|
---|
| 443 | implement, and the resulting implementation is portable, fairly fast, and reacts
|
---|
| 444 | well with other libraries that implement their own memory handling schemes. The
|
---|
| 445 | major problem with reference counting is that it sometimes doesn't realise that
|
---|
| 446 | objects are no longer accessible, resulting in a memory leak. This happens when
|
---|
| 447 | there are cycles of references.
|
---|
| 448 |
|
---|
| 449 | Consider the simplest possible cycle, a class instance which has a reference to
|
---|
| 450 | itself::
|
---|
| 451 |
|
---|
| 452 | instance = SomeClass()
|
---|
| 453 | instance.myself = instance
|
---|
| 454 |
|
---|
| 455 | After the above two lines of code have been executed, the reference count of
|
---|
| 456 | ``instance`` is 2; one reference is from the variable named ``'instance'``, and
|
---|
| 457 | the other is from the ``myself`` attribute of the instance.
|
---|
| 458 |
|
---|
| 459 | If the next line of code is ``del instance``, what happens? The reference count
|
---|
| 460 | of ``instance`` is decreased by 1, so it has a reference count of 1; the
|
---|
| 461 | reference in the ``myself`` attribute still exists. Yet the instance is no
|
---|
| 462 | longer accessible through Python code, and it could be deleted. Several objects
|
---|
| 463 | can participate in a cycle if they have references to each other, causing all of
|
---|
| 464 | the objects to be leaked.
|
---|
| 465 |
|
---|
| 466 | Python 2.0 fixes this problem by periodically executing a cycle detection
|
---|
| 467 | algorithm which looks for inaccessible cycles and deletes the objects involved.
|
---|
| 468 | A new :mod:`gc` module provides functions to perform a garbage collection,
|
---|
| 469 | obtain debugging statistics, and tuning the collector's parameters.
|
---|
| 470 |
|
---|
| 471 | Running the cycle detection algorithm takes some time, and therefore will result
|
---|
| 472 | in some additional overhead. It is hoped that after we've gotten experience
|
---|
| 473 | with the cycle collection from using 2.0, Python 2.1 will be able to minimize
|
---|
| 474 | the overhead with careful tuning. It's not yet obvious how much performance is
|
---|
| 475 | lost, because benchmarking this is tricky and depends crucially on how often the
|
---|
| 476 | program creates and destroys objects. The detection of cycles can be disabled
|
---|
| 477 | when Python is compiled, if you can't afford even a tiny speed penalty or
|
---|
| 478 | suspect that the cycle collection is buggy, by specifying the
|
---|
| 479 | :option:`--without-cycle-gc` switch when running the :program:`configure`
|
---|
| 480 | script.
|
---|
| 481 |
|
---|
| 482 | Several people tackled this problem and contributed to a solution. An early
|
---|
| 483 | implementation of the cycle detection approach was written by Toby Kelsey. The
|
---|
| 484 | current algorithm was suggested by Eric Tiedemann during a visit to CNRI, and
|
---|
| 485 | Guido van Rossum and Neil Schemenauer wrote two different implementations, which
|
---|
| 486 | were later integrated by Neil. Lots of other people offered suggestions along
|
---|
| 487 | the way; the March 2000 archives of the python-dev mailing list contain most of
|
---|
| 488 | the relevant discussion, especially in the threads titled "Reference cycle
|
---|
| 489 | collection for Python" and "Finalization again".
|
---|
| 490 |
|
---|
| 491 | .. ======================================================================
|
---|
| 492 |
|
---|
| 493 |
|
---|
| 494 | Other Core Changes
|
---|
| 495 | ==================
|
---|
| 496 |
|
---|
| 497 | Various minor changes have been made to Python's syntax and built-in functions.
|
---|
| 498 | None of the changes are very far-reaching, but they're handy conveniences.
|
---|
| 499 |
|
---|
| 500 |
|
---|
| 501 | Minor Language Changes
|
---|
| 502 | ----------------------
|
---|
| 503 |
|
---|
| 504 | A new syntax makes it more convenient to call a given function with a tuple of
|
---|
| 505 | arguments and/or a dictionary of keyword arguments. In Python 1.5 and earlier,
|
---|
| 506 | you'd use the :func:`apply` built-in function: ``apply(f, args, kw)`` calls the
|
---|
| 507 | function :func:`f` with the argument tuple *args* and the keyword arguments in
|
---|
| 508 | the dictionary *kw*. :func:`apply` is the same in 2.0, but thanks to a patch
|
---|
| 509 | from Greg Ewing, ``f(*args, **kw)`` as a shorter and clearer way to achieve the
|
---|
| 510 | same effect. This syntax is symmetrical with the syntax for defining
|
---|
| 511 | functions::
|
---|
| 512 |
|
---|
| 513 | def f(*args, **kw):
|
---|
| 514 | # args is a tuple of positional args,
|
---|
| 515 | # kw is a dictionary of keyword args
|
---|
| 516 | ...
|
---|
| 517 |
|
---|
| 518 | The :keyword:`print` statement can now have its output directed to a file-like
|
---|
| 519 | object by following the :keyword:`print` with ``>> file``, similar to the
|
---|
| 520 | redirection operator in Unix shells. Previously you'd either have to use the
|
---|
| 521 | :meth:`write` method of the file-like object, which lacks the convenience and
|
---|
| 522 | simplicity of :keyword:`print`, or you could assign a new value to
|
---|
| 523 | ``sys.stdout`` and then restore the old value. For sending output to standard
|
---|
| 524 | error, it's much easier to write this::
|
---|
| 525 |
|
---|
| 526 | print >> sys.stderr, "Warning: action field not supplied"
|
---|
| 527 |
|
---|
| 528 | Modules can now be renamed on importing them, using the syntax ``import module
|
---|
| 529 | as name`` or ``from module import name as othername``. The patch was submitted
|
---|
| 530 | by Thomas Wouters.
|
---|
| 531 |
|
---|
| 532 | A new format style is available when using the ``%`` operator; '%r' will insert
|
---|
| 533 | the :func:`repr` of its argument. This was also added from symmetry
|
---|
| 534 | considerations, this time for symmetry with the existing '%s' format style,
|
---|
| 535 | which inserts the :func:`str` of its argument. For example, ``'%r %s' % ('abc',
|
---|
| 536 | 'abc')`` returns a string containing ``'abc' abc``.
|
---|
| 537 |
|
---|
| 538 | Previously there was no way to implement a class that overrode Python's built-in
|
---|
| 539 | :keyword:`in` operator and implemented a custom version. ``obj in seq`` returns
|
---|
| 540 | true if *obj* is present in the sequence *seq*; Python computes this by simply
|
---|
| 541 | trying every index of the sequence until either *obj* is found or an
|
---|
| 542 | :exc:`IndexError` is encountered. Moshe Zadka contributed a patch which adds a
|
---|
| 543 | :meth:`__contains__` magic method for providing a custom implementation for
|
---|
| 544 | :keyword:`in`. Additionally, new built-in objects written in C can define what
|
---|
| 545 | :keyword:`in` means for them via a new slot in the sequence protocol.
|
---|
| 546 |
|
---|
| 547 | Earlier versions of Python used a recursive algorithm for deleting objects.
|
---|
| 548 | Deeply nested data structures could cause the interpreter to fill up the C stack
|
---|
| 549 | and crash; Christian Tismer rewrote the deletion logic to fix this problem. On
|
---|
| 550 | a related note, comparing recursive objects recursed infinitely and crashed;
|
---|
| 551 | Jeremy Hylton rewrote the code to no longer crash, producing a useful result
|
---|
| 552 | instead. For example, after this code::
|
---|
| 553 |
|
---|
| 554 | a = []
|
---|
| 555 | b = []
|
---|
| 556 | a.append(a)
|
---|
| 557 | b.append(b)
|
---|
| 558 |
|
---|
| 559 | The comparison ``a==b`` returns true, because the two recursive data structures
|
---|
| 560 | are isomorphic. See the thread "trashcan and PR#7" in the April 2000 archives of
|
---|
| 561 | the python-dev mailing list for the discussion leading up to this
|
---|
| 562 | implementation, and some useful relevant links. Note that comparisons can now
|
---|
| 563 | also raise exceptions. In earlier versions of Python, a comparison operation
|
---|
| 564 | such as ``cmp(a,b)`` would always produce an answer, even if a user-defined
|
---|
| 565 | :meth:`__cmp__` method encountered an error, since the resulting exception would
|
---|
| 566 | simply be silently swallowed.
|
---|
| 567 |
|
---|
| 568 | .. Starting URL:
|
---|
| 569 | .. http://www.python.org/pipermail/python-dev/2000-April/004834.html
|
---|
| 570 |
|
---|
| 571 | Work has been done on porting Python to 64-bit Windows on the Itanium processor,
|
---|
| 572 | mostly by Trent Mick of ActiveState. (Confusingly, ``sys.platform`` is still
|
---|
| 573 | ``'win32'`` on Win64 because it seems that for ease of porting, MS Visual C++
|
---|
| 574 | treats code as 32 bit on Itanium.) PythonWin also supports Windows CE; see the
|
---|
| 575 | Python CE page at http://pythonce.sourceforge.net/ for more information.
|
---|
| 576 |
|
---|
| 577 | Another new platform is Darwin/MacOS X; initial support for it is in Python 2.0.
|
---|
| 578 | Dynamic loading works, if you specify "configure --with-dyld --with-suffix=.x".
|
---|
| 579 | Consult the README in the Python source distribution for more instructions.
|
---|
| 580 |
|
---|
| 581 | An attempt has been made to alleviate one of Python's warts, the often-confusing
|
---|
| 582 | :exc:`NameError` exception when code refers to a local variable before the
|
---|
| 583 | variable has been assigned a value. For example, the following code raises an
|
---|
| 584 | exception on the :keyword:`print` statement in both 1.5.2 and 2.0; in 1.5.2 a
|
---|
| 585 | :exc:`NameError` exception is raised, while 2.0 raises a new
|
---|
| 586 | :exc:`UnboundLocalError` exception. :exc:`UnboundLocalError` is a subclass of
|
---|
| 587 | :exc:`NameError`, so any existing code that expects :exc:`NameError` to be
|
---|
| 588 | raised should still work. ::
|
---|
| 589 |
|
---|
| 590 | def f():
|
---|
| 591 | print "i=",i
|
---|
| 592 | i = i + 1
|
---|
| 593 | f()
|
---|
| 594 |
|
---|
| 595 | Two new exceptions, :exc:`TabError` and :exc:`IndentationError`, have been
|
---|
| 596 | introduced. They're both subclasses of :exc:`SyntaxError`, and are raised when
|
---|
| 597 | Python code is found to be improperly indented.
|
---|
| 598 |
|
---|
| 599 |
|
---|
| 600 | Changes to Built-in Functions
|
---|
| 601 | -----------------------------
|
---|
| 602 |
|
---|
| 603 | A new built-in, :func:`zip(seq1, seq2, ...)`, has been added. :func:`zip`
|
---|
| 604 | returns a list of tuples where each tuple contains the i-th element from each of
|
---|
| 605 | the argument sequences. The difference between :func:`zip` and ``map(None,
|
---|
| 606 | seq1, seq2)`` is that :func:`map` pads the sequences with ``None`` if the
|
---|
| 607 | sequences aren't all of the same length, while :func:`zip` truncates the
|
---|
| 608 | returned list to the length of the shortest argument sequence.
|
---|
| 609 |
|
---|
| 610 | The :func:`int` and :func:`long` functions now accept an optional "base"
|
---|
| 611 | parameter when the first argument is a string. ``int('123', 10)`` returns 123,
|
---|
| 612 | while ``int('123', 16)`` returns 291. ``int(123, 16)`` raises a
|
---|
| 613 | :exc:`TypeError` exception with the message "can't convert non-string with
|
---|
| 614 | explicit base".
|
---|
| 615 |
|
---|
| 616 | A new variable holding more detailed version information has been added to the
|
---|
| 617 | :mod:`sys` module. ``sys.version_info`` is a tuple ``(major, minor, micro,
|
---|
| 618 | level, serial)`` For example, in a hypothetical 2.0.1beta1, ``sys.version_info``
|
---|
| 619 | would be ``(2, 0, 1, 'beta', 1)``. *level* is a string such as ``"alpha"``,
|
---|
| 620 | ``"beta"``, or ``"final"`` for a final release.
|
---|
| 621 |
|
---|
| 622 | Dictionaries have an odd new method, :meth:`setdefault(key, default)`, which
|
---|
| 623 | behaves similarly to the existing :meth:`get` method. However, if the key is
|
---|
| 624 | missing, :meth:`setdefault` both returns the value of *default* as :meth:`get`
|
---|
| 625 | would do, and also inserts it into the dictionary as the value for *key*. Thus,
|
---|
| 626 | the following lines of code::
|
---|
| 627 |
|
---|
| 628 | if dict.has_key( key ): return dict[key]
|
---|
| 629 | else:
|
---|
| 630 | dict[key] = []
|
---|
| 631 | return dict[key]
|
---|
| 632 |
|
---|
| 633 | can be reduced to a single ``return dict.setdefault(key, [])`` statement.
|
---|
| 634 |
|
---|
| 635 | The interpreter sets a maximum recursion depth in order to catch runaway
|
---|
| 636 | recursion before filling the C stack and causing a core dump or GPF..
|
---|
| 637 | Previously this limit was fixed when you compiled Python, but in 2.0 the maximum
|
---|
| 638 | recursion depth can be read and modified using :func:`sys.getrecursionlimit` and
|
---|
| 639 | :func:`sys.setrecursionlimit`. The default value is 1000, and a rough maximum
|
---|
| 640 | value for a given platform can be found by running a new script,
|
---|
| 641 | :file:`Misc/find_recursionlimit.py`.
|
---|
| 642 |
|
---|
| 643 | .. ======================================================================
|
---|
| 644 |
|
---|
| 645 |
|
---|
| 646 | Porting to 2.0
|
---|
| 647 | ==============
|
---|
| 648 |
|
---|
| 649 | New Python releases try hard to be compatible with previous releases, and the
|
---|
| 650 | record has been pretty good. However, some changes are considered useful
|
---|
| 651 | enough, usually because they fix initial design decisions that turned out to be
|
---|
| 652 | actively mistaken, that breaking backward compatibility can't always be avoided.
|
---|
| 653 | This section lists the changes in Python 2.0 that may cause old Python code to
|
---|
| 654 | break.
|
---|
| 655 |
|
---|
| 656 | The change which will probably break the most code is tightening up the
|
---|
| 657 | arguments accepted by some methods. Some methods would take multiple arguments
|
---|
| 658 | and treat them as a tuple, particularly various list methods such as
|
---|
[391] | 659 | :meth:`append` and :meth:`insert`. In earlier versions of Python, if ``L`` is
|
---|
[2] | 660 | a list, ``L.append( 1,2 )`` appends the tuple ``(1,2)`` to the list. In Python
|
---|
| 661 | 2.0 this causes a :exc:`TypeError` exception to be raised, with the message:
|
---|
| 662 | 'append requires exactly 1 argument; 2 given'. The fix is to simply add an
|
---|
| 663 | extra set of parentheses to pass both values as a tuple: ``L.append( (1,2) )``.
|
---|
| 664 |
|
---|
| 665 | The earlier versions of these methods were more forgiving because they used an
|
---|
| 666 | old function in Python's C interface to parse their arguments; 2.0 modernizes
|
---|
| 667 | them to use :func:`PyArg_ParseTuple`, the current argument parsing function,
|
---|
| 668 | which provides more helpful error messages and treats multi-argument calls as
|
---|
| 669 | errors. If you absolutely must use 2.0 but can't fix your code, you can edit
|
---|
| 670 | :file:`Objects/listobject.c` and define the preprocessor symbol
|
---|
| 671 | ``NO_STRICT_LIST_APPEND`` to preserve the old behaviour; this isn't recommended.
|
---|
| 672 |
|
---|
| 673 | Some of the functions in the :mod:`socket` module are still forgiving in this
|
---|
| 674 | way. For example, :func:`socket.connect( ('hostname', 25) )` is the correct
|
---|
| 675 | form, passing a tuple representing an IP address, but :func:`socket.connect(
|
---|
| 676 | 'hostname', 25 )` also works. :func:`socket.connect_ex` and :func:`socket.bind`
|
---|
| 677 | are similarly easy-going. 2.0alpha1 tightened these functions up, but because
|
---|
| 678 | the documentation actually used the erroneous multiple argument form, many
|
---|
| 679 | people wrote code which would break with the stricter checking. GvR backed out
|
---|
| 680 | the changes in the face of public reaction, so for the :mod:`socket` module, the
|
---|
| 681 | documentation was fixed and the multiple argument form is simply marked as
|
---|
| 682 | deprecated; it *will* be tightened up again in a future Python version.
|
---|
| 683 |
|
---|
| 684 | The ``\x`` escape in string literals now takes exactly 2 hex digits. Previously
|
---|
| 685 | it would consume all the hex digits following the 'x' and take the lowest 8 bits
|
---|
| 686 | of the result, so ``\x123456`` was equivalent to ``\x56``.
|
---|
| 687 |
|
---|
| 688 | The :exc:`AttributeError` and :exc:`NameError` exceptions have a more friendly
|
---|
| 689 | error message, whose text will be something like ``'Spam' instance has no
|
---|
| 690 | attribute 'eggs'`` or ``name 'eggs' is not defined``. Previously the error
|
---|
| 691 | message was just the missing attribute name ``eggs``, and code written to take
|
---|
| 692 | advantage of this fact will break in 2.0.
|
---|
| 693 |
|
---|
| 694 | Some work has been done to make integers and long integers a bit more
|
---|
| 695 | interchangeable. In 1.5.2, large-file support was added for Solaris, to allow
|
---|
| 696 | reading files larger than 2 GiB; this made the :meth:`tell` method of file
|
---|
| 697 | objects return a long integer instead of a regular integer. Some code would
|
---|
| 698 | subtract two file offsets and attempt to use the result to multiply a sequence
|
---|
| 699 | or slice a string, but this raised a :exc:`TypeError`. In 2.0, long integers
|
---|
| 700 | can be used to multiply or slice a sequence, and it'll behave as you'd
|
---|
| 701 | intuitively expect it to; ``3L * 'abc'`` produces 'abcabcabc', and
|
---|
| 702 | ``(0,1,2,3)[2L:4L]`` produces (2,3). Long integers can also be used in various
|
---|
| 703 | contexts where previously only integers were accepted, such as in the
|
---|
| 704 | :meth:`seek` method of file objects, and in the formats supported by the ``%``
|
---|
| 705 | operator (``%d``, ``%i``, ``%x``, etc.). For example, ``"%d" % 2L**64`` will
|
---|
| 706 | produce the string ``18446744073709551616``.
|
---|
| 707 |
|
---|
| 708 | The subtlest long integer change of all is that the :func:`str` of a long
|
---|
| 709 | integer no longer has a trailing 'L' character, though :func:`repr` still
|
---|
| 710 | includes it. The 'L' annoyed many people who wanted to print long integers that
|
---|
| 711 | looked just like regular integers, since they had to go out of their way to chop
|
---|
| 712 | off the character. This is no longer a problem in 2.0, but code which does
|
---|
| 713 | ``str(longval)[:-1]`` and assumes the 'L' is there, will now lose the final
|
---|
| 714 | digit.
|
---|
| 715 |
|
---|
| 716 | Taking the :func:`repr` of a float now uses a different formatting precision
|
---|
| 717 | than :func:`str`. :func:`repr` uses ``%.17g`` format string for C's
|
---|
| 718 | :func:`sprintf`, while :func:`str` uses ``%.12g`` as before. The effect is that
|
---|
| 719 | :func:`repr` may occasionally show more decimal places than :func:`str`, for
|
---|
| 720 | certain numbers. For example, the number 8.1 can't be represented exactly in
|
---|
| 721 | binary, so ``repr(8.1)`` is ``'8.0999999999999996'``, while str(8.1) is
|
---|
| 722 | ``'8.1'``.
|
---|
| 723 |
|
---|
| 724 | The ``-X`` command-line option, which turned all standard exceptions into
|
---|
| 725 | strings instead of classes, has been removed; the standard exceptions will now
|
---|
| 726 | always be classes. The :mod:`exceptions` module containing the standard
|
---|
| 727 | exceptions was translated from Python to a built-in C module, written by Barry
|
---|
| 728 | Warsaw and Fredrik Lundh.
|
---|
| 729 |
|
---|
| 730 | .. Commented out for now -- I don't think anyone will care.
|
---|
| 731 | The pattern and match objects provided by SRE are C types, not Python
|
---|
| 732 | class instances as in 1.5. This means you can no longer inherit from
|
---|
| 733 | \class{RegexObject} or \class{MatchObject}, but that shouldn't be much
|
---|
| 734 | of a problem since no one should have been doing that in the first
|
---|
| 735 | place.
|
---|
| 736 | .. ======================================================================
|
---|
| 737 |
|
---|
| 738 |
|
---|
| 739 | Extending/Embedding Changes
|
---|
| 740 | ===========================
|
---|
| 741 |
|
---|
| 742 | Some of the changes are under the covers, and will only be apparent to people
|
---|
| 743 | writing C extension modules or embedding a Python interpreter in a larger
|
---|
| 744 | application. If you aren't dealing with Python's C API, you can safely skip
|
---|
| 745 | this section.
|
---|
| 746 |
|
---|
| 747 | The version number of the Python C API was incremented, so C extensions compiled
|
---|
| 748 | for 1.5.2 must be recompiled in order to work with 2.0. On Windows, it's not
|
---|
| 749 | possible for Python 2.0 to import a third party extension built for Python 1.5.x
|
---|
| 750 | due to how Windows DLLs work, so Python will raise an exception and the import
|
---|
| 751 | will fail.
|
---|
| 752 |
|
---|
| 753 | Users of Jim Fulton's ExtensionClass module will be pleased to find out that
|
---|
| 754 | hooks have been added so that ExtensionClasses are now supported by
|
---|
| 755 | :func:`isinstance` and :func:`issubclass`. This means you no longer have to
|
---|
| 756 | remember to write code such as ``if type(obj) == myExtensionClass``, but can use
|
---|
| 757 | the more natural ``if isinstance(obj, myExtensionClass)``.
|
---|
| 758 |
|
---|
| 759 | The :file:`Python/importdl.c` file, which was a mass of #ifdefs to support
|
---|
| 760 | dynamic loading on many different platforms, was cleaned up and reorganised by
|
---|
| 761 | Greg Stein. :file:`importdl.c` is now quite small, and platform-specific code
|
---|
| 762 | has been moved into a bunch of :file:`Python/dynload_\*.c` files. Another
|
---|
| 763 | cleanup: there were also a number of :file:`my\*.h` files in the Include/
|
---|
| 764 | directory that held various portability hacks; they've been merged into a single
|
---|
| 765 | file, :file:`Include/pyport.h`.
|
---|
| 766 |
|
---|
| 767 | Vladimir Marangozov's long-awaited malloc restructuring was completed, to make
|
---|
| 768 | it easy to have the Python interpreter use a custom allocator instead of C's
|
---|
| 769 | standard :func:`malloc`. For documentation, read the comments in
|
---|
| 770 | :file:`Include/pymem.h` and :file:`Include/objimpl.h`. For the lengthy
|
---|
| 771 | discussions during which the interface was hammered out, see the Web archives of
|
---|
| 772 | the 'patches' and 'python-dev' lists at python.org.
|
---|
| 773 |
|
---|
| 774 | Recent versions of the GUSI development environment for MacOS support POSIX
|
---|
| 775 | threads. Therefore, Python's POSIX threading support now works on the
|
---|
| 776 | Macintosh. Threading support using the user-space GNU ``pth`` library was also
|
---|
| 777 | contributed.
|
---|
| 778 |
|
---|
| 779 | Threading support on Windows was enhanced, too. Windows supports thread locks
|
---|
| 780 | that use kernel objects only in case of contention; in the common case when
|
---|
| 781 | there's no contention, they use simpler functions which are an order of
|
---|
| 782 | magnitude faster. A threaded version of Python 1.5.2 on NT is twice as slow as
|
---|
| 783 | an unthreaded version; with the 2.0 changes, the difference is only 10%. These
|
---|
| 784 | improvements were contributed by Yakov Markovitch.
|
---|
| 785 |
|
---|
| 786 | Python 2.0's source now uses only ANSI C prototypes, so compiling Python now
|
---|
| 787 | requires an ANSI C compiler, and can no longer be done using a compiler that
|
---|
| 788 | only supports K&R C.
|
---|
| 789 |
|
---|
| 790 | Previously the Python virtual machine used 16-bit numbers in its bytecode,
|
---|
| 791 | limiting the size of source files. In particular, this affected the maximum
|
---|
| 792 | size of literal lists and dictionaries in Python source; occasionally people who
|
---|
| 793 | are generating Python code would run into this limit. A patch by Charles G.
|
---|
| 794 | Waldman raises the limit from ``2^16`` to ``2^{32}``.
|
---|
| 795 |
|
---|
| 796 | Three new convenience functions intended for adding constants to a module's
|
---|
| 797 | dictionary at module initialization time were added: :func:`PyModule_AddObject`,
|
---|
| 798 | :func:`PyModule_AddIntConstant`, and :func:`PyModule_AddStringConstant`. Each
|
---|
| 799 | of these functions takes a module object, a null-terminated C string containing
|
---|
| 800 | the name to be added, and a third argument for the value to be assigned to the
|
---|
| 801 | name. This third argument is, respectively, a Python object, a C long, or a C
|
---|
| 802 | string.
|
---|
| 803 |
|
---|
| 804 | A wrapper API was added for Unix-style signal handlers. :func:`PyOS_getsig` gets
|
---|
| 805 | a signal handler and :func:`PyOS_setsig` will set a new handler.
|
---|
| 806 |
|
---|
| 807 | .. ======================================================================
|
---|
| 808 |
|
---|
| 809 |
|
---|
| 810 | Distutils: Making Modules Easy to Install
|
---|
| 811 | =========================================
|
---|
| 812 |
|
---|
| 813 | Before Python 2.0, installing modules was a tedious affair -- there was no way
|
---|
| 814 | to figure out automatically where Python is installed, or what compiler options
|
---|
| 815 | to use for extension modules. Software authors had to go through an arduous
|
---|
| 816 | ritual of editing Makefiles and configuration files, which only really work on
|
---|
| 817 | Unix and leave Windows and MacOS unsupported. Python users faced wildly
|
---|
| 818 | differing installation instructions which varied between different extension
|
---|
| 819 | packages, which made administering a Python installation something of a chore.
|
---|
| 820 |
|
---|
| 821 | The SIG for distribution utilities, shepherded by Greg Ward, has created the
|
---|
| 822 | Distutils, a system to make package installation much easier. They form the
|
---|
| 823 | :mod:`distutils` package, a new part of Python's standard library. In the best
|
---|
| 824 | case, installing a Python module from source will require the same steps: first
|
---|
| 825 | you simply mean unpack the tarball or zip archive, and the run "``python
|
---|
| 826 | setup.py install``". The platform will be automatically detected, the compiler
|
---|
| 827 | will be recognized, C extension modules will be compiled, and the distribution
|
---|
| 828 | installed into the proper directory. Optional command-line arguments provide
|
---|
| 829 | more control over the installation process, the distutils package offers many
|
---|
| 830 | places to override defaults -- separating the build from the install, building
|
---|
| 831 | or installing in non-default directories, and more.
|
---|
| 832 |
|
---|
| 833 | In order to use the Distutils, you need to write a :file:`setup.py` script. For
|
---|
| 834 | the simple case, when the software contains only .py files, a minimal
|
---|
| 835 | :file:`setup.py` can be just a few lines long::
|
---|
| 836 |
|
---|
| 837 | from distutils.core import setup
|
---|
| 838 | setup (name = "foo", version = "1.0",
|
---|
| 839 | py_modules = ["module1", "module2"])
|
---|
| 840 |
|
---|
| 841 | The :file:`setup.py` file isn't much more complicated if the software consists
|
---|
| 842 | of a few packages::
|
---|
| 843 |
|
---|
| 844 | from distutils.core import setup
|
---|
| 845 | setup (name = "foo", version = "1.0",
|
---|
| 846 | packages = ["package", "package.subpackage"])
|
---|
| 847 |
|
---|
| 848 | A C extension can be the most complicated case; here's an example taken from
|
---|
| 849 | the PyXML package::
|
---|
| 850 |
|
---|
| 851 | from distutils.core import setup, Extension
|
---|
| 852 |
|
---|
| 853 | expat_extension = Extension('xml.parsers.pyexpat',
|
---|
| 854 | define_macros = [('XML_NS', None)],
|
---|
| 855 | include_dirs = [ 'extensions/expat/xmltok',
|
---|
| 856 | 'extensions/expat/xmlparse' ],
|
---|
| 857 | sources = [ 'extensions/pyexpat.c',
|
---|
| 858 | 'extensions/expat/xmltok/xmltok.c',
|
---|
| 859 | 'extensions/expat/xmltok/xmlrole.c', ]
|
---|
| 860 | )
|
---|
| 861 | setup (name = "PyXML", version = "0.5.4",
|
---|
| 862 | ext_modules =[ expat_extension ] )
|
---|
| 863 |
|
---|
| 864 | The Distutils can also take care of creating source and binary distributions.
|
---|
| 865 | The "sdist" command, run by "``python setup.py sdist``', builds a source
|
---|
| 866 | distribution such as :file:`foo-1.0.tar.gz`. Adding new commands isn't
|
---|
| 867 | difficult, "bdist_rpm" and "bdist_wininst" commands have already been
|
---|
| 868 | contributed to create an RPM distribution and a Windows installer for the
|
---|
| 869 | software, respectively. Commands to create other distribution formats such as
|
---|
| 870 | Debian packages and Solaris :file:`.pkg` files are in various stages of
|
---|
| 871 | development.
|
---|
| 872 |
|
---|
| 873 | All this is documented in a new manual, *Distributing Python Modules*, that
|
---|
| 874 | joins the basic set of Python documentation.
|
---|
| 875 |
|
---|
| 876 | .. ======================================================================
|
---|
| 877 |
|
---|
| 878 |
|
---|
| 879 | XML Modules
|
---|
| 880 | ===========
|
---|
| 881 |
|
---|
| 882 | Python 1.5.2 included a simple XML parser in the form of the :mod:`xmllib`
|
---|
| 883 | module, contributed by Sjoerd Mullender. Since 1.5.2's release, two different
|
---|
| 884 | interfaces for processing XML have become common: SAX2 (version 2 of the Simple
|
---|
| 885 | API for XML) provides an event-driven interface with some similarities to
|
---|
| 886 | :mod:`xmllib`, and the DOM (Document Object Model) provides a tree-based
|
---|
| 887 | interface, transforming an XML document into a tree of nodes that can be
|
---|
| 888 | traversed and modified. Python 2.0 includes a SAX2 interface and a stripped-
|
---|
| 889 | down DOM interface as part of the :mod:`xml` package. Here we will give a brief
|
---|
| 890 | overview of these new interfaces; consult the Python documentation or the source
|
---|
| 891 | code for complete details. The Python XML SIG is also working on improved
|
---|
| 892 | documentation.
|
---|
| 893 |
|
---|
| 894 |
|
---|
| 895 | SAX2 Support
|
---|
| 896 | ------------
|
---|
| 897 |
|
---|
| 898 | SAX defines an event-driven interface for parsing XML. To use SAX, you must
|
---|
| 899 | write a SAX handler class. Handler classes inherit from various classes
|
---|
| 900 | provided by SAX, and override various methods that will then be called by the
|
---|
| 901 | XML parser. For example, the :meth:`startElement` and :meth:`endElement`
|
---|
| 902 | methods are called for every starting and end tag encountered by the parser, the
|
---|
| 903 | :meth:`characters` method is called for every chunk of character data, and so
|
---|
| 904 | forth.
|
---|
| 905 |
|
---|
| 906 | The advantage of the event-driven approach is that the whole document doesn't
|
---|
| 907 | have to be resident in memory at any one time, which matters if you are
|
---|
| 908 | processing really huge documents. However, writing the SAX handler class can
|
---|
| 909 | get very complicated if you're trying to modify the document structure in some
|
---|
| 910 | elaborate way.
|
---|
| 911 |
|
---|
| 912 | For example, this little example program defines a handler that prints a message
|
---|
| 913 | for every starting and ending tag, and then parses the file :file:`hamlet.xml`
|
---|
| 914 | using it::
|
---|
| 915 |
|
---|
| 916 | from xml import sax
|
---|
| 917 |
|
---|
| 918 | class SimpleHandler(sax.ContentHandler):
|
---|
| 919 | def startElement(self, name, attrs):
|
---|
| 920 | print 'Start of element:', name, attrs.keys()
|
---|
| 921 |
|
---|
| 922 | def endElement(self, name):
|
---|
| 923 | print 'End of element:', name
|
---|
| 924 |
|
---|
| 925 | # Create a parser object
|
---|
| 926 | parser = sax.make_parser()
|
---|
| 927 |
|
---|
| 928 | # Tell it what handler to use
|
---|
| 929 | handler = SimpleHandler()
|
---|
| 930 | parser.setContentHandler( handler )
|
---|
| 931 |
|
---|
| 932 | # Parse a file!
|
---|
| 933 | parser.parse( 'hamlet.xml' )
|
---|
| 934 |
|
---|
| 935 | For more information, consult the Python documentation, or the XML HOWTO at
|
---|
| 936 | http://pyxml.sourceforge.net/topics/howto/xml-howto.html.
|
---|
| 937 |
|
---|
| 938 |
|
---|
| 939 | DOM Support
|
---|
| 940 | -----------
|
---|
| 941 |
|
---|
| 942 | The Document Object Model is a tree-based representation for an XML document. A
|
---|
| 943 | top-level :class:`Document` instance is the root of the tree, and has a single
|
---|
| 944 | child which is the top-level :class:`Element` instance. This :class:`Element`
|
---|
| 945 | has children nodes representing character data and any sub-elements, which may
|
---|
| 946 | have further children of their own, and so forth. Using the DOM you can
|
---|
| 947 | traverse the resulting tree any way you like, access element and attribute
|
---|
| 948 | values, insert and delete nodes, and convert the tree back into XML.
|
---|
| 949 |
|
---|
| 950 | The DOM is useful for modifying XML documents, because you can create a DOM
|
---|
| 951 | tree, modify it by adding new nodes or rearranging subtrees, and then produce a
|
---|
| 952 | new XML document as output. You can also construct a DOM tree manually and
|
---|
| 953 | convert it to XML, which can be a more flexible way of producing XML output than
|
---|
| 954 | simply writing ``<tag1>``...\ ``</tag1>`` to a file.
|
---|
| 955 |
|
---|
| 956 | The DOM implementation included with Python lives in the :mod:`xml.dom.minidom`
|
---|
| 957 | module. It's a lightweight implementation of the Level 1 DOM with support for
|
---|
| 958 | XML namespaces. The :func:`parse` and :func:`parseString` convenience
|
---|
| 959 | functions are provided for generating a DOM tree::
|
---|
| 960 |
|
---|
| 961 | from xml.dom import minidom
|
---|
| 962 | doc = minidom.parse('hamlet.xml')
|
---|
| 963 |
|
---|
| 964 | ``doc`` is a :class:`Document` instance. :class:`Document`, like all the other
|
---|
| 965 | DOM classes such as :class:`Element` and :class:`Text`, is a subclass of the
|
---|
| 966 | :class:`Node` base class. All the nodes in a DOM tree therefore support certain
|
---|
| 967 | common methods, such as :meth:`toxml` which returns a string containing the XML
|
---|
| 968 | representation of the node and its children. Each class also has special
|
---|
| 969 | methods of its own; for example, :class:`Element` and :class:`Document`
|
---|
| 970 | instances have a method to find all child elements with a given tag name.
|
---|
| 971 | Continuing from the previous 2-line example::
|
---|
| 972 |
|
---|
| 973 | perslist = doc.getElementsByTagName( 'PERSONA' )
|
---|
| 974 | print perslist[0].toxml()
|
---|
| 975 | print perslist[1].toxml()
|
---|
| 976 |
|
---|
| 977 | For the *Hamlet* XML file, the above few lines output::
|
---|
| 978 |
|
---|
| 979 | <PERSONA>CLAUDIUS, king of Denmark. </PERSONA>
|
---|
| 980 | <PERSONA>HAMLET, son to the late, and nephew to the present king.</PERSONA>
|
---|
| 981 |
|
---|
| 982 | The root element of the document is available as ``doc.documentElement``, and
|
---|
| 983 | its children can be easily modified by deleting, adding, or removing nodes::
|
---|
| 984 |
|
---|
| 985 | root = doc.documentElement
|
---|
| 986 |
|
---|
| 987 | # Remove the first child
|
---|
| 988 | root.removeChild( root.childNodes[0] )
|
---|
| 989 |
|
---|
| 990 | # Move the new first child to the end
|
---|
| 991 | root.appendChild( root.childNodes[0] )
|
---|
| 992 |
|
---|
| 993 | # Insert the new first child (originally,
|
---|
| 994 | # the third child) before the 20th child.
|
---|
| 995 | root.insertBefore( root.childNodes[0], root.childNodes[20] )
|
---|
| 996 |
|
---|
| 997 | Again, I will refer you to the Python documentation for a complete listing of
|
---|
| 998 | the different :class:`Node` classes and their various methods.
|
---|
| 999 |
|
---|
| 1000 |
|
---|
| 1001 | Relationship to PyXML
|
---|
| 1002 | ---------------------
|
---|
| 1003 |
|
---|
| 1004 | The XML Special Interest Group has been working on XML-related Python code for a
|
---|
| 1005 | while. Its code distribution, called PyXML, is available from the SIG's Web
|
---|
| 1006 | pages at http://www.python.org/sigs/xml-sig/. The PyXML distribution also used
|
---|
| 1007 | the package name ``xml``. If you've written programs that used PyXML, you're
|
---|
| 1008 | probably wondering about its compatibility with the 2.0 :mod:`xml` package.
|
---|
| 1009 |
|
---|
| 1010 | The answer is that Python 2.0's :mod:`xml` package isn't compatible with PyXML,
|
---|
| 1011 | but can be made compatible by installing a recent version PyXML. Many
|
---|
| 1012 | applications can get by with the XML support that is included with Python 2.0,
|
---|
| 1013 | but more complicated applications will require that the full PyXML package will
|
---|
| 1014 | be installed. When installed, PyXML versions 0.6.0 or greater will replace the
|
---|
| 1015 | :mod:`xml` package shipped with Python, and will be a strict superset of the
|
---|
| 1016 | standard package, adding a bunch of additional features. Some of the additional
|
---|
| 1017 | features in PyXML include:
|
---|
| 1018 |
|
---|
| 1019 | * 4DOM, a full DOM implementation from FourThought, Inc.
|
---|
| 1020 |
|
---|
| 1021 | * The xmlproc validating parser, written by Lars Marius Garshol.
|
---|
| 1022 |
|
---|
| 1023 | * The :mod:`sgmlop` parser accelerator module, written by Fredrik Lundh.
|
---|
| 1024 |
|
---|
| 1025 | .. ======================================================================
|
---|
| 1026 |
|
---|
| 1027 |
|
---|
| 1028 | Module changes
|
---|
| 1029 | ==============
|
---|
| 1030 |
|
---|
| 1031 | Lots of improvements and bugfixes were made to Python's extensive standard
|
---|
| 1032 | library; some of the affected modules include :mod:`readline`,
|
---|
| 1033 | :mod:`ConfigParser`, :mod:`cgi`, :mod:`calendar`, :mod:`posix`, :mod:`readline`,
|
---|
| 1034 | :mod:`xmllib`, :mod:`aifc`, :mod:`chunk, wave`, :mod:`random`, :mod:`shelve`,
|
---|
| 1035 | and :mod:`nntplib`. Consult the CVS logs for the exact patch-by-patch details.
|
---|
| 1036 |
|
---|
| 1037 | Brian Gallew contributed OpenSSL support for the :mod:`socket` module. OpenSSL
|
---|
| 1038 | is an implementation of the Secure Socket Layer, which encrypts the data being
|
---|
| 1039 | sent over a socket. When compiling Python, you can edit :file:`Modules/Setup`
|
---|
| 1040 | to include SSL support, which adds an additional function to the :mod:`socket`
|
---|
| 1041 | module: :func:`socket.ssl(socket, keyfile, certfile)`, which takes a socket
|
---|
| 1042 | object and returns an SSL socket. The :mod:`httplib` and :mod:`urllib` modules
|
---|
| 1043 | were also changed to support ``https://`` URLs, though no one has implemented
|
---|
| 1044 | FTP or SMTP over SSL.
|
---|
| 1045 |
|
---|
| 1046 | The :mod:`httplib` module has been rewritten by Greg Stein to support HTTP/1.1.
|
---|
| 1047 | Backward compatibility with the 1.5 version of :mod:`httplib` is provided,
|
---|
| 1048 | though using HTTP/1.1 features such as pipelining will require rewriting code to
|
---|
| 1049 | use a different set of interfaces.
|
---|
| 1050 |
|
---|
| 1051 | The :mod:`Tkinter` module now supports Tcl/Tk version 8.1, 8.2, or 8.3, and
|
---|
| 1052 | support for the older 7.x versions has been dropped. The Tkinter module now
|
---|
| 1053 | supports displaying Unicode strings in Tk widgets. Also, Fredrik Lundh
|
---|
| 1054 | contributed an optimization which makes operations like ``create_line`` and
|
---|
| 1055 | ``create_polygon`` much faster, especially when using lots of coordinates.
|
---|
| 1056 |
|
---|
| 1057 | The :mod:`curses` module has been greatly extended, starting from Oliver
|
---|
| 1058 | Andrich's enhanced version, to provide many additional functions from ncurses
|
---|
| 1059 | and SYSV curses, such as colour, alternative character set support, pads, and
|
---|
| 1060 | mouse support. This means the module is no longer compatible with operating
|
---|
| 1061 | systems that only have BSD curses, but there don't seem to be any currently
|
---|
| 1062 | maintained OSes that fall into this category.
|
---|
| 1063 |
|
---|
| 1064 | As mentioned in the earlier discussion of 2.0's Unicode support, the underlying
|
---|
| 1065 | implementation of the regular expressions provided by the :mod:`re` module has
|
---|
| 1066 | been changed. SRE, a new regular expression engine written by Fredrik Lundh and
|
---|
| 1067 | partially funded by Hewlett Packard, supports matching against both 8-bit
|
---|
| 1068 | strings and Unicode strings.
|
---|
| 1069 |
|
---|
| 1070 | .. ======================================================================
|
---|
| 1071 |
|
---|
| 1072 |
|
---|
| 1073 | New modules
|
---|
| 1074 | ===========
|
---|
| 1075 |
|
---|
| 1076 | A number of new modules were added. We'll simply list them with brief
|
---|
| 1077 | descriptions; consult the 2.0 documentation for the details of a particular
|
---|
| 1078 | module.
|
---|
| 1079 |
|
---|
| 1080 | * :mod:`atexit`: For registering functions to be called before the Python
|
---|
| 1081 | interpreter exits. Code that currently sets ``sys.exitfunc`` directly should be
|
---|
| 1082 | changed to use the :mod:`atexit` module instead, importing :mod:`atexit` and
|
---|
| 1083 | calling :func:`atexit.register` with the function to be called on exit.
|
---|
| 1084 | (Contributed by Skip Montanaro.)
|
---|
| 1085 |
|
---|
| 1086 | * :mod:`codecs`, :mod:`encodings`, :mod:`unicodedata`: Added as part of the new
|
---|
| 1087 | Unicode support.
|
---|
| 1088 |
|
---|
| 1089 | * :mod:`filecmp`: Supersedes the old :mod:`cmp`, :mod:`cmpcache` and
|
---|
| 1090 | :mod:`dircmp` modules, which have now become deprecated. (Contributed by Gordon
|
---|
| 1091 | MacMillan and Moshe Zadka.)
|
---|
| 1092 |
|
---|
| 1093 | * :mod:`gettext`: This module provides internationalization (I18N) and
|
---|
| 1094 | localization (L10N) support for Python programs by providing an interface to the
|
---|
| 1095 | GNU gettext message catalog library. (Integrated by Barry Warsaw, from separate
|
---|
| 1096 | contributions by Martin von Löwis, Peter Funk, and James Henstridge.)
|
---|
| 1097 |
|
---|
| 1098 | * :mod:`linuxaudiodev`: Support for the :file:`/dev/audio` device on Linux, a
|
---|
| 1099 | twin to the existing :mod:`sunaudiodev` module. (Contributed by Peter Bosch,
|
---|
| 1100 | with fixes by Jeremy Hylton.)
|
---|
| 1101 |
|
---|
| 1102 | * :mod:`mmap`: An interface to memory-mapped files on both Windows and Unix. A
|
---|
| 1103 | file's contents can be mapped directly into memory, at which point it behaves
|
---|
| 1104 | like a mutable string, so its contents can be read and modified. They can even
|
---|
| 1105 | be passed to functions that expect ordinary strings, such as the :mod:`re`
|
---|
| 1106 | module. (Contributed by Sam Rushing, with some extensions by A.M. Kuchling.)
|
---|
| 1107 |
|
---|
| 1108 | * :mod:`pyexpat`: An interface to the Expat XML parser. (Contributed by Paul
|
---|
| 1109 | Prescod.)
|
---|
| 1110 |
|
---|
| 1111 | * :mod:`robotparser`: Parse a :file:`robots.txt` file, which is used for writing
|
---|
| 1112 | Web spiders that politely avoid certain areas of a Web site. The parser accepts
|
---|
| 1113 | the contents of a :file:`robots.txt` file, builds a set of rules from it, and
|
---|
| 1114 | can then answer questions about the fetchability of a given URL. (Contributed
|
---|
| 1115 | by Skip Montanaro.)
|
---|
| 1116 |
|
---|
| 1117 | * :mod:`tabnanny`: A module/script to check Python source code for ambiguous
|
---|
| 1118 | indentation. (Contributed by Tim Peters.)
|
---|
| 1119 |
|
---|
| 1120 | * :mod:`UserString`: A base class useful for deriving objects that behave like
|
---|
| 1121 | strings.
|
---|
| 1122 |
|
---|
| 1123 | * :mod:`webbrowser`: A module that provides a platform independent way to launch
|
---|
| 1124 | a web browser on a specific URL. For each platform, various browsers are tried
|
---|
| 1125 | in a specific order. The user can alter which browser is launched by setting the
|
---|
| 1126 | *BROWSER* environment variable. (Originally inspired by Eric S. Raymond's patch
|
---|
| 1127 | to :mod:`urllib` which added similar functionality, but the final module comes
|
---|
| 1128 | from code originally implemented by Fred Drake as
|
---|
| 1129 | :file:`Tools/idle/BrowserControl.py`, and adapted for the standard library by
|
---|
| 1130 | Fred.)
|
---|
| 1131 |
|
---|
| 1132 | * :mod:`_winreg`: An interface to the Windows registry. :mod:`_winreg` is an
|
---|
| 1133 | adaptation of functions that have been part of PythonWin since 1995, but has now
|
---|
| 1134 | been added to the core distribution, and enhanced to support Unicode.
|
---|
| 1135 | :mod:`_winreg` was written by Bill Tutt and Mark Hammond.
|
---|
| 1136 |
|
---|
| 1137 | * :mod:`zipfile`: A module for reading and writing ZIP-format archives. These
|
---|
| 1138 | are archives produced by :program:`PKZIP` on DOS/Windows or :program:`zip` on
|
---|
| 1139 | Unix, not to be confused with :program:`gzip`\ -format files (which are
|
---|
| 1140 | supported by the :mod:`gzip` module) (Contributed by James C. Ahlstrom.)
|
---|
| 1141 |
|
---|
| 1142 | * :mod:`imputil`: A module that provides a simpler way for writing customised
|
---|
| 1143 | import hooks, in comparison to the existing :mod:`ihooks` module. (Implemented
|
---|
| 1144 | by Greg Stein, with much discussion on python-dev along the way.)
|
---|
| 1145 |
|
---|
| 1146 | .. ======================================================================
|
---|
| 1147 |
|
---|
| 1148 |
|
---|
| 1149 | IDLE Improvements
|
---|
| 1150 | =================
|
---|
| 1151 |
|
---|
| 1152 | IDLE is the official Python cross-platform IDE, written using Tkinter. Python
|
---|
| 1153 | 2.0 includes IDLE 0.6, which adds a number of new features and improvements. A
|
---|
| 1154 | partial list:
|
---|
| 1155 |
|
---|
| 1156 | * UI improvements and optimizations, especially in the area of syntax
|
---|
| 1157 | highlighting and auto-indentation.
|
---|
| 1158 |
|
---|
| 1159 | * The class browser now shows more information, such as the top level functions
|
---|
| 1160 | in a module.
|
---|
| 1161 |
|
---|
| 1162 | * Tab width is now a user settable option. When opening an existing Python file,
|
---|
| 1163 | IDLE automatically detects the indentation conventions, and adapts.
|
---|
| 1164 |
|
---|
| 1165 | * There is now support for calling browsers on various platforms, used to open
|
---|
| 1166 | the Python documentation in a browser.
|
---|
| 1167 |
|
---|
| 1168 | * IDLE now has a command line, which is largely similar to the vanilla Python
|
---|
| 1169 | interpreter.
|
---|
| 1170 |
|
---|
| 1171 | * Call tips were added in many places.
|
---|
| 1172 |
|
---|
| 1173 | * IDLE can now be installed as a package.
|
---|
| 1174 |
|
---|
| 1175 | * In the editor window, there is now a line/column bar at the bottom.
|
---|
| 1176 |
|
---|
| 1177 | * Three new keystroke commands: Check module (Alt-F5), Import module (F5) and
|
---|
| 1178 | Run script (Ctrl-F5).
|
---|
| 1179 |
|
---|
| 1180 | .. ======================================================================
|
---|
| 1181 |
|
---|
| 1182 |
|
---|
| 1183 | Deleted and Deprecated Modules
|
---|
| 1184 | ==============================
|
---|
| 1185 |
|
---|
| 1186 | A few modules have been dropped because they're obsolete, or because there are
|
---|
| 1187 | now better ways to do the same thing. The :mod:`stdwin` module is gone; it was
|
---|
| 1188 | for a platform-independent windowing toolkit that's no longer developed.
|
---|
| 1189 |
|
---|
| 1190 | A number of modules have been moved to the :file:`lib-old` subdirectory:
|
---|
| 1191 | :mod:`cmp`, :mod:`cmpcache`, :mod:`dircmp`, :mod:`dump`, :mod:`find`,
|
---|
| 1192 | :mod:`grep`, :mod:`packmail`, :mod:`poly`, :mod:`util`, :mod:`whatsound`,
|
---|
| 1193 | :mod:`zmod`. If you have code which relies on a module that's been moved to
|
---|
| 1194 | :file:`lib-old`, you can simply add that directory to ``sys.path`` to get them
|
---|
| 1195 | back, but you're encouraged to update any code that uses these modules.
|
---|
| 1196 |
|
---|
| 1197 |
|
---|
| 1198 | Acknowledgements
|
---|
| 1199 | ================
|
---|
| 1200 |
|
---|
| 1201 | The authors would like to thank the following people for offering suggestions on
|
---|
| 1202 | various drafts of this article: David Bolen, Mark Hammond, Gregg Hauser, Jeremy
|
---|
| 1203 | Hylton, Fredrik Lundh, Detlef Lannert, Aahz Maruch, Skip Montanaro, Vladimir
|
---|
| 1204 | Marangozov, Tobias Polzin, Guido van Rossum, Neil Schemenauer, and Russ Schmidt.
|
---|
| 1205 |
|
---|