[2] | 1 | ======================
|
---|
| 2 | Design and History FAQ
|
---|
| 3 | ======================
|
---|
| 4 |
|
---|
| 5 | Why does Python use indentation for grouping of statements?
|
---|
| 6 | -----------------------------------------------------------
|
---|
| 7 |
|
---|
| 8 | Guido van Rossum believes that using indentation for grouping is extremely
|
---|
| 9 | elegant and contributes a lot to the clarity of the average Python program.
|
---|
[391] | 10 | Most people learn to love this feature after a while.
|
---|
[2] | 11 |
|
---|
| 12 | Since there are no begin/end brackets there cannot be a disagreement between
|
---|
| 13 | grouping perceived by the parser and the human reader. Occasionally C
|
---|
| 14 | programmers will encounter a fragment of code like this::
|
---|
| 15 |
|
---|
| 16 | if (x <= y)
|
---|
| 17 | x++;
|
---|
| 18 | y--;
|
---|
| 19 | z++;
|
---|
| 20 |
|
---|
| 21 | Only the ``x++`` statement is executed if the condition is true, but the
|
---|
| 22 | indentation leads you to believe otherwise. Even experienced C programmers will
|
---|
| 23 | sometimes stare at it a long time wondering why ``y`` is being decremented even
|
---|
| 24 | for ``x > y``.
|
---|
| 25 |
|
---|
| 26 | Because there are no begin/end brackets, Python is much less prone to
|
---|
| 27 | coding-style conflicts. In C there are many different ways to place the braces.
|
---|
| 28 | If you're used to reading and writing code that uses one style, you will feel at
|
---|
| 29 | least slightly uneasy when reading (or being required to write) another style.
|
---|
| 30 |
|
---|
[391] | 31 | Many coding styles place begin/end brackets on a line by themselves. This makes
|
---|
[2] | 32 | programs considerably longer and wastes valuable screen space, making it harder
|
---|
| 33 | to get a good overview of a program. Ideally, a function should fit on one
|
---|
| 34 | screen (say, 20-30 lines). 20 lines of Python can do a lot more work than 20
|
---|
| 35 | lines of C. This is not solely due to the lack of begin/end brackets -- the
|
---|
| 36 | lack of declarations and the high-level data types are also responsible -- but
|
---|
| 37 | the indentation-based syntax certainly helps.
|
---|
| 38 |
|
---|
| 39 |
|
---|
| 40 | Why am I getting strange results with simple arithmetic operations?
|
---|
| 41 | -------------------------------------------------------------------
|
---|
| 42 |
|
---|
| 43 | See the next question.
|
---|
| 44 |
|
---|
| 45 |
|
---|
| 46 | Why are floating point calculations so inaccurate?
|
---|
| 47 | --------------------------------------------------
|
---|
| 48 |
|
---|
| 49 | People are often very surprised by results like this::
|
---|
| 50 |
|
---|
[391] | 51 | >>> 1.2 - 1.0
|
---|
[2] | 52 | 0.199999999999999996
|
---|
| 53 |
|
---|
| 54 | and think it is a bug in Python. It's not. This has nothing to do with Python,
|
---|
| 55 | but with how the underlying C platform handles floating point numbers, and
|
---|
| 56 | ultimately with the inaccuracies introduced when writing down numbers as a
|
---|
| 57 | string of a fixed number of digits.
|
---|
| 58 |
|
---|
| 59 | The internal representation of floating point numbers uses a fixed number of
|
---|
| 60 | binary digits to represent a decimal number. Some decimal numbers can't be
|
---|
| 61 | represented exactly in binary, resulting in small roundoff errors.
|
---|
| 62 |
|
---|
| 63 | In decimal math, there are many numbers that can't be represented with a fixed
|
---|
| 64 | number of decimal digits, e.g. 1/3 = 0.3333333333.......
|
---|
| 65 |
|
---|
| 66 | In base 2, 1/2 = 0.1, 1/4 = 0.01, 1/8 = 0.001, etc. .2 equals 2/10 equals 1/5,
|
---|
| 67 | resulting in the binary fractional number 0.001100110011001...
|
---|
| 68 |
|
---|
| 69 | Floating point numbers only have 32 or 64 bits of precision, so the digits are
|
---|
| 70 | cut off at some point, and the resulting number is 0.199999999999999996 in
|
---|
| 71 | decimal, not 0.2.
|
---|
| 72 |
|
---|
| 73 | A floating point number's ``repr()`` function prints as many digits are
|
---|
| 74 | necessary to make ``eval(repr(f)) == f`` true for any float f. The ``str()``
|
---|
| 75 | function prints fewer digits and this often results in the more sensible number
|
---|
| 76 | that was probably intended::
|
---|
| 77 |
|
---|
[391] | 78 | >>> 1.1 - 0.9
|
---|
| 79 | 0.20000000000000007
|
---|
| 80 | >>> print 1.1 - 0.9
|
---|
[2] | 81 | 0.2
|
---|
| 82 |
|
---|
| 83 | One of the consequences of this is that it is error-prone to compare the result
|
---|
| 84 | of some computation to a float with ``==``. Tiny inaccuracies may mean that
|
---|
| 85 | ``==`` fails. Instead, you have to check that the difference between the two
|
---|
| 86 | numbers is less than a certain threshold::
|
---|
| 87 |
|
---|
[391] | 88 | epsilon = 0.0000000000001 # Tiny allowed error
|
---|
[2] | 89 | expected_result = 0.4
|
---|
| 90 |
|
---|
| 91 | if expected_result-epsilon <= computation() <= expected_result+epsilon:
|
---|
| 92 | ...
|
---|
| 93 |
|
---|
| 94 | Please see the chapter on :ref:`floating point arithmetic <tut-fp-issues>` in
|
---|
| 95 | the Python tutorial for more information.
|
---|
| 96 |
|
---|
| 97 |
|
---|
| 98 | Why are Python strings immutable?
|
---|
| 99 | ---------------------------------
|
---|
| 100 |
|
---|
| 101 | There are several advantages.
|
---|
| 102 |
|
---|
| 103 | One is performance: knowing that a string is immutable means we can allocate
|
---|
| 104 | space for it at creation time, and the storage requirements are fixed and
|
---|
| 105 | unchanging. This is also one of the reasons for the distinction between tuples
|
---|
| 106 | and lists.
|
---|
| 107 |
|
---|
| 108 | Another advantage is that strings in Python are considered as "elemental" as
|
---|
| 109 | numbers. No amount of activity will change the value 8 to anything else, and in
|
---|
| 110 | Python, no amount of activity will change the string "eight" to anything else.
|
---|
| 111 |
|
---|
| 112 |
|
---|
| 113 | .. _why-self:
|
---|
| 114 |
|
---|
| 115 | Why must 'self' be used explicitly in method definitions and calls?
|
---|
| 116 | -------------------------------------------------------------------
|
---|
| 117 |
|
---|
| 118 | The idea was borrowed from Modula-3. It turns out to be very useful, for a
|
---|
| 119 | variety of reasons.
|
---|
| 120 |
|
---|
| 121 | First, it's more obvious that you are using a method or instance attribute
|
---|
| 122 | instead of a local variable. Reading ``self.x`` or ``self.meth()`` makes it
|
---|
| 123 | absolutely clear that an instance variable or method is used even if you don't
|
---|
| 124 | know the class definition by heart. In C++, you can sort of tell by the lack of
|
---|
| 125 | a local variable declaration (assuming globals are rare or easily recognizable)
|
---|
| 126 | -- but in Python, there are no local variable declarations, so you'd have to
|
---|
| 127 | look up the class definition to be sure. Some C++ and Java coding standards
|
---|
| 128 | call for instance attributes to have an ``m_`` prefix, so this explicitness is
|
---|
| 129 | still useful in those languages, too.
|
---|
| 130 |
|
---|
| 131 | Second, it means that no special syntax is necessary if you want to explicitly
|
---|
| 132 | reference or call the method from a particular class. In C++, if you want to
|
---|
| 133 | use a method from a base class which is overridden in a derived class, you have
|
---|
[391] | 134 | to use the ``::`` operator -- in Python you can write
|
---|
| 135 | ``baseclass.methodname(self, <argument list>)``. This is particularly useful
|
---|
| 136 | for :meth:`__init__` methods, and in general in cases where a derived class
|
---|
| 137 | method wants to extend the base class method of the same name and thus has to
|
---|
| 138 | call the base class method somehow.
|
---|
[2] | 139 |
|
---|
| 140 | Finally, for instance variables it solves a syntactic problem with assignment:
|
---|
| 141 | since local variables in Python are (by definition!) those variables to which a
|
---|
[391] | 142 | value is assigned in a function body (and that aren't explicitly declared
|
---|
| 143 | global), there has to be some way to tell the interpreter that an assignment was
|
---|
| 144 | meant to assign to an instance variable instead of to a local variable, and it
|
---|
| 145 | should preferably be syntactic (for efficiency reasons). C++ does this through
|
---|
[2] | 146 | declarations, but Python doesn't have declarations and it would be a pity having
|
---|
[391] | 147 | to introduce them just for this purpose. Using the explicit ``self.var`` solves
|
---|
[2] | 148 | this nicely. Similarly, for using instance variables, having to write
|
---|
[391] | 149 | ``self.var`` means that references to unqualified names inside a method don't
|
---|
| 150 | have to search the instance's directories. To put it another way, local
|
---|
| 151 | variables and instance variables live in two different namespaces, and you need
|
---|
| 152 | to tell Python which namespace to use.
|
---|
[2] | 153 |
|
---|
| 154 |
|
---|
| 155 | Why can't I use an assignment in an expression?
|
---|
| 156 | -----------------------------------------------
|
---|
| 157 |
|
---|
| 158 | Many people used to C or Perl complain that they want to use this C idiom:
|
---|
| 159 |
|
---|
| 160 | .. code-block:: c
|
---|
| 161 |
|
---|
| 162 | while (line = readline(f)) {
|
---|
| 163 | // do something with line
|
---|
| 164 | }
|
---|
| 165 |
|
---|
| 166 | where in Python you're forced to write this::
|
---|
| 167 |
|
---|
| 168 | while True:
|
---|
| 169 | line = f.readline()
|
---|
| 170 | if not line:
|
---|
| 171 | break
|
---|
| 172 | ... # do something with line
|
---|
| 173 |
|
---|
| 174 | The reason for not allowing assignment in Python expressions is a common,
|
---|
| 175 | hard-to-find bug in those other languages, caused by this construct:
|
---|
| 176 |
|
---|
| 177 | .. code-block:: c
|
---|
| 178 |
|
---|
| 179 | if (x = 0) {
|
---|
| 180 | // error handling
|
---|
| 181 | }
|
---|
| 182 | else {
|
---|
| 183 | // code that only works for nonzero x
|
---|
| 184 | }
|
---|
| 185 |
|
---|
| 186 | The error is a simple typo: ``x = 0``, which assigns 0 to the variable ``x``,
|
---|
| 187 | was written while the comparison ``x == 0`` is certainly what was intended.
|
---|
| 188 |
|
---|
| 189 | Many alternatives have been proposed. Most are hacks that save some typing but
|
---|
| 190 | use arbitrary or cryptic syntax or keywords, and fail the simple criterion for
|
---|
| 191 | language change proposals: it should intuitively suggest the proper meaning to a
|
---|
| 192 | human reader who has not yet been introduced to the construct.
|
---|
| 193 |
|
---|
| 194 | An interesting phenomenon is that most experienced Python programmers recognize
|
---|
| 195 | the ``while True`` idiom and don't seem to be missing the assignment in
|
---|
| 196 | expression construct much; it's only newcomers who express a strong desire to
|
---|
| 197 | add this to the language.
|
---|
| 198 |
|
---|
| 199 | There's an alternative way of spelling this that seems attractive but is
|
---|
| 200 | generally less robust than the "while True" solution::
|
---|
| 201 |
|
---|
| 202 | line = f.readline()
|
---|
| 203 | while line:
|
---|
| 204 | ... # do something with line...
|
---|
| 205 | line = f.readline()
|
---|
| 206 |
|
---|
| 207 | The problem with this is that if you change your mind about exactly how you get
|
---|
| 208 | the next line (e.g. you want to change it into ``sys.stdin.readline()``) you
|
---|
| 209 | have to remember to change two places in your program -- the second occurrence
|
---|
| 210 | is hidden at the bottom of the loop.
|
---|
| 211 |
|
---|
| 212 | The best approach is to use iterators, making it possible to loop through
|
---|
| 213 | objects using the ``for`` statement. For example, in the current version of
|
---|
| 214 | Python file objects support the iterator protocol, so you can now write simply::
|
---|
| 215 |
|
---|
| 216 | for line in f:
|
---|
| 217 | ... # do something with line...
|
---|
| 218 |
|
---|
| 219 |
|
---|
| 220 |
|
---|
| 221 | Why does Python use methods for some functionality (e.g. list.index()) but functions for other (e.g. len(list))?
|
---|
| 222 | ----------------------------------------------------------------------------------------------------------------
|
---|
| 223 |
|
---|
| 224 | The major reason is history. Functions were used for those operations that were
|
---|
| 225 | generic for a group of types and which were intended to work even for objects
|
---|
| 226 | that didn't have methods at all (e.g. tuples). It is also convenient to have a
|
---|
| 227 | function that can readily be applied to an amorphous collection of objects when
|
---|
[391] | 228 | you use the functional features of Python (``map()``, ``zip()`` et al).
|
---|
[2] | 229 |
|
---|
| 230 | In fact, implementing ``len()``, ``max()``, ``min()`` as a built-in function is
|
---|
| 231 | actually less code than implementing them as methods for each type. One can
|
---|
| 232 | quibble about individual cases but it's a part of Python, and it's too late to
|
---|
| 233 | make such fundamental changes now. The functions have to remain to avoid massive
|
---|
| 234 | code breakage.
|
---|
| 235 |
|
---|
| 236 | .. XXX talk about protocols?
|
---|
| 237 |
|
---|
[391] | 238 | .. note::
|
---|
[2] | 239 |
|
---|
[391] | 240 | For string operations, Python has moved from external functions (the
|
---|
| 241 | ``string`` module) to methods. However, ``len()`` is still a function.
|
---|
[2] | 242 |
|
---|
[391] | 243 |
|
---|
[2] | 244 | Why is join() a string method instead of a list or tuple method?
|
---|
| 245 | ----------------------------------------------------------------
|
---|
| 246 |
|
---|
| 247 | Strings became much more like other standard types starting in Python 1.6, when
|
---|
| 248 | methods were added which give the same functionality that has always been
|
---|
| 249 | available using the functions of the string module. Most of these new methods
|
---|
| 250 | have been widely accepted, but the one which appears to make some programmers
|
---|
| 251 | feel uncomfortable is::
|
---|
| 252 |
|
---|
| 253 | ", ".join(['1', '2', '4', '8', '16'])
|
---|
| 254 |
|
---|
| 255 | which gives the result::
|
---|
| 256 |
|
---|
| 257 | "1, 2, 4, 8, 16"
|
---|
| 258 |
|
---|
| 259 | There are two common arguments against this usage.
|
---|
| 260 |
|
---|
| 261 | The first runs along the lines of: "It looks really ugly using a method of a
|
---|
| 262 | string literal (string constant)", to which the answer is that it might, but a
|
---|
| 263 | string literal is just a fixed value. If the methods are to be allowed on names
|
---|
| 264 | bound to strings there is no logical reason to make them unavailable on
|
---|
| 265 | literals.
|
---|
| 266 |
|
---|
| 267 | The second objection is typically cast as: "I am really telling a sequence to
|
---|
| 268 | join its members together with a string constant". Sadly, you aren't. For some
|
---|
| 269 | reason there seems to be much less difficulty with having :meth:`~str.split` as
|
---|
| 270 | a string method, since in that case it is easy to see that ::
|
---|
| 271 |
|
---|
| 272 | "1, 2, 4, 8, 16".split(", ")
|
---|
| 273 |
|
---|
| 274 | is an instruction to a string literal to return the substrings delimited by the
|
---|
| 275 | given separator (or, by default, arbitrary runs of white space). In this case a
|
---|
| 276 | Unicode string returns a list of Unicode strings, an ASCII string returns a list
|
---|
| 277 | of ASCII strings, and everyone is happy.
|
---|
| 278 |
|
---|
| 279 | :meth:`~str.join` is a string method because in using it you are telling the
|
---|
| 280 | separator string to iterate over a sequence of strings and insert itself between
|
---|
| 281 | adjacent elements. This method can be used with any argument which obeys the
|
---|
| 282 | rules for sequence objects, including any new classes you might define yourself.
|
---|
| 283 |
|
---|
| 284 | Because this is a string method it can work for Unicode strings as well as plain
|
---|
| 285 | ASCII strings. If ``join()`` were a method of the sequence types then the
|
---|
| 286 | sequence types would have to decide which type of string to return depending on
|
---|
| 287 | the type of the separator.
|
---|
| 288 |
|
---|
| 289 | .. XXX remove next paragraph eventually
|
---|
| 290 |
|
---|
| 291 | If none of these arguments persuade you, then for the moment you can continue to
|
---|
| 292 | use the ``join()`` function from the string module, which allows you to write ::
|
---|
| 293 |
|
---|
| 294 | string.join(['1', '2', '4', '8', '16'], ", ")
|
---|
| 295 |
|
---|
| 296 |
|
---|
| 297 | How fast are exceptions?
|
---|
| 298 | ------------------------
|
---|
| 299 |
|
---|
[391] | 300 | A try/except block is extremely efficient if no exceptions are raised. Actually
|
---|
| 301 | catching an exception is expensive. In versions of Python prior to 2.0 it was
|
---|
| 302 | common to use this idiom::
|
---|
[2] | 303 |
|
---|
| 304 | try:
|
---|
[391] | 305 | value = mydict[key]
|
---|
[2] | 306 | except KeyError:
|
---|
[391] | 307 | mydict[key] = getvalue(key)
|
---|
| 308 | value = mydict[key]
|
---|
[2] | 309 |
|
---|
| 310 | This only made sense when you expected the dict to have the key almost all the
|
---|
| 311 | time. If that wasn't the case, you coded it like this::
|
---|
| 312 |
|
---|
[391] | 313 | if key in mydict:
|
---|
| 314 | value = mydict[key]
|
---|
[2] | 315 | else:
|
---|
[391] | 316 | value = mydict[key] = getvalue(key)
|
---|
[2] | 317 |
|
---|
[391] | 318 | .. note::
|
---|
[2] | 319 |
|
---|
[391] | 320 | In Python 2.0 and higher, you can code this as ``value =
|
---|
| 321 | mydict.setdefault(key, getvalue(key))``.
|
---|
[2] | 322 |
|
---|
[391] | 323 |
|
---|
[2] | 324 | Why isn't there a switch or case statement in Python?
|
---|
| 325 | -----------------------------------------------------
|
---|
| 326 |
|
---|
| 327 | You can do this easily enough with a sequence of ``if... elif... elif... else``.
|
---|
| 328 | There have been some proposals for switch statement syntax, but there is no
|
---|
| 329 | consensus (yet) on whether and how to do range tests. See :pep:`275` for
|
---|
| 330 | complete details and the current status.
|
---|
| 331 |
|
---|
| 332 | For cases where you need to choose from a very large number of possibilities,
|
---|
| 333 | you can create a dictionary mapping case values to functions to call. For
|
---|
| 334 | example::
|
---|
| 335 |
|
---|
| 336 | def function_1(...):
|
---|
| 337 | ...
|
---|
| 338 |
|
---|
| 339 | functions = {'a': function_1,
|
---|
| 340 | 'b': function_2,
|
---|
| 341 | 'c': self.method_1, ...}
|
---|
| 342 |
|
---|
| 343 | func = functions[value]
|
---|
| 344 | func()
|
---|
| 345 |
|
---|
| 346 | For calling methods on objects, you can simplify yet further by using the
|
---|
| 347 | :func:`getattr` built-in to retrieve methods with a particular name::
|
---|
| 348 |
|
---|
| 349 | def visit_a(self, ...):
|
---|
| 350 | ...
|
---|
| 351 | ...
|
---|
| 352 |
|
---|
| 353 | def dispatch(self, value):
|
---|
| 354 | method_name = 'visit_' + str(value)
|
---|
| 355 | method = getattr(self, method_name)
|
---|
| 356 | method()
|
---|
| 357 |
|
---|
| 358 | It's suggested that you use a prefix for the method names, such as ``visit_`` in
|
---|
| 359 | this example. Without such a prefix, if values are coming from an untrusted
|
---|
| 360 | source, an attacker would be able to call any method on your object.
|
---|
| 361 |
|
---|
| 362 |
|
---|
| 363 | Can't you emulate threads in the interpreter instead of relying on an OS-specific thread implementation?
|
---|
| 364 | --------------------------------------------------------------------------------------------------------
|
---|
| 365 |
|
---|
| 366 | Answer 1: Unfortunately, the interpreter pushes at least one C stack frame for
|
---|
| 367 | each Python stack frame. Also, extensions can call back into Python at almost
|
---|
| 368 | random moments. Therefore, a complete threads implementation requires thread
|
---|
| 369 | support for C.
|
---|
| 370 |
|
---|
| 371 | Answer 2: Fortunately, there is `Stackless Python <http://www.stackless.com>`_,
|
---|
| 372 | which has a completely redesigned interpreter loop that avoids the C stack.
|
---|
| 373 |
|
---|
| 374 |
|
---|
[391] | 375 | Why can't lambda expressions contain statements?
|
---|
| 376 | ------------------------------------------------
|
---|
[2] | 377 |
|
---|
[391] | 378 | Python lambda expressions cannot contain statements because Python's syntactic
|
---|
[2] | 379 | framework can't handle statements nested inside expressions. However, in
|
---|
| 380 | Python, this is not a serious problem. Unlike lambda forms in other languages,
|
---|
| 381 | where they add functionality, Python lambdas are only a shorthand notation if
|
---|
| 382 | you're too lazy to define a function.
|
---|
| 383 |
|
---|
| 384 | Functions are already first class objects in Python, and can be declared in a
|
---|
[391] | 385 | local scope. Therefore the only advantage of using a lambda instead of a
|
---|
[2] | 386 | locally-defined function is that you don't need to invent a name for the
|
---|
| 387 | function -- but that's just a local variable to which the function object (which
|
---|
[391] | 388 | is exactly the same type of object that a lambda expression yields) is assigned!
|
---|
[2] | 389 |
|
---|
| 390 |
|
---|
| 391 | Can Python be compiled to machine code, C or some other language?
|
---|
| 392 | -----------------------------------------------------------------
|
---|
| 393 |
|
---|
| 394 | Not easily. Python's high level data types, dynamic typing of objects and
|
---|
| 395 | run-time invocation of the interpreter (using :func:`eval` or :keyword:`exec`)
|
---|
| 396 | together mean that a "compiled" Python program would probably consist mostly of
|
---|
| 397 | calls into the Python run-time system, even for seemingly simple operations like
|
---|
| 398 | ``x+1``.
|
---|
| 399 |
|
---|
| 400 | Several projects described in the Python newsgroup or at past `Python
|
---|
| 401 | conferences <http://python.org/community/workshops/>`_ have shown that this
|
---|
| 402 | approach is feasible, although the speedups reached so far are only modest
|
---|
| 403 | (e.g. 2x). Jython uses the same strategy for compiling to Java bytecode. (Jim
|
---|
| 404 | Hugunin has demonstrated that in combination with whole-program analysis,
|
---|
| 405 | speedups of 1000x are feasible for small demo programs. See the proceedings
|
---|
| 406 | from the `1997 Python conference
|
---|
| 407 | <http://python.org/workshops/1997-10/proceedings/>`_ for more information.)
|
---|
| 408 |
|
---|
| 409 | Internally, Python source code is always translated into a bytecode
|
---|
| 410 | representation, and this bytecode is then executed by the Python virtual
|
---|
| 411 | machine. In order to avoid the overhead of repeatedly parsing and translating
|
---|
| 412 | modules that rarely change, this byte code is written into a file whose name
|
---|
| 413 | ends in ".pyc" whenever a module is parsed. When the corresponding .py file is
|
---|
| 414 | changed, it is parsed and translated again and the .pyc file is rewritten.
|
---|
| 415 |
|
---|
| 416 | There is no performance difference once the .pyc file has been loaded, as the
|
---|
| 417 | bytecode read from the .pyc file is exactly the same as the bytecode created by
|
---|
| 418 | direct translation. The only difference is that loading code from a .pyc file
|
---|
| 419 | is faster than parsing and translating a .py file, so the presence of
|
---|
| 420 | precompiled .pyc files improves the start-up time of Python scripts. If
|
---|
| 421 | desired, the Lib/compileall.py module can be used to create valid .pyc files for
|
---|
| 422 | a given set of modules.
|
---|
| 423 |
|
---|
| 424 | Note that the main script executed by Python, even if its filename ends in .py,
|
---|
| 425 | is not compiled to a .pyc file. It is compiled to bytecode, but the bytecode is
|
---|
| 426 | not saved to a file. Usually main scripts are quite short, so this doesn't cost
|
---|
| 427 | much speed.
|
---|
| 428 |
|
---|
| 429 | .. XXX check which of these projects are still alive
|
---|
| 430 |
|
---|
| 431 | There are also several programs which make it easier to intermingle Python and C
|
---|
| 432 | code in various ways to increase performance. See, for example, `Psyco
|
---|
| 433 | <http://psyco.sourceforge.net/>`_, `Pyrex
|
---|
| 434 | <http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_, `PyInline
|
---|
| 435 | <http://pyinline.sourceforge.net/>`_, `Py2Cmod
|
---|
| 436 | <http://sourceforge.net/projects/py2cmod/>`_, and `Weave
|
---|
[391] | 437 | <http://www.scipy.org/Weave>`_.
|
---|
[2] | 438 |
|
---|
| 439 |
|
---|
| 440 | How does Python manage memory?
|
---|
| 441 | ------------------------------
|
---|
| 442 |
|
---|
| 443 | The details of Python memory management depend on the implementation. The
|
---|
| 444 | standard C implementation of Python uses reference counting to detect
|
---|
| 445 | inaccessible objects, and another mechanism to collect reference cycles,
|
---|
| 446 | periodically executing a cycle detection algorithm which looks for inaccessible
|
---|
| 447 | cycles and deletes the objects involved. The :mod:`gc` module provides functions
|
---|
| 448 | to perform a garbage collection, obtain debugging statistics, and tune the
|
---|
| 449 | collector's parameters.
|
---|
| 450 |
|
---|
| 451 | Jython relies on the Java runtime so the JVM's garbage collector is used. This
|
---|
| 452 | difference can cause some subtle porting problems if your Python code depends on
|
---|
| 453 | the behavior of the reference counting implementation.
|
---|
| 454 |
|
---|
[391] | 455 | .. XXX relevant for Python 2.6?
|
---|
| 456 |
|
---|
[2] | 457 | Sometimes objects get stuck in tracebacks temporarily and hence are not
|
---|
| 458 | deallocated when you might expect. Clear the tracebacks with::
|
---|
| 459 |
|
---|
| 460 | import sys
|
---|
| 461 | sys.exc_clear()
|
---|
| 462 | sys.exc_traceback = sys.last_traceback = None
|
---|
| 463 |
|
---|
| 464 | Tracebacks are used for reporting errors, implementing debuggers and related
|
---|
| 465 | things. They contain a portion of the program state extracted during the
|
---|
| 466 | handling of an exception (usually the most recent exception).
|
---|
| 467 |
|
---|
[391] | 468 | In the absence of circularities and tracebacks, Python programs do not need to
|
---|
| 469 | manage memory explicitly.
|
---|
[2] | 470 |
|
---|
| 471 | Why doesn't Python use a more traditional garbage collection scheme? For one
|
---|
| 472 | thing, this is not a C standard feature and hence it's not portable. (Yes, we
|
---|
| 473 | know about the Boehm GC library. It has bits of assembler code for *most*
|
---|
| 474 | common platforms, not for all of them, and although it is mostly transparent, it
|
---|
| 475 | isn't completely transparent; patches are required to get Python to work with
|
---|
| 476 | it.)
|
---|
| 477 |
|
---|
| 478 | Traditional GC also becomes a problem when Python is embedded into other
|
---|
| 479 | applications. While in a standalone Python it's fine to replace the standard
|
---|
| 480 | malloc() and free() with versions provided by the GC library, an application
|
---|
| 481 | embedding Python may want to have its *own* substitute for malloc() and free(),
|
---|
| 482 | and may not want Python's. Right now, Python works with anything that
|
---|
| 483 | implements malloc() and free() properly.
|
---|
| 484 |
|
---|
| 485 | In Jython, the following code (which is fine in CPython) will probably run out
|
---|
| 486 | of file descriptors long before it runs out of memory::
|
---|
| 487 |
|
---|
[391] | 488 | for file in very_long_list_of_files:
|
---|
[2] | 489 | f = open(file)
|
---|
| 490 | c = f.read(1)
|
---|
| 491 |
|
---|
| 492 | Using the current reference counting and destructor scheme, each new assignment
|
---|
| 493 | to f closes the previous file. Using GC, this is not guaranteed. If you want
|
---|
| 494 | to write code that will work with any Python implementation, you should
|
---|
[391] | 495 | explicitly close the file or use the :keyword:`with` statement; this will work
|
---|
| 496 | regardless of GC::
|
---|
[2] | 497 |
|
---|
[391] | 498 | for file in very_long_list_of_files:
|
---|
| 499 | with open(file) as f:
|
---|
| 500 | c = f.read(1)
|
---|
[2] | 501 |
|
---|
| 502 |
|
---|
| 503 | Why isn't all memory freed when Python exits?
|
---|
| 504 | ---------------------------------------------
|
---|
| 505 |
|
---|
| 506 | Objects referenced from the global namespaces of Python modules are not always
|
---|
| 507 | deallocated when Python exits. This may happen if there are circular
|
---|
| 508 | references. There are also certain bits of memory that are allocated by the C
|
---|
| 509 | library that are impossible to free (e.g. a tool like Purify will complain about
|
---|
| 510 | these). Python is, however, aggressive about cleaning up memory on exit and
|
---|
| 511 | does try to destroy every single object.
|
---|
| 512 |
|
---|
| 513 | If you want to force Python to delete certain things on deallocation use the
|
---|
| 514 | :mod:`atexit` module to run a function that will force those deletions.
|
---|
| 515 |
|
---|
| 516 |
|
---|
| 517 | Why are there separate tuple and list data types?
|
---|
| 518 | -------------------------------------------------
|
---|
| 519 |
|
---|
| 520 | Lists and tuples, while similar in many respects, are generally used in
|
---|
| 521 | fundamentally different ways. Tuples can be thought of as being similar to
|
---|
| 522 | Pascal records or C structs; they're small collections of related data which may
|
---|
| 523 | be of different types which are operated on as a group. For example, a
|
---|
| 524 | Cartesian coordinate is appropriately represented as a tuple of two or three
|
---|
| 525 | numbers.
|
---|
| 526 |
|
---|
| 527 | Lists, on the other hand, are more like arrays in other languages. They tend to
|
---|
| 528 | hold a varying number of objects all of which have the same type and which are
|
---|
| 529 | operated on one-by-one. For example, ``os.listdir('.')`` returns a list of
|
---|
| 530 | strings representing the files in the current directory. Functions which
|
---|
| 531 | operate on this output would generally not break if you added another file or
|
---|
| 532 | two to the directory.
|
---|
| 533 |
|
---|
| 534 | Tuples are immutable, meaning that once a tuple has been created, you can't
|
---|
| 535 | replace any of its elements with a new value. Lists are mutable, meaning that
|
---|
| 536 | you can always change a list's elements. Only immutable elements can be used as
|
---|
| 537 | dictionary keys, and hence only tuples and not lists can be used as keys.
|
---|
| 538 |
|
---|
| 539 |
|
---|
| 540 | How are lists implemented?
|
---|
| 541 | --------------------------
|
---|
| 542 |
|
---|
| 543 | Python's lists are really variable-length arrays, not Lisp-style linked lists.
|
---|
| 544 | The implementation uses a contiguous array of references to other objects, and
|
---|
| 545 | keeps a pointer to this array and the array's length in a list head structure.
|
---|
| 546 |
|
---|
| 547 | This makes indexing a list ``a[i]`` an operation whose cost is independent of
|
---|
| 548 | the size of the list or the value of the index.
|
---|
| 549 |
|
---|
| 550 | When items are appended or inserted, the array of references is resized. Some
|
---|
| 551 | cleverness is applied to improve the performance of appending items repeatedly;
|
---|
| 552 | when the array must be grown, some extra space is allocated so the next few
|
---|
| 553 | times don't require an actual resize.
|
---|
| 554 |
|
---|
| 555 |
|
---|
| 556 | How are dictionaries implemented?
|
---|
| 557 | ---------------------------------
|
---|
| 558 |
|
---|
| 559 | Python's dictionaries are implemented as resizable hash tables. Compared to
|
---|
| 560 | B-trees, this gives better performance for lookup (the most common operation by
|
---|
| 561 | far) under most circumstances, and the implementation is simpler.
|
---|
| 562 |
|
---|
| 563 | Dictionaries work by computing a hash code for each key stored in the dictionary
|
---|
| 564 | using the :func:`hash` built-in function. The hash code varies widely depending
|
---|
| 565 | on the key; for example, "Python" hashes to -539294296 while "python", a string
|
---|
| 566 | that differs by a single bit, hashes to 1142331976. The hash code is then used
|
---|
| 567 | to calculate a location in an internal array where the value will be stored.
|
---|
| 568 | Assuming that you're storing keys that all have different hash values, this
|
---|
| 569 | means that dictionaries take constant time -- O(1), in computer science notation
|
---|
| 570 | -- to retrieve a key. It also means that no sorted order of the keys is
|
---|
| 571 | maintained, and traversing the array as the ``.keys()`` and ``.items()`` do will
|
---|
| 572 | output the dictionary's content in some arbitrary jumbled order.
|
---|
| 573 |
|
---|
| 574 |
|
---|
| 575 | Why must dictionary keys be immutable?
|
---|
| 576 | --------------------------------------
|
---|
| 577 |
|
---|
| 578 | The hash table implementation of dictionaries uses a hash value calculated from
|
---|
| 579 | the key value to find the key. If the key were a mutable object, its value
|
---|
| 580 | could change, and thus its hash could also change. But since whoever changes
|
---|
| 581 | the key object can't tell that it was being used as a dictionary key, it can't
|
---|
| 582 | move the entry around in the dictionary. Then, when you try to look up the same
|
---|
| 583 | object in the dictionary it won't be found because its hash value is different.
|
---|
| 584 | If you tried to look up the old value it wouldn't be found either, because the
|
---|
| 585 | value of the object found in that hash bin would be different.
|
---|
| 586 |
|
---|
| 587 | If you want a dictionary indexed with a list, simply convert the list to a tuple
|
---|
| 588 | first; the function ``tuple(L)`` creates a tuple with the same entries as the
|
---|
| 589 | list ``L``. Tuples are immutable and can therefore be used as dictionary keys.
|
---|
| 590 |
|
---|
| 591 | Some unacceptable solutions that have been proposed:
|
---|
| 592 |
|
---|
| 593 | - Hash lists by their address (object ID). This doesn't work because if you
|
---|
| 594 | construct a new list with the same value it won't be found; e.g.::
|
---|
| 595 |
|
---|
[391] | 596 | mydict = {[1, 2]: '12'}
|
---|
| 597 | print mydict[[1, 2]]
|
---|
[2] | 598 |
|
---|
[391] | 599 | would raise a KeyError exception because the id of the ``[1, 2]`` used in the
|
---|
[2] | 600 | second line differs from that in the first line. In other words, dictionary
|
---|
| 601 | keys should be compared using ``==``, not using :keyword:`is`.
|
---|
| 602 |
|
---|
| 603 | - Make a copy when using a list as a key. This doesn't work because the list,
|
---|
| 604 | being a mutable object, could contain a reference to itself, and then the
|
---|
| 605 | copying code would run into an infinite loop.
|
---|
| 606 |
|
---|
| 607 | - Allow lists as keys but tell the user not to modify them. This would allow a
|
---|
| 608 | class of hard-to-track bugs in programs when you forgot or modified a list by
|
---|
| 609 | accident. It also invalidates an important invariant of dictionaries: every
|
---|
| 610 | value in ``d.keys()`` is usable as a key of the dictionary.
|
---|
| 611 |
|
---|
| 612 | - Mark lists as read-only once they are used as a dictionary key. The problem
|
---|
| 613 | is that it's not just the top-level object that could change its value; you
|
---|
| 614 | could use a tuple containing a list as a key. Entering anything as a key into
|
---|
| 615 | a dictionary would require marking all objects reachable from there as
|
---|
| 616 | read-only -- and again, self-referential objects could cause an infinite loop.
|
---|
| 617 |
|
---|
| 618 | There is a trick to get around this if you need to, but use it at your own risk:
|
---|
| 619 | You can wrap a mutable structure inside a class instance which has both a
|
---|
[391] | 620 | :meth:`__eq__` and a :meth:`__hash__` method. You must then make sure that the
|
---|
[2] | 621 | hash value for all such wrapper objects that reside in a dictionary (or other
|
---|
| 622 | hash based structure), remain fixed while the object is in the dictionary (or
|
---|
| 623 | other structure). ::
|
---|
| 624 |
|
---|
| 625 | class ListWrapper:
|
---|
| 626 | def __init__(self, the_list):
|
---|
| 627 | self.the_list = the_list
|
---|
[391] | 628 | def __eq__(self, other):
|
---|
[2] | 629 | return self.the_list == other.the_list
|
---|
| 630 | def __hash__(self):
|
---|
| 631 | l = self.the_list
|
---|
| 632 | result = 98767 - len(l)*555
|
---|
[391] | 633 | for i, el in enumerate(l):
|
---|
[2] | 634 | try:
|
---|
[391] | 635 | result = result + (hash(el) % 9999999) * 1001 + i
|
---|
| 636 | except Exception:
|
---|
[2] | 637 | result = (result % 7777777) + i * 333
|
---|
| 638 | return result
|
---|
| 639 |
|
---|
| 640 | Note that the hash computation is complicated by the possibility that some
|
---|
| 641 | members of the list may be unhashable and also by the possibility of arithmetic
|
---|
| 642 | overflow.
|
---|
| 643 |
|
---|
[391] | 644 | Furthermore it must always be the case that if ``o1 == o2`` (ie ``o1.__eq__(o2)
|
---|
| 645 | is True``) then ``hash(o1) == hash(o2)`` (ie, ``o1.__hash__() == o2.__hash__()``),
|
---|
[2] | 646 | regardless of whether the object is in a dictionary or not. If you fail to meet
|
---|
| 647 | these restrictions dictionaries and other hash based structures will misbehave.
|
---|
| 648 |
|
---|
| 649 | In the case of ListWrapper, whenever the wrapper object is in a dictionary the
|
---|
| 650 | wrapped list must not change to avoid anomalies. Don't do this unless you are
|
---|
| 651 | prepared to think hard about the requirements and the consequences of not
|
---|
| 652 | meeting them correctly. Consider yourself warned.
|
---|
| 653 |
|
---|
| 654 |
|
---|
| 655 | Why doesn't list.sort() return the sorted list?
|
---|
| 656 | -----------------------------------------------
|
---|
| 657 |
|
---|
| 658 | In situations where performance matters, making a copy of the list just to sort
|
---|
| 659 | it would be wasteful. Therefore, :meth:`list.sort` sorts the list in place. In
|
---|
| 660 | order to remind you of that fact, it does not return the sorted list. This way,
|
---|
| 661 | you won't be fooled into accidentally overwriting a list when you need a sorted
|
---|
| 662 | copy but also need to keep the unsorted version around.
|
---|
| 663 |
|
---|
| 664 | In Python 2.4 a new built-in function -- :func:`sorted` -- has been added.
|
---|
| 665 | This function creates a new list from a provided iterable, sorts it and returns
|
---|
| 666 | it. For example, here's how to iterate over the keys of a dictionary in sorted
|
---|
| 667 | order::
|
---|
| 668 |
|
---|
[391] | 669 | for key in sorted(mydict):
|
---|
| 670 | ... # do whatever with mydict[key]...
|
---|
[2] | 671 |
|
---|
| 672 |
|
---|
| 673 | How do you specify and enforce an interface spec in Python?
|
---|
| 674 | -----------------------------------------------------------
|
---|
| 675 |
|
---|
| 676 | An interface specification for a module as provided by languages such as C++ and
|
---|
| 677 | Java describes the prototypes for the methods and functions of the module. Many
|
---|
| 678 | feel that compile-time enforcement of interface specifications helps in the
|
---|
| 679 | construction of large programs.
|
---|
| 680 |
|
---|
| 681 | Python 2.6 adds an :mod:`abc` module that lets you define Abstract Base Classes
|
---|
| 682 | (ABCs). You can then use :func:`isinstance` and :func:`issubclass` to check
|
---|
| 683 | whether an instance or a class implements a particular ABC. The
|
---|
[391] | 684 | :mod:`collections` module defines a set of useful ABCs such as
|
---|
| 685 | :class:`~collections.Iterable`, :class:`~collections.Container`, and
|
---|
| 686 | :class:`~collections.MutableMapping`.
|
---|
[2] | 687 |
|
---|
| 688 | For Python, many of the advantages of interface specifications can be obtained
|
---|
| 689 | by an appropriate test discipline for components. There is also a tool,
|
---|
| 690 | PyChecker, which can be used to find problems due to subclassing.
|
---|
| 691 |
|
---|
| 692 | A good test suite for a module can both provide a regression test and serve as a
|
---|
| 693 | module interface specification and a set of examples. Many Python modules can
|
---|
| 694 | be run as a script to provide a simple "self test." Even modules which use
|
---|
| 695 | complex external interfaces can often be tested in isolation using trivial
|
---|
| 696 | "stub" emulations of the external interface. The :mod:`doctest` and
|
---|
| 697 | :mod:`unittest` modules or third-party test frameworks can be used to construct
|
---|
| 698 | exhaustive test suites that exercise every line of code in a module.
|
---|
| 699 |
|
---|
| 700 | An appropriate testing discipline can help build large complex applications in
|
---|
| 701 | Python as well as having interface specifications would. In fact, it can be
|
---|
| 702 | better because an interface specification cannot test certain properties of a
|
---|
| 703 | program. For example, the :meth:`append` method is expected to add new elements
|
---|
| 704 | to the end of some internal list; an interface specification cannot test that
|
---|
| 705 | your :meth:`append` implementation will actually do this correctly, but it's
|
---|
| 706 | trivial to check this property in a test suite.
|
---|
| 707 |
|
---|
| 708 | Writing test suites is very helpful, and you might want to design your code with
|
---|
| 709 | an eye to making it easily tested. One increasingly popular technique,
|
---|
| 710 | test-directed development, calls for writing parts of the test suite first,
|
---|
| 711 | before you write any of the actual code. Of course Python allows you to be
|
---|
| 712 | sloppy and not write test cases at all.
|
---|
| 713 |
|
---|
| 714 |
|
---|
| 715 | Why are default values shared between objects?
|
---|
| 716 | ----------------------------------------------
|
---|
| 717 |
|
---|
| 718 | This type of bug commonly bites neophyte programmers. Consider this function::
|
---|
| 719 |
|
---|
[391] | 720 | def foo(mydict={}): # Danger: shared reference to one dict for all calls
|
---|
[2] | 721 | ... compute something ...
|
---|
[391] | 722 | mydict[key] = value
|
---|
| 723 | return mydict
|
---|
[2] | 724 |
|
---|
[391] | 725 | The first time you call this function, ``mydict`` contains a single item. The
|
---|
| 726 | second time, ``mydict`` contains two items because when ``foo()`` begins
|
---|
| 727 | executing, ``mydict`` starts out with an item already in it.
|
---|
[2] | 728 |
|
---|
| 729 | It is often expected that a function call creates new objects for default
|
---|
| 730 | values. This is not what happens. Default values are created exactly once, when
|
---|
| 731 | the function is defined. If that object is changed, like the dictionary in this
|
---|
| 732 | example, subsequent calls to the function will refer to this changed object.
|
---|
| 733 |
|
---|
| 734 | By definition, immutable objects such as numbers, strings, tuples, and ``None``,
|
---|
| 735 | are safe from change. Changes to mutable objects such as dictionaries, lists,
|
---|
| 736 | and class instances can lead to confusion.
|
---|
| 737 |
|
---|
| 738 | Because of this feature, it is good programming practice to not use mutable
|
---|
| 739 | objects as default values. Instead, use ``None`` as the default value and
|
---|
| 740 | inside the function, check if the parameter is ``None`` and create a new
|
---|
| 741 | list/dictionary/whatever if it is. For example, don't write::
|
---|
| 742 |
|
---|
[391] | 743 | def foo(mydict={}):
|
---|
[2] | 744 | ...
|
---|
| 745 |
|
---|
| 746 | but::
|
---|
| 747 |
|
---|
[391] | 748 | def foo(mydict=None):
|
---|
| 749 | if mydict is None:
|
---|
| 750 | mydict = {} # create a new dict for local namespace
|
---|
[2] | 751 |
|
---|
| 752 | This feature can be useful. When you have a function that's time-consuming to
|
---|
| 753 | compute, a common technique is to cache the parameters and the resulting value
|
---|
| 754 | of each call to the function, and return the cached value if the same value is
|
---|
| 755 | requested again. This is called "memoizing", and can be implemented like this::
|
---|
| 756 |
|
---|
| 757 | # Callers will never provide a third parameter for this function.
|
---|
[391] | 758 | def expensive(arg1, arg2, _cache={}):
|
---|
| 759 | if (arg1, arg2) in _cache:
|
---|
[2] | 760 | return _cache[(arg1, arg2)]
|
---|
| 761 |
|
---|
| 762 | # Calculate the value
|
---|
| 763 | result = ... expensive computation ...
|
---|
| 764 | _cache[(arg1, arg2)] = result # Store result in the cache
|
---|
| 765 | return result
|
---|
| 766 |
|
---|
| 767 | You could use a global variable containing a dictionary instead of the default
|
---|
| 768 | value; it's a matter of taste.
|
---|
| 769 |
|
---|
| 770 |
|
---|
| 771 | Why is there no goto?
|
---|
| 772 | ---------------------
|
---|
| 773 |
|
---|
| 774 | You can use exceptions to provide a "structured goto" that even works across
|
---|
| 775 | function calls. Many feel that exceptions can conveniently emulate all
|
---|
| 776 | reasonable uses of the "go" or "goto" constructs of C, Fortran, and other
|
---|
| 777 | languages. For example::
|
---|
| 778 |
|
---|
[391] | 779 | class label: pass # declare a label
|
---|
[2] | 780 |
|
---|
| 781 | try:
|
---|
| 782 | ...
|
---|
[391] | 783 | if condition: raise label() # goto label
|
---|
[2] | 784 | ...
|
---|
[391] | 785 | except label: # where to goto
|
---|
[2] | 786 | pass
|
---|
| 787 | ...
|
---|
| 788 |
|
---|
| 789 | This doesn't allow you to jump into the middle of a loop, but that's usually
|
---|
| 790 | considered an abuse of goto anyway. Use sparingly.
|
---|
| 791 |
|
---|
| 792 |
|
---|
| 793 | Why can't raw strings (r-strings) end with a backslash?
|
---|
| 794 | -------------------------------------------------------
|
---|
| 795 |
|
---|
| 796 | More precisely, they can't end with an odd number of backslashes: the unpaired
|
---|
| 797 | backslash at the end escapes the closing quote character, leaving an
|
---|
| 798 | unterminated string.
|
---|
| 799 |
|
---|
| 800 | Raw strings were designed to ease creating input for processors (chiefly regular
|
---|
| 801 | expression engines) that want to do their own backslash escape processing. Such
|
---|
| 802 | processors consider an unmatched trailing backslash to be an error anyway, so
|
---|
| 803 | raw strings disallow that. In return, they allow you to pass on the string
|
---|
| 804 | quote character by escaping it with a backslash. These rules work well when
|
---|
| 805 | r-strings are used for their intended purpose.
|
---|
| 806 |
|
---|
| 807 | If you're trying to build Windows pathnames, note that all Windows system calls
|
---|
| 808 | accept forward slashes too::
|
---|
| 809 |
|
---|
[391] | 810 | f = open("/mydir/file.txt") # works fine!
|
---|
[2] | 811 |
|
---|
| 812 | If you're trying to build a pathname for a DOS command, try e.g. one of ::
|
---|
| 813 |
|
---|
| 814 | dir = r"\this\is\my\dos\dir" "\\"
|
---|
| 815 | dir = r"\this\is\my\dos\dir\ "[:-1]
|
---|
| 816 | dir = "\\this\\is\\my\\dos\\dir\\"
|
---|
| 817 |
|
---|
| 818 |
|
---|
| 819 | Why doesn't Python have a "with" statement for attribute assignments?
|
---|
| 820 | ---------------------------------------------------------------------
|
---|
| 821 |
|
---|
| 822 | Python has a 'with' statement that wraps the execution of a block, calling code
|
---|
| 823 | on the entrance and exit from the block. Some language have a construct that
|
---|
| 824 | looks like this::
|
---|
| 825 |
|
---|
| 826 | with obj:
|
---|
[391] | 827 | a = 1 # equivalent to obj.a = 1
|
---|
[2] | 828 | total = total + 1 # obj.total = obj.total + 1
|
---|
| 829 |
|
---|
| 830 | In Python, such a construct would be ambiguous.
|
---|
| 831 |
|
---|
| 832 | Other languages, such as Object Pascal, Delphi, and C++, use static types, so
|
---|
| 833 | it's possible to know, in an unambiguous way, what member is being assigned
|
---|
| 834 | to. This is the main point of static typing -- the compiler *always* knows the
|
---|
| 835 | scope of every variable at compile time.
|
---|
| 836 |
|
---|
| 837 | Python uses dynamic types. It is impossible to know in advance which attribute
|
---|
| 838 | will be referenced at runtime. Member attributes may be added or removed from
|
---|
| 839 | objects on the fly. This makes it impossible to know, from a simple reading,
|
---|
| 840 | what attribute is being referenced: a local one, a global one, or a member
|
---|
| 841 | attribute?
|
---|
| 842 |
|
---|
| 843 | For instance, take the following incomplete snippet::
|
---|
| 844 |
|
---|
| 845 | def foo(a):
|
---|
| 846 | with a:
|
---|
| 847 | print x
|
---|
| 848 |
|
---|
| 849 | The snippet assumes that "a" must have a member attribute called "x". However,
|
---|
| 850 | there is nothing in Python that tells the interpreter this. What should happen
|
---|
| 851 | if "a" is, let us say, an integer? If there is a global variable named "x",
|
---|
| 852 | will it be used inside the with block? As you see, the dynamic nature of Python
|
---|
| 853 | makes such choices much harder.
|
---|
| 854 |
|
---|
| 855 | The primary benefit of "with" and similar language features (reduction of code
|
---|
| 856 | volume) can, however, easily be achieved in Python by assignment. Instead of::
|
---|
| 857 |
|
---|
[391] | 858 | function(args).mydict[index][index].a = 21
|
---|
| 859 | function(args).mydict[index][index].b = 42
|
---|
| 860 | function(args).mydict[index][index].c = 63
|
---|
[2] | 861 |
|
---|
| 862 | write this::
|
---|
| 863 |
|
---|
[391] | 864 | ref = function(args).mydict[index][index]
|
---|
[2] | 865 | ref.a = 21
|
---|
| 866 | ref.b = 42
|
---|
| 867 | ref.c = 63
|
---|
| 868 |
|
---|
| 869 | This also has the side-effect of increasing execution speed because name
|
---|
| 870 | bindings are resolved at run-time in Python, and the second version only needs
|
---|
[391] | 871 | to perform the resolution once.
|
---|
[2] | 872 |
|
---|
| 873 |
|
---|
| 874 | Why are colons required for the if/while/def/class statements?
|
---|
| 875 | --------------------------------------------------------------
|
---|
| 876 |
|
---|
| 877 | The colon is required primarily to enhance readability (one of the results of
|
---|
| 878 | the experimental ABC language). Consider this::
|
---|
| 879 |
|
---|
| 880 | if a == b
|
---|
| 881 | print a
|
---|
| 882 |
|
---|
| 883 | versus ::
|
---|
| 884 |
|
---|
| 885 | if a == b:
|
---|
| 886 | print a
|
---|
| 887 |
|
---|
| 888 | Notice how the second one is slightly easier to read. Notice further how a
|
---|
| 889 | colon sets off the example in this FAQ answer; it's a standard usage in English.
|
---|
| 890 |
|
---|
| 891 | Another minor reason is that the colon makes it easier for editors with syntax
|
---|
| 892 | highlighting; they can look for colons to decide when indentation needs to be
|
---|
| 893 | increased instead of having to do a more elaborate parsing of the program text.
|
---|
| 894 |
|
---|
| 895 |
|
---|
| 896 | Why does Python allow commas at the end of lists and tuples?
|
---|
| 897 | ------------------------------------------------------------
|
---|
| 898 |
|
---|
| 899 | Python lets you add a trailing comma at the end of lists, tuples, and
|
---|
| 900 | dictionaries::
|
---|
| 901 |
|
---|
| 902 | [1, 2, 3,]
|
---|
| 903 | ('a', 'b', 'c',)
|
---|
| 904 | d = {
|
---|
| 905 | "A": [1, 5],
|
---|
| 906 | "B": [6, 7], # last trailing comma is optional but good style
|
---|
| 907 | }
|
---|
| 908 |
|
---|
| 909 |
|
---|
| 910 | There are several reasons to allow this.
|
---|
| 911 |
|
---|
| 912 | When you have a literal value for a list, tuple, or dictionary spread across
|
---|
| 913 | multiple lines, it's easier to add more elements because you don't have to
|
---|
[391] | 914 | remember to add a comma to the previous line. The lines can also be reordered
|
---|
| 915 | without creating a syntax error.
|
---|
[2] | 916 |
|
---|
| 917 | Accidentally omitting the comma can lead to errors that are hard to diagnose.
|
---|
| 918 | For example::
|
---|
| 919 |
|
---|
| 920 | x = [
|
---|
| 921 | "fee",
|
---|
| 922 | "fie"
|
---|
| 923 | "foo",
|
---|
| 924 | "fum"
|
---|
| 925 | ]
|
---|
| 926 |
|
---|
| 927 | This list looks like it has four elements, but it actually contains three:
|
---|
| 928 | "fee", "fiefoo" and "fum". Always adding the comma avoids this source of error.
|
---|
| 929 |
|
---|
| 930 | Allowing the trailing comma may also make programmatic code generation easier.
|
---|