[2] | 1 | ************************************
|
---|
| 2 | Idioms and Anti-Idioms in Python
|
---|
| 3 | ************************************
|
---|
| 4 |
|
---|
| 5 | :Author: Moshe Zadka
|
---|
| 6 |
|
---|
| 7 | This document is placed in the public domain.
|
---|
| 8 |
|
---|
| 9 |
|
---|
| 10 | .. topic:: Abstract
|
---|
| 11 |
|
---|
| 12 | This document can be considered a companion to the tutorial. It shows how to use
|
---|
| 13 | Python, and even more importantly, how *not* to use Python.
|
---|
| 14 |
|
---|
| 15 |
|
---|
| 16 | Language Constructs You Should Not Use
|
---|
| 17 | ======================================
|
---|
| 18 |
|
---|
| 19 | While Python has relatively few gotchas compared to other languages, it still
|
---|
| 20 | has some constructs which are only useful in corner cases, or are plain
|
---|
| 21 | dangerous.
|
---|
| 22 |
|
---|
| 23 |
|
---|
| 24 | from module import \*
|
---|
| 25 | ---------------------
|
---|
| 26 |
|
---|
| 27 |
|
---|
| 28 | Inside Function Definitions
|
---|
| 29 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
---|
| 30 |
|
---|
| 31 | ``from module import *`` is *invalid* inside function definitions. While many
|
---|
| 32 | versions of Python do not check for the invalidity, it does not make it more
|
---|
| 33 | valid, no more than having a smart lawyer makes a man innocent. Do not use it
|
---|
| 34 | like that ever. Even in versions where it was accepted, it made the function
|
---|
[391] | 35 | execution slower, because the compiler could not be certain which names were
|
---|
| 36 | local and which were global. In Python 2.1 this construct causes warnings, and
|
---|
[2] | 37 | sometimes even errors.
|
---|
| 38 |
|
---|
| 39 |
|
---|
| 40 | At Module Level
|
---|
| 41 | ^^^^^^^^^^^^^^^
|
---|
| 42 |
|
---|
| 43 | While it is valid to use ``from module import *`` at module level it is usually
|
---|
| 44 | a bad idea. For one, this loses an important property Python otherwise has ---
|
---|
| 45 | you can know where each toplevel name is defined by a simple "search" function
|
---|
| 46 | in your favourite editor. You also open yourself to trouble in the future, if
|
---|
| 47 | some module grows additional functions or classes.
|
---|
| 48 |
|
---|
[391] | 49 | One of the most awful questions asked on the newsgroup is why this code::
|
---|
[2] | 50 |
|
---|
| 51 | f = open("www")
|
---|
| 52 | f.read()
|
---|
| 53 |
|
---|
| 54 | does not work. Of course, it works just fine (assuming you have a file called
|
---|
| 55 | "www".) But it does not work if somewhere in the module, the statement ``from
|
---|
| 56 | os import *`` is present. The :mod:`os` module has a function called
|
---|
| 57 | :func:`open` which returns an integer. While it is very useful, shadowing a
|
---|
| 58 | builtin is one of its least useful properties.
|
---|
| 59 |
|
---|
| 60 | Remember, you can never know for sure what names a module exports, so either
|
---|
| 61 | take what you need --- ``from module import name1, name2``, or keep them in the
|
---|
| 62 | module and access on a per-need basis --- ``import module;print module.name``.
|
---|
| 63 |
|
---|
| 64 |
|
---|
| 65 | When It Is Just Fine
|
---|
| 66 | ^^^^^^^^^^^^^^^^^^^^
|
---|
| 67 |
|
---|
| 68 | There are situations in which ``from module import *`` is just fine:
|
---|
| 69 |
|
---|
| 70 | * The interactive prompt. For example, ``from math import *`` makes Python an
|
---|
| 71 | amazing scientific calculator.
|
---|
| 72 |
|
---|
| 73 | * When extending a module in C with a module in Python.
|
---|
| 74 |
|
---|
| 75 | * When the module advertises itself as ``from import *`` safe.
|
---|
| 76 |
|
---|
| 77 |
|
---|
| 78 | Unadorned :keyword:`exec`, :func:`execfile` and friends
|
---|
| 79 | -------------------------------------------------------
|
---|
| 80 |
|
---|
| 81 | The word "unadorned" refers to the use without an explicit dictionary, in which
|
---|
| 82 | case those constructs evaluate code in the *current* environment. This is
|
---|
| 83 | dangerous for the same reasons ``from import *`` is dangerous --- it might step
|
---|
| 84 | over variables you are counting on and mess up things for the rest of your code.
|
---|
| 85 | Simply do not do that.
|
---|
| 86 |
|
---|
| 87 | Bad examples::
|
---|
| 88 |
|
---|
| 89 | >>> for name in sys.argv[1:]:
|
---|
| 90 | >>> exec "%s=1" % name
|
---|
| 91 | >>> def func(s, **kw):
|
---|
| 92 | >>> for var, val in kw.items():
|
---|
| 93 | >>> exec "s.%s=val" % var # invalid!
|
---|
| 94 | >>> execfile("handler.py")
|
---|
| 95 | >>> handle()
|
---|
| 96 |
|
---|
| 97 | Good examples::
|
---|
| 98 |
|
---|
| 99 | >>> d = {}
|
---|
| 100 | >>> for name in sys.argv[1:]:
|
---|
| 101 | >>> d[name] = 1
|
---|
| 102 | >>> def func(s, **kw):
|
---|
| 103 | >>> for var, val in kw.items():
|
---|
| 104 | >>> setattr(s, var, val)
|
---|
| 105 | >>> d={}
|
---|
| 106 | >>> execfile("handle.py", d, d)
|
---|
| 107 | >>> handle = d['handle']
|
---|
| 108 | >>> handle()
|
---|
| 109 |
|
---|
| 110 |
|
---|
| 111 | from module import name1, name2
|
---|
| 112 | -------------------------------
|
---|
| 113 |
|
---|
| 114 | This is a "don't" which is much weaker than the previous "don't"s but is still
|
---|
| 115 | something you should not do if you don't have good reasons to do that. The
|
---|
[391] | 116 | reason it is usually a bad idea is because you suddenly have an object which lives
|
---|
[2] | 117 | in two separate namespaces. When the binding in one namespace changes, the
|
---|
| 118 | binding in the other will not, so there will be a discrepancy between them. This
|
---|
| 119 | happens when, for example, one module is reloaded, or changes the definition of
|
---|
| 120 | a function at runtime.
|
---|
| 121 |
|
---|
| 122 | Bad example::
|
---|
| 123 |
|
---|
| 124 | # foo.py
|
---|
| 125 | a = 1
|
---|
| 126 |
|
---|
| 127 | # bar.py
|
---|
| 128 | from foo import a
|
---|
| 129 | if something():
|
---|
| 130 | a = 2 # danger: foo.a != a
|
---|
| 131 |
|
---|
| 132 | Good example::
|
---|
| 133 |
|
---|
| 134 | # foo.py
|
---|
| 135 | a = 1
|
---|
| 136 |
|
---|
| 137 | # bar.py
|
---|
| 138 | import foo
|
---|
| 139 | if something():
|
---|
| 140 | foo.a = 2
|
---|
| 141 |
|
---|
| 142 |
|
---|
| 143 | except:
|
---|
| 144 | -------
|
---|
| 145 |
|
---|
| 146 | Python has the ``except:`` clause, which catches all exceptions. Since *every*
|
---|
[391] | 147 | error in Python raises an exception, using ``except:`` can make many
|
---|
| 148 | programming errors look like runtime problems, which hinders the debugging
|
---|
| 149 | process.
|
---|
[2] | 150 |
|
---|
[391] | 151 | The following code shows a great example of why this is bad::
|
---|
[2] | 152 |
|
---|
| 153 | try:
|
---|
| 154 | foo = opne("file") # misspelled "open"
|
---|
| 155 | except:
|
---|
| 156 | sys.exit("could not open file!")
|
---|
| 157 |
|
---|
[391] | 158 | The second line triggers a :exc:`NameError`, which is caught by the except
|
---|
| 159 | clause. The program will exit, and the error message the program prints will
|
---|
| 160 | make you think the problem is the readability of ``"file"`` when in fact
|
---|
| 161 | the real error has nothing to do with ``"file"``.
|
---|
[2] | 162 |
|
---|
[391] | 163 | A better way to write the above is ::
|
---|
[2] | 164 |
|
---|
| 165 | try:
|
---|
[391] | 166 | foo = opne("file")
|
---|
[2] | 167 | except IOError:
|
---|
| 168 | sys.exit("could not open file")
|
---|
| 169 |
|
---|
[391] | 170 | When this is run, Python will produce a traceback showing the :exc:`NameError`,
|
---|
| 171 | and it will be immediately apparent what needs to be fixed.
|
---|
[2] | 172 |
|
---|
[391] | 173 | .. index:: bare except, except; bare
|
---|
[2] | 174 |
|
---|
[391] | 175 | Because ``except:`` catches *all* exceptions, including :exc:`SystemExit`,
|
---|
| 176 | :exc:`KeyboardInterrupt`, and :exc:`GeneratorExit` (which is not an error and
|
---|
| 177 | should not normally be caught by user code), using a bare ``except:`` is almost
|
---|
| 178 | never a good idea. In situations where you need to catch all "normal" errors,
|
---|
| 179 | such as in a framework that runs callbacks, you can catch the base class for
|
---|
| 180 | all normal exceptions, :exc:`Exception`. Unfortunately in Python 2.x it is
|
---|
| 181 | possible for third-party code to raise exceptions that do not inherit from
|
---|
| 182 | :exc:`Exception`, so in Python 2.x there are some cases where you may have to
|
---|
| 183 | use a bare ``except:`` and manually re-raise the exceptions you don't want
|
---|
| 184 | to catch.
|
---|
| 185 |
|
---|
| 186 |
|
---|
[2] | 187 | Exceptions
|
---|
| 188 | ==========
|
---|
| 189 |
|
---|
| 190 | Exceptions are a useful feature of Python. You should learn to raise them
|
---|
| 191 | whenever something unexpected occurs, and catch them only where you can do
|
---|
| 192 | something about them.
|
---|
| 193 |
|
---|
| 194 | The following is a very popular anti-idiom ::
|
---|
| 195 |
|
---|
| 196 | def get_status(file):
|
---|
| 197 | if not os.path.exists(file):
|
---|
| 198 | print "file not found"
|
---|
| 199 | sys.exit(1)
|
---|
| 200 | return open(file).readline()
|
---|
| 201 |
|
---|
[391] | 202 | Consider the case where the file gets deleted between the time the call to
|
---|
| 203 | :func:`os.path.exists` is made and the time :func:`open` is called. In that
|
---|
| 204 | case the last line will raise an :exc:`IOError`. The same thing would happen
|
---|
| 205 | if *file* exists but has no read permission. Since testing this on a normal
|
---|
| 206 | machine on existent and non-existent files makes it seem bugless, the test
|
---|
| 207 | results will seem fine, and the code will get shipped. Later an unhandled
|
---|
| 208 | :exc:`IOError` (or perhaps some other :exc:`EnvironmentError`) escapes to the
|
---|
| 209 | user, who gets to watch the ugly traceback.
|
---|
[2] | 210 |
|
---|
[391] | 211 | Here is a somewhat better way to do it. ::
|
---|
[2] | 212 |
|
---|
| 213 | def get_status(file):
|
---|
| 214 | try:
|
---|
| 215 | return open(file).readline()
|
---|
[391] | 216 | except EnvironmentError as err:
|
---|
| 217 | print "Unable to open file: {}".format(err)
|
---|
[2] | 218 | sys.exit(1)
|
---|
| 219 |
|
---|
[391] | 220 | In this version, *either* the file gets opened and the line is read (so it
|
---|
| 221 | works even on flaky NFS or SMB connections), or an error message is printed
|
---|
| 222 | that provides all the available information on why the open failed, and the
|
---|
| 223 | application is aborted.
|
---|
[2] | 224 |
|
---|
[391] | 225 | However, even this version of :func:`get_status` makes too many assumptions ---
|
---|
| 226 | that it will only be used in a short running script, and not, say, in a long
|
---|
| 227 | running server. Sure, the caller could do something like ::
|
---|
[2] | 228 |
|
---|
| 229 | try:
|
---|
| 230 | status = get_status(log)
|
---|
| 231 | except SystemExit:
|
---|
| 232 | status = None
|
---|
| 233 |
|
---|
[391] | 234 | But there is a better way. You should try to use as few ``except`` clauses in
|
---|
| 235 | your code as you can --- the ones you do use will usually be inside calls which
|
---|
| 236 | should always succeed, or a catch-all in a main function.
|
---|
[2] | 237 |
|
---|
[391] | 238 | So, an even better version of :func:`get_status()` is probably ::
|
---|
[2] | 239 |
|
---|
| 240 | def get_status(file):
|
---|
| 241 | return open(file).readline()
|
---|
| 242 |
|
---|
[391] | 243 | The caller can deal with the exception if it wants (for example, if it tries
|
---|
[2] | 244 | several files in a loop), or just let the exception filter upwards to *its*
|
---|
| 245 | caller.
|
---|
| 246 |
|
---|
[391] | 247 | But the last version still has a serious problem --- due to implementation
|
---|
| 248 | details in CPython, the file would not be closed when an exception is raised
|
---|
| 249 | until the exception handler finishes; and, worse, in other implementations
|
---|
| 250 | (e.g., Jython) it might not be closed at all regardless of whether or not
|
---|
| 251 | an exception is raised.
|
---|
[2] | 252 |
|
---|
[391] | 253 | The best version of this function uses the ``open()`` call as a context
|
---|
| 254 | manager, which will ensure that the file gets closed as soon as the
|
---|
| 255 | function returns::
|
---|
| 256 |
|
---|
[2] | 257 | def get_status(file):
|
---|
[391] | 258 | with open(file) as fp:
|
---|
[2] | 259 | return fp.readline()
|
---|
| 260 |
|
---|
| 261 |
|
---|
| 262 | Using the Batteries
|
---|
| 263 | ===================
|
---|
| 264 |
|
---|
| 265 | Every so often, people seem to be writing stuff in the Python library again,
|
---|
| 266 | usually poorly. While the occasional module has a poor interface, it is usually
|
---|
| 267 | much better to use the rich standard library and data types that come with
|
---|
| 268 | Python than inventing your own.
|
---|
| 269 |
|
---|
| 270 | A useful module very few people know about is :mod:`os.path`. It always has the
|
---|
| 271 | correct path arithmetic for your operating system, and will usually be much
|
---|
| 272 | better than whatever you come up with yourself.
|
---|
| 273 |
|
---|
| 274 | Compare::
|
---|
| 275 |
|
---|
| 276 | # ugh!
|
---|
| 277 | return dir+"/"+file
|
---|
| 278 | # better
|
---|
| 279 | return os.path.join(dir, file)
|
---|
| 280 |
|
---|
| 281 | More useful functions in :mod:`os.path`: :func:`basename`, :func:`dirname` and
|
---|
| 282 | :func:`splitext`.
|
---|
| 283 |
|
---|
[391] | 284 | There are also many useful built-in functions people seem not to be aware of
|
---|
| 285 | for some reason: :func:`min` and :func:`max` can find the minimum/maximum of
|
---|
| 286 | any sequence with comparable semantics, for example, yet many people write
|
---|
| 287 | their own :func:`max`/:func:`min`. Another highly useful function is
|
---|
| 288 | :func:`reduce` which can be used to repeatly apply a binary operation to a
|
---|
| 289 | sequence, reducing it to a single value. For example, compute a factorial
|
---|
| 290 | with a series of multiply operations::
|
---|
[2] | 291 |
|
---|
[391] | 292 | >>> n = 4
|
---|
| 293 | >>> import operator
|
---|
| 294 | >>> reduce(operator.mul, range(1, n+1))
|
---|
| 295 | 24
|
---|
[2] | 296 |
|
---|
[391] | 297 | When it comes to parsing numbers, note that :func:`float`, :func:`int` and
|
---|
| 298 | :func:`long` all accept string arguments and will reject ill-formed strings
|
---|
| 299 | by raising an :exc:`ValueError`.
|
---|
[2] | 300 |
|
---|
| 301 |
|
---|
| 302 | Using Backslash to Continue Statements
|
---|
| 303 | ======================================
|
---|
| 304 |
|
---|
| 305 | Since Python treats a newline as a statement terminator, and since statements
|
---|
| 306 | are often more than is comfortable to put in one line, many people do::
|
---|
| 307 |
|
---|
| 308 | if foo.bar()['first'][0] == baz.quux(1, 2)[5:9] and \
|
---|
| 309 | calculate_number(10, 20) != forbulate(500, 360):
|
---|
| 310 | pass
|
---|
| 311 |
|
---|
| 312 | You should realize that this is dangerous: a stray space after the ``\`` would
|
---|
| 313 | make this line wrong, and stray spaces are notoriously hard to see in editors.
|
---|
| 314 | In this case, at least it would be a syntax error, but if the code was::
|
---|
| 315 |
|
---|
| 316 | value = foo.bar()['first'][0]*baz.quux(1, 2)[5:9] \
|
---|
| 317 | + calculate_number(10, 20)*forbulate(500, 360)
|
---|
| 318 |
|
---|
| 319 | then it would just be subtly wrong.
|
---|
| 320 |
|
---|
| 321 | It is usually much better to use the implicit continuation inside parenthesis:
|
---|
| 322 |
|
---|
| 323 | This version is bulletproof::
|
---|
| 324 |
|
---|
| 325 | value = (foo.bar()['first'][0]*baz.quux(1, 2)[5:9]
|
---|
| 326 | + calculate_number(10, 20)*forbulate(500, 360))
|
---|
| 327 |
|
---|