Context Navigation

← Previous Revision
Next Revision →
Normal
Revision Log

design.rst

Last change on this file was 391, checked in by dmik, 11 years ago
python: Merge vendor 2.7.6 to trunk.
Property svn:eol-style set to `native`
File size: 39.4 KB

Rev	Line
[2]	1	======================
	2	Design and History FAQ
	3	======================
	4
	5	Why does Python use indentation for grouping of statements?
	6	-----------------------------------------------------------
	7
	8	Guido van Rossum believes that using indentation for grouping is extremely
	9	elegant and contributes a lot to the clarity of the average Python program.
[391]	10	Most people learn to love this feature after a while.
[2]	11
	12	Since there are no begin/end brackets there cannot be a disagreement between
	13	grouping perceived by the parser and the human reader. Occasionally C
	14	programmers will encounter a fragment of code like this::
	15
	16	if (x <= y)
	17	x++;
	18	y--;
	19	z++;
	20
	21	Only the ``x++`` statement is executed if the condition is true, but the
	22	indentation leads you to believe otherwise. Even experienced C programmers will
	23	sometimes stare at it a long time wondering why ``y`` is being decremented even
	24	for ``x > y``.
	25
	26	Because there are no begin/end brackets, Python is much less prone to
	27	coding-style conflicts. In C there are many different ways to place the braces.
	28	If you're used to reading and writing code that uses one style, you will feel at
	29	least slightly uneasy when reading (or being required to write) another style.
	30
[391]	31	Many coding styles place begin/end brackets on a line by themselves. This makes
[2]	32	programs considerably longer and wastes valuable screen space, making it harder
	33	to get a good overview of a program. Ideally, a function should fit on one
	34	screen (say, 20-30 lines). 20 lines of Python can do a lot more work than 20
	35	lines of C. This is not solely due to the lack of begin/end brackets -- the
	36	lack of declarations and the high-level data types are also responsible -- but
	37	the indentation-based syntax certainly helps.
	38
	39
	40	Why am I getting strange results with simple arithmetic operations?
	41	-------------------------------------------------------------------
	42
	43	See the next question.
	44
	45
	46	Why are floating point calculations so inaccurate?
	47	--------------------------------------------------
	48
	49	People are often very surprised by results like this::
	50
[391]	51	>>> 1.2 - 1.0
[2]	52	0.199999999999999996
	53
	54	and think it is a bug in Python. It's not. This has nothing to do with Python,
	55	but with how the underlying C platform handles floating point numbers, and
	56	ultimately with the inaccuracies introduced when writing down numbers as a
	57	string of a fixed number of digits.
	58
	59	The internal representation of floating point numbers uses a fixed number of
	60	binary digits to represent a decimal number. Some decimal numbers can't be
	61	represented exactly in binary, resulting in small roundoff errors.
	62
	63	In decimal math, there are many numbers that can't be represented with a fixed
	64	number of decimal digits, e.g. 1/3 = 0.3333333333.......
	65
	66	In base 2, 1/2 = 0.1, 1/4 = 0.01, 1/8 = 0.001, etc. .2 equals 2/10 equals 1/5,
	67	resulting in the binary fractional number 0.001100110011001...
	68
	69	Floating point numbers only have 32 or 64 bits of precision, so the digits are
	70	cut off at some point, and the resulting number is 0.199999999999999996 in
	71	decimal, not 0.2.
	72
	73	A floating point number's ``repr()`` function prints as many digits are
	74	necessary to make ``eval(repr(f)) == f`` true for any float f. The ``str()``
	75	function prints fewer digits and this often results in the more sensible number
	76	that was probably intended::
	77
[391]	78	>>> 1.1 - 0.9
	79	0.20000000000000007
	80	>>> print 1.1 - 0.9
[2]	81	0.2
	82
	83	One of the consequences of this is that it is error-prone to compare the result
	84	of some computation to a float with ``==``. Tiny inaccuracies may mean that
	85	``==`` fails. Instead, you have to check that the difference between the two
	86	numbers is less than a certain threshold::
	87
[391]	88	epsilon = 0.0000000000001 # Tiny allowed error
[2]	89	expected_result = 0.4
	90
	91	if expected_result-epsilon <= computation() <= expected_result+epsilon:
	92	...
	93
	94	Please see the chapter on :ref:`floating point arithmetic <tut-fp-issues>` in
	95	the Python tutorial for more information.
	96
	97
	98	Why are Python strings immutable?
	99	---------------------------------
	100
	101	There are several advantages.
	102
	103	One is performance: knowing that a string is immutable means we can allocate
	104	space for it at creation time, and the storage requirements are fixed and
	105	unchanging. This is also one of the reasons for the distinction between tuples
	106	and lists.
	107
	108	Another advantage is that strings in Python are considered as "elemental" as
	109	numbers. No amount of activity will change the value 8 to anything else, and in
	110	Python, no amount of activity will change the string "eight" to anything else.
	111
	112
	113	.. _why-self:
	114
	115	Why must 'self' be used explicitly in method definitions and calls?
	116	-------------------------------------------------------------------
	117
	118	The idea was borrowed from Modula-3. It turns out to be very useful, for a
	119	variety of reasons.
	120
	121	First, it's more obvious that you are using a method or instance attribute
	122	instead of a local variable. Reading ``self.x`` or ``self.meth()`` makes it
	123	absolutely clear that an instance variable or method is used even if you don't
	124	know the class definition by heart. In C++, you can sort of tell by the lack of
	125	a local variable declaration (assuming globals are rare or easily recognizable)
	126	-- but in Python, there are no local variable declarations, so you'd have to
	127	look up the class definition to be sure. Some C++ and Java coding standards
	128	call for instance attributes to have an ``m_`` prefix, so this explicitness is
	129	still useful in those languages, too.
	130
	131	Second, it means that no special syntax is necessary if you want to explicitly
	132	reference or call the method from a particular class. In C++, if you want to
	133	use a method from a base class which is overridden in a derived class, you have
[391]	134	to use the ``::`` operator -- in Python you can write
	135	``baseclass.methodname(self, <argument list>)``. This is particularly useful
	136	for :meth:`__init__` methods, and in general in cases where a derived class
	137	method wants to extend the base class method of the same name and thus has to
	138	call the base class method somehow.
[2]	139
	140	Finally, for instance variables it solves a syntactic problem with assignment:
	141	since local variables in Python are (by definition!) those variables to which a
[391]	142	value is assigned in a function body (and that aren't explicitly declared
	143	global), there has to be some way to tell the interpreter that an assignment was
	144	meant to assign to an instance variable instead of to a local variable, and it
	145	should preferably be syntactic (for efficiency reasons). C++ does this through
[2]	146	declarations, but Python doesn't have declarations and it would be a pity having
[391]	147	to introduce them just for this purpose. Using the explicit ``self.var`` solves
[2]	148	this nicely. Similarly, for using instance variables, having to write
[391]	149	``self.var`` means that references to unqualified names inside a method don't
	150	have to search the instance's directories. To put it another way, local
	151	variables and instance variables live in two different namespaces, and you need
	152	to tell Python which namespace to use.
[2]	153
	154
	155	Why can't I use an assignment in an expression?
	156	-----------------------------------------------
	157
	158	Many people used to C or Perl complain that they want to use this C idiom:
	159
	160	.. code-block:: c
	161
	162	while (line = readline(f)) {
	163	// do something with line
	164	}
	165
	166	where in Python you're forced to write this::
	167
	168	while True:
	169	line = f.readline()
	170	if not line:
	171	break
	172	... # do something with line
	173
	174	The reason for not allowing assignment in Python expressions is a common,
	175	hard-to-find bug in those other languages, caused by this construct:
	176
	177	.. code-block:: c
	178
	179	if (x = 0) {
	180	// error handling
	181	}
	182	else {
	183	// code that only works for nonzero x
	184	}
	185
	186	The error is a simple typo: ``x = 0``, which assigns 0 to the variable ``x``,
	187	was written while the comparison ``x == 0`` is certainly what was intended.
	188
	189	Many alternatives have been proposed. Most are hacks that save some typing but
	190	use arbitrary or cryptic syntax or keywords, and fail the simple criterion for
	191	language change proposals: it should intuitively suggest the proper meaning to a
	192	human reader who has not yet been introduced to the construct.
	193
	194	An interesting phenomenon is that most experienced Python programmers recognize
	195	the ``while True`` idiom and don't seem to be missing the assignment in
	196	expression construct much; it's only newcomers who express a strong desire to
	197	add this to the language.
	198
	199	There's an alternative way of spelling this that seems attractive but is
	200	generally less robust than the "while True" solution::
	201
	202	line = f.readline()
	203	while line:
	204	... # do something with line...
	205	line = f.readline()
	206
	207	The problem with this is that if you change your mind about exactly how you get
	208	the next line (e.g. you want to change it into ``sys.stdin.readline()``) you
	209	have to remember to change two places in your program -- the second occurrence
	210	is hidden at the bottom of the loop.
	211
	212	The best approach is to use iterators, making it possible to loop through
	213	objects using the ``for`` statement. For example, in the current version of
	214	Python file objects support the iterator protocol, so you can now write simply::
	215
	216	for line in f:
	217	... # do something with line...
	218
	219
	220
	221	Why does Python use methods for some functionality (e.g. list.index()) but functions for other (e.g. len(list))?
	222	----------------------------------------------------------------------------------------------------------------
	223
	224	The major reason is history. Functions were used for those operations that were
	225	generic for a group of types and which were intended to work even for objects
	226	that didn't have methods at all (e.g. tuples). It is also convenient to have a
	227	function that can readily be applied to an amorphous collection of objects when
[391]	228	you use the functional features of Python (``map()``, ``zip()`` et al).
[2]	229
	230	In fact, implementing ``len()``, ``max()``, ``min()`` as a built-in function is
	231	actually less code than implementing them as methods for each type. One can
	232	quibble about individual cases but it's a part of Python, and it's too late to
	233	make such fundamental changes now. The functions have to remain to avoid massive
	234	code breakage.
	235
	236	.. XXX talk about protocols?
	237
[391]	238	.. note::
[2]	239
[391]	240	For string operations, Python has moved from external functions (the
	241	``string`` module) to methods. However, ``len()`` is still a function.
[2]	242
[391]	243
[2]	244	Why is join() a string method instead of a list or tuple method?
	245	----------------------------------------------------------------
	246
	247	Strings became much more like other standard types starting in Python 1.6, when
	248	methods were added which give the same functionality that has always been
	249	available using the functions of the string module. Most of these new methods
	250	have been widely accepted, but the one which appears to make some programmers
	251	feel uncomfortable is::
	252
	253	", ".join(['1', '2', '4', '8', '16'])
	254
	255	which gives the result::
	256
	257	"1, 2, 4, 8, 16"
	258
	259	There are two common arguments against this usage.
	260
	261	The first runs along the lines of: "It looks really ugly using a method of a
	262	string literal (string constant)", to which the answer is that it might, but a
	263	string literal is just a fixed value. If the methods are to be allowed on names
	264	bound to strings there is no logical reason to make them unavailable on
	265	literals.
	266
	267	The second objection is typically cast as: "I am really telling a sequence to
	268	join its members together with a string constant". Sadly, you aren't. For some
	269	reason there seems to be much less difficulty with having :meth:`~str.split` as
	270	a string method, since in that case it is easy to see that ::
	271
	272	"1, 2, 4, 8, 16".split(", ")
	273
	274	is an instruction to a string literal to return the substrings delimited by the
	275	given separator (or, by default, arbitrary runs of white space). In this case a
	276	Unicode string returns a list of Unicode strings, an ASCII string returns a list
	277	of ASCII strings, and everyone is happy.
	278
	279	:meth:`~str.join` is a string method because in using it you are telling the
	280	separator string to iterate over a sequence of strings and insert itself between
	281	adjacent elements. This method can be used with any argument which obeys the
	282	rules for sequence objects, including any new classes you might define yourself.
	283
	284	Because this is a string method it can work for Unicode strings as well as plain
	285	ASCII strings. If ``join()`` were a method of the sequence types then the
	286	sequence types would have to decide which type of string to return depending on
	287	the type of the separator.
	288
	289	.. XXX remove next paragraph eventually
	290
	291	If none of these arguments persuade you, then for the moment you can continue to
	292	use the ``join()`` function from the string module, which allows you to write ::
	293
	294	string.join(['1', '2', '4', '8', '16'], ", ")
	295
	296
	297	How fast are exceptions?
	298	------------------------
	299
[391]	300	A try/except block is extremely efficient if no exceptions are raised. Actually
	301	catching an exception is expensive. In versions of Python prior to 2.0 it was
	302	common to use this idiom::
[2]	303
	304	try:
[391]	305	value = mydict[key]
[2]	306	except KeyError:
[391]	307	mydict[key] = getvalue(key)
	308	value = mydict[key]
[2]	309
	310	This only made sense when you expected the dict to have the key almost all the
	311	time. If that wasn't the case, you coded it like this::
	312
[391]	313	if key in mydict:
	314	value = mydict[key]
[2]	315	else:
[391]	316	value = mydict[key] = getvalue(key)
[2]	317
[391]	318	.. note::
[2]	319
[391]	320	In Python 2.0 and higher, you can code this as ``value =
	321	mydict.setdefault(key, getvalue(key))``.
[2]	322
[391]	323
[2]	324	Why isn't there a switch or case statement in Python?
	325	-----------------------------------------------------
	326
	327	You can do this easily enough with a sequence of ``if... elif... elif... else``.
	328	There have been some proposals for switch statement syntax, but there is no
	329	consensus (yet) on whether and how to do range tests. See :pep:`275` for
	330	complete details and the current status.
	331
	332	For cases where you need to choose from a very large number of possibilities,
	333	you can create a dictionary mapping case values to functions to call. For
	334	example::
	335
	336	def function_1(...):
	337	...
	338
	339	functions = {'a': function_1,
	340	'b': function_2,
	341	'c': self.method_1, ...}
	342
	343	func = functions[value]
	344	func()
	345
	346	For calling methods on objects, you can simplify yet further by using the
	347	:func:`getattr` built-in to retrieve methods with a particular name::
	348
	349	def visit_a(self, ...):
	350	...
	351	...
	352
	353	def dispatch(self, value):
	354	method_name = 'visit_' + str(value)
	355	method = getattr(self, method_name)
	356	method()
	357
	358	It's suggested that you use a prefix for the method names, such as ``visit_`` in
	359	this example. Without such a prefix, if values are coming from an untrusted
	360	source, an attacker would be able to call any method on your object.
	361
	362
	363	Can't you emulate threads in the interpreter instead of relying on an OS-specific thread implementation?
	364	--------------------------------------------------------------------------------------------------------
	365
	366	Answer 1: Unfortunately, the interpreter pushes at least one C stack frame for
	367	each Python stack frame. Also, extensions can call back into Python at almost
	368	random moments. Therefore, a complete threads implementation requires thread
	369	support for C.
	370
	371	Answer 2: Fortunately, there is `Stackless Python <http://www.stackless.com>`_,
	372	which has a completely redesigned interpreter loop that avoids the C stack.
	373
	374
[391]	375	Why can't lambda expressions contain statements?
	376	------------------------------------------------
[2]	377
[391]	378	Python lambda expressions cannot contain statements because Python's syntactic
[2]	379	framework can't handle statements nested inside expressions. However, in
	380	Python, this is not a serious problem. Unlike lambda forms in other languages,
	381	where they add functionality, Python lambdas are only a shorthand notation if
	382	you're too lazy to define a function.
	383
	384	Functions are already first class objects in Python, and can be declared in a
[391]	385	local scope. Therefore the only advantage of using a lambda instead of a
[2]	386	locally-defined function is that you don't need to invent a name for the
	387	function -- but that's just a local variable to which the function object (which
[391]	388	is exactly the same type of object that a lambda expression yields) is assigned!
[2]	389
	390
	391	Can Python be compiled to machine code, C or some other language?
	392	-----------------------------------------------------------------
	393
	394	Not easily. Python's high level data types, dynamic typing of objects and
	395	run-time invocation of the interpreter (using :func:`eval` or :keyword:`exec`)
	396	together mean that a "compiled" Python program would probably consist mostly of
	397	calls into the Python run-time system, even for seemingly simple operations like
	398	``x+1``.
	399
	400	Several projects described in the Python newsgroup or at past `Python
	401	conferences <http://python.org/community/workshops/>`_ have shown that this
	402	approach is feasible, although the speedups reached so far are only modest
	403	(e.g. 2x). Jython uses the same strategy for compiling to Java bytecode. (Jim
	404	Hugunin has demonstrated that in combination with whole-program analysis,
	405	speedups of 1000x are feasible for small demo programs. See the proceedings
	406	from the `1997 Python conference
	407	<http://python.org/workshops/1997-10/proceedings/>`_ for more information.)
	408
	409	Internally, Python source code is always translated into a bytecode
	410	representation, and this bytecode is then executed by the Python virtual
	411	machine. In order to avoid the overhead of repeatedly parsing and translating
	412	modules that rarely change, this byte code is written into a file whose name
	413	ends in ".pyc" whenever a module is parsed. When the corresponding .py file is
	414	changed, it is parsed and translated again and the .pyc file is rewritten.
	415
	416	There is no performance difference once the .pyc file has been loaded, as the
	417	bytecode read from the .pyc file is exactly the same as the bytecode created by
	418	direct translation. The only difference is that loading code from a .pyc file
	419	is faster than parsing and translating a .py file, so the presence of
	420	precompiled .pyc files improves the start-up time of Python scripts. If
	421	desired, the Lib/compileall.py module can be used to create valid .pyc files for
	422	a given set of modules.
	423
	424	Note that the main script executed by Python, even if its filename ends in .py,
	425	is not compiled to a .pyc file. It is compiled to bytecode, but the bytecode is
	426	not saved to a file. Usually main scripts are quite short, so this doesn't cost
	427	much speed.
	428
	429	.. XXX check which of these projects are still alive
	430
	431	There are also several programs which make it easier to intermingle Python and C
	432	code in various ways to increase performance. See, for example, `Psyco
	433	<http://psyco.sourceforge.net/>`_, `Pyrex
	434	<http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_, `PyInline
	435	<http://pyinline.sourceforge.net/>`_, `Py2Cmod
	436	<http://sourceforge.net/projects/py2cmod/>`_, and `Weave
[391]	437	<http://www.scipy.org/Weave>`_.
[2]	438
	439
	440	How does Python manage memory?
	441	------------------------------
	442
	443	The details of Python memory management depend on the implementation. The
	444	standard C implementation of Python uses reference counting to detect
	445	inaccessible objects, and another mechanism to collect reference cycles,
	446	periodically executing a cycle detection algorithm which looks for inaccessible
	447	cycles and deletes the objects involved. The :mod:`gc` module provides functions
	448	to perform a garbage collection, obtain debugging statistics, and tune the
	449	collector's parameters.
	450
	451	Jython relies on the Java runtime so the JVM's garbage collector is used. This
	452	difference can cause some subtle porting problems if your Python code depends on
	453	the behavior of the reference counting implementation.
	454
[391]	455	.. XXX relevant for Python 2.6?
	456
[2]	457	Sometimes objects get stuck in tracebacks temporarily and hence are not
	458	deallocated when you might expect. Clear the tracebacks with::
	459
	460	import sys
	461	sys.exc_clear()
	462	sys.exc_traceback = sys.last_traceback = None
	463
	464	Tracebacks are used for reporting errors, implementing debuggers and related
	465	things. They contain a portion of the program state extracted during the
	466	handling of an exception (usually the most recent exception).
	467
[391]	468	In the absence of circularities and tracebacks, Python programs do not need to
	469	manage memory explicitly.
[2]	470
	471	Why doesn't Python use a more traditional garbage collection scheme? For one
	472	thing, this is not a C standard feature and hence it's not portable. (Yes, we
	473	know about the Boehm GC library. It has bits of assembler code for most
	474	common platforms, not for all of them, and although it is mostly transparent, it
	475	isn't completely transparent; patches are required to get Python to work with
	476	it.)
	477
	478	Traditional GC also becomes a problem when Python is embedded into other
	479	applications. While in a standalone Python it's fine to replace the standard
	480	malloc() and free() with versions provided by the GC library, an application
	481	embedding Python may want to have its own substitute for malloc() and free(),
	482	and may not want Python's. Right now, Python works with anything that
	483	implements malloc() and free() properly.
	484
	485	In Jython, the following code (which is fine in CPython) will probably run out
	486	of file descriptors long before it runs out of memory::
	487
[391]	488	for file in very_long_list_of_files:
[2]	489	f = open(file)
	490	c = f.read(1)
	491
	492	Using the current reference counting and destructor scheme, each new assignment
	493	to f closes the previous file. Using GC, this is not guaranteed. If you want
	494	to write code that will work with any Python implementation, you should
[391]	495	explicitly close the file or use the :keyword:`with` statement; this will work
	496	regardless of GC::
[2]	497
[391]	498	for file in very_long_list_of_files:
	499	with open(file) as f:
	500	c = f.read(1)
[2]	501
	502
	503	Why isn't all memory freed when Python exits?
	504	---------------------------------------------
	505
	506	Objects referenced from the global namespaces of Python modules are not always
	507	deallocated when Python exits. This may happen if there are circular
	508	references. There are also certain bits of memory that are allocated by the C
	509	library that are impossible to free (e.g. a tool like Purify will complain about
	510	these). Python is, however, aggressive about cleaning up memory on exit and
	511	does try to destroy every single object.
	512
	513	If you want to force Python to delete certain things on deallocation use the
	514	:mod:`atexit` module to run a function that will force those deletions.
	515
	516
	517	Why are there separate tuple and list data types?
	518	-------------------------------------------------
	519
	520	Lists and tuples, while similar in many respects, are generally used in
	521	fundamentally different ways. Tuples can be thought of as being similar to
	522	Pascal records or C structs; they're small collections of related data which may
	523	be of different types which are operated on as a group. For example, a
	524	Cartesian coordinate is appropriately represented as a tuple of two or three
	525	numbers.
	526
	527	Lists, on the other hand, are more like arrays in other languages. They tend to
	528	hold a varying number of objects all of which have the same type and which are
	529	operated on one-by-one. For example, ``os.listdir('.')`` returns a list of
	530	strings representing the files in the current directory. Functions which
	531	operate on this output would generally not break if you added another file or
	532	two to the directory.
	533
	534	Tuples are immutable, meaning that once a tuple has been created, you can't
	535	replace any of its elements with a new value. Lists are mutable, meaning that
	536	you can always change a list's elements. Only immutable elements can be used as
	537	dictionary keys, and hence only tuples and not lists can be used as keys.
	538
	539
	540	How are lists implemented?
	541	--------------------------
	542
	543	Python's lists are really variable-length arrays, not Lisp-style linked lists.
	544	The implementation uses a contiguous array of references to other objects, and
	545	keeps a pointer to this array and the array's length in a list head structure.
	546
	547	This makes indexing a list ``a[i]`` an operation whose cost is independent of
	548	the size of the list or the value of the index.
	549
	550	When items are appended or inserted, the array of references is resized. Some
	551	cleverness is applied to improve the performance of appending items repeatedly;
	552	when the array must be grown, some extra space is allocated so the next few
	553	times don't require an actual resize.
	554
	555
	556	How are dictionaries implemented?
	557	---------------------------------
	558
	559	Python's dictionaries are implemented as resizable hash tables. Compared to
	560	B-trees, this gives better performance for lookup (the most common operation by
	561	far) under most circumstances, and the implementation is simpler.
	562
	563	Dictionaries work by computing a hash code for each key stored in the dictionary
	564	using the :func:`hash` built-in function. The hash code varies widely depending
	565	on the key; for example, "Python" hashes to -539294296 while "python", a string
	566	that differs by a single bit, hashes to 1142331976. The hash code is then used
	567	to calculate a location in an internal array where the value will be stored.
	568	Assuming that you're storing keys that all have different hash values, this
	569	means that dictionaries take constant time -- O(1), in computer science notation
	570	-- to retrieve a key. It also means that no sorted order of the keys is
	571	maintained, and traversing the array as the ``.keys()`` and ``.items()`` do will
	572	output the dictionary's content in some arbitrary jumbled order.
	573
	574
	575	Why must dictionary keys be immutable?
	576	--------------------------------------
	577
	578	The hash table implementation of dictionaries uses a hash value calculated from
	579	the key value to find the key. If the key were a mutable object, its value
	580	could change, and thus its hash could also change. But since whoever changes
	581	the key object can't tell that it was being used as a dictionary key, it can't
	582	move the entry around in the dictionary. Then, when you try to look up the same
	583	object in the dictionary it won't be found because its hash value is different.
	584	If you tried to look up the old value it wouldn't be found either, because the
	585	value of the object found in that hash bin would be different.
	586
	587	If you want a dictionary indexed with a list, simply convert the list to a tuple
	588	first; the function ``tuple(L)`` creates a tuple with the same entries as the
	589	list ``L``. Tuples are immutable and can therefore be used as dictionary keys.
	590
	591	Some unacceptable solutions that have been proposed:
	592
	593	- Hash lists by their address (object ID). This doesn't work because if you
	594	construct a new list with the same value it won't be found; e.g.::
	595
[391]	596	mydict = {[1, 2]: '12'}
	597	print mydict[[1, 2]]
[2]	598
[391]	599	would raise a KeyError exception because the id of the ``[1, 2]`` used in the
[2]	600	second line differs from that in the first line. In other words, dictionary
	601	keys should be compared using ``==``, not using :keyword:`is`.
	602
	603	- Make a copy when using a list as a key. This doesn't work because the list,
	604	being a mutable object, could contain a reference to itself, and then the
	605	copying code would run into an infinite loop.
	606
	607	- Allow lists as keys but tell the user not to modify them. This would allow a
	608	class of hard-to-track bugs in programs when you forgot or modified a list by
	609	accident. It also invalidates an important invariant of dictionaries: every
	610	value in ``d.keys()`` is usable as a key of the dictionary.
	611
	612	- Mark lists as read-only once they are used as a dictionary key. The problem
	613	is that it's not just the top-level object that could change its value; you
	614	could use a tuple containing a list as a key. Entering anything as a key into
	615	a dictionary would require marking all objects reachable from there as
	616	read-only -- and again, self-referential objects could cause an infinite loop.
	617
	618	There is a trick to get around this if you need to, but use it at your own risk:
	619	You can wrap a mutable structure inside a class instance which has both a
[391]	620	:meth:`__eq__` and a :meth:`__hash__` method. You must then make sure that the
[2]	621	hash value for all such wrapper objects that reside in a dictionary (or other
	622	hash based structure), remain fixed while the object is in the dictionary (or
	623	other structure). ::
	624
	625	class ListWrapper:
	626	def __init__(self, the_list):
	627	self.the_list = the_list
[391]	628	def __eq__(self, other):
[2]	629	return self.the_list == other.the_list
	630	def __hash__(self):
	631	l = self.the_list
	632	result = 98767 - len(l)*555
[391]	633	for i, el in enumerate(l):
[2]	634	try:
[391]	635	result = result + (hash(el) % 9999999) * 1001 + i
	636	except Exception:
[2]	637	result = (result % 7777777) + i * 333
	638	return result
	639
	640	Note that the hash computation is complicated by the possibility that some
	641	members of the list may be unhashable and also by the possibility of arithmetic
	642	overflow.
	643
[391]	644	Furthermore it must always be the case that if ``o1 == o2`` (ie ``o1.__eq__(o2)
	645	is True``) then ``hash(o1) == hash(o2)`` (ie, ``o1.__hash__() == o2.__hash__()``),
[2]	646	regardless of whether the object is in a dictionary or not. If you fail to meet
	647	these restrictions dictionaries and other hash based structures will misbehave.
	648
	649	In the case of ListWrapper, whenever the wrapper object is in a dictionary the
	650	wrapped list must not change to avoid anomalies. Don't do this unless you are
	651	prepared to think hard about the requirements and the consequences of not
	652	meeting them correctly. Consider yourself warned.
	653
	654
	655	Why doesn't list.sort() return the sorted list?
	656	-----------------------------------------------
	657
	658	In situations where performance matters, making a copy of the list just to sort
	659	it would be wasteful. Therefore, :meth:`list.sort` sorts the list in place. In
	660	order to remind you of that fact, it does not return the sorted list. This way,
	661	you won't be fooled into accidentally overwriting a list when you need a sorted
	662	copy but also need to keep the unsorted version around.
	663
	664	In Python 2.4 a new built-in function -- :func:`sorted` -- has been added.
	665	This function creates a new list from a provided iterable, sorts it and returns
	666	it. For example, here's how to iterate over the keys of a dictionary in sorted
	667	order::
	668
[391]	669	for key in sorted(mydict):
	670	... # do whatever with mydict[key]...
[2]	671
	672
	673	How do you specify and enforce an interface spec in Python?
	674	-----------------------------------------------------------
	675
	676	An interface specification for a module as provided by languages such as C++ and
	677	Java describes the prototypes for the methods and functions of the module. Many
	678	feel that compile-time enforcement of interface specifications helps in the
	679	construction of large programs.
	680
	681	Python 2.6 adds an :mod:`abc` module that lets you define Abstract Base Classes
	682	(ABCs). You can then use :func:`isinstance` and :func:`issubclass` to check
	683	whether an instance or a class implements a particular ABC. The
[391]	684	:mod:`collections` module defines a set of useful ABCs such as
	685	:class:`~collections.Iterable`, :class:`~collections.Container`, and
	686	:class:`~collections.MutableMapping`.
[2]	687
	688	For Python, many of the advantages of interface specifications can be obtained
	689	by an appropriate test discipline for components. There is also a tool,
	690	PyChecker, which can be used to find problems due to subclassing.
	691
	692	A good test suite for a module can both provide a regression test and serve as a
	693	module interface specification and a set of examples. Many Python modules can
	694	be run as a script to provide a simple "self test." Even modules which use
	695	complex external interfaces can often be tested in isolation using trivial
	696	"stub" emulations of the external interface. The :mod:`doctest` and
	697	:mod:`unittest` modules or third-party test frameworks can be used to construct
	698	exhaustive test suites that exercise every line of code in a module.
	699
	700	An appropriate testing discipline can help build large complex applications in
	701	Python as well as having interface specifications would. In fact, it can be
	702	better because an interface specification cannot test certain properties of a
	703	program. For example, the :meth:`append` method is expected to add new elements
	704	to the end of some internal list; an interface specification cannot test that
	705	your :meth:`append` implementation will actually do this correctly, but it's
	706	trivial to check this property in a test suite.
	707
	708	Writing test suites is very helpful, and you might want to design your code with
	709	an eye to making it easily tested. One increasingly popular technique,
	710	test-directed development, calls for writing parts of the test suite first,
	711	before you write any of the actual code. Of course Python allows you to be
	712	sloppy and not write test cases at all.
	713
	714
	715	Why are default values shared between objects?
	716	----------------------------------------------
	717
	718	This type of bug commonly bites neophyte programmers. Consider this function::
	719
[391]	720	def foo(mydict={}): # Danger: shared reference to one dict for all calls
[2]	721	... compute something ...
[391]	722	mydict[key] = value
	723	return mydict
[2]	724
[391]	725	The first time you call this function, ``mydict`` contains a single item. The
	726	second time, ``mydict`` contains two items because when ``foo()`` begins
	727	executing, ``mydict`` starts out with an item already in it.
[2]	728
	729	It is often expected that a function call creates new objects for default
	730	values. This is not what happens. Default values are created exactly once, when
	731	the function is defined. If that object is changed, like the dictionary in this
	732	example, subsequent calls to the function will refer to this changed object.
	733
	734	By definition, immutable objects such as numbers, strings, tuples, and ``None``,
	735	are safe from change. Changes to mutable objects such as dictionaries, lists,
	736	and class instances can lead to confusion.
	737
	738	Because of this feature, it is good programming practice to not use mutable
	739	objects as default values. Instead, use ``None`` as the default value and
	740	inside the function, check if the parameter is ``None`` and create a new
	741	list/dictionary/whatever if it is. For example, don't write::
	742
[391]	743	def foo(mydict={}):
[2]	744	...
	745
	746	but::
	747
[391]	748	def foo(mydict=None):
	749	if mydict is None:
	750	mydict = {} # create a new dict for local namespace
[2]	751
	752	This feature can be useful. When you have a function that's time-consuming to
	753	compute, a common technique is to cache the parameters and the resulting value
	754	of each call to the function, and return the cached value if the same value is
	755	requested again. This is called "memoizing", and can be implemented like this::
	756
	757	# Callers will never provide a third parameter for this function.
[391]	758	def expensive(arg1, arg2, _cache={}):
	759	if (arg1, arg2) in _cache:
[2]	760	return _cache[(arg1, arg2)]
	761
	762	# Calculate the value
	763	result = ... expensive computation ...
	764	_cache[(arg1, arg2)] = result # Store result in the cache
	765	return result
	766
	767	You could use a global variable containing a dictionary instead of the default
	768	value; it's a matter of taste.
	769
	770
	771	Why is there no goto?
	772	---------------------
	773
	774	You can use exceptions to provide a "structured goto" that even works across
	775	function calls. Many feel that exceptions can conveniently emulate all
	776	reasonable uses of the "go" or "goto" constructs of C, Fortran, and other
	777	languages. For example::
	778
[391]	779	class label: pass # declare a label
[2]	780
	781	try:
	782	...
[391]	783	if condition: raise label() # goto label
[2]	784	...
[391]	785	except label: # where to goto
[2]	786	pass
	787	...
	788
	789	This doesn't allow you to jump into the middle of a loop, but that's usually
	790	considered an abuse of goto anyway. Use sparingly.
	791
	792
	793	Why can't raw strings (r-strings) end with a backslash?
	794	-------------------------------------------------------
	795
	796	More precisely, they can't end with an odd number of backslashes: the unpaired
	797	backslash at the end escapes the closing quote character, leaving an
	798	unterminated string.
	799
	800	Raw strings were designed to ease creating input for processors (chiefly regular
	801	expression engines) that want to do their own backslash escape processing. Such
	802	processors consider an unmatched trailing backslash to be an error anyway, so
	803	raw strings disallow that. In return, they allow you to pass on the string
	804	quote character by escaping it with a backslash. These rules work well when
	805	r-strings are used for their intended purpose.
	806
	807	If you're trying to build Windows pathnames, note that all Windows system calls
	808	accept forward slashes too::
	809
[391]	810	f = open("/mydir/file.txt") # works fine!
[2]	811
	812	If you're trying to build a pathname for a DOS command, try e.g. one of ::
	813
	814	dir = r"\this\is\my\dos\dir" "\\"
	815	dir = r"\this\is\my\dos\dir\ "[:-1]
	816	dir = "\\this\\is\\my\\dos\\dir\\"
	817
	818
	819	Why doesn't Python have a "with" statement for attribute assignments?
	820	---------------------------------------------------------------------
	821
	822	Python has a 'with' statement that wraps the execution of a block, calling code
	823	on the entrance and exit from the block. Some language have a construct that
	824	looks like this::
	825
	826	with obj:
[391]	827	a = 1 # equivalent to obj.a = 1
[2]	828	total = total + 1 # obj.total = obj.total + 1
	829
	830	In Python, such a construct would be ambiguous.
	831
	832	Other languages, such as Object Pascal, Delphi, and C++, use static types, so
	833	it's possible to know, in an unambiguous way, what member is being assigned
	834	to. This is the main point of static typing -- the compiler always knows the
	835	scope of every variable at compile time.
	836
	837	Python uses dynamic types. It is impossible to know in advance which attribute
	838	will be referenced at runtime. Member attributes may be added or removed from
	839	objects on the fly. This makes it impossible to know, from a simple reading,
	840	what attribute is being referenced: a local one, a global one, or a member
	841	attribute?
	842
	843	For instance, take the following incomplete snippet::
	844
	845	def foo(a):
	846	with a:
	847	print x
	848
	849	The snippet assumes that "a" must have a member attribute called "x". However,
	850	there is nothing in Python that tells the interpreter this. What should happen
	851	if "a" is, let us say, an integer? If there is a global variable named "x",
	852	will it be used inside the with block? As you see, the dynamic nature of Python
	853	makes such choices much harder.
	854
	855	The primary benefit of "with" and similar language features (reduction of code
	856	volume) can, however, easily be achieved in Python by assignment. Instead of::
	857
[391]	858	function(args).mydict[index][index].a = 21
	859	function(args).mydict[index][index].b = 42
	860	function(args).mydict[index][index].c = 63
[2]	861
	862	write this::
	863
[391]	864	ref = function(args).mydict[index][index]
[2]	865	ref.a = 21
	866	ref.b = 42
	867	ref.c = 63
	868
	869	This also has the side-effect of increasing execution speed because name
	870	bindings are resolved at run-time in Python, and the second version only needs
[391]	871	to perform the resolution once.
[2]	872
	873
	874	Why are colons required for the if/while/def/class statements?
	875	--------------------------------------------------------------
	876
	877	The colon is required primarily to enhance readability (one of the results of
	878	the experimental ABC language). Consider this::
	879
	880	if a == b
	881	print a
	882
	883	versus ::
	884
	885	if a == b:
	886	print a
	887
	888	Notice how the second one is slightly easier to read. Notice further how a
	889	colon sets off the example in this FAQ answer; it's a standard usage in English.
	890
	891	Another minor reason is that the colon makes it easier for editors with syntax
	892	highlighting; they can look for colons to decide when indentation needs to be
	893	increased instead of having to do a more elaborate parsing of the program text.
	894
	895
	896	Why does Python allow commas at the end of lists and tuples?
	897	------------------------------------------------------------
	898
	899	Python lets you add a trailing comma at the end of lists, tuples, and
	900	dictionaries::
	901
	902	[1, 2, 3,]
	903	('a', 'b', 'c',)
	904	d = {
	905	"A": [1, 5],
	906	"B": [6, 7], # last trailing comma is optional but good style
	907	}
	908
	909
	910	There are several reasons to allow this.
	911
	912	When you have a literal value for a list, tuple, or dictionary spread across
	913	multiple lines, it's easier to add more elements because you don't have to
[391]	914	remember to add a comma to the previous line. The lines can also be reordered
	915	without creating a syntax error.
[2]	916
	917	Accidentally omitting the comma can lead to errors that are hard to diagnose.
	918	For example::
	919
	920	x = [
	921	"fee",
	922	"fie"
	923	"foo",
	924	"fum"
	925	]
	926
	927	This list looks like it has four elements, but it actually contains three:
	928	"fee", "fiefoo" and "fum". Always adding the comma avoids this source of error.
	929
	930	Allowing the trailing comma may also make programmatic code generation easier.

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: python/trunk/Doc/faq/design.rst

Download in other formats: