Context Navigation

tut.tex

Visit:

Last change on this file was 3225, checked in by bird, 18 years ago
Python 2.5
File size: 206.7 KB

Line
1	\documentclass{manual}
2	\usepackage[T1]{fontenc}
3	\usepackage{textcomp}
4
5	% Things to do:
6	% Should really move the Python startup file info to an appendix
7
8	\title{Python Tutorial}
9
10	\input{boilerplate}
11
12	\makeindex
13
14	\begin{document}
15
16	\maketitle
17
18	\ifhtml
19	\chapter*{Front Matter\label{front}}
20	\fi
21
22	\input{copyright}
23
24	\begin{abstract}
25
26	\noindent
27	Python is an easy to learn, powerful programming language. It has
28	efficient high-level data structures and a simple but effective
29	approach to object-oriented programming. Python's elegant syntax and
30	dynamic typing, together with its interpreted nature, make it an ideal
31	language for scripting and rapid application development in many areas
32	on most platforms.
33
34	The Python interpreter and the extensive standard library are freely
35	available in source or binary form for all major platforms from the
36	Python Web site, \url{http://www.python.org/}, and may be freely
37	distributed. The same site also contains distributions of and
38	pointers to many free third party Python modules, programs and tools,
39	and additional documentation.
40
41	The Python interpreter is easily extended with new functions and data
42	types implemented in C or \Cpp{} (or other languages callable from C).
43	Python is also suitable as an extension language for customizable
44	applications.
45
46	This tutorial introduces the reader informally to the basic concepts
47	and features of the Python language and system. It helps to have a
48	Python interpreter handy for hands-on experience, but all examples are
49	self-contained, so the tutorial can be read off-line as well.
50
51	For a description of standard objects and modules, see the
52	\citetitle[../lib/lib.html]{Python Library Reference} document. The
53	\citetitle[../ref/ref.html]{Python Reference Manual} gives a more
54	formal definition of the language. To write extensions in C or
55	\Cpp, read \citetitle[../ext/ext.html]{Extending and Embedding the
56	Python Interpreter} and \citetitle[../api/api.html]{Python/C API
57	Reference}. There are also several books covering Python in depth.
58
59	This tutorial does not attempt to be comprehensive and cover every
60	single feature, or even every commonly used feature. Instead, it
61	introduces many of Python's most noteworthy features, and will give
62	you a good idea of the language's flavor and style. After reading it,
63	you will be able to read and write Python modules and programs, and
64	you will be ready to learn more about the various Python library
65	modules described in the \citetitle[../lib/lib.html]{Python Library
66	Reference}.
67
68	\end{abstract}
69
70	\tableofcontents
71
72
73	\chapter{Whetting Your Appetite \label{intro}}
74
75	If you do much work on computers, eventually you find that there's
76	some task you'd like to automate. For example, you may wish to
77	perform a search-and-replace over a large number of text files, or
78	rename and rearrange a bunch of photo files in a complicated way.
79	Perhaps you'd like to write a small custom database, or a specialized
80	GUI application, or a simple game.
81
82	If you're a professional software developer, you may have to work with
83	several C/\Cpp/Java libraries but find the usual
84	write/compile/test/re-compile cycle is too slow. Perhaps you're
85	writing a test suite for such a library and find writing the testing
86	code a tedious task. Or maybe you've written a program that could use
87	an extension language, and you don't want to design and implement a
88	whole new language for your application.
89
90	Python is just the language for you.
91
92	You could write a {\UNIX} shell script or Windows batch files for some
93	of these tasks, but shell scripts are best at moving around files and
94	changing text data, not well-suited for GUI applications or games.
95	You could write a C/{\Cpp}/Java program, but it can take a lot of
96	development time to get even a first-draft program. Python is simpler
97	to use, available on Windows, MacOS X, and {\UNIX} operating systems,
98	and will help you get the job done more quickly.
99
100	Python is simple to use, but it is a real programming language,
101	offering much more structure and support for large programs than shell
102	scripts or batch files can offer. On the other hand, Python also
103	offers much more error checking than C, and, being a
104	\emph{very-high-level language}, it has high-level data types built
105	in, such as flexible arrays and dictionaries. Because of its more
106	general data types Python is applicable to a much larger problem
107	domain than Awk or even Perl, yet many things are at
108	least as easy in Python as in those languages.
109
110	Python allows you to split your program into modules that can be
111	reused in other Python programs. It comes with a large collection of
112	standard modules that you can use as the basis of your programs --- or
113	as examples to start learning to program in Python. Some of these
114	modules provide things like file I/O, system calls,
115	sockets, and even interfaces to graphical user interface toolkits like Tk.
116
117	Python is an interpreted language, which can save you considerable time
118	during program development because no compilation and linking is
119	necessary. The interpreter can be used interactively, which makes it
120	easy to experiment with features of the language, to write throw-away
121	programs, or to test functions during bottom-up program development.
122	It is also a handy desk calculator.
123
124	Python enables programs to be written compactly and readably. Programs
125	written in Python are typically much shorter than equivalent C,
126	\Cpp{}, or Java programs, for several reasons:
127	\begin{itemize}
128	\item
129	the high-level data types allow you to express complex operations in a
130	single statement;
131	\item
132	statement grouping is done by indentation instead of beginning and ending
133	brackets;
134	\item
135	no variable or argument declarations are necessary.
136	\end{itemize}
137
138	Python is \emph{extensible}: if you know how to program in C it is easy
139	to add a new built-in function or module to the interpreter, either to
140	perform critical operations at maximum speed, or to link Python
141	programs to libraries that may only be available in binary form (such
142	as a vendor-specific graphics library). Once you are really hooked,
143	you can link the Python interpreter into an application written in C
144	and use it as an extension or command language for that application.
145
146	By the way, the language is named after the BBC show ``Monty Python's
147	Flying Circus'' and has nothing to do with nasty reptiles. Making
148	references to Monty Python skits in documentation is not only allowed,
149	it is encouraged!
150
151	%\section{Where From Here \label{where}}
152
153	Now that you are all excited about Python, you'll want to examine it
154	in some more detail. Since the best way to learn a language is
155	to use it, the tutorial invites you to play with the Python interpreter
156	as you read.
157
158	In the next chapter, the mechanics of using the interpreter are
159	explained. This is rather mundane information, but essential for
160	trying out the examples shown later.
161
162	The rest of the tutorial introduces various features of the Python
163	language and system through examples, beginning with simple
164	expressions, statements and data types, through functions and modules,
165	and finally touching upon advanced concepts like exceptions
166	and user-defined classes.
167
168	\chapter{Using the Python Interpreter \label{using}}
169
170	\section{Invoking the Interpreter \label{invoking}}
171
172	The Python interpreter is usually installed as
173	\file{/usr/local/bin/python} on those machines where it is available;
174	putting \file{/usr/local/bin} in your \UNIX{} shell's search path
175	makes it possible to start it by typing the command
176
177	\begin{verbatim}
178	python
179	\end{verbatim}
180
181	to the shell. Since the choice of the directory where the interpreter
182	lives is an installation option, other places are possible; check with
183	your local Python guru or system administrator. (E.g.,
184	\file{/usr/local/python} is a popular alternative location.)
185
186	On Windows machines, the Python installation is usually placed in
187	\file{C:\e Python24}, though you can change this when you're running
188	the installer. To add this directory to your path,
189	you can type the following command into the command prompt in a DOS box:
190
191	\begin{verbatim}
192	set path=%path%;C:\python24
193	\end{verbatim}
194
195
196	Typing an end-of-file character (\kbd{Control-D} on \UNIX,
197	\kbd{Control-Z} on Windows) at the primary prompt causes the
198	interpreter to exit with a zero exit status. If that doesn't work,
199	you can exit the interpreter by typing the following commands:
200	\samp{import sys; sys.exit()}.
201
202	The interpreter's line-editing features usually aren't very
203	sophisticated. On \UNIX, whoever installed the interpreter may have
204	enabled support for the GNU readline library, which adds more
205	elaborate interactive editing and history features. Perhaps the
206	quickest check to see whether command line editing is supported is
207	typing Control-P to the first Python prompt you get. If it beeps, you
208	have command line editing; see Appendix \ref{interacting} for an
209	introduction to the keys. If nothing appears to happen, or if
210	\code{\^P} is echoed, command line editing isn't available; you'll
211	only be able to use backspace to remove characters from the current
212	line.
213
214	The interpreter operates somewhat like the \UNIX{} shell: when called
215	with standard input connected to a tty device, it reads and executes
216	commands interactively; when called with a file name argument or with
217	a file as standard input, it reads and executes a \emph{script} from
218	that file.
219
220	A second way of starting the interpreter is
221	\samp{\program{python} \programopt{-c} \var{command} [arg] ...}, which
222	executes the statement(s) in \var{command}, analogous to the shell's
223	\programopt{-c} option. Since Python statements often contain spaces
224	or other characters that are special to the shell, it is best to quote
225	\var{command} in its entirety with double quotes.
226
227	Some Python modules are also useful as scripts. These can be invoked using
228	\samp{\program{python} \programopt{-m} \var{module} [arg] ...}, which
229	executes the source file for \var{module} as if you had spelled out its
230	full name on the command line.
231
232	Note that there is a difference between \samp{python file} and
233	\samp{python <file}. In the latter case, input requests from the
234	program, such as calls to \function{input()} and \function{raw_input()}, are
235	satisfied from \emph{file}. Since this file has already been read
236	until the end by the parser before the program starts executing, the
237	program will encounter end-of-file immediately. In the former case
238	(which is usually what you want) they are satisfied from whatever file
239	or device is connected to standard input of the Python interpreter.
240
241	When a script file is used, it is sometimes useful to be able to run
242	the script and enter interactive mode afterwards. This can be done by
243	passing \programopt{-i} before the script. (This does not work if the
244	script is read from standard input, for the same reason as explained
245	in the previous paragraph.)
246
247	\subsection{Argument Passing \label{argPassing}}
248
249	When known to the interpreter, the script name and additional
250	arguments thereafter are passed to the script in the variable
251	\code{sys.argv}, which is a list of strings. Its length is at least
252	one; when no script and no arguments are given, \code{sys.argv[0]} is
253	an empty string. When the script name is given as \code{'-'} (meaning
254	standard input), \code{sys.argv[0]} is set to \code{'-'}. When
255	\programopt{-c} \var{command} is used, \code{sys.argv[0]} is set to
256	\code{'-c'}. When \programopt{-m} \var{module} is used, \code{sys.argv[0]}
257	is set to the full name of the located module. Options found after
258	\programopt{-c} \var{command} or \programopt{-m} \var{module} are not consumed
259	by the Python interpreter's option processing but left in \code{sys.argv} for
260	the command or module to handle.
261
262	\subsection{Interactive Mode \label{interactive}}
263
264	When commands are read from a tty, the interpreter is said to be in
265	\emph{interactive mode}. In this mode it prompts for the next command
266	with the \emph{primary prompt}, usually three greater-than signs
267	(\samp{>>>~}); for continuation lines it prompts with the
268	\emph{secondary prompt}, by default three dots (\samp{...~}).
269	The interpreter prints a welcome message stating its version number
270	and a copyright notice before printing the first prompt:
271
272	\begin{verbatim}
273	python
274	Python 1.5.2b2 (#1, Feb 28 1999, 00:02:06) [GCC 2.8.1] on sunos5
275	Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
276	>>>
277	\end{verbatim}
278
279	Continuation lines are needed when entering a multi-line construct.
280	As an example, take a look at this \keyword{if} statement:
281
282	\begin{verbatim}
283	>>> the_world_is_flat = 1
284	>>> if the_world_is_flat:
285	... print "Be careful not to fall off!"
286	...
287	Be careful not to fall off!
288	\end{verbatim}
289
290
291	\section{The Interpreter and Its Environment \label{interp}}
292
293	\subsection{Error Handling \label{error}}
294
295	When an error occurs, the interpreter prints an error
296	message and a stack trace. In interactive mode, it then returns to
297	the primary prompt; when input came from a file, it exits with a
298	nonzero exit status after printing
299	the stack trace. (Exceptions handled by an \keyword{except} clause in a
300	\keyword{try} statement are not errors in this context.) Some errors are
301	unconditionally fatal and cause an exit with a nonzero exit; this
302	applies to internal inconsistencies and some cases of running out of
303	memory. All error messages are written to the standard error stream;
304	normal output from executed commands is written to standard
305	output.
306
307	Typing the interrupt character (usually Control-C or DEL) to the
308	primary or secondary prompt cancels the input and returns to the
309	primary prompt.\footnote{
310	A problem with the GNU Readline package may prevent this.
311	}
312	Typing an interrupt while a command is executing raises the
313	\exception{KeyboardInterrupt} exception, which may be handled by a
314	\keyword{try} statement.
315
316	\subsection{Executable Python Scripts \label{scripts}}
317
318	On BSD'ish \UNIX{} systems, Python scripts can be made directly
319	executable, like shell scripts, by putting the line
320
321	\begin{verbatim}
322	#! /usr/bin/env python
323	\end{verbatim}
324
325	(assuming that the interpreter is on the user's \envvar{PATH}) at the
326	beginning of the script and giving the file an executable mode. The
327	\samp{\#!} must be the first two characters of the file. On some
328	platforms, this first line must end with a \UNIX-style line ending
329	(\character{\e n}), not a Mac OS (\character{\e r}) or Windows
330	(\character{\e r\e n}) line ending. Note that
331	the hash, or pound, character, \character{\#}, is used to start a
332	comment in Python.
333
334	The script can be given an executable mode, or permission, using the
335	\program{chmod} command:
336
337	\begin{verbatim}
338	$ chmod +x myscript.py
339	\end{verbatim} % $ <-- bow to font-lock
340
341
342	\subsection{Source Code Encoding}
343
344	It is possible to use encodings different than \ASCII{} in Python source
345	files. The best way to do it is to put one more special comment line
346	right after the \code{\#!} line to define the source file encoding:
347
348	\begin{alltt}
349	# -- coding: \var{encoding} --
350	\end{alltt}
351
352	With that declaration, all characters in the source file will be treated as
353	having the encoding \var{encoding}, and it will be
354	possible to directly write Unicode string literals in the selected
355	encoding. The list of possible encodings can be found in the
356	\citetitle[../lib/lib.html]{Python Library Reference}, in the section
357	on \ulink{\module{codecs}}{../lib/module-codecs.html}.
358
359	For example, to write Unicode literals including the Euro currency
360	symbol, the ISO-8859-15 encoding can be used, with the Euro symbol
361	having the ordinal value 164. This script will print the value 8364
362	(the Unicode codepoint corresponding to the Euro symbol) and then
363	exit:
364
365	\begin{alltt}
366	# -- coding: iso-8859-15 --
367
368	currency = u"\texteuro"
369	print ord(currency)
370	\end{alltt}
371
372	If your editor supports saving files as \code{UTF-8} with a UTF-8
373	\emph{byte order mark} (aka BOM), you can use that instead of an
374	encoding declaration. IDLE supports this capability if
375	\code{Options/General/Default Source Encoding/UTF-8} is set. Notice
376	that this signature is not understood in older Python releases (2.2
377	and earlier), and also not understood by the operating system for
378	script files with \code{\#!} lines (only used on \UNIX{} systems).
379
380	By using UTF-8 (either through the signature or an encoding
381	declaration), characters of most languages in the world can be used
382	simultaneously in string literals and comments. Using non-\ASCII{}
383	characters in identifiers is not supported. To display all these
384	characters properly, your editor must recognize that the file is
385	UTF-8, and it must use a font that supports all the characters in the
386	file.
387
388	\subsection{The Interactive Startup File \label{startup}}
389
390	% XXX This should probably be dumped in an appendix, since most people
391	% don't use Python interactively in non-trivial ways.
392
393	When you use Python interactively, it is frequently handy to have some
394	standard commands executed every time the interpreter is started. You
395	can do this by setting an environment variable named
396	\envvar{PYTHONSTARTUP} to the name of a file containing your start-up
397	commands. This is similar to the \file{.profile} feature of the
398	\UNIX{} shells.
399
400	This file is only read in interactive sessions, not when Python reads
401	commands from a script, and not when \file{/dev/tty} is given as the
402	explicit source of commands (which otherwise behaves like an
403	interactive session). It is executed in the same namespace where
404	interactive commands are executed, so that objects that it defines or
405	imports can be used without qualification in the interactive session.
406	You can also change the prompts \code{sys.ps1} and \code{sys.ps2} in
407	this file.
408
409	If you want to read an additional start-up file from the current
410	directory, you can program this in the global start-up file using code
411	like \samp{if os.path.isfile('.pythonrc.py'):
412	execfile('.pythonrc.py')}. If you want to use the startup file in a
413	script, you must do this explicitly in the script:
414
415	\begin{verbatim}
416	import os
417	filename = os.environ.get('PYTHONSTARTUP')
418	if filename and os.path.isfile(filename):
419	execfile(filename)
420	\end{verbatim}
421
422
423	\chapter{An Informal Introduction to Python \label{informal}}
424
425	In the following examples, input and output are distinguished by the
426	presence or absence of prompts (\samp{>>>~} and \samp{...~}): to repeat
427	the example, you must type everything after the prompt, when the
428	prompt appears; lines that do not begin with a prompt are output from
429	the interpreter. %
430	%\footnote{
431	% I'd prefer to use different fonts to distinguish input
432	% from output, but the amount of LaTeX hacking that would require
433	% is currently beyond my ability.
434	%}
435	Note that a secondary prompt on a line by itself in an example means
436	you must type a blank line; this is used to end a multi-line command.
437
438	Many of the examples in this manual, even those entered at the
439	interactive prompt, include comments. Comments in Python start with
440	the hash character, \character{\#}, and extend to the end of the
441	physical line. A comment may appear at the start of a line or
442	following whitespace or code, but not within a string literal. A hash
443	character within a string literal is just a hash character.
444
445	Some examples:
446
447	\begin{verbatim}
448	# this is the first comment
449	SPAM = 1 # and this is the second comment
450	# ... and now a third!
451	STRING = "# This is not a comment."
452	\end{verbatim}
453
454
455	\section{Using Python as a Calculator \label{calculator}}
456
457	Let's try some simple Python commands. Start the interpreter and wait
458	for the primary prompt, \samp{>>>~}. (It shouldn't take long.)
459
460	\subsection{Numbers \label{numbers}}
461
462	The interpreter acts as a simple calculator: you can type an
463	expression at it and it will write the value. Expression syntax is
464	straightforward: the operators \code{+}, \code{-}, \code{*} and
465	\code{/} work just like in most other languages (for example, Pascal
466	or C); parentheses can be used for grouping. For example:
467
468	\begin{verbatim}
469	>>> 2+2
470	4
471	>>> # This is a comment
472	... 2+2
473	4
474	>>> 2+2 # and a comment on the same line as code
475	4
476	>>> (50-5*6)/4
477	5
478	>>> # Integer division returns the floor:
479	... 7/3
480	2
481	>>> 7/-3
482	-3
483	\end{verbatim}
484
485	The equal sign (\character{=}) is used to assign a value to a variable.
486	Afterwards, no result is displayed before the next interactive prompt:
487
488	\begin{verbatim}
489	>>> width = 20
490	>>> height = 5*9
491	>>> width * height
492	900
493	\end{verbatim}
494
495	A value can be assigned to several variables simultaneously:
496
497	\begin{verbatim}
498	>>> x = y = z = 0 # Zero x, y and z
499	>>> x
500	0
501	>>> y
502	0
503	>>> z
504	0
505	\end{verbatim}
506
507	There is full support for floating point; operators with mixed type
508	operands convert the integer operand to floating point:
509
510	\begin{verbatim}
511	>>> 3 * 3.75 / 1.5
512	7.5
513	>>> 7.0 / 2
514	3.5
515	\end{verbatim}
516
517	Complex numbers are also supported; imaginary numbers are written with
518	a suffix of \samp{j} or \samp{J}. Complex numbers with a nonzero
519	real component are written as \samp{(\var{real}+\var{imag}j)}, or can
520	be created with the \samp{complex(\var{real}, \var{imag})} function.
521
522	\begin{verbatim}
523	>>> 1j * 1J
524	(-1+0j)
525	>>> 1j * complex(0,1)
526	(-1+0j)
527	>>> 3+1j*3
528	(3+3j)
529	>>> (3+1j)*3
530	(9+3j)
531	>>> (1+2j)/(1+1j)
532	(1.5+0.5j)
533	\end{verbatim}
534
535	Complex numbers are always represented as two floating point numbers,
536	the real and imaginary part. To extract these parts from a complex
537	number \var{z}, use \code{\var{z}.real} and \code{\var{z}.imag}.
538
539	\begin{verbatim}
540	>>> a=1.5+0.5j
541	>>> a.real
542	1.5
543	>>> a.imag
544	0.5
545	\end{verbatim}
546
547	The conversion functions to floating point and integer
548	(\function{float()}, \function{int()} and \function{long()}) don't
549	work for complex numbers --- there is no one correct way to convert a
550	complex number to a real number. Use \code{abs(\var{z})} to get its
551	magnitude (as a float) or \code{z.real} to get its real part.
552
553	\begin{verbatim}
554	>>> a=3.0+4.0j
555	>>> float(a)
556	Traceback (most recent call last):
557	File "<stdin>", line 1, in ?
558	TypeError: can't convert complex to float; use abs(z)
559	>>> a.real
560	3.0
561	>>> a.imag
562	4.0
563	>>> abs(a) # sqrt(a.real2 + a.imag2)
564	5.0
565	>>>
566	\end{verbatim}
567
568	In interactive mode, the last printed expression is assigned to the
569	variable \code{_}. This means that when you are using Python as a
570	desk calculator, it is somewhat easier to continue calculations, for
571	example:
572
573	\begin{verbatim}
574	>>> tax = 12.5 / 100
575	>>> price = 100.50
576	>>> price * tax
577	12.5625
578	>>> price + _
579	113.0625
580	>>> round(_, 2)
581	113.06
582	>>>
583	\end{verbatim}
584
585	This variable should be treated as read-only by the user. Don't
586	explicitly assign a value to it --- you would create an independent
587	local variable with the same name masking the built-in variable with
588	its magic behavior.
589
590	\subsection{Strings \label{strings}}
591
592	Besides numbers, Python can also manipulate strings, which can be
593	expressed in several ways. They can be enclosed in single quotes or
594	double quotes:
595
596	\begin{verbatim}
597	>>> 'spam eggs'
598	'spam eggs'
599	>>> 'doesn\'t'
600	"doesn't"
601	>>> "doesn't"
602	"doesn't"
603	>>> '"Yes," he said.'
604	'"Yes," he said.'
605	>>> "\"Yes,\" he said."
606	'"Yes," he said.'
607	>>> '"Isn\'t," she said.'
608	'"Isn\'t," she said.'
609	\end{verbatim}
610
611	String literals can span multiple lines in several ways. Continuation
612	lines can be used, with a backslash as the last character on the line
613	indicating that the next line is a logical continuation of the line:
614
615	\begin{verbatim}
616	hello = "This is a rather long string containing\n\
617	several lines of text just as you would do in C.\n\
618	Note that whitespace at the beginning of the line is\
619	significant."
620
621	print hello
622	\end{verbatim}
623
624	Note that newlines still need to be embedded in the string using
625	\code{\e n}; the newline following the trailing backslash is
626	discarded. This example would print the following:
627
628	\begin{verbatim}
629	This is a rather long string containing
630	several lines of text just as you would do in C.
631	Note that whitespace at the beginning of the line is significant.
632	\end{verbatim}
633
634	If we make the string literal a ``raw'' string, however, the
635	\code{\e n} sequences are not converted to newlines, but the backslash
636	at the end of the line, and the newline character in the source, are
637	both included in the string as data. Thus, the example:
638
639	\begin{verbatim}
640	hello = r"This is a rather long string containing\n\
641	several lines of text much as you would do in C."
642
643	print hello
644	\end{verbatim}
645
646	would print:
647
648	\begin{verbatim}
649	This is a rather long string containing\n\
650	several lines of text much as you would do in C.
651	\end{verbatim}
652
653	Or, strings can be surrounded in a pair of matching triple-quotes:
654	\code{"""} or \code{'\code{'}'}. End of lines do not need to be escaped
655	when using triple-quotes, but they will be included in the string.
656
657	\begin{verbatim}
658	print """
659	Usage: thingy [OPTIONS]
660	-h Display this usage message
661	-H hostname Hostname to connect to
662	"""
663	\end{verbatim}
664
665	produces the following output:
666
667	\begin{verbatim}
668	Usage: thingy [OPTIONS]
669	-h Display this usage message
670	-H hostname Hostname to connect to
671	\end{verbatim}
672
673	The interpreter prints the result of string operations in the same way
674	as they are typed for input: inside quotes, and with quotes and other
675	funny characters escaped by backslashes, to show the precise
676	value. The string is enclosed in double quotes if the string contains
677	a single quote and no double quotes, else it's enclosed in single
678	quotes. (The \keyword{print} statement, described later, can be used
679	to write strings without quotes or escapes.)
680
681	Strings can be concatenated (glued together) with the
682	\code{+} operator, and repeated with \code{*}:
683
684	\begin{verbatim}
685	>>> word = 'Help' + 'A'
686	>>> word
687	'HelpA'
688	>>> '<' + word*5 + '>'
689	'<HelpAHelpAHelpAHelpAHelpA>'
690	\end{verbatim}
691
692	Two string literals next to each other are automatically concatenated;
693	the first line above could also have been written \samp{word = 'Help'
694	'A'}; this only works with two literals, not with arbitrary string
695	expressions:
696
697	\begin{verbatim}
698	>>> 'str' 'ing' # <- This is ok
699	'string'
700	>>> 'str'.strip() + 'ing' # <- This is ok
701	'string'
702	>>> 'str'.strip() 'ing' # <- This is invalid
703	File "<stdin>", line 1, in ?
704	'str'.strip() 'ing'
705	^
706	SyntaxError: invalid syntax
707	\end{verbatim}
708
709	Strings can be subscripted (indexed); like in C, the first character
710	of a string has subscript (index) 0. There is no separate character
711	type; a character is simply a string of size one. Like in Icon,
712	substrings can be specified with the \emph{slice notation}: two indices
713	separated by a colon.
714
715	\begin{verbatim}
716	>>> word[4]
717	'A'
718	>>> word[0:2]
719	'He'
720	>>> word[2:4]
721	'lp'
722	\end{verbatim}
723
724	Slice indices have useful defaults; an omitted first index defaults to
725	zero, an omitted second index defaults to the size of the string being
726	sliced.
727
728	\begin{verbatim}
729	>>> word[:2] # The first two characters
730	'He'
731	>>> word[2:] # Everything except the first two characters
732	'lpA'
733	\end{verbatim}
734
735	Unlike a C string, Python strings cannot be changed. Assigning to an
736	indexed position in the string results in an error:
737
738	\begin{verbatim}
739	>>> word[0] = 'x'
740	Traceback (most recent call last):
741	File "<stdin>", line 1, in ?
742	TypeError: object doesn't support item assignment
743	>>> word[:1] = 'Splat'
744	Traceback (most recent call last):
745	File "<stdin>", line 1, in ?
746	TypeError: object doesn't support slice assignment
747	\end{verbatim}
748
749	However, creating a new string with the combined content is easy and
750	efficient:
751
752	\begin{verbatim}
753	>>> 'x' + word[1:]
754	'xelpA'
755	>>> 'Splat' + word[4]
756	'SplatA'
757	\end{verbatim}
758
759	Here's a useful invariant of slice operations:
760	\code{s[:i] + s[i:]} equals \code{s}.
761
762	\begin{verbatim}
763	>>> word[:2] + word[2:]
764	'HelpA'
765	>>> word[:3] + word[3:]
766	'HelpA'
767	\end{verbatim}
768
769	Degenerate slice indices are handled gracefully: an index that is too
770	large is replaced by the string size, an upper bound smaller than the
771	lower bound returns an empty string.
772
773	\begin{verbatim}
774	>>> word[1:100]
775	'elpA'
776	>>> word[10:]
777	''
778	>>> word[2:1]
779	''
780	\end{verbatim}
781
782	Indices may be negative numbers, to start counting from the right.
783	For example:
784
785	\begin{verbatim}
786	>>> word[-1] # The last character
787	'A'
788	>>> word[-2] # The last-but-one character
789	'p'
790	>>> word[-2:] # The last two characters
791	'pA'
792	>>> word[:-2] # Everything except the last two characters
793	'Hel'
794	\end{verbatim}
795
796	But note that -0 is really the same as 0, so it does not count from
797	the right!
798
799	\begin{verbatim}
800	>>> word[-0] # (since -0 equals 0)
801	'H'
802	\end{verbatim}
803
804	Out-of-range negative slice indices are truncated, but don't try this
805	for single-element (non-slice) indices:
806
807	\begin{verbatim}
808	>>> word[-100:]
809	'HelpA'
810	>>> word[-10] # error
811	Traceback (most recent call last):
812	File "<stdin>", line 1, in ?
813	IndexError: string index out of range
814	\end{verbatim}
815
816	The best way to remember how slices work is to think of the indices as
817	pointing \emph{between} characters, with the left edge of the first
818	character numbered 0. Then the right edge of the last character of a
819	string of \var{n} characters has index \var{n}, for example:
820
821	\begin{verbatim}
822	+---+---+---+---+---+
823	\| H \| e \| l \| p \| A \|
824	+---+---+---+---+---+
825	0 1 2 3 4 5
826	-5 -4 -3 -2 -1
827	\end{verbatim}
828
829	The first row of numbers gives the position of the indices 0...5 in
830	the string; the second row gives the corresponding negative indices.
831	The slice from \var{i} to \var{j} consists of all characters between
832	the edges labeled \var{i} and \var{j}, respectively.
833
834	For non-negative indices, the length of a slice is the difference of
835	the indices, if both are within bounds. For example, the length of
836	\code{word[1:3]} is 2.
837
838	The built-in function \function{len()} returns the length of a string:
839
840	\begin{verbatim}
841	>>> s = 'supercalifragilisticexpialidocious'
842	>>> len(s)
843	34
844	\end{verbatim}
845
846
847	\begin{seealso}
848	\seetitle[../lib/typesseq.html]{Sequence Types}%
849	{Strings, and the Unicode strings described in the next
850	section, are examples of \emph{sequence types}, and
851	support the common operations supported by such types.}
852	\seetitle[../lib/string-methods.html]{String Methods}%
853	{Both strings and Unicode strings support a large number of
854	methods for basic transformations and searching.}
855	\seetitle[../lib/typesseq-strings.html]{String Formatting Operations}%
856	{The formatting operations invoked when strings and Unicode
857	strings are the left operand of the \code{\%} operator are
858	described in more detail here.}
859	\end{seealso}
860
861
862	\subsection{Unicode Strings \label{unicodeStrings}}
863	\sectionauthor{Marc-Andre Lemburg}{mal@lemburg.com}
864
865	Starting with Python 2.0 a new data type for storing text data is
866	available to the programmer: the Unicode object. It can be used to
867	store and manipulate Unicode data (see \url{http://www.unicode.org/})
868	and integrates well with the existing string objects, providing
869	auto-conversions where necessary.
870
871	Unicode has the advantage of providing one ordinal for every character
872	in every script used in modern and ancient texts. Previously, there
873	were only 256 possible ordinals for script characters. Texts were
874	typically bound to a code page which mapped the ordinals to script
875	characters. This lead to very much confusion especially with respect
876	to internationalization (usually written as \samp{i18n} ---
877	\character{i} + 18 characters + \character{n}) of software. Unicode
878	solves these problems by defining one code page for all scripts.
879
880	Creating Unicode strings in Python is just as simple as creating
881	normal strings:
882
883	\begin{verbatim}
884	>>> u'Hello World !'
885	u'Hello World !'
886	\end{verbatim}
887
888	The small \character{u} in front of the quote indicates that a
889	Unicode string is supposed to be created. If you want to include
890	special characters in the string, you can do so by using the Python
891	\emph{Unicode-Escape} encoding. The following example shows how:
892
893	\begin{verbatim}
894	>>> u'Hello\u0020World !'
895	u'Hello World !'
896	\end{verbatim}
897
898	The escape sequence \code{\e u0020} indicates to insert the Unicode
899	character with the ordinal value 0x0020 (the space character) at the
900	given position.
901
902	Other characters are interpreted by using their respective ordinal
903	values directly as Unicode ordinals. If you have literal strings
904	in the standard Latin-1 encoding that is used in many Western countries,
905	you will find it convenient that the lower 256 characters
906	of Unicode are the same as the 256 characters of Latin-1.
907
908	For experts, there is also a raw mode just like the one for normal
909	strings. You have to prefix the opening quote with 'ur' to have
910	Python use the \emph{Raw-Unicode-Escape} encoding. It will only apply
911	the above \code{\e uXXXX} conversion if there is an uneven number of
912	backslashes in front of the small 'u'.
913
914	\begin{verbatim}
915	>>> ur'Hello\u0020World !'
916	u'Hello World !'
917	>>> ur'Hello\\u0020World !'
918	u'Hello\\\\u0020World !'
919	\end{verbatim}
920
921	The raw mode is most useful when you have to enter lots of
922	backslashes, as can be necessary in regular expressions.
923
924	Apart from these standard encodings, Python provides a whole set of
925	other ways of creating Unicode strings on the basis of a known
926	encoding.
927
928	The built-in function \function{unicode()}\bifuncindex{unicode} provides
929	access to all registered Unicode codecs (COders and DECoders). Some of
930	the more well known encodings which these codecs can convert are
931	\emph{Latin-1}, \emph{ASCII}, \emph{UTF-8}, and \emph{UTF-16}.
932	The latter two are variable-length encodings that store each Unicode
933	character in one or more bytes. The default encoding is
934	normally set to \ASCII, which passes through characters in the range
935	0 to 127 and rejects any other characters with an error.
936	When a Unicode string is printed, written to a file, or converted
937	with \function{str()}, conversion takes place using this default encoding.
938
939	\begin{verbatim}
940	>>> u"abc"
941	u'abc'
942	>>> str(u"abc")
943	'abc'
944	>>> u"äöü"
945	u'\xe4\xf6\xfc'
946	>>> str(u"äöü")
947	Traceback (most recent call last):
948	File "<stdin>", line 1, in ?
949	UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-2: ordinal not in range(128)
950	\end{verbatim}
951
952	To convert a Unicode string into an 8-bit string using a specific
953	encoding, Unicode objects provide an \function{encode()} method
954	that takes one argument, the name of the encoding. Lowercase names
955	for encodings are preferred.
956
957	\begin{verbatim}
958	>>> u"äöü".encode('utf-8')
959	'\xc3\xa4\xc3\xb6\xc3\xbc'
960	\end{verbatim}
961
962	If you have data in a specific encoding and want to produce a
963	corresponding Unicode string from it, you can use the
964	\function{unicode()} function with the encoding name as the second
965	argument.
966
967	\begin{verbatim}
968	>>> unicode('\xc3\xa4\xc3\xb6\xc3\xbc', 'utf-8')
969	u'\xe4\xf6\xfc'
970	\end{verbatim}
971
972	\subsection{Lists \label{lists}}
973
974	Python knows a number of \emph{compound} data types, used to group
975	together other values. The most versatile is the \emph{list}, which
976	can be written as a list of comma-separated values (items) between
977	square brackets. List items need not all have the same type.
978
979	\begin{verbatim}
980	>>> a = ['spam', 'eggs', 100, 1234]
981	>>> a
982	['spam', 'eggs', 100, 1234]
983	\end{verbatim}
984
985	Like string indices, list indices start at 0, and lists can be sliced,
986	concatenated and so on:
987
988	\begin{verbatim}
989	>>> a[0]
990	'spam'
991	>>> a[3]
992	1234
993	>>> a[-2]
994	100
995	>>> a[1:-1]
996	['eggs', 100]
997	>>> a[:2] + ['bacon', 2*2]
998	['spam', 'eggs', 'bacon', 4]
999	>>> 3*a[:3] + ['Boo!']
1000	['spam', 'eggs', 100, 'spam', 'eggs', 100, 'spam', 'eggs', 100, 'Boo!']
1001	\end{verbatim}
1002
1003	Unlike strings, which are \emph{immutable}, it is possible to change
1004	individual elements of a list:
1005
1006	\begin{verbatim}
1007	>>> a
1008	['spam', 'eggs', 100, 1234]
1009	>>> a[2] = a[2] + 23
1010	>>> a
1011	['spam', 'eggs', 123, 1234]
1012	\end{verbatim}
1013
1014	Assignment to slices is also possible, and this can even change the size
1015	of the list or clear it entirely:
1016
1017	\begin{verbatim}
1018	>>> # Replace some items:
1019	... a[0:2] = [1, 12]
1020	>>> a
1021	[1, 12, 123, 1234]
1022	>>> # Remove some:
1023	... a[0:2] = []
1024	>>> a
1025	[123, 1234]
1026	>>> # Insert some:
1027	... a[1:1] = ['bletch', 'xyzzy']
1028	>>> a
1029	[123, 'bletch', 'xyzzy', 1234]
1030	>>> # Insert (a copy of) itself at the beginning
1031	>>> a[:0] = a
1032	>>> a
1033	[123, 'bletch', 'xyzzy', 1234, 123, 'bletch', 'xyzzy', 1234]
1034	>>> # Clear the list: replace all items with an empty list
1035	>>> a[:] = []
1036	>>> a
1037	[]
1038	\end{verbatim}
1039
1040	The built-in function \function{len()} also applies to lists:
1041
1042	\begin{verbatim}
1043	>>> len(a)
1044	8
1045	\end{verbatim}
1046
1047	It is possible to nest lists (create lists containing other lists),
1048	for example:
1049
1050	\begin{verbatim}
1051	>>> q = [2, 3]
1052	>>> p = [1, q, 4]
1053	>>> len(p)
1054	3
1055	>>> p[1]
1056	[2, 3]
1057	>>> p[1][0]
1058	2
1059	>>> p[1].append('xtra') # See section 5.1
1060	>>> p
1061	[1, [2, 3, 'xtra'], 4]
1062	>>> q
1063	[2, 3, 'xtra']
1064	\end{verbatim}
1065
1066	Note that in the last example, \code{p[1]} and \code{q} really refer to
1067	the same object! We'll come back to \emph{object semantics} later.
1068
1069	\section{First Steps Towards Programming \label{firstSteps}}
1070
1071	Of course, we can use Python for more complicated tasks than adding
1072	two and two together. For instance, we can write an initial
1073	sub-sequence of the \emph{Fibonacci} series as follows:
1074
1075	\begin{verbatim}
1076	>>> # Fibonacci series:
1077	... # the sum of two elements defines the next
1078	... a, b = 0, 1
1079	>>> while b < 10:
1080	... print b
1081	... a, b = b, a+b
1082	...
1083	1
1084	1
1085	2
1086	3
1087	5
1088	8
1089	\end{verbatim}
1090
1091	This example introduces several new features.
1092
1093	\begin{itemize}
1094
1095	\item
1096	The first line contains a \emph{multiple assignment}: the variables
1097	\code{a} and \code{b} simultaneously get the new values 0 and 1. On the
1098	last line this is used again, demonstrating that the expressions on
1099	the right-hand side are all evaluated first before any of the
1100	assignments take place. The right-hand side expressions are evaluated
1101	from the left to the right.
1102
1103	\item
1104	The \keyword{while} loop executes as long as the condition (here:
1105	\code{b < 10}) remains true. In Python, like in C, any non-zero
1106	integer value is true; zero is false. The condition may also be a
1107	string or list value, in fact any sequence; anything with a non-zero
1108	length is true, empty sequences are false. The test used in the
1109	example is a simple comparison. The standard comparison operators are
1110	written the same as in C: \code{<} (less than), \code{>} (greater than),
1111	\code{==} (equal to), \code{<=} (less than or equal to),
1112	\code{>=} (greater than or equal to) and \code{!=} (not equal to).
1113
1114	\item
1115	The \emph{body} of the loop is \emph{indented}: indentation is Python's
1116	way of grouping statements. Python does not (yet!) provide an
1117	intelligent input line editing facility, so you have to type a tab or
1118	space(s) for each indented line. In practice you will prepare more
1119	complicated input for Python with a text editor; most text editors have
1120	an auto-indent facility. When a compound statement is entered
1121	interactively, it must be followed by a blank line to indicate
1122	completion (since the parser cannot guess when you have typed the last
1123	line). Note that each line within a basic block must be indented by
1124	the same amount.
1125
1126	\item
1127	The \keyword{print} statement writes the value of the expression(s) it is
1128	given. It differs from just writing the expression you want to write
1129	(as we did earlier in the calculator examples) in the way it handles
1130	multiple expressions and strings. Strings are printed without quotes,
1131	and a space is inserted between items, so you can format things nicely,
1132	like this:
1133
1134	\begin{verbatim}
1135	>>> i = 256*256
1136	>>> print 'The value of i is', i
1137	The value of i is 65536
1138	\end{verbatim}
1139
1140	A trailing comma avoids the newline after the output:
1141
1142	\begin{verbatim}
1143	>>> a, b = 0, 1
1144	>>> while b < 1000:
1145	... print b,
1146	... a, b = b, a+b
1147	...
1148	1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
1149	\end{verbatim}
1150
1151	Note that the interpreter inserts a newline before it prints the next
1152	prompt if the last line was not completed.
1153
1154	\end{itemize}
1155
1156
1157	\chapter{More Control Flow Tools \label{moreControl}}
1158
1159	Besides the \keyword{while} statement just introduced, Python knows
1160	the usual control flow statements known from other languages, with
1161	some twists.
1162
1163	\section{\keyword{if} Statements \label{if}}
1164
1165	Perhaps the most well-known statement type is the
1166	\keyword{if} statement. For example:
1167
1168	\begin{verbatim}
1169	>>> x = int(raw_input("Please enter an integer: "))
1170	>>> if x < 0:
1171	... x = 0
1172	... print 'Negative changed to zero'
1173	... elif x == 0:
1174	... print 'Zero'
1175	... elif x == 1:
1176	... print 'Single'
1177	... else:
1178	... print 'More'
1179	...
1180	\end{verbatim}
1181
1182	There can be zero or more \keyword{elif} parts, and the
1183	\keyword{else} part is optional. The keyword `\keyword{elif}' is
1184	short for `else if', and is useful to avoid excessive indentation. An
1185	\keyword{if} \ldots\ \keyword{elif} \ldots\ \keyword{elif} \ldots\ sequence
1186	% Weird spacings happen here if the wrapping of the source text
1187	% gets changed in the wrong way.
1188	is a substitute for the \keyword{switch} or
1189	\keyword{case} statements found in other languages.
1190
1191
1192	\section{\keyword{for} Statements \label{for}}
1193
1194	The \keyword{for}\stindex{for} statement in Python differs a bit from
1195	what you may be used to in C or Pascal. Rather than always
1196	iterating over an arithmetic progression of numbers (like in Pascal),
1197	or giving the user the ability to define both the iteration step and
1198	halting condition (as C), Python's
1199	\keyword{for}\stindex{for} statement iterates over the items of any
1200	sequence (a list or a string), in the order that they appear in
1201	the sequence. For example (no pun intended):
1202	% One suggestion was to give a real C example here, but that may only
1203	% serve to confuse non-C programmers.
1204
1205	\begin{verbatim}
1206	>>> # Measure some strings:
1207	... a = ['cat', 'window', 'defenestrate']
1208	>>> for x in a:
1209	... print x, len(x)
1210	...
1211	cat 3
1212	window 6
1213	defenestrate 12
1214	\end{verbatim}
1215
1216	It is not safe to modify the sequence being iterated over in the loop
1217	(this can only happen for mutable sequence types, such as lists). If
1218	you need to modify the list you are iterating over (for example, to
1219	duplicate selected items) you must iterate over a copy. The slice
1220	notation makes this particularly convenient:
1221
1222	\begin{verbatim}
1223	>>> for x in a[:]: # make a slice copy of the entire list
1224	... if len(x) > 6: a.insert(0, x)
1225	...
1226	>>> a
1227	['defenestrate', 'cat', 'window', 'defenestrate']
1228	\end{verbatim}
1229
1230
1231	\section{The \function{range()} Function \label{range}}
1232
1233	If you do need to iterate over a sequence of numbers, the built-in
1234	function \function{range()} comes in handy. It generates lists
1235	containing arithmetic progressions:
1236
1237	\begin{verbatim}
1238	>>> range(10)
1239	[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
1240	\end{verbatim}
1241
1242	The given end point is never part of the generated list;
1243	\code{range(10)} generates a list of 10 values, the legal
1244	indices for items of a sequence of length 10. It is possible to let
1245	the range start at another number, or to specify a different increment
1246	(even negative; sometimes this is called the `step'):
1247
1248	\begin{verbatim}
1249	>>> range(5, 10)
1250	[5, 6, 7, 8, 9]
1251	>>> range(0, 10, 3)
1252	[0, 3, 6, 9]
1253	>>> range(-10, -100, -30)
1254	[-10, -40, -70]
1255	\end{verbatim}
1256
1257	To iterate over the indices of a sequence, combine
1258	\function{range()} and \function{len()} as follows:
1259
1260	\begin{verbatim}
1261	>>> a = ['Mary', 'had', 'a', 'little', 'lamb']
1262	>>> for i in range(len(a)):
1263	... print i, a[i]
1264	...
1265	0 Mary
1266	1 had
1267	2 a
1268	3 little
1269	4 lamb
1270	\end{verbatim}
1271
1272
1273	\section{\keyword{break} and \keyword{continue} Statements, and
1274	\keyword{else} Clauses on Loops
1275	\label{break}}
1276
1277	The \keyword{break} statement, like in C, breaks out of the smallest
1278	enclosing \keyword{for} or \keyword{while} loop.
1279
1280	The \keyword{continue} statement, also borrowed from C, continues
1281	with the next iteration of the loop.
1282
1283	Loop statements may have an \code{else} clause; it is executed when
1284	the loop terminates through exhaustion of the list (with
1285	\keyword{for}) or when the condition becomes false (with
1286	\keyword{while}), but not when the loop is terminated by a
1287	\keyword{break} statement. This is exemplified by the following loop,
1288	which searches for prime numbers:
1289
1290	\begin{verbatim}
1291	>>> for n in range(2, 10):
1292	... for x in range(2, n):
1293	... if n % x == 0:
1294	... print n, 'equals', x, '*', n/x
1295	... break
1296	... else:
1297	... # loop fell through without finding a factor
1298	... print n, 'is a prime number'
1299	...
1300	2 is a prime number
1301	3 is a prime number
1302	4 equals 2 * 2
1303	5 is a prime number
1304	6 equals 2 * 3
1305	7 is a prime number
1306	8 equals 2 * 4
1307	9 equals 3 * 3
1308	\end{verbatim}
1309
1310
1311	\section{\keyword{pass} Statements \label{pass}}
1312
1313	The \keyword{pass} statement does nothing.
1314	It can be used when a statement is required syntactically but the
1315	program requires no action.
1316	For example:
1317
1318	\begin{verbatim}
1319	>>> while True:
1320	... pass # Busy-wait for keyboard interrupt
1321	...
1322	\end{verbatim}
1323
1324
1325	\section{Defining Functions \label{functions}}
1326
1327	We can create a function that writes the Fibonacci series to an
1328	arbitrary boundary:
1329
1330	\begin{verbatim}
1331	>>> def fib(n): # write Fibonacci series up to n
1332	... """Print a Fibonacci series up to n."""
1333	... a, b = 0, 1
1334	... while b < n:
1335	... print b,
1336	... a, b = b, a+b
1337	...
1338	>>> # Now call the function we just defined:
1339	... fib(2000)
1340	1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
1341	\end{verbatim}
1342
1343	The keyword \keyword{def} introduces a function \emph{definition}. It
1344	must be followed by the function name and the parenthesized list of
1345	formal parameters. The statements that form the body of the function
1346	start at the next line, and must be indented. The first statement of
1347	the function body can optionally be a string literal; this string
1348	literal is the function's \index{documentation strings}documentation
1349	string, or \dfn{docstring}.\index{docstrings}\index{strings, documentation}
1350
1351	There are tools which use docstrings to automatically produce online
1352	or printed documentation, or to let the user interactively browse
1353	through code; it's good practice to include docstrings in code that
1354	you write, so try to make a habit of it.
1355
1356	The \emph{execution} of a function introduces a new symbol table used
1357	for the local variables of the function. More precisely, all variable
1358	assignments in a function store the value in the local symbol table;
1359	whereas variable references first look in the local symbol table, then
1360	in the global symbol table, and then in the table of built-in names.
1361	Thus, global variables cannot be directly assigned a value within a
1362	function (unless named in a \keyword{global} statement), although
1363	they may be referenced.
1364
1365	The actual parameters (arguments) to a function call are introduced in
1366	the local symbol table of the called function when it is called; thus,
1367	arguments are passed using \emph{call by value} (where the
1368	\emph{value} is always an object \emph{reference}, not the value of
1369	the object).\footnote{
1370	Actually, \emph{call by object reference} would be a better
1371	description, since if a mutable object is passed, the caller
1372	will see any changes the callee makes to it (items
1373	inserted into a list).
1374	} When a function calls another function, a new local symbol table is
1375	created for that call.
1376
1377	A function definition introduces the function name in the current
1378	symbol table. The value of the function name
1379	has a type that is recognized by the interpreter as a user-defined
1380	function. This value can be assigned to another name which can then
1381	also be used as a function. This serves as a general renaming
1382	mechanism:
1383
1384	\begin{verbatim}
1385	>>> fib
1386	<function fib at 10042ed0>
1387	>>> f = fib
1388	>>> f(100)
1389	1 1 2 3 5 8 13 21 34 55 89
1390	\end{verbatim}
1391
1392	You might object that \code{fib} is not a function but a procedure. In
1393	Python, like in C, procedures are just functions that don't return a
1394	value. In fact, technically speaking, procedures do return a value,
1395	albeit a rather boring one. This value is called \code{None} (it's a
1396	built-in name). Writing the value \code{None} is normally suppressed by
1397	the interpreter if it would be the only value written. You can see it
1398	if you really want to:
1399
1400	\begin{verbatim}
1401	>>> print fib(0)
1402	None
1403	\end{verbatim}
1404
1405	It is simple to write a function that returns a list of the numbers of
1406	the Fibonacci series, instead of printing it:
1407
1408	\begin{verbatim}
1409	>>> def fib2(n): # return Fibonacci series up to n
1410	... """Return a list containing the Fibonacci series up to n."""
1411	... result = []
1412	... a, b = 0, 1
1413	... while b < n:
1414	... result.append(b) # see below
1415	... a, b = b, a+b
1416	... return result
1417	...
1418	>>> f100 = fib2(100) # call it
1419	>>> f100 # write the result
1420	[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
1421	\end{verbatim}
1422
1423	This example, as usual, demonstrates some new Python features:
1424
1425	\begin{itemize}
1426
1427	\item
1428	The \keyword{return} statement returns with a value from a function.
1429	\keyword{return} without an expression argument returns \code{None}.
1430	Falling off the end of a procedure also returns \code{None}.
1431
1432	\item
1433	The statement \code{result.append(b)} calls a \emph{method} of the list
1434	object \code{result}. A method is a function that `belongs' to an
1435	object and is named \code{obj.methodname}, where \code{obj} is some
1436	object (this may be an expression), and \code{methodname} is the name
1437	of a method that is defined by the object's type. Different types
1438	define different methods. Methods of different types may have the
1439	same name without causing ambiguity. (It is possible to define your
1440	own object types and methods, using \emph{classes}, as discussed later
1441	in this tutorial.)
1442	The method \method{append()} shown in the example is defined for
1443	list objects; it adds a new element at the end of the list. In this
1444	example it is equivalent to \samp{result = result + [b]}, but more
1445	efficient.
1446
1447	\end{itemize}
1448
1449	\section{More on Defining Functions \label{defining}}
1450
1451	It is also possible to define functions with a variable number of
1452	arguments. There are three forms, which can be combined.
1453
1454	\subsection{Default Argument Values \label{defaultArgs}}
1455
1456	The most useful form is to specify a default value for one or more
1457	arguments. This creates a function that can be called with fewer
1458	arguments than it is defined to allow. For example:
1459
1460	\begin{verbatim}
1461	def ask_ok(prompt, retries=4, complaint='Yes or no, please!'):
1462	while True:
1463	ok = raw_input(prompt)
1464	if ok in ('y', 'ye', 'yes'): return True
1465	if ok in ('n', 'no', 'nop', 'nope'): return False
1466	retries = retries - 1
1467	if retries < 0: raise IOError, 'refusenik user'
1468	print complaint
1469	\end{verbatim}
1470
1471	This function can be called either like this:
1472	\code{ask_ok('Do you really want to quit?')} or like this:
1473	\code{ask_ok('OK to overwrite the file?', 2)}.
1474
1475	This example also introduces the \keyword{in} keyword. This tests
1476	whether or not a sequence contains a certain value.
1477
1478	The default values are evaluated at the point of function definition
1479	in the \emph{defining} scope, so that
1480
1481	\begin{verbatim}
1482	i = 5
1483
1484	def f(arg=i):
1485	print arg
1486
1487	i = 6
1488	f()
1489	\end{verbatim}
1490
1491	will print \code{5}.
1492
1493	\strong{Important warning:} The default value is evaluated only once.
1494	This makes a difference when the default is a mutable object such as a
1495	list, dictionary, or instances of most classes. For example, the
1496	following function accumulates the arguments passed to it on
1497	subsequent calls:
1498
1499	\begin{verbatim}
1500	def f(a, L=[]):
1501	L.append(a)
1502	return L
1503
1504	print f(1)
1505	print f(2)
1506	print f(3)
1507	\end{verbatim}
1508
1509	This will print
1510
1511	\begin{verbatim}
1512	[1]
1513	[1, 2]
1514	[1, 2, 3]
1515	\end{verbatim}
1516
1517	If you don't want the default to be shared between subsequent calls,
1518	you can write the function like this instead:
1519
1520	\begin{verbatim}
1521	def f(a, L=None):
1522	if L is None:
1523	L = []
1524	L.append(a)
1525	return L
1526	\end{verbatim}
1527
1528	\subsection{Keyword Arguments \label{keywordArgs}}
1529
1530	Functions can also be called using
1531	keyword arguments of the form \samp{\var{keyword} = \var{value}}. For
1532	instance, the following function:
1533
1534	\begin{verbatim}
1535	def parrot(voltage, state='a stiff', action='voom', type='Norwegian Blue'):
1536	print "-- This parrot wouldn't", action,
1537	print "if you put", voltage, "volts through it."
1538	print "-- Lovely plumage, the", type
1539	print "-- It's", state, "!"
1540	\end{verbatim}
1541
1542	could be called in any of the following ways:
1543
1544	\begin{verbatim}
1545	parrot(1000)
1546	parrot(action = 'VOOOOOM', voltage = 1000000)
1547	parrot('a thousand', state = 'pushing up the daisies')
1548	parrot('a million', 'bereft of life', 'jump')
1549	\end{verbatim}
1550
1551	but the following calls would all be invalid:
1552
1553	\begin{verbatim}
1554	parrot() # required argument missing
1555	parrot(voltage=5.0, 'dead') # non-keyword argument following keyword
1556	parrot(110, voltage=220) # duplicate value for argument
1557	parrot(actor='John Cleese') # unknown keyword
1558	\end{verbatim}
1559
1560	In general, an argument list must have any positional arguments
1561	followed by any keyword arguments, where the keywords must be chosen
1562	from the formal parameter names. It's not important whether a formal
1563	parameter has a default value or not. No argument may receive a
1564	value more than once --- formal parameter names corresponding to
1565	positional arguments cannot be used as keywords in the same calls.
1566	Here's an example that fails due to this restriction:
1567
1568	\begin{verbatim}
1569	>>> def function(a):
1570	... pass
1571	...
1572	>>> function(0, a=0)
1573	Traceback (most recent call last):
1574	File "<stdin>", line 1, in ?
1575	TypeError: function() got multiple values for keyword argument 'a'
1576	\end{verbatim}
1577
1578	When a final formal parameter of the form \code{**\var{name}} is
1579	present, it receives a \ulink{dictionary}{../lib/typesmapping.html}
1580	containing all keyword arguments except for those corresponding to
1581	a formal parameter. This may be
1582	combined with a formal parameter of the form
1583	\code{*\var{name}} (described in the next subsection) which receives a
1584	tuple containing the positional arguments beyond the formal parameter
1585	list. (\code{\var{name}} must occur before \code{*\var{name}}.)
1586	For example, if we define a function like this:
1587
1588	\begin{verbatim}
1589	def cheeseshop(kind, arguments, *keywords):
1590	print "-- Do you have any", kind, '?'
1591	print "-- I'm sorry, we're all out of", kind
1592	for arg in arguments: print arg
1593	print '-'*40
1594	keys = keywords.keys()
1595	keys.sort()
1596	for kw in keys: print kw, ':', keywords[kw]
1597	\end{verbatim}
1598
1599	It could be called like this:
1600
1601	\begin{verbatim}
1602	cheeseshop('Limburger', "It's very runny, sir.",
1603	"It's really very, VERY runny, sir.",
1604	client='John Cleese',
1605	shopkeeper='Michael Palin',
1606	sketch='Cheese Shop Sketch')
1607	\end{verbatim}
1608
1609	and of course it would print:
1610
1611	\begin{verbatim}
1612	-- Do you have any Limburger ?
1613	-- I'm sorry, we're all out of Limburger
1614	It's very runny, sir.
1615	It's really very, VERY runny, sir.
1616	----------------------------------------
1617	client : John Cleese
1618	shopkeeper : Michael Palin
1619	sketch : Cheese Shop Sketch
1620	\end{verbatim}
1621
1622	Note that the \method{sort()} method of the list of keyword argument
1623	names is called before printing the contents of the \code{keywords}
1624	dictionary; if this is not done, the order in which the arguments are
1625	printed is undefined.
1626
1627
1628	\subsection{Arbitrary Argument Lists \label{arbitraryArgs}}
1629
1630	Finally, the least frequently used option is to specify that a
1631	function can be called with an arbitrary number of arguments. These
1632	arguments will be wrapped up in a tuple. Before the variable number
1633	of arguments, zero or more normal arguments may occur.
1634
1635	\begin{verbatim}
1636	def fprintf(file, format, *args):
1637	file.write(format % args)
1638	\end{verbatim}
1639
1640
1641	\subsection{Unpacking Argument Lists \label{unpacking-arguments}}
1642
1643	The reverse situation occurs when the arguments are already in a list
1644	or tuple but need to be unpacked for a function call requiring separate
1645	positional arguments. For instance, the built-in \function{range()}
1646	function expects separate \var{start} and \var{stop} arguments. If they
1647	are not available separately, write the function call with the
1648	\code{*}-operator to unpack the arguments out of a list or tuple:
1649
1650	\begin{verbatim}
1651	>>> range(3, 6) # normal call with separate arguments
1652	[3, 4, 5]
1653	>>> args = [3, 6]
1654	>>> range(*args) # call with arguments unpacked from a list
1655	[3, 4, 5]
1656	\end{verbatim}
1657
1658	In the same fashion, dictionaries can deliver keyword arguments with the
1659	\code{**}-operator:
1660
1661	\begin{verbatim}
1662	>>> def parrot(voltage, state='a stiff', action='voom'):
1663	... print "-- This parrot wouldn't", action,
1664	... print "if you put", voltage, "volts through it.",
1665	... print "E's", state, "!"
1666	...
1667	>>> d = {"voltage": "four million", "state": "bleedin' demised", "action": "VOOM"}
1668	>>> parrot(**d)
1669	-- This parrot wouldn't VOOM if you put four million volts through it. E's bleedin' demised !
1670	\end{verbatim}
1671
1672
1673	\subsection{Lambda Forms \label{lambda}}
1674
1675	By popular demand, a few features commonly found in functional
1676	programming languages like Lisp have been added to Python. With the
1677	\keyword{lambda} keyword, small anonymous functions can be created.
1678	Here's a function that returns the sum of its two arguments:
1679	\samp{lambda a, b: a+b}. Lambda forms can be used wherever function
1680	objects are required. They are syntactically restricted to a single
1681	expression. Semantically, they are just syntactic sugar for a normal
1682	function definition. Like nested function definitions, lambda forms
1683	can reference variables from the containing scope:
1684
1685	\begin{verbatim}
1686	>>> def make_incrementor(n):
1687	... return lambda x: x + n
1688	...
1689	>>> f = make_incrementor(42)
1690	>>> f(0)
1691	42
1692	>>> f(1)
1693	43
1694	\end{verbatim}
1695
1696
1697	\subsection{Documentation Strings \label{docstrings}}
1698
1699	There are emerging conventions about the content and formatting of
1700	documentation strings.
1701	\index{docstrings}\index{documentation strings}
1702	\index{strings, documentation}
1703
1704	The first line should always be a short, concise summary of the
1705	object's purpose. For brevity, it should not explicitly state the
1706	object's name or type, since these are available by other means
1707	(except if the name happens to be a verb describing a function's
1708	operation). This line should begin with a capital letter and end with
1709	a period.
1710
1711	If there are more lines in the documentation string, the second line
1712	should be blank, visually separating the summary from the rest of the
1713	description. The following lines should be one or more paragraphs
1714	describing the object's calling conventions, its side effects, etc.
1715
1716	The Python parser does not strip indentation from multi-line string
1717	literals in Python, so tools that process documentation have to strip
1718	indentation if desired. This is done using the following convention.
1719	The first non-blank line \emph{after} the first line of the string
1720	determines the amount of indentation for the entire documentation
1721	string. (We can't use the first line since it is generally adjacent
1722	to the string's opening quotes so its indentation is not apparent in
1723	the string literal.) Whitespace ``equivalent'' to this indentation is
1724	then stripped from the start of all lines of the string. Lines that
1725	are indented less should not occur, but if they occur all their
1726	leading whitespace should be stripped. Equivalence of whitespace
1727	should be tested after expansion of tabs (to 8 spaces, normally).
1728
1729	Here is an example of a multi-line docstring:
1730
1731	\begin{verbatim}
1732	>>> def my_function():
1733	... """Do nothing, but document it.
1734	...
1735	... No, really, it doesn't do anything.
1736	... """
1737	... pass
1738	...
1739	>>> print my_function.__doc__
1740	Do nothing, but document it.
1741
1742	No, really, it doesn't do anything.
1743
1744	\end{verbatim}
1745
1746
1747
1748	\chapter{Data Structures \label{structures}}
1749
1750	This chapter describes some things you've learned about already in
1751	more detail, and adds some new things as well.
1752
1753
1754	\section{More on Lists \label{moreLists}}
1755
1756	The list data type has some more methods. Here are all of the methods
1757	of list objects:
1758
1759	\begin{methoddesc}[list]{append}{x}
1760	Add an item to the end of the list;
1761	equivalent to \code{a[len(a):] = [\var{x}]}.
1762	\end{methoddesc}
1763
1764	\begin{methoddesc}[list]{extend}{L}
1765	Extend the list by appending all the items in the given list;
1766	equivalent to \code{a[len(a):] = \var{L}}.
1767	\end{methoddesc}
1768
1769	\begin{methoddesc}[list]{insert}{i, x}
1770	Insert an item at a given position. The first argument is the index
1771	of the element before which to insert, so \code{a.insert(0, \var{x})}
1772	inserts at the front of the list, and \code{a.insert(len(a), \var{x})}
1773	is equivalent to \code{a.append(\var{x})}.
1774	\end{methoddesc}
1775
1776	\begin{methoddesc}[list]{remove}{x}
1777	Remove the first item from the list whose value is \var{x}.
1778	It is an error if there is no such item.
1779	\end{methoddesc}
1780
1781	\begin{methoddesc}[list]{pop}{\optional{i}}
1782	Remove the item at the given position in the list, and return it. If
1783	no index is specified, \code{a.pop()} removes and returns the last item
1784	in the list. (The square brackets
1785	around the \var{i} in the method signature denote that the parameter
1786	is optional, not that you should type square brackets at that
1787	position. You will see this notation frequently in the
1788	\citetitle[../lib/lib.html]{Python Library Reference}.)
1789	\end{methoddesc}
1790
1791	\begin{methoddesc}[list]{index}{x}
1792	Return the index in the list of the first item whose value is \var{x}.
1793	It is an error if there is no such item.
1794	\end{methoddesc}
1795
1796	\begin{methoddesc}[list]{count}{x}
1797	Return the number of times \var{x} appears in the list.
1798	\end{methoddesc}
1799
1800	\begin{methoddesc}[list]{sort}{}
1801	Sort the items of the list, in place.
1802	\end{methoddesc}
1803
1804	\begin{methoddesc}[list]{reverse}{}
1805	Reverse the elements of the list, in place.
1806	\end{methoddesc}
1807
1808	An example that uses most of the list methods:
1809
1810	\begin{verbatim}
1811	>>> a = [66.25, 333, 333, 1, 1234.5]
1812	>>> print a.count(333), a.count(66.25), a.count('x')
1813	2 1 0
1814	>>> a.insert(2, -1)
1815	>>> a.append(333)
1816	>>> a
1817	[66.25, 333, -1, 333, 1, 1234.5, 333]
1818	>>> a.index(333)
1819	1
1820	>>> a.remove(333)
1821	>>> a
1822	[66.25, -1, 333, 1, 1234.5, 333]
1823	>>> a.reverse()
1824	>>> a
1825	[333, 1234.5, 1, 333, -1, 66.25]
1826	>>> a.sort()
1827	>>> a
1828	[-1, 1, 66.25, 333, 333, 1234.5]
1829	\end{verbatim}
1830
1831
1832	\subsection{Using Lists as Stacks \label{lists-as-stacks}}
1833	\sectionauthor{Ka-Ping Yee}{ping@lfw.org}
1834
1835	The list methods make it very easy to use a list as a stack, where the
1836	last element added is the first element retrieved (``last-in,
1837	first-out''). To add an item to the top of the stack, use
1838	\method{append()}. To retrieve an item from the top of the stack, use
1839	\method{pop()} without an explicit index. For example:
1840
1841	\begin{verbatim}
1842	>>> stack = [3, 4, 5]
1843	>>> stack.append(6)
1844	>>> stack.append(7)
1845	>>> stack
1846	[3, 4, 5, 6, 7]
1847	>>> stack.pop()
1848	7
1849	>>> stack
1850	[3, 4, 5, 6]
1851	>>> stack.pop()
1852	6
1853	>>> stack.pop()
1854	5
1855	>>> stack
1856	[3, 4]
1857	\end{verbatim}
1858
1859
1860	\subsection{Using Lists as Queues \label{lists-as-queues}}
1861	\sectionauthor{Ka-Ping Yee}{ping@lfw.org}
1862
1863	You can also use a list conveniently as a queue, where the first
1864	element added is the first element retrieved (``first-in,
1865	first-out''). To add an item to the back of the queue, use
1866	\method{append()}. To retrieve an item from the front of the queue,
1867	use \method{pop()} with \code{0} as the index. For example:
1868
1869	\begin{verbatim}
1870	>>> queue = ["Eric", "John", "Michael"]
1871	>>> queue.append("Terry") # Terry arrives
1872	>>> queue.append("Graham") # Graham arrives
1873	>>> queue.pop(0)
1874	'Eric'
1875	>>> queue.pop(0)
1876	'John'
1877	>>> queue
1878	['Michael', 'Terry', 'Graham']
1879	\end{verbatim}
1880
1881
1882	\subsection{Functional Programming Tools \label{functional}}
1883
1884	There are three built-in functions that are very useful when used with
1885	lists: \function{filter()}, \function{map()}, and \function{reduce()}.
1886
1887	\samp{filter(\var{function}, \var{sequence})} returns a sequence
1888	consisting of those items from the
1889	sequence for which \code{\var{function}(\var{item})} is true.
1890	If \var{sequence} is a \class{string} or \class{tuple}, the result will
1891	be of the same type; otherwise, it is always a \class{list}.
1892	For example, to compute some primes:
1893
1894	\begin{verbatim}
1895	>>> def f(x): return x % 2 != 0 and x % 3 != 0
1896	...
1897	>>> filter(f, range(2, 25))
1898	[5, 7, 11, 13, 17, 19, 23]
1899	\end{verbatim}
1900
1901	\samp{map(\var{function}, \var{sequence})} calls
1902	\code{\var{function}(\var{item})} for each of the sequence's items and
1903	returns a list of the return values. For example, to compute some
1904	cubes:
1905
1906	\begin{verbatim}
1907	>>> def cube(x): return xxx
1908	...
1909	>>> map(cube, range(1, 11))
1910	[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]
1911	\end{verbatim}
1912
1913	More than one sequence may be passed; the function must then have as
1914	many arguments as there are sequences and is called with the
1915	corresponding item from each sequence (or \code{None} if some sequence
1916	is shorter than another). For example:
1917
1918	\begin{verbatim}
1919	>>> seq = range(8)
1920	>>> def add(x, y): return x+y
1921	...
1922	>>> map(add, seq, seq)
1923	[0, 2, 4, 6, 8, 10, 12, 14]
1924	\end{verbatim}
1925
1926	\samp{reduce(\var{function}, \var{sequence})} returns a single value
1927	constructed by calling the binary function \var{function} on the first two
1928	items of the sequence, then on the result and the next item, and so
1929	on. For example, to compute the sum of the numbers 1 through 10:
1930
1931	\begin{verbatim}
1932	>>> def add(x,y): return x+y
1933	...
1934	>>> reduce(add, range(1, 11))
1935	55
1936	\end{verbatim}
1937
1938	If there's only one item in the sequence, its value is returned; if
1939	the sequence is empty, an exception is raised.
1940
1941	A third argument can be passed to indicate the starting value. In this
1942	case the starting value is returned for an empty sequence, and the
1943	function is first applied to the starting value and the first sequence
1944	item, then to the result and the next item, and so on. For example,
1945
1946	\begin{verbatim}
1947	>>> def sum(seq):
1948	... def add(x,y): return x+y
1949	... return reduce(add, seq, 0)
1950	...
1951	>>> sum(range(1, 11))
1952	55
1953	>>> sum([])
1954	0
1955	\end{verbatim}
1956
1957	Don't use this example's definition of \function{sum()}: since summing
1958	numbers is such a common need, a built-in function
1959	\code{sum(\var{sequence})} is already provided, and works exactly like
1960	this.
1961	\versionadded{2.3}
1962
1963	\subsection{List Comprehensions}
1964
1965	List comprehensions provide a concise way to create lists without resorting
1966	to use of \function{map()}, \function{filter()} and/or \keyword{lambda}.
1967	The resulting list definition tends often to be clearer than lists built
1968	using those constructs. Each list comprehension consists of an expression
1969	followed by a \keyword{for} clause, then zero or more \keyword{for} or
1970	\keyword{if} clauses. The result will be a list resulting from evaluating
1971	the expression in the context of the \keyword{for} and \keyword{if} clauses
1972	which follow it. If the expression would evaluate to a tuple, it must be
1973	parenthesized.
1974
1975	\begin{verbatim}
1976	>>> freshfruit = [' banana', ' loganberry ', 'passion fruit ']
1977	>>> [weapon.strip() for weapon in freshfruit]
1978	['banana', 'loganberry', 'passion fruit']
1979	>>> vec = [2, 4, 6]
1980	>>> [3*x for x in vec]
1981	[6, 12, 18]
1982	>>> [3*x for x in vec if x > 3]
1983	[12, 18]
1984	>>> [3*x for x in vec if x < 2]
1985	[]
1986	>>> [[x,x**2] for x in vec]
1987	[[2, 4], [4, 16], [6, 36]]
1988	>>> [x, x**2 for x in vec] # error - parens required for tuples
1989	File "<stdin>", line 1, in ?
1990	[x, x**2 for x in vec]
1991	^
1992	SyntaxError: invalid syntax
1993	>>> [(x, x**2) for x in vec]
1994	[(2, 4), (4, 16), (6, 36)]
1995	>>> vec1 = [2, 4, 6]
1996	>>> vec2 = [4, 3, -9]
1997	>>> [x*y for x in vec1 for y in vec2]
1998	[8, 6, -18, 16, 12, -36, 24, 18, -54]
1999	>>> [x+y for x in vec1 for y in vec2]
2000	[6, 5, -7, 8, 7, -5, 10, 9, -3]
2001	>>> [vec1[i]*vec2[i] for i in range(len(vec1))]
2002	[8, 12, -54]
2003	\end{verbatim}
2004
2005	List comprehensions are much more flexible than \function{map()} and can be
2006	applied to complex expressions and nested functions:
2007
2008	\begin{verbatim}
2009	>>> [str(round(355/113.0, i)) for i in range(1,6)]
2010	['3.1', '3.14', '3.142', '3.1416', '3.14159']
2011	\end{verbatim}
2012
2013
2014	\section{The \keyword{del} statement \label{del}}
2015
2016	There is a way to remove an item from a list given its index instead
2017	of its value: the \keyword{del} statement. This differs from the
2018	\method{pop()}) method which returns a value. The \keyword{del}
2019	statement can also be used to remove slices from a list or clear the
2020	entire list (which we did earlier by assignment of an empty list to
2021	the slice). For example:
2022
2023	\begin{verbatim}
2024	>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
2025	>>> del a[0]
2026	>>> a
2027	[1, 66.25, 333, 333, 1234.5]
2028	>>> del a[2:4]
2029	>>> a
2030	[1, 66.25, 1234.5]
2031	>>> del a[:]
2032	>>> a
2033	[]
2034	\end{verbatim}
2035
2036	\keyword{del} can also be used to delete entire variables:
2037
2038	\begin{verbatim}
2039	>>> del a
2040	\end{verbatim}
2041
2042	Referencing the name \code{a} hereafter is an error (at least until
2043	another value is assigned to it). We'll find other uses for
2044	\keyword{del} later.
2045
2046
2047	\section{Tuples and Sequences \label{tuples}}
2048
2049	We saw that lists and strings have many common properties, such as
2050	indexing and slicing operations. They are two examples of
2051	\ulink{\emph{sequence} data types}{../lib/typesseq.html}. Since
2052	Python is an evolving language, other sequence data types may be
2053	added. There is also another standard sequence data type: the
2054	\emph{tuple}.
2055
2056	A tuple consists of a number of values separated by commas, for
2057	instance:
2058
2059	\begin{verbatim}
2060	>>> t = 12345, 54321, 'hello!'
2061	>>> t[0]
2062	12345
2063	>>> t
2064	(12345, 54321, 'hello!')
2065	>>> # Tuples may be nested:
2066	... u = t, (1, 2, 3, 4, 5)
2067	>>> u
2068	((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
2069	\end{verbatim}
2070
2071	As you see, on output tuples are always enclosed in parentheses, so
2072	that nested tuples are interpreted correctly; they may be input with
2073	or without surrounding parentheses, although often parentheses are
2074	necessary anyway (if the tuple is part of a larger expression).
2075
2076	Tuples have many uses. For example: (x, y) coordinate pairs, employee
2077	records from a database, etc. Tuples, like strings, are immutable: it
2078	is not possible to assign to the individual items of a tuple (you can
2079	simulate much of the same effect with slicing and concatenation,
2080	though). It is also possible to create tuples which contain mutable
2081	objects, such as lists.
2082
2083	A special problem is the construction of tuples containing 0 or 1
2084	items: the syntax has some extra quirks to accommodate these. Empty
2085	tuples are constructed by an empty pair of parentheses; a tuple with
2086	one item is constructed by following a value with a comma
2087	(it is not sufficient to enclose a single value in parentheses).
2088	Ugly, but effective. For example:
2089
2090	\begin{verbatim}
2091	>>> empty = ()
2092	>>> singleton = 'hello', # <-- note trailing comma
2093	>>> len(empty)
2094	0
2095	>>> len(singleton)
2096	1
2097	>>> singleton
2098	('hello',)
2099	\end{verbatim}
2100
2101	The statement \code{t = 12345, 54321, 'hello!'} is an example of
2102	\emph{tuple packing}: the values \code{12345}, \code{54321} and
2103	\code{'hello!'} are packed together in a tuple. The reverse operation
2104	is also possible:
2105
2106	\begin{verbatim}
2107	>>> x, y, z = t
2108	\end{verbatim}
2109
2110	This is called, appropriately enough, \emph{sequence unpacking}.
2111	Sequence unpacking requires the list of variables on the left to
2112	have the same number of elements as the length of the sequence. Note
2113	that multiple assignment is really just a combination of tuple packing
2114	and sequence unpacking!
2115
2116	There is a small bit of asymmetry here: packing multiple values
2117	always creates a tuple, and unpacking works for any sequence.
2118
2119	% XXX Add a bit on the difference between tuples and lists.
2120
2121
2122	\section{Sets \label{sets}}
2123
2124	Python also includes a data type for \emph{sets}. A set is an unordered
2125	collection with no duplicate elements. Basic uses include membership
2126	testing and eliminating duplicate entries. Set objects also support
2127	mathematical operations like union, intersection, difference, and
2128	symmetric difference.
2129
2130	Here is a brief demonstration:
2131
2132	\begin{verbatim}
2133	>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
2134	>>> fruit = set(basket) # create a set without duplicates
2135	>>> fruit
2136	set(['orange', 'pear', 'apple', 'banana'])
2137	>>> 'orange' in fruit # fast membership testing
2138	True
2139	>>> 'crabgrass' in fruit
2140	False
2141
2142	>>> # Demonstrate set operations on unique letters from two words
2143	...
2144	>>> a = set('abracadabra')
2145	>>> b = set('alacazam')
2146	>>> a # unique letters in a
2147	set(['a', 'r', 'b', 'c', 'd'])
2148	>>> a - b # letters in a but not in b
2149	set(['r', 'd', 'b'])
2150	>>> a \| b # letters in either a or b
2151	set(['a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'])
2152	>>> a & b # letters in both a and b
2153	set(['a', 'c'])
2154	>>> a ^ b # letters in a or b but not both
2155	set(['r', 'd', 'b', 'm', 'z', 'l'])
2156	\end{verbatim}
2157
2158
2159	\section{Dictionaries \label{dictionaries}}
2160
2161	Another useful data type built into Python is the
2162	\ulink{\emph{dictionary}}{../lib/typesmapping.html}.
2163	Dictionaries are sometimes found in other languages as ``associative
2164	memories'' or ``associative arrays''. Unlike sequences, which are
2165	indexed by a range of numbers, dictionaries are indexed by \emph{keys},
2166	which can be any immutable type; strings and numbers can always be
2167	keys. Tuples can be used as keys if they contain only strings,
2168	numbers, or tuples; if a tuple contains any mutable object either
2169	directly or indirectly, it cannot be used as a key. You can't use
2170	lists as keys, since lists can be modified in place using
2171	index assignments, slice assignments, or methods like
2172	\method{append()} and \method{extend()}.
2173
2174	It is best to think of a dictionary as an unordered set of
2175	\emph{key: value} pairs, with the requirement that the keys are unique
2176	(within one dictionary).
2177	A pair of braces creates an empty dictionary: \code{\{\}}.
2178	Placing a comma-separated list of key:value pairs within the
2179	braces adds initial key:value pairs to the dictionary; this is also the
2180	way dictionaries are written on output.
2181
2182	The main operations on a dictionary are storing a value with some key
2183	and extracting the value given the key. It is also possible to delete
2184	a key:value pair
2185	with \code{del}.
2186	If you store using a key that is already in use, the old value
2187	associated with that key is forgotten. It is an error to extract a
2188	value using a non-existent key.
2189
2190	The \method{keys()} method of a dictionary object returns a list of all
2191	the keys used in the dictionary, in arbitrary order (if you want it
2192	sorted, just apply the \method{sort()} method to the list of keys). To
2193	check whether a single key is in the dictionary, either use the dictionary's
2194	\method{has_key()} method or the \keyword{in} keyword.
2195
2196	Here is a small example using a dictionary:
2197
2198	\begin{verbatim}
2199	>>> tel = {'jack': 4098, 'sape': 4139}
2200	>>> tel['guido'] = 4127
2201	>>> tel
2202	{'sape': 4139, 'guido': 4127, 'jack': 4098}
2203	>>> tel['jack']
2204	4098
2205	>>> del tel['sape']
2206	>>> tel['irv'] = 4127
2207	>>> tel
2208	{'guido': 4127, 'irv': 4127, 'jack': 4098}
2209	>>> tel.keys()
2210	['guido', 'irv', 'jack']
2211	>>> tel.has_key('guido')
2212	True
2213	>>> 'guido' in tel
2214	True
2215	\end{verbatim}
2216
2217	The \function{dict()} constructor builds dictionaries directly from
2218	lists of key-value pairs stored as tuples. When the pairs form a
2219	pattern, list comprehensions can compactly specify the key-value list.
2220
2221	\begin{verbatim}
2222	>>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
2223	{'sape': 4139, 'jack': 4098, 'guido': 4127}
2224	>>> dict([(x, x**2) for x in (2, 4, 6)]) # use a list comprehension
2225	{2: 4, 4: 16, 6: 36}
2226	\end{verbatim}
2227
2228	Later in the tutorial, we will learn about Generator Expressions
2229	which are even better suited for the task of supplying key-values pairs to
2230	the \function{dict()} constructor.
2231
2232	When the keys are simple strings, it is sometimes easier to specify
2233	pairs using keyword arguments:
2234
2235	\begin{verbatim}
2236	>>> dict(sape=4139, guido=4127, jack=4098)
2237	{'sape': 4139, 'jack': 4098, 'guido': 4127}
2238	\end{verbatim}
2239
2240
2241	\section{Looping Techniques \label{loopidioms}}
2242
2243	When looping through dictionaries, the key and corresponding value can
2244	be retrieved at the same time using the \method{iteritems()} method.
2245
2246	\begin{verbatim}
2247	>>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
2248	>>> for k, v in knights.iteritems():
2249	... print k, v
2250	...
2251	gallahad the pure
2252	robin the brave
2253	\end{verbatim}
2254
2255	When looping through a sequence, the position index and corresponding
2256	value can be retrieved at the same time using the
2257	\function{enumerate()} function.
2258
2259	\begin{verbatim}
2260	>>> for i, v in enumerate(['tic', 'tac', 'toe']):
2261	... print i, v
2262	...
2263	0 tic
2264	1 tac
2265	2 toe
2266	\end{verbatim}
2267
2268	To loop over two or more sequences at the same time, the entries
2269	can be paired with the \function{zip()} function.
2270
2271	\begin{verbatim}
2272	>>> questions = ['name', 'quest', 'favorite color']
2273	>>> answers = ['lancelot', 'the holy grail', 'blue']
2274	>>> for q, a in zip(questions, answers):
2275	... print 'What is your %s? It is %s.' % (q, a)
2276	...
2277	What is your name? It is lancelot.
2278	What is your quest? It is the holy grail.
2279	What is your favorite color? It is blue.
2280	\end{verbatim}
2281
2282	To loop over a sequence in reverse, first specify the sequence
2283	in a forward direction and then call the \function{reversed()}
2284	function.
2285
2286	\begin{verbatim}
2287	>>> for i in reversed(xrange(1,10,2)):
2288	... print i
2289	...
2290	9
2291	7
2292	5
2293	3
2294	1
2295	\end{verbatim}
2296
2297	To loop over a sequence in sorted order, use the \function{sorted()}
2298	function which returns a new sorted list while leaving the source
2299	unaltered.
2300
2301	\begin{verbatim}
2302	>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
2303	>>> for f in sorted(set(basket)):
2304	... print f
2305	...
2306	apple
2307	banana
2308	orange
2309	pear
2310	\end{verbatim}
2311
2312	\section{More on Conditions \label{conditions}}
2313
2314	The conditions used in \code{while} and \code{if} statements can
2315	contain any operators, not just comparisons.
2316
2317	The comparison operators \code{in} and \code{not in} check whether a value
2318	occurs (does not occur) in a sequence. The operators \code{is} and
2319	\code{is not} compare whether two objects are really the same object; this
2320	only matters for mutable objects like lists. All comparison operators
2321	have the same priority, which is lower than that of all numerical
2322	operators.
2323
2324	Comparisons can be chained. For example, \code{a < b == c} tests
2325	whether \code{a} is less than \code{b} and moreover \code{b} equals
2326	\code{c}.
2327
2328	Comparisons may be combined using the Boolean operators \code{and} and
2329	\code{or}, and the outcome of a comparison (or of any other Boolean
2330	expression) may be negated with \code{not}. These have lower
2331	priorities than comparison operators; between them, \code{not} has
2332	the highest priority and \code{or} the lowest, so that
2333	\code{A and not B or C} is equivalent to \code{(A and (not B)) or C}.
2334	As always, parentheses can be used to express the desired composition.
2335
2336	The Boolean operators \code{and} and \code{or} are so-called
2337	\emph{short-circuit} operators: their arguments are evaluated from
2338	left to right, and evaluation stops as soon as the outcome is
2339	determined. For example, if \code{A} and \code{C} are true but
2340	\code{B} is false, \code{A and B and C} does not evaluate the
2341	expression \code{C}. When used as a general value and not as a
2342	Boolean, the return value of a short-circuit operator is the last
2343	evaluated argument.
2344
2345	It is possible to assign the result of a comparison or other Boolean
2346	expression to a variable. For example,
2347
2348	\begin{verbatim}
2349	>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
2350	>>> non_null = string1 or string2 or string3
2351	>>> non_null
2352	'Trondheim'
2353	\end{verbatim}
2354
2355	Note that in Python, unlike C, assignment cannot occur inside expressions.
2356	C programmers may grumble about this, but it avoids a common class of
2357	problems encountered in C programs: typing \code{=} in an expression when
2358	\code{==} was intended.
2359
2360
2361	\section{Comparing Sequences and Other Types \label{comparing}}
2362
2363	Sequence objects may be compared to other objects with the same
2364	sequence type. The comparison uses \emph{lexicographical} ordering:
2365	first the first two items are compared, and if they differ this
2366	determines the outcome of the comparison; if they are equal, the next
2367	two items are compared, and so on, until either sequence is exhausted.
2368	If two items to be compared are themselves sequences of the same type,
2369	the lexicographical comparison is carried out recursively. If all
2370	items of two sequences compare equal, the sequences are considered
2371	equal. If one sequence is an initial sub-sequence of the other, the
2372	shorter sequence is the smaller (lesser) one. Lexicographical
2373	ordering for strings uses the \ASCII{} ordering for individual
2374	characters. Some examples of comparisons between sequences of the
2375	same type:
2376
2377	\begin{verbatim}
2378	(1, 2, 3) < (1, 2, 4)
2379	[1, 2, 3] < [1, 2, 4]
2380	'ABC' < 'C' < 'Pascal' < 'Python'
2381	(1, 2, 3, 4) < (1, 2, 4)
2382	(1, 2) < (1, 2, -1)
2383	(1, 2, 3) == (1.0, 2.0, 3.0)
2384	(1, 2, ('aa', 'ab')) < (1, 2, ('abc', 'a'), 4)
2385	\end{verbatim}
2386
2387	Note that comparing objects of different types is legal. The outcome
2388	is deterministic but arbitrary: the types are ordered by their name.
2389	Thus, a list is always smaller than a string, a string is always
2390	smaller than a tuple, etc. \footnote{
2391	The rules for comparing objects of different types should
2392	not be relied upon; they may change in a future version of
2393	the language.
2394	} Mixed numeric types are compared according to their numeric value, so
2395	0 equals 0.0, etc.
2396
2397
2398	\chapter{Modules \label{modules}}
2399
2400	If you quit from the Python interpreter and enter it again, the
2401	definitions you have made (functions and variables) are lost.
2402	Therefore, if you want to write a somewhat longer program, you are
2403	better off using a text editor to prepare the input for the interpreter
2404	and running it with that file as input instead. This is known as creating a
2405	\emph{script}. As your program gets longer, you may want to split it
2406	into several files for easier maintenance. You may also want to use a
2407	handy function that you've written in several programs without copying
2408	its definition into each program.
2409
2410	To support this, Python has a way to put definitions in a file and use
2411	them in a script or in an interactive instance of the interpreter.
2412	Such a file is called a \emph{module}; definitions from a module can be
2413	\emph{imported} into other modules or into the \emph{main} module (the
2414	collection of variables that you have access to in a script
2415	executed at the top level
2416	and in calculator mode).
2417
2418	A module is a file containing Python definitions and statements. The
2419	file name is the module name with the suffix \file{.py} appended. Within
2420	a module, the module's name (as a string) is available as the value of
2421	the global variable \code{__name__}. For instance, use your favorite text
2422	editor to create a file called \file{fibo.py} in the current directory
2423	with the following contents:
2424
2425	\begin{verbatim}
2426	# Fibonacci numbers module
2427
2428	def fib(n): # write Fibonacci series up to n
2429	a, b = 0, 1
2430	while b < n:
2431	print b,
2432	a, b = b, a+b
2433
2434	def fib2(n): # return Fibonacci series up to n
2435	result = []
2436	a, b = 0, 1
2437	while b < n:
2438	result.append(b)
2439	a, b = b, a+b
2440	return result
2441	\end{verbatim}
2442
2443	Now enter the Python interpreter and import this module with the
2444	following command:
2445
2446	\begin{verbatim}
2447	>>> import fibo
2448	\end{verbatim}
2449
2450	This does not enter the names of the functions defined in \code{fibo}
2451	directly in the current symbol table; it only enters the module name
2452	\code{fibo} there.
2453	Using the module name you can access the functions:
2454
2455	\begin{verbatim}
2456	>>> fibo.fib(1000)
2457	1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987
2458	>>> fibo.fib2(100)
2459	[1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
2460	>>> fibo.__name__
2461	'fibo'
2462	\end{verbatim}
2463
2464	If you intend to use a function often you can assign it to a local name:
2465
2466	\begin{verbatim}
2467	>>> fib = fibo.fib
2468	>>> fib(500)
2469	1 1 2 3 5 8 13 21 34 55 89 144 233 377
2470	\end{verbatim}
2471
2472
2473	\section{More on Modules \label{moreModules}}
2474
2475	A module can contain executable statements as well as function
2476	definitions.
2477	These statements are intended to initialize the module.
2478	They are executed only the
2479	\emph{first} time the module is imported somewhere.\footnote{
2480	In fact function definitions are also `statements' that are
2481	`executed'; the execution enters the function name in the
2482	module's global symbol table.
2483	}
2484
2485	Each module has its own private symbol table, which is used as the
2486	global symbol table by all functions defined in the module.
2487	Thus, the author of a module can use global variables in the module
2488	without worrying about accidental clashes with a user's global
2489	variables.
2490	On the other hand, if you know what you are doing you can touch a
2491	module's global variables with the same notation used to refer to its
2492	functions,
2493	\code{modname.itemname}.
2494
2495	Modules can import other modules. It is customary but not required to
2496	place all \keyword{import} statements at the beginning of a module (or
2497	script, for that matter). The imported module names are placed in the
2498	importing module's global symbol table.
2499
2500	There is a variant of the \keyword{import} statement that imports
2501	names from a module directly into the importing module's symbol
2502	table. For example:
2503
2504	\begin{verbatim}
2505	>>> from fibo import fib, fib2
2506	>>> fib(500)
2507	1 1 2 3 5 8 13 21 34 55 89 144 233 377
2508	\end{verbatim}
2509
2510	This does not introduce the module name from which the imports are taken
2511	in the local symbol table (so in the example, \code{fibo} is not
2512	defined).
2513
2514	There is even a variant to import all names that a module defines:
2515
2516	\begin{verbatim}
2517	>>> from fibo import *
2518	>>> fib(500)
2519	1 1 2 3 5 8 13 21 34 55 89 144 233 377
2520	\end{verbatim}
2521
2522	This imports all names except those beginning with an underscore
2523	(\code{_}).
2524
2525
2526	\subsection{The Module Search Path \label{searchPath}}
2527
2528	\indexiii{module}{search}{path}
2529	When a module named \module{spam} is imported, the interpreter searches
2530	for a file named \file{spam.py} in the current directory,
2531	and then in the list of directories specified by
2532	the environment variable \envvar{PYTHONPATH}. This has the same syntax as
2533	the shell variable \envvar{PATH}, that is, a list of
2534	directory names. When \envvar{PYTHONPATH} is not set, or when the file
2535	is not found there, the search continues in an installation-dependent
2536	default path; on \UNIX, this is usually \file{.:/usr/local/lib/python}.
2537
2538	Actually, modules are searched in the list of directories given by the
2539	variable \code{sys.path} which is initialized from the directory
2540	containing the input script (or the current directory),
2541	\envvar{PYTHONPATH} and the installation-dependent default. This allows
2542	Python programs that know what they're doing to modify or replace the
2543	module search path. Note that because the directory containing the
2544	script being run is on the search path, it is important that the
2545	script not have the same name as a standard module, or Python will
2546	attempt to load the script as a module when that module is imported.
2547	This will generally be an error. See section~\ref{standardModules},
2548	``Standard Modules,'' for more information.
2549
2550
2551	\subsection{``Compiled'' Python files}
2552
2553	As an important speed-up of the start-up time for short programs that
2554	use a lot of standard modules, if a file called \file{spam.pyc} exists
2555	in the directory where \file{spam.py} is found, this is assumed to
2556	contain an already-``byte-compiled'' version of the module \module{spam}.
2557	The modification time of the version of \file{spam.py} used to create
2558	\file{spam.pyc} is recorded in \file{spam.pyc}, and the
2559	\file{.pyc} file is ignored if these don't match.
2560
2561	Normally, you don't need to do anything to create the
2562	\file{spam.pyc} file. Whenever \file{spam.py} is successfully
2563	compiled, an attempt is made to write the compiled version to
2564	\file{spam.pyc}. It is not an error if this attempt fails; if for any
2565	reason the file is not written completely, the resulting
2566	\file{spam.pyc} file will be recognized as invalid and thus ignored
2567	later. The contents of the \file{spam.pyc} file are platform
2568	independent, so a Python module directory can be shared by machines of
2569	different architectures.
2570
2571	Some tips for experts:
2572
2573	\begin{itemize}
2574
2575	\item
2576	When the Python interpreter is invoked with the \programopt{-O} flag,
2577	optimized code is generated and stored in \file{.pyo} files. The
2578	optimizer currently doesn't help much; it only removes
2579	\keyword{assert} statements. When \programopt{-O} is used, \emph{all}
2580	bytecode is optimized; \code{.pyc} files are ignored and \code{.py}
2581	files are compiled to optimized bytecode.
2582
2583	\item
2584	Passing two \programopt{-O} flags to the Python interpreter
2585	(\programopt{-OO}) will cause the bytecode compiler to perform
2586	optimizations that could in some rare cases result in malfunctioning
2587	programs. Currently only \code{__doc__} strings are removed from the
2588	bytecode, resulting in more compact \file{.pyo} files. Since some
2589	programs may rely on having these available, you should only use this
2590	option if you know what you're doing.
2591
2592	\item
2593	A program doesn't run any faster when it is read from a \file{.pyc} or
2594	\file{.pyo} file than when it is read from a \file{.py} file; the only
2595	thing that's faster about \file{.pyc} or \file{.pyo} files is the
2596	speed with which they are loaded.
2597
2598	\item
2599	When a script is run by giving its name on the command line, the
2600	bytecode for the script is never written to a \file{.pyc} or
2601	\file{.pyo} file. Thus, the startup time of a script may be reduced
2602	by moving most of its code to a module and having a small bootstrap
2603	script that imports that module. It is also possible to name a
2604	\file{.pyc} or \file{.pyo} file directly on the command line.
2605
2606	\item
2607	It is possible to have a file called \file{spam.pyc} (or
2608	\file{spam.pyo} when \programopt{-O} is used) without a file
2609	\file{spam.py} for the same module. This can be used to distribute a
2610	library of Python code in a form that is moderately hard to reverse
2611	engineer.
2612
2613	\item
2614	The module \ulink{\module{compileall}}{../lib/module-compileall.html}%
2615	{} \refstmodindex{compileall} can create \file{.pyc} files (or
2616	\file{.pyo} files when \programopt{-O} is used) for all modules in a
2617	directory.
2618
2619	\end{itemize}
2620
2621
2622	\section{Standard Modules \label{standardModules}}
2623
2624	Python comes with a library of standard modules, described in a separate
2625	document, the \citetitle[../lib/lib.html]{Python Library Reference}
2626	(``Library Reference'' hereafter). Some modules are built into the
2627	interpreter; these provide access to operations that are not part of
2628	the core of the language but are nevertheless built in, either for
2629	efficiency or to provide access to operating system primitives such as
2630	system calls. The set of such modules is a configuration option which
2631	also depends on the underlying platform For example,
2632	the \module{amoeba} module is only provided on systems that somehow
2633	support Amoeba primitives. One particular module deserves some
2634	attention: \ulink{\module{sys}}{../lib/module-sys.html}%
2635	\refstmodindex{sys}, which is built into every
2636	Python interpreter. The variables \code{sys.ps1} and
2637	\code{sys.ps2} define the strings used as primary and secondary
2638	prompts:
2639
2640	\begin{verbatim}
2641	>>> import sys
2642	>>> sys.ps1
2643	'>>> '
2644	>>> sys.ps2
2645	'... '
2646	>>> sys.ps1 = 'C> '
2647	C> print 'Yuck!'
2648	Yuck!
2649	C>
2650
2651	\end{verbatim}
2652
2653	These two variables are only defined if the interpreter is in
2654	interactive mode.
2655
2656	The variable \code{sys.path} is a list of strings that determines the
2657	interpreter's search path for modules. It is initialized to a default
2658	path taken from the environment variable \envvar{PYTHONPATH}, or from
2659	a built-in default if \envvar{PYTHONPATH} is not set. You can modify
2660	it using standard list operations:
2661
2662	\begin{verbatim}
2663	>>> import sys
2664	>>> sys.path.append('/ufs/guido/lib/python')
2665	\end{verbatim}
2666
2667	\section{The \function{dir()} Function \label{dir}}
2668
2669	The built-in function \function{dir()} is used to find out which names
2670	a module defines. It returns a sorted list of strings:
2671
2672	\begin{verbatim}
2673	>>> import fibo, sys
2674	>>> dir(fibo)
2675	['__name__', 'fib', 'fib2']
2676	>>> dir(sys)
2677	['__displayhook__', '__doc__', '__excepthook__', '__name__', '__stderr__',
2678	'__stdin__', '__stdout__', '_getframe', 'api_version', 'argv',
2679	'builtin_module_names', 'byteorder', 'callstats', 'copyright',
2680	'displayhook', 'exc_clear', 'exc_info', 'exc_type', 'excepthook',
2681	'exec_prefix', 'executable', 'exit', 'getdefaultencoding', 'getdlopenflags',
2682	'getrecursionlimit', 'getrefcount', 'hexversion', 'maxint', 'maxunicode',
2683	'meta_path', 'modules', 'path', 'path_hooks', 'path_importer_cache',
2684	'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval', 'setdlopenflags',
2685	'setprofile', 'setrecursionlimit', 'settrace', 'stderr', 'stdin', 'stdout',
2686	'version', 'version_info', 'warnoptions']
2687	\end{verbatim}
2688
2689	Without arguments, \function{dir()} lists the names you have defined
2690	currently:
2691
2692	\begin{verbatim}
2693	>>> a = [1, 2, 3, 4, 5]
2694	>>> import fibo
2695	>>> fib = fibo.fib
2696	>>> dir()
2697	['__builtins__', '__doc__', '__file__', '__name__', 'a', 'fib', 'fibo', 'sys']
2698	\end{verbatim}
2699
2700	Note that it lists all types of names: variables, modules, functions, etc.
2701
2702	\function{dir()} does not list the names of built-in functions and
2703	variables. If you want a list of those, they are defined in the
2704	standard module \module{__builtin__}\refbimodindex{__builtin__}:
2705
2706	\begin{verbatim}
2707	>>> import __builtin__
2708	>>> dir(__builtin__)
2709	['ArithmeticError', 'AssertionError', 'AttributeError', 'DeprecationWarning',
2710	'EOFError', 'Ellipsis', 'EnvironmentError', 'Exception', 'False',
2711	'FloatingPointError', 'FutureWarning', 'IOError', 'ImportError',
2712	'IndentationError', 'IndexError', 'KeyError', 'KeyboardInterrupt',
2713	'LookupError', 'MemoryError', 'NameError', 'None', 'NotImplemented',
2714	'NotImplementedError', 'OSError', 'OverflowError',
2715	'PendingDeprecationWarning', 'ReferenceError', 'RuntimeError',
2716	'RuntimeWarning', 'StandardError', 'StopIteration', 'SyntaxError',
2717	'SyntaxWarning', 'SystemError', 'SystemExit', 'TabError', 'True',
2718	'TypeError', 'UnboundLocalError', 'UnicodeDecodeError',
2719	'UnicodeEncodeError', 'UnicodeError', 'UnicodeTranslateError',
2720	'UserWarning', 'ValueError', 'Warning', 'WindowsError',
2721	'ZeroDivisionError', '_', '__debug__', '__doc__', '__import__',
2722	'__name__', 'abs', 'apply', 'basestring', 'bool', 'buffer',
2723	'callable', 'chr', 'classmethod', 'cmp', 'coerce', 'compile',
2724	'complex', 'copyright', 'credits', 'delattr', 'dict', 'dir', 'divmod',
2725	'enumerate', 'eval', 'execfile', 'exit', 'file', 'filter', 'float',
2726	'frozenset', 'getattr', 'globals', 'hasattr', 'hash', 'help', 'hex',
2727	'id', 'input', 'int', 'intern', 'isinstance', 'issubclass', 'iter',
2728	'len', 'license', 'list', 'locals', 'long', 'map', 'max', 'min',
2729	'object', 'oct', 'open', 'ord', 'pow', 'property', 'quit', 'range',
2730	'raw_input', 'reduce', 'reload', 'repr', 'reversed', 'round', 'set',
2731	'setattr', 'slice', 'sorted', 'staticmethod', 'str', 'sum', 'super',
2732	'tuple', 'type', 'unichr', 'unicode', 'vars', 'xrange', 'zip']
2733	\end{verbatim}
2734
2735
2736	\section{Packages \label{packages}}
2737
2738	Packages are a way of structuring Python's module namespace
2739	by using ``dotted module names''. For example, the module name
2740	\module{A.B} designates a submodule named \samp{B} in a package named
2741	\samp{A}. Just like the use of modules saves the authors of different
2742	modules from having to worry about each other's global variable names,
2743	the use of dotted module names saves the authors of multi-module
2744	packages like NumPy or the Python Imaging Library from having to worry
2745	about each other's module names.
2746
2747	Suppose you want to design a collection of modules (a ``package'') for
2748	the uniform handling of sound files and sound data. There are many
2749	different sound file formats (usually recognized by their extension,
2750	for example: \file{.wav}, \file{.aiff}, \file{.au}), so you may need
2751	to create and maintain a growing collection of modules for the
2752	conversion between the various file formats. There are also many
2753	different operations you might want to perform on sound data (such as
2754	mixing, adding echo, applying an equalizer function, creating an
2755	artificial stereo effect), so in addition you will be writing a
2756	never-ending stream of modules to perform these operations. Here's a
2757	possible structure for your package (expressed in terms of a
2758	hierarchical filesystem):
2759
2760	\begin{verbatim}
2761	Sound/ Top-level package
2762	__init__.py Initialize the sound package
2763	Formats/ Subpackage for file format conversions
2764	__init__.py
2765	wavread.py
2766	wavwrite.py
2767	aiffread.py
2768	aiffwrite.py
2769	auread.py
2770	auwrite.py
2771	...
2772	Effects/ Subpackage for sound effects
2773	__init__.py
2774	echo.py
2775	surround.py
2776	reverse.py
2777	...
2778	Filters/ Subpackage for filters
2779	__init__.py
2780	equalizer.py
2781	vocoder.py
2782	karaoke.py
2783	...
2784	\end{verbatim}
2785
2786	When importing the package, Python searches through the directories
2787	on \code{sys.path} looking for the package subdirectory.
2788
2789	The \file{__init__.py} files are required to make Python treat the
2790	directories as containing packages; this is done to prevent
2791	directories with a common name, such as \samp{string}, from
2792	unintentionally hiding valid modules that occur later on the module
2793	search path. In the simplest case, \file{__init__.py} can just be an
2794	empty file, but it can also execute initialization code for the
2795	package or set the \code{__all__} variable, described later.
2796
2797	Users of the package can import individual modules from the
2798	package, for example:
2799
2800	\begin{verbatim}
2801	import Sound.Effects.echo
2802	\end{verbatim}
2803
2804	This loads the submodule \module{Sound.Effects.echo}. It must be referenced
2805	with its full name.
2806
2807	\begin{verbatim}
2808	Sound.Effects.echo.echofilter(input, output, delay=0.7, atten=4)
2809	\end{verbatim}
2810
2811	An alternative way of importing the submodule is:
2812
2813	\begin{verbatim}
2814	from Sound.Effects import echo
2815	\end{verbatim}
2816
2817	This also loads the submodule \module{echo}, and makes it available without
2818	its package prefix, so it can be used as follows:
2819
2820	\begin{verbatim}
2821	echo.echofilter(input, output, delay=0.7, atten=4)
2822	\end{verbatim}
2823
2824	Yet another variation is to import the desired function or variable directly:
2825
2826	\begin{verbatim}
2827	from Sound.Effects.echo import echofilter
2828	\end{verbatim}
2829
2830	Again, this loads the submodule \module{echo}, but this makes its function
2831	\function{echofilter()} directly available:
2832
2833	\begin{verbatim}
2834	echofilter(input, output, delay=0.7, atten=4)
2835	\end{verbatim}
2836
2837	Note that when using \code{from \var{package} import \var{item}}, the
2838	item can be either a submodule (or subpackage) of the package, or some
2839	other name defined in the package, like a function, class or
2840	variable. The \code{import} statement first tests whether the item is
2841	defined in the package; if not, it assumes it is a module and attempts
2842	to load it. If it fails to find it, an
2843	\exception{ImportError} exception is raised.
2844
2845	Contrarily, when using syntax like \code{import
2846	\var{item.subitem.subsubitem}}, each item except for the last must be
2847	a package; the last item can be a module or a package but can't be a
2848	class or function or variable defined in the previous item.
2849
2850	\subsection{Importing * From a Package \label{pkg-import-star}}
2851	%The \code{__all__} Attribute
2852
2853	\ttindex{__all__}
2854	Now what happens when the user writes \code{from Sound.Effects import
2855	*}? Ideally, one would hope that this somehow goes out to the
2856	filesystem, finds which submodules are present in the package, and
2857	imports them all. Unfortunately, this operation does not work very
2858	well on Mac and Windows platforms, where the filesystem does not
2859	always have accurate information about the case of a filename! On
2860	these platforms, there is no guaranteed way to know whether a file
2861	\file{ECHO.PY} should be imported as a module \module{echo},
2862	\module{Echo} or \module{ECHO}. (For example, Windows 95 has the
2863	annoying practice of showing all file names with a capitalized first
2864	letter.) The DOS 8+3 filename restriction adds another interesting
2865	problem for long module names.
2866
2867	The only solution is for the package author to provide an explicit
2868	index of the package. The import statement uses the following
2869	convention: if a package's \file{__init__.py} code defines a list
2870	named \code{__all__}, it is taken to be the list of module names that
2871	should be imported when \code{from \var{package} import *} is
2872	encountered. It is up to the package author to keep this list
2873	up-to-date when a new version of the package is released. Package
2874	authors may also decide not to support it, if they don't see a use for
2875	importing * from their package. For example, the file
2876	\file{Sounds/Effects/__init__.py} could contain the following code:
2877
2878	\begin{verbatim}
2879	__all__ = ["echo", "surround", "reverse"]
2880	\end{verbatim}
2881
2882	This would mean that \code{from Sound.Effects import *} would
2883	import the three named submodules of the \module{Sound} package.
2884
2885	If \code{__all__} is not defined, the statement \code{from Sound.Effects
2886	import *} does \emph{not} import all submodules from the package
2887	\module{Sound.Effects} into the current namespace; it only ensures that the
2888	package \module{Sound.Effects} has been imported (possibly running any
2889	initialization code in \file{__init__.py}) and then imports whatever names are
2890	defined in the package. This includes any names defined (and
2891	submodules explicitly loaded) by \file{__init__.py}. It also includes any
2892	submodules of the package that were explicitly loaded by previous
2893	import statements. Consider this code:
2894
2895	\begin{verbatim}
2896	import Sound.Effects.echo
2897	import Sound.Effects.surround
2898	from Sound.Effects import *
2899	\end{verbatim}
2900
2901	In this example, the echo and surround modules are imported in the
2902	current namespace because they are defined in the
2903	\module{Sound.Effects} package when the \code{from...import} statement
2904	is executed. (This also works when \code{__all__} is defined.)
2905
2906	Note that in general the practice of importing \code{*} from a module or
2907	package is frowned upon, since it often causes poorly readable code.
2908	However, it is okay to use it to save typing in interactive sessions,
2909	and certain modules are designed to export only names that follow
2910	certain patterns.
2911
2912	Remember, there is nothing wrong with using \code{from Package
2913	import specific_submodule}! In fact, this is the
2914	recommended notation unless the importing module needs to use
2915	submodules with the same name from different packages.
2916
2917
2918	\subsection{Intra-package References}
2919
2920	The submodules often need to refer to each other. For example, the
2921	\module{surround} module might use the \module{echo} module. In fact,
2922	such references are so common that the \keyword{import} statement
2923	first looks in the containing package before looking in the standard
2924	module search path. Thus, the \module{surround} module can simply use
2925	\code{import echo} or \code{from echo import echofilter}. If the
2926	imported module is not found in the current package (the package of
2927	which the current module is a submodule), the \keyword{import}
2928	statement looks for a top-level module with the given name.
2929
2930	When packages are structured into subpackages (as with the
2931	\module{Sound} package in the example), there's no shortcut to refer
2932	to submodules of sibling packages - the full name of the subpackage
2933	must be used. For example, if the module
2934	\module{Sound.Filters.vocoder} needs to use the \module{echo} module
2935	in the \module{Sound.Effects} package, it can use \code{from
2936	Sound.Effects import echo}.
2937
2938	Starting with Python 2.5, in addition to the implicit relative imports
2939	described above, you can write explicit relative imports with the
2940	\code{from module import name} form of import statement. These explicit
2941	relative imports use leading dots to indicate the current and parent
2942	packages involved in the relative import. From the \module{surround}
2943	module for example, you might use:
2944
2945	\begin{verbatim}
2946	from . import echo
2947	from .. import Formats
2948	from ..Filters import equalizer
2949	\end{verbatim}
2950
2951	Note that both explicit and implicit relative imports are based on the
2952	name of the current module. Since the name of the main module is always
2953	\code{"__main__"}, modules intended for use as the main module of a
2954	Python application should always use absolute imports.
2955
2956	\subsection{Packages in Multiple Directories}
2957
2958	Packages support one more special attribute, \member{__path__}. This
2959	is initialized to be a list containing the name of the directory
2960	holding the package's \file{__init__.py} before the code in that file
2961	is executed. This variable can be modified; doing so affects future
2962	searches for modules and subpackages contained in the package.
2963
2964	While this feature is not often needed, it can be used to extend the
2965	set of modules found in a package.
2966
2967
2968
2969	\chapter{Input and Output \label{io}}
2970
2971	There are several ways to present the output of a program; data can be
2972	printed in a human-readable form, or written to a file for future use.
2973	This chapter will discuss some of the possibilities.
2974
2975
2976	\section{Fancier Output Formatting \label{formatting}}
2977
2978	So far we've encountered two ways of writing values: \emph{expression
2979	statements} and the \keyword{print} statement. (A third way is using
2980	the \method{write()} method of file objects; the standard output file
2981	can be referenced as \code{sys.stdout}. See the Library Reference for
2982	more information on this.)
2983
2984	Often you'll want more control over the formatting of your output than
2985	simply printing space-separated values. There are two ways to format
2986	your output; the first way is to do all the string handling yourself;
2987	using string slicing and concatenation operations you can create any
2988	layout you can imagine. The standard module
2989	\module{string}\refstmodindex{string} contains some useful operations
2990	for padding strings to a given column width; these will be discussed
2991	shortly. The second way is to use the \code{\%} operator with a
2992	string as the left argument. The \code{\%} operator interprets the
2993	left argument much like a \cfunction{sprintf()}-style format
2994	string to be applied to the right argument, and returns the string
2995	resulting from this formatting operation.
2996
2997	One question remains, of course: how do you convert values to strings?
2998	Luckily, Python has ways to convert any value to a string: pass it to
2999	the \function{repr()} or \function{str()} functions. Reverse quotes
3000	(\code{``}) are equivalent to \function{repr()}, but they are no
3001	longer used in modern Python code and will likely not be in future
3002	versions of the language.
3003
3004	The \function{str()} function is meant to return representations of
3005	values which are fairly human-readable, while \function{repr()} is
3006	meant to generate representations which can be read by the interpreter
3007	(or will force a \exception{SyntaxError} if there is not equivalent
3008	syntax). For objects which don't have a particular representation for
3009	human consumption, \function{str()} will return the same value as
3010	\function{repr()}. Many values, such as numbers or structures like
3011	lists and dictionaries, have the same representation using either
3012	function. Strings and floating point numbers, in particular, have two
3013	distinct representations.
3014
3015	Some examples:
3016
3017	\begin{verbatim}
3018	>>> s = 'Hello, world.'
3019	>>> str(s)
3020	'Hello, world.'
3021	>>> repr(s)
3022	"'Hello, world.'"
3023	>>> str(0.1)
3024	'0.1'
3025	>>> repr(0.1)
3026	'0.10000000000000001'
3027	>>> x = 10 * 3.25
3028	>>> y = 200 * 200
3029	>>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
3030	>>> print s
3031	The value of x is 32.5, and y is 40000...
3032	>>> # The repr() of a string adds string quotes and backslashes:
3033	... hello = 'hello, world\n'
3034	>>> hellos = repr(hello)
3035	>>> print hellos
3036	'hello, world\n'
3037	>>> # The argument to repr() may be any Python object:
3038	... repr((x, y, ('spam', 'eggs')))
3039	"(32.5, 40000, ('spam', 'eggs'))"
3040	>>> # reverse quotes are convenient in interactive sessions:
3041	... `x, y, ('spam', 'eggs')`
3042	"(32.5, 40000, ('spam', 'eggs'))"
3043	\end{verbatim}
3044
3045	Here are two ways to write a table of squares and cubes:
3046
3047	\begin{verbatim}
3048	>>> for x in range(1, 11):
3049	... print repr(x).rjust(2), repr(x*x).rjust(3),
3050	... # Note trailing comma on previous line
3051	... print repr(xxx).rjust(4)
3052	...
3053	1 1 1
3054	2 4 8
3055	3 9 27
3056	4 16 64
3057	5 25 125
3058	6 36 216
3059	7 49 343
3060	8 64 512
3061	9 81 729
3062	10 100 1000
3063	>>> for x in range(1,11):
3064	... print '%2d %3d %4d' % (x, xx, xx*x)
3065	...
3066	1 1 1
3067	2 4 8
3068	3 9 27
3069	4 16 64
3070	5 25 125
3071	6 36 216
3072	7 49 343
3073	8 64 512
3074	9 81 729
3075	10 100 1000
3076	\end{verbatim}
3077
3078	(Note that one space between each column was added by the way
3079	\keyword{print} works: it always adds spaces between its arguments.)
3080
3081	This example demonstrates the \method{rjust()} method of string objects,
3082	which right-justifies a string in a field of a given width by padding
3083	it with spaces on the left. There are similar methods
3084	\method{ljust()} and \method{center()}. These
3085	methods do not write anything, they just return a new string. If
3086	the input string is too long, they don't truncate it, but return it
3087	unchanged; this will mess up your column lay-out but that's usually
3088	better than the alternative, which would be lying about a value. (If
3089	you really want truncation you can always add a slice operation, as in
3090	\samp{x.ljust(n)[:n]}.)
3091
3092	There is another method, \method{zfill()}, which pads a
3093	numeric string on the left with zeros. It understands about plus and
3094	minus signs:
3095
3096	\begin{verbatim}
3097	>>> '12'.zfill(5)
3098	'00012'
3099	>>> '-3.14'.zfill(7)
3100	'-003.14'
3101	>>> '3.14159265359'.zfill(5)
3102	'3.14159265359'
3103	\end{verbatim}
3104
3105	Using the \code{\%} operator looks like this:
3106
3107	\begin{verbatim}
3108	>>> import math
3109	>>> print 'The value of PI is approximately %5.3f.' % math.pi
3110	The value of PI is approximately 3.142.
3111	\end{verbatim}
3112
3113	If there is more than one format in the string, you need to pass a
3114	tuple as right operand, as in this example:
3115
3116	\begin{verbatim}
3117	>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
3118	>>> for name, phone in table.items():
3119	... print '%-10s ==> %10d' % (name, phone)
3120	...
3121	Jack ==> 4098
3122	Dcab ==> 7678
3123	Sjoerd ==> 4127
3124	\end{verbatim}
3125
3126	Most formats work exactly as in C and require that you pass the proper
3127	type; however, if you don't you get an exception, not a core dump.
3128	The \code{\%s} format is more relaxed: if the corresponding argument is
3129	not a string object, it is converted to string using the
3130	\function{str()} built-in function. Using \code{*} to pass the width
3131	or precision in as a separate (integer) argument is supported. The
3132	C formats \code{\%n} and \code{\%p} are not supported.
3133
3134	If you have a really long format string that you don't want to split
3135	up, it would be nice if you could reference the variables to be
3136	formatted by name instead of by position. This can be done by using
3137	form \code{\%(name)format}, as shown here:
3138
3139	\begin{verbatim}
3140	>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
3141	>>> print 'Jack: %(Jack)d; Sjoerd: %(Sjoerd)d; Dcab: %(Dcab)d' % table
3142	Jack: 4098; Sjoerd: 4127; Dcab: 8637678
3143	\end{verbatim}
3144
3145	This is particularly useful in combination with the new built-in
3146	\function{vars()} function, which returns a dictionary containing all
3147	local variables.
3148
3149	\section{Reading and Writing Files \label{files}}
3150
3151	% Opening files
3152	\function{open()}\bifuncindex{open} returns a file
3153	object\obindex{file}, and is most commonly used with two arguments:
3154	\samp{open(\var{filename}, \var{mode})}.
3155
3156	\begin{verbatim}
3157	>>> f=open('/tmp/workfile', 'w')
3158	>>> print f
3159	<open file '/tmp/workfile', mode 'w' at 80a0960>
3160	\end{verbatim}
3161
3162	The first argument is a string containing the filename. The second
3163	argument is another string containing a few characters describing the
3164	way in which the file will be used. \var{mode} can be \code{'r'} when
3165	the file will only be read, \code{'w'} for only writing (an existing
3166	file with the same name will be erased), and \code{'a'} opens the file
3167	for appending; any data written to the file is automatically added to
3168	the end. \code{'r+'} opens the file for both reading and writing.
3169	The \var{mode} argument is optional; \code{'r'} will be assumed if
3170	it's omitted.
3171
3172	On Windows and the Macintosh, \code{'b'} appended to the
3173	mode opens the file in binary mode, so there are also modes like
3174	\code{'rb'}, \code{'wb'}, and \code{'r+b'}. Windows makes a
3175	distinction between text and binary files; the end-of-line characters
3176	in text files are automatically altered slightly when data is read or
3177	written. This behind-the-scenes modification to file data is fine for
3178	\ASCII{} text files, but it'll corrupt binary data like that in \file{JPEG} or
3179	\file{EXE} files. Be very careful to use binary mode when reading and
3180	writing such files.
3181
3182	\subsection{Methods of File Objects \label{fileMethods}}
3183
3184	The rest of the examples in this section will assume that a file
3185	object called \code{f} has already been created.
3186
3187	To read a file's contents, call \code{f.read(\var{size})}, which reads
3188	some quantity of data and returns it as a string. \var{size} is an
3189	optional numeric argument. When \var{size} is omitted or negative,
3190	the entire contents of the file will be read and returned; it's your
3191	problem if the file is twice as large as your machine's memory.
3192	Otherwise, at most \var{size} bytes are read and returned. If the end
3193	of the file has been reached, \code{f.read()} will return an empty
3194	string (\code {""}).
3195	\begin{verbatim}
3196	>>> f.read()
3197	'This is the entire file.\n'
3198	>>> f.read()
3199	''
3200	\end{verbatim}
3201
3202	\code{f.readline()} reads a single line from the file; a newline
3203	character (\code{\e n}) is left at the end of the string, and is only
3204	omitted on the last line of the file if the file doesn't end in a
3205	newline. This makes the return value unambiguous; if
3206	\code{f.readline()} returns an empty string, the end of the file has
3207	been reached, while a blank line is represented by \code{'\e n'}, a
3208	string containing only a single newline.
3209
3210	\begin{verbatim}
3211	>>> f.readline()
3212	'This is the first line of the file.\n'
3213	>>> f.readline()
3214	'Second line of the file\n'
3215	>>> f.readline()
3216	''
3217	\end{verbatim}
3218
3219	\code{f.readlines()} returns a list containing all the lines of data
3220	in the file. If given an optional parameter \var{sizehint}, it reads
3221	that many bytes from the file and enough more to complete a line, and
3222	returns the lines from that. This is often used to allow efficient
3223	reading of a large file by lines, but without having to load the
3224	entire file in memory. Only complete lines will be returned.
3225
3226	\begin{verbatim}
3227	>>> f.readlines()
3228	['This is the first line of the file.\n', 'Second line of the file\n']
3229	\end{verbatim}
3230
3231	An alternate approach to reading lines is to loop over the file object.
3232	This is memory efficient, fast, and leads to simpler code:
3233
3234	\begin{verbatim}
3235	>>> for line in f:
3236	print line,
3237
3238	This is the first line of the file.
3239	Second line of the file
3240	\end{verbatim}
3241
3242	The alternative approach is simpler but does not provide as fine-grained
3243	control. Since the two approaches manage line buffering differently,
3244	they should not be mixed.
3245
3246	\code{f.write(\var{string})} writes the contents of \var{string} to
3247	the file, returning \code{None}.
3248
3249	\begin{verbatim}
3250	>>> f.write('This is a test\n')
3251	\end{verbatim}
3252
3253	To write something other than a string, it needs to be converted to a
3254	string first:
3255
3256	\begin{verbatim}
3257	>>> value = ('the answer', 42)
3258	>>> s = str(value)
3259	>>> f.write(s)
3260	\end{verbatim}
3261
3262	\code{f.tell()} returns an integer giving the file object's current
3263	position in the file, measured in bytes from the beginning of the
3264	file. To change the file object's position, use
3265	\samp{f.seek(\var{offset}, \var{from_what})}. The position is
3266	computed from adding \var{offset} to a reference point; the reference
3267	point is selected by the \var{from_what} argument. A
3268	\var{from_what} value of 0 measures from the beginning of the file, 1
3269	uses the current file position, and 2 uses the end of the file as the
3270	reference point. \var{from_what} can be omitted and defaults to 0,
3271	using the beginning of the file as the reference point.
3272
3273	\begin{verbatim}
3274	>>> f = open('/tmp/workfile', 'r+')
3275	>>> f.write('0123456789abcdef')
3276	>>> f.seek(5) # Go to the 6th byte in the file
3277	>>> f.read(1)
3278	'5'
3279	>>> f.seek(-3, 2) # Go to the 3rd byte before the end
3280	>>> f.read(1)
3281	'd'
3282	\end{verbatim}
3283
3284	When you're done with a file, call \code{f.close()} to close it and
3285	free up any system resources taken up by the open file. After calling
3286	\code{f.close()}, attempts to use the file object will automatically fail.
3287
3288	\begin{verbatim}
3289	>>> f.close()
3290	>>> f.read()
3291	Traceback (most recent call last):
3292	File "<stdin>", line 1, in ?
3293	ValueError: I/O operation on closed file
3294	\end{verbatim}
3295
3296	File objects have some additional methods, such as
3297	\method{isatty()} and \method{truncate()} which are less frequently
3298	used; consult the Library Reference for a complete guide to file
3299	objects.
3300
3301	\subsection{The \module{pickle} Module \label{pickle}}
3302	\refstmodindex{pickle}
3303
3304	Strings can easily be written to and read from a file. Numbers take a
3305	bit more effort, since the \method{read()} method only returns
3306	strings, which will have to be passed to a function like
3307	\function{int()}, which takes a string like \code{'123'} and
3308	returns its numeric value 123. However, when you want to save more
3309	complex data types like lists, dictionaries, or class instances,
3310	things get a lot more complicated.
3311
3312	Rather than have users be constantly writing and debugging code to
3313	save complicated data types, Python provides a standard module called
3314	\ulink{\module{pickle}}{../lib/module-pickle.html}. This is an
3315	amazing module that can take almost
3316	any Python object (even some forms of Python code!), and convert it to
3317	a string representation; this process is called \dfn{pickling}.
3318	Reconstructing the object from the string representation is called
3319	\dfn{unpickling}. Between pickling and unpickling, the string
3320	representing the object may have been stored in a file or data, or
3321	sent over a network connection to some distant machine.
3322
3323	If you have an object \code{x}, and a file object \code{f} that's been
3324	opened for writing, the simplest way to pickle the object takes only
3325	one line of code:
3326
3327	\begin{verbatim}
3328	pickle.dump(x, f)
3329	\end{verbatim}
3330
3331	To unpickle the object again, if \code{f} is a file object which has
3332	been opened for reading:
3333
3334	\begin{verbatim}
3335	x = pickle.load(f)
3336	\end{verbatim}
3337
3338	(There are other variants of this, used when pickling many objects or
3339	when you don't want to write the pickled data to a file; consult the
3340	complete documentation for
3341	\ulink{\module{pickle}}{../lib/module-pickle.html} in the
3342	\citetitle[../lib/]{Python Library Reference}.)
3343
3344	\ulink{\module{pickle}}{../lib/module-pickle.html} is the standard way
3345	to make Python objects which can be stored and reused by other
3346	programs or by a future invocation of the same program; the technical
3347	term for this is a \dfn{persistent} object. Because
3348	\ulink{\module{pickle}}{../lib/module-pickle.html} is so widely used,
3349	many authors who write Python extensions take care to ensure that new
3350	data types such as matrices can be properly pickled and unpickled.
3351
3352
3353
3354	\chapter{Errors and Exceptions \label{errors}}
3355
3356	Until now error messages haven't been more than mentioned, but if you
3357	have tried out the examples you have probably seen some. There are
3358	(at least) two distinguishable kinds of errors:
3359	\emph{syntax errors} and \emph{exceptions}.
3360
3361	\section{Syntax Errors \label{syntaxErrors}}
3362
3363	Syntax errors, also known as parsing errors, are perhaps the most common
3364	kind of complaint you get while you are still learning Python:
3365
3366	\begin{verbatim}
3367	>>> while True print 'Hello world'
3368	File "<stdin>", line 1, in ?
3369	while True print 'Hello world'
3370	^
3371	SyntaxError: invalid syntax
3372	\end{verbatim}
3373
3374	The parser repeats the offending line and displays a little `arrow'
3375	pointing at the earliest point in the line where the error was
3376	detected. The error is caused by (or at least detected at) the token
3377	\emph{preceding} the arrow: in the example, the error is detected at
3378	the keyword \keyword{print}, since a colon (\character{:}) is missing
3379	before it. File name and line number are printed so you know where to
3380	look in case the input came from a script.
3381
3382	\section{Exceptions \label{exceptions}}
3383
3384	Even if a statement or expression is syntactically correct, it may
3385	cause an error when an attempt is made to execute it.
3386	Errors detected during execution are called \emph{exceptions} and are
3387	not unconditionally fatal: you will soon learn how to handle them in
3388	Python programs. Most exceptions are not handled by programs,
3389	however, and result in error messages as shown here:
3390
3391	\begin{verbatim}
3392	>>> 10 * (1/0)
3393	Traceback (most recent call last):
3394	File "<stdin>", line 1, in ?
3395	ZeroDivisionError: integer division or modulo by zero
3396	>>> 4 + spam*3
3397	Traceback (most recent call last):
3398	File "<stdin>", line 1, in ?
3399	NameError: name 'spam' is not defined
3400	>>> '2' + 2
3401	Traceback (most recent call last):
3402	File "<stdin>", line 1, in ?
3403	TypeError: cannot concatenate 'str' and 'int' objects
3404	\end{verbatim}
3405
3406	The last line of the error message indicates what happened.
3407	Exceptions come in different types, and the type is printed as part of
3408	the message: the types in the example are
3409	\exception{ZeroDivisionError}, \exception{NameError} and
3410	\exception{TypeError}.
3411	The string printed as the exception type is the name of the built-in
3412	exception that occurred. This is true for all built-in
3413	exceptions, but need not be true for user-defined exceptions (although
3414	it is a useful convention).
3415	Standard exception names are built-in identifiers (not reserved
3416	keywords).
3417
3418	The rest of the line provides detail based on the type of exception
3419	and what caused it.
3420
3421	The preceding part of the error message shows the context where the
3422	exception happened, in the form of a stack traceback.
3423	In general it contains a stack traceback listing source lines; however,
3424	it will not display lines read from standard input.
3425
3426	The \citetitle[../lib/module-exceptions.html]{Python Library
3427	Reference} lists the built-in exceptions and their meanings.
3428
3429
3430	\section{Handling Exceptions \label{handling}}
3431
3432	It is possible to write programs that handle selected exceptions.
3433	Look at the following example, which asks the user for input until a
3434	valid integer has been entered, but allows the user to interrupt the
3435	program (using \kbd{Control-C} or whatever the operating system
3436	supports); note that a user-generated interruption is signalled by
3437	raising the \exception{KeyboardInterrupt} exception.
3438
3439	\begin{verbatim}
3440	>>> while True:
3441	... try:
3442	... x = int(raw_input("Please enter a number: "))
3443	... break
3444	... except ValueError:
3445	... print "Oops! That was no valid number. Try again..."
3446	...
3447	\end{verbatim}
3448
3449	The \keyword{try} statement works as follows.
3450
3451	\begin{itemize}
3452	\item
3453	First, the \emph{try clause} (the statement(s) between the
3454	\keyword{try} and \keyword{except} keywords) is executed.
3455
3456	\item
3457	If no exception occurs, the \emph{except\ clause} is skipped and
3458	execution of the \keyword{try} statement is finished.
3459
3460	\item
3461	If an exception occurs during execution of the try clause, the rest of
3462	the clause is skipped. Then if its type matches the exception named
3463	after the \keyword{except} keyword, the except clause is executed, and
3464	then execution continues after the \keyword{try} statement.
3465
3466	\item
3467	If an exception occurs which does not match the exception named in the
3468	except clause, it is passed on to outer \keyword{try} statements; if
3469	no handler is found, it is an \emph{unhandled exception} and execution
3470	stops with a message as shown above.
3471
3472	\end{itemize}
3473
3474	A \keyword{try} statement may have more than one except clause, to
3475	specify handlers for different exceptions. At most one handler will
3476	be executed. Handlers only handle exceptions that occur in the
3477	corresponding try clause, not in other handlers of the same
3478	\keyword{try} statement. An except clause may name multiple exceptions
3479	as a parenthesized tuple, for example:
3480
3481	\begin{verbatim}
3482	... except (RuntimeError, TypeError, NameError):
3483	... pass
3484	\end{verbatim}
3485
3486	The last except clause may omit the exception name(s), to serve as a
3487	wildcard. Use this with extreme caution, since it is easy to mask a
3488	real programming error in this way! It can also be used to print an
3489	error message and then re-raise the exception (allowing a caller to
3490	handle the exception as well):
3491
3492	\begin{verbatim}
3493	import sys
3494
3495	try:
3496	f = open('myfile.txt')
3497	s = f.readline()
3498	i = int(s.strip())
3499	except IOError, (errno, strerror):
3500	print "I/O error(%s): %s" % (errno, strerror)
3501	except ValueError:
3502	print "Could not convert data to an integer."
3503	except:
3504	print "Unexpected error:", sys.exc_info()[0]
3505	raise
3506	\end{verbatim}
3507
3508	The \keyword{try} \ldots\ \keyword{except} statement has an optional
3509	\emph{else clause}, which, when present, must follow all except
3510	clauses. It is useful for code that must be executed if the try
3511	clause does not raise an exception. For example:
3512
3513	\begin{verbatim}
3514	for arg in sys.argv[1:]:
3515	try:
3516	f = open(arg, 'r')
3517	except IOError:
3518	print 'cannot open', arg
3519	else:
3520	print arg, 'has', len(f.readlines()), 'lines'
3521	f.close()
3522	\end{verbatim}
3523
3524	The use of the \keyword{else} clause is better than adding additional
3525	code to the \keyword{try} clause because it avoids accidentally
3526	catching an exception that wasn't raised by the code being protected
3527	by the \keyword{try} \ldots\ \keyword{except} statement.
3528
3529
3530	When an exception occurs, it may have an associated value, also known as
3531	the exception's \emph{argument}.
3532	The presence and type of the argument depend on the exception type.
3533
3534	The except clause may specify a variable after the exception name (or tuple).
3535	The variable is bound to an exception instance with the arguments stored
3536	in \code{instance.args}. For convenience, the exception instance
3537	defines \method{__getitem__} and \method{__str__} so the arguments can
3538	be accessed or printed directly without having to reference \code{.args}.
3539
3540	But use of \code{.args} is discouraged. Instead, the preferred use is to pass
3541	a single argument to an exception (which can be a tuple if multiple arguments
3542	are needed) and have it bound to the \code{message} attribute. One my also
3543	instantiate an exception first before raising it and add any attributes to it
3544	as desired.
3545
3546	\begin{verbatim}
3547	>>> try:
3548	... raise Exception('spam', 'eggs')
3549	... except Exception, inst:
3550	... print type(inst) # the exception instance
3551	... print inst.args # arguments stored in .args
3552	... print inst # __str__ allows args to printed directly
3553	... x, y = inst # __getitem__ allows args to be unpacked directly
3554	... print 'x =', x
3555	... print 'y =', y
3556	...
3557	<type 'instance'>
3558	('spam', 'eggs')
3559	('spam', 'eggs')
3560	x = spam
3561	y = eggs
3562	\end{verbatim}
3563
3564	If an exception has an argument, it is printed as the last part
3565	(`detail') of the message for unhandled exceptions.
3566
3567	Exception handlers don't just handle exceptions if they occur
3568	immediately in the try clause, but also if they occur inside functions
3569	that are called (even indirectly) in the try clause.
3570	For example:
3571
3572	\begin{verbatim}
3573	>>> def this_fails():
3574	... x = 1/0
3575	...
3576	>>> try:
3577	... this_fails()
3578	... except ZeroDivisionError, detail:
3579	... print 'Handling run-time error:', detail
3580	...
3581	Handling run-time error: integer division or modulo by zero
3582	\end{verbatim}
3583
3584
3585	\section{Raising Exceptions \label{raising}}
3586
3587	The \keyword{raise} statement allows the programmer to force a
3588	specified exception to occur.
3589	For example:
3590
3591	\begin{verbatim}
3592	>>> raise NameError, 'HiThere'
3593	Traceback (most recent call last):
3594	File "<stdin>", line 1, in ?
3595	NameError: HiThere
3596	\end{verbatim}
3597
3598	The first argument to \keyword{raise} names the exception to be
3599	raised. The optional second argument specifies the exception's
3600	argument. Alternatively, the above could be written as
3601	\code{raise NameError('HiThere')}. Either form works fine, but there
3602	seems to be a growing stylistic preference for the latter.
3603
3604	If you need to determine whether an exception was raised but don't
3605	intend to handle it, a simpler form of the \keyword{raise} statement
3606	allows you to re-raise the exception:
3607
3608	\begin{verbatim}
3609	>>> try:
3610	... raise NameError, 'HiThere'
3611	... except NameError:
3612	... print 'An exception flew by!'
3613	... raise
3614	...
3615	An exception flew by!
3616	Traceback (most recent call last):
3617	File "<stdin>", line 2, in ?
3618	NameError: HiThere
3619	\end{verbatim}
3620
3621
3622	\section{User-defined Exceptions \label{userExceptions}}
3623
3624	Programs may name their own exceptions by creating a new exception
3625	class. Exceptions should typically be derived from the
3626	\exception{Exception} class, either directly or indirectly. For
3627	example:
3628
3629	\begin{verbatim}
3630	>>> class MyError(Exception):
3631	... def __init__(self, value):
3632	... self.value = value
3633	... def __str__(self):
3634	... return repr(self.value)
3635	...
3636	>>> try:
3637	... raise MyError(2*2)
3638	... except MyError, e:
3639	... print 'My exception occurred, value:', e.value
3640	...
3641	My exception occurred, value: 4
3642	>>> raise MyError, 'oops!'
3643	Traceback (most recent call last):
3644	File "<stdin>", line 1, in ?
3645	__main__.MyError: 'oops!'
3646	\end{verbatim}
3647
3648	In this example, the default \method{__init__} of \class{Exception}
3649	has been overridden. The new behavior simply creates the \var{value}
3650	attribute. This replaces the default behavior of creating the
3651	\var{args} attribute.
3652
3653	Exception classes can be defined which do anything any other class can
3654	do, but are usually kept simple, often only offering a number of
3655	attributes that allow information about the error to be extracted by
3656	handlers for the exception. When creating a module that can raise
3657	several distinct errors, a common practice is to create a base class
3658	for exceptions defined by that module, and subclass that to create
3659	specific exception classes for different error conditions:
3660
3661	\begin{verbatim}
3662	class Error(Exception):
3663	"""Base class for exceptions in this module."""
3664	pass
3665
3666	class InputError(Error):
3667	"""Exception raised for errors in the input.
3668
3669	Attributes:
3670	expression -- input expression in which the error occurred
3671	message -- explanation of the error
3672	"""
3673
3674	def __init__(self, expression, message):
3675	self.expression = expression
3676	self.message = message
3677
3678	class TransitionError(Error):
3679	"""Raised when an operation attempts a state transition that's not
3680	allowed.
3681
3682	Attributes:
3683	previous -- state at beginning of transition
3684	next -- attempted new state
3685	message -- explanation of why the specific transition is not allowed
3686	"""
3687
3688	def __init__(self, previous, next, message):
3689	self.previous = previous
3690	self.next = next
3691	self.message = message
3692	\end{verbatim}
3693
3694	Most exceptions are defined with names that end in ``Error,'' similar
3695	to the naming of the standard exceptions.
3696
3697	Many standard modules define their own exceptions to report errors
3698	that may occur in functions they define. More information on classes
3699	is presented in chapter \ref{classes}, ``Classes.''
3700
3701
3702	\section{Defining Clean-up Actions \label{cleanup}}
3703
3704	The \keyword{try} statement has another optional clause which is
3705	intended to define clean-up actions that must be executed under all
3706	circumstances. For example:
3707
3708	\begin{verbatim}
3709	>>> try:
3710	... raise KeyboardInterrupt
3711	... finally:
3712	... print 'Goodbye, world!'
3713	...
3714	Goodbye, world!
3715	Traceback (most recent call last):
3716	File "<stdin>", line 2, in ?
3717	KeyboardInterrupt
3718	\end{verbatim}
3719
3720	A \emph{finally clause} is always executed before leaving the
3721	\keyword{try} statement, whether an exception has occurred or not.
3722	When an exception has occurred in the \keyword{try} clause and has not
3723	been handled by an \keyword{except} clause (or it has occurred in a
3724	\keyword{except} or \keyword{else} clause), it is re-raised after the
3725	\keyword{finally} clause has been executed. The \keyword{finally} clause
3726	is also executed ``on the way out'' when any other clause of the
3727	\keyword{try} statement is left via a \keyword{break}, \keyword{continue}
3728	or \keyword{return} statement. A more complicated example:
3729
3730	\begin{verbatim}
3731	>>> def divide(x, y):
3732	... try:
3733	... result = x / y
3734	... except ZeroDivisionError:
3735	... print "division by zero!"
3736	... else:
3737	... print "result is", result
3738	... finally:
3739	... print "executing finally clause"
3740	...
3741	>>> divide(2, 1)
3742	result is 2
3743	executing finally clause
3744	>>> divide(2, 0)
3745	division by zero!
3746	executing finally clause
3747	>>> divide("2", "1")
3748	executing finally clause
3749	Traceback (most recent call last):
3750	File "<stdin>", line 1, in ?
3751	File "<stdin>", line 3, in divide
3752	TypeError: unsupported operand type(s) for /: 'str' and 'str'
3753	\end{verbatim}
3754
3755	As you can see, the \keyword{finally} clause is executed in any
3756	event. The \exception{TypeError} raised by dividing two strings
3757	is not handled by the \keyword{except} clause and therefore
3758	re-raised after the \keyword{finally} clauses has been executed.
3759
3760	In real world applications, the \keyword{finally} clause is useful
3761	for releasing external resources (such as files or network connections),
3762	regardless of whether the use of the resource was successful.
3763
3764
3765	\section{Predefined Clean-up Actions \label{cleanup-with}}
3766
3767	Some objects define standard clean-up actions to be undertaken when
3768	the object is no longer needed, regardless of whether or not the
3769	operation using the object succeeded or failed.
3770	Look at the following example, which tries to open a file and print
3771	its contents to the screen.
3772
3773	\begin{verbatim}
3774	for line in open("myfile.txt"):
3775	print line
3776	\end{verbatim}
3777
3778	The problem with this code is that it leaves the file open for an
3779	indeterminate amount of time after the code has finished executing.
3780	This is not an issue in simple scripts, but can be a problem for
3781	larger applications. The \keyword{with} statement allows
3782	objects like files to be used in a way that ensures they are
3783	always cleaned up promptly and correctly.
3784
3785	\begin{verbatim}
3786	with open("myfile.txt") as f:
3787	for line in f:
3788	print line
3789	\end{verbatim}
3790
3791	After the statement is executed, the file \var{f} is always closed,
3792	even if a problem was encountered while processing the lines. Other
3793	objects which provide predefined clean-up actions will indicate
3794	this in their documentation.
3795
3796
3797	\chapter{Classes \label{classes}}
3798
3799	Python's class mechanism adds classes to the language with a minimum
3800	of new syntax and semantics. It is a mixture of the class mechanisms
3801	found in \Cpp{} and Modula-3. As is true for modules, classes in Python
3802	do not put an absolute barrier between definition and user, but rather
3803	rely on the politeness of the user not to ``break into the
3804	definition.'' The most important features of classes are retained
3805	with full power, however: the class inheritance mechanism allows
3806	multiple base classes, a derived class can override any methods of its
3807	base class or classes, and a method can call the method of a base class with the
3808	same name. Objects can contain an arbitrary amount of private data.
3809
3810	In \Cpp{} terminology, all class members (including the data members) are
3811	\emph{public}, and all member functions are \emph{virtual}. There are
3812	no special constructors or destructors. As in Modula-3, there are no
3813	shorthands for referencing the object's members from its methods: the
3814	method function is declared with an explicit first argument
3815	representing the object, which is provided implicitly by the call. As
3816	in Smalltalk, classes themselves are objects, albeit in the wider
3817	sense of the word: in Python, all data types are objects. This
3818	provides semantics for importing and renaming. Unlike
3819	\Cpp{} and Modula-3, built-in types can be used as base classes for
3820	extension by the user. Also, like in \Cpp{} but unlike in Modula-3, most
3821	built-in operators with special syntax (arithmetic operators,
3822	subscripting etc.) can be redefined for class instances.
3823
3824	\section{A Word About Terminology \label{terminology}}
3825
3826	Lacking universally accepted terminology to talk about classes, I will
3827	make occasional use of Smalltalk and \Cpp{} terms. (I would use Modula-3
3828	terms, since its object-oriented semantics are closer to those of
3829	Python than \Cpp, but I expect that few readers have heard of it.)
3830
3831	Objects have individuality, and multiple names (in multiple scopes)
3832	can be bound to the same object. This is known as aliasing in other
3833	languages. This is usually not appreciated on a first glance at
3834	Python, and can be safely ignored when dealing with immutable basic
3835	types (numbers, strings, tuples). However, aliasing has an
3836	(intended!) effect on the semantics of Python code involving mutable
3837	objects such as lists, dictionaries, and most types representing
3838	entities outside the program (files, windows, etc.). This is usually
3839	used to the benefit of the program, since aliases behave like pointers
3840	in some respects. For example, passing an object is cheap since only
3841	a pointer is passed by the implementation; and if a function modifies
3842	an object passed as an argument, the caller will see the change --- this
3843	eliminates the need for two different argument passing mechanisms as in
3844	Pascal.
3845
3846
3847	\section{Python Scopes and Name Spaces \label{scopes}}
3848
3849	Before introducing classes, I first have to tell you something about
3850	Python's scope rules. Class definitions play some neat tricks with
3851	namespaces, and you need to know how scopes and namespaces work to
3852	fully understand what's going on. Incidentally, knowledge about this
3853	subject is useful for any advanced Python programmer.
3854
3855	Let's begin with some definitions.
3856
3857	A \emph{namespace} is a mapping from names to objects. Most
3858	namespaces are currently implemented as Python dictionaries, but
3859	that's normally not noticeable in any way (except for performance),
3860	and it may change in the future. Examples of namespaces are: the set
3861	of built-in names (functions such as \function{abs()}, and built-in
3862	exception names); the global names in a module; and the local names in
3863	a function invocation. In a sense the set of attributes of an object
3864	also form a namespace. The important thing to know about namespaces
3865	is that there is absolutely no relation between names in different
3866	namespaces; for instance, two different modules may both define a
3867	function ``maximize'' without confusion --- users of the modules must
3868	prefix it with the module name.
3869
3870	By the way, I use the word \emph{attribute} for any name following a
3871	dot --- for example, in the expression \code{z.real}, \code{real} is
3872	an attribute of the object \code{z}. Strictly speaking, references to
3873	names in modules are attribute references: in the expression
3874	\code{modname.funcname}, \code{modname} is a module object and
3875	\code{funcname} is an attribute of it. In this case there happens to
3876	be a straightforward mapping between the module's attributes and the
3877	global names defined in the module: they share the same namespace!
3878	\footnote{
3879	Except for one thing. Module objects have a secret read-only
3880	attribute called \member{__dict__} which returns the dictionary
3881	used to implement the module's namespace; the name
3882	\member{__dict__} is an attribute but not a global name.
3883	Obviously, using this violates the abstraction of namespace
3884	implementation, and should be restricted to things like
3885	post-mortem debuggers.
3886	}
3887
3888	Attributes may be read-only or writable. In the latter case,
3889	assignment to attributes is possible. Module attributes are writable:
3890	you can write \samp{modname.the_answer = 42}. Writable attributes may
3891	also be deleted with the \keyword{del} statement. For example,
3892	\samp{del modname.the_answer} will remove the attribute
3893	\member{the_answer} from the object named by \code{modname}.
3894
3895	Name spaces are created at different moments and have different
3896	lifetimes. The namespace containing the built-in names is created
3897	when the Python interpreter starts up, and is never deleted. The
3898	global namespace for a module is created when the module definition
3899	is read in; normally, module namespaces also last until the
3900	interpreter quits. The statements executed by the top-level
3901	invocation of the interpreter, either read from a script file or
3902	interactively, are considered part of a module called
3903	\module{__main__}, so they have their own global namespace. (The
3904	built-in names actually also live in a module; this is called
3905	\module{__builtin__}.)
3906
3907	The local namespace for a function is created when the function is
3908	called, and deleted when the function returns or raises an exception
3909	that is not handled within the function. (Actually, forgetting would
3910	be a better way to describe what actually happens.) Of course,
3911	recursive invocations each have their own local namespace.
3912
3913	A \emph{scope} is a textual region of a Python program where a
3914	namespace is directly accessible. ``Directly accessible'' here means
3915	that an unqualified reference to a name attempts to find the name in
3916	the namespace.
3917
3918	Although scopes are determined statically, they are used dynamically.
3919	At any time during execution, there are at least three nested scopes whose
3920	namespaces are directly accessible: the innermost scope, which is searched
3921	first, contains the local names; the namespaces of any enclosing
3922	functions, which are searched starting with the nearest enclosing scope;
3923	the middle scope, searched next, contains the current module's global names;
3924	and the outermost scope (searched last) is the namespace containing built-in
3925	names.
3926
3927	If a name is declared global, then all references and assignments go
3928	directly to the middle scope containing the module's global names.
3929	Otherwise, all variables found outside of the innermost scope are read-only
3930	(an attempt to write to such a variable will simply create a \emph{new}
3931	local variable in the innermost scope, leaving the identically named
3932	outer variable unchanged).
3933
3934	Usually, the local scope references the local names of the (textually)
3935	current function. Outside functions, the local scope references
3936	the same namespace as the global scope: the module's namespace.
3937	Class definitions place yet another namespace in the local scope.
3938
3939	It is important to realize that scopes are determined textually: the
3940	global scope of a function defined in a module is that module's
3941	namespace, no matter from where or by what alias the function is
3942	called. On the other hand, the actual search for names is done
3943	dynamically, at run time --- however, the language definition is
3944	evolving towards static name resolution, at ``compile'' time, so don't
3945	rely on dynamic name resolution! (In fact, local variables are
3946	already determined statically.)
3947
3948	A special quirk of Python is that assignments always go into the
3949	innermost scope. Assignments do not copy data --- they just
3950	bind names to objects. The same is true for deletions: the statement
3951	\samp{del x} removes the binding of \code{x} from the namespace
3952	referenced by the local scope. In fact, all operations that introduce
3953	new names use the local scope: in particular, import statements and
3954	function definitions bind the module or function name in the local
3955	scope. (The \keyword{global} statement can be used to indicate that
3956	particular variables live in the global scope.)
3957
3958
3959	\section{A First Look at Classes \label{firstClasses}}
3960
3961	Classes introduce a little bit of new syntax, three new object types,
3962	and some new semantics.
3963
3964
3965	\subsection{Class Definition Syntax \label{classDefinition}}
3966
3967	The simplest form of class definition looks like this:
3968
3969	\begin{verbatim}
3970	class ClassName:
3971	<statement-1>
3972	.
3973	.
3974	.
3975	<statement-N>
3976	\end{verbatim}
3977
3978	Class definitions, like function definitions
3979	(\keyword{def} statements) must be executed before they have any
3980	effect. (You could conceivably place a class definition in a branch
3981	of an \keyword{if} statement, or inside a function.)
3982
3983	In practice, the statements inside a class definition will usually be
3984	function definitions, but other statements are allowed, and sometimes
3985	useful --- we'll come back to this later. The function definitions
3986	inside a class normally have a peculiar form of argument list,
3987	dictated by the calling conventions for methods --- again, this is
3988	explained later.
3989
3990	When a class definition is entered, a new namespace is created, and
3991	used as the local scope --- thus, all assignments to local variables
3992	go into this new namespace. In particular, function definitions bind
3993	the name of the new function here.
3994
3995	When a class definition is left normally (via the end), a \emph{class
3996	object} is created. This is basically a wrapper around the contents
3997	of the namespace created by the class definition; we'll learn more
3998	about class objects in the next section. The original local scope
3999	(the one in effect just before the class definition was entered) is
4000	reinstated, and the class object is bound here to the class name given
4001	in the class definition header (\class{ClassName} in the example).
4002
4003
4004	\subsection{Class Objects \label{classObjects}}
4005
4006	Class objects support two kinds of operations: attribute references
4007	and instantiation.
4008
4009	\emph{Attribute references} use the standard syntax used for all
4010	attribute references in Python: \code{obj.name}. Valid attribute
4011	names are all the names that were in the class's namespace when the
4012	class object was created. So, if the class definition looked like
4013	this:
4014
4015	\begin{verbatim}
4016	class MyClass:
4017	"A simple example class"
4018	i = 12345
4019	def f(self):
4020	return 'hello world'
4021	\end{verbatim}
4022
4023	then \code{MyClass.i} and \code{MyClass.f} are valid attribute
4024	references, returning an integer and a function object, respectively.
4025	Class attributes can also be assigned to, so you can change the value
4026	of \code{MyClass.i} by assignment. \member{__doc__} is also a valid
4027	attribute, returning the docstring belonging to the class: \code{"A
4028	simple example class"}.
4029
4030	Class \emph{instantiation} uses function notation. Just pretend that
4031	the class object is a parameterless function that returns a new
4032	instance of the class. For example (assuming the above class):
4033
4034	\begin{verbatim}
4035	x = MyClass()
4036	\end{verbatim}
4037
4038	creates a new \emph{instance} of the class and assigns this object to
4039	the local variable \code{x}.
4040
4041	The instantiation operation (``calling'' a class object) creates an
4042	empty object. Many classes like to create objects with instances
4043	customized to a specific initial state.
4044	Therefore a class may define a special method named
4045	\method{__init__()}, like this:
4046
4047	\begin{verbatim}
4048	def __init__(self):
4049	self.data = []
4050	\end{verbatim}
4051
4052	When a class defines an \method{__init__()} method, class
4053	instantiation automatically invokes \method{__init__()} for the
4054	newly-created class instance. So in this example, a new, initialized
4055	instance can be obtained by:
4056
4057	\begin{verbatim}
4058	x = MyClass()
4059	\end{verbatim}
4060
4061	Of course, the \method{__init__()} method may have arguments for
4062	greater flexibility. In that case, arguments given to the class
4063	instantiation operator are passed on to \method{__init__()}. For
4064	example,
4065
4066	\begin{verbatim}
4067	>>> class Complex:
4068	... def __init__(self, realpart, imagpart):
4069	... self.r = realpart
4070	... self.i = imagpart
4071	...
4072	>>> x = Complex(3.0, -4.5)
4073	>>> x.r, x.i
4074	(3.0, -4.5)
4075	\end{verbatim}
4076
4077
4078	\subsection{Instance Objects \label{instanceObjects}}
4079
4080	Now what can we do with instance objects? The only operations
4081	understood by instance objects are attribute references. There are
4082	two kinds of valid attribute names, data attributes and methods.
4083
4084	\emph{data attributes} correspond to
4085	``instance variables'' in Smalltalk, and to ``data members'' in
4086	\Cpp. Data attributes need not be declared; like local variables,
4087	they spring into existence when they are first assigned to. For
4088	example, if \code{x} is the instance of \class{MyClass} created above,
4089	the following piece of code will print the value \code{16}, without
4090	leaving a trace:
4091
4092	\begin{verbatim}
4093	x.counter = 1
4094	while x.counter < 10:
4095	x.counter = x.counter * 2
4096	print x.counter
4097	del x.counter
4098	\end{verbatim}
4099
4100	The other kind of instance attribute reference is a \emph{method}.
4101	A method is a function that ``belongs to'' an
4102	object. (In Python, the term method is not unique to class instances:
4103	other object types can have methods as well. For example, list objects have
4104	methods called append, insert, remove, sort, and so on. However,
4105	in the following discussion, we'll use the term method exclusively to mean
4106	methods of class instance objects, unless explicitly stated otherwise.)
4107
4108	Valid method names of an instance object depend on its class. By
4109	definition, all attributes of a class that are function
4110	objects define corresponding methods of its instances. So in our
4111	example, \code{x.f} is a valid method reference, since
4112	\code{MyClass.f} is a function, but \code{x.i} is not, since
4113	\code{MyClass.i} is not. But \code{x.f} is not the same thing as
4114	\code{MyClass.f} --- it is a \obindex{method}\emph{method object}, not
4115	a function object.
4116
4117
4118	\subsection{Method Objects \label{methodObjects}}
4119
4120	Usually, a method is called right after it is bound:
4121
4122	\begin{verbatim}
4123	x.f()
4124	\end{verbatim}
4125
4126	In the \class{MyClass} example, this will return the string \code{'hello world'}.
4127	However, it is not necessary to call a method right away:
4128	\code{x.f} is a method object, and can be stored away and called at a
4129	later time. For example:
4130
4131	\begin{verbatim}
4132	xf = x.f
4133	while True:
4134	print xf()
4135	\end{verbatim}
4136
4137	will continue to print \samp{hello world} until the end of time.
4138
4139	What exactly happens when a method is called? You may have noticed
4140	that \code{x.f()} was called without an argument above, even though
4141	the function definition for \method{f} specified an argument. What
4142	happened to the argument? Surely Python raises an exception when a
4143	function that requires an argument is called without any --- even if
4144	the argument isn't actually used...
4145
4146	Actually, you may have guessed the answer: the special thing about
4147	methods is that the object is passed as the first argument of the
4148	function. In our example, the call \code{x.f()} is exactly equivalent
4149	to \code{MyClass.f(x)}. In general, calling a method with a list of
4150	\var{n} arguments is equivalent to calling the corresponding function
4151	with an argument list that is created by inserting the method's object
4152	before the first argument.
4153
4154	If you still don't understand how methods work, a look at the
4155	implementation can perhaps clarify matters. When an instance
4156	attribute is referenced that isn't a data attribute, its class is
4157	searched. If the name denotes a valid class attribute that is a
4158	function object, a method object is created by packing (pointers to)
4159	the instance object and the function object just found together in an
4160	abstract object: this is the method object. When the method object is
4161	called with an argument list, it is unpacked again, a new argument
4162	list is constructed from the instance object and the original argument
4163	list, and the function object is called with this new argument list.
4164
4165
4166	\section{Random Remarks \label{remarks}}
4167
4168	% [These should perhaps be placed more carefully...]
4169
4170
4171	Data attributes override method attributes with the same name; to
4172	avoid accidental name conflicts, which may cause hard-to-find bugs in
4173	large programs, it is wise to use some kind of convention that
4174	minimizes the chance of conflicts. Possible conventions include
4175	capitalizing method names, prefixing data attribute names with a small
4176	unique string (perhaps just an underscore), or using verbs for methods
4177	and nouns for data attributes.
4178
4179
4180	Data attributes may be referenced by methods as well as by ordinary
4181	users (``clients'') of an object. In other words, classes are not
4182	usable to implement pure abstract data types. In fact, nothing in
4183	Python makes it possible to enforce data hiding --- it is all based
4184	upon convention. (On the other hand, the Python implementation,
4185	written in C, can completely hide implementation details and control
4186	access to an object if necessary; this can be used by extensions to
4187	Python written in C.)
4188
4189
4190	Clients should use data attributes with care --- clients may mess up
4191	invariants maintained by the methods by stamping on their data
4192	attributes. Note that clients may add data attributes of their own to
4193	an instance object without affecting the validity of the methods, as
4194	long as name conflicts are avoided --- again, a naming convention can
4195	save a lot of headaches here.
4196
4197
4198	There is no shorthand for referencing data attributes (or other
4199	methods!) from within methods. I find that this actually increases
4200	the readability of methods: there is no chance of confusing local
4201	variables and instance variables when glancing through a method.
4202
4203
4204	Often, the first argument of a method is called
4205	\code{self}. This is nothing more than a convention: the name
4206	\code{self} has absolutely no special meaning to Python. (Note,
4207	however, that by not following the convention your code may be less
4208	readable to other Python programmers, and it is also conceivable that
4209	a \emph{class browser} program might be written that relies upon such a
4210	convention.)
4211
4212
4213	Any function object that is a class attribute defines a method for
4214	instances of that class. It is not necessary that the function
4215	definition is textually enclosed in the class definition: assigning a
4216	function object to a local variable in the class is also ok. For
4217	example:
4218
4219	\begin{verbatim}
4220	# Function defined outside the class
4221	def f1(self, x, y):
4222	return min(x, x+y)
4223
4224	class C:
4225	f = f1
4226	def g(self):
4227	return 'hello world'
4228	h = g
4229	\end{verbatim}
4230
4231	Now \code{f}, \code{g} and \code{h} are all attributes of class
4232	\class{C} that refer to function objects, and consequently they are all
4233	methods of instances of \class{C} --- \code{h} being exactly equivalent
4234	to \code{g}. Note that this practice usually only serves to confuse
4235	the reader of a program.
4236
4237
4238	Methods may call other methods by using method attributes of the
4239	\code{self} argument:
4240
4241	\begin{verbatim}
4242	class Bag:
4243	def __init__(self):
4244	self.data = []
4245	def add(self, x):
4246	self.data.append(x)
4247	def addtwice(self, x):
4248	self.add(x)
4249	self.add(x)
4250	\end{verbatim}
4251
4252	Methods may reference global names in the same way as ordinary
4253	functions. The global scope associated with a method is the module
4254	containing the class definition. (The class itself is never used as a
4255	global scope!) While one rarely encounters a good reason for using
4256	global data in a method, there are many legitimate uses of the global
4257	scope: for one thing, functions and modules imported into the global
4258	scope can be used by methods, as well as functions and classes defined
4259	in it. Usually, the class containing the method is itself defined in
4260	this global scope, and in the next section we'll find some good
4261	reasons why a method would want to reference its own class!
4262
4263
4264	\section{Inheritance \label{inheritance}}
4265
4266	Of course, a language feature would not be worthy of the name ``class''
4267	without supporting inheritance. The syntax for a derived class
4268	definition looks like this:
4269
4270	\begin{verbatim}
4271	class DerivedClassName(BaseClassName):
4272	<statement-1>
4273	.
4274	.
4275	.
4276	<statement-N>
4277	\end{verbatim}
4278
4279	The name \class{BaseClassName} must be defined in a scope containing
4280	the derived class definition. In place of a base class name, other
4281	arbitrary expressions are also allowed. This can be useful, for
4282	example, when the base class is defined in another module:
4283
4284	\begin{verbatim}
4285	class DerivedClassName(modname.BaseClassName):
4286	\end{verbatim}
4287
4288	Execution of a derived class definition proceeds the same as for a
4289	base class. When the class object is constructed, the base class is
4290	remembered. This is used for resolving attribute references: if a
4291	requested attribute is not found in the class, the search proceeds to look in the
4292	base class. This rule is applied recursively if the base class itself
4293	is derived from some other class.
4294
4295	There's nothing special about instantiation of derived classes:
4296	\code{DerivedClassName()} creates a new instance of the class. Method
4297	references are resolved as follows: the corresponding class attribute
4298	is searched, descending down the chain of base classes if necessary,
4299	and the method reference is valid if this yields a function object.
4300
4301	Derived classes may override methods of their base classes. Because
4302	methods have no special privileges when calling other methods of the
4303	same object, a method of a base class that calls another method
4304	defined in the same base class may end up calling a method of
4305	a derived class that overrides it. (For \Cpp{} programmers: all methods
4306	in Python are effectively \keyword{virtual}.)
4307
4308	An overriding method in a derived class may in fact want to extend
4309	rather than simply replace the base class method of the same name.
4310	There is a simple way to call the base class method directly: just
4311	call \samp{BaseClassName.methodname(self, arguments)}. This is
4312	occasionally useful to clients as well. (Note that this only works if
4313	the base class is defined or imported directly in the global scope.)
4314
4315
4316	\subsection{Multiple Inheritance \label{multiple}}
4317
4318	Python supports a limited form of multiple inheritance as well. A
4319	class definition with multiple base classes looks like this:
4320
4321	\begin{verbatim}
4322	class DerivedClassName(Base1, Base2, Base3):
4323	<statement-1>
4324	.
4325	.
4326	.
4327	<statement-N>
4328	\end{verbatim}
4329
4330	The only rule necessary to explain the semantics is the resolution
4331	rule used for class attribute references. This is depth-first,
4332	left-to-right. Thus, if an attribute is not found in
4333	\class{DerivedClassName}, it is searched in \class{Base1}, then
4334	(recursively) in the base classes of \class{Base1}, and only if it is
4335	not found there, it is searched in \class{Base2}, and so on.
4336
4337	(To some people breadth first --- searching \class{Base2} and
4338	\class{Base3} before the base classes of \class{Base1} --- looks more
4339	natural. However, this would require you to know whether a particular
4340	attribute of \class{Base1} is actually defined in \class{Base1} or in
4341	one of its base classes before you can figure out the consequences of
4342	a name conflict with an attribute of \class{Base2}. The depth-first
4343	rule makes no differences between direct and inherited attributes of
4344	\class{Base1}.)
4345
4346	It is clear that indiscriminate use of multiple inheritance is a
4347	maintenance nightmare, given the reliance in Python on conventions to
4348	avoid accidental name conflicts. A well-known problem with multiple
4349	inheritance is a class derived from two classes that happen to have a
4350	common base class. While it is easy enough to figure out what happens
4351	in this case (the instance will have a single copy of ``instance
4352	variables'' or data attributes used by the common base class), it is
4353	not clear that these semantics are in any way useful.
4354
4355	%% XXX Add rules for new-style MRO?
4356
4357	\section{Private Variables \label{private}}
4358
4359	There is limited support for class-private
4360	identifiers. Any identifier of the form \code{__spam} (at least two
4361	leading underscores, at most one trailing underscore) is textually
4362	replaced with \code{_classname__spam}, where \code{classname} is the
4363	current class name with leading underscore(s) stripped. This mangling
4364	is done without regard to the syntactic position of the identifier, so
4365	it can be used to define class-private instance and class variables,
4366	methods, variables stored in globals, and even variables stored in instances.
4367	private to this class on instances of \emph{other} classes. Truncation
4368	may occur when the mangled name would be longer than 255 characters.
4369	Outside classes, or when the class name consists of only underscores,
4370	no mangling occurs.
4371
4372	Name mangling is intended to give classes an easy way to define
4373	``private'' instance variables and methods, without having to worry
4374	about instance variables defined by derived classes, or mucking with
4375	instance variables by code outside the class. Note that the mangling
4376	rules are designed mostly to avoid accidents; it still is possible for
4377	a determined soul to access or modify a variable that is considered
4378	private. This can even be useful in special circumstances, such as in
4379	the debugger, and that's one reason why this loophole is not closed.
4380	(Buglet: derivation of a class with the same name as the base class
4381	makes use of private variables of the base class possible.)
4382
4383	Notice that code passed to \code{exec}, \code{eval()} or
4384	\code{execfile()} does not consider the classname of the invoking
4385	class to be the current class; this is similar to the effect of the
4386	\code{global} statement, the effect of which is likewise restricted to
4387	code that is byte-compiled together. The same restriction applies to
4388	\code{getattr()}, \code{setattr()} and \code{delattr()}, as well as
4389	when referencing \code{__dict__} directly.
4390
4391
4392	\section{Odds and Ends \label{odds}}
4393
4394	Sometimes it is useful to have a data type similar to the Pascal
4395	``record'' or C ``struct'', bundling together a few named data
4396	items. An empty class definition will do nicely:
4397
4398	\begin{verbatim}
4399	class Employee:
4400	pass
4401
4402	john = Employee() # Create an empty employee record
4403
4404	# Fill the fields of the record
4405	john.name = 'John Doe'
4406	john.dept = 'computer lab'
4407	john.salary = 1000
4408	\end{verbatim}
4409
4410	A piece of Python code that expects a particular abstract data type
4411	can often be passed a class that emulates the methods of that data
4412	type instead. For instance, if you have a function that formats some
4413	data from a file object, you can define a class with methods
4414	\method{read()} and \method{readline()} that get the data from a string
4415	buffer instead, and pass it as an argument.% (Unfortunately, this
4416	%technique has its limitations: a class can't define operations that
4417	%are accessed by special syntax such as sequence subscripting or
4418	%arithmetic operators, and assigning such a ``pseudo-file'' to
4419	%\code{sys.stdin} will not cause the interpreter to read further input
4420	%from it.)
4421
4422
4423	Instance method objects have attributes, too: \code{m.im_self} is the
4424	instance object with the method \method{m}, and \code{m.im_func} is the
4425	function object corresponding to the method.
4426
4427
4428	\section{Exceptions Are Classes Too\label{exceptionClasses}}
4429
4430	User-defined exceptions are identified by classes as well. Using this
4431	mechanism it is possible to create extensible hierarchies of exceptions.
4432
4433	There are two new valid (semantic) forms for the raise statement:
4434
4435	\begin{verbatim}
4436	raise Class, instance
4437
4438	raise instance
4439	\end{verbatim}
4440
4441	In the first form, \code{instance} must be an instance of
4442	\class{Class} or of a class derived from it. The second form is a
4443	shorthand for:
4444
4445	\begin{verbatim}
4446	raise instance.__class__, instance
4447	\end{verbatim}
4448
4449	A class in an except clause is compatible with an exception if it is the same
4450	class or a base class thereof (but not the other way around --- an
4451	except clause listing a derived class is not compatible with a base
4452	class). For example, the following code will print B, C, D in that
4453	order:
4454
4455	\begin{verbatim}
4456	class B:
4457	pass
4458	class C(B):
4459	pass
4460	class D(C):
4461	pass
4462
4463	for c in [B, C, D]:
4464	try:
4465	raise c()
4466	except D:
4467	print "D"
4468	except C:
4469	print "C"
4470	except B:
4471	print "B"
4472	\end{verbatim}
4473
4474	Note that if the except clauses were reversed (with
4475	\samp{except B} first), it would have printed B, B, B --- the first
4476	matching except clause is triggered.
4477
4478	When an error message is printed for an unhandled exception, the
4479	exception's class name is printed, then a colon and a space, and
4480	finally the instance converted to a string using the built-in function
4481	\function{str()}.
4482
4483
4484	\section{Iterators\label{iterators}}
4485
4486	By now you have probably noticed that most container objects can be looped
4487	over using a \keyword{for} statement:
4488
4489	\begin{verbatim}
4490	for element in [1, 2, 3]:
4491	print element
4492	for element in (1, 2, 3):
4493	print element
4494	for key in {'one':1, 'two':2}:
4495	print key
4496	for char in "123":
4497	print char
4498	for line in open("myfile.txt"):
4499	print line
4500	\end{verbatim}
4501
4502	This style of access is clear, concise, and convenient. The use of iterators
4503	pervades and unifies Python. Behind the scenes, the \keyword{for}
4504	statement calls \function{iter()} on the container object. The
4505	function returns an iterator object that defines the method
4506	\method{next()} which accesses elements in the container one at a
4507	time. When there are no more elements, \method{next()} raises a
4508	\exception{StopIteration} exception which tells the \keyword{for} loop
4509	to terminate. This example shows how it all works:
4510
4511	\begin{verbatim}
4512	>>> s = 'abc'
4513	>>> it = iter(s)
4514	>>> it
4515	<iterator object at 0x00A1DB50>
4516	>>> it.next()
4517	'a'
4518	>>> it.next()
4519	'b'
4520	>>> it.next()
4521	'c'
4522	>>> it.next()
4523
4524	Traceback (most recent call last):
4525	File "<stdin>", line 1, in ?
4526	it.next()
4527	StopIteration
4528	\end{verbatim}
4529
4530	Having seen the mechanics behind the iterator protocol, it is easy to add
4531	iterator behavior to your classes. Define a \method{__iter__()} method
4532	which returns an object with a \method{next()} method. If the class defines
4533	\method{next()}, then \method{__iter__()} can just return \code{self}:
4534
4535	\begin{verbatim}
4536	class Reverse:
4537	"Iterator for looping over a sequence backwards"
4538	def __init__(self, data):
4539	self.data = data
4540	self.index = len(data)
4541	def __iter__(self):
4542	return self
4543	def next(self):
4544	if self.index == 0:
4545	raise StopIteration
4546	self.index = self.index - 1
4547	return self.data[self.index]
4548
4549	>>> for char in Reverse('spam'):
4550	... print char
4551	...
4552	m
4553	a
4554	p
4555	s
4556	\end{verbatim}
4557
4558
4559	\section{Generators\label{generators}}
4560
4561	Generators are a simple and powerful tool for creating iterators. They are
4562	written like regular functions but use the \keyword{yield} statement whenever
4563	they want to return data. Each time \method{next()} is called, the
4564	generator resumes where it left-off (it remembers all the data values and
4565	which statement was last executed). An example shows that generators can
4566	be trivially easy to create:
4567
4568	\begin{verbatim}
4569	def reverse(data):
4570	for index in range(len(data)-1, -1, -1):
4571	yield data[index]
4572
4573	>>> for char in reverse('golf'):
4574	... print char
4575	...
4576	f
4577	l
4578	o
4579	g
4580	\end{verbatim}
4581
4582	Anything that can be done with generators can also be done with class based
4583	iterators as described in the previous section. What makes generators so
4584	compact is that the \method{__iter__()} and \method{next()} methods are
4585	created automatically.
4586
4587	Another key feature is that the local variables and execution state
4588	are automatically saved between calls. This made the function easier to write
4589	and much more clear than an approach using instance variables like
4590	\code{self.index} and \code{self.data}.
4591
4592	In addition to automatic method creation and saving program state, when
4593	generators terminate, they automatically raise \exception{StopIteration}.
4594	In combination, these features make it easy to create iterators with no
4595	more effort than writing a regular function.
4596
4597	\section{Generator Expressions\label{genexps}}
4598
4599	Some simple generators can be coded succinctly as expressions using a syntax
4600	similar to list comprehensions but with parentheses instead of brackets. These
4601	expressions are designed for situations where the generator is used right
4602	away by an enclosing function. Generator expressions are more compact but
4603	less versatile than full generator definitions and tend to be more memory
4604	friendly than equivalent list comprehensions.
4605
4606	Examples:
4607
4608	\begin{verbatim}
4609	>>> sum(i*i for i in range(10)) # sum of squares
4610	285
4611
4612	>>> xvec = [10, 20, 30]
4613	>>> yvec = [7, 5, 3]
4614	>>> sum(x*y for x,y in zip(xvec, yvec)) # dot product
4615	260
4616
4617	>>> from math import pi, sin
4618	>>> sine_table = dict((x, sin(x*pi/180)) for x in range(0, 91))
4619
4620	>>> unique_words = set(word for line in page for word in line.split())
4621
4622	>>> valedictorian = max((student.gpa, student.name) for student in graduates)
4623
4624	>>> data = 'golf'
4625	>>> list(data[i] for i in range(len(data)-1,-1,-1))
4626	['f', 'l', 'o', 'g']
4627
4628	\end{verbatim}
4629
4630
4631
4632	\chapter{Brief Tour of the Standard Library \label{briefTour}}
4633
4634
4635	\section{Operating System Interface\label{os-interface}}
4636
4637	The \ulink{\module{os}}{../lib/module-os.html}
4638	module provides dozens of functions for interacting with the
4639	operating system:
4640
4641	\begin{verbatim}
4642	>>> import os
4643	>>> os.system('time 0:02')
4644	0
4645	>>> os.getcwd() # Return the current working directory
4646	'C:\\Python24'
4647	>>> os.chdir('/server/accesslogs')
4648	\end{verbatim}
4649
4650	Be sure to use the \samp{import os} style instead of
4651	\samp{from os import *}. This will keep \function{os.open()} from
4652	shadowing the builtin \function{open()} function which operates much
4653	differently.
4654
4655	\bifuncindex{help}
4656	The builtin \function{dir()} and \function{help()} functions are useful
4657	as interactive aids for working with large modules like \module{os}:
4658
4659	\begin{verbatim}
4660	>>> import os
4661	>>> dir(os)
4662	<returns a list of all module functions>
4663	>>> help(os)
4664	<returns an extensive manual page created from the module's docstrings>
4665	\end{verbatim}
4666
4667	For daily file and directory management tasks, the
4668	\ulink{\module{shutil}}{../lib/module-shutil.html}
4669	module provides a higher level interface that is easier to use:
4670
4671	\begin{verbatim}
4672	>>> import shutil
4673	>>> shutil.copyfile('data.db', 'archive.db')
4674	>>> shutil.move('/build/executables', 'installdir')
4675	\end{verbatim}
4676
4677
4678	\section{File Wildcards\label{file-wildcards}}
4679
4680	The \ulink{\module{glob}}{../lib/module-glob.html}
4681	module provides a function for making file lists from directory
4682	wildcard searches:
4683
4684	\begin{verbatim}
4685	>>> import glob
4686	>>> glob.glob('*.py')
4687	['primes.py', 'random.py', 'quote.py']
4688	\end{verbatim}
4689
4690
4691	\section{Command Line Arguments\label{command-line-arguments}}
4692
4693	Common utility scripts often need to process command line arguments.
4694	These arguments are stored in the
4695	\ulink{\module{sys}}{../lib/module-sys.html}\ module's \var{argv}
4696	attribute as a list. For instance the following output results from
4697	running \samp{python demo.py one two three} at the command line:
4698
4699	\begin{verbatim}
4700	>>> import sys
4701	>>> print sys.argv
4702	['demo.py', 'one', 'two', 'three']
4703	\end{verbatim}
4704
4705	The \ulink{\module{getopt}}{../lib/module-getopt.html}
4706	module processes \var{sys.argv} using the conventions of the \UNIX{}
4707	\function{getopt()} function. More powerful and flexible command line
4708	processing is provided by the
4709	\ulink{\module{optparse}}{../lib/module-optparse.html} module.
4710
4711
4712	\section{Error Output Redirection and Program Termination\label{stderr}}
4713
4714	The \ulink{\module{sys}}{../lib/module-sys.html}
4715	module also has attributes for \var{stdin}, \var{stdout}, and
4716	\var{stderr}. The latter is useful for emitting warnings and error
4717	messages to make them visible even when \var{stdout} has been redirected:
4718
4719	\begin{verbatim}
4720	>>> sys.stderr.write('Warning, log file not found starting a new one\n')
4721	Warning, log file not found starting a new one
4722	\end{verbatim}
4723
4724	The most direct way to terminate a script is to use \samp{sys.exit()}.
4725
4726
4727	\section{String Pattern Matching\label{string-pattern-matching}}
4728
4729	The \ulink{\module{re}}{../lib/module-re.html}
4730	module provides regular expression tools for advanced string processing.
4731	For complex matching and manipulation, regular expressions offer succinct,
4732	optimized solutions:
4733
4734	\begin{verbatim}
4735	>>> import re
4736	>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
4737	['foot', 'fell', 'fastest']
4738	>>> re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
4739	'cat in the hat'
4740	\end{verbatim}
4741
4742	When only simple capabilities are needed, string methods are preferred
4743	because they are easier to read and debug:
4744
4745	\begin{verbatim}
4746	>>> 'tea for too'.replace('too', 'two')
4747	'tea for two'
4748	\end{verbatim}
4749
4750	\section{Mathematics\label{mathematics}}
4751
4752	The \ulink{\module{math}}{../lib/module-math.html} module gives
4753	access to the underlying C library functions for floating point math:
4754
4755	\begin{verbatim}
4756	>>> import math
4757	>>> math.cos(math.pi / 4.0)
4758	0.70710678118654757
4759	>>> math.log(1024, 2)
4760	10.0
4761	\end{verbatim}
4762
4763	The \ulink{\module{random}}{../lib/module-random.html}
4764	module provides tools for making random selections:
4765
4766	\begin{verbatim}
4767	>>> import random
4768	>>> random.choice(['apple', 'pear', 'banana'])
4769	'apple'
4770	>>> random.sample(xrange(100), 10) # sampling without replacement
4771	[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
4772	>>> random.random() # random float
4773	0.17970987693706186
4774	>>> random.randrange(6) # random integer chosen from range(6)
4775	4
4776	\end{verbatim}
4777
4778
4779	\section{Internet Access\label{internet-access}}
4780
4781	There are a number of modules for accessing the internet and processing
4782	internet protocols. Two of the simplest are
4783	\ulink{\module{urllib2}}{../lib/module-urllib2.html}
4784	for retrieving data from urls and
4785	\ulink{\module{smtplib}}{../lib/module-smtplib.html}
4786	for sending mail:
4787
4788	\begin{verbatim}
4789	>>> import urllib2
4790	>>> for line in urllib2.urlopen('http://tycho.usno.navy.mil/cgi-bin/timer.pl'):
4791	... if 'EST' in line or 'EDT' in line: # look for Eastern Time
4792	... print line
4793
4794	<BR>Nov. 25, 09:43:32 PM EST
4795
4796	>>> import smtplib
4797	>>> server = smtplib.SMTP('localhost')
4798	>>> server.sendmail('soothsayer@example.org', 'jcaesar@example.org',
4799	"""To: jcaesar@example.org
4800	From: soothsayer@example.org
4801
4802	Beware the Ides of March.
4803	""")
4804	>>> server.quit()
4805	\end{verbatim}
4806
4807
4808	\section{Dates and Times\label{dates-and-times}}
4809
4810	The \ulink{\module{datetime}}{../lib/module-datetime.html} module
4811	supplies classes for manipulating dates and times in both simple
4812	and complex ways. While date and time arithmetic is supported, the
4813	focus of the implementation is on efficient member extraction for
4814	output formatting and manipulation. The module also supports objects
4815	that are timezone aware.
4816
4817	\begin{verbatim}
4818	# dates are easily constructed and formatted
4819	>>> from datetime import date
4820	>>> now = date.today()
4821	>>> now
4822	datetime.date(2003, 12, 2)
4823	>>> now.strftime("%m-%d-%y. %d %b %Y is a %A on the %d day of %B.")
4824	'12-02-03. 02 Dec 2003 is a Tuesday on the 02 day of December.'
4825
4826	# dates support calendar arithmetic
4827	>>> birthday = date(1964, 7, 31)
4828	>>> age = now - birthday
4829	>>> age.days
4830	14368
4831	\end{verbatim}
4832
4833
4834	\section{Data Compression\label{data-compression}}
4835
4836	Common data archiving and compression formats are directly supported
4837	by modules including:
4838	\ulink{\module{zlib}}{../lib/module-zlib.html},
4839	\ulink{\module{gzip}}{../lib/module-gzip.html},
4840	\ulink{\module{bz2}}{../lib/module-bz2.html},
4841	\ulink{\module{zipfile}}{../lib/module-zipfile.html}, and
4842	\ulink{\module{tarfile}}{../lib/module-tarfile.html}.
4843
4844	\begin{verbatim}
4845	>>> import zlib
4846	>>> s = 'witch which has which witches wrist watch'
4847	>>> len(s)
4848	41
4849	>>> t = zlib.compress(s)
4850	>>> len(t)
4851	37
4852	>>> zlib.decompress(t)
4853	'witch which has which witches wrist watch'
4854	>>> zlib.crc32(s)
4855	226805979
4856	\end{verbatim}
4857
4858
4859	\section{Performance Measurement\label{performance-measurement}}
4860
4861	Some Python users develop a deep interest in knowing the relative
4862	performance of different approaches to the same problem.
4863	Python provides a measurement tool that answers those questions
4864	immediately.
4865
4866	For example, it may be tempting to use the tuple packing and unpacking
4867	feature instead of the traditional approach to swapping arguments.
4868	The \ulink{\module{timeit}}{../lib/module-timeit.html} module
4869	quickly demonstrates a modest performance advantage:
4870
4871	\begin{verbatim}
4872	>>> from timeit import Timer
4873	>>> Timer('t=a; a=b; b=t', 'a=1; b=2').timeit()
4874	0.57535828626024577
4875	>>> Timer('a,b = b,a', 'a=1; b=2').timeit()
4876	0.54962537085770791
4877	\end{verbatim}
4878
4879	In contrast to \module{timeit}'s fine level of granularity, the
4880	\ulink{\module{profile}}{../lib/module-profile.html} and \module{pstats}
4881	modules provide tools for identifying time critical sections in larger blocks
4882	of code.
4883
4884
4885	\section{Quality Control\label{quality-control}}
4886
4887	One approach for developing high quality software is to write tests for
4888	each function as it is developed and to run those tests frequently during
4889	the development process.
4890
4891	The \ulink{\module{doctest}}{../lib/module-doctest.html} module provides
4892	a tool for scanning a module and validating tests embedded in a program's
4893	docstrings. Test construction is as simple as cutting-and-pasting a
4894	typical call along with its results into the docstring. This improves
4895	the documentation by providing the user with an example and it allows the
4896	doctest module to make sure the code remains true to the documentation:
4897
4898	\begin{verbatim}
4899	def average(values):
4900	"""Computes the arithmetic mean of a list of numbers.
4901
4902	>>> print average([20, 30, 70])
4903	40.0
4904	"""
4905	return sum(values, 0.0) / len(values)
4906
4907	import doctest
4908	doctest.testmod() # automatically validate the embedded tests
4909	\end{verbatim}
4910
4911	The \ulink{\module{unittest}}{../lib/module-unittest.html} module is not
4912	as effortless as the \module{doctest} module, but it allows a more
4913	comprehensive set of tests to be maintained in a separate file:
4914
4915	\begin{verbatim}
4916	import unittest
4917
4918	class TestStatisticalFunctions(unittest.TestCase):
4919
4920	def test_average(self):
4921	self.assertEqual(average([20, 30, 70]), 40.0)
4922	self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
4923	self.assertRaises(ZeroDivisionError, average, [])
4924	self.assertRaises(TypeError, average, 20, 30, 70)
4925
4926	unittest.main() # Calling from the command line invokes all tests
4927	\end{verbatim}
4928
4929	\section{Batteries Included\label{batteries-included}}
4930
4931	Python has a ``batteries included'' philosophy. This is best seen
4932	through the sophisticated and robust capabilities of its larger
4933	packages. For example:
4934
4935	\begin{itemize}
4936	\item The \ulink{\module{xmlrpclib}}{../lib/module-xmlrpclib.html} and
4937	\ulink{\module{SimpleXMLRPCServer}}{../lib/module-SimpleXMLRPCServer.html}
4938	modules make implementing remote procedure calls into an almost trivial task.
4939	Despite the modules names, no direct knowledge or handling of XML is needed.
4940	\item The \ulink{\module{email}}{../lib/module-email.html} package is a library
4941	for managing email messages, including MIME and other RFC 2822-based message
4942	documents. Unlike \module{smtplib} and \module{poplib} which actually send
4943	and receive messages, the email package has a complete toolset for building
4944	or decoding complex message structures (including attachments) and for
4945	implementing internet encoding and header protocols.
4946	\item The \ulink{\module{xml.dom}}{../lib/module-xml.dom.html} and
4947	\ulink{\module{xml.sax}}{../lib/module-xml.sax.html} packages provide robust
4948	support for parsing this popular data interchange format. Likewise, the
4949	\ulink{\module{csv}}{../lib/module-csv.html} module supports direct reads and
4950	writes in a common database format. Together, these modules and packages
4951	greatly simplify data interchange between python applications and other
4952	tools.
4953	\item Internationalization is supported by a number of modules including
4954	\ulink{\module{gettext}}{../lib/module-gettext.html},
4955	\ulink{\module{locale}}{../lib/module-locale.html}, and the
4956	\ulink{\module{codecs}}{../lib/module-codecs.html} package.
4957	\end{itemize}
4958
4959	\chapter{Brief Tour of the Standard Library -- Part II\label{briefTourTwo}}
4960
4961	This second tour covers more advanced modules that support professional
4962	programming needs. These modules rarely occur in small scripts.
4963
4964
4965	\section{Output Formatting\label{output-formatting}}
4966
4967	The \ulink{\module{repr}}{../lib/module-repr.html} module provides a
4968	version of \function{repr()} customized for abbreviated displays of large
4969	or deeply nested containers:
4970
4971	\begin{verbatim}
4972	>>> import repr
4973	>>> repr.repr(set('supercalifragilisticexpialidocious'))
4974	"set(['a', 'c', 'd', 'e', 'f', 'g', ...])"
4975	\end{verbatim}
4976
4977	The \ulink{\module{pprint}}{../lib/module-pprint.html} module offers
4978	more sophisticated control over printing both built-in and user defined
4979	objects in a way that is readable by the interpreter. When the result
4980	is longer than one line, the ``pretty printer'' adds line breaks and
4981	indentation to more clearly reveal data structure:
4982
4983	\begin{verbatim}
4984	>>> import pprint
4985	>>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',
4986	... 'yellow'], 'blue']]]
4987	...
4988	>>> pprint.pprint(t, width=30)
4989	[[[['black', 'cyan'],
4990	'white',
4991	['green', 'red']],
4992	[['magenta', 'yellow'],
4993	'blue']]]
4994	\end{verbatim}
4995
4996	The \ulink{\module{textwrap}}{../lib/module-textwrap.html} module
4997	formats paragraphs of text to fit a given screen width:
4998
4999	\begin{verbatim}
5000	>>> import textwrap
5001	>>> doc = """The wrap() method is just like fill() except that it returns
5002	... a list of strings instead of one big string with newlines to separate
5003	... the wrapped lines."""
5004	...
5005	>>> print textwrap.fill(doc, width=40)
5006	The wrap() method is just like fill()
5007	except that it returns a list of strings
5008	instead of one big string with newlines
5009	to separate the wrapped lines.
5010	\end{verbatim}
5011
5012	The \ulink{\module{locale}}{../lib/module-locale.html} module accesses
5013	a database of culture specific data formats. The grouping attribute
5014	of locale's format function provides a direct way of formatting numbers
5015	with group separators:
5016
5017	\begin{verbatim}
5018	>>> import locale
5019	>>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
5020	'English_United States.1252'
5021	>>> conv = locale.localeconv() # get a mapping of conventions
5022	>>> x = 1234567.8
5023	>>> locale.format("%d", x, grouping=True)
5024	'1,234,567'
5025	>>> locale.format("%s%.*f", (conv['currency_symbol'],
5026	... conv['frac_digits'], x), grouping=True)
5027	'$1,234,567.80'
5028	\end{verbatim}
5029
5030
5031	\section{Templating\label{templating}}
5032
5033	The \ulink{\module{string}}{../lib/module-string.html} module includes a
5034	versatile \class{Template} class with a simplified syntax suitable for
5035	editing by end-users. This allows users to customize their applications
5036	without having to alter the application.
5037
5038	The format uses placeholder names formed by \samp{\$} with valid Python
5039	identifiers (alphanumeric characters and underscores). Surrounding the
5040	placeholder with braces allows it to be followed by more alphanumeric letters
5041	with no intervening spaces. Writing \samp{\$\$} creates a single escaped
5042	\samp{\$}:
5043
5044	\begin{verbatim}
5045	>>> from string import Template
5046	>>> t = Template('${village}folk send $$10 to $cause.')
5047	>>> t.substitute(village='Nottingham', cause='the ditch fund')
5048	'Nottinghamfolk send $10 to the ditch fund.'
5049	\end{verbatim}
5050
5051	The \method{substitute} method raises a \exception{KeyError} when a
5052	placeholder is not supplied in a dictionary or a keyword argument. For
5053	mail-merge style applications, user supplied data may be incomplete and the
5054	\method{safe_substitute} method may be more appropriate --- it will leave
5055	placeholders unchanged if data is missing:
5056
5057	\begin{verbatim}
5058	>>> t = Template('Return the $item to $owner.')
5059	>>> d = dict(item='unladen swallow')
5060	>>> t.substitute(d)
5061	Traceback (most recent call last):
5062	. . .
5063	KeyError: 'owner'
5064	>>> t.safe_substitute(d)
5065	'Return the unladen swallow to $owner.'
5066	\end{verbatim}
5067
5068	Template subclasses can specify a custom delimiter. For example, a batch
5069	renaming utility for a photo browser may elect to use percent signs for
5070	placeholders such as the current date, image sequence number, or file format:
5071
5072	\begin{verbatim}
5073	>>> import time, os.path
5074	>>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']
5075	>>> class BatchRename(Template):
5076	... delimiter = '%'
5077	>>> fmt = raw_input('Enter rename style (%d-date %n-seqnum %f-format): ')
5078	Enter rename style (%d-date %n-seqnum %f-format): Ashley_%n%f
5079
5080	>>> t = BatchRename(fmt)
5081	>>> date = time.strftime('%d%b%y')
5082	>>> for i, filename in enumerate(photofiles):
5083	... base, ext = os.path.splitext(filename)
5084	... newname = t.substitute(d=date, n=i, f=ext)
5085	... print '%s --> %s' % (filename, newname)
5086
5087	img_1074.jpg --> Ashley_0.jpg
5088	img_1076.jpg --> Ashley_1.jpg
5089	img_1077.jpg --> Ashley_2.jpg
5090	\end{verbatim}
5091
5092	Another application for templating is separating program logic from the
5093	details of multiple output formats. This makes it possible to substitute
5094	custom templates for XML files, plain text reports, and HTML web reports.
5095
5096
5097	\section{Working with Binary Data Record Layouts\label{binary-formats}}
5098
5099	The \ulink{\module{struct}}{../lib/module-struct.html} module provides
5100	\function{pack()} and \function{unpack()} functions for working with
5101	variable length binary record formats. The following example shows how
5102	to loop through header information in a ZIP file (with pack codes
5103	\code{"H"} and \code{"L"} representing two and four byte unsigned
5104	numbers respectively):
5105
5106	\begin{verbatim}
5107	import struct
5108
5109	data = open('myfile.zip', 'rb').read()
5110	start = 0
5111	for i in range(3): # show the first 3 file headers
5112	start += 14
5113	fields = struct.unpack('LLLHH', data[start:start+16])
5114	crc32, comp_size, uncomp_size, filenamesize, extra_size = fields
5115
5116	start += 16
5117	filename = data[start:start+filenamesize]
5118	start += filenamesize
5119	extra = data[start:start+extra_size]
5120	print filename, hex(crc32), comp_size, uncomp_size
5121
5122	start += extra_size + comp_size # skip to the next header
5123	\end{verbatim}
5124
5125
5126	\section{Multi-threading\label{multi-threading}}
5127
5128	Threading is a technique for decoupling tasks which are not sequentially
5129	dependent. Threads can be used to improve the responsiveness of
5130	applications that accept user input while other tasks run in the
5131	background. A related use case is running I/O in parallel with
5132	computations in another thread.
5133
5134	The following code shows how the high level
5135	\ulink{\module{threading}}{../lib/module-threading.html} module can run
5136	tasks in background while the main program continues to run:
5137
5138	\begin{verbatim}
5139	import threading, zipfile
5140
5141	class AsyncZip(threading.Thread):
5142	def __init__(self, infile, outfile):
5143	threading.Thread.__init__(self)
5144	self.infile = infile
5145	self.outfile = outfile
5146	def run(self):
5147	f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
5148	f.write(self.infile)
5149	f.close()
5150	print 'Finished background zip of: ', self.infile
5151
5152	background = AsyncZip('mydata.txt', 'myarchive.zip')
5153	background.start()
5154	print 'The main program continues to run in foreground.'
5155
5156	background.join() # Wait for the background task to finish
5157	print 'Main program waited until background was done.'
5158	\end{verbatim}
5159
5160	The principal challenge of multi-threaded applications is coordinating
5161	threads that share data or other resources. To that end, the threading
5162	module provides a number of synchronization primitives including locks,
5163	events, condition variables, and semaphores.
5164
5165	While those tools are powerful, minor design errors can result in
5166	problems that are difficult to reproduce. So, the preferred approach
5167	to task coordination is to concentrate all access to a resource
5168	in a single thread and then use the
5169	\ulink{\module{Queue}}{../lib/module-Queue.html} module to feed that
5170	thread with requests from other threads. Applications using
5171	\class{Queue} objects for inter-thread communication and coordination
5172	are easier to design, more readable, and more reliable.
5173
5174
5175	\section{Logging\label{logging}}
5176
5177	The \ulink{\module{logging}}{../lib/module-logging.html} module offers
5178	a full featured and flexible logging system. At its simplest, log
5179	messages are sent to a file or to \code{sys.stderr}:
5180
5181	\begin{verbatim}
5182	import logging
5183	logging.debug('Debugging information')
5184	logging.info('Informational message')
5185	logging.warning('Warning:config file %s not found', 'server.conf')
5186	logging.error('Error occurred')
5187	logging.critical('Critical error -- shutting down')
5188	\end{verbatim}
5189
5190	This produces the following output:
5191
5192	\begin{verbatim}
5193	WARNING:root:Warning:config file server.conf not found
5194	ERROR:root:Error occurred
5195	CRITICAL:root:Critical error -- shutting down
5196	\end{verbatim}
5197
5198	By default, informational and debugging messages are suppressed and the
5199	output is sent to standard error. Other output options include routing
5200	messages through email, datagrams, sockets, or to an HTTP Server. New
5201	filters can select different routing based on message priority:
5202	\constant{DEBUG}, \constant{INFO}, \constant{WARNING}, \constant{ERROR},
5203	and \constant{CRITICAL}.
5204
5205	The logging system can be configured directly from Python or can be
5206	loaded from a user editable configuration file for customized logging
5207	without altering the application.
5208
5209
5210	\section{Weak References\label{weak-references}}
5211
5212	Python does automatic memory management (reference counting for most
5213	objects and garbage collection to eliminate cycles). The memory is
5214	freed shortly after the last reference to it has been eliminated.
5215
5216	This approach works fine for most applications but occasionally there
5217	is a need to track objects only as long as they are being used by
5218	something else. Unfortunately, just tracking them creates a reference
5219	that makes them permanent. The
5220	\ulink{\module{weakref}}{../lib/module-weakref.html} module provides
5221	tools for tracking objects without creating a reference. When the
5222	object is no longer needed, it is automatically removed from a weakref
5223	table and a callback is triggered for weakref objects. Typical
5224	applications include caching objects that are expensive to create:
5225
5226	\begin{verbatim}
5227	>>> import weakref, gc
5228	>>> class A:
5229	... def __init__(self, value):
5230	... self.value = value
5231	... def __repr__(self):
5232	... return str(self.value)
5233	...
5234	>>> a = A(10) # create a reference
5235	>>> d = weakref.WeakValueDictionary()
5236	>>> d['primary'] = a # does not create a reference
5237	>>> d['primary'] # fetch the object if it is still alive
5238	10
5239	>>> del a # remove the one reference
5240	>>> gc.collect() # run garbage collection right away
5241	0
5242	>>> d['primary'] # entry was automatically removed
5243	Traceback (most recent call last):
5244	File "<pyshell#108>", line 1, in -toplevel-
5245	d['primary'] # entry was automatically removed
5246	File "C:/PY24/lib/weakref.py", line 46, in __getitem__
5247	o = self.data[key]()
5248	KeyError: 'primary'
5249	\end{verbatim}
5250
5251	\section{Tools for Working with Lists\label{list-tools}}
5252
5253	Many data structure needs can be met with the built-in list type.
5254	However, sometimes there is a need for alternative implementations
5255	with different performance trade-offs.
5256
5257	The \ulink{\module{array}}{../lib/module-array.html} module provides an
5258	\class{array()} object that is like a list that stores only homogenous
5259	data and stores it more compactly. The following example shows an array
5260	of numbers stored as two byte unsigned binary numbers (typecode
5261	\code{"H"}) rather than the usual 16 bytes per entry for regular lists
5262	of python int objects:
5263
5264	\begin{verbatim}
5265	>>> from array import array
5266	>>> a = array('H', [4000, 10, 700, 22222])
5267	>>> sum(a)
5268	26932
5269	>>> a[1:3]
5270	array('H', [10, 700])
5271	\end{verbatim}
5272
5273	The \ulink{\module{collections}}{../lib/module-collections.html} module
5274	provides a \class{deque()} object that is like a list with faster
5275	appends and pops from the left side but slower lookups in the middle.
5276	These objects are well suited for implementing queues and breadth first
5277	tree searches:
5278
5279	\begin{verbatim}
5280	>>> from collections import deque
5281	>>> d = deque(["task1", "task2", "task3"])
5282	>>> d.append("task4")
5283	>>> print "Handling", d.popleft()
5284	Handling task1
5285
5286	unsearched = deque([starting_node])
5287	def breadth_first_search(unsearched):
5288	node = unsearched.popleft()
5289	for m in gen_moves(node):
5290	if is_goal(m):
5291	return m
5292	unsearched.append(m)
5293	\end{verbatim}
5294
5295	In addition to alternative list implementations, the library also offers
5296	other tools such as the \ulink{\module{bisect}}{../lib/module-bisect.html}
5297	module with functions for manipulating sorted lists:
5298
5299	\begin{verbatim}
5300	>>> import bisect
5301	>>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
5302	>>> bisect.insort(scores, (300, 'ruby'))
5303	>>> scores
5304	[(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]
5305	\end{verbatim}
5306
5307	The \ulink{\module{heapq}}{../lib/module-heapq.html} module provides
5308	functions for implementing heaps based on regular lists. The lowest
5309	valued entry is always kept at position zero. This is useful for
5310	applications which repeatedly access the smallest element but do not
5311	want to run a full list sort:
5312
5313	\begin{verbatim}
5314	>>> from heapq import heapify, heappop, heappush
5315	>>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
5316	>>> heapify(data) # rearrange the list into heap order
5317	>>> heappush(data, -5) # add a new entry
5318	>>> [heappop(data) for i in range(3)] # fetch the three smallest entries
5319	[-5, 0, 1]
5320	\end{verbatim}
5321
5322
5323	\section{Decimal Floating Point Arithmetic\label{decimal-fp}}
5324
5325	The \ulink{\module{decimal}}{../lib/module-decimal.html} module offers a
5326	\class{Decimal} datatype for decimal floating point arithmetic. Compared to
5327	the built-in \class{float} implementation of binary floating point, the new
5328	class is especially helpful for financial applications and other uses which
5329	require exact decimal representation, control over precision, control over
5330	rounding to meet legal or regulatory requirements, tracking of significant
5331	decimal places, or for applications where the user expects the results to
5332	match calculations done by hand.
5333
5334	For example, calculating a 5\%{} tax on a 70 cent phone charge gives
5335	different results in decimal floating point and binary floating point.
5336	The difference becomes significant if the results are rounded to the
5337	nearest cent:
5338
5339	\begin{verbatim}
5340	>>> from decimal import *
5341	>>> Decimal('0.70') * Decimal('1.05')
5342	Decimal("0.7350")
5343	>>> .70 * 1.05
5344	0.73499999999999999
5345	\end{verbatim}
5346
5347	The \class{Decimal} result keeps a trailing zero, automatically inferring four
5348	place significance from multiplicands with two place significance. Decimal reproduces
5349	mathematics as done by hand and avoids issues that can arise when binary
5350	floating point cannot exactly represent decimal quantities.
5351
5352	Exact representation enables the \class{Decimal} class to perform
5353	modulo calculations and equality tests that are unsuitable for binary
5354	floating point:
5355
5356	\begin{verbatim}
5357	>>> Decimal('1.00') % Decimal('.10')
5358	Decimal("0.00")
5359	>>> 1.00 % 0.10
5360	0.09999999999999995
5361
5362	>>> sum([Decimal('0.1')]*10) == Decimal('1.0')
5363	True
5364	>>> sum([0.1]*10) == 1.0
5365	False
5366	\end{verbatim}
5367
5368	The \module{decimal} module provides arithmetic with as much precision as
5369	needed:
5370
5371	\begin{verbatim}
5372	>>> getcontext().prec = 36
5373	>>> Decimal(1) / Decimal(7)
5374	Decimal("0.142857142857142857142857142857142857")
5375	\end{verbatim}
5376
5377
5378
5379	\chapter{What Now? \label{whatNow}}
5380
5381	Reading this tutorial has probably reinforced your interest in using
5382	Python --- you should be eager to apply Python to solving your
5383	real-world problems. Where should you go to learn more?
5384
5385	This tutorial is part of Python's documentation set.
5386	Some other documents in the set are:
5387
5388	\begin{itemize}
5389
5390	\item \citetitle[../lib/lib.html]{Python Library Reference}:
5391
5392	You should browse through this manual, which gives complete (though
5393	terse) reference material about types, functions, and the modules in
5394	the standard library. The standard Python distribution includes a
5395	\emph{lot} of additional code. There are modules to read \UNIX{}
5396	mailboxes, retrieve documents via HTTP, generate random numbers, parse
5397	command-line options, write CGI programs, compress data, and many other tasks.
5398	Skimming through the Library Reference will give you an idea of
5399	what's available.
5400
5401	\item \citetitle[../inst/inst.html]{Installing Python Modules}
5402	explains how to install external modules written by other Python
5403	users.
5404
5405	\item \citetitle[../ref/ref.html]{Language Reference}: A detailed
5406	explanation of Python's syntax and semantics. It's heavy reading,
5407	but is useful as a complete guide to the language itself.
5408
5409	\end{itemize}
5410
5411	More Python resources:
5412
5413	\begin{itemize}
5414
5415	\item \url{http://www.python.org}: The major Python Web site. It contains
5416	code, documentation, and pointers to Python-related pages around the
5417	Web. This Web site is mirrored in various places around the
5418	world, such as Europe, Japan, and Australia; a mirror may be faster
5419	than the main site, depending on your geographical location.
5420
5421	\item \url{http://docs.python.org}: Fast access to Python's
5422	documentation.
5423
5424	\item \url{http://cheeseshop.python.org}:
5425	The Python Package Index, nicknamed the Cheese Shop,
5426	is an index of user-created Python modules that are available for
5427	download. Once you begin releasing code, you can register it
5428	here so that others can find it.
5429
5430	\item \url{http://aspn.activestate.com/ASPN/Python/Cookbook/}: The
5431	Python Cookbook is a sizable collection of code examples, larger
5432	modules, and useful scripts. Particularly notable contributions are
5433	collected in a book also titled \citetitle{Python Cookbook} (O'Reilly
5434	\& Associates, ISBN 0-596-00797-3.)
5435
5436	\end{itemize}
5437
5438
5439	For Python-related questions and problem reports, you can post to the
5440	newsgroup \newsgroup{comp.lang.python}, or send them to the mailing
5441	list at \email{python-list@python.org}. The newsgroup and mailing list
5442	are gatewayed, so messages posted to one will automatically be
5443	forwarded to the other. There are around 120 postings a day (with peaks
5444	up to several hundred),
5445	% Postings figure based on average of last six months activity as
5446	% reported by www.egroups.com; Jan. 2000 - June 2000: 21272 msgs / 182
5447	% days = 116.9 msgs / day and steadily increasing.
5448	asking (and answering) questions, suggesting new features, and
5449	announcing new modules. Before posting, be sure to check the list of
5450	\ulink{Frequently Asked Questions}{http://www.python.org/doc/faq/} (also called the FAQ), or look for it in the
5451	\file{Misc/} directory of the Python source distribution. Mailing
5452	list archives are available at \url{http://mail.python.org/pipermail/}.
5453	The FAQ answers many of the questions that come up again and again,
5454	and may already contain the solution for your problem.
5455
5456
5457	\appendix
5458
5459	\chapter{Interactive Input Editing and History Substitution\label{interacting}}
5460
5461	Some versions of the Python interpreter support editing of the current
5462	input line and history substitution, similar to facilities found in
5463	the Korn shell and the GNU Bash shell. This is implemented using the
5464	\emph{GNU Readline} library, which supports Emacs-style and vi-style
5465	editing. This library has its own documentation which I won't
5466	duplicate here; however, the basics are easily explained. The
5467	interactive editing and history described here are optionally
5468	available in the \UNIX{} and Cygwin versions of the interpreter.
5469
5470	This chapter does \emph{not} document the editing facilities of Mark
5471	Hammond's PythonWin package or the Tk-based environment, IDLE,
5472	distributed with Python. The command line history recall which
5473	operates within DOS boxes on NT and some other DOS and Windows flavors
5474	is yet another beast.
5475
5476	\section{Line Editing \label{lineEditing}}
5477
5478	If supported, input line editing is active whenever the interpreter
5479	prints a primary or secondary prompt. The current line can be edited
5480	using the conventional Emacs control characters. The most important
5481	of these are: \kbd{C-A} (Control-A) moves the cursor to the beginning
5482	of the line, \kbd{C-E} to the end, \kbd{C-B} moves it one position to
5483	the left, \kbd{C-F} to the right. Backspace erases the character to
5484	the left of the cursor, \kbd{C-D} the character to its right.
5485	\kbd{C-K} kills (erases) the rest of the line to the right of the
5486	cursor, \kbd{C-Y} yanks back the last killed string.
5487	\kbd{C-underscore} undoes the last change you made; it can be repeated
5488	for cumulative effect.
5489
5490	\section{History Substitution \label{history}}
5491
5492	History substitution works as follows. All non-empty input lines
5493	issued are saved in a history buffer, and when a new prompt is given
5494	you are positioned on a new line at the bottom of this buffer.
5495	\kbd{C-P} moves one line up (back) in the history buffer,
5496	\kbd{C-N} moves one down. Any line in the history buffer can be
5497	edited; an asterisk appears in front of the prompt to mark a line as
5498	modified. Pressing the \kbd{Return} key passes the current line to
5499	the interpreter. \kbd{C-R} starts an incremental reverse search;
5500	\kbd{C-S} starts a forward search.
5501
5502	\section{Key Bindings \label{keyBindings}}
5503
5504	The key bindings and some other parameters of the Readline library can
5505	be customized by placing commands in an initialization file called
5506	\file{\~{}/.inputrc}. Key bindings have the form
5507
5508	\begin{verbatim}
5509	key-name: function-name
5510	\end{verbatim}
5511
5512	or
5513
5514	\begin{verbatim}
5515	"string": function-name
5516	\end{verbatim}
5517
5518	and options can be set with
5519
5520	\begin{verbatim}
5521	set option-name value
5522	\end{verbatim}
5523
5524	For example:
5525
5526	\begin{verbatim}
5527	# I prefer vi-style editing:
5528	set editing-mode vi
5529
5530	# Edit using a single line:
5531	set horizontal-scroll-mode On
5532
5533	# Rebind some keys:
5534	Meta-h: backward-kill-word
5535	"\C-u": universal-argument
5536	"\C-x\C-r": re-read-init-file
5537	\end{verbatim}
5538
5539	Note that the default binding for \kbd{Tab} in Python is to insert a
5540	\kbd{Tab} character instead of Readline's default filename completion
5541	function. If you insist, you can override this by putting
5542
5543	\begin{verbatim}
5544	Tab: complete
5545	\end{verbatim}
5546
5547	in your \file{\~{}/.inputrc}. (Of course, this makes it harder to
5548	type indented continuation lines if you're accustomed to using
5549	\kbd{Tab} for that purpose.)
5550
5551	Automatic completion of variable and module names is optionally
5552	available. To enable it in the interpreter's interactive mode, add
5553	the following to your startup file:\footnote{
5554	Python will execute the contents of a file identified by the
5555	\envvar{PYTHONSTARTUP} environment variable when you start an
5556	interactive interpreter.}
5557	\refstmodindex{rlcompleter}\refbimodindex{readline}
5558
5559	\begin{verbatim}
5560	import rlcompleter, readline
5561	readline.parse_and_bind('tab: complete')
5562	\end{verbatim}
5563
5564	This binds the \kbd{Tab} key to the completion function, so hitting
5565	the \kbd{Tab} key twice suggests completions; it looks at Python
5566	statement names, the current local variables, and the available module
5567	names. For dotted expressions such as \code{string.a}, it will
5568	evaluate the expression up to the final \character{.} and then
5569	suggest completions from the attributes of the resulting object. Note
5570	that this may execute application-defined code if an object with a
5571	\method{__getattr__()} method is part of the expression.
5572
5573	A more capable startup file might look like this example. Note that
5574	this deletes the names it creates once they are no longer needed; this
5575	is done since the startup file is executed in the same namespace as
5576	the interactive commands, and removing the names avoids creating side
5577	effects in the interactive environment. You may find it convenient
5578	to keep some of the imported modules, such as
5579	\ulink{\module{os}}{../lib/module-os.html}, which turn
5580	out to be needed in most sessions with the interpreter.
5581
5582	\begin{verbatim}
5583	# Add auto-completion and a stored history file of commands to your Python
5584	# interactive interpreter. Requires Python 2.0+, readline. Autocomplete is
5585	# bound to the Esc key by default (you can change it - see readline docs).
5586	#
5587	# Store the file in ~/.pystartup, and set an environment variable to point
5588	# to it: "export PYTHONSTARTUP=/max/home/itamar/.pystartup" in bash.
5589	#
5590	# Note that PYTHONSTARTUP does not expand "~", so you have to put in the
5591	# full path to your home directory.
5592
5593	import atexit
5594	import os
5595	import readline
5596	import rlcompleter
5597
5598	historyPath = os.path.expanduser("~/.pyhistory")
5599
5600	def save_history(historyPath=historyPath):
5601	import readline
5602	readline.write_history_file(historyPath)
5603
5604	if os.path.exists(historyPath):
5605	readline.read_history_file(historyPath)
5606
5607	atexit.register(save_history)
5608	del os, atexit, readline, rlcompleter, save_history, historyPath
5609	\end{verbatim}
5610
5611
5612	\section{Commentary \label{commentary}}
5613
5614	This facility is an enormous step forward compared to earlier versions
5615	of the interpreter; however, some wishes are left: It would be nice if
5616	the proper indentation were suggested on continuation lines (the
5617	parser knows if an indent token is required next). The completion
5618	mechanism might use the interpreter's symbol table. A command to
5619	check (or even suggest) matching parentheses, quotes, etc., would also
5620	be useful.
5621
5622
5623	\chapter{Floating Point Arithmetic: Issues and Limitations\label{fp-issues}}
5624	\sectionauthor{Tim Peters}{tim_one@users.sourceforge.net}
5625
5626	Floating-point numbers are represented in computer hardware as
5627	base 2 (binary) fractions. For example, the decimal fraction
5628
5629	\begin{verbatim}
5630	0.125
5631	\end{verbatim}
5632
5633	has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction
5634
5635	\begin{verbatim}
5636	0.001
5637	\end{verbatim}
5638
5639	has value 0/2 + 0/4 + 1/8. These two fractions have identical values,
5640	the only real difference being that the first is written in base 10
5641	fractional notation, and the second in base 2.
5642
5643	Unfortunately, most decimal fractions cannot be represented exactly as
5644	binary fractions. A consequence is that, in general, the decimal
5645	floating-point numbers you enter are only approximated by the binary
5646	floating-point numbers actually stored in the machine.
5647
5648	The problem is easier to understand at first in base 10. Consider the
5649	fraction 1/3. You can approximate that as a base 10 fraction:
5650
5651	\begin{verbatim}
5652	0.3
5653	\end{verbatim}
5654
5655	or, better,
5656
5657	\begin{verbatim}
5658	0.33
5659	\end{verbatim}
5660
5661	or, better,
5662
5663	\begin{verbatim}
5664	0.333
5665	\end{verbatim}
5666
5667	and so on. No matter how many digits you're willing to write down, the
5668	result will never be exactly 1/3, but will be an increasingly better
5669	approximation of 1/3.
5670
5671	In the same way, no matter how many base 2 digits you're willing to
5672	use, the decimal value 0.1 cannot be represented exactly as a base 2
5673	fraction. In base 2, 1/10 is the infinitely repeating fraction
5674
5675	\begin{verbatim}
5676	0.0001100110011001100110011001100110011001100110011...
5677	\end{verbatim}
5678
5679	Stop at any finite number of bits, and you get an approximation. This
5680	is why you see things like:
5681
5682	\begin{verbatim}
5683	>>> 0.1
5684	0.10000000000000001
5685	\end{verbatim}
5686
5687	On most machines today, that is what you'll see if you enter 0.1 at
5688	a Python prompt. You may not, though, because the number of bits
5689	used by the hardware to store floating-point values can vary across
5690	machines, and Python only prints a decimal approximation to the true
5691	decimal value of the binary approximation stored by the machine. On
5692	most machines, if Python were to print the true decimal value of
5693	the binary approximation stored for 0.1, it would have to display
5694
5695	\begin{verbatim}
5696	>>> 0.1
5697	0.1000000000000000055511151231257827021181583404541015625
5698	\end{verbatim}
5699
5700	instead! The Python prompt uses the builtin
5701	\function{repr()} function to obtain a string version of everything it
5702	displays. For floats, \code{repr(\var{float})} rounds the true
5703	decimal value to 17 significant digits, giving
5704
5705	\begin{verbatim}
5706	0.10000000000000001
5707	\end{verbatim}
5708
5709	\code{repr(\var{float})} produces 17 significant digits because it
5710	turns out that's enough (on most machines) so that
5711	\code{eval(repr(\var{x})) == \var{x}} exactly for all finite floats
5712	\var{x}, but rounding to 16 digits is not enough to make that true.
5713
5714	Note that this is in the very nature of binary floating-point: this is
5715	not a bug in Python, and it is not a bug in your code either. You'll
5716	see the same kind of thing in all languages that support your
5717	hardware's floating-point arithmetic (although some languages may
5718	not \emph{display} the difference by default, or in all output modes).
5719
5720	Python's builtin \function{str()} function produces only 12
5721	significant digits, and you may wish to use that instead. It's
5722	unusual for \code{eval(str(\var{x}))} to reproduce \var{x}, but the
5723	output may be more pleasant to look at:
5724
5725	\begin{verbatim}
5726	>>> print str(0.1)
5727	0.1
5728	\end{verbatim}
5729
5730	It's important to realize that this is, in a real sense, an illusion:
5731	the value in the machine is not exactly 1/10, you're simply rounding
5732	the \emph{display} of the true machine value.
5733
5734	Other surprises follow from this one. For example, after seeing
5735
5736	\begin{verbatim}
5737	>>> 0.1
5738	0.10000000000000001
5739	\end{verbatim}
5740
5741	you may be tempted to use the \function{round()} function to chop it
5742	back to the single digit you expect. But that makes no difference:
5743
5744	\begin{verbatim}
5745	>>> round(0.1, 1)
5746	0.10000000000000001
5747	\end{verbatim}
5748
5749	The problem is that the binary floating-point value stored for "0.1"
5750	was already the best possible binary approximation to 1/10, so trying
5751	to round it again can't make it better: it was already as good as it
5752	gets.
5753
5754	Another consequence is that since 0.1 is not exactly 1/10,
5755	summing ten values of 0.1 may not yield exactly 1.0, either:
5756
5757	\begin{verbatim}
5758	>>> sum = 0.0
5759	>>> for i in range(10):
5760	... sum += 0.1
5761	...
5762	>>> sum
5763	0.99999999999999989
5764	\end{verbatim}
5765
5766	Binary floating-point arithmetic holds many surprises like this. The
5767	problem with "0.1" is explained in precise detail below, in the
5768	"Representation Error" section. See
5769	\citetitle[http://www.lahey.com/float.htm]{The Perils of Floating
5770	Point} for a more complete account of other common surprises.
5771
5772	As that says near the end, ``there are no easy answers.'' Still,
5773	don't be unduly wary of floating-point! The errors in Python float
5774	operations are inherited from the floating-point hardware, and on most
5775	machines are on the order of no more than 1 part in 2**53 per
5776	operation. That's more than adequate for most tasks, but you do need
5777	to keep in mind that it's not decimal arithmetic, and that every float
5778	operation can suffer a new rounding error.
5779
5780	While pathological cases do exist, for most casual use of
5781	floating-point arithmetic you'll see the result you expect in the end
5782	if you simply round the display of your final results to the number of
5783	decimal digits you expect. \function{str()} usually suffices, and for
5784	finer control see the discussion of Python's \code{\%} format
5785	operator: the \code{\%g}, \code{\%f} and \code{\%e} format codes
5786	supply flexible and easy ways to round float results for display.
5787
5788
5789	\section{Representation Error
5790	\label{fp-error}}
5791
5792	This section explains the ``0.1'' example in detail, and shows how
5793	you can perform an exact analysis of cases like this yourself. Basic
5794	familiarity with binary floating-point representation is assumed.
5795
5796	\dfn{Representation error} refers to the fact that some (most, actually)
5797	decimal fractions cannot be represented exactly as binary (base 2)
5798	fractions. This is the chief reason why Python (or Perl, C, \Cpp,
5799	Java, Fortran, and many others) often won't display the exact decimal
5800	number you expect:
5801
5802	\begin{verbatim}
5803	>>> 0.1
5804	0.10000000000000001
5805	\end{verbatim}
5806
5807	Why is that? 1/10 is not exactly representable as a binary fraction.
5808	Almost all machines today (November 2000) use IEEE-754 floating point
5809	arithmetic, and almost all platforms map Python floats to IEEE-754
5810	"double precision". 754 doubles contain 53 bits of precision, so on
5811	input the computer strives to convert 0.1 to the closest fraction it can
5812	of the form \var{J}/2**\var{N} where \var{J} is an integer containing
5813	exactly 53 bits. Rewriting
5814
5815	\begin{verbatim}
5816	1 / 10 ~= J / (2**N)
5817	\end{verbatim}
5818
5819	as
5820
5821	\begin{verbatim}
5822	J ~= 2**N / 10
5823	\end{verbatim}
5824
5825	and recalling that \var{J} has exactly 53 bits (is \code{>= 2**52} but
5826	\code{< 2**53}), the best value for \var{N} is 56:
5827
5828	\begin{verbatim}
5829	>>> 2**52
5830	4503599627370496L
5831	>>> 2**53
5832	9007199254740992L
5833	>>> 2**56/10
5834	7205759403792793L
5835	\end{verbatim}
5836
5837	That is, 56 is the only value for \var{N} that leaves \var{J} with
5838	exactly 53 bits. The best possible value for \var{J} is then that
5839	quotient rounded:
5840
5841	\begin{verbatim}
5842	>>> q, r = divmod(2**56, 10)
5843	>>> r
5844	6L
5845	\end{verbatim}
5846
5847	Since the remainder is more than half of 10, the best approximation is
5848	obtained by rounding up:
5849
5850	\begin{verbatim}
5851	>>> q+1
5852	7205759403792794L
5853	\end{verbatim}
5854
5855	Therefore the best possible approximation to 1/10 in 754 double
5856	precision is that over 2**56, or
5857
5858	\begin{verbatim}
5859	7205759403792794 / 72057594037927936
5860	\end{verbatim}
5861
5862	Note that since we rounded up, this is actually a little bit larger than
5863	1/10; if we had not rounded up, the quotient would have been a little
5864	bit smaller than 1/10. But in no case can it be \emph{exactly} 1/10!
5865
5866	So the computer never ``sees'' 1/10: what it sees is the exact
5867	fraction given above, the best 754 double approximation it can get:
5868
5869	\begin{verbatim}
5870	>>> .1 * 2**56
5871	7205759403792794.0
5872	\end{verbatim}
5873
5874	If we multiply that fraction by 10**30, we can see the (truncated)
5875	value of its 30 most significant decimal digits:
5876
5877	\begin{verbatim}
5878	>>> 7205759403792794 * 1030 / 256
5879	100000000000000005551115123125L
5880	\end{verbatim}
5881
5882	meaning that the exact number stored in the computer is approximately
5883	equal to the decimal value 0.100000000000000005551115123125. Rounding
5884	that to 17 significant digits gives the 0.10000000000000001 that Python
5885	displays (well, will display on any 754-conforming platform that does
5886	best-possible input and output conversions in its C library --- yours may
5887	not!).
5888
5889	\chapter{History and License}
5890	\input{license}
5891
5892	\input{glossary}
5893
5894	\input{tut.ind}
5895
5896	\end{document}

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: vendor/python/2.5/Doc/tut/tut.tex

Download in other formats: