1 | \chapter{Introduction \label{intro}}
|
---|
2 |
|
---|
3 |
|
---|
4 | The Application Programmer's Interface to Python gives C and
|
---|
5 | \Cpp{} programmers access to the Python interpreter at a variety of
|
---|
6 | levels. The API is equally usable from \Cpp, but for brevity it is
|
---|
7 | generally referred to as the Python/C API. There are two
|
---|
8 | fundamentally different reasons for using the Python/C API. The first
|
---|
9 | reason is to write \emph{extension modules} for specific purposes;
|
---|
10 | these are C modules that extend the Python interpreter. This is
|
---|
11 | probably the most common use. The second reason is to use Python as a
|
---|
12 | component in a larger application; this technique is generally
|
---|
13 | referred to as \dfn{embedding} Python in an application.
|
---|
14 |
|
---|
15 | Writing an extension module is a relatively well-understood process,
|
---|
16 | where a ``cookbook'' approach works well. There are several tools
|
---|
17 | that automate the process to some extent. While people have embedded
|
---|
18 | Python in other applications since its early existence, the process of
|
---|
19 | embedding Python is less straightforward than writing an extension.
|
---|
20 |
|
---|
21 | Many API functions are useful independent of whether you're embedding
|
---|
22 | or extending Python; moreover, most applications that embed Python
|
---|
23 | will need to provide a custom extension as well, so it's probably a
|
---|
24 | good idea to become familiar with writing an extension before
|
---|
25 | attempting to embed Python in a real application.
|
---|
26 |
|
---|
27 |
|
---|
28 | \section{Include Files \label{includes}}
|
---|
29 |
|
---|
30 | All function, type and macro definitions needed to use the Python/C
|
---|
31 | API are included in your code by the following line:
|
---|
32 |
|
---|
33 | \begin{verbatim}
|
---|
34 | #include "Python.h"
|
---|
35 | \end{verbatim}
|
---|
36 |
|
---|
37 | This implies inclusion of the following standard headers:
|
---|
38 | \code{<stdio.h>}, \code{<string.h>}, \code{<errno.h>},
|
---|
39 | \code{<limits.h>}, and \code{<stdlib.h>} (if available).
|
---|
40 |
|
---|
41 | \begin{notice}[warning]
|
---|
42 | Since Python may define some pre-processor definitions which affect
|
---|
43 | the standard headers on some systems, you \emph{must} include
|
---|
44 | \file{Python.h} before any standard headers are included.
|
---|
45 | \end{notice}
|
---|
46 |
|
---|
47 | All user visible names defined by Python.h (except those defined by
|
---|
48 | the included standard headers) have one of the prefixes \samp{Py} or
|
---|
49 | \samp{_Py}. Names beginning with \samp{_Py} are for internal use by
|
---|
50 | the Python implementation and should not be used by extension writers.
|
---|
51 | Structure member names do not have a reserved prefix.
|
---|
52 |
|
---|
53 | \strong{Important:} user code should never define names that begin
|
---|
54 | with \samp{Py} or \samp{_Py}. This confuses the reader, and
|
---|
55 | jeopardizes the portability of the user code to future Python
|
---|
56 | versions, which may define additional names beginning with one of
|
---|
57 | these prefixes.
|
---|
58 |
|
---|
59 | The header files are typically installed with Python. On \UNIX, these
|
---|
60 | are located in the directories
|
---|
61 | \file{\envvar{prefix}/include/python\var{version}/} and
|
---|
62 | \file{\envvar{exec_prefix}/include/python\var{version}/}, where
|
---|
63 | \envvar{prefix} and \envvar{exec_prefix} are defined by the
|
---|
64 | corresponding parameters to Python's \program{configure} script and
|
---|
65 | \var{version} is \code{sys.version[:3]}. On Windows, the headers are
|
---|
66 | installed in \file{\envvar{prefix}/include}, where \envvar{prefix} is
|
---|
67 | the installation directory specified to the installer.
|
---|
68 |
|
---|
69 | To include the headers, place both directories (if different) on your
|
---|
70 | compiler's search path for includes. Do \emph{not} place the parent
|
---|
71 | directories on the search path and then use
|
---|
72 | \samp{\#include <python\shortversion/Python.h>}; this will break on
|
---|
73 | multi-platform builds since the platform independent headers under
|
---|
74 | \envvar{prefix} include the platform specific headers from
|
---|
75 | \envvar{exec_prefix}.
|
---|
76 |
|
---|
77 | \Cpp{} users should note that though the API is defined entirely using
|
---|
78 | C, the header files do properly declare the entry points to be
|
---|
79 | \code{extern "C"}, so there is no need to do anything special to use
|
---|
80 | the API from \Cpp.
|
---|
81 |
|
---|
82 |
|
---|
83 | \section{Objects, Types and Reference Counts \label{objects}}
|
---|
84 |
|
---|
85 | Most Python/C API functions have one or more arguments as well as a
|
---|
86 | return value of type \ctype{PyObject*}. This type is a pointer
|
---|
87 | to an opaque data type representing an arbitrary Python
|
---|
88 | object. Since all Python object types are treated the same way by the
|
---|
89 | Python language in most situations (e.g., assignments, scope rules,
|
---|
90 | and argument passing), it is only fitting that they should be
|
---|
91 | represented by a single C type. Almost all Python objects live on the
|
---|
92 | heap: you never declare an automatic or static variable of type
|
---|
93 | \ctype{PyObject}, only pointer variables of type \ctype{PyObject*} can
|
---|
94 | be declared. The sole exception are the type objects\obindex{type};
|
---|
95 | since these must never be deallocated, they are typically static
|
---|
96 | \ctype{PyTypeObject} objects.
|
---|
97 |
|
---|
98 | All Python objects (even Python integers) have a \dfn{type} and a
|
---|
99 | \dfn{reference count}. An object's type determines what kind of object
|
---|
100 | it is (e.g., an integer, a list, or a user-defined function; there are
|
---|
101 | many more as explained in the \citetitle[../ref/ref.html]{Python
|
---|
102 | Reference Manual}). For each of the well-known types there is a macro
|
---|
103 | to check whether an object is of that type; for instance,
|
---|
104 | \samp{PyList_Check(\var{a})} is true if (and only if) the object
|
---|
105 | pointed to by \var{a} is a Python list.
|
---|
106 |
|
---|
107 |
|
---|
108 | \subsection{Reference Counts \label{refcounts}}
|
---|
109 |
|
---|
110 | The reference count is important because today's computers have a
|
---|
111 | finite (and often severely limited) memory size; it counts how many
|
---|
112 | different places there are that have a reference to an object. Such a
|
---|
113 | place could be another object, or a global (or static) C variable, or
|
---|
114 | a local variable in some C function. When an object's reference count
|
---|
115 | becomes zero, the object is deallocated. If it contains references to
|
---|
116 | other objects, their reference count is decremented. Those other
|
---|
117 | objects may be deallocated in turn, if this decrement makes their
|
---|
118 | reference count become zero, and so on. (There's an obvious problem
|
---|
119 | with objects that reference each other here; for now, the solution is
|
---|
120 | ``don't do that.'')
|
---|
121 |
|
---|
122 | Reference counts are always manipulated explicitly. The normal way is
|
---|
123 | to use the macro \cfunction{Py_INCREF()}\ttindex{Py_INCREF()} to
|
---|
124 | increment an object's reference count by one, and
|
---|
125 | \cfunction{Py_DECREF()}\ttindex{Py_DECREF()} to decrement it by
|
---|
126 | one. The \cfunction{Py_DECREF()} macro is considerably more complex
|
---|
127 | than the incref one, since it must check whether the reference count
|
---|
128 | becomes zero and then cause the object's deallocator to be called.
|
---|
129 | The deallocator is a function pointer contained in the object's type
|
---|
130 | structure. The type-specific deallocator takes care of decrementing
|
---|
131 | the reference counts for other objects contained in the object if this
|
---|
132 | is a compound object type, such as a list, as well as performing any
|
---|
133 | additional finalization that's needed. There's no chance that the
|
---|
134 | reference count can overflow; at least as many bits are used to hold
|
---|
135 | the reference count as there are distinct memory locations in virtual
|
---|
136 | memory (assuming \code{sizeof(long) >= sizeof(char*)}). Thus, the
|
---|
137 | reference count increment is a simple operation.
|
---|
138 |
|
---|
139 | It is not necessary to increment an object's reference count for every
|
---|
140 | local variable that contains a pointer to an object. In theory, the
|
---|
141 | object's reference count goes up by one when the variable is made to
|
---|
142 | point to it and it goes down by one when the variable goes out of
|
---|
143 | scope. However, these two cancel each other out, so at the end the
|
---|
144 | reference count hasn't changed. The only real reason to use the
|
---|
145 | reference count is to prevent the object from being deallocated as
|
---|
146 | long as our variable is pointing to it. If we know that there is at
|
---|
147 | least one other reference to the object that lives at least as long as
|
---|
148 | our variable, there is no need to increment the reference count
|
---|
149 | temporarily. An important situation where this arises is in objects
|
---|
150 | that are passed as arguments to C functions in an extension module
|
---|
151 | that are called from Python; the call mechanism guarantees to hold a
|
---|
152 | reference to every argument for the duration of the call.
|
---|
153 |
|
---|
154 | However, a common pitfall is to extract an object from a list and
|
---|
155 | hold on to it for a while without incrementing its reference count.
|
---|
156 | Some other operation might conceivably remove the object from the
|
---|
157 | list, decrementing its reference count and possible deallocating it.
|
---|
158 | The real danger is that innocent-looking operations may invoke
|
---|
159 | arbitrary Python code which could do this; there is a code path which
|
---|
160 | allows control to flow back to the user from a \cfunction{Py_DECREF()},
|
---|
161 | so almost any operation is potentially dangerous.
|
---|
162 |
|
---|
163 | A safe approach is to always use the generic operations (functions
|
---|
164 | whose name begins with \samp{PyObject_}, \samp{PyNumber_},
|
---|
165 | \samp{PySequence_} or \samp{PyMapping_}). These operations always
|
---|
166 | increment the reference count of the object they return. This leaves
|
---|
167 | the caller with the responsibility to call
|
---|
168 | \cfunction{Py_DECREF()} when they are done with the result; this soon
|
---|
169 | becomes second nature.
|
---|
170 |
|
---|
171 |
|
---|
172 | \subsubsection{Reference Count Details \label{refcountDetails}}
|
---|
173 |
|
---|
174 | The reference count behavior of functions in the Python/C API is best
|
---|
175 | explained in terms of \emph{ownership of references}. Ownership
|
---|
176 | pertains to references, never to objects (objects are not owned: they
|
---|
177 | are always shared). "Owning a reference" means being responsible for
|
---|
178 | calling Py_DECREF on it when the reference is no longer needed.
|
---|
179 | Ownership can also be transferred, meaning that the code that receives
|
---|
180 | ownership of the reference then becomes responsible for eventually
|
---|
181 | decref'ing it by calling \cfunction{Py_DECREF()} or
|
---|
182 | \cfunction{Py_XDECREF()} when it's no longer needed---or passing on
|
---|
183 | this responsibility (usually to its caller).
|
---|
184 | When a function passes ownership of a reference on to its caller, the
|
---|
185 | caller is said to receive a \emph{new} reference. When no ownership
|
---|
186 | is transferred, the caller is said to \emph{borrow} the reference.
|
---|
187 | Nothing needs to be done for a borrowed reference.
|
---|
188 |
|
---|
189 | Conversely, when a calling function passes it a reference to an
|
---|
190 | object, there are two possibilities: the function \emph{steals} a
|
---|
191 | reference to the object, or it does not. \emph{Stealing a reference}
|
---|
192 | means that when you pass a reference to a function, that function
|
---|
193 | assumes that it now owns that reference, and you are not responsible
|
---|
194 | for it any longer.
|
---|
195 |
|
---|
196 | Few functions steal references; the two notable exceptions are
|
---|
197 | \cfunction{PyList_SetItem()}\ttindex{PyList_SetItem()} and
|
---|
198 | \cfunction{PyTuple_SetItem()}\ttindex{PyTuple_SetItem()}, which
|
---|
199 | steal a reference to the item (but not to the tuple or list into which
|
---|
200 | the item is put!). These functions were designed to steal a reference
|
---|
201 | because of a common idiom for populating a tuple or list with newly
|
---|
202 | created objects; for example, the code to create the tuple \code{(1,
|
---|
203 | 2, "three")} could look like this (forgetting about error handling for
|
---|
204 | the moment; a better way to code this is shown below):
|
---|
205 |
|
---|
206 | \begin{verbatim}
|
---|
207 | PyObject *t;
|
---|
208 |
|
---|
209 | t = PyTuple_New(3);
|
---|
210 | PyTuple_SetItem(t, 0, PyInt_FromLong(1L));
|
---|
211 | PyTuple_SetItem(t, 1, PyInt_FromLong(2L));
|
---|
212 | PyTuple_SetItem(t, 2, PyString_FromString("three"));
|
---|
213 | \end{verbatim}
|
---|
214 |
|
---|
215 | Here, \cfunction{PyInt_FromLong()} returns a new reference which is
|
---|
216 | immediately stolen by \cfunction{PyTuple_SetItem()}. When you want to
|
---|
217 | keep using an object although the reference to it will be stolen,
|
---|
218 | use \cfunction{Py_INCREF()} to grab another reference before calling the
|
---|
219 | reference-stealing function.
|
---|
220 |
|
---|
221 | Incidentally, \cfunction{PyTuple_SetItem()} is the \emph{only} way to
|
---|
222 | set tuple items; \cfunction{PySequence_SetItem()} and
|
---|
223 | \cfunction{PyObject_SetItem()} refuse to do this since tuples are an
|
---|
224 | immutable data type. You should only use
|
---|
225 | \cfunction{PyTuple_SetItem()} for tuples that you are creating
|
---|
226 | yourself.
|
---|
227 |
|
---|
228 | Equivalent code for populating a list can be written using
|
---|
229 | \cfunction{PyList_New()} and \cfunction{PyList_SetItem()}.
|
---|
230 |
|
---|
231 | However, in practice, you will rarely use these ways of
|
---|
232 | creating and populating a tuple or list. There's a generic function,
|
---|
233 | \cfunction{Py_BuildValue()}, that can create most common objects from
|
---|
234 | C values, directed by a \dfn{format string}. For example, the
|
---|
235 | above two blocks of code could be replaced by the following (which
|
---|
236 | also takes care of the error checking):
|
---|
237 |
|
---|
238 | \begin{verbatim}
|
---|
239 | PyObject *tuple, *list;
|
---|
240 |
|
---|
241 | tuple = Py_BuildValue("(iis)", 1, 2, "three");
|
---|
242 | list = Py_BuildValue("[iis]", 1, 2, "three");
|
---|
243 | \end{verbatim}
|
---|
244 |
|
---|
245 | It is much more common to use \cfunction{PyObject_SetItem()} and
|
---|
246 | friends with items whose references you are only borrowing, like
|
---|
247 | arguments that were passed in to the function you are writing. In
|
---|
248 | that case, their behaviour regarding reference counts is much saner,
|
---|
249 | since you don't have to increment a reference count so you can give a
|
---|
250 | reference away (``have it be stolen''). For example, this function
|
---|
251 | sets all items of a list (actually, any mutable sequence) to a given
|
---|
252 | item:
|
---|
253 |
|
---|
254 | \begin{verbatim}
|
---|
255 | int
|
---|
256 | set_all(PyObject *target, PyObject *item)
|
---|
257 | {
|
---|
258 | int i, n;
|
---|
259 |
|
---|
260 | n = PyObject_Length(target);
|
---|
261 | if (n < 0)
|
---|
262 | return -1;
|
---|
263 | for (i = 0; i < n; i++) {
|
---|
264 | PyObject *index = PyInt_FromLong(i);
|
---|
265 | if (!index)
|
---|
266 | return -1;
|
---|
267 | if (PyObject_SetItem(target, index, item) < 0)
|
---|
268 | return -1;
|
---|
269 | Py_DECREF(index);
|
---|
270 | }
|
---|
271 | return 0;
|
---|
272 | }
|
---|
273 | \end{verbatim}
|
---|
274 | \ttindex{set_all()}
|
---|
275 |
|
---|
276 | The situation is slightly different for function return values.
|
---|
277 | While passing a reference to most functions does not change your
|
---|
278 | ownership responsibilities for that reference, many functions that
|
---|
279 | return a reference to an object give you ownership of the reference.
|
---|
280 | The reason is simple: in many cases, the returned object is created
|
---|
281 | on the fly, and the reference you get is the only reference to the
|
---|
282 | object. Therefore, the generic functions that return object
|
---|
283 | references, like \cfunction{PyObject_GetItem()} and
|
---|
284 | \cfunction{PySequence_GetItem()}, always return a new reference (the
|
---|
285 | caller becomes the owner of the reference).
|
---|
286 |
|
---|
287 | It is important to realize that whether you own a reference returned
|
---|
288 | by a function depends on which function you call only --- \emph{the
|
---|
289 | plumage} (the type of the object passed as an
|
---|
290 | argument to the function) \emph{doesn't enter into it!} Thus, if you
|
---|
291 | extract an item from a list using \cfunction{PyList_GetItem()}, you
|
---|
292 | don't own the reference --- but if you obtain the same item from the
|
---|
293 | same list using \cfunction{PySequence_GetItem()} (which happens to
|
---|
294 | take exactly the same arguments), you do own a reference to the
|
---|
295 | returned object.
|
---|
296 |
|
---|
297 | Here is an example of how you could write a function that computes the
|
---|
298 | sum of the items in a list of integers; once using
|
---|
299 | \cfunction{PyList_GetItem()}\ttindex{PyList_GetItem()}, and once using
|
---|
300 | \cfunction{PySequence_GetItem()}\ttindex{PySequence_GetItem()}.
|
---|
301 |
|
---|
302 | \begin{verbatim}
|
---|
303 | long
|
---|
304 | sum_list(PyObject *list)
|
---|
305 | {
|
---|
306 | int i, n;
|
---|
307 | long total = 0;
|
---|
308 | PyObject *item;
|
---|
309 |
|
---|
310 | n = PyList_Size(list);
|
---|
311 | if (n < 0)
|
---|
312 | return -1; /* Not a list */
|
---|
313 | for (i = 0; i < n; i++) {
|
---|
314 | item = PyList_GetItem(list, i); /* Can't fail */
|
---|
315 | if (!PyInt_Check(item)) continue; /* Skip non-integers */
|
---|
316 | total += PyInt_AsLong(item);
|
---|
317 | }
|
---|
318 | return total;
|
---|
319 | }
|
---|
320 | \end{verbatim}
|
---|
321 | \ttindex{sum_list()}
|
---|
322 |
|
---|
323 | \begin{verbatim}
|
---|
324 | long
|
---|
325 | sum_sequence(PyObject *sequence)
|
---|
326 | {
|
---|
327 | int i, n;
|
---|
328 | long total = 0;
|
---|
329 | PyObject *item;
|
---|
330 | n = PySequence_Length(sequence);
|
---|
331 | if (n < 0)
|
---|
332 | return -1; /* Has no length */
|
---|
333 | for (i = 0; i < n; i++) {
|
---|
334 | item = PySequence_GetItem(sequence, i);
|
---|
335 | if (item == NULL)
|
---|
336 | return -1; /* Not a sequence, or other failure */
|
---|
337 | if (PyInt_Check(item))
|
---|
338 | total += PyInt_AsLong(item);
|
---|
339 | Py_DECREF(item); /* Discard reference ownership */
|
---|
340 | }
|
---|
341 | return total;
|
---|
342 | }
|
---|
343 | \end{verbatim}
|
---|
344 | \ttindex{sum_sequence()}
|
---|
345 |
|
---|
346 |
|
---|
347 | \subsection{Types \label{types}}
|
---|
348 |
|
---|
349 | There are few other data types that play a significant role in
|
---|
350 | the Python/C API; most are simple C types such as \ctype{int},
|
---|
351 | \ctype{long}, \ctype{double} and \ctype{char*}. A few structure types
|
---|
352 | are used to describe static tables used to list the functions exported
|
---|
353 | by a module or the data attributes of a new object type, and another
|
---|
354 | is used to describe the value of a complex number. These will
|
---|
355 | be discussed together with the functions that use them.
|
---|
356 |
|
---|
357 |
|
---|
358 | \section{Exceptions \label{exceptions}}
|
---|
359 |
|
---|
360 | The Python programmer only needs to deal with exceptions if specific
|
---|
361 | error handling is required; unhandled exceptions are automatically
|
---|
362 | propagated to the caller, then to the caller's caller, and so on, until
|
---|
363 | they reach the top-level interpreter, where they are reported to the
|
---|
364 | user accompanied by a stack traceback.
|
---|
365 |
|
---|
366 | For C programmers, however, error checking always has to be explicit.
|
---|
367 | All functions in the Python/C API can raise exceptions, unless an
|
---|
368 | explicit claim is made otherwise in a function's documentation. In
|
---|
369 | general, when a function encounters an error, it sets an exception,
|
---|
370 | discards any object references that it owns, and returns an
|
---|
371 | error indicator --- usually \NULL{} or \code{-1}. A few functions
|
---|
372 | return a Boolean true/false result, with false indicating an error.
|
---|
373 | Very few functions return no explicit error indicator or have an
|
---|
374 | ambiguous return value, and require explicit testing for errors with
|
---|
375 | \cfunction{PyErr_Occurred()}\ttindex{PyErr_Occurred()}.
|
---|
376 |
|
---|
377 | Exception state is maintained in per-thread storage (this is
|
---|
378 | equivalent to using global storage in an unthreaded application). A
|
---|
379 | thread can be in one of two states: an exception has occurred, or not.
|
---|
380 | The function \cfunction{PyErr_Occurred()} can be used to check for
|
---|
381 | this: it returns a borrowed reference to the exception type object
|
---|
382 | when an exception has occurred, and \NULL{} otherwise. There are a
|
---|
383 | number of functions to set the exception state:
|
---|
384 | \cfunction{PyErr_SetString()}\ttindex{PyErr_SetString()} is the most
|
---|
385 | common (though not the most general) function to set the exception
|
---|
386 | state, and \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} clears the
|
---|
387 | exception state.
|
---|
388 |
|
---|
389 | The full exception state consists of three objects (all of which can
|
---|
390 | be \NULL): the exception type, the corresponding exception
|
---|
391 | value, and the traceback. These have the same meanings as the Python
|
---|
392 | \withsubitem{(in module sys)}{
|
---|
393 | \ttindex{exc_type}\ttindex{exc_value}\ttindex{exc_traceback}}
|
---|
394 | objects \code{sys.exc_type}, \code{sys.exc_value}, and
|
---|
395 | \code{sys.exc_traceback}; however, they are not the same: the Python
|
---|
396 | objects represent the last exception being handled by a Python
|
---|
397 | \keyword{try} \ldots\ \keyword{except} statement, while the C level
|
---|
398 | exception state only exists while an exception is being passed on
|
---|
399 | between C functions until it reaches the Python bytecode interpreter's
|
---|
400 | main loop, which takes care of transferring it to \code{sys.exc_type}
|
---|
401 | and friends.
|
---|
402 |
|
---|
403 | Note that starting with Python 1.5, the preferred, thread-safe way to
|
---|
404 | access the exception state from Python code is to call the function
|
---|
405 | \withsubitem{(in module sys)}{\ttindex{exc_info()}}
|
---|
406 | \function{sys.exc_info()}, which returns the per-thread exception state
|
---|
407 | for Python code. Also, the semantics of both ways to access the
|
---|
408 | exception state have changed so that a function which catches an
|
---|
409 | exception will save and restore its thread's exception state so as to
|
---|
410 | preserve the exception state of its caller. This prevents common bugs
|
---|
411 | in exception handling code caused by an innocent-looking function
|
---|
412 | overwriting the exception being handled; it also reduces the often
|
---|
413 | unwanted lifetime extension for objects that are referenced by the
|
---|
414 | stack frames in the traceback.
|
---|
415 |
|
---|
416 | As a general principle, a function that calls another function to
|
---|
417 | perform some task should check whether the called function raised an
|
---|
418 | exception, and if so, pass the exception state on to its caller. It
|
---|
419 | should discard any object references that it owns, and return an
|
---|
420 | error indicator, but it should \emph{not} set another exception ---
|
---|
421 | that would overwrite the exception that was just raised, and lose
|
---|
422 | important information about the exact cause of the error.
|
---|
423 |
|
---|
424 | A simple example of detecting exceptions and passing them on is shown
|
---|
425 | in the \cfunction{sum_sequence()}\ttindex{sum_sequence()} example
|
---|
426 | above. It so happens that that example doesn't need to clean up any
|
---|
427 | owned references when it detects an error. The following example
|
---|
428 | function shows some error cleanup. First, to remind you why you like
|
---|
429 | Python, we show the equivalent Python code:
|
---|
430 |
|
---|
431 | \begin{verbatim}
|
---|
432 | def incr_item(dict, key):
|
---|
433 | try:
|
---|
434 | item = dict[key]
|
---|
435 | except KeyError:
|
---|
436 | item = 0
|
---|
437 | dict[key] = item + 1
|
---|
438 | \end{verbatim}
|
---|
439 | \ttindex{incr_item()}
|
---|
440 |
|
---|
441 | Here is the corresponding C code, in all its glory:
|
---|
442 |
|
---|
443 | \begin{verbatim}
|
---|
444 | int
|
---|
445 | incr_item(PyObject *dict, PyObject *key)
|
---|
446 | {
|
---|
447 | /* Objects all initialized to NULL for Py_XDECREF */
|
---|
448 | PyObject *item = NULL, *const_one = NULL, *incremented_item = NULL;
|
---|
449 | int rv = -1; /* Return value initialized to -1 (failure) */
|
---|
450 |
|
---|
451 | item = PyObject_GetItem(dict, key);
|
---|
452 | if (item == NULL) {
|
---|
453 | /* Handle KeyError only: */
|
---|
454 | if (!PyErr_ExceptionMatches(PyExc_KeyError))
|
---|
455 | goto error;
|
---|
456 |
|
---|
457 | /* Clear the error and use zero: */
|
---|
458 | PyErr_Clear();
|
---|
459 | item = PyInt_FromLong(0L);
|
---|
460 | if (item == NULL)
|
---|
461 | goto error;
|
---|
462 | }
|
---|
463 | const_one = PyInt_FromLong(1L);
|
---|
464 | if (const_one == NULL)
|
---|
465 | goto error;
|
---|
466 |
|
---|
467 | incremented_item = PyNumber_Add(item, const_one);
|
---|
468 | if (incremented_item == NULL)
|
---|
469 | goto error;
|
---|
470 |
|
---|
471 | if (PyObject_SetItem(dict, key, incremented_item) < 0)
|
---|
472 | goto error;
|
---|
473 | rv = 0; /* Success */
|
---|
474 | /* Continue with cleanup code */
|
---|
475 |
|
---|
476 | error:
|
---|
477 | /* Cleanup code, shared by success and failure path */
|
---|
478 |
|
---|
479 | /* Use Py_XDECREF() to ignore NULL references */
|
---|
480 | Py_XDECREF(item);
|
---|
481 | Py_XDECREF(const_one);
|
---|
482 | Py_XDECREF(incremented_item);
|
---|
483 |
|
---|
484 | return rv; /* -1 for error, 0 for success */
|
---|
485 | }
|
---|
486 | \end{verbatim}
|
---|
487 | \ttindex{incr_item()}
|
---|
488 |
|
---|
489 | This example represents an endorsed use of the \keyword{goto} statement
|
---|
490 | in C! It illustrates the use of
|
---|
491 | \cfunction{PyErr_ExceptionMatches()}\ttindex{PyErr_ExceptionMatches()} and
|
---|
492 | \cfunction{PyErr_Clear()}\ttindex{PyErr_Clear()} to
|
---|
493 | handle specific exceptions, and the use of
|
---|
494 | \cfunction{Py_XDECREF()}\ttindex{Py_XDECREF()} to
|
---|
495 | dispose of owned references that may be \NULL{} (note the
|
---|
496 | \character{X} in the name; \cfunction{Py_DECREF()} would crash when
|
---|
497 | confronted with a \NULL{} reference). It is important that the
|
---|
498 | variables used to hold owned references are initialized to \NULL{} for
|
---|
499 | this to work; likewise, the proposed return value is initialized to
|
---|
500 | \code{-1} (failure) and only set to success after the final call made
|
---|
501 | is successful.
|
---|
502 |
|
---|
503 |
|
---|
504 | \section{Embedding Python \label{embedding}}
|
---|
505 |
|
---|
506 | The one important task that only embedders (as opposed to extension
|
---|
507 | writers) of the Python interpreter have to worry about is the
|
---|
508 | initialization, and possibly the finalization, of the Python
|
---|
509 | interpreter. Most functionality of the interpreter can only be used
|
---|
510 | after the interpreter has been initialized.
|
---|
511 |
|
---|
512 | The basic initialization function is
|
---|
513 | \cfunction{Py_Initialize()}\ttindex{Py_Initialize()}.
|
---|
514 | This initializes the table of loaded modules, and creates the
|
---|
515 | fundamental modules \module{__builtin__}\refbimodindex{__builtin__},
|
---|
516 | \module{__main__}\refbimodindex{__main__}, \module{sys}\refbimodindex{sys},
|
---|
517 | and \module{exceptions}.\refbimodindex{exceptions} It also initializes
|
---|
518 | the module search path (\code{sys.path}).%
|
---|
519 | \indexiii{module}{search}{path}
|
---|
520 | \withsubitem{(in module sys)}{\ttindex{path}}
|
---|
521 |
|
---|
522 | \cfunction{Py_Initialize()} does not set the ``script argument list''
|
---|
523 | (\code{sys.argv}). If this variable is needed by Python code that
|
---|
524 | will be executed later, it must be set explicitly with a call to
|
---|
525 | \code{PySys_SetArgv(\var{argc},
|
---|
526 | \var{argv})}\ttindex{PySys_SetArgv()} subsequent to the call to
|
---|
527 | \cfunction{Py_Initialize()}.
|
---|
528 |
|
---|
529 | On most systems (in particular, on \UNIX{} and Windows, although the
|
---|
530 | details are slightly different),
|
---|
531 | \cfunction{Py_Initialize()} calculates the module search path based
|
---|
532 | upon its best guess for the location of the standard Python
|
---|
533 | interpreter executable, assuming that the Python library is found in a
|
---|
534 | fixed location relative to the Python interpreter executable. In
|
---|
535 | particular, it looks for a directory named
|
---|
536 | \file{lib/python\shortversion} relative to the parent directory where
|
---|
537 | the executable named \file{python} is found on the shell command
|
---|
538 | search path (the environment variable \envvar{PATH}).
|
---|
539 |
|
---|
540 | For instance, if the Python executable is found in
|
---|
541 | \file{/usr/local/bin/python}, it will assume that the libraries are in
|
---|
542 | \file{/usr/local/lib/python\shortversion}. (In fact, this particular path
|
---|
543 | is also the ``fallback'' location, used when no executable file named
|
---|
544 | \file{python} is found along \envvar{PATH}.) The user can override
|
---|
545 | this behavior by setting the environment variable \envvar{PYTHONHOME},
|
---|
546 | or insert additional directories in front of the standard path by
|
---|
547 | setting \envvar{PYTHONPATH}.
|
---|
548 |
|
---|
549 | The embedding application can steer the search by calling
|
---|
550 | \code{Py_SetProgramName(\var{file})}\ttindex{Py_SetProgramName()} \emph{before} calling
|
---|
551 | \cfunction{Py_Initialize()}. Note that \envvar{PYTHONHOME} still
|
---|
552 | overrides this and \envvar{PYTHONPATH} is still inserted in front of
|
---|
553 | the standard path. An application that requires total control has to
|
---|
554 | provide its own implementation of
|
---|
555 | \cfunction{Py_GetPath()}\ttindex{Py_GetPath()},
|
---|
556 | \cfunction{Py_GetPrefix()}\ttindex{Py_GetPrefix()},
|
---|
557 | \cfunction{Py_GetExecPrefix()}\ttindex{Py_GetExecPrefix()}, and
|
---|
558 | \cfunction{Py_GetProgramFullPath()}\ttindex{Py_GetProgramFullPath()} (all
|
---|
559 | defined in \file{Modules/getpath.c}).
|
---|
560 |
|
---|
561 | Sometimes, it is desirable to ``uninitialize'' Python. For instance,
|
---|
562 | the application may want to start over (make another call to
|
---|
563 | \cfunction{Py_Initialize()}) or the application is simply done with its
|
---|
564 | use of Python and wants to free memory allocated by Python. This
|
---|
565 | can be accomplished by calling \cfunction{Py_Finalize()}. The function
|
---|
566 | \cfunction{Py_IsInitialized()}\ttindex{Py_IsInitialized()} returns
|
---|
567 | true if Python is currently in the initialized state. More
|
---|
568 | information about these functions is given in a later chapter.
|
---|
569 | Notice that \cfunction{Py_Finalize} does \emph{not} free all memory
|
---|
570 | allocated by the Python interpreter, e.g. memory allocated by extension
|
---|
571 | modules currently cannot be released.
|
---|
572 |
|
---|
573 |
|
---|
574 | \section{Debugging Builds \label{debugging}}
|
---|
575 |
|
---|
576 | Python can be built with several macros to enable extra checks of the
|
---|
577 | interpreter and extension modules. These checks tend to add a large
|
---|
578 | amount of overhead to the runtime so they are not enabled by default.
|
---|
579 |
|
---|
580 | A full list of the various types of debugging builds is in the file
|
---|
581 | \file{Misc/SpecialBuilds.txt} in the Python source distribution.
|
---|
582 | Builds are available that support tracing of reference counts,
|
---|
583 | debugging the memory allocator, or low-level profiling of the main
|
---|
584 | interpreter loop. Only the most frequently-used builds will be
|
---|
585 | described in the remainder of this section.
|
---|
586 |
|
---|
587 | Compiling the interpreter with the \csimplemacro{Py_DEBUG} macro
|
---|
588 | defined produces what is generally meant by "a debug build" of Python.
|
---|
589 | \csimplemacro{Py_DEBUG} is enabled in the \UNIX{} build by adding
|
---|
590 | \longprogramopt{with-pydebug} to the \file{configure} command. It is also
|
---|
591 | implied by the presence of the not-Python-specific
|
---|
592 | \csimplemacro{_DEBUG} macro. When \csimplemacro{Py_DEBUG} is enabled
|
---|
593 | in the \UNIX{} build, compiler optimization is disabled.
|
---|
594 |
|
---|
595 | In addition to the reference count debugging described below, the
|
---|
596 | following extra checks are performed:
|
---|
597 |
|
---|
598 | \begin{itemize}
|
---|
599 | \item Extra checks are added to the object allocator.
|
---|
600 | \item Extra checks are added to the parser and compiler.
|
---|
601 | \item Downcasts from wide types to narrow types are checked for
|
---|
602 | loss of information.
|
---|
603 | \item A number of assertions are added to the dictionary and set
|
---|
604 | implementations. In addition, the set object acquires a
|
---|
605 | \method{test_c_api} method.
|
---|
606 | \item Sanity checks of the input arguments are added to frame
|
---|
607 | creation.
|
---|
608 | \item The storage for long ints is initialized with a known
|
---|
609 | invalid pattern to catch reference to uninitialized
|
---|
610 | digits.
|
---|
611 | \item Low-level tracing and extra exception checking are added
|
---|
612 | to the runtime virtual machine.
|
---|
613 | \item Extra checks are added to the memory arena implementation.
|
---|
614 | \item Extra debugging is added to the thread module.
|
---|
615 | \end{itemize}
|
---|
616 |
|
---|
617 | There may be additional checks not mentioned here.
|
---|
618 |
|
---|
619 | Defining \csimplemacro{Py_TRACE_REFS} enables reference tracing. When
|
---|
620 | defined, a circular doubly linked list of active objects is maintained
|
---|
621 | by adding two extra fields to every \ctype{PyObject}. Total
|
---|
622 | allocations are tracked as well. Upon exit, all existing references
|
---|
623 | are printed. (In interactive mode this happens after every statement
|
---|
624 | run by the interpreter.) Implied by \csimplemacro{Py_DEBUG}.
|
---|
625 |
|
---|
626 | Please refer to \file{Misc/SpecialBuilds.txt} in the Python source
|
---|
627 | distribution for more detailed information.
|
---|