1 | \chapter{The Python Profilers \label{profile}}
|
---|
2 |
|
---|
3 | \sectionauthor{James Roskind}{}
|
---|
4 |
|
---|
5 | Copyright \copyright{} 1994, by InfoSeek Corporation, all rights reserved.
|
---|
6 | \index{InfoSeek Corporation}
|
---|
7 |
|
---|
8 | Written by James Roskind.\footnote{
|
---|
9 | Updated and converted to \LaTeX\ by Guido van Rossum.
|
---|
10 | Further updated by Armin Rigo to integrate the documentation for the new
|
---|
11 | \module{cProfile} module of Python 2.5.}
|
---|
12 |
|
---|
13 | Permission to use, copy, modify, and distribute this Python software
|
---|
14 | and its associated documentation for any purpose (subject to the
|
---|
15 | restriction in the following sentence) without fee is hereby granted,
|
---|
16 | provided that the above copyright notice appears in all copies, and
|
---|
17 | that both that copyright notice and this permission notice appear in
|
---|
18 | supporting documentation, and that the name of InfoSeek not be used in
|
---|
19 | advertising or publicity pertaining to distribution of the software
|
---|
20 | without specific, written prior permission. This permission is
|
---|
21 | explicitly restricted to the copying and modification of the software
|
---|
22 | to remain in Python, compiled Python, or other languages (such as C)
|
---|
23 | wherein the modified or derived code is exclusively imported into a
|
---|
24 | Python module.
|
---|
25 |
|
---|
26 | INFOSEEK CORPORATION DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS
|
---|
27 | SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
|
---|
28 | FITNESS. IN NO EVENT SHALL INFOSEEK CORPORATION BE LIABLE FOR ANY
|
---|
29 | SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER
|
---|
30 | RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF
|
---|
31 | CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN
|
---|
32 | CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
|
---|
33 |
|
---|
34 |
|
---|
35 | The profiler was written after only programming in Python for 3 weeks.
|
---|
36 | As a result, it is probably clumsy code, but I don't know for sure yet
|
---|
37 | 'cause I'm a beginner :-). I did work hard to make the code run fast,
|
---|
38 | so that profiling would be a reasonable thing to do. I tried not to
|
---|
39 | repeat code fragments, but I'm sure I did some stuff in really awkward
|
---|
40 | ways at times. Please send suggestions for improvements to:
|
---|
41 | \email{jar@netscape.com}. I won't promise \emph{any} support. ...but
|
---|
42 | I'd appreciate the feedback.
|
---|
43 |
|
---|
44 |
|
---|
45 | \section{Introduction to the profilers}
|
---|
46 | \nodename{Profiler Introduction}
|
---|
47 |
|
---|
48 | A \dfn{profiler} is a program that describes the run time performance
|
---|
49 | of a program, providing a variety of statistics. This documentation
|
---|
50 | describes the profiler functionality provided in the modules
|
---|
51 | \module{profile} and \module{pstats}. This profiler provides
|
---|
52 | \dfn{deterministic profiling} of any Python programs. It also
|
---|
53 | provides a series of report generation tools to allow users to rapidly
|
---|
54 | examine the results of a profile operation.
|
---|
55 | \index{deterministic profiling}
|
---|
56 | \index{profiling, deterministic}
|
---|
57 |
|
---|
58 | The Python standard library provides three different profilers:
|
---|
59 |
|
---|
60 | \begin{enumerate}
|
---|
61 | \item \module{profile}, a pure Python module, described in the sequel.
|
---|
62 | Copyright \copyright{} 1994, by InfoSeek Corporation.
|
---|
63 | \versionchanged[also reports the time spent in calls to built-in
|
---|
64 | functions and methods]{2.4}
|
---|
65 |
|
---|
66 | \item \module{cProfile}, a module written in C, with a reasonable
|
---|
67 | overhead that makes it suitable for profiling long-running programs.
|
---|
68 | Based on \module{lsprof}, contributed by Brett Rosen and Ted Czotter.
|
---|
69 | \versionadded{2.5}
|
---|
70 |
|
---|
71 | \item \module{hotshot}, a C module focusing on minimizing the overhead
|
---|
72 | while profiling, at the expense of long data post-processing times.
|
---|
73 | \versionchanged[the results should be more meaningful than in the
|
---|
74 | past: the timing core contained a critical bug]{2.5}
|
---|
75 | \end{enumerate}
|
---|
76 |
|
---|
77 | The \module{profile} and \module{cProfile} modules export the same
|
---|
78 | interface, so they are mostly interchangeables; \module{cProfile} has a
|
---|
79 | much lower overhead but is not so far as well-tested and might not be
|
---|
80 | available on all systems. \module{cProfile} is really a compatibility
|
---|
81 | layer on top of the internal \module{_lsprof} module. The
|
---|
82 | \module{hotshot} module is reserved to specialized usages.
|
---|
83 |
|
---|
84 | %\section{How Is This Profiler Different From The Old Profiler?}
|
---|
85 | %\nodename{Profiler Changes}
|
---|
86 | %
|
---|
87 | %(This section is of historical importance only; the old profiler
|
---|
88 | %discussed here was last seen in Python 1.1.)
|
---|
89 | %
|
---|
90 | %The big changes from old profiling module are that you get more
|
---|
91 | %information, and you pay less CPU time. It's not a trade-off, it's a
|
---|
92 | %trade-up.
|
---|
93 | %
|
---|
94 | %To be specific:
|
---|
95 | %
|
---|
96 | %\begin{description}
|
---|
97 | %
|
---|
98 | %\item[Bugs removed:]
|
---|
99 | %Local stack frame is no longer molested, execution time is now charged
|
---|
100 | %to correct functions.
|
---|
101 | %
|
---|
102 | %\item[Accuracy increased:]
|
---|
103 | %Profiler execution time is no longer charged to user's code,
|
---|
104 | %calibration for platform is supported, file reads are not done \emph{by}
|
---|
105 | %profiler \emph{during} profiling (and charged to user's code!).
|
---|
106 | %
|
---|
107 | %\item[Speed increased:]
|
---|
108 | %Overhead CPU cost was reduced by more than a factor of two (perhaps a
|
---|
109 | %factor of five), lightweight profiler module is all that must be
|
---|
110 | %loaded, and the report generating module (\module{pstats}) is not needed
|
---|
111 | %during profiling.
|
---|
112 | %
|
---|
113 | %\item[Recursive functions support:]
|
---|
114 | %Cumulative times in recursive functions are correctly calculated;
|
---|
115 | %recursive entries are counted.
|
---|
116 | %
|
---|
117 | %\item[Large growth in report generating UI:]
|
---|
118 | %Distinct profiles runs can be added together forming a comprehensive
|
---|
119 | %report; functions that import statistics take arbitrary lists of
|
---|
120 | %files; sorting criteria is now based on keywords (instead of 4 integer
|
---|
121 | %options); reports shows what functions were profiled as well as what
|
---|
122 | %profile file was referenced; output format has been improved.
|
---|
123 | %
|
---|
124 | %\end{description}
|
---|
125 |
|
---|
126 |
|
---|
127 | \section{Instant User's Manual \label{profile-instant}}
|
---|
128 |
|
---|
129 | This section is provided for users that ``don't want to read the
|
---|
130 | manual.'' It provides a very brief overview, and allows a user to
|
---|
131 | rapidly perform profiling on an existing application.
|
---|
132 |
|
---|
133 | To profile an application with a main entry point of \function{foo()},
|
---|
134 | you would add the following to your module:
|
---|
135 |
|
---|
136 | \begin{verbatim}
|
---|
137 | import cProfile
|
---|
138 | cProfile.run('foo()')
|
---|
139 | \end{verbatim}
|
---|
140 |
|
---|
141 | (Use \module{profile} instead of \module{cProfile} if the latter is not
|
---|
142 | available on your system.)
|
---|
143 |
|
---|
144 | The above action would cause \function{foo()} to be run, and a series of
|
---|
145 | informative lines (the profile) to be printed. The above approach is
|
---|
146 | most useful when working with the interpreter. If you would like to
|
---|
147 | save the results of a profile into a file for later examination, you
|
---|
148 | can supply a file name as the second argument to the \function{run()}
|
---|
149 | function:
|
---|
150 |
|
---|
151 | \begin{verbatim}
|
---|
152 | import cProfile
|
---|
153 | cProfile.run('foo()', 'fooprof')
|
---|
154 | \end{verbatim}
|
---|
155 |
|
---|
156 | The file \file{cProfile.py} can also be invoked as
|
---|
157 | a script to profile another script. For example:
|
---|
158 |
|
---|
159 | \begin{verbatim}
|
---|
160 | python -m cProfile myscript.py
|
---|
161 | \end{verbatim}
|
---|
162 |
|
---|
163 | \file{cProfile.py} accepts two optional arguments on the command line:
|
---|
164 |
|
---|
165 | \begin{verbatim}
|
---|
166 | cProfile.py [-o output_file] [-s sort_order]
|
---|
167 | \end{verbatim}
|
---|
168 |
|
---|
169 | \programopt{-s} only applies to standard output (\programopt{-o} is
|
---|
170 | not supplied). Look in the \class{Stats} documentation for valid sort
|
---|
171 | values.
|
---|
172 |
|
---|
173 | When you wish to review the profile, you should use the methods in the
|
---|
174 | \module{pstats} module. Typically you would load the statistics data as
|
---|
175 | follows:
|
---|
176 |
|
---|
177 | \begin{verbatim}
|
---|
178 | import pstats
|
---|
179 | p = pstats.Stats('fooprof')
|
---|
180 | \end{verbatim}
|
---|
181 |
|
---|
182 | The class \class{Stats} (the above code just created an instance of
|
---|
183 | this class) has a variety of methods for manipulating and printing the
|
---|
184 | data that was just read into \code{p}. When you ran
|
---|
185 | \function{cProfile.run()} above, what was printed was the result of three
|
---|
186 | method calls:
|
---|
187 |
|
---|
188 | \begin{verbatim}
|
---|
189 | p.strip_dirs().sort_stats(-1).print_stats()
|
---|
190 | \end{verbatim}
|
---|
191 |
|
---|
192 | The first method removed the extraneous path from all the module
|
---|
193 | names. The second method sorted all the entries according to the
|
---|
194 | standard module/line/name string that is printed.
|
---|
195 | %(this is to comply with the semantics of the old profiler).
|
---|
196 | The third method printed out
|
---|
197 | all the statistics. You might try the following sort calls:
|
---|
198 |
|
---|
199 | \begin{verbatim}
|
---|
200 | p.sort_stats('name')
|
---|
201 | p.print_stats()
|
---|
202 | \end{verbatim}
|
---|
203 |
|
---|
204 | The first call will actually sort the list by function name, and the
|
---|
205 | second call will print out the statistics. The following are some
|
---|
206 | interesting calls to experiment with:
|
---|
207 |
|
---|
208 | \begin{verbatim}
|
---|
209 | p.sort_stats('cumulative').print_stats(10)
|
---|
210 | \end{verbatim}
|
---|
211 |
|
---|
212 | This sorts the profile by cumulative time in a function, and then only
|
---|
213 | prints the ten most significant lines. If you want to understand what
|
---|
214 | algorithms are taking time, the above line is what you would use.
|
---|
215 |
|
---|
216 | If you were looking to see what functions were looping a lot, and
|
---|
217 | taking a lot of time, you would do:
|
---|
218 |
|
---|
219 | \begin{verbatim}
|
---|
220 | p.sort_stats('time').print_stats(10)
|
---|
221 | \end{verbatim}
|
---|
222 |
|
---|
223 | to sort according to time spent within each function, and then print
|
---|
224 | the statistics for the top ten functions.
|
---|
225 |
|
---|
226 | You might also try:
|
---|
227 |
|
---|
228 | \begin{verbatim}
|
---|
229 | p.sort_stats('file').print_stats('__init__')
|
---|
230 | \end{verbatim}
|
---|
231 |
|
---|
232 | This will sort all the statistics by file name, and then print out
|
---|
233 | statistics for only the class init methods (since they are spelled
|
---|
234 | with \code{__init__} in them). As one final example, you could try:
|
---|
235 |
|
---|
236 | \begin{verbatim}
|
---|
237 | p.sort_stats('time', 'cum').print_stats(.5, 'init')
|
---|
238 | \end{verbatim}
|
---|
239 |
|
---|
240 | This line sorts statistics with a primary key of time, and a secondary
|
---|
241 | key of cumulative time, and then prints out some of the statistics.
|
---|
242 | To be specific, the list is first culled down to 50\% (re: \samp{.5})
|
---|
243 | of its original size, then only lines containing \code{init} are
|
---|
244 | maintained, and that sub-sub-list is printed.
|
---|
245 |
|
---|
246 | If you wondered what functions called the above functions, you could
|
---|
247 | now (\code{p} is still sorted according to the last criteria) do:
|
---|
248 |
|
---|
249 | \begin{verbatim}
|
---|
250 | p.print_callers(.5, 'init')
|
---|
251 | \end{verbatim}
|
---|
252 |
|
---|
253 | and you would get a list of callers for each of the listed functions.
|
---|
254 |
|
---|
255 | If you want more functionality, you're going to have to read the
|
---|
256 | manual, or guess what the following functions do:
|
---|
257 |
|
---|
258 | \begin{verbatim}
|
---|
259 | p.print_callees()
|
---|
260 | p.add('fooprof')
|
---|
261 | \end{verbatim}
|
---|
262 |
|
---|
263 | Invoked as a script, the \module{pstats} module is a statistics
|
---|
264 | browser for reading and examining profile dumps. It has a simple
|
---|
265 | line-oriented interface (implemented using \refmodule{cmd}) and
|
---|
266 | interactive help.
|
---|
267 |
|
---|
268 | \section{What Is Deterministic Profiling?}
|
---|
269 | \nodename{Deterministic Profiling}
|
---|
270 |
|
---|
271 | \dfn{Deterministic profiling} is meant to reflect the fact that all
|
---|
272 | \emph{function call}, \emph{function return}, and \emph{exception} events
|
---|
273 | are monitored, and precise timings are made for the intervals between
|
---|
274 | these events (during which time the user's code is executing). In
|
---|
275 | contrast, \dfn{statistical profiling} (which is not done by this
|
---|
276 | module) randomly samples the effective instruction pointer, and
|
---|
277 | deduces where time is being spent. The latter technique traditionally
|
---|
278 | involves less overhead (as the code does not need to be instrumented),
|
---|
279 | but provides only relative indications of where time is being spent.
|
---|
280 |
|
---|
281 | In Python, since there is an interpreter active during execution, the
|
---|
282 | presence of instrumented code is not required to do deterministic
|
---|
283 | profiling. Python automatically provides a \dfn{hook} (optional
|
---|
284 | callback) for each event. In addition, the interpreted nature of
|
---|
285 | Python tends to add so much overhead to execution, that deterministic
|
---|
286 | profiling tends to only add small processing overhead in typical
|
---|
287 | applications. The result is that deterministic profiling is not that
|
---|
288 | expensive, yet provides extensive run time statistics about the
|
---|
289 | execution of a Python program.
|
---|
290 |
|
---|
291 | Call count statistics can be used to identify bugs in code (surprising
|
---|
292 | counts), and to identify possible inline-expansion points (high call
|
---|
293 | counts). Internal time statistics can be used to identify ``hot
|
---|
294 | loops'' that should be carefully optimized. Cumulative time
|
---|
295 | statistics should be used to identify high level errors in the
|
---|
296 | selection of algorithms. Note that the unusual handling of cumulative
|
---|
297 | times in this profiler allows statistics for recursive implementations
|
---|
298 | of algorithms to be directly compared to iterative implementations.
|
---|
299 |
|
---|
300 |
|
---|
301 | \section{Reference Manual -- \module{profile} and \module{cProfile}}
|
---|
302 |
|
---|
303 | \declaremodule{standard}{profile}
|
---|
304 | \declaremodule{standard}{cProfile}
|
---|
305 | \modulesynopsis{Python profiler}
|
---|
306 |
|
---|
307 |
|
---|
308 |
|
---|
309 | The primary entry point for the profiler is the global function
|
---|
310 | \function{profile.run()} (resp. \function{cProfile.run()}).
|
---|
311 | It is typically used to create any profile
|
---|
312 | information. The reports are formatted and printed using methods of
|
---|
313 | the class \class{pstats.Stats}. The following is a description of all
|
---|
314 | of these standard entry points and functions. For a more in-depth
|
---|
315 | view of some of the code, consider reading the later section on
|
---|
316 | Profiler Extensions, which includes discussion of how to derive
|
---|
317 | ``better'' profilers from the classes presented, or reading the source
|
---|
318 | code for these modules.
|
---|
319 |
|
---|
320 | \begin{funcdesc}{run}{command\optional{, filename}}
|
---|
321 |
|
---|
322 | This function takes a single argument that has can be passed to the
|
---|
323 | \keyword{exec} statement, and an optional file name. In all cases this
|
---|
324 | routine attempts to \keyword{exec} its first argument, and gather profiling
|
---|
325 | statistics from the execution. If no file name is present, then this
|
---|
326 | function automatically prints a simple profiling report, sorted by the
|
---|
327 | standard name string (file/line/function-name) that is presented in
|
---|
328 | each line. The following is a typical output from such a call:
|
---|
329 |
|
---|
330 | \begin{verbatim}
|
---|
331 | 2706 function calls (2004 primitive calls) in 4.504 CPU seconds
|
---|
332 |
|
---|
333 | Ordered by: standard name
|
---|
334 |
|
---|
335 | ncalls tottime percall cumtime percall filename:lineno(function)
|
---|
336 | 2 0.006 0.003 0.953 0.477 pobject.py:75(save_objects)
|
---|
337 | 43/3 0.533 0.012 0.749 0.250 pobject.py:99(evaluate)
|
---|
338 | ...
|
---|
339 | \end{verbatim}
|
---|
340 |
|
---|
341 | The first line indicates that 2706 calls were
|
---|
342 | monitored. Of those calls, 2004 were \dfn{primitive}. We define
|
---|
343 | \dfn{primitive} to mean that the call was not induced via recursion.
|
---|
344 | The next line: \code{Ordered by:\ standard name}, indicates that
|
---|
345 | the text string in the far right column was used to sort the output.
|
---|
346 | The column headings include:
|
---|
347 |
|
---|
348 | \begin{description}
|
---|
349 |
|
---|
350 | \item[ncalls ]
|
---|
351 | for the number of calls,
|
---|
352 |
|
---|
353 | \item[tottime ]
|
---|
354 | for the total time spent in the given function (and excluding time
|
---|
355 | made in calls to sub-functions),
|
---|
356 |
|
---|
357 | \item[percall ]
|
---|
358 | is the quotient of \code{tottime} divided by \code{ncalls}
|
---|
359 |
|
---|
360 | \item[cumtime ]
|
---|
361 | is the total time spent in this and all subfunctions (from invocation
|
---|
362 | till exit). This figure is accurate \emph{even} for recursive
|
---|
363 | functions.
|
---|
364 |
|
---|
365 | \item[percall ]
|
---|
366 | is the quotient of \code{cumtime} divided by primitive calls
|
---|
367 |
|
---|
368 | \item[filename:lineno(function) ]
|
---|
369 | provides the respective data of each function
|
---|
370 |
|
---|
371 | \end{description}
|
---|
372 |
|
---|
373 | When there are two numbers in the first column (for example,
|
---|
374 | \samp{43/3}), then the latter is the number of primitive calls, and
|
---|
375 | the former is the actual number of calls. Note that when the function
|
---|
376 | does not recurse, these two values are the same, and only the single
|
---|
377 | figure is printed.
|
---|
378 |
|
---|
379 | \end{funcdesc}
|
---|
380 |
|
---|
381 | \begin{funcdesc}{runctx}{command, globals, locals\optional{, filename}}
|
---|
382 | This function is similar to \function{run()}, with added
|
---|
383 | arguments to supply the globals and locals dictionaries for the
|
---|
384 | \var{command} string.
|
---|
385 | \end{funcdesc}
|
---|
386 |
|
---|
387 | Analysis of the profiler data is done using the \class{Stats} class.
|
---|
388 |
|
---|
389 | \note{The \class{Stats} class is defined in the \module{pstats} module.}
|
---|
390 |
|
---|
391 | % now switch modules....
|
---|
392 | % (This \stmodindex use may be hard to change ;-( )
|
---|
393 | \stmodindex{pstats}
|
---|
394 |
|
---|
395 | \begin{classdesc}{Stats}{filename\optional{, stream=sys.stdout\optional{, \moreargs}}}
|
---|
396 | This class constructor creates an instance of a ``statistics object''
|
---|
397 | from a \var{filename} (or set of filenames). \class{Stats} objects are
|
---|
398 | manipulated by methods, in order to print useful reports. You may specify
|
---|
399 | an alternate output stream by giving the keyword argument, \code{stream}.
|
---|
400 |
|
---|
401 | The file selected by the above constructor must have been created by the
|
---|
402 | corresponding version of \module{profile} or \module{cProfile}. To be
|
---|
403 | specific, there is \emph{no} file compatibility guaranteed with future
|
---|
404 | versions of this profiler, and there is no compatibility with files produced
|
---|
405 | by other profilers.
|
---|
406 | %(such as the old system profiler).
|
---|
407 |
|
---|
408 | If several files are provided, all the statistics for identical
|
---|
409 | functions will be coalesced, so that an overall view of several
|
---|
410 | processes can be considered in a single report. If additional files
|
---|
411 | need to be combined with data in an existing \class{Stats} object, the
|
---|
412 | \method{add()} method can be used.
|
---|
413 |
|
---|
414 | \versionchanged[The \var{stream} parameter was added]{2.5}
|
---|
415 | \end{classdesc}
|
---|
416 |
|
---|
417 |
|
---|
418 | \subsection{The \class{Stats} Class \label{profile-stats}}
|
---|
419 |
|
---|
420 | \class{Stats} objects have the following methods:
|
---|
421 |
|
---|
422 | \begin{methoddesc}[Stats]{strip_dirs}{}
|
---|
423 | This method for the \class{Stats} class removes all leading path
|
---|
424 | information from file names. It is very useful in reducing the size
|
---|
425 | of the printout to fit within (close to) 80 columns. This method
|
---|
426 | modifies the object, and the stripped information is lost. After
|
---|
427 | performing a strip operation, the object is considered to have its
|
---|
428 | entries in a ``random'' order, as it was just after object
|
---|
429 | initialization and loading. If \method{strip_dirs()} causes two
|
---|
430 | function names to be indistinguishable (they are on the same
|
---|
431 | line of the same filename, and have the same function name), then the
|
---|
432 | statistics for these two entries are accumulated into a single entry.
|
---|
433 | \end{methoddesc}
|
---|
434 |
|
---|
435 |
|
---|
436 | \begin{methoddesc}[Stats]{add}{filename\optional{, \moreargs}}
|
---|
437 | This method of the \class{Stats} class accumulates additional
|
---|
438 | profiling information into the current profiling object. Its
|
---|
439 | arguments should refer to filenames created by the corresponding
|
---|
440 | version of \function{profile.run()} or \function{cProfile.run()}.
|
---|
441 | Statistics for identically named
|
---|
442 | (re: file, line, name) functions are automatically accumulated into
|
---|
443 | single function statistics.
|
---|
444 | \end{methoddesc}
|
---|
445 |
|
---|
446 | \begin{methoddesc}[Stats]{dump_stats}{filename}
|
---|
447 | Save the data loaded into the \class{Stats} object to a file named
|
---|
448 | \var{filename}. The file is created if it does not exist, and is
|
---|
449 | overwritten if it already exists. This is equivalent to the method of
|
---|
450 | the same name on the \class{profile.Profile} and
|
---|
451 | \class{cProfile.Profile} classes.
|
---|
452 | \versionadded{2.3}
|
---|
453 | \end{methoddesc}
|
---|
454 |
|
---|
455 | \begin{methoddesc}[Stats]{sort_stats}{key\optional{, \moreargs}}
|
---|
456 | This method modifies the \class{Stats} object by sorting it according
|
---|
457 | to the supplied criteria. The argument is typically a string
|
---|
458 | identifying the basis of a sort (example: \code{'time'} or
|
---|
459 | \code{'name'}).
|
---|
460 |
|
---|
461 | When more than one key is provided, then additional keys are used as
|
---|
462 | secondary criteria when there is equality in all keys selected
|
---|
463 | before them. For example, \code{sort_stats('name', 'file')} will sort
|
---|
464 | all the entries according to their function name, and resolve all ties
|
---|
465 | (identical function names) by sorting by file name.
|
---|
466 |
|
---|
467 | Abbreviations can be used for any key names, as long as the
|
---|
468 | abbreviation is unambiguous. The following are the keys currently
|
---|
469 | defined:
|
---|
470 |
|
---|
471 | \begin{tableii}{l|l}{code}{Valid Arg}{Meaning}
|
---|
472 | \lineii{'calls'}{call count}
|
---|
473 | \lineii{'cumulative'}{cumulative time}
|
---|
474 | \lineii{'file'}{file name}
|
---|
475 | \lineii{'module'}{file name}
|
---|
476 | \lineii{'pcalls'}{primitive call count}
|
---|
477 | \lineii{'line'}{line number}
|
---|
478 | \lineii{'name'}{function name}
|
---|
479 | \lineii{'nfl'}{name/file/line}
|
---|
480 | \lineii{'stdname'}{standard name}
|
---|
481 | \lineii{'time'}{internal time}
|
---|
482 | \end{tableii}
|
---|
483 |
|
---|
484 | Note that all sorts on statistics are in descending order (placing
|
---|
485 | most time consuming items first), where as name, file, and line number
|
---|
486 | searches are in ascending order (alphabetical). The subtle
|
---|
487 | distinction between \code{'nfl'} and \code{'stdname'} is that the
|
---|
488 | standard name is a sort of the name as printed, which means that the
|
---|
489 | embedded line numbers get compared in an odd way. For example, lines
|
---|
490 | 3, 20, and 40 would (if the file names were the same) appear in the
|
---|
491 | string order 20, 3 and 40. In contrast, \code{'nfl'} does a numeric
|
---|
492 | compare of the line numbers. In fact, \code{sort_stats('nfl')} is the
|
---|
493 | same as \code{sort_stats('name', 'file', 'line')}.
|
---|
494 |
|
---|
495 | %For compatibility with the old profiler,
|
---|
496 | For backward-compatibility reasons, the numeric arguments
|
---|
497 | \code{-1}, \code{0}, \code{1}, and \code{2} are permitted. They are
|
---|
498 | interpreted as \code{'stdname'}, \code{'calls'}, \code{'time'}, and
|
---|
499 | \code{'cumulative'} respectively. If this old style format (numeric)
|
---|
500 | is used, only one sort key (the numeric key) will be used, and
|
---|
501 | additional arguments will be silently ignored.
|
---|
502 | \end{methoddesc}
|
---|
503 |
|
---|
504 |
|
---|
505 | \begin{methoddesc}[Stats]{reverse_order}{}
|
---|
506 | This method for the \class{Stats} class reverses the ordering of the basic
|
---|
507 | list within the object. %This method is provided primarily for
|
---|
508 | %compatibility with the old profiler.
|
---|
509 | Note that by default ascending vs descending order is properly selected
|
---|
510 | based on the sort key of choice.
|
---|
511 | \end{methoddesc}
|
---|
512 |
|
---|
513 | \begin{methoddesc}[Stats]{print_stats}{\optional{restriction, \moreargs}}
|
---|
514 | This method for the \class{Stats} class prints out a report as described
|
---|
515 | in the \function{profile.run()} definition.
|
---|
516 |
|
---|
517 | The order of the printing is based on the last \method{sort_stats()}
|
---|
518 | operation done on the object (subject to caveats in \method{add()} and
|
---|
519 | \method{strip_dirs()}).
|
---|
520 |
|
---|
521 | The arguments provided (if any) can be used to limit the list down to
|
---|
522 | the significant entries. Initially, the list is taken to be the
|
---|
523 | complete set of profiled functions. Each restriction is either an
|
---|
524 | integer (to select a count of lines), or a decimal fraction between
|
---|
525 | 0.0 and 1.0 inclusive (to select a percentage of lines), or a regular
|
---|
526 | expression (to pattern match the standard name that is printed; as of
|
---|
527 | Python 1.5b1, this uses the Perl-style regular expression syntax
|
---|
528 | defined by the \refmodule{re} module). If several restrictions are
|
---|
529 | provided, then they are applied sequentially. For example:
|
---|
530 |
|
---|
531 | \begin{verbatim}
|
---|
532 | print_stats(.1, 'foo:')
|
---|
533 | \end{verbatim}
|
---|
534 |
|
---|
535 | would first limit the printing to first 10\% of list, and then only
|
---|
536 | print functions that were part of filename \file{.*foo:}. In
|
---|
537 | contrast, the command:
|
---|
538 |
|
---|
539 | \begin{verbatim}
|
---|
540 | print_stats('foo:', .1)
|
---|
541 | \end{verbatim}
|
---|
542 |
|
---|
543 | would limit the list to all functions having file names \file{.*foo:},
|
---|
544 | and then proceed to only print the first 10\% of them.
|
---|
545 | \end{methoddesc}
|
---|
546 |
|
---|
547 |
|
---|
548 | \begin{methoddesc}[Stats]{print_callers}{\optional{restriction, \moreargs}}
|
---|
549 | This method for the \class{Stats} class prints a list of all functions
|
---|
550 | that called each function in the profiled database. The ordering is
|
---|
551 | identical to that provided by \method{print_stats()}, and the definition
|
---|
552 | of the restricting argument is also identical. Each caller is reported on
|
---|
553 | its own line. The format differs slightly depending on the profiler that
|
---|
554 | produced the stats:
|
---|
555 |
|
---|
556 | \begin{itemize}
|
---|
557 | \item With \module{profile}, a number is shown in parentheses after each
|
---|
558 | caller to show how many times this specific call was made. For
|
---|
559 | convenience, a second non-parenthesized number repeats the cumulative
|
---|
560 | time spent in the function at the right.
|
---|
561 |
|
---|
562 | \item With \module{cProfile}, each caller is preceeded by three numbers:
|
---|
563 | the number of times this specific call was made, and the total and
|
---|
564 | cumulative times spent in the current function while it was invoked by
|
---|
565 | this specific caller.
|
---|
566 | \end{itemize}
|
---|
567 | \end{methoddesc}
|
---|
568 |
|
---|
569 | \begin{methoddesc}[Stats]{print_callees}{\optional{restriction, \moreargs}}
|
---|
570 | This method for the \class{Stats} class prints a list of all function
|
---|
571 | that were called by the indicated function. Aside from this reversal
|
---|
572 | of direction of calls (re: called vs was called by), the arguments and
|
---|
573 | ordering are identical to the \method{print_callers()} method.
|
---|
574 | \end{methoddesc}
|
---|
575 |
|
---|
576 |
|
---|
577 | \section{Limitations \label{profile-limits}}
|
---|
578 |
|
---|
579 | One limitation has to do with accuracy of timing information.
|
---|
580 | There is a fundamental problem with deterministic profilers involving
|
---|
581 | accuracy. The most obvious restriction is that the underlying ``clock''
|
---|
582 | is only ticking at a rate (typically) of about .001 seconds. Hence no
|
---|
583 | measurements will be more accurate than the underlying clock. If
|
---|
584 | enough measurements are taken, then the ``error'' will tend to average
|
---|
585 | out. Unfortunately, removing this first error induces a second source
|
---|
586 | of error.
|
---|
587 |
|
---|
588 | The second problem is that it ``takes a while'' from when an event is
|
---|
589 | dispatched until the profiler's call to get the time actually
|
---|
590 | \emph{gets} the state of the clock. Similarly, there is a certain lag
|
---|
591 | when exiting the profiler event handler from the time that the clock's
|
---|
592 | value was obtained (and then squirreled away), until the user's code
|
---|
593 | is once again executing. As a result, functions that are called many
|
---|
594 | times, or call many functions, will typically accumulate this error.
|
---|
595 | The error that accumulates in this fashion is typically less than the
|
---|
596 | accuracy of the clock (less than one clock tick), but it
|
---|
597 | \emph{can} accumulate and become very significant.
|
---|
598 |
|
---|
599 | The problem is more important with \module{profile} than with the
|
---|
600 | lower-overhead \module{cProfile}. For this reason, \module{profile}
|
---|
601 | provides a means of calibrating itself for a given platform so that
|
---|
602 | this error can be probabilistically (on the average) removed.
|
---|
603 | After the profiler is calibrated, it will be more accurate (in a least
|
---|
604 | square sense), but it will sometimes produce negative numbers (when
|
---|
605 | call counts are exceptionally low, and the gods of probability work
|
---|
606 | against you :-). ) Do \emph{not} be alarmed by negative numbers in
|
---|
607 | the profile. They should \emph{only} appear if you have calibrated
|
---|
608 | your profiler, and the results are actually better than without
|
---|
609 | calibration.
|
---|
610 |
|
---|
611 |
|
---|
612 | \section{Calibration \label{profile-calibration}}
|
---|
613 |
|
---|
614 | The profiler of the \module{profile} module subtracts a constant from each
|
---|
615 | event handling time to compensate for the overhead of calling the time
|
---|
616 | function, and socking away the results. By default, the constant is 0.
|
---|
617 | The following procedure can
|
---|
618 | be used to obtain a better constant for a given platform (see discussion
|
---|
619 | in section Limitations above).
|
---|
620 |
|
---|
621 | \begin{verbatim}
|
---|
622 | import profile
|
---|
623 | pr = profile.Profile()
|
---|
624 | for i in range(5):
|
---|
625 | print pr.calibrate(10000)
|
---|
626 | \end{verbatim}
|
---|
627 |
|
---|
628 | The method executes the number of Python calls given by the argument,
|
---|
629 | directly and again under the profiler, measuring the time for both.
|
---|
630 | It then computes the hidden overhead per profiler event, and returns
|
---|
631 | that as a float. For example, on an 800 MHz Pentium running
|
---|
632 | Windows 2000, and using Python's time.clock() as the timer,
|
---|
633 | the magical number is about 12.5e-6.
|
---|
634 |
|
---|
635 | The object of this exercise is to get a fairly consistent result.
|
---|
636 | If your computer is \emph{very} fast, or your timer function has poor
|
---|
637 | resolution, you might have to pass 100000, or even 1000000, to get
|
---|
638 | consistent results.
|
---|
639 |
|
---|
640 | When you have a consistent answer,
|
---|
641 | there are three ways you can use it:\footnote{Prior to Python 2.2, it
|
---|
642 | was necessary to edit the profiler source code to embed the bias as
|
---|
643 | a literal number. You still can, but that method is no longer
|
---|
644 | described, because no longer needed.}
|
---|
645 |
|
---|
646 | \begin{verbatim}
|
---|
647 | import profile
|
---|
648 |
|
---|
649 | # 1. Apply computed bias to all Profile instances created hereafter.
|
---|
650 | profile.Profile.bias = your_computed_bias
|
---|
651 |
|
---|
652 | # 2. Apply computed bias to a specific Profile instance.
|
---|
653 | pr = profile.Profile()
|
---|
654 | pr.bias = your_computed_bias
|
---|
655 |
|
---|
656 | # 3. Specify computed bias in instance constructor.
|
---|
657 | pr = profile.Profile(bias=your_computed_bias)
|
---|
658 | \end{verbatim}
|
---|
659 |
|
---|
660 | If you have a choice, you are better off choosing a smaller constant, and
|
---|
661 | then your results will ``less often'' show up as negative in profile
|
---|
662 | statistics.
|
---|
663 |
|
---|
664 |
|
---|
665 | \section{Extensions --- Deriving Better Profilers}
|
---|
666 | \nodename{Profiler Extensions}
|
---|
667 |
|
---|
668 | The \class{Profile} class of both modules, \module{profile} and
|
---|
669 | \module{cProfile}, were written so that
|
---|
670 | derived classes could be developed to extend the profiler. The details
|
---|
671 | are not described here, as doing this successfully requires an expert
|
---|
672 | understanding of how the \class{Profile} class works internally. Study
|
---|
673 | the source code of the module carefully if you want to
|
---|
674 | pursue this.
|
---|
675 |
|
---|
676 | If all you want to do is change how current time is determined (for
|
---|
677 | example, to force use of wall-clock time or elapsed process time),
|
---|
678 | pass the timing function you want to the \class{Profile} class
|
---|
679 | constructor:
|
---|
680 |
|
---|
681 | \begin{verbatim}
|
---|
682 | pr = profile.Profile(your_time_func)
|
---|
683 | \end{verbatim}
|
---|
684 |
|
---|
685 | The resulting profiler will then call \function{your_time_func()}.
|
---|
686 |
|
---|
687 | \begin{description}
|
---|
688 | \item[\class{profile.Profile}]
|
---|
689 | \function{your_time_func()} should return a single number, or a list of
|
---|
690 | numbers whose sum is the current time (like what \function{os.times()}
|
---|
691 | returns). If the function returns a single time number, or the list of
|
---|
692 | returned numbers has length 2, then you will get an especially fast
|
---|
693 | version of the dispatch routine.
|
---|
694 |
|
---|
695 | Be warned that you should calibrate the profiler class for the
|
---|
696 | timer function that you choose. For most machines, a timer that
|
---|
697 | returns a lone integer value will provide the best results in terms of
|
---|
698 | low overhead during profiling. (\function{os.times()} is
|
---|
699 | \emph{pretty} bad, as it returns a tuple of floating point values). If
|
---|
700 | you want to substitute a better timer in the cleanest fashion,
|
---|
701 | derive a class and hardwire a replacement dispatch method that best
|
---|
702 | handles your timer call, along with the appropriate calibration
|
---|
703 | constant.
|
---|
704 |
|
---|
705 | \item[\class{cProfile.Profile}]
|
---|
706 | \function{your_time_func()} should return a single number. If it returns
|
---|
707 | plain integers, you can also invoke the class constructor with a second
|
---|
708 | argument specifying the real duration of one unit of time. For example,
|
---|
709 | if \function{your_integer_time_func()} returns times measured in thousands
|
---|
710 | of seconds, you would constuct the \class{Profile} instance as follows:
|
---|
711 |
|
---|
712 | \begin{verbatim}
|
---|
713 | pr = profile.Profile(your_integer_time_func, 0.001)
|
---|
714 | \end{verbatim}
|
---|
715 |
|
---|
716 | As the \module{cProfile.Profile} class cannot be calibrated, custom
|
---|
717 | timer functions should be used with care and should be as fast as
|
---|
718 | possible. For the best results with a custom timer, it might be
|
---|
719 | necessary to hard-code it in the C source of the internal
|
---|
720 | \module{_lsprof} module.
|
---|
721 |
|
---|
722 | \end{description}
|
---|