1 | README for GPROF
|
---|
2 |
|
---|
3 | This is the GNU profiler. It is distributed with other "binary
|
---|
4 | utilities" which should be in ../binutils. See ../binutils/README for
|
---|
5 | more general notes, including where to send bug reports.
|
---|
6 |
|
---|
7 | This file documents the changes and new features available with this
|
---|
8 | version of GNU gprof.
|
---|
9 |
|
---|
10 | * New Features
|
---|
11 |
|
---|
12 | o Long options
|
---|
13 |
|
---|
14 | o Supports generalized file format, without breaking backward compatibility:
|
---|
15 | new file format supports basic-block execution counts and non-realtime
|
---|
16 | histograms (see below)
|
---|
17 |
|
---|
18 | o Supports profiling at the line level: flat profiles, call-graph profiles,
|
---|
19 | and execution-counts can all be displayed at a level that identifies
|
---|
20 | individual lines rather than just functions
|
---|
21 |
|
---|
22 | o Test-coverage support (similar to Sun tcov program): source files
|
---|
23 | can be annotated with the number of times a function was invoked
|
---|
24 | or with the number of times each basic-block in a function was
|
---|
25 | executed
|
---|
26 |
|
---|
27 | o Generalized histograms: not just execution-time, but arbitrary
|
---|
28 | histograms are support (for example, performance counter based
|
---|
29 | profiles)
|
---|
30 |
|
---|
31 | o Powerful mechanism to select data to be included/excluded from
|
---|
32 | analysis and/or output
|
---|
33 |
|
---|
34 | o Support for DEC OSF/1 v3.0
|
---|
35 |
|
---|
36 | o Full cross-platform profiling support: gprof uses BFD to support
|
---|
37 | arbitrary, non-native object file formats and non-native byte-orders
|
---|
38 | (this feature has not been tested yet)
|
---|
39 |
|
---|
40 | o In the call-graph function index, static function names are now
|
---|
41 | printed together with the filename in which the function was defined
|
---|
42 | (required bfd_find_nearest_line() support and symbolic debugging
|
---|
43 | information to be present in the executable file)
|
---|
44 |
|
---|
45 | o Major overhaul of source code (compiles cleanly with -Wall, etc.)
|
---|
46 |
|
---|
47 | * Supported Platforms
|
---|
48 |
|
---|
49 | The current version is known to work on:
|
---|
50 |
|
---|
51 | o DEC OSF/1 v3.0
|
---|
52 | All features supported.
|
---|
53 |
|
---|
54 | o SunOS 4.1.x
|
---|
55 | All features supported.
|
---|
56 |
|
---|
57 | o Solaris 2.3
|
---|
58 | Line-level profiling unsupported because bfd_find_nearest_line()
|
---|
59 | is not fully implemented for Elf binaries.
|
---|
60 |
|
---|
61 | o HP-UX 9.01
|
---|
62 | Line-level profiling unsupported because bfd_find_nearest_line()
|
---|
63 | is not fully implemented for SOM binaries.
|
---|
64 |
|
---|
65 | * Detailed Description
|
---|
66 |
|
---|
67 | ** User Interface Changes
|
---|
68 |
|
---|
69 | The command-line interface is backwards compatible with earlier
|
---|
70 | versions of GNU gprof and Berkeley gprof. The only exception is
|
---|
71 | the option to delete arcs from the call graph. The old syntax
|
---|
72 | was:
|
---|
73 |
|
---|
74 | -k fromname toname
|
---|
75 |
|
---|
76 | while the new syntax is:
|
---|
77 |
|
---|
78 | -k fromname/toname
|
---|
79 |
|
---|
80 | This change was necessary to be compatible with long-option parsing.
|
---|
81 | Also, "fromname" and "toname" can now be arbitrary symspecs rather
|
---|
82 | than just function names (see below for an explanation of symspecs).
|
---|
83 | For example, option "-k gprof.c/" suppresses all arcs due to calls out
|
---|
84 | of file "gprof.c".
|
---|
85 |
|
---|
86 | *** Sym Specs
|
---|
87 |
|
---|
88 | It is often necessary to apply gprof only to specific parts of a
|
---|
89 | program. GNU gprof has a simple but powerful mechanism to achieve
|
---|
90 | this. So called {\em symspecs\/} provide the foundation for this
|
---|
91 | mechanism. A symspec selects the parts of a profiled program to which
|
---|
92 | an operation should be applied to. The syntax of a symspec is
|
---|
93 | simple:
|
---|
94 |
|
---|
95 | filename_containing_a_dot
|
---|
96 | | funcname_not_containing_a_dot
|
---|
97 | | linenumber
|
---|
98 | | ( [ any_filename ] `:' ( any_funcname | linenumber ) )
|
---|
99 |
|
---|
100 | Here are some examples:
|
---|
101 |
|
---|
102 | main.c Selects everything in file "main.c"---the
|
---|
103 | dot in the string tells gprof to interpret
|
---|
104 | the string as a filename, rather than as
|
---|
105 | a function name. To select a file whose
|
---|
106 | name does contain a dot, a trailing colon
|
---|
107 | should be specified. For example, "odd:" is
|
---|
108 | interpreted as the file named "odd".
|
---|
109 |
|
---|
110 | main Selects all functions named "main". Notice
|
---|
111 | that there may be multiple instances of the
|
---|
112 | same function name because some of the
|
---|
113 | definitions may be local (i.e., static).
|
---|
114 | Unless a function name is unique in a program,
|
---|
115 | you must use the colon notation explained
|
---|
116 | below to specify a function from a specific
|
---|
117 | source file. Sometimes, functionnames contain
|
---|
118 | dots. In such cases, it is necessary to
|
---|
119 | add a leading colon to the name. For example,
|
---|
120 | ":.mul" selects function ".mul".
|
---|
121 |
|
---|
122 | main.c:main Selects function "main" in file "main.c".
|
---|
123 |
|
---|
124 | main.c:134 Selects line 134 in file "main.c".
|
---|
125 |
|
---|
126 | IMPLEMENTATION NOTE: The source code uses the type sym_id for symspecs.
|
---|
127 | At some point, this probably ought to be changed to "sym_spec" to make
|
---|
128 | reading the code easier.
|
---|
129 |
|
---|
130 | *** Long options
|
---|
131 |
|
---|
132 | GNU gprof now supports long options. The following is a list of all
|
---|
133 | supported options. Options that are listed without description
|
---|
134 | operate in the same manner as the corresponding option in older
|
---|
135 | versions of gprof.
|
---|
136 |
|
---|
137 | Short Form: Long Form:
|
---|
138 | ----------- ----------
|
---|
139 | -l --line
|
---|
140 | Request profiling at the line-level rather
|
---|
141 | than just at the function level. Source
|
---|
142 | lines are identified by symbols of the form:
|
---|
143 |
|
---|
144 | func (file:line)
|
---|
145 |
|
---|
146 | where "func" is the function name, "file" is the
|
---|
147 | file name and "line" is the line-number that
|
---|
148 | corresponds to the line.
|
---|
149 |
|
---|
150 | To work properly, the binary must contain symbolic
|
---|
151 | debugging information. This means that the source
|
---|
152 | have to be translated with option "-g" specified.
|
---|
153 | Functions for which there is no symbolic debugging
|
---|
154 | information available are treated as if "--line"
|
---|
155 | had not been specified. However, the line number
|
---|
156 | printed with such symbols is usually incorrect
|
---|
157 | and should be ignored.
|
---|
158 |
|
---|
159 | -a --no-static
|
---|
160 | -A[symspec] --annotated-source[=symspec]
|
---|
161 | Request output in the form of annotated source
|
---|
162 | files. If "symspec" is specified, print output only
|
---|
163 | for symbols selected by "symspec". If the option
|
---|
164 | is specified multiple times, annotated output is
|
---|
165 | generated for the union of all symspecs.
|
---|
166 |
|
---|
167 | Examples:
|
---|
168 |
|
---|
169 | -A Prints annotated source for all
|
---|
170 | source files.
|
---|
171 | -Agprof.c Prints annotated source for file
|
---|
172 | gprof.c.
|
---|
173 | -Afoobar Prints annotated source for files
|
---|
174 | containing a function named "foobar".
|
---|
175 | The entire file will be printed, but
|
---|
176 | only the function itself will be
|
---|
177 | annotated with profile data.
|
---|
178 |
|
---|
179 | -J[symspec] --no-annotated-source[=symspec]
|
---|
180 | Suppress annotated source output. If specified
|
---|
181 | without argument, annotated output is suppressed
|
---|
182 | completely. With an argument, annotated output
|
---|
183 | is suppressed only for the symbols selected by
|
---|
184 | "symspec". If the option is specified multiple
|
---|
185 | times, annotated output is suppressed for the
|
---|
186 | union of all symspecs. This option has lower
|
---|
187 | precedence than --annotated-source
|
---|
188 |
|
---|
189 | -p[symspec] --flat-profile[=symspec]
|
---|
190 | Request output in the form of a flat profile
|
---|
191 | (unless any other output-style option is specified,
|
---|
192 | this option is turned on by default). If
|
---|
193 | "symspec" is specified, include only symbols
|
---|
194 | selected by "symspec" in flat profile. If the
|
---|
195 | option is specified multiple times, the flat
|
---|
196 | profile includes symbols selected by the union
|
---|
197 | of all symspecs.
|
---|
198 |
|
---|
199 | -P[symspec] --no-flat-profile[=symspec]
|
---|
200 | Suppress output in the flat profile. If given
|
---|
201 | without an argument, the flat profile is suppressed
|
---|
202 | completely. If "symspec" is specified, suppress
|
---|
203 | the selected symbols in the flat profile. If the
|
---|
204 | option is specified multiple times, the union of
|
---|
205 | the selected symbols is suppressed. This option
|
---|
206 | has lower precedence than --flat-profile.
|
---|
207 |
|
---|
208 | -q[symspec] --graph[=symspec]
|
---|
209 | Request output in the form of a call-graph
|
---|
210 | (unless any other output-style option is specified,
|
---|
211 | this option is turned on by default). If "symspec"
|
---|
212 | is specified, include only symbols selected by
|
---|
213 | "symspec" in the call-graph. If the option is
|
---|
214 | specified multiple times, the call-graph includes
|
---|
215 | symbols selected by the union of all symspecs.
|
---|
216 |
|
---|
217 | -Q[symspec] --no-graph[=symspec]
|
---|
218 | Suppress output in the call-graph. If given without
|
---|
219 | an argument, the call-graph is suppressed completely.
|
---|
220 | With a "symspec", suppress the selected symbols
|
---|
221 | from the call-graph. If the option is specified
|
---|
222 | multiple times, the union of the selected symbols
|
---|
223 | is suppressed. This option has lower precedence
|
---|
224 | than --graph.
|
---|
225 |
|
---|
226 | -C[symspec] --exec-counts[=symspec]
|
---|
227 | Request output in the form of execution counts.
|
---|
228 | If "symspec" is present, include only symbols
|
---|
229 | selected by "symspec" in the execution count
|
---|
230 | listing. If the option is specified multiple
|
---|
231 | times, the execution count listing includes
|
---|
232 | symbols selected by the union of all symspecs.
|
---|
233 |
|
---|
234 | -Z[symspec] --no-exec-counts[=symspec]
|
---|
235 | Suppress output in the execution count listing.
|
---|
236 | If given without an argument, the listing is
|
---|
237 | suppressed completely. With a "symspec", suppress
|
---|
238 | the selected symbols from the call-graph. If the
|
---|
239 | option is specified multiple times, the union of
|
---|
240 | the selected symbols is suppressed. This option
|
---|
241 | has lower precedence than --exec-counts.
|
---|
242 |
|
---|
243 | -i --file-info
|
---|
244 | Print information about the profile files that
|
---|
245 | are read. The information consists of the
|
---|
246 | number and types of records present in the
|
---|
247 | profile file. Currently, a profile file can
|
---|
248 | contain any number and any combination of histogram,
|
---|
249 | call-graph, or basic-block count records.
|
---|
250 |
|
---|
251 | -s --sum
|
---|
252 |
|
---|
253 | -x --all-lines
|
---|
254 | This option affects annotated source output only.
|
---|
255 | By default, only the lines at the beginning of
|
---|
256 | a basic-block are annotated. If this option is
|
---|
257 | specified, every line in a basic-block is annotated
|
---|
258 | by repeating the annotation for the first line.
|
---|
259 | This option is identical to tcov's "-a".
|
---|
260 |
|
---|
261 | -I dirs --directory-path=dirs
|
---|
262 | This option affects annotated source output only.
|
---|
263 | Specifies the list of directories to be searched
|
---|
264 | for source files. The argument "dirs" is a colon
|
---|
265 | separated list of directories. By default, gprof
|
---|
266 | searches for source files relative to the current
|
---|
267 | working directory only.
|
---|
268 |
|
---|
269 | -z --display-unused-functions
|
---|
270 |
|
---|
271 | -m num --min-count=num
|
---|
272 | This option affects annotated source and execution
|
---|
273 | count output only. Symbols that are executed
|
---|
274 | less than "num" times are suppressed. For annotated
|
---|
275 | source output, suppressed symbols are marked
|
---|
276 | by five hash-marks (#####). In an execution count
|
---|
277 | output, suppressed symbols do not appear at all.
|
---|
278 |
|
---|
279 | -L --print-path
|
---|
280 | Normally, source filenames are printed with the path
|
---|
281 | component suppressed. With this option, gprof
|
---|
282 | can be forced to print the full pathname of
|
---|
283 | source filenames. The full pathname is determined
|
---|
284 | from symbolic debugging information in the image file
|
---|
285 | and is relative to the directory in which the compiler
|
---|
286 | was invoked.
|
---|
287 |
|
---|
288 | -y --separate-files
|
---|
289 | This option affects annotated source output only.
|
---|
290 | Normally, gprof prints annotated source files
|
---|
291 | to standard-output. If this option is specified,
|
---|
292 | annotated source for a file named "path/filename"
|
---|
293 | is generated in the file "filename-ann". That is,
|
---|
294 | annotated output is {\em always\/} generated in
|
---|
295 | gprof's current working directory. Care has to
|
---|
296 | be taken if a program consists of files that have
|
---|
297 | identical filenames, but distinct paths.
|
---|
298 |
|
---|
299 | -c --static-call-graph
|
---|
300 |
|
---|
301 | -t num --table-length=num
|
---|
302 | This option affects annotated source output only.
|
---|
303 | After annotating a source file, gprof generates
|
---|
304 | an execution count summary consisting of a table
|
---|
305 | of lines with the top execution counts. By
|
---|
306 | default, this table is ten entries long.
|
---|
307 | This option can be used to change the table length
|
---|
308 | or, by specifying an argument value of 0, it can be
|
---|
309 | suppressed completely.
|
---|
310 |
|
---|
311 | -n symspec --time=symspec
|
---|
312 | Only symbols selected by "symspec" are considered
|
---|
313 | in total and percentage time computations.
|
---|
314 | However, this option does not affect percentage time
|
---|
315 | computation for the flat profile.
|
---|
316 | If the option is specified multiple times, the union
|
---|
317 | of all selected symbols is used in time computations.
|
---|
318 |
|
---|
319 | -N --no-time=symspec
|
---|
320 | Exclude the symbols selected by "symspec" from
|
---|
321 | total and percentage time computations.
|
---|
322 | However, this option does not affect percentage time
|
---|
323 | computation for the flat profile.
|
---|
324 | This option is ignored if any --time options are
|
---|
325 | specified.
|
---|
326 |
|
---|
327 | -w num --width=num
|
---|
328 | Sets the output line width. Currently, this option
|
---|
329 | affects the printing of the call-graph function index
|
---|
330 | only.
|
---|
331 |
|
---|
332 | -e <no long form---for backwards compatibility only>
|
---|
333 | -E <no long form---for backwards compatibility only>
|
---|
334 | -f <no long form---for backwards compatibility only>
|
---|
335 | -F <no long form---for backwards compatibility only>
|
---|
336 | -k <no long form---for backwards compatibility only>
|
---|
337 | -b --brief
|
---|
338 | -dnum --debug[=num]
|
---|
339 |
|
---|
340 | -h --help
|
---|
341 | Prints a usage message.
|
---|
342 |
|
---|
343 | -O name --file-format=name
|
---|
344 | Selects the format of the profile data files.
|
---|
345 | Recognized formats are "auto", "bsd", "magic",
|
---|
346 | and "prof". The last one is not yet supported.
|
---|
347 | Format "auto" attempts to detect the file format
|
---|
348 | automatically (this is the default behavior).
|
---|
349 | It attempts to read the profile data files as
|
---|
350 | "magic" files and if this fails, falls back to
|
---|
351 | the "bsd" format. "bsd" forces gprof to read
|
---|
352 | the data files in the BSD format. "magic" forces
|
---|
353 | gprof to read the data files in the "magic" format.
|
---|
354 |
|
---|
355 | -T --traditional
|
---|
356 | -v --version
|
---|
357 |
|
---|
358 | ** File Format Changes
|
---|
359 |
|
---|
360 | The old BSD-derived format used for profile data does not contain a
|
---|
361 | magic cookie that allows to check whether a data file really is a
|
---|
362 | gprof file. Furthermore, it does not provide a version number, thus
|
---|
363 | rendering changes to the file format almost impossible. GNU gprof
|
---|
364 | uses a new file format that provides these features. For backward
|
---|
365 | compatibility, GNU gprof continues to support the old BSD-derived
|
---|
366 | format, but not all features are supported with it. For example,
|
---|
367 | basic-block execution counts cannot be accommodated by the old file
|
---|
368 | format.
|
---|
369 |
|
---|
370 | The new file format is defined in header file \file{gmon_out.h}. It
|
---|
371 | consists of a header containing the magic cookie and a version number,
|
---|
372 | as well as some spare bytes available for future extensions. All data
|
---|
373 | in a profile data file is in the native format of the host on which
|
---|
374 | the profile was collected. GNU gprof adapts automatically to the
|
---|
375 | byte-order in use.
|
---|
376 |
|
---|
377 | In the new file format, the header is followed by a sequence of
|
---|
378 | records. Currently, there are three different record types: histogram
|
---|
379 | records, call-graph arc records, and basic-block execution count
|
---|
380 | records. Each file can contain any number of each record type. When
|
---|
381 | reading a file, GNU gprof will ensure records of the same type are
|
---|
382 | compatible with each other and compute the union of all records. For
|
---|
383 | example, for basic-block execution counts, the union is simply the sum
|
---|
384 | of all execution counts for each basic-block.
|
---|
385 |
|
---|
386 | *** Histogram Records
|
---|
387 |
|
---|
388 | Histogram records consist of a header that is followed by an array of
|
---|
389 | bins. The header contains the text-segment range that the histogram
|
---|
390 | spans, the size of the histogram in bytes (unlike in the old BSD
|
---|
391 | format, this does not include the size of the header), the rate of the
|
---|
392 | profiling clock, and the physical dimension that the bin counts
|
---|
393 | represent after being scaled by the profiling clock rate. The
|
---|
394 | physical dimension is specified in two parts: a long name of up to 15
|
---|
395 | characters and a single character abbreviation. For example, a
|
---|
396 | histogram representing real-time would specify the long name as
|
---|
397 | "seconds" and the abbreviation as "s". This feature is useful for
|
---|
398 | architectures that support performance monitor hardware (which,
|
---|
399 | fortunately, is becoming increasingly common). For example, under DEC
|
---|
400 | OSF/1, the "uprofile" command can be used to produce a histogram of,
|
---|
401 | say, instruction cache misses. In this case, the dimension in the
|
---|
402 | histogram header could be set to "i-cache misses" and the abbreviation
|
---|
403 | could be set to "1" (because it is simply a count, not a physical
|
---|
404 | dimension). Also, the profiling rate would have to be set to 1 in
|
---|
405 | this case.
|
---|
406 |
|
---|
407 | Histogram bins are 16-bit numbers and each bin represent an equal
|
---|
408 | amount of text-space. For example, if the text-segment is one
|
---|
409 | thousand bytes long and if there are ten bins in the histogram, each
|
---|
410 | bin represents one hundred bytes.
|
---|
411 |
|
---|
412 |
|
---|
413 | *** Call-Graph Records
|
---|
414 |
|
---|
415 | Call-graph records have a format that is identical to the one used in
|
---|
416 | the BSD-derived file format. It consists of an arc in the call graph
|
---|
417 | and a count indicating the number of times the arc was traversed
|
---|
418 | during program execution. Arcs are specified by a pair of addresses:
|
---|
419 | the first must be within caller's function and the second must be
|
---|
420 | within the callee's function. When performing profiling at the
|
---|
421 | function level, these addresses can point anywhere within the
|
---|
422 | respective function. However, when profiling at the line-level, it is
|
---|
423 | better if the addresses are as close to the call-site/entry-point as
|
---|
424 | possible. This will ensure that the line-level call-graph is able to
|
---|
425 | identify exactly which line of source code performed calls to a
|
---|
426 | function.
|
---|
427 |
|
---|
428 | *** Basic-Block Execution Count Records
|
---|
429 |
|
---|
430 | Basic-block execution count records consist of a header followed by a
|
---|
431 | sequence of address/count pairs. The header simply specifies the
|
---|
432 | length of the sequence. In an address/count pair, the address
|
---|
433 | identifies a basic-block and the count specifies the number of times
|
---|
434 | that basic-block was executed. Any address within the basic-address can
|
---|
435 | be used.
|
---|
436 |
|
---|
437 | IMPLEMENTATION NOTE: gcc -a can be used to instrument a program to
|
---|
438 | record basic-block execution counts. However, the __bb_exit_func()
|
---|
439 | that is currently present in libgcc2.c does not generate a gmon.out
|
---|
440 | file in a suitable format. This should be fixed for future releases
|
---|
441 | of gcc. In the meantime, contact davidm@cs.arizona.edu for a version
|
---|
442 | of __bb_exit_func() to is appropriate.
|
---|