Context Navigation

lnotab_notes.txt

Last change on this file was 391, checked in by dmik, 11 years ago
python: Merge vendor 2.7.6 to trunk.
Property svn:eol-style set to `native`
File size: 5.3 KB

Rev	Line
[388]	1	All about co_lnotab, the line number table.
	2
	3	Code objects store a field named co_lnotab. This is an array of unsigned bytes
	4	disguised as a Python string. It is used to map bytecode offsets to source code
	5	line #s for tracebacks and to identify line number boundaries for line tracing.
	6
	7	The array is conceptually a compressed list of
	8	(bytecode offset increment, line number increment)
	9	pairs. The details are important and delicate, best illustrated by example:
	10
	11	byte code offset source code line number
	12	0 1
	13	6 2
	14	50 7
	15	350 307
	16	361 308
	17
	18	Instead of storing these numbers literally, we compress the list by storing only
	19	the increments from one row to the next. Conceptually, the stored list might
	20	look like:
	21
	22	0, 1, 6, 1, 44, 5, 300, 300, 11, 1
	23
	24	The above doesn't really work, but it's a start. Note that an unsigned byte
	25	can't hold negative values, or values larger than 255, and the above example
	26	contains two such values. So we make two tweaks:
	27
	28	(a) there's a deep assumption that byte code offsets and their corresponding
	29	line #s both increase monotonically, and
	30	(b) if at least one column jumps by more than 255 from one row to the next,
	31	more than one pair is written to the table. In case #b, there's no way to know
	32	from looking at the table later how many were written. That's the delicate
	33	part. A user of co_lnotab desiring to find the source line number
	34	corresponding to a bytecode address A should do something like this
	35
	36	lineno = addr = 0
	37	for addr_incr, line_incr in co_lnotab:
	38	addr += addr_incr
	39	if addr > A:
	40	return lineno
	41	lineno += line_incr
	42
	43	(In C, this is implemented by PyCode_Addr2Line().) In order for this to work,
	44	when the addr field increments by more than 255, the line # increment in each
	45	pair generated must be 0 until the remaining addr increment is < 256. So, in
	46	the example above, assemble_lnotab in compile.c should not (as was actually done
	47	until 2.2) expand 300, 300 to
	48	255, 255, 45, 45,
	49	but to
	50	255, 0, 45, 255, 0, 45.
	51
	52	The above is sufficient to reconstruct line numbers for tracebacks, but not for
	53	line tracing. Tracing is handled by PyCode_CheckLineNumber() in codeobject.c
	54	and maybe_call_line_trace() in ceval.c.
	55
	56	* Tracing *
	57
	58	To a first approximation, we want to call the tracing function when the line
	59	number of the current instruction changes. Re-computing the current line for
	60	every instruction is a little slow, though, so each time we compute the line
	61	number we save the bytecode indices where it's valid:
	62
	63	instr_lb <= frame->f_lasti < instr_ub
	64
	65	is true so long as execution does not change lines. That is, *instr_lb holds
	66	the first bytecode index of the current line, and *instr_ub holds the first
	67	bytecode index of the next line. As long as the above expression is true,
	68	maybe_call_line_trace() does not need to call PyCode_CheckLineNumber(). Note
	69	that the same line may appear multiple times in the lnotab, either because the
	70	bytecode jumped more than 255 indices between line number changes or because
	71	the compiler inserted the same line twice. Even in that case, *instr_ub holds
	72	the first index of the next line.
	73
	74	However, we don't always want to call the line trace function when the above
	75	test fails.
	76
	77	Consider this code:
	78
	79	1: def f(a):
	80	2: while a:
	81	3: print 1,
	82	4: break
	83	5: else:
	84	6: print 2,
	85
	86	which compiles to this:
	87
	88	2 0 SETUP_LOOP 19 (to 22)
	89	>> 3 LOAD_FAST 0 (a)
	90	6 POP_JUMP_IF_FALSE 17
	91
	92	3 9 LOAD_CONST 1 (1)
	93	12 PRINT_ITEM
	94
	95	4 13 BREAK_LOOP
	96	14 JUMP_ABSOLUTE 3
	97	>> 17 POP_BLOCK
	98
	99	6 18 LOAD_CONST 2 (2)
	100	21 PRINT_ITEM
	101	>> 22 LOAD_CONST 0 (None)
	102	25 RETURN_VALUE
	103
	104	If 'a' is false, execution will jump to the POP_BLOCK instruction at offset 17
	105	and the co_lnotab will claim that execution has moved to line 4, which is wrong.
	106	In this case, we could instead associate the POP_BLOCK with line 5, but that
	107	would break jumps around loops without else clauses.
	108
	109	We fix this by only calling the line trace function for a forward jump if the
	110	co_lnotab indicates we have jumped to the start of a line, i.e. if the current
	111	instruction offset matches the offset given for the start of a line by the
	112	co_lnotab. For backward jumps, however, we always call the line trace function,
	113	which lets a debugger stop on every evaluation of a loop guard (which usually
	114	won't be the first opcode in a line).
	115
	116	Why do we set f_lineno when tracing, and only just before calling the trace
	117	function? Well, consider the code above when 'a' is true. If stepping through
	118	this with 'n' in pdb, you would stop at line 1 with a "call" type event, then
	119	line events on lines 2, 3, and 4, then a "return" type event -- but because the
	120	code for the return actually falls in the range of the "line 6" opcodes, you
	121	would be shown line 6 during this event. This is a change from the behaviour in
	122	2.2 and before, and I've found it confusing in practice. By setting and using
	123	f_lineno when tracing, one can report a line number different from that
	124	suggested by f_lasti on this one occasion where it's desirable.

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: python/trunk/Objects/lnotab_notes.txt

Download in other formats: