source: trunk/gcc/boehm-gc/doc/tree.html@ 2447

Last change on this file since 2447 was 2, checked in by bird, 23 years ago

Initial revision

  • Property cvs2svn:cvs-rev set to 1.1
  • Property svn:eol-style set to native
  • Property svn:executable set to *
File size: 10.5 KB
Line 
1<HTML>
2<HEAD>
3 <TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
4 <AUTHOR> Hans-J. Boehm, Silicon Graphics</author>
5</HEAD>
6<BODY>
7<H1>Two-Level Tree Structure for Fast Pointer Lookup</h1>
8<P>
9The conservative garbage collector described
10<A HREF="gc.html">here</a> uses a 2-level tree
11data structure to aid in fast pointer identification.
12This data structure is described in a bit more detail here, since
13<OL>
14<LI> Variations of the data structure are more generally useful.
15<LI> It appears to be hard to understand by reading the code.
16<LI> Some other collectors appear to use inferior data structures to
17solve the same problem.
18<LI> It is central to fast collector operation.
19</ol>
20A candidate pointer is divided into three sections, the <I>high</i>,
21<I>middle</i>, and <I>low</i> bits. The exact division between these
22three groups of bits is dependent on the detailed collector configuration.
23<P>
24The high and middle bits are used to look up an entry in the table described
25here. The resulting table entry consists of either a block descriptor
26(<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
27identifying the layout of objects in the block, or an indication that this
28address range corresponds to the middle of a large block, together with a
29hint for locating the actual block descriptor. Such a hint consist
30of a displacement that can be subtracted from the middle bits of the candidate
31pointer without leaving the object.
32<P>
33In either case, the block descriptor (<TT>struct hblkhdr</tt>)
34refers to a table of object starting addresses (the <TT>hb_map</tt> field).
35The starting address table is indexed by the low bits if the candidate pointer.
36The resulting entry contains a displacement to the beginning of the object,
37or an indication that this cannot be a valid object pointer.
38(If all interior pointer are recognized, pointers into large objects
39are handled specially, as appropriate.)
40
41<H2>The Tree</h2>
42<P>
43The rest of this discussion focuses on the two level data structure
44used to map the high and middle bits to the block descriptor.
45<P>
46The high bits are used as an index into the <TT>GC_top_index</tt> (really
47<TT>GC_arrays._top_index</tt>) array. Each entry points to a
48<TT>bottom_index</tt> data structure. This structure in turn consists
49mostly of an array <TT>index</tt> indexed by the middle bits of
50the candidate pointer. The <TT>index</tt> array contains the actual
51<TT>hdr</tt> pointers.
52<P>
53Thus a pointer lookup consists primarily of a handful of memory references,
54and can be quite fast:
55<OL>
56<LI> The appropriate <TT>bottom_index</tt> pointer is looked up in
57<TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
58<LI> The appropriate <TT>hdr</tt> pointer is looked up in the
59<TT>bottom_index</tt> structure, based on the middle bits.
60<LI> The block layout map pointer is retrieved from the <TT>hdr</tt>
61structure. (This memory reference is necessary since we try to share
62block layout maps.)
63<LI> The displacement to the beginning of the object is retrieved from the
64above map.
65</ol>
66<P>
67In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
68point to distinct <TT>bottom_index</tt> structures. If no address with
69the corresponding high bits is part of the heap, then the entry points
70to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
71only of NULL <TT>hdr</tt> pointers.
72<P>
73<TT>Bottom_index</tt> structures contain slightly more information than
74just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link
75all <TT>bottom_index</tt> structures in ascending order for fast traversal.
76This list is pointed to be <TT>GC_all_bottom_indices</tt>.
77It is maintained with the aid of <TT>key</tt> field that contains the
78high bits corresponding to the <TT>bottom_index</tt>.
79
80<H2>64 bit addresses</h2>
81<P>
82In the case of 64 bit addresses, this picture is complicated slightly
83by the fact that one of the index structures would have to be huge to
84cover the entire address space with a two level tree. We deal with this
85by turning <TT>GC_top_index</tt> into a chained hash table, instead of
86a simple array. This adds a <TT>hash_link</tt> field to the
87<TT>bottom_index</tt> structure.
88<P>
89The "hash function" consists of dropping the high bits. This is cheap to
90compute, and guarantees that there will be no collisions if the heap
91is contiguous and not excessively large.
92
93<H2>A picture</h2>
94<P>
95The following is an ASCII diagram of the data structure.
96This was contributed by Dave Barrett several years ago.
97<PRE>
98
99 Data Structure used by GC_base in gc3.7:
100 21-Apr-94
101
102
103
104
105 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
106 +------------------+----------------+------------------+------------------+
107 p:| | TL_HASH(hi) | | HBLKDISPL(p) |
108 +------------------+----------------+------------------+------------------+
109 \-----------------------HBLKPTR(p)-------------------/
110 \------------hi-------------------/
111 \______ ________/ \________ _______/ \________ _______/
112 V V V
113 | | |
114 GC_top_index[] | | |
115 --- +--------------+ | | |
116 ^ | | | | |
117 | | | | | |
118 TOP +--------------+<--+ | |
119 _SZ +-<| [] | * | |
120(items)| +--------------+ if 0 < bi< HBLKSIZE | |
121 | | | | then large object | |
122 | | | | starts at the bi'th | |
123 v | | | HBLK before p. | i |
124 --- | +--------------+ | (word- |
125 v | aligned) |
126 bi= |GET_BI(p){->hash_link}->key==hi | |
127 v | |
128 | (bottom_index) \ scratch_alloc'd | |
129 | ( struct bi ) / by get_index() | |
130 --- +->+--------------+ | |
131 ^ | | | |
132 ^ | | | |
133 BOTTOM | | ha=GET_HDR_ADDR(p) | |
134_SZ(items)+--------------+<----------------------+ +-------+
135 | +--<| index[] | |
136 | | +--------------+ GC_obj_map: v
137 | | | | from / +-+-+-----+-+-+-+-+ ---
138 v | | | GC_add < 0| | | | | | | | ^
139 --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
140 | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
141 | +--------------+ +-->| | | j | | | | | +1
142 | | key | | +-+-+-----+-+-+-+-+ |
143 | +--------------+ | +-+-+-----+-+-+-+-+ |
144 | | hash_link | | | | | | | | | | v
145 | +--------------+ | +-+-+-----+-+-+-+-+ ---
146 | | |<--MAX_OFFSET--->|
147 | | (bytes)
148HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
149 | \ from | =HBLKSIZE/WORDSZ
150 | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
151 +-->+----------------------+ | (8/16 bits each)
152GET_HDR(p)| word hb_sz (words) | |
153 +----------------------+ |
154 | struct hblk *hb_next | |
155 +----------------------+ |
156 |mark_proc hb_mark_proc| |
157 +----------------------+ |
158 | char * hb_map |>-------------+
159 +----------------------+
160 | ushort hb_obj_kind |
161 +----------------------+
162 | hb_last_reclaimed |
163 --- +----------------------+
164 ^ | |
165 MARK_BITS| hb_marks[] | *if hdr is free, hb_sz + DISCARD_WORDS
166_SZ(words)| | is the size of a heap chunk (struct hblk)
167 v | | of at least MININCR*HBLKSIZE bytes (below),
168 --- +----------------------+ otherwise, size of each object in chunk.
169
170Dynamic data structures above are interleaved throughout the heap in blocks of
171size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
172freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected.
173
174 (struct hblk)
175 --- +----------------------+ < HBLKSIZE --- --- DISCARD_
176 ^ |garbage[DISCARD_WORDS]| aligned ^ ^ HDR_BYTES WORDS
177 | | | | v (bytes) (words)
178 | +-----hb_body----------+ < WORDSZ | --- ---
179 | | | aligned | ^ ^
180 | | Object 0 | | hb_sz |
181 | | | i |(word- (words)|
182 | | | (bytes)|aligned) v |
183 | + - - - - - - - - - - -+ --- | --- |
184 | | | ^ | ^ |
185 n * | | j (words) | hb_sz BODY_SZ
186 HBLKSIZE | Object 1 | v v | (words)
187 (bytes) | |--------------- v MAX_OFFSET
188 | + - - - - - - - - - - -+ --- (bytes)
189 | | | !All_INTERIOR_PTRS ^ |
190 | | | sets j only for hb_sz |
191 | | Object N | valid object offsets. | |
192 v | | All objects WORDSZ v v
193 --- +----------------------+ aligned. --- ---
194
195DISCARD_WORDS is normally zero. Indeed the collector has not been tested
196with another value in ages.
197</pre>
198</body>
Note: See TracBrowser for help on using the repository browser.