| 1 | .\" Copyright (c) 1990, 1993
|
|---|
| 2 | .\" The Regents of the University of California. All rights reserved.
|
|---|
| 3 | .\"
|
|---|
| 4 | .\" Redistribution and use in source and binary forms, with or without
|
|---|
| 5 | .\" modification, are permitted provided that the following conditions
|
|---|
| 6 | .\" are met:
|
|---|
| 7 | .\" 1. Redistributions of source code must retain the above copyright
|
|---|
| 8 | .\" notice, this list of conditions and the following disclaimer.
|
|---|
| 9 | .\" 2. Redistributions in binary form must reproduce the above copyright
|
|---|
| 10 | .\" notice, this list of conditions and the following disclaimer in the
|
|---|
| 11 | .\" documentation and/or other materials provided with the distribution.
|
|---|
| 12 | .\" 3. All advertising materials mentioning features or use of this software
|
|---|
| 13 | .\" must display the following acknowledgement:
|
|---|
| 14 | .\" This product includes software developed by the University of
|
|---|
| 15 | .\" California, Berkeley and its contributors.
|
|---|
| 16 | .\" 4. Neither the name of the University nor the names of its contributors
|
|---|
| 17 | .\" may be used to endorse or promote products derived from this software
|
|---|
| 18 | .\" without specific prior written permission.
|
|---|
| 19 | .\"
|
|---|
| 20 | .\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
|
|---|
| 21 | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
|
|---|
| 22 | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
|
|---|
| 23 | .\" ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
|
|---|
| 24 | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
|
|---|
| 25 | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
|
|---|
| 26 | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
|
|---|
| 27 | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
|
|---|
| 28 | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
|
|---|
| 29 | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
|
|---|
| 30 | .\" SUCH DAMAGE.
|
|---|
| 31 | .\"
|
|---|
| 32 | .\" @(#)btree.3 8.4 (Berkeley) 8/18/94
|
|---|
| 33 | .\" $FreeBSD: src/lib/libc/db/man/btree.3,v 1.6 2002/12/19 09:40:21 ru Exp $
|
|---|
| 34 | .\"
|
|---|
| 35 | .Dd August 18, 1994
|
|---|
| 36 | .Dt BTREE 3
|
|---|
| 37 | .Os
|
|---|
| 38 | .Sh NAME
|
|---|
| 39 | .Nm btree
|
|---|
| 40 | .Nd "btree database access method"
|
|---|
| 41 | .Sh SYNOPSIS
|
|---|
| 42 | .In sys/types.h
|
|---|
| 43 | .In db.h
|
|---|
| 44 | .Sh DESCRIPTION
|
|---|
| 45 | The routine
|
|---|
| 46 | .Fn dbopen
|
|---|
| 47 | is the library interface to database files.
|
|---|
| 48 | One of the supported file formats is
|
|---|
| 49 | .Nm
|
|---|
| 50 | files.
|
|---|
| 51 | The general description of the database access methods is in
|
|---|
| 52 | .Xr dbopen 3 ,
|
|---|
| 53 | this manual page describes only the
|
|---|
| 54 | .Nm
|
|---|
| 55 | specific information.
|
|---|
| 56 | .Pp
|
|---|
| 57 | The
|
|---|
| 58 | .Nm
|
|---|
| 59 | data structure is a sorted, balanced tree structure storing
|
|---|
| 60 | associated key/data pairs.
|
|---|
| 61 | .Pp
|
|---|
| 62 | The
|
|---|
| 63 | .Nm
|
|---|
| 64 | access method specific data structure provided to
|
|---|
| 65 | .Fn dbopen
|
|---|
| 66 | is defined in the
|
|---|
| 67 | .Aq Pa db.h
|
|---|
| 68 | include file as follows:
|
|---|
| 69 | .Bd -literal
|
|---|
| 70 | typedef struct {
|
|---|
| 71 | u_long flags;
|
|---|
| 72 | u_int cachesize;
|
|---|
| 73 | int maxkeypage;
|
|---|
| 74 | int minkeypage;
|
|---|
| 75 | u_int psize;
|
|---|
| 76 | int (*compare)(const DBT *key1, const DBT *key2);
|
|---|
| 77 | size_t (*prefix)(const DBT *key1, const DBT *key2);
|
|---|
| 78 | int lorder;
|
|---|
| 79 | } BTREEINFO;
|
|---|
| 80 | .Ed
|
|---|
| 81 | .Pp
|
|---|
| 82 | The elements of this structure are as follows:
|
|---|
| 83 | .Bl -tag -width indent
|
|---|
| 84 | .It Va flags
|
|---|
| 85 | The flag value is specified by
|
|---|
| 86 | .Em or Ns 'ing
|
|---|
| 87 | any of the following values:
|
|---|
| 88 | .Bl -tag -width indent
|
|---|
| 89 | .It Dv R_DUP
|
|---|
| 90 | Permit duplicate keys in the tree, i.e. permit insertion if the key to be
|
|---|
| 91 | inserted already exists in the tree.
|
|---|
| 92 | The default behavior, as described in
|
|---|
| 93 | .Xr dbopen 3 ,
|
|---|
| 94 | is to overwrite a matching key when inserting a new key or to fail if
|
|---|
| 95 | the
|
|---|
| 96 | .Dv R_NOOVERWRITE
|
|---|
| 97 | flag is specified.
|
|---|
| 98 | The
|
|---|
| 99 | .Dv R_DUP
|
|---|
| 100 | flag is overridden by the
|
|---|
| 101 | .Dv R_NOOVERWRITE
|
|---|
| 102 | flag, and if the
|
|---|
| 103 | .Dv R_NOOVERWRITE
|
|---|
| 104 | flag is specified, attempts to insert duplicate keys into
|
|---|
| 105 | the tree will fail.
|
|---|
| 106 | .Pp
|
|---|
| 107 | If the database contains duplicate keys, the order of retrieval of
|
|---|
| 108 | key/data pairs is undefined if the
|
|---|
| 109 | .Va get
|
|---|
| 110 | routine is used, however,
|
|---|
| 111 | .Va seq
|
|---|
| 112 | routine calls with the
|
|---|
| 113 | .Dv R_CURSOR
|
|---|
| 114 | flag set will always return the logical
|
|---|
| 115 | .Dq first
|
|---|
| 116 | of any group of duplicate keys.
|
|---|
| 117 | .El
|
|---|
| 118 | .It Va cachesize
|
|---|
| 119 | A suggested maximum size (in bytes) of the memory cache.
|
|---|
| 120 | This value is
|
|---|
| 121 | .Em only
|
|---|
| 122 | advisory, and the access method will allocate more memory rather than fail.
|
|---|
| 123 | Since every search examines the root page of the tree, caching the most
|
|---|
| 124 | recently used pages substantially improves access time.
|
|---|
| 125 | In addition, physical writes are delayed as long as possible, so a moderate
|
|---|
| 126 | cache can reduce the number of I/O operations significantly.
|
|---|
| 127 | Obviously, using a cache increases (but only increases) the likelihood of
|
|---|
| 128 | corruption or lost data if the system crashes while a tree is being modified.
|
|---|
| 129 | If
|
|---|
| 130 | .Va cachesize
|
|---|
| 131 | is 0 (no size is specified) a default cache is used.
|
|---|
| 132 | .It Va maxkeypage
|
|---|
| 133 | The maximum number of keys which will be stored on any single page.
|
|---|
| 134 | Not currently implemented.
|
|---|
| 135 | .\" The maximum number of keys which will be stored on any single page.
|
|---|
| 136 | .\" Because of the way the
|
|---|
| 137 | .\" .Nm
|
|---|
| 138 | .\" data structure works,
|
|---|
| 139 | .\" .Va maxkeypage
|
|---|
| 140 | .\" must always be greater than or equal to 2.
|
|---|
| 141 | .\" If
|
|---|
| 142 | .\" .Va maxkeypage
|
|---|
| 143 | .\" is 0 (no maximum number of keys is specified) the page fill factor is
|
|---|
| 144 | .\" made as large as possible (which is almost invariably what is wanted).
|
|---|
| 145 | .It Va minkeypage
|
|---|
| 146 | The minimum number of keys which will be stored on any single page.
|
|---|
| 147 | This value is used to determine which keys will be stored on overflow
|
|---|
| 148 | pages, i.e. if a key or data item is longer than the pagesize divided
|
|---|
| 149 | by the minkeypage value, it will be stored on overflow pages instead
|
|---|
| 150 | of in the page itself.
|
|---|
| 151 | If
|
|---|
| 152 | .Va minkeypage
|
|---|
| 153 | is 0 (no minimum number of keys is specified) a value of 2 is used.
|
|---|
| 154 | .It Va psize
|
|---|
| 155 | Page size is the size (in bytes) of the pages used for nodes in the tree.
|
|---|
| 156 | The minimum page size is 512 bytes and the maximum page size is 64K.
|
|---|
| 157 | If
|
|---|
| 158 | .Va psize
|
|---|
| 159 | is 0 (no page size is specified) a page size is chosen based on the
|
|---|
| 160 | underlying file system I/O block size.
|
|---|
| 161 | .It Va compare
|
|---|
| 162 | Compare is the key comparison function.
|
|---|
| 163 | It must return an integer less than, equal to, or greater than zero if the
|
|---|
| 164 | first key argument is considered to be respectively less than, equal to,
|
|---|
| 165 | or greater than the second key argument.
|
|---|
| 166 | The same comparison function must be used on a given tree every time it
|
|---|
| 167 | is opened.
|
|---|
| 168 | If
|
|---|
| 169 | .Va compare
|
|---|
| 170 | is
|
|---|
| 171 | .Dv NULL
|
|---|
| 172 | (no comparison function is specified), the keys are compared
|
|---|
| 173 | lexically, with shorter keys considered less than longer keys.
|
|---|
| 174 | .It Va prefix
|
|---|
| 175 | The
|
|---|
| 176 | .Va prefix
|
|---|
| 177 | element
|
|---|
| 178 | is the prefix comparison function.
|
|---|
| 179 | If specified, this routine must return the number of bytes of the second key
|
|---|
| 180 | argument which are necessary to determine that it is greater than the first
|
|---|
| 181 | key argument.
|
|---|
| 182 | If the keys are equal, the key length should be returned.
|
|---|
| 183 | Note, the usefulness of this routine is very data dependent, but, in some
|
|---|
| 184 | data sets can produce significantly reduced tree sizes and search times.
|
|---|
| 185 | If
|
|---|
| 186 | .Va prefix
|
|---|
| 187 | is
|
|---|
| 188 | .Dv NULL
|
|---|
| 189 | (no prefix function is specified),
|
|---|
| 190 | .Em and
|
|---|
| 191 | no comparison function is specified, a default lexical comparison routine
|
|---|
| 192 | is used.
|
|---|
| 193 | If
|
|---|
| 194 | .Va prefix
|
|---|
| 195 | is
|
|---|
| 196 | .Dv NULL
|
|---|
| 197 | and a comparison routine is specified, no prefix comparison is
|
|---|
| 198 | done.
|
|---|
| 199 | .It Va lorder
|
|---|
| 200 | The byte order for integers in the stored database metadata.
|
|---|
| 201 | The number should represent the order as an integer; for example,
|
|---|
| 202 | big endian order would be the number 4,321.
|
|---|
| 203 | If
|
|---|
| 204 | .Va lorder
|
|---|
| 205 | is 0 (no order is specified) the current host order is used.
|
|---|
| 206 | .El
|
|---|
| 207 | .Pp
|
|---|
| 208 | If the file already exists (and the
|
|---|
| 209 | .Dv O_TRUNC
|
|---|
| 210 | flag is not specified), the
|
|---|
| 211 | values specified for the
|
|---|
| 212 | .Va flags , lorder
|
|---|
| 213 | and
|
|---|
| 214 | .Va psize
|
|---|
| 215 | arguments
|
|---|
| 216 | are ignored
|
|---|
| 217 | in favor of the values used when the tree was created.
|
|---|
| 218 | .Pp
|
|---|
| 219 | Forward sequential scans of a tree are from the least key to the greatest.
|
|---|
| 220 | .Pp
|
|---|
| 221 | Space freed up by deleting key/data pairs from the tree is never reclaimed,
|
|---|
| 222 | although it is normally made available for reuse.
|
|---|
| 223 | This means that the
|
|---|
| 224 | .Nm
|
|---|
| 225 | storage structure is grow-only.
|
|---|
| 226 | The only solutions are to avoid excessive deletions, or to create a fresh
|
|---|
| 227 | tree periodically from a scan of an existing one.
|
|---|
| 228 | .Pp
|
|---|
| 229 | Searches, insertions, and deletions in a
|
|---|
| 230 | .Nm
|
|---|
| 231 | will all complete in
|
|---|
| 232 | O lg base N where base is the average fill factor.
|
|---|
| 233 | Often, inserting ordered data into
|
|---|
| 234 | .Nm Ns s
|
|---|
| 235 | results in a low fill factor.
|
|---|
| 236 | This implementation has been modified to make ordered insertion the best
|
|---|
| 237 | case, resulting in a much better than normal page fill factor.
|
|---|
| 238 | .Sh ERRORS
|
|---|
| 239 | The
|
|---|
| 240 | .Nm
|
|---|
| 241 | access method routines may fail and set
|
|---|
| 242 | .Va errno
|
|---|
| 243 | for any of the errors specified for the library routine
|
|---|
| 244 | .Xr dbopen 3 .
|
|---|
| 245 | .Sh SEE ALSO
|
|---|
| 246 | .Xr dbopen 3 ,
|
|---|
| 247 | .Xr hash 3 ,
|
|---|
| 248 | .Xr mpool 3 ,
|
|---|
| 249 | .Xr recno 3
|
|---|
| 250 | .Rs
|
|---|
| 251 | .%T "The Ubiquitous B-tree"
|
|---|
| 252 | .%A Douglas Comer
|
|---|
| 253 | .%J "ACM Comput. Surv. 11"
|
|---|
| 254 | .%N 2
|
|---|
| 255 | .%D June 1979
|
|---|
| 256 | .%P 121-138
|
|---|
| 257 | .Re
|
|---|
| 258 | .Rs
|
|---|
| 259 | .%A Bayer
|
|---|
| 260 | .%A Unterauer
|
|---|
| 261 | .%T "Prefix B-trees"
|
|---|
| 262 | .%J "ACM Transactions on Database Systems"
|
|---|
| 263 | .%N 1
|
|---|
| 264 | .%V Vol. 2
|
|---|
| 265 | .%D March 1977
|
|---|
| 266 | .%P 11-26
|
|---|
| 267 | .Re
|
|---|
| 268 | .Rs
|
|---|
| 269 | .%B "The Art of Computer Programming Vol. 3: Sorting and Searching"
|
|---|
| 270 | .%A D. E. Knuth
|
|---|
| 271 | .%D 1968
|
|---|
| 272 | .%P 471-480
|
|---|
| 273 | .Re
|
|---|
| 274 | .Sh BUGS
|
|---|
| 275 | Only big and little endian byte order is supported.
|
|---|