| 1 | tdb - a trivial database system
|
|---|
| 2 | tridge@linuxcare.com December 1999
|
|---|
| 3 | ==================================
|
|---|
| 4 |
|
|---|
| 5 | This is a simple database API. It was inspired by the realisation that
|
|---|
| 6 | in Samba we have several ad-hoc bits of code that essentially
|
|---|
| 7 | implement small databases for sharing structures between parts of
|
|---|
| 8 | Samba. As I was about to add another I realised that a generic
|
|---|
| 9 | database module was called for to replace all the ad-hoc bits.
|
|---|
| 10 |
|
|---|
| 11 | I based the interface on gdbm. I couldn't use gdbm as we need to be
|
|---|
| 12 | able to have multiple writers to the databases at one time.
|
|---|
| 13 |
|
|---|
| 14 | Compilation
|
|---|
| 15 | -----------
|
|---|
| 16 |
|
|---|
| 17 | add HAVE_MMAP=1 to use mmap instead of read/write
|
|---|
| 18 | add NOLOCK=1 to disable locking code
|
|---|
| 19 |
|
|---|
| 20 | Testing
|
|---|
| 21 | -------
|
|---|
| 22 |
|
|---|
| 23 | Compile tdbtest.c and link with gdbm for testing. tdbtest will perform
|
|---|
| 24 | identical operations via tdb and gdbm then make sure the result is the
|
|---|
| 25 | same
|
|---|
| 26 |
|
|---|
| 27 | Also included is tdbtool, which allows simple database manipulation
|
|---|
| 28 | on the commandline.
|
|---|
| 29 |
|
|---|
| 30 | tdbtest and tdbtool are not built as part of Samba, but are included
|
|---|
| 31 | for completeness.
|
|---|
| 32 |
|
|---|
| 33 | Interface
|
|---|
| 34 | ---------
|
|---|
| 35 |
|
|---|
| 36 | The interface is very similar to gdbm except for the following:
|
|---|
| 37 |
|
|---|
| 38 | - different open interface. The tdb_open call is more similar to a
|
|---|
| 39 | traditional open()
|
|---|
| 40 | - no tdbm_reorganise() function
|
|---|
| 41 | - no tdbm_sync() function. No operations are cached in the library anyway
|
|---|
| 42 | - added a tdb_traverse() function for traversing the whole database
|
|---|
| 43 | - added transactions support
|
|---|
| 44 |
|
|---|
| 45 | A general rule for using tdb is that the caller frees any returned
|
|---|
| 46 | TDB_DATA structures. Just call free(p.dptr) to free a TDB_DATA
|
|---|
| 47 | return value called p. This is the same as gdbm.
|
|---|
| 48 |
|
|---|
| 49 | here is a full list of tdb functions with brief descriptions.
|
|---|
| 50 |
|
|---|
| 51 |
|
|---|
| 52 | ----------------------------------------------------------------------
|
|---|
| 53 | TDB_CONTEXT *tdb_open(char *name, int hash_size, int tdb_flags,
|
|---|
| 54 | int open_flags, mode_t mode)
|
|---|
| 55 |
|
|---|
| 56 | open the database, creating it if necessary
|
|---|
| 57 |
|
|---|
| 58 | The open_flags and mode are passed straight to the open call on the database
|
|---|
| 59 | file. A flags value of O_WRONLY is invalid
|
|---|
| 60 |
|
|---|
| 61 | The hash size is advisory, use zero for a default value.
|
|---|
| 62 |
|
|---|
| 63 | return is NULL on error
|
|---|
| 64 |
|
|---|
| 65 | possible tdb_flags are:
|
|---|
| 66 | TDB_CLEAR_IF_FIRST - clear database if we are the only one with it open
|
|---|
| 67 | TDB_INTERNAL - don't use a file, instaed store the data in
|
|---|
| 68 | memory. The filename is ignored in this case.
|
|---|
| 69 | TDB_NOLOCK - don't do any locking
|
|---|
| 70 | TDB_NOMMAP - don't use mmap
|
|---|
| 71 | TDB_NOSYNC - don't synchronise transactions to disk
|
|---|
| 72 | TDB_SEQNUM - maintain a sequence number
|
|---|
| 73 | TDB_VOLATILE - activate the per-hashchain freelist, default 5
|
|---|
| 74 | TDB_ALLOW_NESTING - allow transactions to nest
|
|---|
| 75 | TDB_DISALLOW_NESTING - disallow transactions to nest
|
|---|
| 76 |
|
|---|
| 77 | ----------------------------------------------------------------------
|
|---|
| 78 | TDB_CONTEXT *tdb_open_ex(char *name, int hash_size, int tdb_flags,
|
|---|
| 79 | int open_flags, mode_t mode,
|
|---|
| 80 | const struct tdb_logging_context *log_ctx,
|
|---|
| 81 | tdb_hash_func hash_fn)
|
|---|
| 82 |
|
|---|
| 83 | This is like tdb_open(), but allows you to pass an initial logging and
|
|---|
| 84 | hash function. Be careful when passing a hash function - all users of
|
|---|
| 85 | the database must use the same hash function or you will get data
|
|---|
| 86 | corruption.
|
|---|
| 87 |
|
|---|
| 88 |
|
|---|
| 89 | ----------------------------------------------------------------------
|
|---|
| 90 | char *tdb_error(TDB_CONTEXT *tdb);
|
|---|
| 91 |
|
|---|
| 92 | return a error string for the last tdb error
|
|---|
| 93 |
|
|---|
| 94 | ----------------------------------------------------------------------
|
|---|
| 95 | int tdb_close(TDB_CONTEXT *tdb);
|
|---|
| 96 |
|
|---|
| 97 | close a database
|
|---|
| 98 |
|
|---|
| 99 | ----------------------------------------------------------------------
|
|---|
| 100 | TDB_DATA tdb_fetch(TDB_CONTEXT *tdb, TDB_DATA key);
|
|---|
| 101 |
|
|---|
| 102 | fetch an entry in the database given a key
|
|---|
| 103 | if the return value has a null dptr then a error occurred
|
|---|
| 104 |
|
|---|
| 105 | caller must free the resulting data
|
|---|
| 106 |
|
|---|
| 107 | ----------------------------------------------------------------------
|
|---|
| 108 | int tdb_parse_record(struct tdb_context *tdb, TDB_DATA key,
|
|---|
| 109 | int (*parser)(TDB_DATA key, TDB_DATA data,
|
|---|
| 110 | void *private_data),
|
|---|
| 111 | void *private_data);
|
|---|
| 112 |
|
|---|
| 113 | Hand a record to a parser function without allocating it.
|
|---|
| 114 |
|
|---|
| 115 | This function is meant as a fast tdb_fetch alternative for large records
|
|---|
| 116 | that are frequently read. The "key" and "data" arguments point directly
|
|---|
| 117 | into the tdb shared memory, they are not aligned at any boundary.
|
|---|
| 118 |
|
|---|
| 119 | WARNING: The parser is called while tdb holds a lock on the record. DO NOT
|
|---|
| 120 | call other tdb routines from within the parser. Also, for good performance
|
|---|
| 121 | you should make the parser fast to allow parallel operations.
|
|---|
| 122 |
|
|---|
| 123 | tdb_parse_record returns -1 if the record was not found. If the record was
|
|---|
| 124 | found, the return value of "parser" is passed up to the caller.
|
|---|
| 125 |
|
|---|
| 126 | ----------------------------------------------------------------------
|
|---|
| 127 | int tdb_exists(TDB_CONTEXT *tdb, TDB_DATA key);
|
|---|
| 128 |
|
|---|
| 129 | check if an entry in the database exists
|
|---|
| 130 |
|
|---|
| 131 | note that 1 is returned if the key is found and 0 is returned if not found
|
|---|
| 132 | this doesn't match the conventions in the rest of this module, but is
|
|---|
| 133 | compatible with gdbm
|
|---|
| 134 |
|
|---|
| 135 | ----------------------------------------------------------------------
|
|---|
| 136 | int tdb_traverse(TDB_CONTEXT *tdb, int (*fn)(TDB_CONTEXT *tdb,
|
|---|
| 137 | TDB_DATA key, TDB_DATA dbuf, void *state), void *state);
|
|---|
| 138 |
|
|---|
| 139 | traverse the entire database - calling fn(tdb, key, data, state) on each
|
|---|
| 140 | element.
|
|---|
| 141 |
|
|---|
| 142 | return -1 on error or the record count traversed
|
|---|
| 143 |
|
|---|
| 144 | if fn is NULL then it is not called
|
|---|
| 145 |
|
|---|
| 146 | a non-zero return value from fn() indicates that the traversal
|
|---|
| 147 | should stop. Traversal callbacks may not start transactions.
|
|---|
| 148 |
|
|---|
| 149 | WARNING: The data buffer given to the callback fn does NOT meet the
|
|---|
| 150 | alignment restrictions malloc gives you.
|
|---|
| 151 |
|
|---|
| 152 | ----------------------------------------------------------------------
|
|---|
| 153 | int tdb_traverse_read(TDB_CONTEXT *tdb, int (*fn)(TDB_CONTEXT *tdb,
|
|---|
| 154 | TDB_DATA key, TDB_DATA dbuf, void *state), void *state);
|
|---|
| 155 |
|
|---|
| 156 | traverse the entire database - calling fn(tdb, key, data, state) on
|
|---|
| 157 | each element, but marking the database read only during the
|
|---|
| 158 | traversal, so any write operations will fail. This allows tdb to
|
|---|
| 159 | use read locks, which increases the parallelism possible during the
|
|---|
| 160 | traversal.
|
|---|
| 161 |
|
|---|
| 162 | return -1 on error or the record count traversed
|
|---|
| 163 |
|
|---|
| 164 | if fn is NULL then it is not called
|
|---|
| 165 |
|
|---|
| 166 | a non-zero return value from fn() indicates that the traversal
|
|---|
| 167 | should stop. Traversal callbacks may not start transactions.
|
|---|
| 168 |
|
|---|
| 169 | ----------------------------------------------------------------------
|
|---|
| 170 | TDB_DATA tdb_firstkey(TDB_CONTEXT *tdb);
|
|---|
| 171 |
|
|---|
| 172 | find the first entry in the database and return its key
|
|---|
| 173 |
|
|---|
| 174 | the caller must free the returned data
|
|---|
| 175 |
|
|---|
| 176 | ----------------------------------------------------------------------
|
|---|
| 177 | TDB_DATA tdb_nextkey(TDB_CONTEXT *tdb, TDB_DATA key);
|
|---|
| 178 |
|
|---|
| 179 | find the next entry in the database, returning its key
|
|---|
| 180 |
|
|---|
| 181 | the caller must free the returned data
|
|---|
| 182 |
|
|---|
| 183 | ----------------------------------------------------------------------
|
|---|
| 184 | int tdb_delete(TDB_CONTEXT *tdb, TDB_DATA key);
|
|---|
| 185 |
|
|---|
| 186 | delete an entry in the database given a key
|
|---|
| 187 |
|
|---|
| 188 | ----------------------------------------------------------------------
|
|---|
| 189 | int tdb_store(TDB_CONTEXT *tdb, TDB_DATA key, TDB_DATA dbuf, int flag);
|
|---|
| 190 |
|
|---|
| 191 | store an element in the database, replacing any existing element
|
|---|
| 192 | with the same key
|
|---|
| 193 |
|
|---|
| 194 | If flag==TDB_INSERT then don't overwrite an existing entry
|
|---|
| 195 | If flag==TDB_MODIFY then don't create a new entry
|
|---|
| 196 |
|
|---|
| 197 | return 0 on success, -1 on failure
|
|---|
| 198 |
|
|---|
| 199 | ----------------------------------------------------------------------
|
|---|
| 200 | int tdb_writelock(TDB_CONTEXT *tdb);
|
|---|
| 201 |
|
|---|
| 202 | lock the database. If we already have it locked then don't do anything
|
|---|
| 203 |
|
|---|
| 204 | ----------------------------------------------------------------------
|
|---|
| 205 | int tdb_writeunlock(TDB_CONTEXT *tdb);
|
|---|
| 206 | unlock the database
|
|---|
| 207 |
|
|---|
| 208 | ----------------------------------------------------------------------
|
|---|
| 209 | int tdb_lockchain(TDB_CONTEXT *tdb, TDB_DATA key);
|
|---|
| 210 |
|
|---|
| 211 | lock one hash chain. This is meant to be used to reduce locking
|
|---|
| 212 | contention - it cannot guarantee how many records will be locked
|
|---|
| 213 |
|
|---|
| 214 | ----------------------------------------------------------------------
|
|---|
| 215 | int tdb_unlockchain(TDB_CONTEXT *tdb, TDB_DATA key);
|
|---|
| 216 |
|
|---|
| 217 | unlock one hash chain
|
|---|
| 218 |
|
|---|
| 219 | ----------------------------------------------------------------------
|
|---|
| 220 | int tdb_transaction_start(TDB_CONTEXT *tdb)
|
|---|
| 221 |
|
|---|
| 222 | start a transaction. All operations after the transaction start can
|
|---|
| 223 | either be committed with tdb_transaction_commit() or cancelled with
|
|---|
| 224 | tdb_transaction_cancel().
|
|---|
| 225 |
|
|---|
| 226 | If you call tdb_transaction_start() again on the same tdb context
|
|---|
| 227 | while a transaction is in progress, then the same transaction
|
|---|
| 228 | buffer is re-used. The number of tdb_transaction_{commit,cancel}
|
|---|
| 229 | operations must match the number of successful
|
|---|
| 230 | tdb_transaction_start() calls.
|
|---|
| 231 |
|
|---|
| 232 | Note that transactions are by default disk synchronous, and use a
|
|---|
| 233 | recover area in the database to automatically recover the database
|
|---|
| 234 | on the next open if the system crashes during a transaction. You
|
|---|
| 235 | can disable the synchronous transaction recovery setup using the
|
|---|
| 236 | TDB_NOSYNC flag, which will greatly speed up operations at the risk
|
|---|
| 237 | of corrupting your database if the system crashes.
|
|---|
| 238 |
|
|---|
| 239 | Operations made within a transaction are not visible to other users
|
|---|
| 240 | of the database until a successful commit.
|
|---|
| 241 |
|
|---|
| 242 | ----------------------------------------------------------------------
|
|---|
| 243 | int tdb_transaction_cancel(TDB_CONTEXT *tdb)
|
|---|
| 244 |
|
|---|
| 245 | cancel a current transaction, discarding all write and lock
|
|---|
| 246 | operations that have been made since the transaction started.
|
|---|
| 247 |
|
|---|
| 248 |
|
|---|
| 249 | ----------------------------------------------------------------------
|
|---|
| 250 | int tdb_transaction_commit(TDB_CONTEXT *tdb)
|
|---|
| 251 |
|
|---|
| 252 | commit a current transaction, updating the database and releasing
|
|---|
| 253 | the transaction locks.
|
|---|
| 254 |
|
|---|
| 255 | ----------------------------------------------------------------------
|
|---|
| 256 | int tdb_transaction_prepare_commit(TDB_CONTEXT *tdb)
|
|---|
| 257 |
|
|---|
| 258 | prepare to commit a current transaction, for two-phase commits.
|
|---|
| 259 | Once prepared for commit, the only allowed calls are
|
|---|
| 260 | tdb_transaction_commit() or tdb_transaction_cancel(). Preparing
|
|---|
| 261 | allocates disk space for the pending updates, so a subsequent
|
|---|
| 262 | commit should succeed (barring any hardware failures).
|
|---|
| 263 |
|
|---|
| 264 | ----------------------------------------------------------------------
|
|---|
| 265 | int tdb_check(TDB_CONTEXT *tdb,
|
|---|
| 266 | int (*check)(TDB_DATA key, TDB_DATA data, void *private_data),
|
|---|
| 267 | void *private_data);)
|
|---|
| 268 |
|
|---|
| 269 | check the consistency of the database, calling back the check function
|
|---|
| 270 | (if non-NULL) with each record. If some consistency check fails, or
|
|---|
| 271 | the supplied check function returns -1, tdb_check returns -1, otherwise
|
|---|
| 272 | 0. Note that logging function (if set) will be called with additional
|
|---|
| 273 | information on the corruption found.
|
|---|