NNAAMMEE db_open - database access methods SSYYNNOOPPSSIISS ##iinncclluuddee <> iinntt ddbb__ooppeenn((ccoonnsstt cchhaarr **ffiillee,, DDBBTTYYPPEE ttyyppee,, uu__iinntt3322__tt ffllaaggss,, iinntt mmooddee,, DDBB__EENNVV **ddbbeennvv,, DDBB__IINNFFOO **ddbbiinnffoo,, DDBB ****ddbbpppp));; iinntt DDBB-->>cclloossee((DDBB **ddbb,, uu__iinntt3322__tt ffllaaggss));; iinntt DDBB-->>ccuurrssoorr((DDBB **ddbb,, DDBB__TTXXNN **ttxxnniidd,, DDBBCC ****ccuurrssoorrpp));; iinntt DDBB-->>ddeell((DDBB **ddbb,, DDBB__TTXXNN **ttxxnniidd,, DDBBTT **kkeeyy,, uu__iinntt3322__tt ffllaaggss));; iinntt DDBB-->>ffdd((DDBB **ddbb,, iinntt **ffddpp));; iinntt DDBB-->>ggeett((DDBB **ddbb,, DDBB__TTXXNN **ttxxnniidd,, DDBBTT **kkeeyy,, DDBBTT **ddaattaa,, uu__iinntt3322__tt ffllaaggss));; iinntt DDBB-->>ppuutt((DDBB **ddbb,, DDBB__TTXXNN **ttxxnniidd,, DDBBTT **kkeeyy,, DDBBTT **ddaattaa,, uu__iinntt3322__tt ffllaaggss));; iinntt DDBB-->>ssyynncc((DDBB **ddbb,, uu__iinntt3322__tt ffllaaggss));; iinntt DDBB-->>ssttaatt((DDBB **ddbb,, vvooiidd **sspp,, vvooiidd **((**ddbb__mmaalllloocc))((ssiizzee__tt)),, uu__iinntt3322__tt ffllaaggss));; DDEESSCCRRIIPPTTIIOONN The DB library is a family of groups of functions that provides a modular programming interface to transactions and record-oriented file access. The library includes support for transactions, locking, logging and file page caching, as well as various indexed access methods. Many of the functional groups (e.g., the file page caching functions) are useful independent of the other DB func- tions, although some functional groups are explicitly based on other functional groups (e.g., transactions and logging). For a general description of the DB package, see _d_b___i_n_t_r_o(3). This manual page describes the overall structure of the DB library access methods. The currently supported file formats are btree, hashed and recno. The btree format is a representation of a sorted, balanced tree structure. The hashed format is an extensi- ble, dynamic hashing scheme. The recno format supports fixed or variable length records (optionally retrieved from a flat text file). Storage and retrieval for the DB access methods are based on key/data pairs, or DBT structures as they are typedef'd in the include file. See _d_b___d_b_t(3) for specific information on the structure and capabilities of a DBT. The _d_b___o_p_e_n function opens the database represented by _f_i_l_e for both reading and writing. Files never intended to be shared or preserved on disk may be created by set- ting the file parameter to NULL. The _d_b___o_p_e_n function copies a pointer to a DB structure (as typedef'd in the include file), into the memory location referenced by _d_b_p_p. This structure includes a set of functions to perform various database actions, as described below. The _d_b___o_p_e_n function returns the value of _e_r_r_n_o on failure and 0 on success. Note, while most of the access methods use _f_i_l_e as the name of an underlying file on disk, this is not guaran- teed. Also, calling _d_b___o_p_e_n is a reasonably expensive operation. (This is based on a model where the DBMS keeps a set of files open for a long time rather than opening and closing them on each query.) The _t_y_p_e argument is of type DBTYPE (as defined in the include file) and must be set to one of DB_BTREE, DB_HASH, DB_RECNO or DB_UNKNOWN. If _t_y_p_e is DB_UNKNOWN, the database must already exist and _d_b___o_p_e_n will then determine if it is of type DB_BTREE, DB_HASH or DB_RECNO. The _f_l_a_g_s and _m_o_d_e arguments specify how files will be opened and/or created when they don't already exist. The flags value is specified by oorr'ing together one or more of the following values: DB_CREATE Create any underlying files, as necessary. If the files do not already exist and the DB_CREATE flag is not specified, the call will fail. DB_NOMMAP Do not map this file (see _d_b___m_p_o_o_l(3) for further information). DB_RDONLY Open the database for reading only. Any attempt to write the database using the access methods will fail regardless of the actual permissions of any underly- ing files. DB_THREAD Cause the DB handle returned by the _d_b___o_p_e_n function to be useable by multiple threads within a single address space, i.e., to be ``free-threaded''. DB_TRUNCATE ``Truncate'' the database if it exists, i.e., behave as if the database were just created, discarding any previous contents. All files created by the access methods are created with mode _m_o_d_e (as described in _c_h_m_o_d(2)) and modified by the process' umask value at the time of creation (see _u_m_a_s_k(2)). The group ownership of created files is based on the system and directory defaults, and is not further specified by DB. DDBB__EENNVV The access methods make calls to the other subsystems in the DB library based on the _d_b_e_n_v argument to _d_b___o_p_e_n, which is a pointer to a structure of type DB_ENV (type- def'd in ). Applications will normally use the same DB_ENV structure (initialized by _d_b___a_p_p_i_n_i_t(3)), as an argument to all of the subsystems in the DB package. References to the DB_ENV structure are maintained by DB, so it may not be discarded until the last close function, corresponding to an open function for which it was an argument, has returned. In order to ensure compatibility with future releases of DB, all fields of the DB_ENV structure that are not explicitly set should be initial- ized to 0 before the first time the structure is used. Do this by declaring the structure external or static, or by calling the C library routine _b_z_e_r_o(3) or _m_e_m_s_e_t(3). The fields of the DB_ENV structure used by _d_b___o_p_e_n are described below. If _d_b_e_n_v is NULL or any of its fields are set to 0, defaults appropriate for the system are used where possible. The following fields in the DB_ENV structure may be ini- tialized before calling _d_b___o_p_e_n: DB_LOG *lg_info; If modifications to the file being opened should be logged, the _l_g___i_n_f_o field contains a return value from the function _l_o_g___o_p_e_n. If _l_g___i_n_f_o is NULL, no logging is done by the DB access methods. DB_LOCKTAB *lk_info; If locking is required for the file being opened (as is the case when multiple processes or threads are accessing the same file), the _l_k___i_n_f_o field contains a return value from the function _l_o_c_k___o_p_e_n. If _l_k___i_n_f_o is NULL, no locking is done by the DB access methods. If both locking and transactions are being performed (i.e., both _l_k___i_n_f_o and _t_x___i_n_f_o are non-NULL), the transaction ID will be used as the locker ID. If only locking is being performed, _d_b___o_p_e_n will acquire a locker ID from _l_o_c_k___i_d(3), and will use it for all locks required for this instance of _d_b___o_p_e_n. DB_MPOOL *mp_info; If the cache for the file being opened should be maintained in a shared buffer pool, the _m_p___i_n_f_o field contains a return value from the function _m_e_m_p___o_p_e_n. If _m_p___i_n_f_o is NULL, a memory pool may still be cre- ated by DB, but it will be private to the application and managed by DB. DB_TXNMGR *tx_info; If the accesses to the file being opened should take place in the context of transactions (providing atom- icity and error recovery), the _t_x___i_n_f_o field contains a return value from the function _t_x_n___o_p_e_n (see _d_b___t_x_n(3)). If transactions are specified, the application is responsible for making suitable calls to _t_x_n___b_e_g_i_n, _t_x_n___a_b_o_r_t, and _t_x_n___c_o_m_m_i_t. If _t_x___i_n_f_o is NULL, no transaction support is done by the DB access methods. When the access methods are used in conjunction with transactions, the application must abort the transac- tion (using _t_x_n___a_b_o_r_t) if any of the transaction pro- tected access method calls (i.e., any calls other than open, close and sync) returns a system error (e.g., deadlock, which returns EAGAIN). As described by _d_b___i_n_t_r_o(3), a system error is any value greater than 0. DDBB__IINNFFOO The access methods are configured using the DB_INFO data structure argument to _d_b___o_p_e_n. The DB_INFO structure is typedef'd in and has a large number of fields, most specific to a single access method, although a few are shared. The fields that are common to all access methods are listed here; those specific to an individual access method are described below. No reference to the DB_INFO structure is maintained by DB, so it is possible to dis- card it as soon as the _d_b___o_p_e_n call returns. In order to ensure compatibility with future releases of DB, all fields of the DB_INFO structure should be initial- ized to 0 before the structure is used. Do this by declaring the structure external or static, or by calling the C library function _b_z_e_r_o(3) or _m_e_m_s_e_t(3). If possible, defaults appropriate for the system are used for the DB_INFO fields if _d_b_i_n_f_o is NULL or any fields of the DB_INFO structure are set to 0. The following DB_INFO fields may be initialized before calling _d_b___o_p_e_n: size_t db_cachesize; A suggested maximum size of the memory pool cache, in bytes. If _d_b___c_a_c_h_e_s_i_z_e is 0, an appropriate default is used. It is an error to specify both the _m_p___i_n_f_o field and a non-zero _d_b___c_a_c_h_e_s_i_z_e. NNoottee,, tthhee mmiinniimmuumm nnuummbbeerr ooff ppaaggeess iinn tthhee ccaacchhee sshhoouulldd bbee nnoo lleessss tthhaann 1100,, aanndd tthhee aacccceessss mmeetthhooddss wwiillll ffaaiill iiff aann iinnssuuffffiicciieennttllyy llaarrggee ccaacchhee iiss ssppeecciiffiieedd.. In addition, for applications that exhibit strong local- ity in their data access patterns, increasing the size of the cache can significantly improve applica- tion performance. int db_lorder; The byte order for integers in the stored database metadata. The number should represent the order as an integer, for example, big endian order is the num- ber 4,321, and little endian order is the number 1,234. If _d_b___l_o_r_d_e_r is 0, the host order of the machine where the DB library was compiled is used. The value of _d_b___l_o_r_d_e_r is ignored except when databases are being created. If a database already exists, the byte order it uses is determined when the file is read. TThhee aacccceessss mmeetthhooddss pprroovviiddee nnoo gguuaarraanntteeeess aabboouutt tthhee bbyyttee oorrddeerriinngg ooff tthhee aapppplliiccaattiioonn ddaattaa ssttoorreedd iinn tthhee ddaattaabbaassee,, aanndd aapppplliiccaattiioonnss aarree rreessppoonnssiibbllee ffoorr mmaaiinn-- ttaaiinniinngg aannyy nneecceessssaarryy oorrddeerriinngg.. size_t db_pagesize; The size of the pages used to hold items in the database, in bytes. The minimum page size is 512 bytes and the maximum page size is 64K bytes. If _d_b___p_a_g_e_s_i_z_e is 0, a page size is selected based on the underlying filesystem I/O block size. The selected size has a lower limit of 512 bytes and an upper limit of 16K bytes. void *(*db_malloc)(size_t); The flag DB_DBT_MALLOC, when specified in the DBT structure, will cause the DB library to allocate mem- ory which then becomes the responsibility of the calling application. See _d_b___d_b_t(3) for more informa- tion. On systems where there may be multiple library ver- sions of malloc (notably Windows NT), specifying the DB_DBT_MALLOC flag will fail because the DB library will allocate memory from a different heap than the application will use to free it. To avoid this prob- lem, the _d_b___m_a_l_l_o_c field should be set to point to the application's allocation routine. If _d_b___m_a_l_l_o_c is non-NULL, it will be used to allocate the memory returned when the DB_DBT_MALLOC flag is set. The _d_b___m_a_l_l_o_c function must match the calling conventions of the _m_a_l_l_o_c(3) library routine. BBTTRREEEE The btree data structure is a sorted, balanced tree struc- ture storing associated key/data pairs. Searches, inser- tions, and deletions in the btree will all complete in O (lg base N) where base is the average number of keys per page. Often, inserting ordered data into btrees results in pages that are half-full. This implementation has been modified to make ordered (or inverse ordered) insertion the best case, resulting in nearly perfect page space uti- lization. Space freed by deleting key/data pairs from the database is never reclaimed from the filesystem, although it is reused where possible. This means that the btree storage structure is grow-only. If sufficiently many keys are deleted from a tree that shrinking the underlying database file is desirable, this can be accomplished by creating a new tree from a scan of the existing one. The following additional fields and flags may be initial- ized in the DB_INFO structure before calling _d_b___o_p_e_n, when using the btree access method: int (*bt_compare)(const DBT *, const DBT *); The _b_t___c_o_m_p_a_r_e function is the key comparison func- tion. It must return an integer less than, equal to, or greater than zero if the first key argument is considered to be respectively less than, equal to, or greater than the second key argument. The same com- parison function must be used on a given tree every time it is opened. The _d_a_t_a and _s_i_z_e fields of the DBT are the only fields that may be used for the purposes of this com- parison. If _b_t___c_o_m_p_a_r_e is NULL, the keys are compared lexi- cally, with shorter keys collating before longer keys. u_int32_t bt_minkey; The minimum number of keys that will be stored on any single page. This value is used to determine which keys will be stored on overflow pages, i.e. if a key or data item is larger than the pagesize divided by the _b_t___m_i_n_k_e_y value, it will be stored on overflow pages instead of in the page itself. The _b_t___m_i_n_k_e_y value specified must be at least 2; if _b_t___m_i_n_k_e_y is 0, a value of 2 is used. size_t (*bt_prefix)(const DBT *, const DBT *); The _b_t___p_r_e_f_i_x function is the prefix comparison func- tion. If specified, this function must return the number of bytes of the second key argument that are necessary to determine that it is greater than the first key argument. If the keys are equal, the key length should be returned. The _d_a_t_a and _s_i_z_e fields of the DBT are the only fields that may be used for the purposes of this com- parison. This is used to compress the keys stored on the btree internal pages. The usefulness of this is data dependent, but in some data sets can produce signifi- cantly reduced tree sizes and search times. If _b_t___p_r_e_f_i_x is NULL, and no comparison function is specified, a default lexical comparison function is used. If _b_t___p_r_e_f_i_x is NULL and a comparison function is specified, no prefix comparison is done. u_int32_t flags; The following additional flags may be specified by oorr'ing together one or more of the following values: DB_DUP Permit duplicate keys in the tree, i.e. inser- tion when the key of the key/data pair being inserted already exists in the tree will be suc- cessful. The ordering of duplicates in the tree is determined by the order of insertion, unless the ordering is otherwise specified by use of a cursor (see _d_b___c_u_r_s_o_r(3) for more information.) It is an error to specify both DB_DUP and DB_RECNUM. DB_RECNUM Support retrieval from btrees using record num- bers. For more information, see the DB_SET_RECNO flag to the _D_B_-_>_g_e_t function (below), and the cursor _c___g_e_t function (in _d_b___c_u_r_s_o_r(3)). Logical record numbers in btrees are mutable in the face of record insertion or deletion. See the DB_RENUMBER flag in the RECNO section below for further discussion. Maintaining record counts within a btree intro- duces a serious point of contention, namely the page locations where the record counts are stored. In addition, the entire tree must be locked during both insertions and deletions, effectively single-threading the tree for those operations. Specifying DB_RECNUM can result in serious performance degradation for some appli- cations and data sets. It is an error to specify both DB_DUP and DB_RECNUM. HHAASSHH The hash data structure is an extensible, dynamic hashing scheme. Backward compatible interfaces to the functions described in _d_b_m(3), _n_d_b_m(3) and _h_s_e_a_r_c_h(3) are provided, however these interfaces are not compatible with previous file formats. The following additional fields and flags may be initial- ized in the DB_INFO structure before calling _d_b___o_p_e_n, when using the hash access method: u_int32_t h_ffactor; The desired density within the hash table. It is an approximation of the number of keys allowed to accu- mulate in any one bucket, determining when the hash table grows or shrinks. The default value is 0, indicating that the fill factor will be selected dynamically as pages are filled. u_int32_t (*h_hash)(const void *, u_int32_t); The _h___h_a_s_h field is a user defined hash function; if _h___h_a_s_h is NULL, a default hash function is used. Since no hash function performs equally well on all possible data, the user may find that the built-in hash function performs poorly with a particular data set. User specified hash functions must take a pointer to a byte string and a length as arguments and return a u_int32_t value. If a hash function is specified, _h_a_s_h___o_p_e_n will attempt to determine if the hash function specified is the same as the one with which the database was created, and will fail if it detects that it is not. u_int32_t h_nelem; An estimate of the final size of the hash table. If not set or set too low, hash tables will expand gracefully as keys are entered, although a slight performance degradation may be noticed. The default value is 1. u_int32_t flags; The following additional flags may be specified by oorr'ing together one or more of the following values: DB_DUP Permit duplicate keys in the tree, i.e. inser- tion when the key of the key/data pair being inserted already exists in the tree will be suc- cessful. The ordering of duplicates in the tree is determined by the order of insertion, unless the ordering is otherwise specified by use of a cursor (see _d_b___c_u_r_s_o_r(3) for more information.) RREECCNNOO The recno access method provides support for fixed and variable length records, optionally backed by a flat text (byte stream) file. Both fixed and variable length records are accessed by their logical record number. It is valid to create a record whose record number is more than one greater than the last record currently in the database. For example, the creation of record number 8, when records 6 and 7 do not yet exist, is not an error. However, any attempt to retrieve such records (e.g., records 6 and 7) will return DB_KEYEMPTY. Deleting a record will not, by default, renumber records following the deleted record (see DB_RENUMBER below for more information). Any attempt to retrieve deleted records will return DB_KEYEMPTY. The following additional fields and flags may be initial- ized in the DB_INFO structure before calling _d_b___o_p_e_n, when using the recno access method: int re_delim; For variable length records, if the _r_e___s_o_u_r_c_e file is specified and the DB_DELIMITER flag is set, the delimiting byte used to mark the end of a record in the source file. If the _r_e___s_o_u_r_c_e file is specified and the DB_DELIMITER flag is not set, char- acters (i.e. ``\n'', 0x0a) are interpreted as end-of- record markers. u_int32_t re_len; The length of a fixed-length record. int re_pad; For fixed length records, if the DB_PAD flag is set, the pad character for short records. If the DB_PAD flag is not set, characters (i.e., 0x20) are used for padding. char *re_source; The purpose of the _r_e___s_o_u_r_c_e field is to provide fast access and modification to databases that are nor- mally stored as flat text files. If the _r_e___s_o_u_r_c_e field is non-NULL, it specifies an underlying flat text database file that is read to initialize a transient record number index. In the case of variable length records, the records are sep- arated by the byte value _r_e___d_e_l_i_m. For example, standard UNIX byte stream files can be interpreted as a sequence of variable length records separated by characters. In addition, when cached data would normally be writ- ten back to the underlying database file (e.g., the _c_l_o_s_e or _s_y_n_c functions are called), the in-memory copy of the database will be written back to the _r_e___s_o_u_r_c_e file. By default, the backing source file is read lazily, i.e., records are not read from the file until they are requested by the application. IIff mmuullttiippllee pprroo-- cceesssseess ((nnoott tthhrreeaaddss)) aarree aacccceessssiinngg aa rreeccnnoo ddaattaabbaassee ccoonnccuurrrreennttllyy aanndd eeiitthheerr iinnsseerrttiinngg oorr ddeelleettiinngg rreeccoorrddss,, tthhee bbaacckkiinngg ssoouurrccee ffiillee mmuusstt bbee rreeaadd iinn iittss eennttiirreettyy bbeeffoorree mmoorree tthhaann aa ssiinnggllee pprroocceessss aacccceesssseess tthhee ddaattaabbaassee,, aanndd oonnllyy tthhaatt pprroocceessss sshhoouulldd ssppeecciiffyy tthhee bbaacckkiinngg ssoouurrccee ffiillee aass ppaarrtt ooff tthhee ddbb__ooppeenn ccaallll.. See the DB_SNAPSHOT flag below for more information. RReeaaddiinngg aanndd wwrriittiinngg tthhee bbaacckkiinngg ssoouurrccee ffiillee ssppeecciiffiieedd bbyy rree__ssoouurrccee ccaannnnoott bbee ttrraannssaaccttiioonnaallllyy pprrootteecctteedd bbeeccaauussee iitt iinnvvoollvveess ffiilleessyysstteemm ooppeerraattiioonnss tthhaatt aarree nnoott ppaarrtt ooff tthhee DDBB ttrraannssaaccttiioonn mmeetthhooddoollooggyy.. For this reason, if a temporary database is used to hold the records, i.e., a NULL was specified as the _f_i_l_e argu- ment to _d_b___o_p_e_n, it is possible to lose the contents of the _r_e___s_o_u_r_c_e file, e.g., if the system crashes at the right instant. If a file is used to hold the database, i.e., a file name was specified as the _f_i_l_e argument to _d_b___o_p_e_n, normal database recovery on that file can be used to prevent information loss, although it is still possible that the contents of _r_e___s_o_u_r_c_e will be lost if the system crashes. The _r_e___s_o_u_r_c_e file must already exist (but may be zero-length) when _d_b___o_p_e_n is called. For all of the above reasons, the _r_e___s_o_u_r_c_e field is generally used to specify databases that are read- only for DB applications, and that are either gener- ated on the fly by software tools, or modified using a different mechanism, e.g., a text editor. u_int32_t flags; The following additional flags may be specified by oorr'ing together one or more of the following values: DB_DELIMITER The _r_e___d_e_l_i_m field is set. DB_FIXEDLEN The records are fixed-length, not byte delim- ited. The structure element _r_e___l_e_n specifies the length of the record, and the structure ele- ment _r_e___p_a_d is used as the pad character. Any records added to the database that are less than _r_e___l_e_n bytes long are automatically padded. Any attempt to insert records into the database that are greater than _r_e___l_e_n bytes long will cause the call to fail immediately and return an error. DB_PAD The _r_e___p_a_d field is set. DB_RENUMBER Specifying the DB_RENUMBER flag causes the logi- cal record numbers to be mutable, and change as records are added to and deleted from the database. For example, the deletion of record number 4 causes records numbered 5 and greater to be renumbered downward by 1. If a cursor was positioned to record number 4 before the dele- tion, it will reference the new record number 4, if any such record exists, after the deletion. If a cursor was positioned after record number 4 before the deletion, it will be shifted downward 1 logical record, continuing to reference the same record as it did before. Using the _c___p_u_t or _p_u_t interfaces to create new records will cause the creation of multiple records if the record number is more than one greater than the largest record currently in the database. For example, creating record 28, when record 25 was previously the last record in the database, will create records 26 and 27 as well as 28. Attempts to retrieve records that were created in this manner will result in an error return of DB_KEYEMPTY. If a created record is not at the end of the database, all records following the new record will be automatically renumbered upward by 1. For example, the creation of a new record num- bered 8 causes records numbered 8 and greater to be renumbered upward by 1. If a cursor was positioned to record number 8 or greater before the insertion, it will be shifted upward 1 logi- cal record, continuing to reference the same record as it did before. For these reasons, concurrent access to a recno database with the DB_RENUMBER flag specified may be largely meaningless, although it is sup- ported. DB_SNAPSHOT This flag specifies that any specified _r_e___s_o_u_r_c_e file be read in its entirety when _d_b___o_p_e_n is called. If this flag is not specified, the _r_e___s_o_u_r_c_e file may be read lazily. DDBB OOPPEERRAATTIIOONNSS The DB structure returned by _d_b___o_p_e_n describes a database type, and includes a set of functions to perform various actions, as described below. Each of these functions takes a pointer to a DB structure, and may take one or more DBT *'s and a flag value as well. The fields of the DB structure are as follows: DBTYPE type; The type of the underlying access method (and file format). Set to one of DB_BTREE, DB_HASH or DB_RECNO. This field may be used to determine the type of the database after a return from _d_b___o_p_e_n with the _t_y_p_e argument set to DB_UNKNOWN. int (*close)(DB *db, u_int32_t flags); A pointer to a function to flush any cached informa- tion to disk, close any open cursors (see _d_b___c_u_r_- _s_o_r(3)), free any allocated resources, and close any underlying files. Since key/data pairs are cached in memory, failing to sync the file with the _c_l_o_s_e or _s_y_n_c function may result in inconsistent or lost information. The _f_l_a_g_s parameter must be set to 0 or the following value: DB_NOSYNC Do not flush cached information to disk. The DB_NOSYNC flag is a dangerous option. It should only be set if the application is doing logging (with transactions) so that the database is recoverable after a system or application crash, or if the database is always generated from scratch after any system or application crash. IItt iiss iimmppoorrttaanntt ttoo uunnddeerrssttaanndd tthhaatt fflluusshhiinngg ccaacchheedd iinnffoorrmmaattiioonn ttoo ddiisskk oonnllyy mmiinniimmiizzeess tthhee wwiinnddooww ooff ooppppoorrttuunniittyy ffoorr ccoorrrruupptteedd ddaattaa.. While unlikely, it is possible for database corruption to happen if a system or application crash occurs while writing data to the database. To ensure that database corruption never occurs, applications must either: use transac- tions and logging with automatic recovery, use log- ging and application-specific recovery, or edit a copy of the database, and, once all applications using the database have successfully called _c_l_o_s_e, replace the original database with the updated copy. When multiple threads are using the DB handle concur- rently, only a single thread may call the DB handle close function. The _c_l_o_s_e function returns the value of _e_r_r_n_o on failure and 0 on success. int (*cursor)(DB *db, DB_TXN *txnid, DBC **cursorp); A pointer to a function to create a cursor and copy a pointer to it into the memory referenced by _c_u_r_s_o_r_p. A cursor is a structure used to provide sequential access through a database. This interface and its associated functions replaces the functionality pro- vided by the _s_e_q function in previous releases of the DB library. If the file is being accessed under transaction pro- tection, the _t_x_n_i_d parameter is a transaction ID returned from _t_x_n___b_e_g_i_n, otherwise, NULL. If trans- action protection is enabled, cursors must be opened and closed within the context of a transaction, and the _t_x_n_i_d parameter specifies the transaction context in which the cursor may be used. See _d_b___c_u_r_s_o_r(3) for more information. The _c_u_r_s_o_r function returns the value of _e_r_r_n_o on failure and 0 on success. int (*del)(DB *db, DB_TXN *txnid, DBT *key, u_int32_t flags); A pointer to a function to remove key/data pairs from the database. The key/data pair associated with the specified _k_e_y is discarded from the database. In the presence of duplicate key values, all records associ- ated with the designated key will be discarded. If the file is being accessed under transaction pro- tection, the _t_x_n_i_d parameter is a transaction ID returned from _t_x_n___b_e_g_i_n, otherwise, NULL. The _f_l_a_g_s parameter is currently unused, and must be set to 0. The _d_e_l function returns the value of _e_r_r_n_o on fail- ure, 0 on success, and DB_NOTFOUND if the specified _k_e_y did not exist in the file. int (*fd)(DB *db, int *fdp); A pointer to a function that copies a file descriptor representative of the underlying database into the memory referenced by _f_d_p. A file descriptor refer- encing the same file will be returned to all pro- cesses that call _d_b___o_p_e_n with the same _f_i_l_e argument. This file descriptor may be safely used as an argu- ment to the _f_c_n_t_l(2) and _f_l_o_c_k(2) locking functions. The file descriptor is not necessarily associated with any of the underlying files used by the access method. The _f_d function only supports a coarse-grained form of locking. Applications should use the lock manager where possible. The _f_d function returns the value of _e_r_r_n_o on failure and 0 on success. int (*get)(DB *db, DB_TXN *txnid, DBT *key, DBT *data, u_int32_t flags); A pointer to a function that is an interface for keyed retrieval from the database. The address and length of the data associated with the specified _k_e_y are returned in the structure referenced by _d_a_t_a. In the presence of duplicate key values, _g_e_t will return the first data item for the designated key. Duplicates are sorted by insert order except where this order has been overridden by cursor operations. RReettrriieevvaall ooff dduupplliiccaatteess rreeqquuiirreess tthhee uussee ooff ccuurrssoorr ooppeerraattiioonnss.. See _d_b___c_u_r_s_o_r(3) for details. If the file is being accessed under transaction pro- tection, the _t_x_n_i_d parameter is a transaction ID returned from _t_x_n___b_e_g_i_n, otherwise, NULL. The _f_l_a_g_s parameter must be set to 0 or the following value: DB_SET_RECNO Retrieve the specified numbered key/data pair from a database. Upon return, both the _k_e_y and _d_a_t_a items will have been filled in, not just the data item as is done for all other uses of the _g_e_t function. The _d_a_t_a field of the specified _k_e_y must be a pointer to a memory location from which a _d_b___r_e_c_n_o___t may be read, as described in _d_b___d_b_t(3). This memory location will be read to determine the record to be retrieved. For DB_SET_RECNO to be specified, the underlying database must be of type btree and it must have been created with the DB_RECNUM flag (see _d_b___o_p_e_n(3)). If the database is a recno database and the requested key exists, but was never explicitly created by the application or was later deleted, the _g_e_t function returns DB_KEYEMPTY. Otherwise, if the requested key isn't in the database, the _g_e_t function returns DB_NOTFOUND. Otherwise, the _g_e_t function returns the value of _e_r_r_n_o on failure and 0 on success. int (*put)(DB *db, DB_TXN *txnid, DBT *key, DBT *data, u_int32_t flags); A pointer to a function to store key/data pairs in the database. If the database supports duplicates, the _p_u_t function adds the new data value at the end of the duplicate set. If the file is being accessed under transaction pro- tection, the _t_x_n_i_d parameter is a transaction ID returned from _t_x_n___b_e_g_i_n, otherwise, NULL. The flags value is specified by oorr'ing together one or more of the following values: DB_APPEND Append the key/data pair to the end of the database. For DB_APPEND to be specified, the underlying database must be of type recno. The record number allocated to the record is returned in the specified _k_e_y. DB_NOOVERWRITE Enter the new key/data pair only if the key does not already appear in the database. The default behavior of the _p_u_t function is to enter the new key/data pair, replacing any previously existing key if duplicates are disallowed, or to add a duplicate entry if duplicates are allowed. Even if the designated database allows duplicates, a call to _p_u_t with the DB_NOOVERWRITE flag set will fail if the key already exists in the database. The _p_u_t function returns the value of _e_r_r_n_o on fail- ure, 0 on success, and DB_KEYEXIST if the DB_NOOVER- WRITE _f_l_a_g was set and the key already exists in the file. int (*sync)(DB *db, u_int32_t flags); A pointer to a function to flush any cached informa- tion to disk. If the database is in memory only, the _s_y_n_c function has no effect and will always succeed. The _f_l_a_g_s parameter is currently unused, and must be set to 0. See the _c_l_o_s_e function description above for a dis- cussion of DB and cached data. The _s_y_n_c function returns the value of _e_r_r_n_o on fail- ure and 0 on success. int (*stat)(DB *db, void *sp, void *(*db_malloc)(size_t), u_int32_t flags); A pointer to a function to create a statistical structure and copy a pointer to it into user-speci- fied memory locations. Specifically, if _s_p is non- NULL, a pointer to the statistics for the database are copied into the memory location it references. Statistical structures are created in allocated mem- ory. If _d_b___m_a_l_l_o_c is non-NULL, it is called to allo- cate the memory, otherwise, the library function _m_a_l_- _l_o_c(3) is used. The function _d_b___m_a_l_l_o_c must match the calling conventions of the _m_a_l_l_o_c(3) library rou- tine. Regardless, the caller is responsible for deallocating the returned memory. To deallocate the returned memory, free each returned memory pointer; pointers inside the memory do not need to be individ- ually freed. IInn tthhee pprreesseennccee ooff mmuullttiippllee tthhrreeaaddss oorr pprroocceesssseess aacccceessssiinngg aann aaccttiivvee ddaattaabbaassee,, tthhee rreettuurrnneedd iinnffoorrmmaa-- ttiioonn mmaayy bbee oouutt--ooff--ddaattee.. TThhiiss ffuunnccttiioonn mmaayy aacccceessss aallll ooff tthhee ppaaggeess iinn tthhee ddaattaabbaassee,, aanndd tthheerreeffoorree mmaayy iinnccuurr aa sseevveerree ppeerrffoorr-- mmaannccee ppeennaallttyy aanndd hhaavvee oobbvviioouuss nneeggaattiivvee eeffffeeccttss oonn tthhee uunnddeerrllyyiinngg bbuuffffeerr ppooooll.. The _f_l_a_g_s parameter must be set to 0 or the following value: DB_RECORDCOUNT Fill in the _b_t___n_r_e_c_s field of the statistics structure, but do not collect any other informa- tion. This flag makes it reasonable for appli- cations to request a record count from a database without incurring a performance penalty. It is only available for recno databases, or btree databases where the underly- ing database was created with the DB_RECNUM flag. The _s_t_a_t function returns the value of _e_r_r_n_o on fail- ure and 0 on success. In the case of a btree or recno database, the statis- tics are stored in a structure of type DB_BTREE_STAT (typedef'd in ). The following fields will be filled in: u_int32_t bt_magic; Magic number that identifies the file as a btree file. u_int32_t bt_version; The version of the btree file type. u_int32_t bt_flags; Permanent database flags, including DB_DUP, DB_FIXEDLEN, DB_RECNUM and DB_RENUMBER. u_int32_t bt_minkey; The _b_t___m_i_n_k_e_y value specified to _d_b___o_p_e_n(3), if any. u_int32_t bt_re_len; The _r_e___l_e_n value specified to _d_b___o_p_e_n(3), if any. u_int32_t bt_re_pad; The _r_e___p_a_d value specified to _d_b___o_p_e_n(3), if any. u_int32_t bt_pagesize; Underlying tree page size. u_int32_t bt_levels; Number of levels in the tree. u_int32_t bt_nrecs; Number of data items in the tree (since there may be multiple data items per key, this number may not be the same as the number of keys). u_int32_t bt_int_pg; Number of tree internal pages. u_int32_t bt_leaf_pg; Number of tree leaf pages. u_int32_t bt_dup_pg; Number of tree duplicate pages. u_int32_t bt_over_pg; Number of tree overflow pages. u_int32_t bt_free; Number of pages on the free list. u_int32_t bt_freed; Number of pages made available for reuse because they were emptied. u_int32_t bt_int_pgfree; Number of bytes free in tree internal pages. u_int32_t bt_leaf_pgfree; Number of bytes free in tree leaf pages. u_int32_t bt_dup_pgfree; Number of bytes free in tree duplicate pages. u_int32_t bt_over_pgfree; Number of bytes free in tree overflow pages. u_int32_t bt_pfxsaved; Number of bytes saved by prefix compression. u_int32_t bt_split; Total number of tree page splits (includes fast and root splits). u_int32_t bt_rootsplit; Number of root page splits. u_int32_t bt_fastsplit; Number of fast splits. When sorted keys are added to the database, the DB btree implementa- tion will split left or right to increase the page-fill factor. This number is a measure of how often it was possible to make such a split. u_int32_t bt_added; Number of keys added. u_int32_t bt_deleted; Number of keys deleted. u_int32_t bt_get; Number of keys retrieved. (Note, this value will not reflect any keys retrieved when the database was open for read-only access, as there is no permanent location to store the informa- tion in this case.) u_int32_t bt_cache_hit; Number of hits in tree fast-insert code. When sorted keys are added to the database, the DB btree implementation will check the last page where an insert occurred before doing a full lookup. This number is a measure of how often the lookup was successful. u_int32_t bt_cache_miss; Number of misses in tree fast-insert code. See the description of bt_cache_hit; this number is a measure of how often the lookup failed. EENNVVIIRROONNMMEENNTT VVAARRIIAABBLLEESS The following environment variables affect the execution of _d_b___o_p_e_n: DB_HOME If the _d_b_e_n_v argument to _d_b___o_p_e_n was initialized using _d_b___a_p_p_i_n_i_t, the environment variable DB_HOME may be used as the path of the database home for the interpretation of the _d_i_r argument to _d_b___o_p_e_n, as described in _d_b___a_p_p_i_n_i_t(3). Specifically, _d_b___o_p_e_n is affected by the configuration string value of DB_DATA_DIR. EEXXAAMMPPLLEESS Applications that create short-lived databases that are discarded or recreated when the system fails and are unconcerned with concurrent access and loss of data due to catastrophic failure, may wish to use the _d_b___o_p_e_n func- tionality without other parts of the DB library. Such applications will only be concerned with the DB access methods. The DB access methods will use the memory pool subsystem, but the application is unlikely to be aware of this. See the files _e_x_a_m_p_l_e_/_e_x___a_c_c_e_s_s_._c and _e_x_a_m_- _p_l_e_/_e_x___b_t_r_e_c_._c in the DB source distribution for C lan- guage code examples of how such applications might use the DB library. EERRRROORRSS The _d_b___o_p_e_n function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: DB->sync(3), calloc(3), close(2), fcntl(2), fflush(3), lock_get(3), lock_id(3), lock_put(3), lock_vec(3), log_put(3), log_register(3), log_unregis- ter(3), malloc(3), memcpy(3), memmove(3), memp_close(3), memp_fclose(3), memp_fget(3), memp_fopen(3), memp_fput(3), memp_fset(3), memp_fsync(3), memp_open(3), memp_regis- ter(3), memset(3), mmap(2), munmap(2), open(2), read(2), realloc(3), sigfillset(3), sigprocmask(2), stat(2), str- cpy(3), strdup(3), strerror(3), strlen(3), time(3), and unlink(2). In addition, the _d_b___o_p_e_n function may fail and return _e_r_r_n_o for the following conditions: [EAGAIN] A lock was unavailable. [EINVAL] An invalid flag value or parameter was specified (e.g., unknown database type, page size, hash func- tion, recno pad byte, byte order) or a flag value or parameter that is incompatible with the current _f_i_l_e specification. The DB_THREAD flag was specified and spinlocks are not implemented for this architecture. There is a mismatch between the version number of _f_i_l_e and the software. A _r_e___s_o_u_r_c_e file was specified with either the DB_THREAD flag or a non-NULL _t_x___i_n_f_o field in the DB_ENV argument to db_open. [ENOENT] A non-existent _r_e___s_o_u_r_c_e file was specified. [EPERM] Database corruption was detected. All subsequent database calls (other than _D_B_-_>_c_l_o_s_e) will return EPERM. The _D_B_-_>_c_l_o_s_e function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: DB->sync(3), calloc(3), close(2), fflush(3), lock_get(3), lock_put(3), lock_vec(3), log_put(3), log_unregister(3), malloc(3), memcpy(3), memmove(3), memp_close(3), memp_fclose(3), memp_fget(3), memp_fput(3), memp_fset(3), memp_fsync(3), memset(3), munmap(2), real- loc(3), and strerror(3). The _D_B_-_>_c_u_r_s_o_r function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: calloc(3). In addition, the _D_B_-_>_c_u_r_s_o_r function may fail and return _e_r_r_n_o for the following conditions: [EINVAL] An invalid flag value or parameter was specified. [EPERM] Database corruption was detected. All subsequent database calls (other than _D_B_-_>_c_l_o_s_e) will return EPERM. The _D_B_-_>_d_e_l function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: calloc(3), fcntl(2), fflush(3), lock_get(3), lock_id(3), lock_put(3), lock_vec(3), log_put(3), mal- loc(3), memcmp(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3), memp_fset(3), memset(3), realloc(3), and strerror(3). In addition, the _D_B_-_>_d_e_l function may fail and return _e_r_r_n_o for the following conditions: [EAGAIN] A lock was unavailable. [EINVAL] An invalid flag value or parameter was specified. [EPERM] Database corruption was detected. All subsequent database calls (other than _D_B_-_>_c_l_o_s_e) will return EPERM. In addition, the _D_B_-_>_f_d function may fail and return _e_r_r_n_o for the following conditions: [ENOENT] The _D_B_-_>_f_d function was called for an in-memory database, or no underlying file has yet been created. [EPERM] Database corruption was detected. All subsequent database calls (other than _D_B_-_>_c_l_o_s_e) will return EPERM. The _D_B_-_>_g_e_t function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: DBcursor->c_get(3), calloc(3), fcntl(2), fflush(3), lock_get(3), lock_id(3), lock_put(3), lock_vec(3), log_put(3), malloc(3), memcmp(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3), memp_fset(3), mem- set(3), realloc(3), and strerror(3). In addition, the _D_B_-_>_g_e_t function may fail and return _e_r_r_n_o for the following conditions: [EAGAIN] A lock was unavailable. [EINVAL] An invalid flag value or parameter was specified. The DB_THREAD flag was specified to the _d_b___o_p_e_n(3) function and neither the DB_DBT_MALLOC or DB_DBT_USERMEM flags were set in the DBT. A record number of 0 was specified. [EPERM] Database corruption was detected. All subsequent database calls (other than _D_B_-_>_c_l_o_s_e) will return EPERM. The _D_B_-_>_p_u_t function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: calloc(3), fcntl(2), fflush(3), lock_get(3), lock_id(3), lock_put(3), lock_vec(3), log_put(3), mal- loc(3), memcmp(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3), memp_fset(3), memset(3), realloc(3), and strerror(3). In addition, the _D_B_-_>_p_u_t function may fail and return _e_r_r_n_o for the following conditions: [EACCES] An attempt was made to modify a read-only database. [EAGAIN] A lock was unavailable. [EINVAL] An invalid flag value or parameter was specified. A record number of 0 was specified. An attempt was made to add a record to a fixed-length database that was too large to fit. An attempt was made to do a partial put. [EPERM] Database corruption was detected. All subsequent database calls (other than _D_B_-_>_c_l_o_s_e) will return EPERM. [ENOSPC] A btree exceeded the maximum btree depth (255). The _D_B_-_>_s_t_a_t function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: calloc(3), fcntl(2), fflush(3), lock_get(3), lock_id(3), lock_put(3), lock_vec(3), malloc(3), mem- cpy(3), memp_fget(3), memp_fput(3), and memset(3). The _D_B_-_>_s_y_n_c function may fail and return _e_r_r_n_o for any of the errors specified for the following DB and library functions: DB->get(3), DB->sync(3), calloc(3), close(2), fcntl(2), fflush(3), lock_get(3), lock_id(3), lock_put(3), lock_vec(3), log_put(3), malloc(3), memcpy(3), memmove(3), memp_fget(3), memp_fput(3), memp_fset(3), memp_fsync(3), memset(3), munmap(2), open(2), realloc(3), strerror(3), unlink(2), and write(2). In addition, the _D_B_-_>_s_y_n_c function may fail and return _e_r_r_n_o for the following conditions: [EINVAL] An invalid flag value or parameter was specified. [EPERM] Database corruption was detected. All subsequent database calls (other than _D_B_-_>_c_l_o_s_e) will return EPERM. SSEEEE AALLSSOO _T_h_e _U_b_i_q_u_i_t_o_u_s _B_-_t_r_e_e, Douglas Comer, ACM Comput. Surv. 11, 2 (June 1979), 121-138. _P_r_e_f_i_x _B_-_t_r_e_e_s, Bayer and Unterauer, ACM Transactions on Database Systems, Vol. 2, 1 (March 1977), 11-26. _T_h_e _A_r_t _o_f _C_o_m_p_u_t_e_r _P_r_o_g_r_a_m_m_i_n_g _V_o_l_. _3_: _S_o_r_t_i_n_g _a_n_d _S_e_a_r_c_h_i_n_g, D.E. Knuth, 1968, pp 471-480. _D_y_n_a_m_i_c _H_a_s_h _T_a_b_l_e_s, Per-Ake Larson, Communications of the ACM, April 1988. _A _N_e_w _H_a_s_h _P_a_c_k_a_g_e _f_o_r _U_N_I_X, Margo Seltzer, USENIX Pro- ceedings, Winter 1991. _D_o_c_u_m_e_n_t _P_r_o_c_e_s_s_i_n_g _i_n _a _R_e_l_a_t_i_o_n_a_l _D_a_t_a_b_a_s_e _S_y_s_t_e_m, Michael Stonebraker, Heidi Stettner, Joseph Kalash, Antonin Guttman, Nadene Lynn, Memorandum No. UCB/ERL M82/32, May 1982. _d_b___a_r_c_h_i_v_e(1), _d_b___c_h_e_c_k_p_o_i_n_t(1), _d_b___d_e_a_d_l_o_c_k(1), _d_b___d_u_m_p(1), _d_b___l_o_a_d(1), _d_b___r_e_c_o_v_e_r(1), _d_b___s_t_a_t(1), _d_b___i_n_t_r_o(3), _d_b___a_p_p_i_n_i_t(3), _d_b___c_u_r_s_o_r(3), _d_b___d_b_m(3), _d_b___i_n_t_e_r_n_a_l(3), _d_b___l_o_c_k(3), _d_b___l_o_g(3), _d_b___m_p_o_o_l(3), _d_b___o_p_e_n(3), _d_b___t_h_r_e_a_d(3), _d_b___t_x_n(3)