(cl.info)Hash Tables


Next: Structures Prev: Lists Up: Top

Hash Tables
***********

A "hash table" is a data structure that maps "keys" onto "values."
Keys and values can be arbitrary Lisp data objects.  Hash tables have
the property that the time to search for a given key is roughly
constant; simpler data structures like association lists take time
proportional to the number of entries in the list.

 - Function: make-hash-table &key :test :size
     This function creates and returns a hash-table object whose
     function for comparing elements is `:test' (`eql' by default), and
     which is allocated to fit about `:size' elements.  The `:size'
     argument is purely advisory; the table will stretch automatically
     if you store more elements in it.  If `:size' is omitted, a
     reasonable default is used.

     Common Lisp allows only `eq', `eql', `equal', and `equalp' as
     legal values for the `:test' argument.  In this package, any
     reasonable predicate function will work, though if you use
     something else you should check the details of the hashing
     function described below to make sure it is suitable for your
     predicate.

     Some versions of Emacs (like Lucid Emacs 19) include a built-in
     hash table type; in these versions, `make-hash-table' with a test
     of `eq' will use these built-in hash tables.  In all other cases,
     it will return a hash-table object which takes the form of a list
     with an identifying "tag" symbol at the front.  All of the hash
     table functions in this package can operate on both types of hash
     table; normally you will never know which type is being used.

     This function accepts the additional Common Lisp keywords
     `:rehash-size' and `:rehash-threshold', but it ignores their
     values.

 - Function: gethash KEY TABLE &optional DEFAULT
     This function looks up KEY in TABLE.  If KEY exists in the table,
     in the sense that it matches any of the existing keys according to
     the table's test function, then the associated value is returned.
     Otherwise, DEFAULT (or `nil') is returned.

     To store new data in the hash table, use `setf' on a call to
     `gethash'.  If KEY already exists in the table, the corresponding
     value is changed to the stored value.  If KEY does not already
     exist, a new entry is added to the table and the table is
     reallocated to a larger size if necessary.  The DEFAULT argument
     is allowed but ignored in this case.  The situation is exactly
     analogous to that of `get*'; Note: Property Lists..

 - Function: remhash KEY TABLE
     This function removes the entry for KEY from TABLE.  If an entry
     was removed, it returns `t'.  If KEY does not appear in the table,
     it does nothing and returns `nil'.

 - Function: clrhash TABLE
     This function removes all the entries from TABLE, leaving an empty
     hash table.

 - Function: maphash FUNCTION TABLE
     This function calls FUNCTION for each entry in TABLE.  It passes
     two arguments to FUNCTION, the key and the value of the given
     entry.  The return value of FUNCTION is ignored; MAPHASH itself
     returns `nil'.  Note: Loop Facility, for an alternate way of
     iterating over hash tables.

 - Function: hash-table-count TABLE
     This function returns the number of entries in TABLE.  *Warning:*
     The current implementation of Lucid Emacs 19 hash-tables does not
     decrement the stored `count' when `remhash' removes an entry.
     Therefore, the return value of this function is not dependable if
     you have used `remhash' on the table and the table's test is `eq'.
     A slower, but reliable, way to count the entries is `(loop for x
     being the hash-keys of TABLE count t)'.

 - Function: hash-table-p OBJECT
     This function returns `t' if OBJECT is a hash table, `nil'
     otherwise.  It recognizes both types of hash tables (both Lucid
     Emacs built-in tables and tables implemented with special lists.)

   Sometimes when dealing with hash tables it is useful to know the
exact "hash function" that is used.  This package implements hash
tables using Emacs Lisp "obarrays," which are the same data structure
that Emacs Lisp uses to keep track of symbols.  Each hash table
includes an embedded obarray.  Key values given to `gethash' are
converted by various means into strings, which are then looked up in
the obarray using `intern' and `intern-soft'.  The symbol, or "bucket,"
corresponding to a given key string includes as its `symbol-value' an
association list of all key-value pairs which hash to that string.
Depending on the test function, it is possible for many entries to hash
to the same bucket.  For example, if the test is `eql', then the symbol
`foo' and two separately built strings `"foo"' will create three
entries in the same bucket.  Search time is linear within buckets, so
hash tables will be most effective if you arrange not to store too many
things that hash the same.

   The following algorithm is used to convert Lisp objects to hash
strings:

   * Strings are used directly as hash strings.  (However, if the test
     function is `equalp', strings are `downcase'd first.)

   * Symbols are hashed according to their `symbol-name'.

   * Integers are hashed into one of 16 buckets depending on their value
     modulo 16.  Floating-point numbers are truncated to integers and
     hashed modulo 16.

   * Cons cells are hashed according to their `car's; nonempty vectors
     are hashed according to their first element.

   * All other types of objects hash into a single bucket named `"*"'.

Thus, for example, searching among many buffer objects in a hash table
will devolve to a (still fairly fast) linear-time search through a
single bucket, whereas searching for different symbols will be very
fast since each symbol will, in general, hash into its own bucket.

   The size of the obarray in a hash table is automatically adjusted as
the number of elements increases.

   As a special case, `make-hash-table' with a `:size' argument of 0 or
1 will create a hash-table object that uses a single association list
rather than an obarray of many lists.  For very small tables this
structure will be more efficient since lookup does not require
converting the key to a string or looking it up in an obarray.
However, such tables are guaranteed to take time proportional to their
size to do a search.


automatically generated by info2www