Victus Spiritus

home

Diametrically opposed to demolition derbies, Hash Functions

04 Apr 2011

With little doubt each and every one of you arose this fine spring morning just as excited about hashing as myself. Like all interesting algorithms, the proud tradition of identifying effective hashing functions requires a mathematician's razor mind and an artist's subtle touch. If you know of anyone who has either of these attributes don't hesitate to hold an image of them in your mind as you scan the rest of this article.

Quick Hash Function

Within the hallowed pages of the Algorithm Design Manual, Steve Skiena* describes a straightforward Hashing function between strings S of length |S|, with alphabet A, and a length m hash table.

Table Collisions

The worst hashing function will generate the same index for all keys. Even for the best functions the locations that result from the hash are not guaranteed to avoid collisions. Collisions are common even for uniform distributions, see the Birthday Problem for details.

Recommended solutions to collisions include doubly linked lists, or open addressing where the former is more popular but also consumes more memory for pointers.

For hash buckets with doubly linked lists when writing we append the new value to the list, when reading scan through the list for the desires key.

Open addressing writes to the next available memory location at a hashed index. Reading requires scanning through adjacent memory locations, and deletions require reinserting any broken runs of nodes (ouch).

I haven't read mention of it but busy hash tables could have trees implemented at each bucket. If a hash table becomes congested (which data structure), you're likely better off using a tree implementation. When n is small you're better off with a list.

References:

Notes:
*= I'm reminded of a conversation between Professor Skiena and a student over a decade ago. I took an algorithm overview course a few years after I graduated when taking additional grad courses was a good excuse to slip out of the office for a couple of hours. Now I recommend reading and working with other peers on the cheap.

Myself and two colleagues who also took the course were waiting outside Steve's office to discuss our final project. A foreign student was ahead of us in a meeting, and we overhead Steve going into detail as he described the Thanksgiving Holiday ritual: "A huge turkey...stuffing...yams..." From the conversation, I could imagine the student salivating over the description of the bountiful meal, and we were inclined to believe Steve may extend an invite to the student to introduce him to the holiday. We were surprised as the meeting wrapped up with Steve mentioning "it's really quite a fun event, you should.. you should get one of your friends to invite you over". Steve's generous with his dynamic programming knowledge and algorithmic mastery, but bring your own Turkey dinner ;).