Hash vs list

Michael Torrie torriem at chem.byu.edu
Fri Apr 21 20:56:23 MDT 2006


On Fri, 2006-04-21 at 18:20 -0600, Jeff Schroeder wrote:
> I'm curious whether searching for a single value in a hash is generally 
> (always?) faster than searching for it in a list.  The programming I'm 
> doing is in PHP, but I think it's a question with a general answer.
> 
> For example, assume I have a list of a million e-mail addresses [1] and 
> I want to very quickly check whether a certain address is in that list.  
> If I had a one-dimensional array, I could use the PHP function 
> in_array() to determine whether I'm in the list:
> 
> $address_list = array("guy at place.com", "god at heaven.org", ...);
> if(in_array("jeff at zingstudios.net", $address_list)) ...
> 
> On the other hand, I could construct a hash with the same data:
> 
> $address_list = array("guy at place.com" => 1, "god at heaven.org" => 1, ...);
> if($address_list{"jeff at zingstudios.net"}) ...
> 
> Since hashes are designed to provide fast lookups of any given key, it 
> seems to me that the second approach would yield results more 
> efficiently.  But I never took a data structures class, so I may be 
> missing something.

A hash's lookup efficiency varies according to how good the hashing
buckets algorithm is.  The more collisions there are, the worse it gets.
However worse-case scenario, a lookup in a hash is O(n), the same as the
plain array.  Best case, though, is O(log n).  Of course under the worst
circumstances, if the hash requires looking in every bucket, it will
probably be slower than a plain array, as there is overhead to be taken
into account.  But usually if you're trying to find just one key by
name, a hash is always fastest.

Michael


> 
> Thoughts?
> 
> Thanks,
> Jeff
> 
> [1] No, I'm not a spammer. :)  It's just an example.
> /*
> PLUG: http://plug.org, #utah on irc.freenode.net
> Unsubscribe: http://plug.org/mailman/options/plug
> Don't fear the penguin.
> */



More information about the PLUG mailing list