24 Million entries and I need to what?
nicholas4 at gmail.com
Sat Dec 28 09:21:22 MST 2013
I've done similar testing (looking for collisions within sha1) with
millions of strings and their hashes. I didn't actually expect to
find any collisions but I wanted to try it anyway. In the process I
realized that ruby's Hash wouldn't work for this project because of
memory limitations (if I recall correctly) and I had to find another
When it was clear there were no collisions I tested my generated
hashes for some simple patterns e.g. Which char (0,1,2...d,e,f) is
most likely to appear in position 1 of the hash, position 2 of the
hash, etc. And I did find that some chars were more likely to occur
for a given position.
I brushed off my stats knowlegde to determine if the results were
statistically significant. Turns out that none of my findings were
Though I didn't discover anything new I'm glad I did it. This
exercise helped me become a little bit better. It reminds me of being
in school and writing a chess program. The purpose of the exercise
isn't to discover anything groundbreaking but rather it's to improve
my own skill set.
I have enjoyed the discussion in this thread and the variety of ways
to approach the problem.
More information about the PLUG