# 24 Million entries and I need to what?

Levi Pearson levipearson at gmail.com
Fri Dec 27 15:43:00 MST 2013

```On Fri, Dec 27, 2013 at 1:59 AM, S. Dale Morrey <sdalemorrey at gmail.com> wrote:
> So here's the problem...
>
> I'm exploring the strength of the SHA256 algorithm.
> Specifically I'm looking for the possibility of a hash collision.
>

to your question, you need to rethink what you're doing entirely.

Hash collisions are always possible when the set of outputs is smaller
than the set of inputs. You don't need to run any programs to know
this. Furthermore, the algorithm is fully deterministic, so the
probability of collisions is fixed for any fixed input parameters. But
considering the fact that the algorithm was designed to make hash
collisions as unlikely as possible, you should realize that the
collision probability is going to be vanishingly tiny.

And even if, by some cosmically unlikely chance, your 24 million
random strings happen to have a pair of distinct strings that hash to
the same value--you still have not really explored the strength of
SHA256.  The strength is a probabilistic thing, and finding a
collision through sheer luck does not affect the probability at all.
The strength of a hash function is determined by the evenness of its
distribution of outputs for given inputs.  A perfect hash maps its
inputs to the full range of possible outputs completely evenly, with
no output more likely than any other.  For a hash function with 256
bits of output, the number of output buckets is 2^256, which is an
astronomically huge number.  I'll leave it to you to calculate the
probability of generating a collision with 24 million samples given
the assumption of a perfectly even distribution, but I think you'll
find it's not worth trying.

If you are unfamiliar with the math behind the test you're trying to
run, you should take a look at Wikipedia's article on the Birthday