24 Million entries and I need to what?
levipearson at gmail.com
Fri Dec 27 19:54:38 MST 2013
On Fri, Dec 27, 2013 at 5:57 PM, Joshua Marsh <joshua at themarshians.com> wrote:
> On Fri, Dec 27, 2013 at 3:43 PM, Levi Pearson <levipearson at gmail.com> wrote:
>> On Fri, Dec 27, 2013 at 1:59 AM, S. Dale Morrey <sdalemorrey at gmail.com>
>> > So here's the problem...
>> > I'm exploring the strength of the SHA256 algorithm.
>> > Specifically I'm looking for the possibility of a hash collision.
>> Man, talk about a lot of unhelpful answers. You don't need an answer
>> to your question, you need to rethink what you're doing entirely.
> I actually found the answers really helpful. His experimenting will likely
> uncover the obvious, but he'll have learned new tools and algorithms in the
In what way did you find them to be helpful? Some of them were
interesting, and it may be fun to think about them, but none of them
were at all relevant to the stated goal of "exploring the strength of
SHA256" or even the stated subgoal of "looking for the possibility of
a hash collision". Is helping someone further along the wrong path
really helpful? No, the helpful thing to do is to point the person
onto the right path, not brainstorm about the fastest way to do
something completely useless for the intended purpose.
If you think that hashing 24 million random strings with SHA256 and
checking for matches is a useful or interesting exploration of the
SHA256 algorithm, then I would submit that you are not likely to
uncover the "obvious" on your own by heading down that line of
experimentation. In fact, it's not really obvious that it's a bad
experiment at all, at least not without stopping to think about the
sizes of numbers involved and knowing the direct way to calculate the
resulting problem size.
If you followed some of the advice given, you'd spend countless hours
implementing a crazy low-level multi-threaded program with custom
optimized data storage, then spend countless more hours debugging said
program, and then quite a long time waiting for it to finish, and
then... what? So the program finishes and says "no matches". What
does that mean? At this point, not a whole lot. Even if you generate
a bunch of similar data sets and get the same result, does it tell you
anything meaningful about SHA256? How would you *know* if it was an
Even if you followed the advice to throw the data into some SQL
database or scripting language run-time and it happened to yield
reasonable performance and reasonable space usage, you're still
"looking for the possibility of a hash collision" in precisely the
wrong way. Just what result from doing this is supposed to tell you
that you're doing the wrong thing? If you think it's the right thing,
the next step would be to just do *more* of it, or do it with some
minor variations, and that's going to waste more time and resources
without offering any insight. How do you know you're not *this* close
to finding a collision if you have no idea how big the problem really
This is not a programming problem, an algorithmic problem, or a data
structure problem. It's a math problem. You should think about it
mathematically. If you have a hard time thinking mathematically about
it, then you should either give up on cryptography or spend some more
time with your mathematical thinking fundamentals.
Let me reiterate, lest I seem discouraging towards people who are
earnestly trying to feel their way around a very interesting problem
they don't quite know the right questions to ask about yet. There was
nothing wrong with asking the original question; it's very common to
not even know the right way to explore unfamiliar terrain, and it's an
inescapable part of being a novice at something you're trying to learn
on your own. Many of the answers were given with good intentions and
probably a similar unfamiliarity with the surrounding problem domain.
But the fact remains that the answers were *unhelpful* and I'm sure
some of of the answerers would have realized this with a bit of
More information about the PLUG