Building Correct Software (was Re: JOB: LAMP Artisan)

Levi Pearson levipearson at gmail.com
Wed Feb 26 11:13:15 MST 2014


On Wed, Feb 26, 2014 at 2:29 AM, Dan Egli <ddavidegli at gmail.com> wrote:
> On February 24, 2014, Levi Pearson wrote:
>> This is a reasonable thing to do, and certainly better than nothing,
>> but this sort of testing never offers any sort of proof of absence of
>> bugs.
>
> Nothing does, really. I've seen programs that are in use for years suddenly
> get a flurry of bug reports because someone tried something no one else
> tried and it caused problems, so they posted it and everyone did it. You
> can never be 100% sure of bug-free status. But you can get reasonably sure.
>

Well, there are always things that can go wrong, but there are
techniques to formally prove the correct correspondence of a program
to a specification.  This is more trouble than it's worth for a
program that lives on top of a huge pile of unverified code and is
built with an unverified compiler, but it can absolutely be used today
in smaller embedded systems, or even in standard environments where we
limit our expectations to guarantees of only a subset of our
specifications.

A couple of notable recent examples:

1. The seL4 verified microkernel:  http://ssrg.nicta.com.au/projects/seL4/
Another version of the L4 kernel (OKL4) that has not received *quite*
the same level of formal verification is currently deployed on over a
billion mobile devices:
http://www.ok-labs.com/why-ok-labs/why-choose-ok-labs

2. The CompCert C compiler, which is proven free of miscompilation
errors: http://compcert.inria.fr/compcert-C.html

And there are tools for static analysis and developing both formal
specifications and formal proofs that programs follow the
specifications.  Here's a framework for static analysis and proofs
about C programs that's not too hard to get started playing with:
http://frama-c.com/what_is.html

For the domain of web programming, there are specialized programming
languages that take advantage of recent advances in type theory, such
as Ur/Web: http://www.impredicative.com/ur/

Lest you claim that such a language is horribly impractical and not at
all usable in the real world, I present to you a counterexample; a
commercial hosted web app with the front-end written in Ur/Web:
https://bazqux.com/

So, these things are entering the realm of feasibility and even
practicality in some cases. Doing formal proofs in a system like Coq
(http://coq.inria.fr/) can actually be kind of fun.  There are a
couple of good textbooks on using it to study programming language
semantics as well as to build certified programs
(http://www.cis.upenn.edu/~bcpierce/sf/)
(http://adam.chlipala.net/cpdt/).  Coq was used for both CompCert and
Ur/Web.

>
>> And the thing about bugs is that different ones show up at
>> different usage scales.
>
> I've seen that, too. That's one of the reasons I frequently use Apache
> Bench and/or Siege. I can flood my page with requests, and have a log file
> that is only enabled during testing that logs what output would have been
> sent to the browser (since the output is actually discarded with ab and
> siege). After sending in, say  25k requests, I can spend a few hours
> looking over the log files and when I see a bug in the output or when php
> reports an error either in it's own log files, or in apache's error log
> file, then I know there's something to fix.
>

That's important, of course, but not at all the same thing as scaling
usage by *real users*.  Even if they are doing entirely legitimate
things, they will eventually trigger bugs you'd never be able to
ferret out through testing alone.  A solid foundation of tests is just
the bare minimum for developing applications that might possibly be
acceptable to customers at scale, if they're particularly forgiving.

This will turn up bugs in your hardware, too.  Manufacturing defects
at the board level start to be a common problem beyond a certain
scale, and then at another order of magnitude you will start to turn
up defects in silicon design and manufacturing.  CPU manufacturers
have started embracing formal methods for verifying important and
tricky parts of their chips, such as cache coherency algorithms; as
they gain more expertise at this and verification tools get better,
hopefully more of the hardware will be verified in this manner.

>> In truth, there are enough components out of your
>> control that no system that you program will ever be completely secure
>> or bug-free, but there are certainly approaches that can be taken that
>> will be far more effective than simply asking a few friends to hammer
>> on it, or even a few hundred friends.
>
> Again, this is where siege and/or ab come in. I've siege'd my pages to look
> at them. Then I'll have friends try all sorts of things, deliberately
> trying to break things. I'll fix what they report, the re-siege or re-ab,
> and try again. Is it guaranteed to become bug free? No. Of course not. But
> it gets it as close to bug-free as I can reasonably make it before entering
> production. Once something enters production you're almost guaranteed to
> find more bugs because either there's a bug in a component or your code
> that no one thought to try until now, or the simple fact is that you never
> managed to get your system with a heavy enough load to see this bug which
> only appears at certain load points.

Well, that's certainly accepted wisdom, and is mostly true for the
tools in common use, but as I mentioned at the beginning of this post
there are actually tools available that can help you do *much* better
with respect to removing (or never even including in the first place)
bugs before your application meets usage at scale.  They are currently
in use at the fringes of practical software development, but the sorry
state of software engineering at present is untenable and eventually
the use of formal systems for correct construction and verification of
software will become more widespread.

        --Levi


More information about the PLUG mailing list