database design
Dan Hanks
hanksdc at plug.org
Thu Jun 9 15:28:59 MDT 2005
On Thu, 9 Jun 2005, Steve Dibb wrote:
> I'm working on creating a new table in our database, and I'm wondering --
> why bother with the possibility of NULL values in a field? It seems like it
> always adds more work then its worth.
>
> What's the benefit of adding the option of IS NULL when you usually have to
> check to see if its also blank (field = '' OR field IS NULL)?
Others have already covered the NULL != relational aspect pretty well, but
here's another way to look at it. The relational model is based on set
theory and predicate logic. If you've taken (what used to be called) CS
235 at BYU (or was it 236...), then you're familiar with predicate logic.
If you look at each table (relation) in your database, you can see each
one as representing a particular 'predicate' in the area you're trying to
model. For example a table:
Employee
--------
id
first name
last name
represents the predicate:
Employee [ID] is known by [First Name] [Last Name]
In predicate theory, which is based on 2-valued-logic (2VL), each
predicate can evaluate to either true or false. Each row (tuple) in your
table (relation) contains values that you can "plug-in" to your predicate
to provide true assertions about the enterprise you're trying to model.
E.g.,
Employee 2 is known by Jim Jones
Employee 3 is known as Bill Brown
etc.
If you take the logical AND of each of the rows (tuples) in your table
(relation) you then have a (possibly very long) assertion of truth about
your enterprise. These represent the "axioms" of information from which
you can extrapolate further information about your enterprise using
relational operators.
I wont go into detail here, but a query (and if you've taken CS 236, you
remember the proofs you did with predicate logic) is nothing more than a
proof using the axioms metioned above to forumlate new predicates
evaluating true about your enterprise. If you have designed your database
well, and have stuck with 2VL (true and false) then you can exercise a
large degree of trust in the results of your queries, because the results
are mathematically provable.
Now when you introduce NULL into the picture, all of a sudden your
predicates are no longer 2VL assertions, since now you have true, false,
and "maybe". You've now moved into 3VL which is much more complex. And pretty
much your ease to mathematically prove (as you can with 2VL) the validity
of your query results just went out the window.
Designing databases without using nulls is challenging, but I have found,
in my experience, is well worth the effort.
For further reading, I'd recommend O'Reilly's new book by C. J. Date
called Database in Depth. For anyone who works with databases, a good
grasp of predicate logic and set theory will do wonders for your
understanding of what "relational databases" are all about. Yes it's
"theory", but it's extremely practical theory.
FWIW,
-- Dan
More information about the PLUG
mailing list