database design

Dan Hanks hanksdc at plug.org
Thu Jun 9 15:28:59 MDT 2005


On Thu, 9 Jun 2005, Steve Dibb wrote:

> I'm working on creating a new table in our database, and I'm wondering -- 
> why bother with the possibility of NULL values in a field?  It seems like it 
> always adds more work then its worth.
>
> What's the benefit of adding the option of IS NULL when you usually have to 
> check to see if its also blank (field = '' OR field IS NULL)?

Others have already covered the NULL != relational aspect pretty well, but 
here's another way to look at it. The relational model is based on set 
theory and predicate logic. If you've taken (what used to be called) CS 
235 at BYU (or was it 236...), then you're familiar with predicate logic.

If you look at each table (relation) in your database, you can see each 
one as representing a particular 'predicate' in the area you're trying to 
model. For example a table:

Employee
--------
id
first name
last name

represents the predicate:

Employee [ID] is known by [First Name] [Last Name]

In predicate theory, which is based on 2-valued-logic (2VL), each 
predicate can evaluate to either true or false. Each row (tuple) in your 
table (relation) contains values that you can "plug-in" to your predicate 
to provide true assertions about the enterprise you're trying to model.

E.g.,

Employee 2 is known by Jim Jones
Employee 3 is known as Bill Brown
etc.

If you take the logical AND of each of the rows (tuples) in your table 
(relation) you then have a (possibly very long) assertion of truth about 
your enterprise. These represent the "axioms" of information from which 
you can extrapolate further information about your enterprise using 
relational operators.

I wont go into detail here, but a query (and if you've taken CS 236, you 
remember the proofs you did with predicate logic) is nothing more than a 
proof using the axioms metioned above to forumlate new predicates 
evaluating true about your enterprise. If you have designed your database 
well, and have stuck with 2VL (true and false) then you can exercise a 
large degree of trust in the results of your queries, because the results 
are mathematically provable.

Now when you introduce NULL into the picture, all of a sudden your 
predicates are no longer 2VL assertions, since now you have true, false, 
and "maybe". You've now moved into 3VL which is much more complex. And pretty
much your ease to mathematically prove (as you can with 2VL) the validity 
of your query results just went out the window.

Designing databases without using nulls is challenging, but I have found, 
in my experience, is well worth the effort.

For further reading, I'd recommend O'Reilly's new book by C. J. Date 
called Database in Depth. For anyone who works with databases, a good 
grasp of predicate logic and set theory will do wonders for your 
understanding of what "relational databases" are all about. Yes it's 
"theory", but it's extremely practical theory.

FWIW,

-- Dan





More information about the PLUG mailing list