database design

Stephen Smith scsmith1451 at totacc.com
Fri Jun 10 08:55:58 MDT 2005


Their are very important business critical reasons for incorporating 
NULL as valid datum in a database.  Take for instance an insurace 
company.  An operator is inputting an application.  As entry progresses, 
it is discovered that near the end of the application, information is 
missing or wrong.  Lets look at the business implications of each 
alternative.

1.  All fields must be NOT NULL - Since the required datum is not 
available the operator can put everything on hold to contact the agent 
to get the required datum.  In the mean time, no other work can be 
accomplished until the datum is acquired.  Hold time on personnel is 
money, driving up the cost of doing business.  Alternatively, the 
operator can cancel the input, call the agent, and set aside the 
application until the agent calls back.  When the datum is acquired, the 
application can be input.  The time to input the application just 
doubled...more costs. The third is to force the field to be a value that 
is NOT NULL but not valid either.  This leads to the retension of 
non-truth in the system which must be validated at some point before the 
data can be used upstream in the business process.  If a the value is 
poorly chosen, then the upstream process may not detect the erroneous 
datum and process it anyway.  Now even more costly to correct from a 
business standpoint.

2. Fields may be NULL.  - In the above example, the operator can save 
the partial application and resume where they left off when the datum is 
obtained,  minimizing input costs and allowing the business to be more 
competitive.  When looking at the data, the data tells the truth, i.e. 
the datum is not known, not some arbitray or cleverly manipulate value.  
Down stream processing can be stopped easily because there is no value 
to work against, thus preventing ambiguous results that must be reversed 
at additional expense to the company. 

It has been stated by those that study this stuff, that if the cost of 
correcting an error is 1 if it is detected at the time it occurs, it 
will cost 10 if it is not detected until the end of the process.  If the 
error gets to the end user, the cost of correction can approach 100.  It 
is in business's competitive interest to prohibit mistakes from getting 
to the end user.  Using poorly chosen values for fields that must be NOT 
NULL often lead to very costly errors to business.

Here is another example that I've seen.  In an insurace company, the 
application input process required that all fields be NOT NULL.  One of 
the fields that was populated was the policy #, which was generated from 
a sequence.  The application could not be input because of missing or 
incorrect datum and the input was cancelled.  The issued policy# was 
just lost  without any trace in the system as to why.  You are now 
called on by an auditor to explain why there are holes in the policy# 
sequence of issued policies.  Guess what, the question cannot be 
adaquately defended and serious financial consequenses can follow.

Last example.  There are business rules that are in effect for a period 
of time such as interest rates.  To represent the valid time period the 
rules is in effect a start date and an end date are associated with the 
rule.  If I choose to make the end date NOT NULL what date do I use when 
creating the record?  If I put the date out to the limits of time for 
the system, then the rule does not represent truth because it ends at 
the date specified.  If I use an epoc date, then the end date preceeds 
the start date which is also not true.  The only valid end date is NULL, 
the rule is in effect until terminated.

In over 20 years of database design and application development, I have 
found that it is far better to leave fields that have unknown values 
NULL rather than populate them with invalid data and suffer the 
consequenses later, especially for auditing purposes.  Only NULL tells 
the truth in the data...the datum is unknown.  When bogus data is used 
to populate required fields, it will inevitably end up on a report that 
someone will misinterpret often to near catestrophic results for business.



More information about the PLUG mailing list