Whats wrong with AWS & other cloud tech?

Daniel Fussell dfussell at byu.edu
Fri Nov 18 12:26:33 MST 2011


On 11/18/2011 08:54 AM, S. Dale Morrey wrote:
> For the record the client has 20 servers located in a single
> datacenter, and it was during the design of their business continuity
> plan that they realized they suffer from the potential for a
> catastrophic single point of failure.  Client is a healthcare records
> management&  billing company, so HIPAA&  PCI are both significant
> concerns. But they do have strong encryption on the data and there are
> pretty tight controls on who exactly gets access to what data.
>
I know what you mean by that gut twisting feeling.  While I don't know 
much about the various cloud provider's designs and what not, I have 
been in a business where the expensive core applications where on Unisys 
mainframes sitting off in some hosted data center several hundred miles 
away.  You might think, "Gee, getting the data out of the Wasatch Front 
earthquake zone is a great idea."  Until I tell you the data center was 
in LA.

After Hurricane Katrina our entire industry dropped all of the 
nitty-gritty topics they usually whine about and started screaming 
business continuity.  We worked with the app host to "ensure" if LA was 
wiped out, the data would be available from one of their other regional 
data centers.  We ended up in Florida for our second site.  It's a long 
ways away from earthquake hazards and we thought this would be alright.  
That is, until I was on vacation in Florida making the geek's pilgrimage 
to Kennedy Space Center, saw the data center as we approached Kennedy, 
and suddenly realized something.  Not only was it in Hurricane Alley, it 
had been hit before by the same hurricane that wreaked havoc on NASA's 
vehicle assembly building.

Yes, the odds of a hurricane and an earthquake hitting at the same time 
are rather small-ish, but to make a long story short, disaster struck 
from the business side, and the business literally disappeared 
overnight.  Though it was the largest and oldest business in it's area, 
it's now a small paragraph in some small-town history book.

The sad part is, there was a long-running project that would have 
identified the killer problem and might have helped the company dodge 
yet another bullet in it's long history.  But it was repeatedly held up 
by objections from the app host, the bureaucracy in both companies, and 
the lack of design transparency.  This saving project should have been 
done in a week, tops, but was taking several years, all because it was 
cheaper to host the data and apps with a "trusted" partner.  A lot of 
people got hurt in the disaster aftermath, many of whom were my friends.

So that's my horror story, and why I get the gut twisting feeling 
whenever someone proposes that moving critical data off site to a 
service provider is a fantastically new idea.  Maybe today's cloud isn't 
your grandfather's app host.  But do consider that the further the data 
gets out of your immediate control, the more possible points of failure 
you add.  Sometimes redundancy reduces that risk; sometimes it makes it 
worse while making it look better.  Like growing RAID rebuild times, or 
cascading cluster failure, if you are into HA clusters.

We weren't regulated by HIPPA, but just off the top of my head you are 
probably going to need something like SAS-70 reports from the provider, 
updated regularly.  Vendor/Risk Management complexity will increase, as 
will the cost of the labor involved.  I understand there are heavy 
penalties for not being able to retrieve medical records; something like 
a $50,000 per lost x-ray (though I got that number from hearsay a few 
years ago).  What is the cost to the business if something simple (like 
backhoe-demarc issues) prevents you from getting to your highly 
redundant, multi-homed, replicated cloud host?  What's the cost if some 
government entity requests your data from the app host without your 
knowledge?  Will they require an official warrant, or just turn it over 
without due process because it's not their data to worry about.  What if 
the provider relationship doesn't work out.  Can you get your data back, 
and ensure they have no residual copies.

My rule of thumb is "don't outsource your core business".  It makes 
sense for a doctor to have a medical records company manage his 
archives; his business is diagnosis and treatment, not records 
management.  It doesn't make sense for a records management company to 
outsource storage and infrastructure.  At that point, you're just a 
compliance management company.  You might save 80% of infrastructure 
cost, but you're doing so by getting rid of 80% of your core business.

Now if that 80% savings is from moving your DR site from expensive 
rented racks and servers in a data center, or some kind of SunGard-ish 
contracted mobile data center drop, to a mass-market, cross-site 
replicated, virtual data center, then you might have something.  I'm 
assuming, of course, you won't later be tempted to run the whole thing 
from the cloud all the time.

But at that point you are comparing the cost of DR using aging and 
idling equipment (and/or idling DR insurance contracts), to an 
elastic-ish pay-as-you-go model, with only the (encrypted) storage 
replication hosts running all the time.  You would still need to do your 
periodic DR tests to ensure the VMs can come up and take over the load.

Finally, keep in mind that today's cloud is only slightly different from 
yesterday's black box.

My apologies to the hosting-industry folks for any un-intended offense.  
My comments are coming from my once-bitten-twice-shy experience, not 
from a disdain for the general industry.

Grazie,
;-Daniel


More information about the PLUG mailing list