Thursday, 17 October 2013

Weaknesses in Traditional Data Platforms

Everyone understands that Hadoop brings high performance commercial computing to organizations using relatively low cost commodity storage.   What is accelerating the move to Hadoop are weaknesses in traditional relational and data warehouse platforms in meeting today's business needs.  Some key weaknesses of traditional platforms include:


  • Late binding with schemas greatly increase the latency between receiving new data sources and deriving business value from this data. 
  • The significantly high cost and complexity of SAN storage.  This high cost forces organizations to aggregate and remove a lot of data that contains high business value.  Important details and information are getting thrown out or hidden in aggregated data.
  • The complexity of working with semi-structured and unstructured data.
  • The incredible cost, complexity and ramifications of maintaining database administrators, storage and networking teams in traditional platforms.   There are lots of silos of expertise and software required in traditional environments that have dramatic effects on agility and cost.  It's gotten to the point that vendors are now delivering extremely expensive engineered systems to deal with the complexity of these silos.  These expensive engineered systems require even more specialized expertise to maintain and make customers ever more dependent on the vendors.  What's funny is you hear the old phrase "one throat to choke but it's the customer whose choking on the cost. With Hadoop's self-healing and fault tolerance a small team can manage thousands of servers.   A single Hadoop administrator can manage  1000 - 3000 nodes all on relatively inexpensive commodity hardware.

While the above highlights the need for Hadoop, it's also important to understand traditional relational databases and data warehouses still have the same role and are needed.   A relational database provides a completely different function that a Hadoop cluster.  Also, a company is not going to throw out all their existing data warehouses or the expertise and reporting they've built around them.  Hadoop today is usually used to add new capabilities to an enterprise data environment not replace existing platforms.

The old line of no one ever gets fired for buying IBM is a thing of the past with Hadoop.  An entire organization may go under if your competition is effectively using big data and you are not.   Hadoop is  the most disruptive technology since the .com days.  

No comments:

Post a comment