Analytics, big data and enterprise data management that make the promise of Hadoop a reality

Data Lakes

Subscribe to Data Lakes: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Data Lakes: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Data Lakes Authors: Liz McMillan, William Schmarzo, Automic Blog, Glenn Rossman, Jason Bloomberg

Related Topics: EMC Journal, Data Lakes News

Data Lakes: Blog Feed Post

Live From Strata + Hadoop World: Dry Lakes, Salt Lakes, Data Lakes

Jeffrey Abbott

IMG_1565Water, water everywhere and nothing to drink. Today I traveled from Boston to San Jose, CA. With stunningly clear weather and a window seat, I observed the transition from a frozen blanket of white covering the entire Northeast and Great Lakes, to the dry and rugged Rockies that are oddly snow-free, to the nearly empty reservoirs of California with their bleached sidewalls that reveal our failure to control our supply and demand for natural resources. The picture here is the Utah Wasatch range that’s home to Snowbird and Alta, which usually have among the most snow of any U.S. ski area (looks more like May than February right now). This year, you’ll find far more snow in New England. This trip brings me to the biggest gathering of big data practitioners of the year and although I see empty reservoirs, I see lots of data lakes.

IMG_1573

In fact, from looking at the top big data vendors, it seems that the notion of a data lake has surpassed the skepticism, rejection, and second guessing that plagues all new tech concepts. Vendors, customers, and industry experts have found common ground around the idea that the data lake can relieve the challenges of the data warehouses. The big question is where the data lake fits with the data warehouse. Is it a teammate, a leader, a follower, or a full-on replacement?

The data lake, although it suffers from a bad name, leverages new technologies and approaches to accommodate both structured and unstructured data from a range of sources without the need to categorize/classify/label it when it’s captured. In other words, because technologies such as Hadoop enable it to be ingested with high efficiency, we can now store it without already knowing how we’ll use it.

Although so many vendors are rushing to position their capabilities to build you a data lake, many of them are missing the primary reason why their customers are slow to adopt. The challenge is that the promised value of a data lake has two distinct categories. The first is easy. It’s the cost savings side. It’s the efficiency derived from a better way to store massive amounts of both structured and unstructured data. And although that matters, it’s… well… boring. What makes business leaders interested? New products, services, markets, customers, business models, partnerships, revenue streams, etc. And those are exactly the right types of use cases for big data analytics and data lakes.

But in order for business leaders to sign off on major investments, they need numbers, metrics, KPIs, ROI, time-to-value, opportunity cost, economies of scale, etc. And for big data, they need to understand the analytics  use cases that will result in insight that advances their strategic initiatives. They need this before committing to making a major shift in how they “afford” IT, in hopes of turning it from a cost center into a revenue center.

From Day 1 at the Strata Conference in San Jose 2015,  it’s apparent that the data lake has moved from an experiment that runs alongside a data warehouse, into a better approach to ingest and store data that has untapped value. The critical first step is to determine where and how to apply the analytics capabilities.  Many studies show that identifying use cases for big data is the biggest obstacle in big data adoption.  EMC has addressed this with a Big Data Vision Workshop. This infographic explains the process.

Live From Strata + Hadoop World: Dry Lakes, Salt Lakes, Data Lakes
Jeffrey Abbott

Read the original blog entry...

More Stories By Jeffrey Abbott

Jeff is part of EMC’s Global Services division, helping customers understand how to identify, and take advantage of, opportunities in Big Data.

Prior to EMC, Jeff helped build and promote a cloud-based ecosystem for CA Technologies that combined an online community, cloud development platform, and e-commerce site for cloud services. Jeff also spent several years within CA’s Thought Leadership group, creating and promoting top-level messaging and social-media programs around major disruptive trends in IT. Prior to this, Jeff spent 3 years at EMC, marketing IT management software products. Jeff’s marketing career also includes time at Citrix, as well as numerous marketing firms – one of which he founded with 2 former colleagues in 1999. Jeff lives in Sudbury, MA, with his wife, 2 boys, and dog. Jeff enjoys skiing, backpacking, photography, and classic cars.