What Is the Difference Between a Data Lake and a Data Warehouse?
By Dave Kellermanns
The data warehouse and data lake are two different types of data storage
repository. The data warehouse integrates data from different sources and
suits business reporting. The data lake stores raw structured and
unstructured data in whatever form the data source provides. It does not
require prior knowledge of the analyses you think you want to perform.
What is a Data Lake?
A data lake is a storage repository that holds a vast amount of raw data in
its native format until it is needed. While a hierarchical data warehouse
stores data in files or folders, a data lake uses a flat architecture to
What is a Data Warehouse?
A core component of business intelligence, the data warehouse is a central
repository of integrated data from one or more disparate sources, and it's
used ... (more)
2016 will be the year of the data lake. But I expect that much of 2016 data
lake efforts will be focused on activities and projects that save the company
more money. That is okay from a foundation perspective, but IT and Business
will both miss the bigger opportunity to leverage the data lake (and its
associated analytics) to make the company more money.
This blog examines an approach that allows organizations to quickly achieve
some “save me more money” cost benefits from their data lake without
losing sight of the bigger “make me more money” payoff – by coupling
the data lake ... (more)
Data Lake and Data Refinery – Gartner Controversy!
Much discussion has been going on the new phrase called Data Lake. Gartner
wrote a report on the ‘Data Lake’ fallacy, saying to be careful about
‘data lake’ or ‘data swamp’. Then Andrew Oliver wrote in the
InfoWorld these beginning words, “For $200, Gartner tells you ‘data
lakes’ are bad and advises you to try real hard, plan far in advance, and
get governance correct”. Wow, what an insight!
During my days at IBM and Oracle, Gartner wanted to get time on my calendar
to talk about database futures. Then afterwards, I realized that... (more)
The Data Lake Has Landed
I'm in hi-tech marketing. I live in a sea of buzzwords, business jargon, and
acronyms (most of which are actually abbreviations, but I've learned to let
that one slide). They spread faster than a virus in a daycare center. I hear
people on conference calls saying things like "Dave, let's double click on
that thought and explore it further." Seriously? Do what? Do I sound like
that? Or, I'll read marketing materials that say things such as "...Our full
stack enterprise-grade cloud solution for business acceleration speeds
time-to-value, increase margins, ... (more)
It's not hard to find technology trade press commentary on the subject of Big
Variously defined (in non-technical terms) as the cluttered old shoebox of
all data - and again (in more technical terms) as that amount of data that
does not comfortably fit into a standard relational database for storage,
processing and analytics within the normal constraints of processing, memory
and data transport technologies - we can say that Big Data is an oft
mentioned and sometimes misunderstood subject.
Three key Big Data control factors
Good advice for CIOs faced with this new planet of... (more)
Data Lake Phenomenon Among Enterprises
Over the past few years, there has been an explosion in the volume of data.
To tackle this big data explosion, there has been a rise in the number of
successful Hadoop projects in enterprises. Due to the large volumes of data,
the emergence of Hadoop technology, and the need to store all soloed data in
one place, has prompted a phenomenon among enterprises called: Data Lake.
Is the Data Lake an effective catchment for all of the enterprise data?
Yes and No. Data lakes are good to house the current, inter-related data but
they don’t address th... (more)