Analytics, big data and enterprise data management that make the promise of Hadoop a reality

Data Lakes

Subscribe to Data Lakes: eMailAlertsEmail Alerts newslettersWeekly Newsletters
Get Data Lakes: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn

Top Stories

What Is the Difference Between a Data Lake and a Data Warehouse? By Dave Kellermanns The data warehouse and data lake are two different types of data storage repository. The data warehouse integrates data from different sources and suits business reporting. The data lake stores raw structured and unstructured data in whatever form the data source provides. It does not require prior knowledge of the analyses you think you want to perform. What is a Data Lake? A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed. While a hierarchical data warehouse stores data in files or folders, a data lake uses a flat architecture to store data. What is a Data Warehouse? A core component of business intelligence, the data warehouse is a central repository of integrated data from one or more disparate sources, and it's used ... (more)

Data Lake: Save Me More Money vs. Make Me More Money By @Schmarzo | @BigDataExpo #BigData

2016 will be the year of the data lake. But I expect that much of 2016 data lake efforts will be focused on activities and projects that save the company more money. That is okay from a foundation perspective, but IT and Business will both miss the bigger opportunity to leverage the data lake (and its associated analytics) to make the company more money. This blog examines an approach that allows organizations to quickly achieve some “save me more money” cost benefits from their data lake without losing sight of the bigger “make me more money” payoff – by coupling the data lake ... (more)

Data Lake and Data Refinery | @ThingsExpo #BigData #IoT #M2M #API #InternetOfThings

Data Lake and Data Refinery – Gartner Controversy! Much discussion has been going on the new phrase called Data Lake. Gartner wrote a report on the ‘Data Lake’ fallacy, saying to be careful about ‘data lake’ or ‘data swamp’. Then Andrew Oliver wrote in the InfoWorld these beginning words, “For $200, Gartner tells you ‘data lakes’ are bad and advises you to try real hard, plan far in advance, and get governance correct”. Wow, what an insight! During my days at IBM and Oracle, Gartner wanted to get time on my calendar to talk about database futures. Then afterwards, I realized that... (more)

The Data Lake Has Landed | @ThingsExpo #BigData #DevOps #IoT #M2M #API

The Data Lake Has Landed I'm in hi-tech marketing. I live in a sea of buzzwords, business jargon, and acronyms (most of which are actually abbreviations, but I've learned to let that one slide). They spread faster than a virus in a daycare center. I hear people on conference calls saying things like "Dave, let's double click on that thought and explore it further." Seriously? Do what? Do I sound like that? Or, I'll read marketing materials that say things such as "...Our full stack enterprise-grade cloud solution for business acceleration speeds time-to-value, increase margins, ... (more)

Data Lake Phenomenon | @ThingsExpo #IoT #M2M #BigData #Microservices

Data Lake Phenomenon Among Enterprises Over the past few years, there has been an explosion in the volume of data. To tackle this big data explosion, there has been a rise in the number of successful Hadoop projects in enterprises. Due to the large volumes of data, the emergence of Hadoop technology, and the need to store all soloed data in one place, has prompted a phenomenon among enterprises called: Data Lake. Is the Data Lake an effective catchment for all of the enterprise data? Yes and No. Data lakes are good to house the current, inter-related data but they don’t address th... (more)

CIOs Must Beware Big Data Blindness By @ABridgwater | @BigDataExpo #BigData

It's not hard to find technology trade press commentary on the subject of Big Data. Variously defined (in non-technical terms) as the cluttered old shoebox of all data - and again (in more technical terms) as that amount of data that does not comfortably fit into a standard relational database for storage, processing and analytics within the normal constraints of processing, memory and data transport technologies - we can say that Big Data is an oft mentioned and sometimes misunderstood subject. Three key Big Data control factors Good advice for CIOs faced with this new planet of... (more)