Big Data or what is the Data Lake?


When it comes to Big Data, people are often talking about the “Data Lake”. But what is this?

Historically, we normally lived in “Data Ponds”. With the data pond architecture, each department within a company has it’s own data storage, often in different formats and technologies. HR, for instance, uses other technologies like the marketing department. The basics for that vary, but it is mostly due to the fact that these applications are too different.

With a data pond we used to have different storage technologies such as SQL, NoSQL, XML, unstructured data and many more available.

The major difference to a data lake, which is the new approach, is that all data is now seen as one thing – regarding less of where it is stored, what department is the data owner and so on. All data within a company is the company’s entire knowledgement. With new technologies such as Hadoop, we have the possibility to use all available data. Hadoop offers many data integration and governance tools to go to different data types.

With the Data Lake, all existing data ponds are joined together to one place, that forms up a data lake. The company or organisation gets a much better view on what data is available and it also gets a more comprehensive insight.

Header Image copyright under the creative commons license by Dave Bloggs.

Advertisements

Published by

Mario Meir-Huber

I work as Big Data Architect for Microsoft. With this role, I support my customers in applying Big Data technologies - mainly Hadoop/Spark - for their use-cases. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s