Big Data is definitely a very complex “thing”. Why do I call it “a thing” here? Because it is simply not a technology itself! Hadoop is a technology, Lucene is a technology but Big Data is more of a concept, since it is nothing you can touch. Ever tried installing Big Data on your machine? Or said “I need this Big Data Software”? When you talk about a software or technology, you talk about a very concrete Product or Open Source Tool.
The concept of Big Data is rather complicated when it comes to implementing it. There are several major dimensions you have to be aware of.
The dimensions are:
- Legal dimension: What is necessary in terms of data protection legislation? What do you need to know about legal impacts, what kind of data are you allowed to store or collect/process?
- Social dimension: What social impacts will you generate with your application? How will your users react to that?
- Business dimension: What is the business model you want to generate with your Big Data platform? How can your Big Data platform support your business? What kind of pricing do you want to calculate?
- Technology dimension: How can you achieve your targets? What technology would you use to get there? What scale able software can you use?
- Application dimension: What industry solutions are available for your needs? How can you enable decision support based on data for your company?
If you want to target all of these questions, you need to have a team that is capable of fulfilling this request. In the next posts I will talk about the Big Data technology stack and what it needs to be a data scientist.
Header Image copyright: Michael Coghlan. Distributed under the Creative Commons license 2.0 by Creative Commons Australia Pool.