Big Data involves a lot of different technologies. Each of these technologies require different knowledge. I’ve described the knowledge necessary in an earlier post.
In this post, I want to outline all necessary technologies in the Big Data stack. The following image shows them:
The layers are:
- Management: In this layer, the problem on how to store data on hardware or in the cloud and what resources need to be scheduled is addressed. It is basically knowledge involved in datacenter design and/or cloud computing for Big Data.
- Platforms: This layer is all about Big Data technologies such as Hadoop and how to use them.
- Analytics: This layer is about the mathematical and statistical techniques necessary for Big Data. It is about asking the questions you need to answer.
- Utilisation: The last and most abstract layer is about the visualization of Big Data. This is mainly used by visual artists and presentation software.
Each of the layers needs different knowledge and also different hardware and software. As described earlier, it is simply not possible to have one software that “fits it all”. And you need to create a team that has the knowledge in all of these areas.