Big Data is considered to be the job you simply have to go for. Some call it sexy, some call it the best job in the future. But what exactly is a Data Scientist? Is it someone you can simply hire from university or is it more complicated? Definitely the last one applies for that.
When we think about a Data Scientist, we often say that the perfect Data Scientist is kind of a hybrid between a Statistician and Computer Scientist. I think this needs to be redefined, since much more knowledge is necessary. A Data Scientist should also be good in analysing business cases and talk to line executives to understand the problem and model an ideal solution. Furthermore, extensive knowledge on current (international) law is necessary. In a recent study we did, we defined 5 major challenges:
Each of the 5 topics are about:
- Big Data Business Developer: The person needs to know what questions to ask, how to cooperate with line of business (LOB) decision makers and must have good social skills to cooperate with all of them.
- Big Data Technologist: In case your company isn’t using the cloud for Big Data Analytics, you also need to be into infrastructure. The person must know a lot about system infrastructure, distributed systems, datacenter design and operating systems. Furthermore, it is also important to know how to run your software. Hadoop doesn’t install itself and there is some maintenance necessary.
- Big Data Analyst: This is the fun part; here it is all about writing your queries, running Hadoop jobs, doing fancy MapReduce queries and so on! However, the person should know what to analyse and how to implement such algorithms. It is also about machine learning and more advanced topics.
- Big Data Developer: Here it is more about writing extensions, add-ons and other stuff. It is also about distributed programming, which isn’t the easiest part itself.
- Big Data Artist: Got the hardware/datacenter right? Know what to analyse? Wrote the algorithms? What about presenting them to your management? Exactly! This is also necessary! You simply shouldn’t forget about that. The best data is worth noting if nobody is interested in it because of poor presentation. It is also necessary to know how to present your data.
As you can see, it is very hard to become a data scientist. Things are not as easy as it might seems. The Data Scientist should be a nerd in each of these fields, so the person should be some kind of a “super nerd”. This might be the super hero of the future.
Most likely, you won’t find one person that is good in all of these fields. Therefore, it is necessary to build an effective team.
Header Image Copyright: Chase Elliott Clark