Why Big Data projects are challenging – and why I love it


During my professional carrier, I was managing several IT projects, mainly in the distributed systems environment. Initially, these projects were cloud projects, that were rather easy. I worked with IT departments in different domains/industries and we all had the same level of “vocabulary”. When talking with IT staff, it is clear that all use the same terms to describe “things”. No special explanation is needed.

I soon realized that Big Data projects are VERY different to that. I wrote several posts on Big Data challenges in the last month and the requirements for data scientists and alike. What I am always coming across when managing Big Data projects is the different approach one have to select when (successfully) managing these kind of projects.

Let me first start by explaining what I am doing. First of all, I don’t code, implement or create any kind of infrastructure. I work with senior (IT) staff to talk about ideas which will eventually be transformed to Big Data projects (either direct or indirect). My task is to work with them on what Big Data can achieve for their organization and/or problem. I am not discussion how their Hadoop solution will look like, I am working on use-cases and challenges/opportunities for their problems, independent from a concrete technology. Plus, I am not focused on any specific industry or domain.

However, all strategic Big Data projects have a concrete schema. The most challenging part is to understand the problem. In the last month, I had some challenges in different industries; whenever I run these kind of projects, it is mainly about cooperating with the domain experts. They often have no idea about the possibilities of Big Data – and they don’t have to. I, in contrast, have no idea about the domain itself. This is challenging on the one side – but very helpful on the other side. The more experience one person gains within a specific domain, the more the person thinks and acts in the methodology for the specific domain. They often don’t see the solution because they work on a “I’ve made this experience and it has to be very similar”. The same applies to me as a Big Data expert. All workshops I ran were mainly about mixing the concrete domain with the possibilities of Big Data.

I had a number of interesting projects lately. One of the projects was in the geriatric care domain. We worked on how data can make the live of elderly better and what type of data is needed. It was very interesting to work with domain experts and see what challenges they actually face. An almost funny discussion arose around Open Data – we looked at several data sources provided by the state and I mentioned: “sorry, but we can’t use these data sources. They aren’t big and they are about locations of toilets within our capital city”. However, their opinion was different because the location of toilets is very important for them – and data doesn’t always needs to be big, it needs to be valuable. Another project was in the utilities domain, where it was about improving their supply chain by optimizing it with data. Another project for a company providing devices was about improving the reliability of their devices by analyzing large amounts of log data. When their devices have an outage, a service personal has to go to the city of the outage. This takes several days to a week. I worked on reducing this time and included a data scientist. We could reduce the time the device stands still to some hours only for the 3 mayor error codes by finding patterns weeks before the outage occurs. However, there is still much work to be done in that area. Another project was in the utilities sector and in the government sector.

All of these projects had a common iteration phase but were very different – each project had it’s own challenges, but the key success factor for me was how to deal with people – it was very important to work with different people from different domains with a different mindset – improving my knowledge and broadening my horizon as well. That’s challenging on the one hand but very exciting on the other hand.

Advertisements

Published by

Mario Meir-Huber

I work as Big Data Architect for Microsoft. With this role, I support my customers in applying Big Data technologies - mainly Hadoop/Spark - for their use-cases. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data.

2 thoughts on “Why Big Data projects are challenging – and why I love it”

  1. Great post! I think one of the biggest challenges is that a lot of organizations do not know where to start. And, when they finally do make the investment and take the plunge they hope to boil the ocean. Big data success is most often the result of building up your maturity by starting small to address known problems. In doing so, its also important for organizations to celebrate the small wins. Failures are bound to happen, but when you are constantly adding to your abilities the failures can prove to be valuable lessons for the next win.

    Peter Fretty, IDG blogger working on behalf of SAS

    1. Thanks, Peter! We did a study with some partners analyzing Big Data Projects in terms of their management – and found that additional skills are necessary. I will hopefully soon have time to write about that 🙂

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s