My Big Data predictions for 2016

As 2016 is around the corner, the question is what this year will bring for Big Data. Here are my top assumptions for the year to come:

  • The growth for relational databases will slow down, as more companies will evaluate Hadoop as an alternative to classic rdbms
  • The Hadoop stack will get more complicated, as more and more projects are added. It will almost take a team to understand what each of these projects does
  • Spark will lead the market for handling data. It will change the entire ecosystem again.
  • Cloud vendors will add more and more capability to their solutions to deal with the increasing demand for workloads in the cloud
  • We will see a dramatic increase of successful use-cases with Hadoop, as the first projects come to a successful end

What do you think about my predictions? Do you agree or disagree?

Big Data and Hadoop E-Books at reduced price

2 Big Data and Hadoop E-Books are available at a special promotion. The reduced price is only valid for 1 week, so make sure to order soon! The offer expires on 21th of December and are available on the Kindle store. The two E-Books are:

  • Big Data (Introduction); 0.99$ instead of 5$: Get it here
  • Hadoop (Introduction); 0.99$ instead of 5$: Get it here

Have fun reading it!

My Cloud predictions for 2016

2016 is around the corner and the question is, what the next year might bring. I’ve added my top 5 predictions that could become relevant for 2016:

  • The Cloud war will intensify. Amazon and Azure will lead the space, followed (with quite some distance) by IBM. Google and Oracle will stay far behind the leading 2+1 Cloud providers. Both Microsoft and Amazon will see significant growth, with Microsoft’s growth being higher, meaning that Microsoft will continue to catch up with Amazon
  • More PaaS Solutions will arrive. All major vendors will provide PaaS solutions on their platform for different use-cases (e.g. Internet of Things). These Solutions will become more industry-specific (e.g. a Solution specific for manufacturing workflows, …)
  • Vendors currently not using the cloud will see declines in their income, as more and more companies move to the cloud
  • Cloud Data Centers will become more often outsourced from the leading providers to local companies, in order to overcome local legislation
  • Big Data in the Cloud will grow significantly in 2016 as more companies will put workload to the Cloud for these kind of applications

What do you think? What are your predictions?

Big Data Europe Meetup in Vienna, 15th of December

On the 15th of December, a Big Data Meetup will take place in Vienna, with leading personals from Fraunhofer, Rapidminer, Teradata et al.

About the Meetup:

The growing digitization and networking process within our society has a large influence on all aspects of everyday life. Large amounts of data are being produced permanently, and when these are analyzed and interlinked they have the potential to create new knowledge and intelligent solutions for economy and society. Big Data can make important contributions to the technical progress in our societal key sectors and help shape business. What is needed are innovative technologies, strategies and competencies for the beneficial use of Big Data to address societal needs.

Climate, Energy, Food, Health, Transport, Security, and Social Sciences – are the most important societal challenges tackled by the European Union within the new research and innovation framework program “Horizon 2020”. In every one of these fields, the processing, analysis and integration of large amounts of data plays a growing role – such as the analysis of medical data, the decentralized supply with renewable energies or the optimization of traffic flow in large cities.

Big Data Europe (BDE, will undertake the foundational work for enabling European companies to build innovative multilingual products and services based on semantically interoperable, large-scale, multi-lingual data assets and knowledge, available under a variety of licenses and business models

On 14-15 December 2015 the whole BDE team is meeting in Vienna for a project plenary and thereby around 35 experts in the topic will be participating in the Big Data Europe MeetUp on 15 December 2015 at the Impact Hub Vienna to discuss challenges and requirements and proven solutions for big data management together with the audience.

16:00 – 16:10, Welcome & the BDE MeetUp, Vienna – Martin Kaltenböck (SWC)
16:10 – 16:30, The Big Data Europe Project
Sören Auer (Fraunhofer IAIS, BDE Project Lead)
16:30 – 16:45, Big Data Management Models (e.g. RACE)
Mario Meir-Huber (Big Data Lead CEE, Teradata, Vienna – Austria)
16:45 – 17:00, Selected Big Data Projects in Budapest & above,

Zoltan C Toth (Senior Big Data Engineer RapidMiner Inc., Budapest – Hungary)
17:00 – 17:30 Open Discussion with the Panel on Big Data Requirements, Challenges and Solutions.
17:30 – 19:00 Networking & Drinks
Remark: 19:00/30 end of event…

Register here or here.

Conference announcement – Data Natives in Berlin

I am happy to announce that there is a partnership between the Data Natives conference and Cloudvane. Once again, one lucky person can get a free ticket to this conference. The conference takes place from 19th to 20th November in Berlin.

What’s necessary for you to get the ticket:

  • Share the blog post (Twitter, LinkedIn, Facebook) and send the proof of that to me via mail
  • Write a review (ideally with some pictures)

Data Natives focuses on three key areas of innovation: Big Data, IoT and FinTech. The intersection of these product categories is home to the most exciting technology innovation happening today. Whether it’s for individual consumers or multi-billion dollar industries, the opportunity is immense. Come and learn more from leading scientists, founders, analysts, investors and economists coming from Google, SAP, Rocket Internet,Gartner, Forrester among others. Two days full of interesting talks, sharing knowledge from 50+ speakers and engaging with the community of a data driven generation of more than 500 people.

More information on 

Thursday, November 19, 8:30AM to Friday, November 20 7:00PM  

NHow Hotel Berlin

Stralauer Allee 3

10245 Berlin


What everyone is doing wrong about Big Data

I saw so many Big Data “initiatives” in the last month in companies. And guess what? Most of them failed either completely or simply didn’t deliver the results expected. A recent Gartner study even mentioned that only 20% of Hadoop projects are put “live”. But why do these projects fail? What is everyone doing wrong?

Whenever customers are coming to me, they “heard” of what Big Data can help them with. So they looked at 1-3 use cases and now want to have them put into production. However, this is where the problem starts: they are not aware of the fact that also Big Data needs a strategic approach. To get this right, it is necessary to understand the industry (e.g. TelCo, Banking, …) and associated opportunities. To achieve that, a Big Data roadmap has to be built. This is normally done in a couple of workshops with the business. This roadmap will then outline what projects are done in what priority and how to measure results. Therefore, we have a Business Value Framework for different industries, where possible projects are defined.

The other thing I often see is that customers come and say: so now we built a data lake. What should we do with it? We simply can’t find value in our data. This is a totally wrong approach. We often talk about the data lake, but it is not as easy as IT marketing tells us; whenever you build a data lake, you first have to think about what you want to do with it. Why should you know what you might find if you don’t really know what you are looking for? Ever tried searching “something”? If you have no strategy, it is worth nothing and you will find nothing. Therefore, a data lake makes sense, but you need to know what you want to build on top of it. Building a data lake for Big Data is like buying bricks for a house – without knowing where you gonna construct that house and without knowing what the house should finally look like. However, a data lake is necessary to provide great analytics and to run projects on top of that.

Big Data and IT Business alignment
Big Data and IT Business alignment


Summing it up, what is necessary for Big Data is to have a clear strategy and vision in place. If you fail to do so, you will end up like many others – being desperate about the promises that didn’t turn out to be true.


How to kill your Big Data initiative

Everyone is doing Big Data these days. If you don’t work on Big Data projects within your company, you are simply not up to date and don’t know how things work. Big Data solves all of your problems, really!

Well, in reality this is different. It doesn’t solve all your problems. It actually creates more problems then you think! Most companies I saw recently working on Big Data projects failed. They started a Big Data project and successfully wasted thousands of dollars on Big Data projects. But what exactly went wrong?

First of all, Big Data is often only seen as Hadoop. We live with the mis-perception that only Hadoop can solve all Big Data topics. This simply isn’t true. Hadoop can do many things – but real data science is often not done with the core of Hadoop. Ever talked to someone doing the analytics (e.g someone good in math or statistics)?. They are not ok with writing Java Map/Reduce queries or Pig/Hive scripts. They want to work with other tools that are ways more interactive.

The other thing is that most Big Data initiatives are often handled wrong. Most initiatives often simply don’t include someone being good in analytics. One simply doesn’t find this type of person in an IT team – the person has to be found somewhere else. Failing to include someone with this skills often leads to finding “nothing” in the data – because IT staff is good in writing queries – but not in doing complex analytics. These skills are actually not thought in IT classes – it requires a totally different study field to reach this skill set.

Hadoop as the solution to everything for many IT departments. However, projects often stop with implementing Hadoop. Most Hadoop implementations never leave the pilot phase. This is often due to the fact that IT departments see Hadoop as a fun thing to play with – but getting this into production requires a different approach. There are actually more solutions out there that can be done when delivering a Big Data project.

A key to ruining your Big Data project is not involving the LoB. The IT department often doesn’t know what questions to ask. So how can they know the answer and try to find the question? The LoB sees that different. They see an answer – and know what question it would be.

The key to kill your Big Data initiative is exactly one thing: go with the hype. Implement Hadoop and don’t think about what you actually want to achieve with it. Forget the use-case, just go and play with the fancy technology. NOT

As long as companies will stich to that, I am sure I will have enough work to do. I “inherited” several failed projects and turned them into success. So, please continue.