My Big Data predictions for 2016

As 2016 is around the corner, the question is what this year will bring for Big Data. Here are my top assumptions for the year to come:

  • The growth for relational databases will slow down, as more companies will evaluate Hadoop as an alternative to classic rdbms
  • The Hadoop stack will get more complicated, as more and more projects are added. It will almost take a team to understand what each of these projects does
  • Spark will lead the market for handling data. It will change the entire ecosystem again.
  • Cloud vendors will add more and more capability to their solutions to deal with the increasing demand for workloads in the cloud
  • We will see a dramatic increase of successful use-cases with Hadoop, as the first projects come to a successful end

What do you think about my predictions? Do you agree or disagree?

My Cloud predictions for 2016

2016 is around the corner and the question is, what the next year might bring. I’ve added my top 5 predictions that could become relevant for 2016:

  • The Cloud war will intensify. Amazon and Azure will lead the space, followed (with quite some distance) by IBM. Google and Oracle will stay far behind the leading 2+1 Cloud providers. Both Microsoft and Amazon will see significant growth, with Microsoft’s growth being higher, meaning that Microsoft will continue to catch up with Amazon
  • More PaaS Solutions will arrive. All major vendors will provide PaaS solutions on their platform for different use-cases (e.g. Internet of Things). These Solutions will become more industry-specific (e.g. a Solution specific for manufacturing workflows, …)
  • Vendors currently not using the cloud will see declines in their income, as more and more companies move to the cloud
  • Cloud Data Centers will become more often outsourced from the leading providers to local companies, in order to overcome local legislation
  • Big Data in the Cloud will grow significantly in 2016 as more companies will put workload to the Cloud for these kind of applications

What do you think? What are your predictions?


On top of all those collaboration- and cloud-services a lot of us have found out that working together has not become much easier since the introduction of those services. As today every organization uses own infrastructure either self-hosted or an online services the borders have only moved but have not gotten transparent when needed. The walls between collaborating organisations are as strong as ever.

SPHARES is here to change this.

We are allowing sharinglike DropBox, but between different systems. Even hosted on your own systems -Dietmar Gombotz, CEO of SPHARES

SPHARES is a small start-up team of 5 from Vienna with the mission to make working-life and collaboration much easier by providing a tool that allows you to integrate different work environments without having to actually change tools.

It is working as a service-integrator between different systems in the background. The sync-engine allows to transparently share data to and from colleagues using different (or even the same) systems as oneself.

As an integration type it currently allows one-way and two way synchronization, between different heterogenous systems.

Our Goal is to make sharing between organisations as easy as sitting beside each other in the same office, even at the same desk, Hannes Schmied, BizDev SPHARES

Overview SPHARES
Overview SPHARES

SPHARES either runs on your server or is hosted online for you on a dedicated virtual machine. It allows you to directly integrate your partners with you via your own server where you control the environment. Even if you have a virtual machine from us we will not have access to the users data, neither will you. We secured the communication with double encryption.

Current Use-Cases SPHARES focus on

  • Marketing Agencies for collaborator integration
  • Tax Advisors in the digital agency
  • Unique System Integration for integrating bigger solutions
  • Technology Providing for Plattforms

SPHARES provides the system either on a service agreement providing you the service on a monthly fee, including all costs for license, updates and support handling via web-interface or as a technology license for one-time fees + maintenance.

If you are interested please simply drop the team a line at and they will come back to you ASAP

Impact of self-driving cars and Smart Logistics on Cloud and Big Data

Self-driving cars are getting more and more momentum. In 2014, Tesla introduced the “Autopilot” feature for it’s Model S, which allows autonomous driving. The technology for self-driving cars has been around for years though – there are other factors why it is still not here. It is mainly a legal question and not a technical one.

However, autonomous systems will be here in some years from now, and they will have a positive impact on cloud computing and big data. The use-cases were already described partially with smart cities in an earlier post. However, there are several other use-cases. Positive effects of self-driving cars are the advanced security: sensors need milliseconds to react to threads whereas humans need a second. This gives more time for better reactions. Autonomous systems can then also communicate with other cars and warn them in advance. This is called “Vehicle to Vehicle communication”. But communication is also done with infrastructure (which is called Vehicle to Infrastructure communication). A street for instance can warn the car that there are problems ahead – e.g. that the street itself is getting worse.

The car IT itself doesn’t need the cloud and big data – but services around that will heavily use cloud and big data services.

Self-driving cars also brings a side-effect: Smart Logistics. Smart Logistics are fully automated logistic devices that drive without the need of a driver and deliver goods to a destination. This can start in china with a truck that brings a container to a ship. This ship is also fully automated and works independent. The ship drives to New York, where the goods are picked up by a self-driving truck again. The truck brings the container to a distribution center, where robots unload the container and drones deliver the goods to the customers. All of that is handled by cloud and big data systems that often operate in real-time.

Impact of Industry 4.0 and Smart Production on Cloud and Big Data

According to various sources, we are in the middle of the so-called 4th industrial revolution. This revolution is basically lead by a very high degree of automation and IT systems. Until recently, IT played mainly a support role in the industry, but with new technologies the role will change dramatically: it will lead the industry. Industry 4.0 (or Industrie 4.0) is mainly lead by Germany which places a high bet on that topic. The industrial output of germany is high and in order to maintain it’s global position, the german industry has to – and will – change dramatically.

Let’s first look at the past industrial revolutions:

  • The first industrial revolution took place in the 18th century. This happend when the mechanical loom was introduced.
  • The second industrial revolution took place in the early 20th century, when assembly lines were introduced
  • In the 70th and 80th of the last century, the 3rd industrial revolution took place. Machines could now work on repeatable tasks and robots were first introduced

The 4th industrial revolution is now lead dramatically by the IT industry. It is not only about supporting the assembly lines but it is about replacing them. The customer can define it’s own product and make it really individual. Designers can offer templates in online stores and the product then knows how it will be produced. The product selects in what fabric it will be produced and tells the machines how it should be handled.

Everything in this process is fully automated. It starts by ordering something online. The transportation process is automated as well – autonomous systems deliver individual parts to the fabrics and this goes well beyond traditional just-in-time delivery. This is also a democratization of design: just like individuals can now write their books without a publisher as e-books, designers can provide their designs online on new platforms. This gives new opportunities to designers as well as customers.

As with Smart Homes and Smart Cities, this produces not only a lot of data – it also requires sophisticated back-end systems in the cloud that take care of this complex processes. Business processes need to be adjusted to the new challenges and they are more complex than ever. This can’t be handled by one single system – this needs a complex system running in the cloud.

Guest Blog: Sphares, a tool to unify collaboration in the Cloud

By Dietmar Gombotz, CEO and Founder of Sphares

With the introduction and growth of different Cloud- and Software-As-A-Service offerings, a rapid transition process driven through the mix-up between professional and personal space has taken shape. Not only are users all over the world using modern, flexibel and new products like DropBox or others at home, they want the same usability and “ease of use” in the corporate world. This of course conflicts with internal-policy and external-compliance issues, especially when data is shared through any tool.

I will focus mainly on the aspect of sharing data (usually in the form of files, but it could be other data-objects like calender-information or CRM data)

Many organizations have not yet formulated a consistent and universal strategy on how to handle this aspect in of their daily work. We assume an organizational structure where data sharing with clients/partners/suppliers is a regular process, which will surely be the case in more than 80% of all business nowdays.

There area different strategies to handle this:

No Product Policy
Basically the most well known policy is to not allow usage of modern tools and keeping with some internal infrastructure or in-house built tools.

Pro: data storage is 100% transparent, no need for further clarification

Con: unrealistic expectation especially in fields with a lot of data sharing, email will be used to transfer data to partners anyway so the data will be in multiple places and stages distributed


One-Product Policy

The most widley proactive policy is to define one solution (e.g. we use Google Drive) where a business account is taken or which can be installed (owncloud, …) on own hardware

Pro: data storage can be defined, employees have access to a working solution, clarifications are not needed

Con: partner need accounts on this system and have to make an extra effort to integrate it into their processes


Seen at small shops often. They use whatever their partners are using and get accounts when there partners propose some solution. They usually have a prefered product, but will switch whenever the client wants to use something else.

Pro: no need of adjustment on side of partner

Con: dozens of accounts, often shared to private accounts with no central control, data will be copied into internal system like with emails


Usage of Aggregation Services

The organization uses the “Product-As-You-Need” view combined with aggregation tools like JoliCloud or CloudKafe

Pro: no need of adjustment on side of partner, one view on the data on the companies side

Con: data still in dozen of systems and on private accounts (central control), integration in processes not possibleas the data stay on the different systems


Usage of Rule-Engines

There are a couple of Rule Engines like IFTTT (If this then that) or Zapier that can help you to connect different tools and trigger actions like you are used to in e-mail inboxes (filter rules). In combination with a preferred tool this can be a valid way to get data pre-processed and put into your system

Pro: Rudimentary integration with different systems, employees stay within their system

Con: Usually One-Way actions so updates do not get back to your partners, usually on a user-basis so no central control is needed.

Service Integration

Service Integration allows the sharing of data via an intermediate layer. There are solutions that will synchronize data (SPHARES) thereby allowing data consistency. Additionally there are services that will connect to multiple cloud storage facilities to retrieve data (Zoho CRM)

Pro: Data is integrated in processes, everybody stays within their system they use

Con: additional cost for the integration service


Impact of Smart Cities and Smart Homes on Cloud and Big Data

Cities and Homes are getting smarter and smarter. People living in these cities actually demand that there are more services presented by the local government. It needn’t be necessary to go to the city administration for standard tasks but these tasks can be done online. A key driver for smart cities is e-government. But there is much more to that than just e-government (which, in fact, has been around for years)

Cities need to get smarter. This happens on various things. The city can automatically adapt to new developments such as stronger traffic in a certain area. If more people would like to go to a specific area (maybe because there is an event), the public and private transport will automatically adapt to that. As of the private transport, cars are often driven “automatically” in an smart city. There is no driver (this will be described later on). This gives some interesting opportunities: cars communicate with the city where they want to go. The city has an overview over all desired destinations and can adapt in real-time to challenges that might arise. In case that a destination is highly demanded, the city can communicate to individual cars that there might be a traffic jam and prioritize cars or select alternative routs so that no car ends up in the traffic jam. It could also happen that there is different charging: e.g., if you want to get somewhere fast, you might have to pay little more. A very similar system can be found in Singapore, where you have to pay for using streets based on traffic and time. This can significantly lower the private transport and make the city “cleaner” and give inhabitants less stress. Some people might even decide to select the public transport instead. Furthermore, the private traffic could become public: companies might offer their cars to individuals, just like taxis but without drivers.

Of course, this needs a lot of technology in the background. Real-Time systems have to be available and complex calculations have to be done. Smart Cities need Big Data and Cloud Computing in order to provide all of these things.

A similar story can be seen with Smart Homes. More and more home automation is underway. Google’s Nest and Apple’s HomeKit are big bets for the companies and this emerging market. Future homes are highly connected and optimized. When the home “is not in use” – e.g. children are in the school, parents at work – the home stops heating or just keeps it at a low level. Before they come back home, the house starts to heat up again to achieve the required temperature (or vice versa: the home chills down for those living in warmer regions). The home itself can be opened simply with the smartphone and devices within the house are connected as well. There are sensors for elder people that prevent danger and advanced surveillance systems protect the home from unwanted visitors.

As with smart cities, this also requires a lot of back-end technology that is delivered via the cloud and uses big data technologies.