Partitioning is another factor for Big Data Applications. It is one of the factors of the CAP-Theorem (see 1.6.1) and is also important for scaling applications. Partitioning basically describes the ability to distribute a database over different servers. In Big Data Applications, it is often not possible to store everything on one (Josuttis, 2011)
The factors for partitioning illustrated in the Figure: Partitioning are described by (Rys, 2011). Functional partitioning is basically describing the service oriented architecture (SOA) approach (Josuttis, 2011). With SOA, different functions are provided by their own services. If we talk about a Web shop such as Amazon, there are a lot of different services involved. Some Services handle the Order Workflow; other Services handle the search and so on. If there is high load on a specific service such as the shopping cart, new instances can be added on demand. This reduces the risk of an outage that would lead to loosing money. Building a service-oriented architecture simply doesn’t solve all problems for partitioning. Therefore, data also has to be partitioned. By data partitioning, all data is distributed over different servers. They can also be distributed geographically. A partition key basically identifies partitioned Data. Since there is a lot of data available and single nodes may fail, it is necessary to partition data in the network. This means that data should be replicated and stored redundant in order to deal with node failures.