Apache Accumulo is another NoSQL Database in the Hadoop stack. Accumulo is based on Google’s Big Table design and is a sorted and distributed key/value storage.
Key/Value storages are basically not operating on rows, but it is possible to query them – which comes with a performance trade-off often. Accumulo allows us to query large rows which typically wouldn’t fit into the memory.
Accumulo is also built for high availability, scalability and fault tolerance. As of the ACID-topology, Accumulo supports “Isolation”. This basically means that recently inserted data isn’t displayed in case that the insert was after the query was sent.
Accumulo is built with a PlugIn-based architecture and provides a comprehensive API. With Accumulo, it is possible to execute MapReduce jobs, bulk- and batch operations.
The following Figure outlines how a Key/Value is displayed in Accumulo. The Key consists of the Row id, a column specifier and a timestamp. The column contains informations about the column family, the qualifier and the visibility.
The next sample will display how Accumulo code is written. The sample displays how to write a text to the database.
|Text uid = new Text(“columid”);
Text family = new Text(“columnFamily”);
Text qualifier = new Text(“columnQualifier”);
ColumnVisibility visibility = new ColumnVisibility(“public”);
long timestamp = System.currentTimeMillis();
Value value = new Value(“Here is my text”.getBytes());
Mutation mutation = new Mutation(uid);
mutation.put(family, qualifier, visibility, timestamp, value);