Hadoop Tutorial – Apache Accumulo

Apache Accumulo is another NoSQL Database in the Hadoop stack. Accumulo is based on Google’s Big Table design and is a sorted and distributed key/value storage.

Key/Value storages are basically not operating on rows, but it is possible to query them – which comes with a performance trade-off often. Accumulo allows us to query large rows which typically wouldn’t fit into the memory.

Accumulo is also built for high availability, scalability and fault tolerance. As of the ACID-topology, Accumulo supports “Isolation”. This basically means that recently inserted data isn’t displayed in case that the insert was after the query was sent.

Accumulo is built with a PlugIn-based architecture and provides a comprehensive API. With Accumulo, it is possible to execute MapReduce jobs, bulk- and batch operations.

The following Figure outlines how a Key/Value is displayed in Accumulo. The Key consists of the Row id, a column specifier and a timestamp. The column contains informations about the column family, the qualifier and the visibility.

Apache Accumulo
Apache Accumulo

The next sample will display how Accumulo code is written. The sample displays how to write a text to the database.

Text uid = new Text(“columid”);

Text family = new Text(“columnFamily”);

Text qualifier = new Text(“columnQualifier”);

ColumnVisibility visibility = new ColumnVisibility(“public”);

long timestamp = System.currentTimeMillis();

Value value = new Value(“Here is my text”.getBytes());

Mutation mutation = new Mutation(uid);

mutation.put(family, qualifier, visibility, timestamp, value);



Published by

Mario Meir-Huber

I work as Big Data Architect for Microsoft. With this role, I support my customers in applying Big Data technologies - mainly Hadoop/Spark - for their use-cases. I also teach this topic at various universities and frequently speak at various Conferences. In 2010 I wrote a book about Cloud Computing, which is often used at German & Austrian Universities. In my home country (Austria) I am part of several organisations on Big Data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s