Search code examples
javaaccumulo

count of rows inserted into a table in accumulo


I have inserted some rows into a table in Accumulo. Some rows are newly created and Some rows are updated.

How can I find the count of rows that were inserted or updated into an accumulo table in Java?

def obj= jsonObject["obj"]
for(entry in obj) {
                String a = entry["a"];
                String b = entry["b"];
                String c = entry["c"];
                String d = entry["d"];
                String e = entry["e"];

                ColumnVisibility cv = new ColumnVisibility(d);
                Mutation m = new Mutation(a);
                m.put(b, c, cv, e)
                bw.addMutation(m);
                count++;
            }

This is what is currently being done and the count is considered as the number of entries written into the table. But if there are only some new entry/rows inserted and the others are to be updated, the count can't be considered as the new entries entered into the table


Solution

  • As of Accumulo 1.6.x (the latest, as of this post), there is no public API for getting the count of either rows or individual entries in a table. If maintaining counts were a built-in feature, it would add a bit of overhead, and would be very difficult to implement, given that server-side iterators could change these counts during compactions.

    So, the best Accumulo provides is estimates of the number of entries in a table (only entries, not rows).

    If row-counting is needed, that functionality must be added at the application layer. Help may be available to do this on the user mailing list.