I've tried to insert 100 values into a column test_col
in column family test_cf
in a row key test-123
The problem is that I successfully inserted 100 values into Bigtable.
However, The number of values in the test_col
column in test_cf
is less than 100 and It appears to be randomly inserted.
The code I wrote is below.
rows = []
values = ['123-123124-325324', '543-123-45324-123123', '292-123124-54324-234', '292-213123-123123123-3213']
# ... 100 values in the list
row_key = 'test-123'.encode()
direct_row = table.direct_row(row_key)
for val in values:
row.set_cell("test_cf",
"test_col".encode('utf-8'),
val,
datetime.utcnow())
rows.append(row)
rtn = table.mutate_rows(rows)
for i, status in enumerate(rtn):
if status.code != 0:
print('ERROR')
And the weird thing is that the response status is always 0 for the 100 values mutation.
This code is not doing what you intended. Each set_cell
call is writing to the same row ("test-123"), the same column family ("test_cf") and the same column qualifier ("test_col"). The value is different each time, but the timestamp associated with each value is the current time which could be the same across multiple set_cell
s. Because a single cell in Bigtable is indexed by the (row, family, column, timestamp) tuple, this code can overwrite data it wrote earlier in the loop.
So, it is entirely possible that the first 3 set_cell
s look like this:
row: "test-123"
family: "test_cf"
column: "test_col"
value: "123-123124-325324"
timestamp: t0
row: "test-123"
family: "test_cf"
column: "test_col"
value: "543-123-45324-123123"
timestamp: t0
row: "test-123"
family: "test_cf"
column: "test_col"
value: "292-123124-54324-234"
timestamp: t1
In this case the second entry overwrites the first, as (row, family, column, timestamp) is identical.
The status code will be 0
for successful calls, so that is working as expected.