Search code examples
phpgoogle-cloud-bigtablebigtable

multiple rows on google bigtable


I am having an issue when adding a new row to Bigtable instead of one row it is adding 3 ideally I want it to have only one copy

this is the code I am using

use Google\Cloud\Bigtable\BigtableClient;
use Google\Cloud\Bigtable\DataUtil;
use Google\Cloud\Bigtable\Mutations;

    $bigtable = new BigtableClient();

    $table = $bigtable->table('claster', 'configuration');
    $column_family_id = 'campaign';
    $column_id = 'dsakjhasdkjhasdkj';

    $mutations = (new Mutations())->upsert($column_family_id, "hahahaha", "campaign123");

    $v = $table->mutateRow("campaign1854", $mutations);

    printf('Successfully wrote row.' . PHP_EOL);
    echo '<pre>';
    print_r($v);
    echo '</pre>';

what I get in return is this

Array
(
    [campaign] => Array
        (
            [hahahaha] => Array
                (
                    [0] => Array
                        (
                            [value] => campaign123
                            [labels] => 
                            [timeStamp] => 1586350256130000
                        )

                    [1] => Array
                        (
                            [value] => campaign123
                            [labels] => 
                            [timeStamp] => 1586350254707000
                        )

                    [2] => Array
                        (
                            [value] => campaign123
                            [labels] => 
                            [timeStamp] => 1586350253750000
                        )

                )

        )

)

in addition, every time I am trying to read the key it ads another copy of the value here is the code I use to read

    $bigtable = new BigtableClient();
    $table = $bigtable->table('claster', 'configuration');
    $data = $table->readRow('campaign1854');

    echo '<pre>';
    print_r($data);
    echo '</pre>';

I am getting this response with additional copy

Array
(
    [campaign] => Array
        (
            [hahahaha] => Array
                (
                    [0] => Array
                        (
                            [value] => campaign123
                            [labels] => 
                            [timeStamp] => 1586350256130000
                        )

                    [1] => Array
                        (
                            [value] => campaign123
                            [labels] => 
                            [timeStamp] => 1586350254707000
                        )

                    [2] => Array
                        (
                            [value] => campaign123
                            [labels] => 
                            [timeStamp] => 1586350253750000
                        )

                    [3] => Array
                        (
                            [value] => campaign123
                            [labels] => 
                            [timeStamp] => 1586350252676000
                        )

                )

        )

)

Solution

  • Every row in Bigtable is made up of Cells which is a set of values and timestamps. We call those cell versions.

    Each time you run that script, it will add another value at the current timestamp which is causing you to have multiple versions in your cell. So the code you provided isn't causing it to write multiple versions, it will only write one version, but if you run the code multiple times, then it will add more versions.

    I'm not sure why reading is causing more versions to be written, you must be running your write code by accident when your read code is run.

    So there are some solutions you can use for this:

    1. Allow each cell to only have one version through garbage collection. You can use the cbt tool to configure only one version per cell like so

      cbt createfamily your-table cf2
      cbt setgcpolicy your-table cf2 maxversions=1
      
    2. When you are reading from Bigtable, you can apply a filter to only read the latest version of your cell like so:

      $filter = Filter::limit()->cellsPerColumn(1);
      $table->readRows([
          'filter' => $filter
      ]);