Search code examples
performancedatabase-designsinglestore

How can this data skew be fixed in singlestore?


we are using singlestore databases in that I have a columstore table in which we have noticed data skew, as a result, we are experiencing performance concerns.

+----------+----------+------------+-------------+
| avg_rows | row_skew | avg_memory | memory_skew |
+----------+----------+------------+-------------+
|  3748574 |  780.300 |          0 |        NULL |
+----------+----------+------------+-------------+

How can this data skew be fixed? We have 24 columns in the table overall, with 3 shard keys and 7 unique keys, we are seeing dataskew on few more tables but this table having highest data skew.


Solution

  • Data skew in SingleStore can be fixed by changing the shard key. A shard key can be defined for a specific field or multiple fields. The lower the cardinality is following this shard key, i.e how many records are associated with the same shard key, the more evenly the data will be distributed between the partitions. It will avoid to have data skew in your database.

    More information in SingleStore documentation here : https://docs.singlestore.com/managed-service/en/create-a-database/physical-database-schema-design/procedures-for-physical-database-schema-design/optimizing-table-data-structures.html#choosing-a-shard-key-654461

    Having a shard key on an auto increment column or unique column will ensure a low cardinality. You will have no data skew in this case for example.