Search code examples
vora

How to set Vora Table partition size?


I have set the 'partitionSize' option to multiple different values, and I seem to get the same amount of partitions no matter the number. According to the documentation the should correspond to the HDFS block size. Is there something that I am missing.

HDFS block size 64M

CREATE TABLE TABLE_TEST (DEFINITION_INFO) USING com.sap.spark.vora OPTIONS ( tablename "TABLE_TEST", partitionSize "64", paths "/load_from_here/combined.csv", eagerLoad "true" )

The csv is about 680M


Solution

  • The name of the parameter is a bit misleading. It is not for partitioning tables, but rather to influence the load performance when loading data into tables. In newer versions it might be renamed to avoid this confusion.