Search code examples
vertica

How many ROS containers would be created if I am loading a table with 100 rows in a 3 node cluster


I have a 3 node cluster. There is 1 database and 1 table. I have not created a projection. If I load 100 rows in the table using copy command then:

  1. How many projections would be created? I suspect only 1 super projection, am I correct?
  2. If I am using segmentation then would that distribute data evenly (~33 rows) per node? Does that mean I have now 3 Read Optimised Storage (ROS) one per node and the projection has 3 ROSes?
  3. If I use KSafety value as 1 then a copy of each ROS (buddy) would be stored in another node? So DO I have 6 ROSes now, each containing 33 rows?

Solution

  • Well, let's play the scenario ... You will see that you get a projection and its identical buddy projection ... And you can query the catalogue to count the rows and identify the projections ..

    -- load a file with 100 random generated rows into table example;
    -- generate the rows from within Vertica, and export to file
    -- then create a new table and see what the projections look like
    CREATE TABLE rows100 AS
      SELECT
        (ARRAY['Ann','Lucy','Mary','Bob','Matt'])[RANDOMINT(5)] AS fname,
        (ARRAY['Lee','Ross','Smith','Davis'])[RANDOMINT(4)] AS lname,
        '2001-01-01'::DATE + RANDOMINT(365*10) AS hdate,
        (10000 + RANDOM()*9000)::NUMERIC(7,2) AS salary
      FROM ( 
        SELECT tm FROM ( 
            SELECT now() + INTERVAL '  1 second'  AS t UNION ALL
            SELECT now() + INTERVAL '100 seconds' AS t   -- Creates 100 rows
        ) x TIMESERIES tm AS '1 second' OVER(ORDER BY t)
       ) y
    ;
    -- set field separator to vertical bar (the default, actually...)
    \pset fieldsep '|'
    -- toggle to tuples only .. no column names and no row count
    \tuples_only
    -- spool to example.bsv - in bar-separated-value format
    \o example.bsv
    SELECT * FROM rows100;
    -- spool to file off - closes output file
    \o
    
    -- create a table without bothering with projections matching the test data
    DROP TABLE IF EXISTS example;
    CREATE TABLE example LIKE rows100;
    
    -- load the new table ...
    COPY example FROM LOCAL 'example.bsv';
    
    -- check the nodes ..
    SELECT node_name FROM nodes;
    -- out    node_name    
    -- out ----------------
    -- out  v_sbx_node0001
    -- out  v_sbx_node0002
    -- out  v_sbx_node0003
    
    
    SELECT
      node_name
    , projection_schema
    , anchor_table_name
    , projection_name
    , row_count
    FROM v_monitor.projection_storage
    WHERE anchor_table_name='example'
    ORDER BY projection_name, node_name
    ;
    -- out    node_name    | projection_schema | anchor_table_name | projection_name | row_count 
    -- out ----------------+-------------------+-------------------+-----------------+-----------
    -- out  v_sbx_node0001 | public            | example           | example_b0      |        38
    -- out  v_sbx_node0002 | public            | example           | example_b0      |        32
    -- out  v_sbx_node0003 | public            | example           | example_b0      |        30
    -- out  v_sbx_node0001 | public            | example           | example_b1      |        30
    -- out  v_sbx_node0002 | public            | example           | example_b1      |        38
    -- out  v_sbx_node0003 | public            | example           | example_b1      |        32