Search code examples
apache-drill

How to generate a psv file from apache drill


The current way I am going about creating a pipe seperated value (psv) file is to first create a view with a query like

Create view ABC as select column 1 || '|' || column 2 || '|' || ..

And then use the !record to do a select * from ABC.

This is causing a lot of development time and error prone as the files that I need to generate have 100's of columns.

Is there a simple way of going about this?


Solution

  • In your Storage plugin create custom format. Here is the documentation https://drill.apache.org/docs/plugin-configuration-basics/

    "formats": {
    "psv": {
      "type": "text",
      "extensions": [
        "tbl"
      ],
      "delimiter": "|"
     }
    }
    

    Alter you session to set your default store

    alter session set `store.format`='psv';
    

    Use CTAS to write the data in above specified format

    create table `users.vgunnu`.`vt_del_test` as select * from dfs.root.`/tmp/test_parquet` limit 3;
    

    More info for the format http://drill.apache.org/docs/create-table-as-ctas-command/