Search code examples
google-app-enginebulkloader

Upload data to datastore from a gzipped CSV file?


I have a very large gzip csv file (around 500GB) which I need to import in datastore using the bulk load tool. Is it possible without having to unzip it first? If yes how I have to configure my bulkload.yaml file?

transformers:

- kind: Client
    connector: csv
    connector_options:
      encoding: zip?

Solution

  • What about using named pipes.

    mkfifo --mode=0666 /tmp/namedPipe
    gzip --stdout -d file.gz > /tmp/namedPipe
    

    And then in another terminal or if you detached the gzip command with &

    appcfg.py upload_data --config_file=bulkloader.yaml --filename=/tmp/namedPipe --kind=YOUR_DATA_KIND 
    

    Example taken from http://en.wikipedia.org/wiki/Named_pipe