I have a very large gzip csv file (around 500GB) which I need to import in datastore using the bulk load tool. Is it possible without having to unzip it first? If yes how I have to configure my bulkload.yaml file?
transformers:
- kind: Client
connector: csv
connector_options:
encoding: zip?
What about using named pipes.
mkfifo --mode=0666 /tmp/namedPipe
gzip --stdout -d file.gz > /tmp/namedPipe
And then in another terminal or if you detached the gzip command with &
appcfg.py upload_data --config_file=bulkloader.yaml --filename=/tmp/namedPipe --kind=YOUR_DATA_KIND
Example taken from http://en.wikipedia.org/wiki/Named_pipe