Search code examples
google-cloud-data-fusioncdap

Using compressed files with Datafusion


Is there a way to use compressed files with Cloud data fusion. I have used Google Storage as a source and placed a gzip file in the preferred location.

In the wrangler transform, I don't see a preview. When I try to select the file using select Data the zipped file is not highlighted. The steps work fine when I work with an uncompressed file.

Should I be using some transform before I wrangle? Is there a way where I can read a compressed file directly and preview the data. In data prep, the transform identifies the files based on the extension, however, in data fusion, there seems to be no such option.

I was using a basic version of the data fusion environment, would enterprise edition help?


Solution

  • Wrangler expects the files to be uncompressed and does not yet support reading compressed files. I have opened an enhancement request for the same https://issues.cask.co/browse/CDAP-16140

    Thanks, Sree