I have a csv file that is tared and zipped. So I have test.tar.gz.
I would like, through text file input, read csv file.
I try this tar:gz:file://C:/test/test.tar.gz!/test.tar!
use wildcard like ".*\.csv"
.
But it sometime can't read success.
It throws Exception
org.apache.commons.vfs.FileNotFolderException:
Could not list the contents of
"tar:gz:file:///C:/test/test.tar.gz!/test.tar!/"
because it is not a folder.
I use windows8.1, pdi 5.2
Where it might be wrong?
For a compressed file csv reading, "Text File Input" step in Pentaho Kettle only supports the first files inside the compressed folder(either in Zip/GZip file). Check the Pentaho Wiki in the compression section.
Now for your issue, try removing the wildcard entry since only the first file inside the zip/gzip file will be read. (as explained above)
I have placed a sample code containing both reading zip and gzip files. Check it here.
Hope it helps :)