I receive a daily dump of files from a data provider. On occasion we receive empty files (20bytes). Is there any way to automatically avoid processing or skip these files?
I have tried:
USING Extractors.Csv(skipFirstNRows:1, silent:true);
But I seem to get a vertex failure related to what I believe is the empty files.
We recently added a FILE.LENGTH property as a computed virtual column that you can use to filter out files of a certain size.
For example the following should only operate on the files that are larger than 20 bytes:
@data =
EXTRACT
// ... columns to extract
, file_sz = FILE.LENGTH()
FROM "/mydata/{*}"
USING Extractors.Csv();
@res =
SELECT *
FROM @data
WHERE file_sz > 20;