Search code examples
kettlepentaho-data-integrationpdi

PDI - Collecting File From FTP Older Than N Day


I have a job that will Collect Data from FTP using Get a file with FTP and I want it's only collect yesterday file or older than n day or base on specific date.

How do that? Is any way or possible?

What I know is Get a file with FTP only copy file directly from FTP to destination folder. So, I can't use any field and assign it into JavaScript variable to create condition.

My requirement is moving only yesterday or ... file from FTP into Location I need, not all of them because I have a lot of file about 30K-40K with various file size and it will took a lot of time if I do that. Below is the pic what I have design.

What I have create


Solution

  • By using the 'Get File Names' step in a transformation, you can access your FTP files (via VFS) and their atributes, namely the 'lastmodifiedtime'.

    With this information you can do a simple filter by dates, and only download the files which are older than N days, or any other filter you require. With that in hand you can move, download or any other file related action you desire.