Search code examples
perldata-processing

What are some good Perl modules for flow-based programming on files?


What are some good Perl modules to process files based on configurations?

Basically I am working on taking data files, split them into columns, remove some rows based on some columns, remove unnecessary columns, compare them to baseline (writes where changes have occured) and save a csv of the data and the comments as metadata.

Sample file is:

001SMSL22009032020090321024936
002XXXXX20090320102436               010000337 00051     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               010000333 00090     
002XXXXX20090320103525               020000333 00090     
009000000009000000000271422122

it will compare row by row with another file (baseline) and some differing rows will be highlighted ( I am use Tk::DiffText).

Here is the pipeline where [is a pipe]

file -> [split] -> [remove production] -> [sort] -> [compare] -> {user jumps in and writes comments, edits file as needed} -> [save csv] -> [save comments]

The real question is what perl module helps to model and make a pipeline flow like this? After more research I have found this http://en.wikipedia.org/wiki/Flow-based_programming.


Solution

  • This is what I was looking for:

    Text::Pipe

    Text::Pipe::Stackable

    Thank you for helping me clarify my ideas!