Mule Dataweave Fixed Width File with header and footer

I am working on a project where we receive a flat file but the first and last lines have information that does not fit the fixed width pattern. Is there a way to dataweave all of this information correctly and if possible put the header and footer into variables and just have the contents in the payload.

Example File

HDMTFSBEUP00000220170209130400           MT                                                     HD07
DT01870977         FSFSS   F3749261            CR00469002017020820170225                        0000
DT01870978         FSFSS   F3749262            CR00062002017020820170125                        0000
TRMTFSBEUP00000220170209130400  000000020000002000000000000043330000000000000                   0000

I know for CSV you can skip a line but dont see it with fixed width and also the header and footer will both start with the first 2 letters every time so maybe they can be filtered by dataweave?

Solution

Please refer to the DataWeave Flatfile Schemas documentation. There are several examples for processing several different types of data.

In this case, I tried to simplify your example data, and apply a custom schema as follow:

Example data:

HDMTFSBEUP00000220170209130400           
DT01870977         
DT01870978         
TRMTFSBEUP00000220170209130400

Schema/Flat File Definition:

form: FLATFILE
structures:
- id: 'test'
  name: test
  tagStart: 0
  tagLength: 2
  data:
  - { idRef: 'header' }
  - { idRef: 'data', count: '>1' }
  - { idRef: 'footer' }
segments:
- id: 'header'
  name: header
  tag: 'HD'
  values:
  - { name: 'header', type: String, length: 39 }
- id: 'data'
  name: data
  tag: 'DT'
  values:
  - { name: 'code', type: String, length: 17 }
- id: 'footer'
  name: footer
  tag: 'TR'
  values:
  - { name: 'footer', type: String, length: 30 }

The schema will validate the example data and identify based on the tag, the first 2 letters. The output will be grouped accordingly.

{
  "header": {},
  "data": [{}, {}],
  "footer": {}
}

Since the expected result is only the data, then just select it: payload.data.