I want to covert a flat data file to a new file based on below requirement:
1) Change tab delimited to pile delimited "|".
2) Remove any leading and trailing SPACE on each "column".
3) Some columns are NULL, I want to keep the null. e.g., A||B. (2nd column is null.)
Example:
Original file (test.dat) has a row of tab delimited data , 7 columns in total, 2 columns after "NY" are NULL:
A New York NY Meal - Seafood Grocery Department
Please be aware there are leading/trailing space on some fields:
(" A ", "Meal - Seafood ", " Grocery Department ")
This is the final version I want in the new file:
A|New York|NY|||Meal - Seafood|Grocery Department
Can any one write a sample code or shell script that I can use in Linux to ouput a new file?
Thanks!
The posix character class [[:space:]]
and you can use *
to match zero or more spaces. The literal \t
matches a tab. Like,
$ sed "s/^[[:space:]]*//" test.dat | sed "s/[[:space:]]*\t[[:space:]]*/|/g" \
| sed "s/[[:space:]]*$//"
A|New York|NY|Meal - Seafood|Grocery Department
$ cat test.dat
A New York NY Meal - Seafood Grocery Department