Search code examples
bashunixcut

using cut on a line having multiple instances of the same delimiter - unix


I am trying to write a generic script which can have different file name inputs.

This is just a small part of my bash script.

for example, lets say folder 444-55 has 2 files

qq.filter.vcf
ee.filter.vcf

I want my output to be -

qq
ee

I tried this and it worked -

ls /data2/delivery/Stack_overflow/1111_2222_3333_23/secondary/444-55/*.filter.vcf | sort | cut -f1 -d "." | xargs -n 1 basename

But lets say I have a folder like this -

/data2/delivery/Stack_overflow/de.1111_2222_3333_23/secondary/444-55/*.filter.vcf

My script's output would then be

de
de

How can I make it generic?

Thank you so much for your help.


Solution

  • Something like this in a script will "cut" it:

    for i in /data2/delivery/Stack_overflow/1111_2222_3333_23/secondary/444-55/*.filter.vcf
    do
       basename "$i" | cut -f1 -d.
    done | sort
    

    advantages:

    • it does not parse the output of ls, which is frowned upon
    • it cuts after having applied the basename treatment, and the cut ignores the full path.
    • it also sorts last so it's guaranteed to be sorted according to the prefix