Search code examples
regexlinuxperlrenamefastq

Batch rename *fastq.gz files using regular expression


I'm trying to get a regex to work with rename; I've tried the approach of similar answered questions here but couldn't get the results I wanted.

The files are named as such:

SR1_S90_L001_R1_001.fastq.gz 
SR1_S90_L001_R2_001.fastq.gz
Rinc_S96_L001_R1_001.fastq.gz 
Rinc_S96_L001_R2_001.fastq.gz

And I would like to retain only the information prior to the first underscore and the _R1_ or _R2_ tags, like this:

SR1_R1_.fastq.gz 
SR1_R2_.fastq.gz
Rinc_R1_.fastq.gz 
Rinc_R2_.fastq.gz

Solution

  • rename 's{^([^._]+)_[^.]*(_R[12]_)[^.]*}{$1$2}' *
    

    The idea is to match (and capture) the first part of the string (1 or more characters that are not . or _), followed by _ and 0 or more non-. characters, followed by _R1_ or _R2_ (we capture this part, too), followed by 0 or more non-. characters yet again.

    This should match the first part of the filename (before .) and replace it by the first and second captured substrings, i.e. everything before the first _ and the R1/R2 tag.