Search code examples
rubyregexmacosterminalbatch-rename

Mac OS X Rename files in Batch (2 patterns)


I would like to match the \d\dQ\d pattern or \dQ\d\d pattern and reformat it in front of the file name. Using OSX Terminal or ruby, what would be the best method to do this? It is a little more complicated because it is pattern 1 or pattern 2.

In the first file, pattern "3Q17" for example is a date that means the 3rd quarter of year 2017. I need to add 2017Q3 in front of the file name. However, I have another pattern "05Q1" which means the 1st quarter of 2005, so I would need to add 2005Q1.

Files:

ABC-EDFGH-JLG-Sample-3Q17-fIS.pdf 
                     ^^^^
2Q13 ABC MF PM Example_fIS.pdf
^^^^
03Q1_FIS.pdf 
^^^^
05Q1 ABC_IS.pdf
^^^^

Files renamed:

2017Q3_ABC-EDFGH-JLG-Sample-3Q17-fIS.pdf
2013Q2_2Q13 ABC MF PM Example_fIS.pdf
2003Q1_03Q1_FIS.pdf
2005Q1_05Q1 ABC_IS.pdf

This question helps with a single pattern, but not 2 patterns.

mac os x terminal batch rename

My ruby code works fine:

Dir.glob("*.pdf").each do |orig_name|
    new_name = orig_name.gsub(/^.*?(?:(\d\d)Q(\d)|(\d)Q(\d\d))/, '20\1\4Q\2\3_\0')
    File.rename(orig_name, new_name)
end

However, my rename script (using brew install rename) in the terminal provides the wrong results.

rename -n -e 's/^.*?(?:(\d\d)Q(\d)|(\d)Q(\d\d))/'20\1\4Q\2\3_\0'/' *.pdf
'03Q1_FIS.pdf' would be renamed to '2014Q23_0_FIS.pdf'
'03Q1_vIS.pdf' would be renamed to '2014Q23_0_vIS.pdf'
'05Q1 ABC_IS.pdf' would be renamed to '2014Q23_0 ABC_IS.pdf'
'2Q13 ABC MF PM Example_fIS.pdf' would be renamed to '2014Q23_0 ABC MF PM Example_fIS.pdf'

Solution

  • Update

    rename uses Perl syntax for Regular Expressions. Following Perl it expects back-references to refer to a captured group otherwise that warning is displayed. Using Perl Regular Expressions there is a workaround for this using a branch reset construct (?|(...)|(...)|...) and a positive lookbehind:

    ^.*?(?|(\d\d)Q(\d)|Q(\d\d)(?<=(\d)...))
    

    replace with 20$1Q$2_$&

    Regex breakdown:

    • ^ Match start of input string
    • .*? Match every thing (un-greedily) up to...
    • (?| Start of a branch reset construct
      • (\d\d) Capture two digits in group 1
      • Q Match a Q
      • (\d) Capture a digit in group 2
      • | Or
      • Q Match a Q
      • (\d\d) Capture two digits in group 1
      • (?<=(\d)...) Capture a digit preceding Q in group 2
    • ) End of construct

    This way both capturing groups exist at the same time, first capturing group always refers to 2 digits and second one to 1 digit and we don't need to deal with 4 different capturing groups.


    You could try to match both patterns starting with pattern with more digits at beginning, then use one replacement string that includes back-references to all capturing groups in order. Here is regex:

    ^.*?(?:(\d\d)Q(\d)|(\d)Q(\d\d))
    

    and replacement string:

    20\1\4Q\2\3_\0
    

    Ruby (Live demo)

    str.gsub(/^.*?(?:(\d\d)Q(\d)|(\d)Q(\d\d))/, '20\1\4Q\2\3_\0')