I'm using a 3rd party file renaming software which is written in Delphi and has pascal-script support: http://www.den4b.com/?x=products&product=renamer
The application allows the usage of regular expressions to rename files. this means that if what I need to do with a filename cannot be accomplished only using one RegEx, then I could use simultaneous various expressions or also a pascal-script code to accommodate the filename until I can properly format the filename for the needs of this question or anything else...
I need to format song filenames like these below, in these filenames the "...featuring artist" part is at the right of the string, I need to match that and position it in the left part of the string.
To make this simple to understand, we could imaginary tokenize the filename like this:
[0]ARTIST [1]DASH [2]TRACK [3]FEAT_ARTIST [4]POSSIBLE_ADDITIONAL_INFO_INSIDE:()[]{}
Then what I need to do with a RegEx, is format the filename to positionate the tokens in this order:
[0]ARTIST [3]FEAT_ARTIST [1]DASH [2]TRACK [4]POSSIBLE_ADDITIONAL_INFO_INSIDE:()[]{}
I actually do that using this RegEx:
\A([^-]?)\s-\s*(.?)\s([([])?((ft[.\s]|feat[.\s]|featuring[.\s])[^(){}[]]*)([)]])?(.+)?\Z
Replacing with:
$1 $4 - $2$7
The problem begins here, because the [0]ARTIST
and [2]TRACK
tokens could contains dashes like for example this filename:
Then, correct me if I'm wrong, but I think its just impossible to solve this in any way, because a machine can't predict when to separate one token for the other, what is a name or what isn't, because I can't know the number of dashes that contains the filename.
For that reason, instead of looking for ingenuos perfection that could cause bad filenames because the amount of dashes inside, I prefer to look for a filename exclusion solution, by limiting the dashes that the expression should match in the filename.
Taking as example the RegEx that I shown above to extend/improve it, how I could exclude filenames that contains an [0]ARTIST
or an [2]TRACK
tokens with dashes?
...Or in other words, how I can tell my RegEx to avoid modifying a filename when the filename contains more than 1 dash before the "...featuring artist" part? (not after)
Basically the Regex should determine whether [1]DASH
is found more than once before [3]FEAT_ARTIST
, if yes then exclude that filename (don't modify it)
I know how to limit the occurrence of a Regex group something more or less like this ([\-]){1}
to match only 1 dash occurrence, but I'm not sure how to implement it in the expression I'm using.
Just some random examples...
One dash only before the [3]FEAT_ARTIST
so we can know when to separate [0]ARTIST
from [2]TRACK
tokens.
One dash only before the [3]FEAT_ARTIST
so we can know when to separate [0]ARTIST
from [2]TRACK
tokens. With [4]POSSIBLE_ADDITIONAL_INFO_INSIDE:()[]{}
.
One dash only before the [3]FEAT_ARTIST
so we can know when to separate [0]ARTIST
from [2]TRACK
tokens. With [4]POSSIBLE_ADDITIONAL_INFO_INSIDE:()[]{}
which also contains dashes.
One dash only between [0]ARTIST
an [2]TRACK
tokens, but the filename doesn't have a [3]FEAT_ARTIST
so we don't touch it.
One dash only between [0]ARTIST
an [2]TRACK
tokens, but the [3]FEAT_ARTIST
is before the [1]DASH
so we don't touch it.
[0]ARTIST
has dashes, so we can't know when to separate [0]ARTIST
and [2]TRACK
tokens, so the Regex should excludes this to don't modify this filename.
[2]TRACK
has dashes, so we can't know when to separate [0]ARTIST
and [2]TRACK
tokens, so the Regex should excludes this to don't modify this filename.
[0]ARTIST
and [2]TRACK
tokens has dashes, so we can't know when to separate them, so the Regex should excludes this to don't modify this filename.
[0]ARTIST
and [2]TRACK
tokens has dashes and also [3]FEAT_ARTIST
doesn't exists, again nothing to do here.
I hope this helps to understand what I need.
Try with:
^(.+)\s+-\s+(.+?)\s+[fF](t|eat(uring)?)?\.?([^([\])\n]+)(.+)?$
and use replace with: $1 Feat.$5 - $2$6
I tried it with ReNamer and Regex101, and it works also if there is -
( +
-
+ ) in artist name, like
artist - name
, BUT it will fail if there is such fragment in title part.
The ^(.+)\s+-\s+
part use a greedy quantifier .+
before a sequence space-dash-space, which is treated as delimiter between artist name and title of track. So it will match as much as it can, up to last occurrence of -
, because of that, it will "ignore" the dashes with spaces in names of artist, but it will case invalid match, if such element occur in track title. So the:
Artist - name - track title feat. someone
- it will be matched and
modified properly,Artist name - track - title feat. someone
- it will fail, as text
will be splitted on last dash.Instead of (ft[.\s]|feat[.\s]|featuring[.\s])
I used [fF](t|eat(uring)?)?\.?
which match similar, but should work faster (it should restrain backtracing a little bit).
in my demo, there is a +
instead \s+
(like above) as it would match multiline in the demonstration, and show invalid results, but in oneline cases, like in your problem, it should work fine.