Fyi, this question originated from this sed
answer.
Given a 5 columns CSV line with all 5 columns empty, i.e. a line which only contains ,,,,
, I thought the following vim
-ex
command should insert hello
in all 5 positions:
:s/\v(^|,)\ze(,|$)/\1hello/g
However it does not, as the output is
hello,,hello,hello,hello
The first hello
is inserted because ^\ze,
matches at the beginning of line. However it seems that this ,
is consumed by the command. Is this the case? If so, why?
I'm not sure of the answer, but I can share a hunch. I think this boils down to entirely zero-width match/replace patterns (e.g. /^\ze,
) having to move some ethereal match index by one, even if it technically hasn't consumed anything. That way it can still go to some next match, or else it will just keep matching in the same position (if that makes sense).
Your example seems to evidence of that. A more illustrative example would be the following (changing the input to better show what was matched).
Given the following command:
:s/\v(^|.)\ze(.|$)/<0\11\22>/g
Running it against an input line of abcd
will output:
<01a2>a<0b1c2><0c1d2><0d12>
Note how the a
is both matched/replaced (in <01a2>
), and is also unmatched as shown by the a
in <01a2>a<0b1c2>
. This prevents the ab
pair from being matched/replaced.
The only thing I can think of that would explain this is that idea of some match cursor or match index having to move past the first character of a
after being matched by the first zero-width pattern of /^\ze.
In other words:
Input: abcd
Command: s/\v(^|.)\ze(.|$)/<0\11\22>/g
======================================
Match/Replace 1:
abcd => <01a2>abcd
^ ^
Matches /^ze.
Will move cursor by 1 after the zero-width /^\ze. match (or else it would be stuck there)
----------------
Match/Replace 2:
<01a2>abcd => <01a2>a<0b1c2>cd
^ ^
Matches /.\ze.
Consumes the '.' (in this case 'b'). Not entirely zero-width.
... and so on ...