Search code examples
regexvimregex-lookarounds

Linux or Vim: How to find and replace matched string, except for the last character?


Scenario

Say I have some text file:

savingstable
savings_proc.table

I want to change it to:

savings.table
savings_proc.table

My Solution

I separated this into two commands:
%s/savings/savings\./g
After this the file reads:

savings.table
savings._proc.table

So I follow the command with:
%s/savings\._/savings_/g

Question

Splitting the problem into two edits doesn't always work. Is there a way to do all of this in one step?

A one step solution would be to match all cases of savings[A-Za-z] and replace everything but the last character with savings\.

Generally, is there a way to replace one's matched strings excluding certain characters in the matched string? In this case we would want to exclude the last character.


Solution

  • In this specific scenario, %s/savings\zs\ze[^_]/./ would work, and it also gives you a chance to do :h \zs to learn something new, but if you don't explain a more general use case, there's not much else we can help you with.

    A one step solution would be to match all cases of savings[A-Za-z] and replace everything but the last character with savings\. (Actually this would be savings., as you don't need to escape the . in the replacement string.)

    Well, you could just capture the last character and put it back in the substitution, using the command %s/savings\([A-Za-z]\)/savings.\1/. But then why not capturing also the savings part, as in %s/\(savings\)\([A-Za-z]\)/\1.\2/? But at this point I would go back to make intelligent use of \zs and \ze.

    Generally, is there a way to replace one's matched strings excluding certain characters in the matched string? In this case we would want to exclude the last character.

    "excluding certain characters" is not possible in general, for obvious reasons: you have one string (which can have literal parts, such as bla, references to captured groups, such as \1, \2, n..., and other stuff; but it all still adds up to a single string) to replace stuff. And that stuff can't possibly be other than a string as well. In other words, if the substitution command starts with s/ABC/replacement/, there's no way to "decorate" the ABC or write the replacement such that A and C are replaced but B is left unchanged; if you want to keep the B, you have to put it back manually or via backreferences, e.g. s/A\(B\)C/x\1y/.

    On the other hand, you can exclude leading and trailing parts of the search string, exactly via the \zs and \ze that I mentioned since the beginning. These two are special cases of positive lookbehind and positive lookahead respectively, which Vim implements via \@<= and \@=. For instance, %s/savings\zs\ze[^_]/./ is equivalent to the less readable %s/\%(savings\)\@<=[^_]\@=/./, where [^_]\@= is matching non-_s without "consuming" them, just like \ze[^_] is ending the match right before a non-_; similarly \%(savings\)\@<= is matching right after savings (which has to be grouped, but there's no need to remember it, so I used \%( and \) instead of \( and \)).

    Notice that there's also negative lookbehind \@<! and negative lookahead \@!. All the 4 of them are collectively called lookarounds, and allow put some very complex logic in the regular expressions.