Search code examples
regexvimvitext-manipulation

vim: substitute specific character, but only after nth occurance


I need to make this exercise about regexes and text manipulation in vim.

So I have this file about the most scoring soccer players in history, with 50 entries looking like this:

1 Cristiano Ronaldo Portugal 88 121 0.73 03 Manchester United Real Madrid

The whitespaces between the fields are tabs (\t)

The fields each respond to a differen category: etc... This last field contains one or more clubs the player has played in. (so not a fixed number of clubs)

The question: replace all tabs with a ';', except for the last field, where the clubs need to be seperated by a ','.

So I thought: I just replace all of them with a comma, and then I replace the first 7 commas with a semicolon. But how do you do that? Everything - from regex to vim commands - is allowed.

The first part is easy: :2,$s/\t/,/g But the second part, I can't seem to figure out.

Any help would be greatly appreciated.

Thanks, Zeno


Solution

  • This answer is similar to @Amadan's, but it makes use of the ability to provide an expression as the replace string to actually do the difficult bit of changing the first set of tabs to semicolons:

    %s/\v(.{-}\t){7}/\=substitute(submatch('0'), '\t', ';', 'g')/|%s/\t/,/g
    

    Broken down this is a set of three substitute commands. The first two are cobbled together with a sub-replace-expression:

    %s/\v(.{-}\t){7}/\=substitute(submatch('0'), '\t', ';', 'g')/
    

    What this does is find exactly seven occurrances ({7}) of any character followed by a tab, in a non-greedy way. ((.{-}\t)). Then we replace this entire match (submatch(0)) with the result of the substitute expression (\=substitute(...)). The substitute expression is simple by comparison as it just converts all tabs to semicolons.

    The last substitute just changes any other tabs on the line to commas.

    See :help sub-replace-expression