Search code examples
vim

Delete all content before 1st occurrence of a pattern


I have a file as follows:

<newline>
Something from
the beginning of a file
until
my target PATTERN then
the content continue 
until the appearance 
of other PATTERN in another lines

I want to delete everything from the beginning of the file until right before the 1st PATTERN (i.e. from "<newline>Something....my target") The cursor is at the first character (i.e. empty line with a newline). I tried

  1. :/\_.\{-}PATTERN/d <- line 2 was deleted (i.e. "Something from")
  2. :1,/\_.\{-}PATTERN/d <- line 1 and 2 was deleted
  3. :1,/PATTERN/s/\_.\{-}PATTERN/ <- everything up to and including the 2nd PATTERN was deleted. (I assume the find-and-replace only apply to the range from line 1 to 1st PATTERN)
  4. :1,/PATTERN/s/\_.\{-}PATTERN/PATTERN <- same as point 3, but with "PATTERNPATTERN" inserted in the beginning. i.e. the replacement occurred twice
  5. :1,/PATTERN/-1s/\_.\{-}PATTERN/ <- This is the solution!

I don't understand why all the 5 solutions behave like that. Every one of them does not perform as I expected.


Solution

  • Let's translate one by one:

    1. :/\_.\{-}PATTERN/d, it deletes only the line containing "PATTERN".
    2. :1,/\_.\{-}PATTERN/d, it deletes all lines from line 1 to the line containing "PATTERN".
    3. :1,/PATTERN/s/\_.\{-}PATTERN/, it deletes everything up to and including the second "PATTERN".
    4. :1,/PATTERN/s/\_.\{-}PATTERN/PATTERN. It replaces everything up to second "PATTERN" with "PATTERNPATTERN". Why? the substitution operates on the range line by line, and since the pattern includes "PATTERN", it effectively appends another "PATTERN" after the first match.
    5. :1,/PATTERN/-1s/\_.\{-}PATTERN/, it deletes everything up to the first occurrence of "PATTERN".

    In general, your first two certainly cannot meet your demand, at least you need to use :s command to replace the content to be removed by empty, instead of using :d.

    For 3, the problem mainly comes down to the fact that substitution operates on text within the range, not the range itself. Since "_." matches across lines, the substitution deletes everything up to the first occurrence of "PATTERN" and continues matching beyond it, i.e., to the next "PATTERN" if it exists.

    For 4, this one makes no sense, I reckon it's simply something you misunderstood.