Search code examples
greplines

How to grep 2 same strings on lines after eachother


I want to find out if there are lines in a text that are similar and after eachother. For example I want to find if there are any lines that has "cccc" in and after eacother.

aaaaaaaa
bbbbaaaa
ccccxxxx
ddddaaaa
eeeeaaaa
ccccxxxx   <---
ccccyyyy   <---
ddddaaaa
eeeeaaaa

So I should print out only the double cccc**** lines. I tried something like:

 grep "cccc" -A1 file.txt

but got all "cccc*" lines.

Simple problem I know... Another example: Search for duplicates of "Finland":

Iceland
Germany
FinlandsIsNiceButNoMatch
France
FinlandWillMatchTHisTime    <---
FinlandWillAlsoMatch        <---
Hungary

Solution

  • This will match two lines if they both begin with at least 3 identical letters:

    grep -Pzo "([a-zA-Z]{3}).*\n\1.*" file.txt