I want to find out if there are lines in a text that are similar and after eachother. For example I want to find if there are any lines that has "cccc" in and after eacother.
aaaaaaaa
bbbbaaaa
ccccxxxx
ddddaaaa
eeeeaaaa
ccccxxxx <---
ccccyyyy <---
ddddaaaa
eeeeaaaa
So I should print out only the double cccc**** lines. I tried something like:
grep "cccc" -A1 file.txt
but got all "cccc*" lines.
Simple problem I know... Another example: Search for duplicates of "Finland":
Iceland
Germany
FinlandsIsNiceButNoMatch
France
FinlandWillMatchTHisTime <---
FinlandWillAlsoMatch <---
Hungary
This will match two lines if they both begin with at least 3 identical letters:
grep -Pzo "([a-zA-Z]{3}).*\n\1.*" file.txt