Search code examples
shellsearchgrepfull-text-search

How to search and copy surrounding text?


I have a large textfile structured like this:


---

NAME: Some Name

Random
number
of lines with info.

---

NAME: Another Name

Random
number
of lines
with different info.

---

…

When searching for "different", I want to edit everything from "NAME: Another name" to "---" in a new file.

There may be multiple occurences of a search term. I would love to have all of them in one file.

grep different file.txt > edit.txt puts lines containing "different" in a new file.

But I don´t know how to do the same with everything between the two "---".

Is grep the right tool for the job?


Solution

  • grep is the wrong tool for the job. Using GNU awk for multi-char RS and RT:

    awk -v RS='(^|\n)---(\n|$)' -v ORS= '/different/{print $0 RT}' file
    

    For example given this input:

    $ cat file
    ---
    
    NAME: Some Name
    
    Random
    number
    of lines with info.
    
    ---
    
    NAME: Another Name
    
    Random
    different number ---
    of lines
    
    NAME: just part of some random text
    
    with different info.
    
    ---
    
    NAME: Yet another Name
    
    Random
    number
    of lines
    with different info.
    
    ---
    

    the above command produces the expected output:

    $ awk -v RS='(^|\n)---(\n|$)' -v ORS= '/different/{print $0 RT}' file
    
    NAME: Another Name
    
    Random
    different number ---
    of lines
    
    NAME: just part of some random text
    
    with different info.
    
    ---
    
    NAME: Yet another Name
    
    Random
    number
    of lines
    with different info.
    
    ---
    

    while the awk command from the OPs answer outputs:

    $ awk 'BEGIN { RS = "---" } /different/' file
    
    
    NAME: Another Name
    
    Random
    different number
    
    of lines
    
    NAME: just part of some random text
    
    with different info.
    
    
    
    
    NAME: Yet another Name
    
    Random
    number
    of lines
    with different info.
    

    with the --- from the middle of the second record replaced by a newline and all separator --- lines gone so the NAME:.. line from the middle of the second record cant be distinguished from other records, etc.