Search code examples
stringbashreplacetext-processing

Replace every nth occurrence of 'foo' with the nth consecutive regular range of lines from a given file


I would like to replace each nth occurrence of foo on the 1.txt file with the nth consecutive regular range of lines every nth lines (say in this case every 2 lines) from the 0.txt file containing the following content below (this is MWE).

source file is 0.txt:

The sun has its own light
foo
The moon reflects the sunlight
foo
The planet Earth receives both sunlight and moonlight
foo

target file is 1.txt:

source-text1
[(('f1','b1'), ('g1','h1'))]
source-text-2
[(('f2','b2'), ('g2','h2'))]
source-text-3
[(('f3','b3'), ('g3','h3'))]

Applying the replacement, e.g. 'command_method' 0.txt 1.txt > 2.txt , pseudocode, I would have the desired output file is as below, being the printed output on a third 2.txt file:

expected output is 2.txt:

The sun has its own light
source-text1
[(('f1','b1'), ('g1','h1'))]
The moon reflects the sunlight
source-text-2
[(('f2','b2'), ('g2','h2'))]
The planet Earth receives both sunlight and moonlight
source-text-3
[(('f3','b3'), ('g3','h3'))]

I tried:

awk 'NR==FNR {a[NR]=$0; next} /foo/{gsub("foo", a[int(k++/2)%3 + 2])} 1' 1.txt 0.txt > 2.txt

but this gives me 2.txt:

The sun has its own light
[(('f1','b1'), ('g1','h1'))]
The moon reflects the sunlight
[(('f1','b1'), ('g1','h1'))]
The planet Earth receives both sunlight and moonlight
source-text-2

I don't have ideas anymore. I'm looking for a solution that can work with any size range of lines


Solution

  • Assumptions:

    • foo only occurs in a line by itself
    • if foo occurs more times than we have replacement strings, do not replace foo

    Setup:

    $ cat 0.txt
    The sun has its own light
    foo
    The moon reflects the sunlight
    foo
    The planet Earth receives both sunlight and moonlight
    foo
    The following line should NOT be replaced
    foo
    
    $ cat 1.txt
    source-text1
    [(('f1','b1'), ('g1','h1'))]
    source-text-2
    [(('f2','b2'), ('g2','h2'))]
    source-text-3
    [(('f3','b3'), ('g3','h3'))]
    

    One awk idea:

    awk -v setsize=2 -v ptn="foo" '                        # setsize == number of lines from first file that define a replacement set
                                                           # ptn == string to be replaced
    
    FNR==NR { replace[++rcount]= $0                        # start replacement string
              for (i=1;i<setsize;i++) {                    # append to replacement string until we have read "setsize" lines into the replacement string
                  getline
                  replace[rcount]=replace[rcount] RS $0
              }
              next
            }
    $0~ptn  { if (++pcount in replace)                     # if we have a replacement string then ...
                  $0=replace[pcount]                       # replace the current line
            }
    1                                                      # print the current line
    
    ' 1.txt 0.txt
    

    This generates:

    The sun has its own light
    source-text1
    [(('f1','b1'), ('g1','h1'))]
    The moon reflects the sunlight
    source-text-2
    [(('f2','b2'), ('g2','h2'))]
    The planet Earth receives both sunlight and moonlight
    source-text-3
    [(('f3','b3'), ('g3','h3'))]
    The following line should NOT be replaced
    foo