Search code examples
awksed

How to find a chunk of string that starts with $PATTERN and ends with $LINEBREAK


I am trying to remove a chunk of text from 100s of files. The amount of lines to remove is not the same for each one, but they do all start with the same string. I am looking to remove the entire chunk of text from $PATTERN ## Table, then select ALL of the text until a line break occurs. I am pretty "ok" with sed but not getting anywhere with Googling.

text.txt contents:

Random text.

## Table of Contents
*   [Design System](#<b>Design-System</b>)
*   [Multisite Designs](#Multisite)
    *   [Recommended Changes Per Site](#mcetoc_123)
    *   [Recommended Consistent Elements Across Sites](#mcetoc_456)
*   [Templates v. Pages](#Templates-v-Pages)
*   [Layouts](#mcetoc_dfghj)
*   [Features](#Features)
    *   [Something](#mcetoc_khjvfsdfsd)
    *   [Feature](#mcetoc_dfwyduegwu)
    *   [See More Logic](#mcetoc_fdsfsdugfs)
*   [Advertising, and Print](#Advertising-and-Print)
*   [Images](#Images)
    *   [Featured Image)](#mcetoc_fghjk) 
*   [Videos](#Videos)
    *   [Video Art](#mcetoc_4567890)
*   [Author and Search Pages](#Author-Tag-and-Search-Pages)
*   [Accessibility](#Accessibility)

Then here is some text.

Here is more text!

## Prerequisites

text.txt desired outcome:

Random text.


Then here is some text.

Here is more text!

## Prerequisites

I don't care about cleaning up extra line breaks, just deleting the chunk starting with ## Table and ending with the first line break.


I had no idea where to even begin, so I posted here. The first answer I got resulted in:

sed -e '/^\s*[*#]/d'

Random text.

    *   [Recommended Changes Per Site](#mcetoc_123)
    *   [Recommended Consistent Elements Across Sites](#mcetoc_456)
    *   [Something](#mcetoc_khjvfsdfsd)
    *   [Feature](#mcetoc_dfwyduegwu)
    *   [See More Logic](#mcetoc_fdsfsdugfs)
    *   [Featured Image)](#mcetoc_fghjk)
    *   [Video Art](#mcetoc_4567890)

Then here is some text.

Here is more text!

awk '/^## Table/ {skip=1} NF==0 {skip=0} !skip' worked as well as sed -i '' '/## Table/,/^$/d', and resulted in:

Random text.


Then here is some text.

Here is more text!

## Prerequisites

Solution

  • Since the patterns are mutually exclusive, you can use an address range:

    sed '/## Table/,/^$/d' infile >outfile
    

    Or, if your definition of "linebreak" includes whitespace:

    sed '/## Table/,/^[[:space:]]*$/d' infile >outfile
    

    Many versions of sed support non-standard pseudo-in-place editing with a -i option.