Search code examples
sedgrep

grep or sed to delete 1 line after a match (but not the match)?


I'm trying to write a command to amend an xml file to remove the first of each pair of lines with a <display-name> occurrence.

In other words, my file looks like...

<?xml version="1.0" encoding="UTF-8"?>
<tv guide2go="guide2go" source-info-name="Schedules Direct" source-info-url="http://schedulesdirect.org">
  <channel id="guide2go.68589.schedulesdirect.org">
    <display-name>COMCUKH</display-name>
    <display-name>Comedy Central UK HD</display-name>
    <icon src="https://schedulesdirect-api20141201-logos.s3.dualstack.us-east-1.amazonaws.com/stationLogos/s68589_dark_360w_270h.png" height="270" width="360"></icon>
  </channel>
  <channel id="guide2go.79036.schedulesdirect.org">
    <display-name>INVDIP1</display-name>
    <display-name>ID +1</display-name>
    <icon src="https://schedulesdirect-api20141201-logos.s3.dualstack.us-east-1.amazonaws.com/stationLogos/s79036_dark_360w_270h.png" height="270" width="360"></icon>
  </channel>

and I want it to look like...

<?xml version="1.0" encoding="UTF-8"?>
<tv guide2go="guide2go" source-info-name="Schedules Direct" source-info-url="http://schedulesdirect.org">
  <channel id="guide2go.68589.schedulesdirect.org">
    <display-name>Comedy Central UK HD</display-name>
    <icon src="https://schedulesdirect-api20141201-logos.s3.dualstack.us-east-1.amazonaws.com/stationLogos/s68589_dark_360w_270h.png" height="270" width="360"></icon>
  </channel>
  <channel id="guide2go.79036.schedulesdirect.org">
    <display-name>ID +1</display-name>
    <icon src="https://schedulesdirect-api20141201-logos.s3.dualstack.us-east-1.amazonaws.com/stationLogos/s79036_dark_360w_270h.png" height="270" width="360"></icon>
  </channel>

I don't mind if it's the same file, or a new file, I just want it with only the second occurrences of <display-name> from each pair. TIA.

I've tried a few GREP and SED commands (and a lot of Google searching for anything similar), as I know how to use -v to remove all occurrences of a match, but no joy working between lines at all.


Solution

  • Tested with both GNU and BSD (e.g. macOS) tools

    sed '/display-name/{N;s/.*\n//;}' file
    

    Basically, when seeing "display-name", read the next line and delete the current one. You can make the pattern more specific if this hits false-positives.

    Note that this doesn't check that the next line also matches, so if there is only one it will be deleted. If there are more than two in a row, all even lines will be printed (second, fourth, etc.)