Search code examples
pythonxmlregexstringmarkup

Python append a string after an occurrence of a pattern


How do I append to a string after an occurrence of a pattern?? I know that strings are immutable. But if there is a way to do it?

Eg.. input:

condor  t   airline airline
eight   n   0   flightnumber
nine    n   0   flightnumber
five    n   0   flightnumber
hallo   t   0   sentence

expected output:

<s> <callsign> <airline> condor </airline> 
<flightnumber> eight nine five </flightnumber> 
</callsign> hallo </s>

Program:

import re
import string
import csv
out = ''
with open('input.txt', 'r') as f:
  reader = csv.reader(f, delimiter='\t')
  for row in reader:
    if (row == "\n"):
        out += "\n"
    if 'airline' in row:
        print '<callsign> <airline>' + row[0] + '</airline></callsign>'
    if 'sentence' in row: 
        print '<s>' + row[0] + '</s>'
    if 'flightnumber' in row: 
        print '<flightnumber>' + row[0] + '</flightnumber>'

Produces:

<callsign> <airline>condor</airline></callsign>
<flightnumber>eight</flightnumber>
<flightnumber>nine</flightnumber>
<flightnumber>five</flightnumber>
<s>hallo</s>

Is there a way that I can make this^ to the one in expected output?


Solution

  • You create a new string with the pattern replaced with itself followed by what you want to add, and replace the original string with the new one.

    However, it looks like from your example, that you need more than simple replacements; you need to gather up the lines w/ flightnumber so you can combine their contents into one tag.

    I think you'll need to provide more details on what rules you want to follow to get a more detailed answer.