Search code examples
pythonpandascsvtextdelimiter

Add delimiter in txt file python


I m trying to transfer data from a text file to csv. My text file contains lots of rows delimited by /n.

a: data
b: data2
$e = data3
number
a: data4 and so on

I need a column for a,b, one for $ starting rows and a column for data after the = or : sign.

Can someone help me with a starting point?:)


Solution

  • txt = """a: data
    b: data2
    $e = data3
    number
    a: data4"""
    txt_lines = txt.split("\n")
    df_dict = dict()
    for line in txt_lines:
        if ":" in line or "$" in line:
            # remove special character
            column = line.split()[0].replace("$", "").replace(":", "")
            # if column exists get row, if not, get empty list
            row = df_dict.get(column, list())
            # think as last word to data and update row
            row.append(line.split()[-1])
            # update dictionary
            df_dict.update({column: row})
        else:
            # if ":" or "$" not exists in text, then think as number
            column = 'number'
            # if column exists get row, if not, get empty list
            row = df_dict.get(column, list())
            # update row
            row.append(line)
            # update dictionary
            df_dict.update({column: row})
    df = pd.DataFrame(dict([(k, pd.Series(v)) for k,v in df_dict.items()]))
    

    df looks like it.

           a      b      e  number
    0   data  data2  data3  number
    1  data4    NaN    NaN     NaN
    

    It'll be better people help you if you question after read How to create a Minimal, Reproducible Example