Search code examples
pythonpandascsvtextextract

Text file lines to csv columns


I m trying to transfer data from a text file to csv. My text file contains lots of rows delimited by /n.

My text file is like:

1  CONTINUE

A:data

B:data

C:data

D:data

 Something A

$Param     = data

$Param2    = data

2 CONTINUE 

and so on, the structure is the same

I need the output to be a csv like this:

Number | Var_A | Var_B | Var_C | Var_D | Something | Parameter 
1       |data  |  data   | data  |  data    |   A     |     Param
1       |data  |  data   | data  |  data    |   A     |     Param2

Hope I was clear enough:) Any ideas how to begin?:)


Solution

  • Question is pretty hard to find the real problem. I made the code, but please tell me if your problem clearly exists.

    txt_lines = txt.split("\n")
    df_dict = dict()
    for line in txt_lines:
        if not line:
            continue
        if ":" in line:
            column = "Var_" + line.split(':')[0]
            row = df_dict.get(column, list())
            row.append(line.split(':')[-1])
            df_dict.update({column: row})
        elif "$" in line:
            column = "Parameter"
            row = df_dict.get(column, list())
            row.append(line.split()[0].split('$')[-1])
            df_dict.update({column: row})
        elif line.split()[0].isdigit():
            column = "Number"
            row = df_dict.get(column, list())
            row.append(line.split()[0])
            df_dict.update({column: row})
        else:
            column = line.split()[0]
            row = df_dict.get(column, list())
            row.append(line.split()[1])
            df_dict.update({column: row})
    df = pd.DataFrame(dict([(k, pd.Series(v)) for k,v in df_dict.items()])).fillna('')
    #df.to_csv("result.csv", index=False, sep="|")
    

    df looks like this

      Number Var_A Var_B Var_C Var_D Something Parameter
    0      1  data  data  data  data         A     Param
    1      2                                      Param2