Search code examples
pythonpandastxt

Create txt file from Pandas Dataframe


I would like to save my dataframe in a way that matches an existing txt file (I have a trained model based on the this txt file and I now want to predict on new data, that needs to match this format).

The target txt file looks like this (first3 rows):

2 qid:0 0:0.4967141530112327 1:-0.1382643011711847 2:0.6476885381006925 3:1.523029856408025 4:-0.234153374723336 
1 qid:2 0:1.465648768921554 1:-0.2257763004865357 2:0.06752820468792384 3:-1.424748186213457 4:-0.5443827245251827 
2 qid:0 0:0.7384665799954104 1:0.1713682811899705 2:-0.1156482823882405 3:-0.3011036955892888 4:-1.478521990367427 

First column is just a random integer (here the 2 and the 1) The qid is always connected via colon to an integer. Then an integer is followed by a float, for the rest of the columns.

My dataframe looks like this:

data = {'label': [2,3,2],
        'qid': ['qid:0', 'qid:1','qid:0'],
       '0': [0.4967, 0.4967,0.4967],
       '1': [0.4967, 0.4967,0.4967],
       '2': [0.4967, 0.4967,0.4967],
       '3': [0.4967, 0.4967,0.4967],
       '4': [0.4967, 0.4967,0.4967]}

df = pd.DataFrame(data)

Solution

  • try this and let us know if it works for you case

    data = pd.read_csv('output_list.txt', sep=" ", header=None)
    
    data.columns = ["a", "b", "c", "etc."]
    

    google colab pic

    Updated code very messy if this solves your problem then it can be updated to handle large amount of data using numpy array methods

    for i in list(data.keys()):
      if i=="label" or i=="qid":
        pass
      else:
        data[i]=[str(i)+":"+str(j) for j in list(data[i])]
    

    enter image description here