Search code examples
pythonpandasdataframefiletext-files

Is there a function to write certain values of a dataframe to a .txt file in Python?


I have a dataframe as follows:

Index A B C D E F
1     0 0 C 0 E 0 
2     A 0 0 0 0 F
3     0 0 0 0 E 0
4     0 0 C D 0 0 
5     A B 0 0 0 0

Basically I would like to write the dataframe to a txt file, such that every row consists of the index and the subsequent column name only, excluding the zeroes.

For example:

txt file

1 C E 
2 A F 
3 E 
4 C D 
5 A B

The dataset is quite big, about 1k rows, 16k columns. Is there any way I can do this using a function in Pandas?


Solution

  • Take a matrix vector multiplication between the boolean matrix generated by "is this entry "0" or not" and the columns of the dataframe, and write it to a text file with to_csv (thanks to @Andreas' answer!):

    df.ne("0").dot(df.columns + " ").str.rstrip().to_csv("text_file.txt")
    

    where we right strip the spaces at the end due to the added " " to the last entries.

    If you don't want the name Index appearing in the text file, you can chain a rename_axis(index=None) to get rid of it i.e.,

    df.ne("0").dot(df.columns + " ").str.rstrip().rename_axis(index=None)
    

    and then to_csv as above.