Search code examples
pythonpandastxt

Convert pandas to txt in google colab


I have a dataset which called preprocessed_sample in the following format

preprocessed_sample.ftr.zstd 

and I am opening it using the following code

df = pd.read_feather(filepath)

The output looks something like that

index   text
0   0   i really dont come across how i actually am an...
1   1   music has become the only way i am staying san...
2   2   adults are contradicting
3   3   exo are breathing 553 miles away from me. they...
4   4   im missing people that i met when i was hospit...

and finally I would like to save this dataset in a file which called 'examples' and contains all these texts into txt format.

Update: @Tsingis I would like to have the above lines into txt files, for example the first line 'i really dont come across how i actually am an...' will be a file named 'line1.txt', in the same way all the lines will be txt files into a folder which called 'examples'.


Solution

  • You can use the following code:

    import pathlib
    
    data_dir = pathlib.Path('./examples')
    data_dir.mkdir(exist_ok=True)
    
    for i, text in enumerate(df['text'], 1):
        with open(f'examples/line{i}.txt', 'w') as fp:
            fp.write(text)
    

    Output:

    examples/
    ├── line1.txt
    ├── line2.txt
    ├── line3.txt
    ├── line4.txt
    └── line5.txt
    
    1 directory, 5 files
    

    line1.txt:

    i really dont come across how i actually am an...