Search code examples
pythonpandasnumpyglob

arranging files in a text file


i have many .txt files in a directory named as (TU_1.ST01.XXX.TXT_,TU_1.ST02.XXX.TXT_, TU_1.ST03.XXX.TXT_, .......TU_1.ST1000.XXX.TXT_). i want to arrange all the text files side by side and want to save it in a file which should be equal to the paste command in shell scripting.

Can anybody help me doing this.

I tried the script

import numpy as np
import os
import glob

for file in glob.glob("*.TXT_"):
    print(file)
    #here i want to arrange files 

Solution

  • Example input files:

    $ paste *.txt
    one    four    six
    two    five    seven
    thee           eight   
                   nine
    

    Using pandas is one option.

    import pandas  as pd
    from   pathlib import Path
    
    >>> pd.DataFrame(f.read_text().splitlines() for f in Path().glob('*.txt'))
          0      1      2     3
    0   one    two   thee  None
    1  four   five   None  None
    2   six  seven  eight  nine
    

    You can then .tranpose() / .T to turn each file into its own column.

    >>> df = pd.DataFrame(f.read_text().splitlines() for f in Path().glob('*.txt'))
    >>> df.T
          0     1      2
    0   one  four    six
    1   two  five  seven
    2  thee  None  eight
    3  None  None   nine
    

    You can then use .to_csv() and set the sep if you want to emulate the tabbed output from paste

    >>> print(df.T.to_csv(index=None, header=None, sep='\t'), end='')
    one    four    six
    two    five    seven
    thee           eight   
                   nine
    

    You could also implement it using itertools.zip_longest