Search code examples
pythontext-filesglob

combining one-column text files into one multi-column file, separated by tabs


I have several one-column text files that I want to put next to each other (e.g. each file is one column). The only documentation I could find was on appending files to the bottom of other files. That's where I'm stuck.

import glob
MA_files = glob.glob("MA*continuous*2")

with open("MA_continuous_results.csv", "wb") as outfile:
    for eachFile in MA_files:
        with open(eachFile, "rb") as infile:
            eachFile=eachFile.split("maskave_")
            # eachFile[-1] = 15_2 etc
            outfile.write('%s\n' % eachFile[-1])
            for line in infile:
                #split by [ because text after that, files give unnecessary info
                line=line.split('[')
                #only write info before [
                outfile.writelines('%s\n' % line[0])

the output looks like this:

15_2
-0.0383935 
-0.0559652 
-0.0168811 
-0.0996374 
-0.151921 
-0.131327 
-0.0509602 
-0.109181 
-0.0238667 
-0.00646939 
-0.106631 
-0.0380114 
-0.0219288 
-0.0135917 
-0.0627647 
-0.0605226 
-0.0534139 
-0.0134063 
21_2
-0.097086 
-0.210296 
0.0639971 
0.00949209 
-0.227474 
0.0180759 
-0.135376 
-0.212909 
-0.00786295 
-0.00922367 
-0.0749584 
-0.0584701 
-0.019548 
-0.0984993 
-0.00848889 
-0.164244 
-0.0121499 
0.0100612 

but I want it to look like this (columns separated by tab or comma):

 15_2       21_2
-0.0383935  -0.097086
-0.0559652  -0.210296
-0.0168811  0.0639971
-0.0996374  0.00949209
-0.151921   -0.227474
-0.131327   0.0180759
-0.0509602  -0.135376
-0.109181   -0.212909
-0.0238667  -0.00786295
-0.00646939 -0.00922367
-0.106631   -0.0749584
-0.0380114  -0.0584701
-0.0219288  -0.019548
-0.0135917  -0.0984993
-0.0627647  -0.00848889
-0.0605226  -0.164244
-0.0534139  -0.0121499
-0.0134063  0.0100612

How can I do that?

Example filenames:

MA_continuous_hybrid_maskave_19_2 MA_continuous_hybrid_maskave_18_2

Example content:

-0.182682 [344 voxels]
-0.0631301 [344 voxels]
-0.0101798 [344 voxels]
-0.121342 [344 voxels]
-0.547331 [344 voxels]
-0.0582418 [344 voxels]
-0.284454 [344 voxels]
-0.262656 [344 voxels]
-0.123836 [344 voxels]
-0.0371469 [344 voxels]
-0.265201 [344 voxels]
-0.147427 [344 voxels]
-0.34516 [344 voxels]
-0.0431832 [344 voxels]
-0.0171557 [344 voxels]
-0.14525 [344 voxels]
-0.0864529 [344 voxels]
0.0881003 [344 voxels]

Solution

  • I would rewrite your code as follows:

    import glob
    from itertools import izip
    
    def extract_meaningful_info(line):
        return line.rstrip('\n').split('[')[0]
    
    MA_files = glob.glob("MA*continuous*2")
    
    with open("MA_continuous_results.csv", "wb") as outfile:
        outfile.write("\t".join(MA_files) + '\n')
        for fields in izip(*(open(f) for f in MA_files)):
            fields = [extract_meaningful_info(f) for f in fields]
            outfile.write('\t'.join(fields) + '\n')
    

    (code is for python2)

    You might want to read about: