Search code examples
pythonarraysnumpyconcatenationgenfromtxt

Concatenate many lists from text files


I have a few thousand lists in text files. For example:

text1.txt:

1,2,3,4

text2.txt:

a,b,c,d

I want to open all of my text files and put them all into one list, like so:

[1,2,3,4,a,b,c,d]

However, all of the loops I have tried either give me a list of arrays like:

[array([1,2,3,4]),array([a,b,c,d)]]

Or they just give me back [1,2,3,4].

This is the most recent code I tried:

file_list=glob.glob(file_dir+'/*.txt')
data=[]
for file_path in file_list:
    data.append(np.concatenate([np.genfromtxt(file_path, delimiter=',')]))

Which just puts the first list into data. Without concatenate, it puts the two lists into data as a list of two separate arrays.


Solution

  • Collect the arrays in a list, data, then call np.concatenate once to join the list of arrays into a single array:

    data=[]
    for file_path in glob.glob(file_dir+'/*.txt')
        data.append(np.genfromtxt(file_path, delimiter=','))
    result = np.concatenate(data)
    

    Since np.genfromtxt can accept a generator as its first argument, you could also avoid forming lots of small, separate arrays by creating a generator expression which yields all the lines from all the text files:

    import glob
    import itertools as IT
    
    lines = IT.chain.from_iterable(open(file_path, 'r') 
                                   for file_path in glob.glob('*.txt'))
    result = np.genfromtxt(lines, delimiter=',', dtype=None)