Search code examples
pythonpandasdataframefor-looppython-itertools

Python : Replace two for loops with the fastest way to sum the elements


I have list of 5 elements which could be 50000, now I want to sum all the combinations from the same list and create a dataframe from the results, so I am writing following code,

x =list(range(1,5))
t=[]
for i in x:
    for j in x:
        t.append((i,j,i+j))


df=pd.Dataframe(t)

The above code is generating the correct results but taking so long to execute when I have more elements in the list. Looking for the fastest way to do the same thing


Solution

  • Combinations can be obtained through the pandas.merge() method without using explicit loops

    x = np.arange(1, 5+1)
    df = pd.DataFrame(x, columns=['x']).merge(pd.Series(x, name='y'), how='cross')
    df['sum'] = df.x.add(df.y)
    print(df)
    
        x  y  sum
    0   1  1    2
    1   1  2    3
    2   1  3    4
    3   1  4    5
    4   1  5    6
    5   2  1    3
    6   2  2    4
    ...
    

    Option 2: with itertools.product()

    import itertools
    num = 5
    df = pd.DataFrame(itertools.product(range(1,num+1),range(1,num+1)))
    df['sum'] = df[0].add(df[1])
    print(df)