Search code examples
pythonpython-3.xpandasnumpypandas-profiling

How to create a dataframe with multiple lists/arrays in python


I have many lists which consists of 1d data. like below:

list1 = [1,2,3,4...]
list2 = ['a','b','c'...] 

Now, I have to create dataframe like below:

df = [[1,'a'],[2,'b'],[3,'c']]

I need this dataframe so that I can profile each column using pandas_profiling. Please suggest.

I have tried

list1+list2

but its giving data like below:

list3=[1,2,3,4...'a','b'...]

used numpy hpstack too, but not working

import pandas as pd
import pandas_profiling
import numpy as np

list3 = np.hstack([[list1],[list2]])

array([[1,2,3,4,'a','b','c'..]],dtype='<U5')

Solution

  • You can use the zip function described in the answer from this question to create your nested list.

    You should note that you cannot use the zip function directly as it could lead to an error.

    The solution would be:

    import pandas as pd
    
    list1 = [1,2,3]
    list2 = ['a','b','c']
    df = pd.DataFrame(list(zip(list1,list2)), columns=['list1', 'list2'])