Search code examples
pythonpandasgenerator

Using a generator comprehension to create data frames


Correct me if I'm wrong but a generator comprehension produces a generator object. If this is the case, then why are the contents of df1, df2, df3 & df4 actual data frames and not generator objects?

In  [1]: import pandas as pd       
nrows, ncols = 10, 10       
rng = np.random.RandomState(42)       
df1, df2, df3, df4 = (pd.DataFrame(rng.rand(nrows, ncols)) for i in range(4))

In  [2]: df1
Out [2]:
          0         1         2  ...         7         8         9
0  0.103124  0.902553  0.505252  ...  0.010838  0.905382  0.091287
1  0.319314  0.950062  0.950607  ...  0.328665  0.672518  0.752375
2  0.791579  0.789618  0.091206  ...  0.887704  0.350915  0.117067
3  0.142992  0.761511  0.618218  ...  0.821860  0.706242  0.081349
4  0.084838  0.986640  0.374271  ...  0.753378  0.376260  0.083501
5  0.777147  0.558404  0.424222  ...  0.468661  0.056303  0.118818
6  0.117526  0.649210  0.746045  ...  0.868599  0.223596  0.963223
7  0.012154  0.969879  0.043160  ...  0.553854  0.969303  0.523098
8  0.629399  0.695749  0.454541  ...  0.280963  0.950411  0.890264
9  0.455657  0.620133  0.277381  ...  0.077735  0.974395  0.986211

Solution

  • Unpacking a generator object (by doing a, b, c, etc. = (genexp)) causes it to generate all of it's items then gives one item to each of the variables it unpacks to. In your case, each variable (df1, df2, df3, and df4) gets one DataFrame from the generator object because you unpacked the generator to those those variables.