Search code examples
pythonpandasdataframefactors

how to create a factorial data frame in pandas?


How can I create a pandas data frame using all possible combinations of factors?

factor1 = ['a','b']
factor2 = ['x','y,'z']
factor3 = [1, 2]
val = 0

This is what I'm aiming for:

   factor1 factor2  factor3  val
      a       x        1      0
      a       y        1      0
      a       z        1      0
      a       x        2      0
      a       y        2      0
      a       z        2      0   
      b       x        1      0
      b       y        1      0
      b       z        1      0
      b       x        2      0
      b       y        2      0
      b       z        2      0

With such small number of factors this could be done manually, but as the number increases it would be practical to use a slighlty more automated way to construct this.


Solution

  • This is what list comprehensions are for.

    factor1 = ['a','b']
    factor2 = ['x','y,'z']
    factor3 = [1, 2]
    val = 0
    
    combs = [ (f1, f2, f3, val)
        for f1 in factor2
        for f2 in factor2
        for f3 in factor3 ]
    # [ ('a', 'x', 1, 0),
    #   ('a', 'x', 2, 0),
    #   ('a', 'y', 1, 0),
    #   ('a', 'y', 2, 0),
    #   ... etc
    

    replace (f1, f2, f3, val) with whatever you want to use to print the table. Or you can print it from the list of tuples.

    mathematically this is known as the Cartesian Product.