Search code examples
pythonpysparkapache-spark-mllib

Can we use for loop in ParamGridBuilder of pyspark?


The below code is to add the parameters to paramGridBuilder without any loops in pyspark.

from pyspark.ml.tuning import ParamGridBuilder
paramGrid = ParamGridBuilder()\
        .addGrid(lr.regParam, [0.1, 0.01]) \
        .addGrid(lr.fitIntercept, [False, True])\
        .addGrid(lr.elasticNetParam, [0.0, 0.5, 1.0])\
        .build()

I have a dictionary like this

 dict =   {lr.regParam : [0.1,0.01],lr.fitIntercept:[False,True],lr.elasticNetParam:[0.0,0.5,1.0]

Can we use a loop to build the ParamgridBuilder and will it work?

for k,v in dict.items():
    paramGrid = ParamGridBuilder().addGrid(k,v).build()

Solution

  • you can use the reduce function :

    from functools import reduce
    paramGrid  = reduce(
        lambda a,b: a.addGrid(*b),
        dict.items(),
        ParamGridBuilder(),
    ).build()
    

    or with a for loop

    paramGrid = ParamGridBuilder()
    for k,v in dict.items():
        paramGrid = paramGrid.addGrid(k,v)
    paramGrid = paramGrid.build()