Search code examples
pythonpandasparallel-processingpandarallel

pandarell and lambda function


I'm struggling with the pandarell library.

Here is what I'm doing:

def ponerfecha(row):
    import datetime
    a = datetime.datetime(2023, 9, 10, row['HORA'], row['MINUTO'])
    return a

CargaT['FECHATRX'] = CargaT.parallel_apply(lambda row: ponerfecha(row), axis=1)

It is not working. I'm getting the follow error: NameError: name 'ponerfecha' is not defined

Example:

data = {'HORA': [10, 12, 15],
        'MINUTO': [30, 45, 0]}
CargaT = pd.DataFrame(data)

Expected output:

   HORA  MINUTO            FECHATRX
0    10      30 2023-09-10 10:30:00
1    12      45 2023-09-10 12:45:00
2    15       0 2023-09-10 15:00:00

Any clue of what I'm doing wrong? Without parallel it works perfectly.


Solution

  • A complete example that's working for me:

    import pandas as pd
    from pandarallel import pandarallel
    
    pandarallel.initialize(progress_bar=True)
    
    def ponerfecha(row):
        import datetime
        a = datetime.datetime(2023, 9, 10, row['HORA'], row['MINUTO'])
        return a
    
    data = {'HORA': [10, 12, 15],
            'MINUTO': [30, 45, 0]}
    CargaT = pd.DataFrame(data)
    
    CargaT['FECHATRX'] = CargaT.parallel_apply(lambda row: ponerfecha(row), axis=1)
    print(CargaT)
    

    Prints:

    INFO: Pandarallel will run on 8 workers.
    INFO: Pandarallel will use Memory file system to transfer data between the main process and workers.
     100.00% :::::::::::::::::::::::::::::::::::::::: |        1 /        1 |                                                                                                                                         
     100.00% :::::::::::::::::::::::::::::::::::::::: |        1 /        1 |                                                                                                                                         
     100.00% :::::::::::::::::::::::::::::::::::::::: |        1 /        1 |                                                                                                                                            
    HORA  MINUTO            FECHATRX
    0    10      30 2023-09-10 10:30:00
    1    12      45 2023-09-10 12:45:00
    2    15       0 2023-09-10 15:00:00