Search code examples
pythonpandastqdm

Is it possible to use tqdm for pandas merge operation?


I could find examples of tqdm progress bar being used for group by and other pandas operations. But couldn't find anything on merge or join.

Is it possible to use tqdm on pandas for merge ?


Solution

  • tqdm supports pandas and various operations within it. For merging two large dataframes and showing the progress, you could do it this way:

    import pandas as pd
    from tqdm import tqdm
    
    df1 = pd.DataFrame({'lkey': 1000*['a', 'b', 'c', 'd'],'lvalue': np.random.randint(0,int(1e8),4000)})
    df2 = pd.DataFrame({'rkey': 1000*['a', 'b', 'c', 'd'],'rvalue': np.random.randint(0, int(1e8),4000)})
    
    #this is how you activate the pandas features in tqdm
    tqdm.pandas()
    #call the progress_apply feature with a dummy lambda 
    df1.merge(df2, left_on='lkey', right_on='rkey').progress_apply(lambda x: x)
    

    More details are available on this thread: Progress indicator during pandas operations (python)