Search code examples
pythonfast-ai

FastAI V2: TabularLearner export.pkl from learn.export() is Very Large


I'm not sure if this is intended but the export.pkl from the learn.export() is about 471 MB which is somewhat prohibitive in the deployment in certain applications.

The model itself from SaveModelCallback is only 131 KB and I'm only looking to use the Learner in order to apply the same transforms/processing (Normalization, FillMissing, Categorify).

Is there a reason this is so large? I've also confirmed

learn.xb = (None, )

learn.yb = (None, )


Solution

  • Original Post: https://forums.fast.ai/t/tabularlearner-export-pkl-from-learn-export-is-very-large/81251/2

    Must pip install wwf see https://walkwithfastai.com/tab.export

    from wwf.tab.export import *

    1. We manually save the Model in the Learner torch.save(learn.model, f'{model_dir}/2_{REF}_LEARNER_MODEL.pt')

    2. We export the Tabular Object as well to.export(f'{model_dir}/3_{REF}_TABULAR_OBJECT.pkl')

    3. We load the Tabular Object

    to_new = load_pandas(f'{model_dir}/3_{REF}_TABULAR_OBJECT.pkl')
    to_new = to_new.train.new(df[:20])
    to_new.process()
    
    1. We load the Model
    model_2 = torch.load(f'{model_dir}/2_{REF}_LEARNER_MODEL.pt')
    learn_new = TabularLearner(dls_new, model)
    
    1. We do Inference
    row, clas, probs = learn_new.predict(df.iloc[0])
    row.show()
    probs
    

    The savings are substantial:

    • Model: 135 kb
    • Tabular Object: 6 kb

    vs.

    • learn.export() 417 mb