Search code examples
pythonmachine-learningpytorchpytorch-lightningfb-hydra

How does Hydra `_partial_` interact with seeding


In the configuration management library Hydra, it is possible to only partially instantiate classes defined in configuration using the _partial_ keyword. The library explains that this results in a functools.partial. I wonder how this interacts with seeding. E.g. with

My reasoning is, that if I use the _partial_ keyword while specifying all parameters for __init__, then I would essentially obtain a factory which could be called after specifying the seed to do multiple runs. But this assumes that _partial_ does not bake the seed in already. To my understanding that should not be the case. Is that correct?


Solution

  • Before using hydra.utils.instantiate no third party code is not run by hydra. So you can set your seeds before each use of instantiate; or if a partial before each call to the partial.

    Here a complete toy example, based on Hydra's doc overview, which creates a partial to instantiate an optimizer or a model, that takes a callable optim_partial as an argument.

    # config.yaml
    model:
      _target_: "__main.__.MyModel"
      optim_partial:
        _partial_: true
        _target_: __main__.MyOptimizer
        algo: SGD
      lr: 0.01
    
    from functools import partial
    from typing import Callable
    import random
    from pprint import pprint
    
    import hydra
    from omegaconf import DictConfig, OmegaConf
    
    
    class MyModel:
        def __init__(self, lr, optim_partial: Callable[..., "MyOptimizer"]):
            self.optim_partial = optim_partial
            self.optim1 = self.optim_partial()
            self.optim2 = self.optim_partial()
    
    
    class MyOptimizer:
        def __init__(self, algo):
            print(algo, random.randint(0, 10000))
    
    
    @hydra.main(config_name="config", config_path="./", version_base=None)
    def main(cfg: DictConfig):
        # Check out the config
        pprint(OmegaConf.to_container(cfg, resolve=False))
        print(type(cfg.model.optim_partial))
        
        # Create the functools.partial
        optim_partial: partial[MyOptimizer] = hydra.utils.instantiate(cfg.model.optim_partial)
        # Set the seed before you call the a partial
        random.seed(42)
        optimizer1: MyOptimizer = optim_partial()
        optimizer2: MyOptimizer = optim_partial()
        random.seed(42)
        optimizer1b: MyOptimizer = optim_partial()
        optimizer2b: MyOptimizer = optim_partial()
    
        # model is not a partial; use seed before creation
        random.seed(42)
        model: MyModel = hydra.utils.instantiate(cfg.model)
    
    
    if __name__ == "__main__":
        main()
    
    # Output
    {'model': {'_target_': '__main__.MyModel',
               'lr': 0.01,
               'optim_partial': {'_partial_': True,
                                 '_target_': '__main__.MyOptimizer',
                                 'algo': 'SGD'}}}
    type of cfg.model.optim_partial <class 'omegaconf.dictconfig.DictConfig'>
    SGD 1824
    SGD 409
    SGD 1824
    SGD 409
    SGD 1824
    SGD 409