Search code examples
deep-learningnlptensorflow2.0huggingface-transformers

Train Hugging face AutoModel defined using AutoConfig


I have defined the configration for a model in transformers. Later, I have used this configration to initialise the classifier as follows

from transformers import AutoConfig, AutoModel

config = AutoConfig.from_pretrained('bert-base-uncased')
classifier = AutoModel.from_config(config)

I have check the list of functions available for this class which are

>>> dir(classifier)

>>>
['add_memory_hooks',
 'add_module',
 'adjust_logits_during_generation',
 'apply',
 'base_model',
 'base_model_prefix',
 'beam_sample',
 'beam_search',
 'bfloat16',
 'buffers',
 'children',
 'config',
 'config_class',
 'cpu',
 'cuda',
 'device',
 'double',
 'dtype',
 'dummy_inputs',
 'dump_patches',
 'embeddings',
 'encoder',
 'estimate_tokens',
 'eval',
 'extra_repr',
 'float',
 'floating_point_ops',
 'forward',
 'from_pretrained',
 'generate',
 'get_buffer',
 'get_extended_attention_mask',
 'get_head_mask',
 'get_input_embeddings',
 'get_output_embeddings',
 'get_parameter',
 'get_position_embeddings',
 'get_submodule',
 'gradient_checkpointing_disable',
 'gradient_checkpointing_enable',
 'greedy_search',
 'group_beam_search',
 'half',
 'init_weights',
 'invert_attention_mask',
 'is_parallelizable',
 'load_state_dict',
 'load_tf_weights',
 'modules',
 'name_or_path',
 'named_buffers',
 'named_children',
 'named_modules',
 'named_parameters',
 'num_parameters',
 'parameters',
 'pooler',
 'prepare_inputs_for_generation',
 'prune_heads',
 'push_to_hub',
 'register_backward_hook',
 'register_buffer',
 'register_forward_hook',
 'register_forward_pre_hook',
 'register_full_backward_hook',
 'register_parameter',
 'requires_grad_',
 'reset_memory_hooks_state',
 'resize_position_embeddings',
 'resize_token_embeddings',
 'retrieve_modules_from_names',
 'sample',
 'save_pretrained',
 'set_input_embeddings',
 'share_memory',
 'state_dict',
 'supports_gradient_checkpointing',
 'tie_weights',
 'to',
 'to_empty',
 'train',
 'training',
 'type',
 'xpu',
 'zero_grad']

Out of this only train method seemed relevant. However, upon checking the doc string for the function, I got

>>> print(classifier.train.__doc__)
>>> Sets the module in training mode.

        This has any effect only on certain modules. See documentations of
        particular modules for details of their behaviors in training/evaluation
        mode, if they are affected, e.g. :class:`Dropout`, :class:`BatchNorm`,
        etc.

        Args:
            mode (bool): whether to set training mode (``True``) or evaluation
                         mode (``False``). Default: ``True``.

        Returns:
            Module: self

How do I train this classifier on custom dataset (preferably in the transformers or in tensorflow)?


Solution

  • TFAutoModel was needed in the above code.

    from transformers import AutoConfig, TFAutoModel
    
    config = AutoConfig.from_pretrained('bert-base-uncased')
    
    model = TFAutoModel.from_config(config)
    
    model.compile(
        loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
        optimizer=tf.keras.optimizers.RMSprop(),
        metrics=["accuracy"],
    )
    

    Then, we call model.fit and model.predict functions to train on the custom dataset