Search code examples
pythontensorflowkerastensorflow2.0huggingface-transformers

HuggingFace TFRobertaModel detailed summary


from transformers import RobertaTokenizer, TFRobertaModel
import tensorflow as tf

tokenizer = RobertaTokenizer.from_pretrained("roberta-base")
model = TFRobertaModel.from_pretrained("roberta-base")

I want a detailed layer summary of this HuggingFace TFRobertaModel() so that I can visualize shapes, layers and customize if needed. However, when I did: model.summary(), it just shows everything in a single layer. I tried digging into it's different attributes, but not able to get a detailed layer summary. Is it possible to do so?

Model: "tf_roberta_model_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
roberta (TFRobertaMainLayer) multiple                  124645632 
=================================================================
Total params: 124,645,632
Trainable params: 124,645,632
Non-trainable params: 0
_________________________________________________________________

Also, there is a related question in HuggingFace forum which hasn't been answered yet.


Solution

  • Not exactly a model summary, but you can print the layers like this:

    from transformers import RobertaTokenizer, TFRobertaModel
    import tensorflow as tf
    
    tokenizer = RobertaTokenizer.from_pretrained("roberta-base")
    model = TFRobertaModel.from_pretrained("roberta-base")
    
    def print_layers(l, model):
      for idx, s in enumerate(l.submodules):
        if s.submodules:
          print_layers(s, model)
        print(s)
    
    TFRobertaMainLayer = model.layers[0]   
    print_layers(TFRobertaMainLayer, model)
    

    You could also use s.weights to get the weights of each layer.