Search code examples
nlppytorchtorchtexttorchscript

Using torchtext vocab with torchscript


I'm trying to use the torchtext vocab layer along with torchscript but I'm getting some errors and I was wondering if someone here has made it work.

My current model is

class VocabText(torch.nn.Module):

    def __init__(self):

        super(VocabText, self).__init__()

        self.embedding = torch.nn.Embedding(10,128)

        vocab = ['This', 'is', 'a', 'test']

        counter = Counter(vocab)

        self.lookup = text.vocab.vocab(counter)

        self.tensor = torch.Tensor

    def forward(self, x: str):

        x_mapped = self.lookup(x)

        x_mapped = self.tensor(x_mapped).int()

        x_mapped = self.embedding(x_mapped)

        

        return x

That works when I do a pass of the model like this:

example_str = ["is"]

model(example_str)

But when I try to compile it with torchscript it fails:

model_scripted = torch.jit.script(model)   

model_scripted.save('model_scripted.pt')

With the following error:

RuntimeError: 
Unknown builtin op: aten::Tensor.
Here are some suggestions: 

For when I map the result of the lookup layer during the forward function

I think is due to typing as the vocab layer expects strings as input but the embedding layer will use tensors. Im doing a cast in the middle of the forward.

I have a working notebook in colab to reproduce this issue if anybody wants: https://colab.research.google.com/drive/14nZF5X8rQrZET_7iA1N2MUV3XSzozpeI?usp=sharing


Solution

  • Turns out I had to change the function to built the tensor, found at https://discuss.pytorch.org/t/unknown-builtin-op-aten-tensor/62389