I'm trying to extract vector-representations of text using BERT in the transformers libray, and have stumbled on the following part of the documentation for the "BERTModel" class:
Can anybody explain this in more detail? A forward-pass makes intuitive sense to me (am trying to get final hidden states after all), and I can't find any additional information on what "pre and post processing" means in this context.
Thanks up front!
I think this is just general advice concerning working with PyTorch Module
's. The transformers
modules are nn.Module
s, and they require a forward
method. However, one should not call model.forward()
manually but instead call model()
. The reason is that PyTorch does some stuff under the hood when just calling the Module. You can find that in the source code.
def __call__(self, *input, **kwargs):
for hook in self._forward_pre_hooks.values():
result = hook(self, input)
if result is not None:
if not isinstance(result, tuple):
result = (result,)
input = result
if torch._C._get_tracing_state():
result = self._slow_forward(*input, **kwargs)
else:
result = self.forward(*input, **kwargs)
for hook in self._forward_hooks.values():
hook_result = hook(self, input, result)
if hook_result is not None:
result = hook_result
if len(self._backward_hooks) > 0:
var = result
while not isinstance(var, torch.Tensor):
if isinstance(var, dict):
var = next((v for v in var.values() if isinstance(v, torch.Tensor)))
else:
var = var[0]
grad_fn = var.grad_fn
if grad_fn is not None:
for hook in self._backward_hooks.values():
wrapper = functools.partial(hook, self)
functools.update_wrapper(wrapper, hook)
grad_fn.register_hook(wrapper)
return result
You'll see that forward
is called when necessary.