Search code examples
pythondeep-learningpytorchneural-network

Is it possible to combine these three tensors into a large tensor as model input without using dict?


I have a ANN network, and not I need a dict that combines these three tensors as input, for example,

model = MyNetwork("mypath")
dummy_input = {}
dummy_input["input_ids"] = torch.randint(1, 512,(1,345))
dummy_input["attention_mask"] = torch.randint(1, 512,(1,345))
dummy_input["bbox"] = torch.randint(1, 512,(1,345,4))
torch_out = model(**dummy_input) #need to decompress before input

Thus,I was wondering if it is possible to combined above-mentioned tensor into one tenor, for example,

input_ids = torch.randint(1, 512,(1,345))
attention_mask = torch.randint(1, 512,(1,345))
bbox = torch.randint(1, 512,(1,345,4))
torch_out = model(input_ids = input_ids, attention_mask = attention_mask, bbox = bbox) #is it possible to input only one tensor to replace these three?
print(torch_out)

btw, the forward func of my model like that, only this three tensors is necessary,

def forward(
    self,
    input_ids=None,
    bbox=None,
    attention_mask=None,
    token_type_ids=None,
    valid_span=None,
    position_ids=None,
    head_mask=None,
    inputs_embeds=None,
    encoder_hidden_states=None,
    encoder_attention_mask=None,
    past_key_values=None,
    use_cache=None,
    output_attentions=None,
    output_hidden_states=None,
    return_dict=None,
    images=None,
):

Solution

  • If you are looking to combine these three tensors into a single, it seems you can work this out with a concatenation since they roughly have the same shapes:

    • input_ids is shaped (b, f)
    • attention_mask is shaped (b, f)
    • bbox is shaped (b, f, 4)

    If you concatenate along a third dimensions, you will get a tensor of shape (b, f, 4+1+1):

    >>> x = torch.cat([input_ids.unsqueeze(-1), attention_mask.unsqueeze(-1), bbox], dim=-1)