Search code examples
kerasdeep-learningpytorch

What would be the equivalent of keras.layers.Masking in pytorch?


I have time-series sequences which I needed to keep the length of sequences fixed to a number by padding zeroes into matrix and using keras.layers.Masking in keras I could neglect those padded zeros for further computations, I am wondering how could it be done in Pytorch?

Either I need to do the padding in pytroch and pytorch can't handle the sequences with varying lengths what is the equivalent to Masking layer of keras in pytorch, or if pytorch handles the sequences with varying lengths, how could it be done?


Solution

  • You can use PackedSequence class as equivalent to keras masking. you can find more features at torch.nn.utils.rnn

    Here putting example from packing for variable-length sequence inputs for rnn

    import torch
    import torch.nn as nn
    from torch.autograd import Variable
    
    batch_size = 3
    max_length = 3
    hidden_size = 2
    n_layers =1
    
    # container
    batch_in = torch.zeros((batch_size, 1, max_length))
    
    #data
    vec_1 = torch.FloatTensor([[1, 2, 3]])
    vec_2 = torch.FloatTensor([[1, 2, 0]])
    vec_3 = torch.FloatTensor([[1, 0, 0]])
    
    batch_in[0] = vec_1
    batch_in[1] = vec_2
    batch_in[2] = vec_3
    
    batch_in = Variable(batch_in)
    
    seq_lengths = [3,2,1] # list of integers holding information about the batch size at each sequence step
    
    # pack it
    pack = torch.nn.utils.rnn.pack_padded_sequence(batch_in, seq_lengths, batch_first=True)
    
    >>> pack
    PackedSequence(data=Variable containing:
     1  2  3
     1  2  0
     1  0  0
    [torch.FloatTensor of size 3x3]
    , batch_sizes=[3])
    
    
    # initialize
    rnn = nn.RNN(max_length, hidden_size, n_layers, batch_first=True) 
    h0 = Variable(torch.randn(n_layers, batch_size, hidden_size))
    
    #forward 
    out, _ = rnn(pack, h0)
    
    # unpack
    unpacked, unpacked_len = torch.nn.utils.rnn.pad_packed_sequence(out)
    
    >>> unpacked
    Variable containing:
    (0 ,.,.) = 
     -0.7883 -0.7972
      0.3367 -0.6102
      0.1502 -0.4654
    [torch.FloatTensor of size 1x3x2]
    
    

    more you would find this article useful. [Jum to Title - "How the PackedSequence object works"] - link