Search code examples
machine-learningpytorchtorchautoencoder

Pytorch error mat1 and mat2 shapes cannot be multiplied in Autoencoder to compress images


I receive this error. Whereas the size of my input image is 3x120x120, so I flatten the image by the following code, however, I received this error:

mat1 and mat2 shapes cannot be multiplied (720x120 and 43200x512)

I have tu use an autoencoder to compress my images of a factor of 360 ( So i started from 3x120x120 input to 120 in the encoder).

my code:

class AE(torch.nn.Module):
    def __init__(self):
        super().__init__()
         
        
        self.encoder = torch.nn.Sequential(
            torch.nn.Linear(3*120*120, 512),
            torch.nn.ReLU(),
            torch.nn.Linear(512, 256),
            torch.nn.ReLU(),
            torch.nn.Linear(256, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, 120)
        )
         
        
        self.decoder = torch.nn.Sequential(
            torch.nn.Linear(120, 128),
            torch.nn.ReLU(),
            torch.nn.Linear(128, 256),
            torch.nn.ReLU(),
            torch.nn.Linear(256, 512),
            torch.nn.ReLU(),
            torch.nn.Linear(512, 3*120*120),
            torch.nn.Sigmoid()
        )
 
    def forward(self, x):
        encoded = self.encoder(x)
        decoded = self.decoder(encoded)
        return decoded

Solution

  • I think you missed some basic characteristics of nn.Linear function. Its inputs are input and output channel dimension, respectively. And therefore only 1D input is allowable (considering batch, which is 3 in your case, it will be 2D in total ). Therefore you should first flatten your x for later compute compatibility.

    class AE(torch.nn.Module):
        def __init__(self):
            super().__init__()
        
            self.encoder = torch.nn.Sequential(
                torch.nn.Linear(120*120, 512),
                torch.nn.ReLU(),
                torch.nn.Linear(512, 256),
                torch.nn.ReLU(),
                torch.nn.Linear(256, 128),
                torch.nn.ReLU(),
                torch.nn.Linear(128, 120)
            )
         
        
            self.decoder = torch.nn.Sequential(
                torch.nn.Linear(120, 128),
                torch.nn.ReLU(),
                torch.nn.Linear(128, 256),
                torch.nn.ReLU(),
                torch.nn.Linear(256, 512),
                torch.nn.ReLU(),
                torch.nn.Linear(512, 3*120*120),
                torch.nn.Sigmoid()
            )
    
        def forward(self, x):
            x = x.reshape(3, -1)
            encoded = self.encoder(x)
            decoded = self.decoder(encoded)
            return decoded
    

    I recommend you to carefully search for the structure and dimensional change of common neural networks. It will help you alot for sure.