I am trying to read an image file and classify and image. My model is resnet18 I trained it previously and planning to use a different .py script to classify a list of images. This is my network:
PATH = './net.pth'
model_ft = models.resnet18(pretrained=True)
num_ftrs = model_ft.fc.in_features
model_ft.fc = nn.Linear(num_ftrs, 16)
model_ft.load_state_dict(torch.load(PATH))
model_ft.eval()
And I am trying to read Images this way:
imsize = 256
loader = transforms.Compose([transforms.Scale(imsize), transforms.ToTensor()])
def image_loader(image_name):
#load image, returns cuda tensor
image = Image.open(image_name)
image = loader(image).float()
image = Variable(image, requires_grad=True)
image = image.unsqueeze(0)
return image.cuda()
image = image_loader("dataset/test/9673.png")
model_ft(image)
I am getting this error:
"Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 4, 676, 256] to have 3 channels, but got 4 channels instead"
I've got recommended to remove the unsqueeze for resnet18, doing that I got the following error:
"Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [4, 676, 256] instead"
I do not quite understand the problem I am dealing with, how should I read my test set? I'll need to write the class ID-s and the file names into a .txt afterwards.
You are using a PNG image which has 4 channels. your network expects 3 channels.
Convert to RGB
and you should be fine. In your image_loader
simply do:
image = Image.open(image_name).convert('RGB')