Search code examples
pythonpytorchneural-network

50% training and test accuracy in deep nn model for binarily classifying 2d points


I have a deep neural network with 2 hidden layers. I'm trying to use the neural network to classify 2d datapoints to either 0 or 1. That data that I've generated looks like this:

enter image description here

What I'm trying to do also looks like this tensorflow classification.

import random
import torch.optim
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from torch import nn
from sklearn.datasets import make_circles

def accuracy_fn(y_true, y_pred):
    correct = torch.eq(y_true, y_pred).sum().item() # torch.eq() calculates where two tensors are equal
    acc = (correct / len(y_pred)) * 100
    return acc


n_samples = 1000

X, y = make_circles(n_samples, noise=0.03, random_state=42)

X = torch.from_numpy(X).type(torch.float)
y = torch.from_numpy(y).type(torch.float).unsqueeze(1)

X_train, X_test, y_train, y_test = train_test_split(X,
                                                    y,
                                                    test_size=0.2, # 20% test, 80% train
                                                    random_state=42)

model = nn.Sequential(
    nn.Linear(2, 8),
    nn.ReLU(),
    nn.Linear(8, 8),
    nn.ReLU(),
    nn.Linear(8, 1),
)

print(model.state_dict())

criterion = nn.BCEWithLogitsLoss()

optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

Loss = []
epochs = 500

for epoch in range(epochs):
    y_logit = model(X_train)
    loss = criterion(y_logit, y_train)
    y_pred = torch.round(torch.sigmoid(y_logit))
    acc = accuracy_fn(y_true=y_train,
                      y_pred=y_pred)

    if epoch % 100 == 0:
        Loss.append(loss.item())

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

    model.eval()
    with torch.inference_mode():
        # 1. Forward pass
        test_logits = model(X_test)
        test_pred = torch.round(torch.sigmoid(test_logits))  # logits -> prediction probabilities -> prediction labels
        # 2. Calcuate loss and accuracy
        test_loss = criterion(test_logits, y_test)
        test_acc = accuracy_fn(y_true=y_test,
                               y_pred=test_pred)

    # Print out what's happening
    if epoch % 100 == 0:
        print(
            f"Epoch: {epoch} | Loss: {loss:.5f}, Accuracy: {acc:.2f}% | Test Loss: {test_loss:.5f}, Test Accuracy: {test_acc:.2f}%")

I'm not too sure what I'm doing wrong here. I've tried all sorts of learning rates and epochs, but my final model will typically have an accuracy rate of 50% and the loss never goes down. Sometimes, it will hit the high 50s or even low 60s in accuracy if I rerun my program enough times. I'm sort of referencing this link to get a sense of the data I should be getting and the author is getting an around the mid and high 70s in terms of accuracy. His loss also shrinks significantly.

I did not copy his code exactly, but I also looked at other similar examples and my code doesn't seem to be too different.


Solution

  • for epoch in range(epochs):
    ################################################################
        optimizer.zero_grad()
    ################################################################
        y_logit = model(X_train)
        loss = criterion(y_logit, y_train)
        y_pred = torch.round(torch.sigmoid(y_logit))
        acc = accuracy_fn(y_true=y_train,
                          y_pred=y_pred)
    
        if epoch % 100 == 0:
            Loss.append(loss.item())
    
        
        loss.backward()
        optimizer.step()
    
        model.eval()
        with torch.inference_mode():
            # 1. Forward pass
            test_logits = model(X_test)
            test_pred = torch.round(torch.sigmoid(test_logits))  # logits -> prediction probabilities -> prediction labels
            # 2. Calcuate loss and accuracy
            test_loss = criterion(test_logits, y_test)
            test_acc = accuracy_fn(y_true=y_test,
                                   y_pred=test_pred)
    
        # Print out what's happening
        if epoch % 100 == 0:
            print(
                f"Epoch: {epoch} | Loss: {loss:.5f}, Accuracy: {acc:.2f}% | Test Loss: {test_loss:.5f}, Test Accuracy: {test_acc:.2f}%")
    

    you have to put the zero_grad at the top of the loop. when you put optimizer.zero_grad() at the bottom of the training loop, you aren't clearing the gradients before computing the gradients for the current batch. this can lead to gradients accumulating over multiple batches, resulting in incorrect and inconsistent updates to the model parameters.