Search code examples
pythonmachine-learningdeep-learningpytorchlogistic-regression

Trouble with binary classification using HingeEmbeddingLoss() function


I have 2 datasets which make up the (x,y) coordinates of 2 sine curves and their respective outputs. The sine curves are concentric.

The bigger sine curve has an output label of 1 and the smaller sine curve an output label of -1.

I have to train a model which will take in a new (x,y) coordinate and output either -1 or +1, depending on which curve it might be a part of.

Now I have prepared the data like so-

#generate data_set - sin wave 1 and sin wave 2

def wave_Lower_training_data(n = 300):
  #generate random input points for wave 1
  X1 = random.uniform(0,2*pi,n) #generates 'n' radian values b/w 0 to 2pi
  #wave output
  X2 = sin(X1) #lower curve
  X1 = X1.reshape(n, 1)
  X2 = X2.reshape(n,1)
  X = hstack((X1,X2))
  # y = -ones((n,1))
  y = -ones((n,1))

  return X,y

def wave_Higher_training_data(n = 300):
  #generate random input points for wave 1
  X1 = random.uniform(0,2*pi,n) #generates 'n' radian values b/w 0 to 2pi
  #wave output
  X2 = 2*sin(X1) #higher curve
  X1 = X1.reshape(n, 1)
  X2 = X2.reshape(n,1)
  X = hstack((X1,X2))
  y = ones((n,1))

  return X,y

X1, y1 = wave_Lower_training_data()
X2, y2 = wave_Higher_training_data()

# print(X1)

# Combine the training data
X_combined = vstack((X1, X2))
y_combined = vstack((y1, y2))




# print("Combined X shape:", X_combined)
# print("Combined y shape:", y_combined)

And this is how I've attempted to model-

import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split

class Model(nn.Module):
    def __init__(self):
        super(Model, self).__init__()
        # Define model
        self.fc1 = nn.Linear(2, 64)
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(64, 32)
        self.fc3 = nn.Linear(32, 16)
        self.fc4 = nn.Linear(16,1)
        self.tanh = nn.Tanh()

    def forward(self, x):
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        x = self.relu(self.fc3(x))
        x = self.tanh(self.fc4(x))
        return x

# Convert training data into tensors usable by PyTorch
X_tensor = torch.tensor(X_combined, dtype=torch.float32)
y_tensor = torch.tensor(y_combined, dtype=torch.int64)
print(y_tensor)

# Splitting of training data into 80-20
X_train, X_val, y_train, y_val = train_test_split(X_tensor, y_tensor, test_size=0.2, random_state=786)

model = Model()
criterion = nn.HingeEmbeddingLoss()
optimizer = optim.Adam(model.parameters(), lr=0.0001)

# Training loop
n_epochs = 10000
for epoch in range(0, n_epochs):
    optimizer.zero_grad()
    output = model(X_train)
    loss = criterion(output.squeeze(), y_train.long())
    loss.backward()
    optimizer.step()

    if epoch % 100 == 0:
        print(f"Epoch {epoch+1}/{n_epochs}, Loss: {loss.item()}")

with torch.no_grad():
    y_pred = model(X_val)
    y_pred_class = torch.sign(y_pred)
    accuracy = (y_pred_class.squeeze().long() == y_val.long()).sum().item() / len(y_val)
    print(f"Validation Accuracy: {accuracy}")

As i inspect the the loss percentage every 100 epochs, it is clear that after a point, It is stagnant?

Then when i get to the prediction part-

#predictions
def get_user_input():
    them to a list:
    feature1 = float(input("Enter the first feature value: "))
    feature2 = float(input("Enter the second feature value: "))
    return [feature1, feature2]

# Get user input
X_new = get_user_input()
X_ip = torch.tensor(X_new, dtype=torch.float32)
y_pred = model(X_ip)

print(y_pred)


pyplot.figure(figsize=(10, 6))
pyplot.scatter(X1[:, 0], X1[:, 1], label='Wave 1', color='blue', s=10)
pyplot.scatter(X2[:, 0], X2[:, 1], label='Wave 2', color='red', s=10)
user_x = X_new[0]  # Example x-coordinate for user input
user_y = X_new[1]  # Example y-coordinate for user input
pyplot.scatter(user_x, user_y, color='green', label='User Input Point')

pyplot.xlabel('Angle (radians)')
pyplot.ylabel('Amplitude')
pyplot.title('Sine Waves with Amplitudes 1 and 2')
pyplot.legend()
pyplot.show()

enter image description here

This is the output I get-

Enter the first feature value: 1.9
Enter the second feature value: 2
tensor([-1.], grad_fn=<TanhBackward0>)

From the graph it is very clear that the output should be [+1], as it resembled the upper curve. But i still get [-1]?

I always get [-1] no matter what input I give.

Any help?

EDIT : I've tried the same using BCELoss() and it seems to work. But I need to map it to {-1,1} and not {0,1}.


Solution

  • Have you read the documentation for HingeEmbeddingLoss?

    Measures the loss given an input tensor 𝑥 and a labels tensor 𝑦 (containing 1 or -1). This is usually used for measuring whether two inputs are similar or dissimilar, e.g. using the L1 pairwise distance as 𝑥, and is typically used for learning nonlinear embeddings or semi-supervised learning.

    The x in HingeEmbeddingLoss is supposed to be distances between paired embeddings. It doesn't apply at all to your prediction problem.

    You are actually telling the model to do the opposite of what you want. For the loss computation, the loss is x when y=1 and margin-x when y=-1. Meaning, when y=1, you are telling the model to produce a small value (-1), and vice versa when y=-1.

    You also have errors from incorrect broadcasting. loss = criterion(output.squeeze(), y_train.long()) produces a (n,n) loss rather than a (n,1) loss. This error is also present in your accuracy calculation.

    If you remove the broadcasting errors and compute accuracy as the opposite sign (((-y_pred.sign()).long() == y_val).float().mean()), the training setup somewhat works.

    Overall, the correct way to approach this problem is to use binary classification (which, as you mention, works). If you need the output to be on [-1, 1], you can just rescale the sigmoid logits from [0, 1] to [-1, 1] after prediction.