Search code examples
pythondeep-learningpytorchneural-network

Why doesn't my football prediction NN in PyTorch work?


I've tried to create a model that predicts football scores.

I used this dataset: https://www.kaggle.com/datasets/martj42/international-football-results-from-1872-to-2017

Then, I gave each team name a unique integer.

Here's the code for the NN:

import csv
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import pandas as pd
import pandas as pd

data = []
with open('/Volumes/Drive 2/Football nn/output.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        data.append(row)

# team names
home_team = [str(row[1]) for row in data]
away_team = [str(row[2]) for row in data]
#team scores
old_home_score = [str(row[3]) for row in data]
old_away_score = [str(row[4]) for row in data]

home_score = []
away_score = []
print(home_team[44762])
print(away_team[44762])
home_team.pop(0)
away_team.pop(0)

old_home_score.pop(0)
old_away_score.pop(0)

for item in (old_home_score):
    iteam = int(item)
    iteam /=10
    home_score.append(iteam)

for item in (old_away_score):
    iteam = int(item)
    iteam /=10
    away_score.append(iteam)

print(away_score[44761])
print(home_score[44761])
home_team = [eval(i) for i in home_team]
away_team = [eval(i) for i in away_team]
class SoccerNN(nn.Module):
    def __init__(self):
        super(SoccerNN, self).__init__()
        self.fc1 = nn.Linear(2, 15)  
        self.fc2 = nn.Linear(15, 20)
        self.fc3 = nn.Linear(20, 10)
        self.fc4 = nn.Linear(10, 2) 

    def forward(self, x):
        x = self.fc1(x)
        x = self.fc2(x)
        x = self.fc3(x)
        x = self.fc4(x)
        x = torch.sigmoid(x)
        return x


# Convert the columns to PyTorch tensors
input_data = torch.tensor(np.column_stack((home_team, away_team)), dtype=torch.int16)
output_data = torch.tensor(np.column_stack((home_score, away_score)), dtype=torch.int16)


model = SoccerNN()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
epochs = 50  # Adjust the number of epochs as needed
batch_size = 10  # Adjust the batch size as needed
for epoch in range(epochs):
    for batch_start in range(0, len(input_data), batch_size):
        batch_end = batch_start + batch_size
        batch_input = input_data[batch_start:batch_end]
        batch_output = output_data[batch_start:batch_end]
        # Convert batch_input and batch_output to the same dtype as the model's parameters
        batch_input = batch_input.to(model.fc1.weight.dtype)
        batch_output = batch_output.to(model.fc2.weight.dtype)

        # Zero the gradients
        optimizer.zero_grad()

        # Forward pass
        predictions = model(batch_input)

        # Calculate loss
        loss = criterion(predictions, batch_output)

        # Backpropagation
        loss.backward()

        # Update weights
        optimizer.step()

    
        print(f"Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}")

# After training, you can use the model for predictions
home_team = [16]
away_team = [37]
new_data = torch.tensor(np.column_stack((home_team, away_team)), dtype=torch.int16)
new_data = torch.tensor(new_data, dtype=torch.float32)  # Use the appropriate data type

# Put the model in evaluation mode
model.eval()

# Make predictions
with torch.no_grad():
    predictions = model(new_data)

# Convert predictions to numpy array
predictions_np = predictions.numpy()

# Print or use the predictions as needed
print(predictions_np)

I gave the network two ints as input (each int represents a team) and the network is supposed to output the predicted score of the game...

When I train the model it says that the loss is 0 but the predictions are completely wrong.

Why doesn't it work (my guess is that it stepped into a local minimum and it can't get out but I'm probably completely wrong as I'm very new to this whole thing) and how should I change the model so that it would work?

Tried changing the batch size, num of layers, epochs...

Still doesn't work...


Solution

  • Firstly when I try to compile your code I get an error at the home_team = [eval(i) for i in home_team] line, with which I suspect you wanted to map teams to integers, namely:

    NameError: name 'Scotland' is not defined
    

    I will first explain how to do this mapping properly, even though it is not the correct way to feed classes (teams) into a NN, which I will explain afterwards.

    Map teams to integers

    You can use the following code:

    string_to_int = {}
    next_int = 0  # Initialize the integer counter
    # Iterate through the list of strings
    for string in home_team + away_team:
        # Check if the string is already in the dictionary
        if string not in string_to_int:
            string_to_int[string] = next_int
            next_int += 1  # Increment the integer counter
    
    home_team = [string_to_int[s] for s in home_team]
    away_team = [string_to_int[s] for s in away_team]
    

    With this I was able to run your code, and get to a loss of about 0.014.

    How to improve the model performance

    • Mapping classes to integers only makes sense when the classes have some kind of relationship, e.g. an ordering. In this case they don't but by providing them to the NN as an int you are explicitly telling the NN that e.g. Uruguay (5) is close to Austria (6) and far from Surrey (314). The NN then has to first learn that this relationship doesn't mean anything, which makes its job harder.

      • A better option would be to use One Hot Encoding, or Learnt Encoding (see this tutorial)
      • you will now have a wider network, as the input will have the dimension of the number of unique classes (here ~300)
    • Add nonlinear functions (e.g. nn.ReLU() or nn.Sigmoid()) between your fc functions. At the moment your whole network is equivalent to a single linear fc layer with a sigmoid.

    • You are dividing scores by 10 to make sure they fit into the range of your networks output [-1,1]. It would be more sensible to change the final nonlinearity (at the moment sigmoid), to something whose range is the range of the goal data, i.e. [0,infinity), e.g. a ReLU function.