Search code examples
pythonkerasdeterminants

Approximating determinant with keras


I am training a keras dense model to approximate the determinant of 2x2 matrices. I am using 30 hidden layers with 100 nodes each and 10E6 matrices (with entries in the interval [0,100[). After predicting on the test set (33.3% of total) I calculate the square root of the MSE and get something usually not greater than 100. I think this is quite a high error (although I am not sure about what could be considered a good error in this case), but besides increasing the number of samples, I am not sure how I could improve it (already 10E6 seems like a big number). I hope someone can provide some advice. Here is the code:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Dense


### Select number of samples, matrix size and range of entries in matrices
nb_samples = 1000000
matrix_size = 2
entries_range = 100

### Generate random matrices and determinants
matrices = []
determinants = []
for i in range(nb_samples):
    matrix = np.random.randint(entries_range, size = (matrix_size,matrix_size))
    matrices.append(matrix.reshape(matrix_size**2,))
    determinants.append(np.array(np.linalg.det(matrix)).reshape(1,))
matrices = np.array(matrices)
determinants = np.array(determinants)


### Split the data 
matrices_train, matrices_test, determinants_train, determinants_test = train_test_split(matrices,determinants,train_size = 0.66) 

### Select number of layers and neurons
nb_layers = 30
nb_neurons = 100

### Create dense neural network with nb_layers hidden layers having nb_neurons neurons each
model = Sequential()
model.add(Dense(nb_neurons, input_dim = matrix_size**2, activation='relu'))
for i in range(nb_layers):
    model.add(Dense(nb_neurons, activation='relu'))
model.add(Dense(1))
model.compile(loss='mse', optimizer='adam')
model.fit(matrices_train, determinants_train, epochs = 10, batch_size = 100, verbose = 0)

#_ , test_acc = model.evaluate(matrices_test,determinants_test)
#print(test_acc) 

### Make a prediction on the test set
determinants_pred = model.predict(matrices_test)

print('''
RMSE: {}
Number of layers: {}
Number of neurons: {}
Number of samples: {}
'''.format(np.sqrt(mean_squared_error(determinants_test,determinants_pred)),nb_layers,nb_neurons,nb_samples))

Here is an output:

  • RMSE: 20.429616387932295
  • Number of layers: 32
  • Number of neurons: 32
  • Number of samples: 1000000

Note: I decided to go for 30 layers and 100 nodes in each by trial and error (the MSE seemed the lowest around these values).


Solution

  • I think your network is massive for the size of the problem (input dim = 4 output = 1) and you do not have nearly enough epochs.

    also we can cheat a bit here since we know the calculation can basically be represented in terms of squares of linear combinations of inputs, we can use a x*x custom activation function. Here is an example, 10 neurons, 1 hidden layer, custom activation function as above, epochs = 1000, nsamples = 10000, produces

    RMSE: 0.04413008355924881
    Number of layers: 1
    Number of neurons: 10
    Number of samples: 10000
    
    

    here is your code in full with my small modifications

    import numpy as np
    from sklearn.model_selection import train_test_split
    from sklearn.metrics import mean_squared_error
    from tensorflow.keras import Sequential
    from tensorflow.keras.layers import Dense
    
    
    ### Select number of samples, matrix size and range of entries in matrices
    nb_samples = 10000#00
    matrix_size = 2
    entries_range = 100
    
    ### Generate random matrices and determinants
    matrices = []
    determinants = []
    for i in range(nb_samples):
        matrix = np.random.randint(entries_range, size = (matrix_size,matrix_size))
        matrices.append(matrix.reshape(matrix_size**2,))
        determinants.append(np.array(np.linalg.det(matrix)).reshape(1,))
    matrices = np.array(matrices)
    determinants = np.array(determinants)
    
    
    ### Split the data 
    matrices_train, matrices_test, determinants_train, determinants_test = train_test_split(matrices,determinants,train_size = 0.66) 
    
    ### Select number of layers and neurons
    nb_layers = 1#30
    nb_neurons = 10#0
    
    ### Create dense neural network with nb_layers hidden layers having nb_neurons neurons each
    model = Sequential()
    model.add(Dense(nb_neurons, input_dim = matrix_size**2, activation=lambda x:x*x))
    #for i in range(nb_layers):
    #    model.add(Dense(nb_neurons, activation='relu'))
    model.add(Dense(1))
    model.compile(loss='mse', optimizer='adam')
    model.fit(matrices_train, determinants_train, epochs = 1000, batch_size = 100, verbose = 1)
    
    #_ , test_acc = model.evaluate(matrices_test,determinants_test)
    #print(test_acc) 
    
    ### Make a prediction on the test set
    determinants_pred = model.predict(matrices_test)
    
    print('''
    RMSE: {}
    Number of layers: {}
    Number of neurons: {}
    Number of samples: {}
    '''.format(np.sqrt(mean_squared_error(determinants_test,determinants_pred)),nb_layers,nb_neurons,nb_samples))