In order to make my issue reproducible, I have generated the following .csv
file using iris flower data set (10 arbitrary rows, all columns standard normalized) and a minimal neural network model (which predicts petal width using sepal length, sepal width and petal length) by modifying an MNIST example that I found on the internet. Scroll down to see my question!
import pandas as pd
import numpy as np
import tensorflow as tf
import scipy.stats
# Import iris data
data = pd.read_csv("iris.csv")
input = data[["Sepal.Length", "Sepal.Width", "Petal.Length"]]
target = data[["Petal.Width"]]
# Parameters
learning_rate = 0.001
training_epochs = 6000
# Network Parameters
n_hidden_1 = 5 # 1st layer number of features
n_hidden_2 = 5 # 2nd layer number of features
n_input = 3 # data input
n_output = 1 # data output
# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_output])
# Create model
def multilayer_network(x, weights, biases):
# Hidden layer with TanH activation
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_1 = tf.tanh(layer_1)
# Hidden layer with TanH activation
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
layer_2 = tf.tanh(layer_2)
# Output layer with linear activation
out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
return out_layer
# Store layers weight & bias
weights = {
'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_hidden_2, n_output]))
biases = {
'b1': tf.Variable(tf.random_normal([n_hidden_1])),
'b2': tf.Variable(tf.random_normal([n_hidden_2])),
'out': tf.Variable(tf.random_normal([n_output]))
# Construct model
pred = multilayer_network(x, weights, biases)
# Define loss and optimizer
cost = tf.reduce_mean(tf.square(pred-y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
# Initializing the variables
init = tf.initialize_all_variables()
# Launch the graph
with tf.Session() as sess:
# Training cycle
for epoch in range(training_epochs):
# Run optimization op (backprop) and cost op (to get loss value)
_, c =[optimizer, cost], feed_dict={x: input, y: target})
# Display logs per epoch step
if epoch % 1000 == 0:
print "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c)
print "Optimization Finished!"
Here is an example training session result:
$ python
Epoch: 0001 cost= 3.000185966
Epoch: 1001 cost= 0.031734336
Epoch: 2001 cost= 0.000614795
Epoch: 3001 cost= 0.000008422
Epoch: 4001 cost= 0.000000057
Epoch: 5001 cost= 0.000000000
Optimization Finished!
My idea was to replace Mean Square Error with the Spearman distance, which I recently learnt about, as my objective function. Following the definition:
I wrote a function that returns the ranking of a vector:
import scipy.stats
def rank(vector):
return scipy.stats.rankdata(vector, method="min")
Using TensorFlow's method py_func
, I defined my cost tensor as follows.
pred = tf.to_float(tf.py_func(rank, [pred], [tf.int64])[0])
y = tf.to_float(tf.py_func(rank, [y], [tf.int64])[0])
cost = tf.reduce_mean(tf.square(y-pred))
However, this gave me the error
ValueError: No gradients provided for any variable: ((None, <tensorflow.python.ops.variables.Variable object at 0x7f67ffe4ee90>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed3c4990>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed357310>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed357190>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed380350>), (None, <tensorflow.python.ops.variables.Variable object at 0x7f66ed3801d0>))
I do not understand what the underlying problem is. Any direction you could provide me would be greatly appreciated!
Your error comes from the fact that tf.py_func
has no gradient defined.
Anyway, as @user20160 said in the comments, no gradient even exists for the operation rank
, so this is not a loss on which you can train your algorithm directly.