python performance neural-network theano lasagne

Simple lasagne network output is very slow

I'm trying to train an extremely simple neural network with Lasagne: one dense layer with one output, without nonlinearity (so it's simply a linear regression). Here's my code:

#!/usr/bin/env python

import numpy as np
import theano
import theano.tensor as T
import lasagne
import time

def build_mlp(input_var=None):
    l_in = lasagne.layers.InputLayer(shape=(None, 36), input_var=input_var)

    l_out = lasagne.layers.DenseLayer(
        l_in,
        num_units=1)

    return l_out

if __name__ == '__main__':
    start_time = time.time()

    input_var = T.matrix('inputs')
    target_var = T.fvector('targets')
    network = build_mlp(input_var)
    prediction = lasagne.layers.get_output(network)[:, 0]
    loss = lasagne.objectives.aggregate(lasagne.objectives.squared_error(prediction, target_var), mode="sum")
    params = lasagne.layers.get_all_params(network, trainable=True)
    updates = lasagne.updates.nesterov_momentum(loss, params, learning_rate=0.01, momentum=0.01)
    train_fn = theano.function([input_var, target_var], loss, updates=updates, allow_input_downcast=True)

    features = [-0.7275278, -1.2492378, -1.1284761, -1.5771232, -1.6482532, 0.57888401,\
    -0.66000223, 0.89886779, -0.61547941, 1.2937579, -0.74761862, -1.4564357, 1.4365945,\
    -3.2745962, 1.3266684, -3.6136472, 1.5396905, -0.60452163, 1.1510054, -1.0534937,\
    1.0851847, -0.096269868, 0.15175876, -2.0422907, 1.6125549, -1.0562884, 2.9321988,\
    -1.3044566, 2.5821636, -1.2787727, 2.0813208, -0.87762129, 1.493879, -0.60782474, 0.77946049, 0.0]

    print("Network built in " + str(time.time() - start_time) + " sec")

    it_number = 1000

    start_time = time.time()
    for i in xrange(it_number):
        val = lasagne.layers.get_output(network, features).eval()[0][0]
    print("1K outputs: " + str(time.time() - start_time) + " sec")

    p = params[0].eval()

    start_time = time.time()
    for i in xrange(it_number):
        n = np.dot(features, p)
    print("1K dot products: " + str(time.time() - start_time) + " sec")

    print(val)
    print(n)

I'm not training a network here yet, just doing 1K evals (with initial random weights) to see how much time it will take to get 1K actual predictions of my network. Comparing to 1K dot products it's a terrible slowdown!

Network built in 8.86999106407 sec
1K outputs: 53.0574831963 sec
1K dot products: 0.00349998474121 sec
0.0
[-3.37383742]

So my question is: why it takes so much time to evaluate such simple network?

Also, I'm confused about the predicted value. If the dot product is less than zero, the network outputs 0, otherwise these two values are the same:

Network built in 8.96299982071 sec
1K outputs: 54.2732210159 sec
1K dot products: 0.00287079811096 sec
1.10120121082
[ 1.10120121]

Am I missing something about how DenseLayer works?

Solution

Thanks to Jan Schlueter on https://groups.google.com/forum/#!forum/lasagne-users there is an answer to this.

Here I have not only done 1K passes through the network, but compiled 1K different functions and call each of them once. Instead of using eval() on 1K different expressions (they are different because they each include a different numpy array as a constant), I should have compiled a single prediction function (similar to train_fn, but returning the prediction instead of returning the loss and performing updates) and call that one 1K times in a loop.

Question about DenseLayer also solved:

The DenseLayer includes a nonlinearity, which defaults to the rectifier. The rectifier sets all outputs smaller than zero to zero.

It seems that all the Lasagne questions are much more likely to be answered on googlegroups rather than on StackOverflow. According to Jan, they're more focused on the mailing list.