How to estimate Inference time from average forward pass time in caffe?

I use this command to benchmark my ConvNet in caffe:

./build/tools/caffe time -model models/own_xx/deploy.prototxt -weights examples/RSR_50k_all_1k_db/snapshot_iter_10000.caffemodel -gpu=0

It runs fine and generates output which ends with:

I0426 16:08:19.345427 15441 caffe.cpp:377] Average Forward pass: 13.5549 ms.
I0426 16:08:19.345484 15441 caffe.cpp:379] Average Backward pass: 10.7661 ms.
I0426 16:08:19.345527 15441 caffe.cpp:381] Average Forward-Backward: 25.2922 ms.
I0426 16:08:19.345579 15441 caffe.cpp:383] Total Time: 1264.61 ms.
I0426 16:08:19.345628 15441 caffe.cpp:384] *** Benchmark ends ***

In some tutorials I have seen the guy somehow simply infer the classification time from Average Forward pass. However, I can not find any formula or material which explain how to do this. Is there actually some link between the two entites? Which other factors e.g. number of iterations, and batch size are involved? My goal is to accurately predict classification time of my ConvNet on the GPU.

UPDATE: To not appear completely ignorant I will add here that I have a basic idea that forward pass is the time taken by an input to generate a relative output so it can be called the inference time too. However, what I am interested in is knowing if that is true irrespective of batch size and iterations? I tried but during benchmarking, caffe does not offer any 'batch' options.

Solution

The average forward pass time is the time it takes to propagate one batch of inputs from the input ("data") layer to the output layer. The batch size specified in your models/own_xx/deploy.prototxt file will determine how many images are processed per batch.

For instance, if I run the default command that comes with Caffe:

build/tools/caffe time --model=models/bvlc_alexnet/deploy.prototxt --gpu=0

I get the following output

...
I0426 13:07:32.701490 30417 layer_factory.hpp:77] Creating layer data
I0426 13:07:32.701513 30417 net.cpp:91] Creating Layer data
I0426 13:07:32.701529 30417 net.cpp:399] data -> data
I0426 13:07:32.709048 30417 net.cpp:141] Setting up data
I0426 13:07:32.709079 30417 net.cpp:148] Top shape: 10 3 227 227 (1545870)
I0426 13:07:32.709084 30417 net.cpp:156] Memory required for data: 6183480
...
I0426 13:07:34.390281 30417 caffe.cpp:377] Average Forward pass: 16.7818 ms.
I0426 13:07:34.390290 30417 caffe.cpp:379] Average Backward pass: 12.923 ms.
I0426 13:07:34.390296 30417 caffe.cpp:381] Average Forward-Backward: 29.7969 ms.

The following line:

I0426 13:07:32.709079 30417 net.cpp:148] Top shape: 10 3 227 227 (1545870)

is super important. It says that your input layer is 10x3x227x227-dimensional. In this case, the batch size is 10 images, each of size 3x227x227 (the 3 refers to each of the rgb channels in an image).

So effectively, it took 1.67818 ms/image to do a forward pass or inference time per image.

Changing the batch size

If you want to change the batch size, look at your .prototxt file. The models/bvlc_alexnet/deploy.prototxt file that comes with Caffe looks like the following:

name: "AlexNet"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
layer { ...

Simply change dim: 10 into some other value (say to '100' to specify a batch size of 100 images per forward pass).