Search code examples
caffehardware-acceleration

Resume Training Caffe using Different GPU?


Please forgive what may be an insane question.

My setup: two machines, one with a GTX1080, one with an RX Vega 64. Retraining/fine tuning a bvlc_googlenet model on the GTX1080.

If I build Caffe for the Vega 64, then can I take a snapshot from the GTX1080 machine and restart training on the Vega 64? Would this work in the sense that the training would continue in a normal manner?

What if I moved the GTX1080 snapshot to a Volta V100 in AWS? Would this work?

I know Caffe will to some degree abstract the hardware, but I don't know how well it can do that. I need the GTX 1080 for something else...

Thanks in advance!


Solution

  • To my knowledge, this should work without a problem. Weight files and training snapshots are just blobs of numbers, that you should be able to resume on other hardware (e.g. CPU/GPU), a different machine with a different operating system, or between 32 and 64 bit processes.