Search code examples
pythonpython-multiprocessingchainerchainercv

MultiprocessIterator throws error when changing batch_size


I want to train a Faster R-CNN with ChainerCV. As a first test I mostly copied the provided example, I only changed the lines corresponding the dataset to use my custom dataset. I checked if my dataset is fully functional with all operations discribed in this tutorial.

If I run the script without changes everything works perfect, but if I change the batch_size I get an error. I tried increasing the shared_mem from 100 MB to 1000 MB, but the error didn’t disappear.

Error when setting the batch_size=2:

Exception in main training loop: all the input array dimensions except for the concatenation axis must match exactly
Traceback (most recent call last):
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/training/trainer.py", line 315, in run
    update()
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/training/updaters/standard_updater.py", line 165, in update
    self.update_core()
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/training/updaters/standard_updater.py", line 171, in update_core
    in_arrays = self.converter(batch, self.device)
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/dataset/convert.py", line 134, in concat_examples
    [example[i] for example in batch], padding[i])))
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/dataset/convert.py", line 164, in _concat_arrays
    return xp.concatenate([array[None] for array in arrays])
Will finalize trainer extensions and updater before reraising the exception.
Traceback (most recent call last):
  File "/home/cv/ChainerCV/faster_rcnn/train.py", line 131, in <module>
    main()
  File "/home/cv/ChainerCV/faster_rcnn/train.py", line 126, in main
    trainer.run()
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/training/trainer.py", line 329, in run
    six.reraise(*sys.exc_info())
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/six.py", line 693, in reraise
    raise value
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/training/trainer.py", line 315, in run
    update()
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/training/updaters/standard_updater.py", line 165, in update
    self.update_core()
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/training/updaters/standard_updater.py", line 171, in update_core
    in_arrays = self.converter(batch, self.device)
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/dataset/convert.py", line 134, in concat_examples
    [example[i] for example in batch], padding[i])))
  File "/home/cv/anaconda3/envs/chainer/lib/python3.6/site-packages/chainer/dataset/convert.py", line 164, in _concat_arrays
    return xp.concatenate([array[None] for array in arrays])
ValueError: all the input array dimensions except for the concatenation axis must match exactly

System info:

__Hardware Information__
Machine                                       : x86_64
CPU Name                                      : skylake
Number of accessible CPU cores                : 8

__OS Information__
Platform                                      : Linux-4.15.0-45-generic-x86_64-with-debian-stretch-sid
Release                                       : 4.15.0-45-generic
System Name                                   : Linux
Version                                       : #48~16.04.1-Ubuntu SMP Tue Jan 29 18:03:48 UTC 2019
OS specific info                              : debianstretch/sid
glibc info                                    : glibc 2.10

__CUDA Information__
Found 1 CUDA devices
id 0     b'GeForce GTX 1080'                              [SUPPORTED]
                      compute capability: 6.1
                           pci device id: 0
                              pci bus id: 1
Summary:
    1/1 devices are supported
CUDA driver version                           : 10000

__Conda Information__
conda_build_version                           : 3.17.6
conda_env_version                             : 4.6.3
platform                                      : linux-64
python_version                                : 3.7.1.final.0

EDIT: When running the example with batch_size=2 the error also occurs.


Solution

  • While trying to fix the error I got another error.

    ValueError: Currently only batch size 1 is supported.
    

    Waiting seems to be the solution.