I'm designing a neural network (pytorch) that accomplishes two different, but entangled, tasks. One is very difficult, one is very easy. While training two different models (two trainings, two set of parameters,...) can crack both of the problems independently, when combined, the hard task fails, no matter what I do. I would like to understand why and if there's a way around it
Example: Let's say that I would like, given a picture, to:
Both of this problems are not impossible, however if I ask the NN to crack them simultaneously only the easy one is successful. Why?
My guesses are the following:
I know I could just split the tasks, but that's not what is required from me at the moment
what do?
Without knowing your task/network/data, it's difficult to give advice on your problem specifically. However, generally speaking, the interaction between different tasks can vary. Sometimes they can complement each other and boost learning, sometimes not. If you're interested, three subfields of Machine Learning that investigate this are Transfer Learning (for transferring knowledge from one to domain to another), Curriculum Learning (for transferring knowledge from one task to another in sequence) and Multitask Learning (for learning multiple tasks simultaneously).
Is the input data the same for both tasks? Does the network need to learn them in sequence or simultaneously?
Your suggestions may be right - the network might get stuck in a local minimum of the easy task, and not be able to then learn the difficult task. That sounds as if they are quite different tasks, and don't share overlapping solutions? If that's not true, and they should share parts of the solutions, perhaps the learning algorithm could be tuned (slower learning rate, different ratio of loss functions, different set-up of combining the tasks (sequence/simultaneous/semi-simultaneous).