python pytorch computer-vision transfer-learning resnet

Can I train my pretrained model with a totally different architecture?

I have trained a pretrained ResNet18 model with my custom dataset on Pytorch and wondered whether I could transfer my model file to train another one with a different architecture, e.g. ResNet50. I know I have to save my model accordingly (explained well on another post here) but this was a question that I have never thought before.
I was planning to use more advanced models like VisionTransformers (ViT) but I couldn't figure out whether I had to start with a pretrained ViT already or I could just take my previous model file and use it as the pretrained model to train a ViT.

Example Scenario: ResNet18 --> ResNet50 --> Inception v3 --> ViT

My best guess it that it's not possible due to number of weights, neurons and layer structures but I would love to hear that if I miss a crucial point here.
Thanks!

Solution

Between models that only differ in number of layers (Resnet-18 and Resnet-50), it has been done to initialize some layers of the larger model from the weights of the smaller model's layers. Inversely, you can truncate a larger model by taking a subset of regularly spaced layers and initialize a smaller model. In both cases, you need to retrain everything at the end if you hope to achieve semi-decent performances.

The whole point of using architectures that vastly differ (vision transformers vs CNNs) is to learn different features from the inputs and unlock new levels of semantic understanding. Recent models like BeiT also use new self-supervised training schemes that have nothing to do with the classic ImageNet pretraining. Using trained weights from another model would go against the point.

Having said that,if you want to use a ViT, why not start from the available pretrained weights on HuggingFace and fine-tune it on the data you used to train your ResNet50 ?