Search code examples
jupyter-notebookmachine-translationbert-language-modelsequence-to-sequence

How can i use BERT fo machine Translation?


I got a big problem. For my bachelor thesis I have to make a machine tranlation model with BERT. But I am not getting anywhere right now. Do you know a documentation or something that can help me here? I have read some papers in that direction but maybe there is a documentation or tutorial that can help me.

For my bachelor thesis I have to translate from a summary of a text into a title. I hope someone can help me.


Solution

  • BERT is not a machine translation model, BERT is designed to provide a contextual sentence representation that should be useful for various NLP tasks. Although there exist ways how BERT can be incorporated into machine translation (https://openreview.net/forum?id=Hyl7ygStwB), it is not an easy problem and there are doubts if it really pays off.

    From your question, it seems that you are not really machine translation, but automatic summarization. Similarly to machine translation, it can be approached using sequence-to-sequence models, but we do not call it translation in NLP. For sequence-to-sequence modeling, there are different pre-trained models, such as BART or MASS. These should be much more useful than BERT.


    Update in September 2022: There are multilingual BERT-like models, the most famous are multilingual BERT and XLM-RoBERTa. When fine-tuned carefully, they can be used as a universal encoder for machine translation and enable so-called zero-shot machine translation. The model is trained to translate from several source languages into English, but in the end, it can translate from all languages covered by the multilingual BERT-like models. The method is called SixT.