Search code examples
pythontensorflowbert-language-modeltpu

BERT pre-training from scratch with tensorflow version 2.x


i used run_pretraining.py (https://github.com/google-research/bert/blob/master/run_pretraining.py) python script in tensorflow version 1.15.5 version before. I use Google cloud TPU, as well. Is it possible or any python script for BERT pre-training from scratch on TPU using tensorflow version 2.x ?


Solution

  • Yes you can use NPL library from TF2 model garden.

    The instructions for creating training data and running pretraining are here: nlp/docs/train.md#pre-train-a-bert-from-scratch.

    You can also follow BERT Fine Tuning with Cloud TPU tutorial with some changes to run pretraining script instead of fine tuning.