I am following BERT instructions to fine tune as described here
Here is my code:
from sentence_transformers import SentenceTransformer, SentencesDataset, InputExample, losses, evaluation
from torch.utils.data import DataLoader
# load model
embedder = SentenceTransformer('bert-large-nli-mean-tokens')
print("embedder loaded...")
# define your train dataset, the dataloader, and the train loss
train_dataset = SentencesDataset(x_sample["input"].tolist(), embedder)
train_dataloader = DataLoader(train_dataset, shuffle=False, batch_size=16)
train_loss = losses.CosineSimilarityLoss(embedder)
sentences1 = ['This list contains the first column', 'With your sentences', 'You want your model to evaluate on']
sentences2 = ['Sentences contains the other column', 'The evaluator matches sentences1[i] with sentences2[i]', 'Compute the cosine similarity and compares it to scores[i]']
scores = [0.3, 0.6, 0.2]
evaluator = evaluation.EmbeddingSimilarityEvaluator(sentences1, sentences2, scores)
# tune the model
embedder.fit(train_objectives=[(train_dataloader, train_loss)],
epochs=1,
warmup_steps=100,
evaluator=evaluator,
evaluation_steps=1)
At 4% the training stops and the programs exists with no warnings or errors. There is no output.
I have no idea how to troubleshoot - any help would be great.
Edit: Changed the title from fails to stops/quits because I don't know if its failing
Here is what I see on my terminal: Epoch: 0%| Killedtion: 0%|
The word "Killed" overlaps the word iteration... memory problem perhaps? FYI: I am running it from the terminal of vscode with wsl on ubuntu vm in windows
Found the issue on github: https://github.com/ElderResearch/gpu_docker/issues/38
My solution was to set batch and worker to one and its very slow
train_dataloader = DataLoader(train_dataset, shuffle=False, batch_size=1, num_workers=1)