I am currently trying to finetune Wav2Vec2
model from: https://huggingface.co/dima806/bird_sounds_classification. But my RAM utilisation is running over the free tier on Google Colab.
The following is my code:
from transformers import TrainingArguments, Trainer
# Load model with ignore_mismatched_sizes=True
model = Wav2Vec2ForSequenceClassification.from_pretrained(
"dima806/bird_sounds_classification",
num_labels=len(label2id),
ignore_mismatched_sizes=True
)
# Set up training with gradient accumulation
batch_size = 1 # Reduce batch size to manage memory
accumulation_steps = 4 # Accumulate gradients over 4 steps
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
gradient_accumulation_steps=accumulation_steps, # Gradient accumulation
num_train_epochs=3,
weight_decay=0.01,
fp16=True, # Enable mixed precision training
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=val_dataset,
tokenizer=feature_extractor,
)
# Train the model
trainer.train()
What could be the reasons the RAM is going past 12.7GB? My dataset only contains 20 items. How can I address this issue?
The sound inputs were too long, after resampling the audio into chunks, the problem was resolved.