Search code examples
speech-recognitionhuggingface-transformerstransformer-model

Problem with trained model and load model


I'm trying to create model from wav2vec2 with facebook/wav2vec2-base-960h pretrained model and this is my training_args

from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir=save_dir,
    group_by_length=True,
    per_device_train_batch_size=10,
    per_device_eval_batch_size=10,
    gradient_accumulation_steps=2,
    evaluation_strategy="steps",
    num_train_epochs=0.5,
    fp16=True,
    save_steps=10,
    eval_steps=10,
    logging_steps=10,
    learning_rate=1e-4,
    warmup_steps=500,
    save_total_limit=2,
)

and this is my trainer

from transformers import Trainer

trainer = Trainer(
    model=model,
    data_collator=data_collator,
    args=training_args,
    compute_metrics=compute_metrics,
    train_dataset=_common_voice_train,
    eval_dataset=_common_voice_test,
    tokenizer=processor.feature_extractor,
)

now when the training part is over and model trained the trainer.evaluate() part show me the good result like this

reference: "شما امروز صبوری بفرمایین ثبت شده تا امروز با شما هماهنگی انجام بشه"
predicted: "شما امروز سبوری بفرمای سبز شده تا امروز با شما همهمنگی انجام باشه"

but when I'm trying to load and use the model I got this

رچسصجپ هدثج یو تو یتنپ هر وغسهروغج سچ ثزتسه شتذس صمرجچو

I load my model like this

sample_rate = 16_000

model = Wav2Vec2ForCTC.from_pretrained("/content/drive/MyDrive/model")
processor = Wav2Vec2Processor.from_pretrained("/content/drive/MyDrive/model")
audio_input, sample_rate = librosa.load("/content60_L4.wav", sr=sample_rate)
input_values = processor(audio_input, sampling_rate=sample_rate, return_tensors="pt").input_values
logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
transcription = processor.decode(predicted_ids[0])

I can't find my mistake


Solution

  • It's a little bit odd. I encountered this issue and solved it via a strange path. Try to use the same directory for your inference as your training directory.

    After wrapping up the model and running it with the Python interpreter instead of Conda, I have never seen the bug again.

    I do not know the reason for this, maybe someone can help me be more accurate about the cause.