Search code examples
pythonspeech-recognitionspeech-to-textphonetics

Transcribe non-english audio to phonetically similar english words


Essentially, if I have an audio file where someone is speaking, is there something I can use to match the audio to similar sounding English words.

For example, if a Spanish speaker said:

Hola, me llamo Bob y me gusta ir a la biblioteca.

The program would output something similar to:

old ahh may yam oh bob, E may goo star ear ala bib lee oh tech ah

As you can see from my very bad example, it doesn't need to be close to perfect, it just needs to be phonetically similar. I would prefer something that works with python, but at this point anything will be good.


Solution

  • Very old reply, but I managed to do this with the Vosk transcription library