Search code examples
algorithmopen-sourcespeech-recognition

speech retrieval by speech query


My main problem is the following : given a set of refence speech files (list of features extracted from a speech phrase) and a query speech input I need to find the one out my references that best mathes. The point would be to not search throught all of them but rather prune out as much as possible. Can someone point me to an eficient algorithm that tackles this problem or any open source code that handles such things? Thank you


Solution

  • I'm assuming that the text spoken in the reference file is identical to the one in the query file. A common method for doing this is to simply compare each reference file to the query file. Typically you would use the Dynamic Time Warping algorithm--the wikipedia article has links to several implementations and it isn't too hard to implement yourself. The basic idea is try to align the two files and you pick the reference that lines up best with query.

    I know you said you didn't want to compare every example though. In that case, my first thought is to cluster the reference files. Offline you could compare the reference files against each other and group similar ones together. When it comes time to query, you only compare to one example in each cluster. Based on the result, you then compare against all files in the closest cluster or clusters.

    That is just one idea, I'm sure there are others.