Create timestamps for subtitles in audibook

I want to add timestamps to book sentences, fitting the relevant audiobook. In various languages ideally.

Here's an example:
Pride and prejudice
text from gutenberg project
audio from Librivox

My idea was to find a voice recognition tool that puts timestamps on sentences (step 1), and then map the messy transcription to the original text using levenshtein distances (step 2).

The website https://speechlogger.appspot.com/ offers a solution to the 1st step, but it's limited in character output. I could theoritically use web automation to get the job done, by starting a new recording every minute or so, but it's really dirty.

I scripted step 2 in R and tested it on a sample I got from speechlogger and it works okayish, but this could be greatly improved if the program knew the text, like when you read to train a speech recognition software. I'm not using all my information here by transcribing first.

So my questions are, what alternative ways could i have to timestamp audio files, and is there a way i can make my process smarter by letting the recognition engine know what it's supposed to recognize ?

Solution

There are many nice software packages developed for that with various level of accuracy:

Gentle - Kaldi-based aligner, works as a service.

Older implementations:

Aligner Demo in Sphinx4 - CMUSphinx toolkit in java

SAIL align - HTK-based aligner, quite some pack of perl scripts.