Search code examples
amazon-web-servicesamazon-transcribe

extract all aws transcribe results using boto3


I have a couple hundred transcribed results in aws transcribe and I would like to get all the transcribed text and store it in one file. Is there any way to do this without clicking on each transcribed result and copy and pasting the text?


Solution

  • You can do this via the AWS APIs.

    For example, if you were using Python, you can use the Python boto3 SDK:

    • list_transcription_jobs() will return a list of Transcription Job Names
    • For each job, you could then call get_transcription_job(), which will provide the TranscriptFileUri that is the location where the transcription is stored.
    • You can then use get_object() to download the file from Amazon S3
    • Your program would then need to combine the content from each file into one file.

    See how you go with that. If you run into any specific difficulties, post a new Question with the code and an explanation of the problem.