Search code examples
luamachine-translationopennmt

Change source input from file to string in th translate.lua


I am new to lua and I am wondering if I can send on the translate.lua -src a string and not a file where the string exists. I have searched a lot before posting but I could not manage to find something similar. My main problem is that Machine Translation is getting slower because I have to read/open from files. Thank you in advance!

for example: -src /TestFolder/TestFolder/TestFolder/TestFolder/TestFolder/TestFolder/TestFolder/test.txt; instead of file add a string directly.


Solution

  • No, it is not possible because -src value must be an existing file.

    th translate.lua -model $model -src "What is going on?"
    translate.lua: invalid argument for option -src: the file must exist
    

    To work around the problem, you may set up a REST or ZeroMQ server to translate text "on-the-fly".

    You may also write a simple Bash script that will accept several arguments and translate any string given the ONMT, model and optionally (if used) BPE model paths:

    #!/bin/bash
    #USAGE: bash translate.sh <TEXT> <ONMT_PATH> <MODEL_FILE_NAME> <BPE_FILE_NAME>
    
    file="$2/tmp"
    echo "$1" > "${file}"
    echo "Translating '$1' using ONMT from '$2' using model '$3' and BPE model '$4'"
    cd "$2"
    th ./tools/tokenize.lua OPTIONS -bpe_model "$4" < "${file}" > "${file}.tok" 2>/dev/null
    th ./translate.lua -model "$3" -src "${file}.tok" -output "${file}.tok.tgt" -gpuid 1 1> /dev/null
    th ./tools/detokenize.lua OPTIONS < "${file}.tok.tgt" > "${file}.tok.tgt.detok" 2>/dev/null
    cat "${file}.tok.tgt.detok"
    rm {"${file}","${file}.tok","${file}.tok.tgt","${file}.tok.tgt.detok"}
    

    Replace the OPTIONS inside the script with the appropriate options for your (de)tokenization.

    Call it like

    bash translate.sh "What is going on?" /OpenNMT /models/m_epoch13_3.33.t7 /models/model.bpe