Search code examples
bashfiletranslation

Bash: replace specific text with its translation


There is a huge file, in it I want to replace all the text between '=' and '\n' with its translation, here is an example:

input:

screen.LIGHT_COLOR=Lighting Color 
screen.LIGHT_M=Light (Morning)
screen.AMBIENT_M=Ambient (Morning)

output:

screen.LIGHT_COLOR=Цвет Освещения
screen.LIGHT_M=Свет (Утро)
screen.AMBIENT_M=Эмбиент (Утро)

All I have managed to do until now is to extract and translate the targeted text.

while IFS= read -r line
do
        echo $line | cut -d= -f2- | trans -b en:ru

done < file.txt

output:
    Цвет Освещения
    Свет (Утро)
    Эмбиент (Утро)

*trans is short for translate-shell. It is slow, but does the job. -b for brief translation; en:ru means English to Russian.

If you have any suggestions or solutions i'll be glad to know, thanks!

edit, in case someone needs it:

After discovering trans-shell limitations I ended up going with the @TaylorG. suggestion. It is seams that translation-shell allows around 110 request per some time. Processing each line seperatly results in 1300 requests, which breaks the script.

long story short, it is faster to pack all the data into a single request. Its possible to reduce processing time from couple of minutes to mere seconds. sorry for the messy code, it's my third day with:

cut -s -d = -f 1 en_US.lang > option_en.txt
cut -s -d = -f 2 en_US.lang > value_en.txt

# merge lines
sed ':a; N; $!ba; s/\n/ :: /g' value_en.txt > value_en_block.txt

trans -b en:ru -i value_en_block.txt -o value_ru_block.txt

sed 's/ :: /\n/g' value_ru_block.txt > value_ru.txt

paste -d = option_en.txt value_ru.txt > ru_RU.lang

# remove trmporary files
rm option_en.txt value_en.txt value_en_block.txt value_ru.txt value_ru_block.txt

Thanks Taylor G., Armali and every commentator


Solution

  • Using pipe in a large loop is expensive. You can try the following instead.

    cut -s -d = -f 1 file.txt > name.txt
    cut -s -d = -f 2- file.txt | trans -b en:ru > translate.txt
    paste -d = name.txt translate.txt
    

    It shall be much faster than your current script. I'm not sure how your trans method is written. It needs to be updated to process batch input if it's not, e.g. using a while loop.

    trans() {
        while read -r line; do
            # do translate and print result
        done
    }