Search code examples
pythongoogle-translation-apigoogle-translator-toolkit

Is it possible to use googletrans to only translate whats between marked areas?


Trying to develop something that can auto translate things into different languages in a file but only want things to translate between marked areas. Question is: Can I specify .read to only read between lets say ""?

In my file I have a list of sentences or words or lets even say letters and they are there sitting. I put "" speech marks around the sentence or a word I want to translate.

Down below is my txt file:

Sentence 1 - "I like Bannannas and I will eat them all day."
Sentence 2 - How is your day going?
Sentence 3 - "Will there be any sort of fun today or just raining?"
Sentence 4 - Can the sun come out to play!!!

I want to be able to translate the sentences which are now only wrapped around the "".

My Code Currently:

import re
import googletrans
from googletrans import Translator

file_translator = Translator()

tFile = open('demo.txt', 'r', encoding="utf-8")

if tFile.mode == 'r':
    content = tFile.read()
    print(content)

result = file_translator.translate(content, dest='fr')

with open('output.txt', 'w') as outFile:
    outFile.write(result.text)

Solution

  • At first we have go find the right sentence in the file, so for that we used re to find the text and then we have to translate that text using googletrans and then we have to replace the found sentences with translated sentences and finally we can write the whole paragraph in text file.

    Here's the code to do all those stuff:

    import re
    import googletrans
    from googletrans import Translator
    
    file_translator = Translator()
    
    with open("demo.txt","r") as f:
        content=f.read()
    
    pattern=re.compile(r'"([^"]*)"',re.MULTILINE)
    founds=re.findall(pattern,content)
    
    translated=[]
    for found in founds:
        translated.append(file_translator.translate(found, dest='fr').text)
    
    for f,t in zip(founds,translated):
        content=content.replace(f'"{f}"',t)
    
    with open('output.txt', 'w') as outFile:
        outFile.write(content)