Search code examples
pythonxmlapiazuretranslate

Response from Microsoft Translate API in Python


I've recently been trying to create a some software that will record some speech, change the speech to text, and translate that text to another language. So far, I have accomplished the first two objectives, but I have been really struggling with translation.

I have been trying to use the Microsoft Translator API, and have followed all of the instructions in setting my environment up. I set up a Microsoft Azure Marketplace account, set up a project, enabled the API, and I have been able to use a simple bash command to get my access token:

curl --data "" 'https://api.cognitive.microsoft.com/sts/v1.0/issueToken?Subscription-Key=mySubscriptionKey'

I have written a small python script using the requests and argparse libraries that sends the request:

request = {
    'appid': ('Bearer ' + token),
    'text' : txt,
    'from' : 'en',
    'to' : 'fr'
}

response = requests.get('https://api.microsofttranslator.com/v2/http.svc', params = request)

Everything seems to go smoothly, and I get a 200 response (which I gather means success), yet when I try to look at the text in the response, hundreds of lines of obscure html are printed out. After looking through a few hundred of the lines (which were, for the most part, listing the dozens of languages I chose NOT to translate my text into) I couldn't find any actually translated text. All of the examples that Microsoft has on their github use the outdated DataMarket website that Microsoft is in the process of discontinuing as the authorization link. Moreover, I couldn't find any examples of the API actually being used - they were all just authorization examples. Using the token with their 'Try it Out' example gives me the correct result (though as an xml file?), so this is definitely a python problem.

So, has anyone used this service before and mind shedding some light on how to interpret or unwrap this response?

Thank you!


Solution

  • I tried to reproduce your issue, but failed, and my sample code works fine, which be writen via follow the documents Authentication Token API for Microsoft Cognitive Services Translator API & Text Translation API /Translate.

    As reference, here is my sample code in Python, and the output under it.

    import requests
    
    # Getting the key from tab Keys on Azure portal
    key = "xxxxxxxxxxxxxxxxxxxxxxx" 
    
    # For gettting access token
    # url4authentication = 'https://api.cognitive.microsoft.com/sts/v1.0/issueToken?Subscription-Key=%s' % key
    # resp4authentication = requests.post(url4authentication)
    
    url4authentication = 'https://api.cognitive.microsoft.com/sts/v1.0/issueToken'
    headers4authentication = {'Ocp-Apim-Subscription-Key': key}
    resp4authentication = requests.post(url4authentication, headers=headers4authentication)
    token = resp4authentication.text
    
    # For calling Translate API
    #text = "happy time"
    text = """
    Everything seems to go smoothly, and I get a 200 response (which I gather means success), yet when I try to look at the text in the response, hundreds of lines of obscure html are printed out. After looking through a few hundred of the lines (which were, for the most part, listing the dozens of languages I chose NOT to translate my text into) I couldn't find any actually translated text. All of the examples that Microsoft has on their github use the outdated DataMarket website that Microsoft is in the process of discontinuing as the authorization link. Moreover, I couldn't find any examples of the API actually being used - they were all just authorization examples. Using the token with their 'Try it Out' example gives me the correct result (though as an xml file?), so this is definitely a python problem.
    
    So, has anyone used this service before and mind shedding some light on how to interpret or unwrap this response?
    
    Thank you!
    """
    come = "en"
    to = "fr"
    # url4translate = 'https://api.microsofttranslator.com/v2/http.svc/Translate?appid=Bearer %s&text=%s&from=%s&to=%s' % (token, text, come, to)
    # headers4translate = {'Accept': 'application/xml'}
    # resp4translate = requests.get(url4translate, headers=headers4translate)
    url4translate = 'https://api.microsofttranslator.com/v2/http.svc/Translate'
    params = {'appid': 'Bearer '+token, 'text': text, 'from': come, 'to': to}
    headers4translate = {'Accept': 'application/xml'}
    resp4translate = requests.get(url4translate, params=params, headers=headers4translate)
    print(resp4translate.text)
    

    Output:

    <string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">
    Tout semble aller en douceur, et je reçois une réponse 200 (qui je suppose signifie succès), mais lorsque j’essaie de regarder le texte dans la réponse, des centaines de lignes html obscur sont imprimés. Après avoir regardé à travers quelques centaines des lignes (qui étaient, pour la plupart, répertoriant des dizaines de langues, en que j’ai choisi de ne pas traduire mon texte) je ne pouvais pas trouver n’importe quel texte en fait traduit. Tous les exemples que Microsoft a sur leur github utilisent le site DataMarket dépassé que Microsoft est en train d’interrompre le lien d’autorisation. En outre, je ne pouvais pas trouver des exemples de l’API effectivement utilisés - ils étaient tous exemples juste autorisation. En utilisant le jeton avec leur exemple « Essayer » me donne un résultat correct (même si, comme un fichier xml ?), donc c’est certainement un problème de python.
    
    Ainsi, quiconque a utilisé ce service avant et l’esprit certains éclairant sur la façon d’interpréter ou de dérouler cette réponse ?
    
    Merci !
    </string>
    

    Hope it helps.