Search code examples
smtpasciidecodeencodesmtplib

smptlib - trying to send email in Hebrew but getting some encoding instead


I'm trying to send an email which written in Hebrew. I'm getting the text in some encoding instead. I tried to google that error but still cant solve the problem.

import smtplib
import time
from email.mime.text import MIMEText
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys

service = Service("C:\Development\chromedriver.exe")
driver = webdriver.Chrome(service=service)

SUPERFARM_URL = "https://shop.super-pharm.co.il/"
my_email = "ZZZ"
password = "ZZZ"

driver.get(SUPERFARM_URL)
time.sleep(2)
search_bar = driver.find_element(By.ID, 'search-input')
search_bar.send_keys("סימילאק גולד")
search_bar.send_keys(Keys.ENTER)
time.sleep(2)

The issue is in these lines:

similac_gold_price = int(driver.find_element(By.XPATH, '//*[@id="results-boxes"]/a[3]/div/div[2]').get_attribute(
    "textContent").strip().split(" ")[-1].strip())
similac_gold_price_dec = int(driver.find_element(By.XPATH, '//*[@id="results-boxes"]/a[3]/div/div[2]').get_attribute(
    "textContent").strip().split(" ")[0].strip())
similac_grams = driver.find_element(By.XPATH, '//*[@id="results-boxes"]/a[4]/div/div[3]/div/div/span[1]').get_attribute("textContent")
similac_grams_encode = u' '.join(similac_grams).encode('utf-8').strip()

if similac_gold_price < 70:
    # The "text_encode" and "similac_grams_encode" are in Hebrew.
    text = driver.find_element(By.XPATH, '//*[@id="results-boxes"]/a[3]/div/div[3]/div/h4').get_attribute('textContent')
    text_encode = u' '.join(text).encode('utf-8').strip()
    print(text)
    print(text_encode)
    with smtplib.SMTP("smtp.gmail.com") as connection:
        connection.starttls()
        connection.login(user=my_email, password=password)
        connection.sendmail(
            from_addr=my_email,
            to_addrs="CCC",
            msg=f"Subject:Similac Offer!\n\n{text_encode} with a price of {similac_gold_price}.{similac_gold_price_dec} ({similac_grams_encode})"
        )

I tried to get plain Hebrew, but got text in some encoding instead. The email I'm getting:

b'\xd7\xa1 \xd7\x99 \xd7\x9e \xd7\x99 \xd7\x9c \xd7\x90 \xd7\xa7 \xc2\xa0 \xd7\xa1 \xd7\x99 \xd7\x9e \xd7\x99 \xd7\x9c \xd7\x90 \xd7\xa7 \xd7\x92 \xd7\x95 \xd7\x9c \xd7\x93 \xd7\xa2 \xd7\x9d H M O \xd7\xa9 \xd7\x9c \xd7\x91 2' with a price of 69.90 (b'7 0 0 \xd7\x92 \xd7\xa8 \xd7\x9d')


Solution

  • The third argument to sendmail needs to be a properly formatted MIME message, not free-form text.

    Something like this;

    from email.message import EmailMessage()
    ...
        message = EmailMessage()
        message["from"] = my_email
        message["to"] = "[email protected]"
        message["subject"] = "Similac Offer!"
        message.set_content(f"{text_encode} with a price of {similac_gold_price}.{similac_gold_price_dec} ({similac_grams_encode})")
    
        with smtplib.SMTP("smtp.gmail.com") as connection:
            connection.starttls()
            connection.login(user=my_email, password=password)
            connection.send_message(message)
    

    This should convert your Hebrew text into a suitable character set (probably UTF-8 in practice) and create the necessary MIME structure around it to communicate the correct character set to the recipient's email client.

    See also https://docs.python.org/3/library/email.examples.html (and maybe notice that you can find a lot of older code on the net which uses MimeMultipart and other obsolete libraries; you want the new API, always).