Search code examples
pythonemailbeautifulsoupweb-crawlerreddit

Python crawler which than sends out results via email


thanks for helping. So I tried to make a little crawler which checks out the gif page of reddit, than writes down all the gifs + titles, puts them in a list and than sends this list via email (to my work colleagues).

So far so good, works perfectly, BUT the list it send looks like this e.g:

'1. Old man dancing at electronic music festival: http://i.imgur.com/2EtphXY.gifv', '2. Generation text..: http://i.imgur.com/fH6eV2B.gifv', '3. Porcupine climbs up for warmth:

etc...

What do I want? I want that the titles + links are printed by single row in the email + i wanne add a text to it. like this

Hello friends welcome to daily gifs

  1. title1: link1
  2. title2: link2
  3. title3: link3

This is my code so far:

import requests
from bs4 import BeautifulSoup
import urllib2
import smtplib
import time
import random
import datetime

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
url = 'https://www.reddit.com/r/gifs/?count=26&before=t3_3u4mnz'
response = opener.open(url)
page = response.read()
soup = BeautifulSoup(page, "lxml")
list = []

variable = 1
for link in soup.findAll('a', {'class': 'title may-blank '}):
    href = link.get('href')
    name = link.string
    #print str(variable) + ". " + name + " : " + href
    list.append(str(str(variable) + ". " + name + ": " + href))
    variable += 1


GMAIL_USERNAME = "blabla@blabla.com"
GMAIL_PASSWORD  = "xxxxxxxx"
email_subject = "Lunchtime gifs of the day: " + str(time.strftime("%d/%m/%Y"))
recipient = "workfriends@blabla.com"
body_of_email = str(list)[1:-1]
session = smtplib.SMTP('smtp.gmail.com', 587)
session.ehlo()
session.starttls()
session.login(GMAIL_USERNAME, GMAIL_PASSWORD)

headers = "\r\n".join(["from: " + GMAIL_USERNAME,
                       "subject: " + email_subject,
                       "to: " + recipient,
                       "mime-version: 1.0",
                       "content-type: text/html"])

content = headers + "\r\n\r\n" + body_of_email

session.sendmail(GMAIL_USERNAME, recipient, content)

print "Email send!"

Solution

  • An email's content type is determined using the Content-Type header. You've specified that that content type for your email is text/html and clients that read this email will interpret the bits as HTML.

    So, make the bits that you're sending look like HTML. I would use <br /> or <ol /> <li /> tags. Alternatively, send the email as text/plain and your \n characters will be interpreted as you would expect.

    Personally, for emails like this, I prefer for them to be in text/plain format.