I wrote a mini-app, that scrapes my school's Website then looks for the title of the last post, compare it to the old title, if it's not the same, it then sends me an email. In order for the app to work properly it needs to keep running 24/7 so that the value of the title variable is correct. Here's the code:
import requests
from bs4 import BeautifulSoup
import schedule, time
import sys
import smtplib
#Mailing Info
from_addr = ''
to_addrs = ['']
message = """From: sender
To: receiver
Subject: New Post
A new post has been published
visit the website to view it:
"""
def send_mail(msg):
try:
s = smtplib.SMTP('localhost')
s.login('email',
'password')
s.sendmail(from_addr, to_addrs, msg)
s.quit()
except smtplib.SMTPException as e:
print(e)
#Scraping
URL = ''
title = 'Hello World'
def check():
global title
global message
page = requests.get(URL)
soup = BeautifulSoup(page.content, 'html.parser')
main_section = soup.find('section', id='spacious_featured_posts_widget-2')
first_div = main_section.find('div', class_='tg-one-half')
current_title = first_div.find('h2', class_='entry-title').find('a')['title']
if current_title != title:
send_mail(message)
title = current_title
else:
send_mail("Nothing New")
schedule.every(6).hours.do(check)
while True:
schedule.run_pending()
time.sleep(0.000001)
So my question is How do I keep this code running on host using Cpanel? I know I can use cron jobs to run it every like 2 hours or something, but I don't know how to keep the script itself running, using a terminal doesn't work when I close the page the app gets terminated
So - generally to run programs for an extended period, they would need to be daemonised. Essentially disconnected from your terminal with a double-fork, and a set-sid. Having that said, I've never actually done it myself, since it was usually either (a) the wrong solution, or (b) it's re-inventing the wheel (https://github.com/thesharp/daemonize).
In this case, I think a better course of action would be to invoke the script every 6 hours, rather than have it internally do something every 6 hours. Making your program resilient to a restart is pretty much how most systems are kept reliable, and putting them in a 'cradle' that automatically restarts them.
In your case, I'd suggest saving the title to a file, and reading from and writing to that file when the script is invoked. It would make your script simplier, and more robust, and you'd be using battle-hardened tools for the job.
A couple of years down the line, when your writing code that needs to survive the total machine crashing, and being replaced (within 6 hours, with everything installed) you can use some external form of storage (like a database) instead of a file, to make your system even more resiliant.