Search code examples
pythoncommand-linescheduler

Python schedule with commandline


I have this problem that I want to automate a script. And in passed projects I've used python scheduler for this. But for this project I'm unsure how to handle this.

The problem is that the code works with login details that are outside the code and entered in the commandline when launching the script.

ex. python scriptname.py [email protected] password

How can I automate this with python scheduler? The code that is in 'scriptname.py' is:

//LinkedBot.py
import argparse, os, time
import urlparse, random
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup

def getPeopleLinks(page):
    links = []
    for link in page.find_all('a'):
        url = link.get('href')
        if url:
            if 'profile/view?id=' in url:
                links.append(url)
    return links

def getJobLinks(page):
    links = []
    for link in page.find_all('a'):
        url = link.get('href')
        if url:       
            if '/jobs' in url:
                links.append(url)
    return links

def getID(url):
    pUrl = urlparse.urlparse(url)
    return urlparse.parse_qs(pUrl.query)['id'][0]


def ViewBot(browser):
    visited = {}
    pList = []
    count = 0
    while True:
        #sleep to make sure everything loads, add random to make us look human.
        time.sleep(random.uniform(3.5,6.9))
        page = BeautifulSoup(browser.page_source)
        people = getPeopleLinks(page)
        if people:
            for person in people:
                ID = getID(person)
                if ID not in visited:
                    pList.append(person)
                    visited[ID] = 1
        if pList: #if there is people to look at look at them
            person = pList.pop()
            browser.get(person)
            count += 1
        else: #otherwise find people via the job pages
            jobs = getJobLinks(page)
            if jobs:
                job = random.choice(jobs)
                root = 'http://www.linkedin.com'
                roots = 'https://www.linkedin.com'
                if root not in job or roots not in job:
                    job = 'https://www.linkedin.com'+job
                browser.get(job)
            else:
                print "I'm Lost Exiting"
                break

        #Output (Make option for this)           
        print "[+] "+browser.title+" Visited! \n("\
            +str(count)+"/"+str(len(pList))+") Visited/Queue)"


def Main():
    parser = argparse.ArgumentParser()
    parser.add_argument("email", help="linkedin email")
    parser.add_argument("password", help="linkedin password")
    args = parser.parse_args()

    browser = webdriver.Firefox()

    browser.get("https://linkedin.com/uas/login")


    emailElement = browser.find_element_by_id("session_key-login")
    emailElement.send_keys(args.email)
    passElement = browser.find_element_by_id("session_password-login")
    passElement.send_keys(args.password)
    passElement.submit()

Running this on OSX.


Solution

  • I can see at least two different way of automating the trigger of your script. Since you are mentioning that your script is started this way:

    python scriptname.py [email protected] password
    

    It means that you start it from a shell. As you want to have it scheduled, it sounds like a Crontab is a perfect answer. (see https://kvz.io/blog/2007/07/29/schedule-tasks-on-linux-using-crontab/ for example)

    If you really want to use python scheduler, you can use the subprocess.

    In your file using python scheduler:

    import subprocess
    
    subprocess.call("python scriptname.py [email protected] password", shell=True)
    

    What is the best way to call a Python script from another Python script?