Trying to get a project I'm working on to wait on the results of the Scrapy crawls. Pretty new to Python but I'm learning quickly and I have liked it thus far. Here's my remedial function to refresh my crawls;
def refreshCrawls():
os.system('rm JSON/*.json)
os.system('scrapy crawl TeamGameResults -o JSON/TeamGameResults.json --nolog')
#I do this same call for 4 other crawls also
This function gets called in a for loop in my 'main function' while I'm parsing args:
for i in xrange(1,len(sys.argv)):
arg = sys.argv[i]
if arg == '-r':
pprint('Refreshing Data...')
refreshCrawls()
This all works and does update the JSON files, however the rest of my application does not wait on this as I foolishly expected it to. Didn't really have a problem with this until I moved the app over to a Pi and now the poor little guy can't refresh soon enough, Any suggestions on how to resolve this?
My quick dirty answer says split it into a different automated script and just run it an hour or so before I run my automated 'main function,' or use a sleep timer but I'd rather go about this properly if there's some low hanging fruit that can solve this for me. I do like being able to enter the refresh arg in my command line.
Instead of using os
use subprocess
:
from subprocess import Popen
import shlex
def refreshCrawls():
os.system('rm JSON/*.json')
cmd = shlex.split('scrapy crawl TeamGameResults -o JSON/TeamGameResults.json --nolog')
p = Popen(cmd)
#I do this same call for 4 other crawls also
p.wait()
for i in xrange(1,len(sys.argv)):
arg = sys.argv[i]
if arg == '-r':
pprint('Refreshing Data...')
refreshCrawls()