Essentially what I want to do is run this script every hour to pull in data from the last hour only and then the script will run again an hour later. I want this script to pull all data associated with that last hour and then onwards for every hour of the day. How would I do this as I am only seeing a filter that can do this but I have read that it will only pull a sample and then filter on the hour from that sample.
def get_report(analytics):
return analytics.reports().batchGet(
body={
'reportRequests': [
{
' viewId': VIEW_ID,
'dateRanges': [{'startDate':
'1dayAgo','endDate':'today'}],
'metrics': [{'expression': 'ga:uniquepageviews'},
{'expression': 'ga:timeonpage'},
{'expression': 'ga:entrances'},
{'expression': 'ga:exits'},
{"expression": "ga:pageviews"}
],
'dimensions': [{'name': 'ga:dimension97'},
{'name': 'ga:dimension69'},
{'name': 'ga:dateHourMinute'},
]
}]
}
).execute()
Python has a sched module. It is possible to save the following code into a file and then execute it.
There are options for keeping the script running: terminal window, tmux session, background process, etc.
I used to use cron a lot but have changed to using the Python sched module. It can be easier to troubleshoot.
Save this code into a file.
execute chmod 755 <myfile.py>
Then run the script: ./myfile.py
#!/usr/bin/env python
import sched
import time
from datetime import datetime, timedelta
# Create a scheduler instance.
scheduler = sched.scheduler(timefunc=time.time)
def reschedule(interval: dict=None):
"""Define how often the action function will run.
Pass a dict interval {'hours': 1} to make it run every hour.
"""
interval = {'minutes': 1} if interval is None else interval
# Get the current time and remove the seconds and microseconds.
now = datetime.now().replace(second=0, microsecond=0)
# Add the time interval to now
target = now + timedelta(**interval)
# Schedule the task
scheduler.enterabs(target.timestamp(), priority=0, action=get_report)
def get_report(analytics=None):
# replace the print call with the code execute the Google API call
print(time.ctime())
reschedule() # Reschedule so it runs again.
if __name__ == "__main__":
reschedule() # start
try:
scheduler.run(blocking=True)
except KeyboardInterrupt:
print('Stopped.')
OUTPUT:
Tue Oct 29 22:35:00 2019
Tue Oct 29 22:36:00 2019
Stopped.