This may be a strange question but here goes. I have a script that reads several sources (RSS) and then compiles a list of articles and sends an e-mail.
I use the pubDate tag
<pubDate>Thu, 27 Apr 2006</pubDate>
and then select all data that is published yesterday with -1 day in php.
I use UTC and my question is when should I run the script to make sure that I get everything that was in fact published. Is it me that is confused or is there a perfect time not to miss anything?
For instance, if I run the script 08:00 UTC there may be locations where data is not published yet, and perhaps one hour later stuff will still be on the same day but not retrieved when I run the script the next day.
Thanks for any input on schedules etc.
In practice, time zone offsets range from UTC-12:00 to UTC+14:00. Since each time zone has it's own concept of a day, if you want to cover the entire world you'll have to run your script until after 12:00 PM (Noon) UTC.
In other words, to cover any concept of May 1st, you'll have to wait until Noon UTC on May 2nd.
You might also want to a allow a few minutes for clock discrepencies. 12:05 PM UTC would work well.
HOWEVER - in many cases, you don't want to process the entire world at once. If you can separate the data by it's time zone, you may instead want to run a series of separate smaller batches after midnight in each time zone.