I have a job that runs calculations nightly and stores results in sqlite files google storage. The scheduled task always succeeds when I manually run. It always fails when cron (Google Cloud Scheduler) runs it. The logs don't give a meaningful error message. 502 error code. Starts a new instance which should not matter. I set up the cron.yaml to retry 2 times. But all executions fail.
My cron.yaml
- description: "PAPER GO calc from alpaca to /tmp/entries.sqlite"
target: default
url: /calcentriesindb
schedule: every sun,mon,tue,wed,thu 20:00
timezone: America/New_York
retry_parameters:
job_retry_limit: 2
min_backoff_seconds: 30
The top log entry is the manual execution. The following 3 are the cron attempts.
Complete log entry from 502 error
{
"protoPayload": {
"@type": "type.googleapis.com/google.appengine.logging.v1.RequestLog",
"appId": "s~wc2022-356423",
"versionId": "20220730t143529",
"requestId": "62eb0c5d00ff0bb76d8eabcbfc0001737e7763323032322d3335363432330001323032323037333074313433353239000100",
"ip": "0.1.0.2",
"startTime": "2022-08-04T00:01:33.767986Z",
"endTime": "2022-08-04T00:01:34.565759Z",
"latency": "0.797773s",
"method": "GET",
"resource": "/calcentriesindb",
"httpVersion": "HTTP/1.1",
"status": 502,
"responseSize": "288",
"userAgent": "AppEngine-Google; (+http://code.google.com/appengine)",
"urlMapEntry": "auto",
"host": "default.wc2022-356423.uc.r.appspot.com",
"cost": 3.2186e-8,
"taskQueueName": "__cron",
"taskName": "05697141537686660811",
"wasLoadingRequest": true,
"instanceIndex": -1,
"finished": true,
"instanceId": "00c61b117c3c5fa64d13a841d4e8db781b8d5f07b01c3875537cbb754a073373efdb153e58e5d70a725960c452ea04b253df39c0c286628e0dec6d",
"line": [
{
"time": "2022-08-04T00:01:34.565621Z",
"severity": "INFO",
"logMessage": "This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application."
}
],
"appEngineRelease": "1.9.71",
"traceId": "115ab8ca6be26417bc63bb90824dbd71",
"first": true,
"traceSampled": true,
"spanId": "1960824541831512114"
},
"insertId": "62eb0c5e0008a2c1355ad083",
"httpRequest": {
"status": 502
},
"resource": {
"type": "gae_app",
"labels": {
"project_id": "wc2022-356423",
"zone": "us16",
"module_id": "default",
"version_id": "20220730t143529"
}
},
"timestamp": "2022-08-04T00:01:33.767986Z",
"severity": "INFO",
"labels": {
"clone_id": "00c61b117c3c5fa64d13a841d4e8db781b8d5f07b01c3875537cbb754a073373efdb153e58e5d70a725960c452ea04b253df39c0c286628e0dec6d"
},
"logName": "projects/wc2022-356423/logs/appengine.googleapis.com%2Frequest_log",
"operation": {
"id": "62eb0c5d00ff0bb76d8eabcbfc0001737e7763323032322d3335363432330001323032323037333074313433353239000100",
"producer": "appengine.googleapis.com/request_id",
"first": true,
"last": true
},
"trace": "projects/wc2022-356423/traces/115ab8ca6be26417bc63bb90824dbd71",
"receiveTimestamp": "2022-08-04T00:01:34.572846740Z",
"spanId": "1960824541831512114",
"traceSampled": true
}
The critical function getting called from a handler:
func CalcEntriesInDb() (interface{}, error) {
symbols, err := getTradableSymbols()
if err != nil {
log.Fatal("getTradableSymbols failed.", err)
}
log.Println(len(symbols), "symbols to scan.")
createPath(c.EntryDbPath) // forces a specific path to exist including creation of folders.
runList := GetRunList(c.RefDbPath)
////loadHistory because of weekends and holidays lets gather more than 13 days to get enough trail for minlow
rowcount, err := loadHistory(20, symbols, c.EntryDbPath)
if err != nil {
log.Fatal(err)
}
log.Println(`Running `, len(runList), `sql scripts.`)
runSqlScripts(runList, c.EntryDbPath)
err = gcsUp(c.EntryDbPath)
sendEmail("Entries Calculated", PrintTableHTML(c.EntryDbPath, "entry"), c.EntryDbPath)
return rowcount, err
}
Here's the excerpt from my indexHandler
func indexHandler(w http.ResponseWriter, r *http.Request) {
spew.Dump(r.URL.Query())
params := r.URL.Query()
var apikey string
log.Println("X-Appengine-Cron", r.Header["X-Appengine-Cron"])
//log.Println(params["marina"])
if len(params["marina"]) == 1 {
apikey = params["marina"][0]
}
if len(r.Header["X-Appengine-Cron"]) > 0 || apikey == c.Apikey {
log.Println("Key or Param passed.")
} else {
http.NotFound(w, r)
return
}
//spew.Dump(r.URL.Path)
switch r.URL.Path {
case "/calcentriesindb":
resp, err := CalcEntriesInDb()
if err != nil {
_, err = fmt.Fprint(w, err)
} else {
_, err = spew.Fprint(w, resp)
}
I finally got gae cron working.
I created a cron entry that includes the apikey. This is a workaround, not a fix.
Instead of scheduling /calcentriesindb
I scheduled /calcentriesindb?marina=apikey
. I wrote my code to not need the api key when executing from app engine cron. Oh well.
Something is definitely still broken with how app engine handles schedule requests.
Also if you didn't see in other comments--scheduling the request from a completely different server using curl worked flawlessly.