Search code examples
google-app-enginecrongoogle-cloud-scheduler

Cron always fails and manual execution always succeeds. How?


I have a job that runs calculations nightly and stores results in sqlite files google storage. The scheduled task always succeeds when I manually run. It always fails when cron (Google Cloud Scheduler) runs it. The logs don't give a meaningful error message. 502 error code. Starts a new instance which should not matter. I set up the cron.yaml to retry 2 times. But all executions fail.

My cron.yaml

- description: "PAPER GO calc from alpaca to /tmp/entries.sqlite"
  target: default
  url: /calcentriesindb
  schedule: every sun,mon,tue,wed,thu  20:00
  timezone: America/New_York
  retry_parameters:
    job_retry_limit: 2
    min_backoff_seconds: 30

The top log entry is the manual execution. The following 3 are the cron attempts.enter image description here

Complete log entry from 502 error

{
  "protoPayload": {
    "@type": "type.googleapis.com/google.appengine.logging.v1.RequestLog",
    "appId": "s~wc2022-356423",
    "versionId": "20220730t143529",
    "requestId": "62eb0c5d00ff0bb76d8eabcbfc0001737e7763323032322d3335363432330001323032323037333074313433353239000100",
    "ip": "0.1.0.2",
    "startTime": "2022-08-04T00:01:33.767986Z",
    "endTime": "2022-08-04T00:01:34.565759Z",
    "latency": "0.797773s",
    "method": "GET",
    "resource": "/calcentriesindb",
    "httpVersion": "HTTP/1.1",
    "status": 502,
    "responseSize": "288",
    "userAgent": "AppEngine-Google; (+http://code.google.com/appengine)",
    "urlMapEntry": "auto",
    "host": "default.wc2022-356423.uc.r.appspot.com",
    "cost": 3.2186e-8,
    "taskQueueName": "__cron",
    "taskName": "05697141537686660811",
    "wasLoadingRequest": true,
    "instanceIndex": -1,
    "finished": true,
    "instanceId": "00c61b117c3c5fa64d13a841d4e8db781b8d5f07b01c3875537cbb754a073373efdb153e58e5d70a725960c452ea04b253df39c0c286628e0dec6d",
    "line": [
      {
        "time": "2022-08-04T00:01:34.565621Z",
        "severity": "INFO",
        "logMessage": "This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application."
      }
    ],
    "appEngineRelease": "1.9.71",
    "traceId": "115ab8ca6be26417bc63bb90824dbd71",
    "first": true,
    "traceSampled": true,
    "spanId": "1960824541831512114"
  },
  "insertId": "62eb0c5e0008a2c1355ad083",
  "httpRequest": {
    "status": 502
  },
  "resource": {
    "type": "gae_app",
    "labels": {
      "project_id": "wc2022-356423",
      "zone": "us16",
      "module_id": "default",
      "version_id": "20220730t143529"
    }
  },
  "timestamp": "2022-08-04T00:01:33.767986Z",
  "severity": "INFO",
  "labels": {
    "clone_id": "00c61b117c3c5fa64d13a841d4e8db781b8d5f07b01c3875537cbb754a073373efdb153e58e5d70a725960c452ea04b253df39c0c286628e0dec6d"
  },
  "logName": "projects/wc2022-356423/logs/appengine.googleapis.com%2Frequest_log",
  "operation": {
    "id": "62eb0c5d00ff0bb76d8eabcbfc0001737e7763323032322d3335363432330001323032323037333074313433353239000100",
    "producer": "appengine.googleapis.com/request_id",
    "first": true,
    "last": true
  },
  "trace": "projects/wc2022-356423/traces/115ab8ca6be26417bc63bb90824dbd71",
  "receiveTimestamp": "2022-08-04T00:01:34.572846740Z",
  "spanId": "1960824541831512114",
  "traceSampled": true
}

The critical function getting called from a handler:

func CalcEntriesInDb() (interface{}, error) {
    symbols, err := getTradableSymbols()
    if err != nil {
        log.Fatal("getTradableSymbols failed.", err)
    }

    log.Println(len(symbols), "symbols to scan.")
    createPath(c.EntryDbPath) // forces a specific path to exist including creation of folders.
    runList := GetRunList(c.RefDbPath)
    ////loadHistory because of weekends and holidays lets gather more than 13 days to get enough trail for minlow
    rowcount, err := loadHistory(20, symbols, c.EntryDbPath)
    if err != nil {
        log.Fatal(err)
    }
    log.Println(`Running `, len(runList), `sql scripts.`)
    runSqlScripts(runList, c.EntryDbPath)

    err = gcsUp(c.EntryDbPath)
    sendEmail("Entries Calculated", PrintTableHTML(c.EntryDbPath, "entry"), c.EntryDbPath)

    return rowcount, err
}

Here's the excerpt from my indexHandler

func indexHandler(w http.ResponseWriter, r *http.Request) {
    spew.Dump(r.URL.Query())
    params := r.URL.Query()
    var apikey string
    log.Println("X-Appengine-Cron", r.Header["X-Appengine-Cron"])
    //log.Println(params["marina"])
    if len(params["marina"]) == 1 {
        apikey = params["marina"][0]
    }

    if len(r.Header["X-Appengine-Cron"]) > 0 || apikey == c.Apikey {
        log.Println("Key or Param passed.")
    } else {
        http.NotFound(w, r)
        return
    }
    //spew.Dump(r.URL.Path)
    switch r.URL.Path {
    case "/calcentriesindb":
        resp, err := CalcEntriesInDb()
        if err != nil {
            _, err = fmt.Fprint(w, err)
        } else {
            _, err = spew.Fprint(w, resp)
        }

Solution

  • I finally got gae cron working. I created a cron entry that includes the apikey. This is a workaround, not a fix. Instead of scheduling /calcentriesindb I scheduled /calcentriesindb?marina=apikey. I wrote my code to not need the api key when executing from app engine cron. Oh well. Something is definitely still broken with how app engine handles schedule requests. Also if you didn't see in other comments--scheduling the request from a completely different server using curl worked flawlessly.