Search code examples
apify

How i can save apify actor output to s3 or google bucket


I have created the apify actor it save the output in apify cloud but i want to save the output in my s3 account or the google bucket. thanks for any help to do this


Solution

  • You may create a webhook/serverless function on gcp/aws and set the endpoint into apify task webhooks enter image description here

    The incoming webhook body will look like

    {
      "userId": {{userId}},
      "createdAt": {{createdAt}},
      "eventType": {{eventType}},
      "eventData": {{eventData}},
      "resource": {{resource}}
    }
    

    where resource will be like below when SUCCESS

    {
      "id": {{id}},
      "actId": {{actId}},
      "userId": {{userId}},
      "startedAt": "2020-03-29T04:12:07.434Z",
      "finishedAt": "2020-03-29T04:12:13.415Z",
      "status": "SUCCEEDED",
      "meta": {
        "origin": "DEVELOPMENT",
        "userAgent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.132 Safari/537.36"
      },
      "stats": {
        ...
        ...
        ...
      },
      "options": {
        ...
        ...
        ...
      },
      "buildId": {{buildId}},
      "exitCode": 0,
      "defaultKeyValueStoreId": {{defaultKeyValueStoreId}},
      "defaultDatasetId": {{defaultDatasetId}},
      "defaultRequestQueueId": {{defaultRequestQueueId}},
      "buildNumber": "0.0.9",
      "containerUrl": {{containerUrl}}
    }
    

    Then you can download/save-to-bucket-or-s3 your dataset with the URL: https://api.apify.com/v2/datasets/{{defaultDatasetId}}/items programmatically

    I hope this helps!