Search code examples
python-3.xgoogle-cloud-platformgoogle-cloud-data-fusion

Is there a way to trigger cloud data fusion pipeline form local via python?


I am trying to build a code in which i need to trigger the cloud data fusion pipeline from a desktop based app which is i am building. It is currently on python. Can anyone suggest a way to initiate pipeline without using the google cloud data fusion UI and with some lines of python code.


Solution

  • You may find either PycURL library approaching cURL client side HTTP methods sender or Requests that can be used to trigger calls to the mentioned CDAP REST API inventory from Python code.

    The example below shows Python code, affording HTTP POST method for starting
    batch pipeline within PyCurl, for reference I used the same environmental variables as per documentation link mentioned by @Edwin Elia:

    Set up environment variables:

    export AUTH_TOKEN=$(gcloud auth print-access-token)

    export CDAP_ENDPOINT=$(gcloud beta data-fusion instances describe \
    --location=<region> \
    --format="value(apiEndpoint)" \
    ${INSTANCE_ID})v3/namespaces/namespace-id/apps/pipeline-name/workflows/DataPipelineWorkflow/start
    

    Python code snippet:

    import pycurl
    import os
    
    CDAP_ENDPOINT = os.environ['CDAP_ENDPOINT']
    AUTH_TOKEN = os.environ['AUTH_TOKEN']
    
    c = pycurl.Curl()
    c.setopt(pycurl.URL, CDAP_ENDPOINT)
    c.setopt(pycurl.HTTPHEADER, ['Authorization: Bearer %s' %(AUTH_TOKEN)])
    c.setopt(pycurl.POST, 1)
    c.perform()