Search code examples
apache-kafkapipelineerp

How to build a pipeline that will collect data from ERPs like Odoo and store in an external database using Kafka


We are working on a project which is to collect data from ERPs and store in our database, we studied lots of technologies on big data and came to a conclusion to use Apache Kafka to perform the task since Kafka ingest data in realtime.

The issue is after researches, we don't still know how to go about it. We were able to create a pipeline to collect data from a file.txt but when it comes to ERPs, using their APIs.

Can someone guide us? or Can anyone provide us a course that we could buy or watch that can help us? Thanks


Solution

  • For the record mostly... (since I guess you've already found a solution). One path worth exploring is to use Kafka Connect. That's the reason the API was created, after all.

    I would try to create/write custom connectors to extract data from desired ERPs and feed it into the Kafka cluster:

    • either directly from the ERP's database(s), if such access can be granted
    • or by trying to invoke various REST services/endpoints the ERP might expose
    • or maybe the ERP already publishes events to expose state change, etc.