Search code examples
mysqldatabaseamazon-web-servicesdata-warehouseamazon-rds

Application hosted on AWS to access multiple external DB's


This question is not purely related to application development, but quite from an architectural point of view. Let's say an application has to be hosted on AWS, which will require communication from multiple external 3rd party held databases (not mandatory to be on AWS, could be their own data center or a colocation) - maybe one client would be using MySQL, another one MSSQL, or Oracle etc.

So, the data held in these external databases could be changing frequently and all the time the application should retrieve the updated data. One way I could think of is to use another RDBMS DB (from our side) specifically to store all these data from external DB's and to keep our DB in sync all the time. But I am not sure if this is a good way to achieve this.

Or do I need to use any data warehousing tool like Redshift to store all the data synced in real time and then use DW data in our application? According to the architects here, what would be a good solution for achieving this, mainly with multiple external DB connectivity and syncing?


Solution

  • For your usecase, I would recommend to create an REST API which will work as the backend service for Mobile App.

    There are few reasons, you shouldn't use your mobile app to directly connect to backend databases.

    Security Risk

    • Requires to make Databases publicly accessible which is a high security risk.
    • Another security risk is there in storing Database access credentials inside Mobile App.

    Performance

    • Directly communicating with the Database can cause performance issues with respect to data transfer where if you have a backend service, which could optimally send and receive batches of data after combining from different sources based on queries.
    • Also can get the benefit of server side caching by having an API or a backend service.

    By having a Universal View of Your Data through your API, you can connect to multiple databases and retrieve data accordingly, also handling authentication and authorization within your API.

    Following diagram illustrates an architecture using Serverless technology stack(API Gateway & Lambda) in AWS to build an API which could connect to multiple backend databases.

    Serverless Architecture Solution

    Few points to highlight

    • You can decide on having a single Lambda or Multiple Lambda functions based on the complexity of your business logic.
    • Recommend to place all the Lambda's in a VPC and then have a VPN connection to your remote data centers so that the database access don't need to be exposed publicly.