Search code examples
sqlpython-3.xdata-analysis

Analyse Data within multiple tables between two SQL Databases using Python


I have been given two SQL Databases which are of same schema type but differ by table names or columns names within those tables.

I am required to analyse those data within the tables and suggest , which are the similar(based on data within the table) tables between those two databases.

Can anybody suggest me to how to move forward with this problem statement. I am new to Data Analytics using Python. It will be really helpful if someone can help me with this.

As an example : lets Consider two databases A and B . . . A contains a table called "person_details" and B contains a table called "details"

I am looking for answers where i can recomend by analyzing the data within those two tables that they can be joined.


Solution

  • Consider using sqlite package of python for connecting to db and querying in it. https://docs.python.org/2/library/sqlite3.html

    Via query you can check the schema of tables. Or if you want to analyse data using pandas then consider using SQLAlchemy for creating connection to DB. http://docs.sqlalchemy.org/en/latest/core/engines.html#supported-databases

    After that read tables as pandas DataFrame using this https://pandas.pydata.org/pandas-docs/version/0.21/generated/pandas.read_sql_table.html

        from sqlalchemy import create_engine
        import pandas as pd
    
        engine_A = create_engine('sqlite:///sqlalchemy_A.db')
        engine_B = create_engine('sqlite:///sqlalchemy_B.db')
    
        data_A = pd.read_sql_table("person_details",engine_A)
        data_B = pd.read_sql_table("details",engine_B)
    

    This should work.