Search code examples
nlplangchain

How does SQLDatabase Chain work internally? (Langchain)


Langchain Doc

I want to understand underlying implementation. I know it uses NLP. But how it is determining whether requested thing is table or column. Maybe they are using spacy but customised a bit to understand database terms.

What does it store in memory? Obviously they are not storing whole database. From this answer,i got to know they are storing DDL of Database.
But huge database will probably have large ddl. Won't that create issue?


Solution

  • This is the implementation for SQLDatabaseChain

     https://github.com/langchain-ai/langchain/blob/master/libs/experimental/langchain_experimental/sql/base.py
    

    Regarding your queries

    What does it store in memory? Obviously they are not storing whole database.

    Answer : Yes SQLDatabaseChain does not store entire database, it works based on metadata

    From this answer,i got to know they are storing DDL of Database. But huge database will mostly have large ddl. Won't that create issue?

    Answer : Metadata mostly includes table names, column names, primary and foreign keys, all these information together sums up to very small compared to DDL.