Search code examples
azurepysparkdatabricksazure-databricks

Searching for a column name across a schema - Databricks


I've found a lot of answers for this question, however none of them seem to work for me for whatever reason.

I'm looking to do 1 of 2 things:

  1. Search through a schema and find a column name that is like my input

Or

  1. Search through a schema and find a column that has a value that matches my input

I am using Azure Databricks and can use either SQL or Python (pyspark) for this.

What I using currently is:

%python
databaseName = "DB1"
desiredColumn = "Product_Id"
database = spark.sql(f"show tables in {databaseName} ").collect()
tablenames = []
for row in database:
  cols = spark.table(row.tableName).columns
  if desiredColumn in cols:
    tablenames.append(row.tableName)

However this one keeps returning an error of: Table or view not found: "One of my DB Tables" The table name returned here is the first table of the DB alphabetically.

If I change the DB name to a different DB I then get the same error, and again with the 1st table of that DB alphabetically.

Hopefully someone has some idea of what might be going wrong here or might be able to point me in the right direction.

Thanks,

N


Solution

  • "I've found a lot of answers for this question, however none of them seem to work for me for whatever reason." Ok but tell us what you tried, otherwise maybe we will just tell you these solutions again...

    I think it could be because you have to re specify the DB name in the 'for' like this :

    databaseName = "DB1"
    desiredColumn = "Product_Id"
    database = spark.sql(f"show tables in {databaseName} ").collect()
    tablenames = []
    
    for row in database:
        tableName = f"{databaseName}.{row.tableName}"
        cols = spark.table(tableName).columns
        if desiredColumn in cols:
            tablenames.append(row.tableName)
    
    

    Try and tell us