Search code examples
sqldatabricksrepair

AttributeError: 'list' object has no attribute 'filter'


I want to run a repair job (MSCK REPAIR TABLE) in Azure Databricks, however I want to exclude 4 tables. What am I doing wrong?

database = "az_shffs"
tables = spark.catalog.listTables(database)

tables = tables.filter("tableName != 'exampletable1'").filter("tableName != 'exampletable2'").filter("tableName != 'exampletable3'").filter("tableName != 'exampletable4'")

for table in tables:
   spark.sql(f"MSCK REPAIR TABLE {database}.{table.name}")`

I get the following error message:

AttributeError: 'list' object has no attribute 'filter'

Solution

  • I think you are storing the list of tables in in tables variable by running the following command tables = spark.catalog.listTables(database) but the variable type is list not dataframe and list has no attribute filter. If you still want to use filter then convert that to dataframe and then use filter.

    Please refer below image.

    enter image description here

    You can use following command it will store that as a dataframe and then you can use filter.

    df = spark.sql("show tables in demo")
    display(df)
    

    enter image description here

    To run MSCK REPAIR TABLE command in for loop you can use below code.

    for i in tables.collect():