Search code examples
palantir-foundryfoundry-code-repositories

How Can I install Microsoft Access Driver in the Code Repository?


I am trying to read a file with the extension .mdb and make it into a Foundry Dataset. As far as I checked, the .mdb file requires the Microsoft Access Driver, but the Code Repository does not have that driver, so it cannot be read. (Or the method of setting the absolute path of the file in the Dataset may be wrong.) I don't know how to install the Microsoft Access Driver in the Code Repository, and I couldn't find a manual.

Seeing the following error:

pyodbc.Error: Can't open lib 'Microsoft Access Driver (+.mdb, +.accdb)': file not found (0) (SQLDriverConnect)


Solution

  • Another approach for a 1 time dump of the data is to parse the file outside of Foundry, and upload the CSVs for each of the files. Here is an example:

        from access_parser import AccessParser
        import csv
        import pandas as pd
        
        
        # .mdb or .accdb file
        db = AccessParser("/path/to/.accdb")
        
        
        # Print DB tables
        print(db.catalog)
        
        
        # Tables are stored as defaultdict(list) -- table[column][row_index]
        
        TABLES_TO_PARSE = [
            "TABLE1",
            "TABLE2",
            "TABLEN"
        ]
        
        def convert_accdb_to_csv(table_name):
            table = db.parse_table(table_name)
            df = pd.DataFrame(table)
            csv_name = table_name + ".csv"
            df.to_csv(csv_name)
            print(table_name + " is done.")
        ​
        
        for t in TABLES_TO_PARSE:
            convert_accdb_to_csv(t)
    

    You can then upload each of the csvs. You can also adapt the code above to run in Foundry by accessing the raw file (.accdb or .mdb)