I am trying to read a file with the extension .mdb and make it into a Foundry Dataset. As far as I checked, the .mdb file requires the Microsoft Access Driver, but the Code Repository does not have that driver, so it cannot be read. (Or the method of setting the absolute path of the file in the Dataset may be wrong.) I don't know how to install the Microsoft Access Driver in the Code Repository, and I couldn't find a manual.
Seeing the following error:
pyodbc.Error: Can't open lib 'Microsoft Access Driver (+.mdb, +.accdb)': file not found (0) (SQLDriverConnect)
Another approach for a 1 time dump of the data is to parse the file outside of Foundry, and upload the CSVs for each of the files. Here is an example:
from access_parser import AccessParser
import csv
import pandas as pd
# .mdb or .accdb file
db = AccessParser("/path/to/.accdb")
# Print DB tables
print(db.catalog)
# Tables are stored as defaultdict(list) -- table[column][row_index]
TABLES_TO_PARSE = [
"TABLE1",
"TABLE2",
"TABLEN"
]
def convert_accdb_to_csv(table_name):
table = db.parse_table(table_name)
df = pd.DataFrame(table)
csv_name = table_name + ".csv"
df.to_csv(csv_name)
print(table_name + " is done.")
for t in TABLES_TO_PARSE:
convert_accdb_to_csv(t)
You can then upload each of the csvs. You can also adapt the code above to run in Foundry by accessing the raw file (.accdb
or .mdb
)