Search code examples
pythontabula

Import Data frame value to Dictionary


I have a code to extract the table data from pdf, I want to convert the data frame to a dictionary, if I mention the output as JSON in tabula it gives the coordinates also which is not required. I want only the data present in the table. if the data frame is converted to a dictionary , I can go up with other processing work

from tabula import read_pdf
from tabulate import tabulate
import pandas as pd

df = read_pdf("http://www.uncledavesenterprise.com/file/health/Food%20Calories%20List.pdf",multiple_tables=True,pages='3' ,pandas_options={'header':None},guess = False)
print (df)

Solution

  • You can use df.to_dict() to convert the dataframe to dictionary. You can use this method directly or also give some arguments which can be checked here for pandas v1.0.5 or here for pandas v0.23.4 . You can also use df.to_json() for converting the dataframe to JSON. Information regarding arguments can be found here and here for v1.0.5 and v0.23.4 respectively.