I have a PDF document with 388 pages and 1 table per page , i am trying to get them converted to excel or multiple dataframes, but having some difficulties, i have tried pypdf2 and tabula libraries but it stops after extracting only one page. The data looks like this:
so far the best results i got are with
import tabula
import pandas as pd
df= pd.DataFrame()
df = tabula.read_pdf("FSA.pdf",multiple_tables=True)
tabula.convert_into("FSA.pdf", "fsa_report.csv", output_format="csv",multiple_tables=True)
print(df)
But it stops after completing page 1.Any help?
df = tabula.read_pdf(file, lattice=True, pages=2, multiple_tables=True)
tabula.convert_into(file, "fsa_report.csv", output_format="csv", pages=3, multiple_tables=True)
Use this line,You need to mentioned page count