This is the text I extracted from a cropped image containing table:
S No PART CODE PART DESCRIPTION
HSN
QTY RATE(Rs)
VALUE DISCOUNT SGST SGST%
CGST CGST%
AMOUNT(Rs)
CHAIN LUBE &
CLEANER KIT-
34039900
0.16
1,406.78 213.5648
11.52
19.22
19.22
9
252.00
1
3600008
S00ML.
141715
BULB 12V-2VW(BA9S)
85392940
4
10.17
10.17
0
0.92
0.92
9
12.01
2)
(PARKING)
20.14
18
264.01
TOTAL
223.73
11.52
20.14
18
0.01
ROUND OFF
TOTAL
264
I want to convert this into pandas dataframe. How should I do it?
df = pytesseract.image_to_data('1.jpg', lang='eng', output_type='data.frame')
display(df)
You will need to specify output_type='data.frame'
.
from PIL import Image
import pytesseract
df = pytesseract.image_to_data(Image.open('your_image.jpeg'),lang='eng',output_type='data.frame')