I have the following page:
http://greyhoundbet.racingpost.com/#card/race_id=1632746&r_date=2018-08-17&tab=form
It contains a series of information organized in "tables". I need to "extract" that information (rows and columns) to manipulate the info later.
Knowing that I'm a newbie, i tried to do it with bs4 with python but I wasn't successful. What would you recomend ?
1) Should I use a program language that would allow me to read the text from the page (which one should I use ? what sould I look for?) and then manipulate it ?
2) Can I get the text manually (ctrl+c) and send it to python somehow ?
How would you get the info from the page in the easiest way to later work with the data ?
Thank you all and I'm sorry if this is a dumb question. I've been struggling with that for the past week.
Regards, P.
EDIT: I was thinking in use an object oriented approach to separate every greyhound and to study each number. Maybe its better to do it in C# ?
Create a text file and paste the copied text. Then read the file with python :
with open('page_text.txt') as f: lines = f.readlines()
You cannot scrape the page with bs4. You need a 'headless browser', a tool that can load dynamic webpages (like Selenium etc)