I'm new into python. Doing some basic webscraping of football scores and got stuck.
I have webscraped using betsapi.com, got Premierleague teams, then in single code got from each team > all games, then in new one got from single game a stats. It will still need to get looped for all matches by all teams, but starting from the bottom:
I got this:
0 1 2
0 Brighton NaN Man City
1 1 Goals 1
2 5 Corners 1
3 3 Corners (Half) 0
4 2 Yellow Card 1
5 0 Yellow/Red Card 0
6 0 Red Card 0
7 6 Throw-ins 11
8 2 Offsides 0
9 11 Free kicks 18
10 7 Goal kicks 5
What I want to achieve is that all Rows >0 will be transposed to Row=0
So the results would look like
0 HomeTeam AwayTeam HGAG...
0 Brighton Man City 1 1 5 1 3 0 2 1 0 0 0 0 6 11
So the format is HomeTeam Away Team then each stats respectively for Home/Away
I have tried transposing with melt, var, but not solved it myself as I'm really beginner.
Header I would have probably to rename manually since there are 29 rows total (pasted only 10 to make it short) so I will get x2 columns from that in theory.
df = df.set_index(['HomeTeam','AwayTeam','HG','AG',....]) should work.
But How to transpose each 2 columns from row N into row 0 one after another? EDIT:
Thanks Timeless your code works. However looks my code wasn't proper for dataframe as it was showing 'axis = 1' not existing.
I have simplified code to:
import requests
liga_url = "https://betsapi.com/r/6557350/Brighton-vs-Man-City"
data = requests.get(liga_url)
from bs4 import BeautifulSoup
import pandas as pd
matches = pd.read_html(data.text)
matches = matches[0] # have chosen to use only first part since in all other df info is repeated or not important, I just miss referee name which is available when you click the link, but somehow read_html not getting it
matches
but with this code i'm probably missing sth in terms of str/int as getting such errors.
Maybe you can inspect further THanks,
You can try :
out = (
pd.DataFrame(df.T.drop("1").to_numpy().reshape(1, -1, order="F"))
# .set_axis(["HomeTeam","AwayTeam", "HG", "AG", ...], axis=1) # to complete
)
Output :
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Brighton | Man City | 1 | 1 | 5 | 1 | 3 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 6 | 11 | 2 | 0 | 11 | 18 | 7 | 5 |
Input used :
df = pd.DataFrame({
'0': ['Brighton', '1', '5', '3', '2', '0', '0', '6', '2', '11', '7'],
'1': [None, 'Goals', 'Corners', 'Corners (Half)', 'Yellow Card',
'Yellow/Red Card', 'Red Card', 'Throw-ins', 'Offsides',
'Free kicks', 'Goal kicks'],
'2': ['Man City', '1', '1', '0', '1', '0', '0', '11', '0', '18', '5']}
)