I want to read the txt file from here with Dota 2 mmrs for different players. It has the form as below:
1) "103757918"
2) "1"
3) "107361667"
4) "1"
5) "108464725"
6) "1"
7) "110818765"
8) "1"
9) "111436016"
10) "1"
11) "113518306"
12) "1"
13) "118896321"
14) "1"
15) "119780733"
16) "1"
17) "120360801"
18) "1"
19) "120870684"
20) "1"
21) "122616345"
22) "1"
23) "124393917"
24) "1"
25) "124487030"
With the account_id (e.g 103757918) followed by the mmr of the player (e.g 1). How can I read this in a pandas dataframe with two columns = account_id, mmr?
I don't need the index numbers.
Read the data how you normally would, slice the data every other row and concat. After you can rename the columns to whatever you want.
Slicing the data this way assumes that the first value is always the account_id followed by the mmr. Notice how row 25 is missing the mmr in your sample data and is therefore null.
df = pd.read_csv(data.txt, sep='\s+', header=None)
pd.concat([df[1][::2].reset_index(drop=True),
df[1][1::2].reset_index(drop=True)], axis=1)
Here is a working example based on your sample data
s = '''1) "103757918"
2) "1"
3) "107361667"
4) "1"
5) "108464725"
6) "1"
7) "110818765"
8) "1"
9) "111436016"
10) "1"
11) "113518306"
12) "1"
13) "118896321"
14) "1"
15) "119780733"
16) "1"
17) "120360801"
18) "1"
19) "120870684"
20) "1"
21) "122616345"
22) "1"
23) "124393917"
24) "1"
25) "124487030"'''
from io import StringIO
df = pd.read_csv(StringIO(s),sep='\s+', header=None)
data = pd.concat([df[1][::2].reset_index(drop=True),
df[1][1::2].reset_index(drop=True)], axis=1)
data.columns = ['account_id', 'mmr']
account_id mmr
0 103757918 1.0
1 107361667 1.0
2 108464725 1.0
3 110818765 1.0
4 111436016 1.0
5 113518306 1.0
6 118896321 1.0
7 119780733 1.0
8 120360801 1.0
9 120870684 1.0
10 122616345 1.0
11 124393917 1.0
12 124487030 NaN