I have a multiindexed pandas df and need to find the row number of a specific index. the Index is epoch time (Timestamp column) so several rows with the same index exist, I only need to find one of them.
df looks like this:
Temp1 Temp2 Uptime CPULoadAvg1 CPULoadAvg5 CPULoadAvg15 GPUV1 GPUV2 GPUV3 GPUV4 ts
Board Timestamp
src 1689884603 30.6 34.0 0 0.08 0.02 0.01 0 0 0 100 2023-07-20 20:22:24
1689884604 30.8 30.0 0 0.08 0.02 0.01 0 0 0 100 2023-07-20 20:22:24
1689884605 30.7 31.0 0 0.15 0.03 0.01 0 0 0 100 2023-07-20 20:22:24
1689884606 30.7 30.0 0 0.15 0.03 0.01 0 0 0 100 2023-07-20 20:22:24
1689884607 30.5 30.0 0 0.15 0.03 0.01 0 0 0 100 2023-07-20 20:22:24
... ... ... ... ... ... ... ... ... ... ... ...
coms 1690214970 47.1 53.0 2:59 2.12 2.06 2.06 689 641 876 64 2023-07-24 16:08:32
lr2 1690214970 34.3 49.0 2:59 0.31 0.13 0.04 0 0 0 100 2023-07-24 16:08:32
pp 1690214970 38.5 40.0 NaN NaN NaN NaN 0 0 0 100 2023-07-24 16:08:32
srs 1690214970 43.0 49.0 NaN NaN NaN NaN NaN NaN NaN NaN 2023-07-24 16:08:32
vel 1690214970 37.4 34.0 NaN NaN NaN NaN NaN NaN NaN NaN 2023-07-24 16:08:32
EDIT: So my task is to grab the last x minutes of the syslogs to graph the temperatures and other things. I filtered the syslog and made a df of the data. Ask the user for the number of minutes, and then have this code:
how_far_back_to_go_sec = (int(how_far_back_to_go_mins) * 60)
first_row_timestamp = int(df.index.levels[2][0])
last_row_timestamp = int(df.index.levels[2][-1])
timestamp_to_go_to = last_row_timestamp - how_far_back_to_go_sec # Get desired Timestamp to start graph at
if timestamp_to_go_to < first_row_timestamp:
print("Error: The logs to not go that far back. (",timestamp_to_go_to," seconds)")
exit(1) # Exit the code with an error code of 1
# try 1
df_xs = df.xs(timestamp_to_go_to, level='Timestamp')
print(df_xs)
# try 2
df_loc = df.loc[(slice(None), timestamp_to_go_to),:]
print(df_loc)
This returns a KeyError though. I have checked the df and timestamp_to_go_to is a valid index
added print(df.to_dict()) output:
'39', 'src', '1690216116'): Timestamp('2023-07-24 16:27:44'), ('39', 'srs', '1690216116'): Timestamp('2023-07-24 16:27:44'), ('39', 'vel', '1690216116'): Timestamp('2023-07-24 16:27:44'), ('39', 'coms', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'loc', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr1', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr2', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'pp', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'src', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'srs', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'vel', '1690216117'): Timestamp('2023-07-24 16:27:44'), ('39', 'coms', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'loc', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr1', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr2', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'pp', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'src', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'srs', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'vel', '1690216118'): Timestamp('2023-07-24 16:27:44'), ('39', 'coms', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'loc', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr1', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr2', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'pp', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'src', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'srs', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'vel', '1690216119'): Timestamp('2023-07-24 16:27:44'), ('39', 'coms', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'loc', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr1', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr2', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'pp', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'src', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'srs', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'vel', '1690216120'): Timestamp('2023-07-24 16:27:44'), ('39', 'coms', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'loc', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr1', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'lr2', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'pp', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'src', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'srs', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'vel', '1690216121'): Timestamp('2023-07-24 16:27:44'), ('39', 'coms', '1690216122'): Timestamp('2023-07-24 16:27:44'), ('39', 'loc', '1690216122'): Timestamp('2023-07-24 16:27:44'),
for that run the key was '1690216119' and still gave the keyError.
graphing code:
for idn, adfs in adfs.groupby(['Board'], level=0):
fig.add_trace(go.Scatter(x=adfs.index.get_level_values('Timestamp').values,
y=adfs['Temp1'],
mode='lines+markers', # Display both lines and markers
name=adfs.index.get_level_values('Board').values[0],
line=dict(width=1)))
fig.update_layout(xaxis=dict(title="Time", tickformat='%H:%M:%S', type='date'),
yaxis=dict(title='Temperature (deg C)'),
title=figure_title)
# Show the plot
fig.show()
So I ended up doing a df.reset_index() to get rid of the multi indexing. After that I looped through each row and searched for the epoch time value I wanted (none of the build in searches seemed to work). Once found I appended the rest of the rows to a new df
# single index way to cut down the df
rows_to_append = []
time_found = False
for row in adfs.itertuples(index=False):
if int(row[1]) == timestamp_to_go_to:
time_found = True
if time_found:
rows_to_append.append(row)
if not time_found:
print("Log time not found, try going back to another point")
exit(1) # Exit cause value not found
# After the loop, reset the index of the new DataFrame
adfs = pd.DataFrame(rows_to_append)
adfs.reset_index(drop=True, inplace=True)
print(adfs)
Its not pretty, but it works.
Graphed it with:
for idn, adfs in adfs.groupby(['Board']): # single index
fig.add_trace(go.Scatter(
x=adfs['Timestamp'], # single index
y=adfs['Temp1'],
mode='lines+markers', # Display both lines and markers
name=adfs['Board'].values[0], # single index
line=dict(width=1)))
fig.update_layout(xaxis=dict(title="Time", tickformat='%H:%M:%S', tickmode='auto', nticks=20),
yaxis=dict(title='Temperature (deg C)'),
title=figure_title)
# Show the plot
fig.show()