I have got step function execution history in JSON format
[{
"timestamp": "2022-07-18T13:03:03.346000+00:00",
"type": "ExecutionFailed",
"id": 3,
"previousEventId": 2,
"executionFailedEventDetails": {
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'Workflow Choice state' (entered at the event id #2). Invalid path '$.contributor_id': The choice state's condition path references an invalid value."
}
}, {
"timestamp": "2022-07-18T13:03:03.306000+00:00",
"type": "ChoiceStateEntered",
"id": 2,
"previousEventId": 0,
"stateEnteredEventDetails": {
"name": "Workflow Choice state",
"input": "{\n \"Comment\": \"Insert your JSON here\"\n}",
"inputDetails": {
"truncated": false
}
}
}, {
"timestamp": "2022-07-18T13:03:03.252000+00:00",
"type": "ExecutionStarted",
"id": 1,
"previousEventId": 0,
"executionStartedEventDetails": {
"input": "{\n \"Comment\": \"Insert your JSON here\"\n}",
"inputDetails": {
"truncated": false
},
"roleArn": "arn:aws:iam::asdfg:role/step-all"
}
}]
We want to create a view like below
The issue is i am not able to create executionFailedEventDetails ,stateEnteredEventDetails,executionStartedEventDetails as new row . It comes in first row only .
Step column is the name in the stateEnteredEventDetails
This is what i am doing
import json
import pandas as pd
from tabulate import tabulate
raw = r"""[{
"timestamp": "2022-07-18T13:03:03.346000+00:00",
"type": "ExecutionFailed",
"id": 3,
"previousEventId": 2,
"executionFailedEventDetails": {
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'Workflow Choice state' (entered at the event id #2). Invalid path '$.contributor_id': The choice state's condition path references an invalid value."
}
}, {
"timestamp": "2022-07-18T13:03:03.306000+00:00",
"type": "ChoiceStateEntered",
"id": 2,
"previousEventId": 0,
"stateEnteredEventDetails": {
"name": "Workflow Choice state",
"input": "{\n \"Comment\": \"Insert your JSON here\"\n}",
"inputDetails": {
"truncated": false
}
}
}, {
"timestamp": "2022-07-18T13:03:03.252000+00:00",
"type": "ExecutionStarted",
"id": 1,
"previousEventId": 0,
"executionStartedEventDetails": {
"input": "{\n \"Comment\": \"Insert your JSON here\"\n}",
"inputDetails": {
"truncated": false
},
"roleArn": "arn:aws:iam::asdfg:role/step-all"
}
}]"""
data = json.loads(raw, strict=False)
data = pd.json_normalize(data)
# print(data.to_csv(), index=False)
print(tabulate(data, headers='keys', tablefmt='psql'))
data.to_csv('file.csv',encoding='utf-8', index=False)
and the output is
+----+----------------------------------+--------------------+------+-------------------+-------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+----------------------------------------+---------------------------------------------------+----------------------------------------+-------------------------------------------------------+----------------------------------------+
| | timestamp | type | id | previousEventId | executionFailedEventDetails.error | executionFailedEventDetails.cause | stateEnteredEventDetails.name | stateEnteredEventDetails.input | stateEnteredEventDetails.inputDetails.truncated | executionStartedEventDetails.input | executionStartedEventDetails.inputDetails.truncated | executionStartedEventDetails.roleArn |
|----+----------------------------------+--------------------+------+-------------------+-------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+----------------------------------------+---------------------------------------------------+----------------------------------------+-------------------------------------------------------+----------------------------------------|
| 0 | 2022-07-18T13:03:03.346000+00:00 | ExecutionFailed | 3 | 2 | States.Runtime | An error occurred while executing the state 'Workflow Choice state' (entered at the event id #2). Invalid path '$.contributor_id': The choice state's condition path references an invalid value. | nan | nan | nan | nan | nan | nan |
| 1 | 2022-07-18T13:03:03.306000+00:00 | ChoiceStateEntered | 2 | 0 | nan | nan | Workflow Choice state | { | 0 | nan | nan | nan |
| | | | | | | | | "Comment": "Insert your JSON here" | | | | |
| | | | | | | | | } | | | | |
| 2 | 2022-07-18T13:03:03.252000+00:00 | ExecutionStarted | 1 | 0 | nan | nan | nan | nan | nan | { | 0 | arn:aws:iam::asdfg:role/step-all |
| | | | | | | | | | | "Comment": "Insert your JSON here" | | |
| | | | | | | | | | | } | | |
+----+----------------------------------+--------------------+------+-------------------+-------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------+----------------------------------------+---------------------------------------------------+----------------------------------------+-------------------------------------------------------+----------------------------------------+
The Details column 5th column is dynamic like all columns currently i have given example of only 3 event but it can go up to any number .
Final Expected Output
Given file.json
:
[{
"timestamp": "2022-07-18T13:03:03.346000+00:00",
"type": "ExecutionFailed",
"id": 3,
"previousEventId": 2,
"executionFailedEventDetails": {
"error": "States.Runtime",
"cause": "An error occurred while executing the state 'Workflow Choice state' (entered at the event id #2). Invalid path '$.contributor_id': The choice state's condition path references an invalid value."
}
}, {
"timestamp": "2022-07-18T13:03:03.306000+00:00",
"type": "ChoiceStateEntered",
"id": 2,
"previousEventId": 0,
"stateEnteredEventDetails": {
"name": "Workflow Choice state",
"input": "{\n \"Comment\": \"Insert your JSON here\"\n}",
"inputDetails": {
"truncated": false
}
}
}, {
"timestamp": "2022-07-18T13:03:03.252000+00:00",
"type": "ExecutionStarted",
"id": 1,
"previousEventId": 0,
"executionStartedEventDetails": {
"input": "{\n \"Comment\": \"Insert your JSON here\"\n}",
"inputDetails": {
"truncated": false
},
"roleArn": "arn:aws:iam::asdfg:role/step-all"
}
}]
Doing
import pandas as pd
df = pd.read_json('file.json')
df = df.melt(['timestamp', 'type', 'id', 'previousEventId'], var_name='step', value_name='details').dropna()
print(df.to_markdown(index=False))
Output (Markdown was just the easiest for me to show all):
| timestamp | type | id | previousEventId | step | details |
|:---------------------------------|:-------------------|-----:|------------------:|:-----------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2022-07-18 13:03:03.346000+00:00 | ExecutionFailed | 3 | 2 | executionFailedEventDetails | {'error': 'States.Runtime', 'cause': "An error occurred while executing the state 'Workflow Choice state' (entered at the event id #2). Invalid path '$.contributor_id': The choice state's condition path references an invalid value."} |
| 2022-07-18 13:03:03.306000+00:00 | ChoiceStateEntered | 2 | 0 | stateEnteredEventDetails | {'name': 'Workflow Choice state', 'input': '{\n "Comment": "Insert your JSON here"\n}', 'inputDetails': {'truncated': False}} |
| 2022-07-18 13:03:03.252000+00:00 | ExecutionStarted | 1 | 0 | executionStartedEventDetails | {'input': '{\n "Comment": "Insert your JSON here"\n}', 'inputDetails': {'truncated': False}, 'roleArn': 'arn:aws:iam::asdfg:role/step-all'} |