I am new in machine learning using python. I can not figure out how to reshape my dataset to use as an input in an LSTM model which will predict the risk of having a disease in the future. To be more specific, my dataset looks like the below one:
PatientID | MeasurementDate | Parameter1 | Parameter2 | HasDisease |
---|---|---|---|---|
1 | 1/1/2021 | 106 | 1 | 0 |
1 | 1/2/2021 | 105,9 | 1 | 0 |
1 | 1/3/2021 | 107 | 1 | 1 |
2 | 2/1/2021 | 100 | 0 | 1 |
2 | 2/2/2021 | 100,5 | 0 | 1 |
2 | 2/3/2021 | 104 | 0 | 1 |
3 | 3/1/2021 | 97 | 1 | 0 |
3 | 3/2/2021 | 97 | 1 | 1 |
3 | 3/3/2021 | 97 | 0 | 1 |
4 | 4/1/2021 | 99 | 0 | 0 |
4 | 4/2/2021 | 109 | 1 | 0 |
4 | 4/3/2021 | 110 | 0 | 0 |
How should I reshape this one to be used as an input in LSTM and look like (batch_size, time_steps, seqeuence_len)? I would like my LSTM model to return something like PatientId :1 Risk score :80% which means that patient 1 has 80% probability to have the disease after 3 months. Thank you.
Thank you.
I managed to get the expected result by following this video https://www.youtube.com/watch?v=CcGf_Uo7NMw&ab_channel=KnowledgeCenter and adjust it to my dataset. Batch size as the number of patients Timesteps as the number of follow-ups Sequence_len as the number of attributes as inputs. It ended up as an nD array.