I have a csv file with many patients' health measurement data. Each patient has a different number of measurements. (Some patients come frequently, some don't.) I am trying to do a next value prediction model to predict the patients' risk of specific incidences. Since the values are all in time sequence, I've tried to use LSTM to make predictions. Also, I am concatenating all the patients' health data together into a long column. (Please see attachment)
what I am feeding into the LSTM
And my LSTM model generates results like stock price prediction.
But I wonder if there are better ways. I think my current method of concatenating all my patients' data is strange. Since all the patients have a different number of measurements, I am not sure if can feed them to the LSTM model in parallel. Or maybe I should use random forest because each patient's data has unique distribution? Thank you!
Regarding the different lengths of your data, you can use Padding and Masking to make your data evenly lengthed (Description of Padding/Masking with Tensorflow). Predicting sequence based data using LSTMs is generally a good way, but I would advise you to look in GRUs instead of LSTMs and also into Transformer architectures, becuase by now they have many advantages to LSTMs.