I am using the Eigenjoints of skeleton features to perform human action recognition by Matlab.
I have 320 videos, so the training data is 320x1 cell array, each one cell contains Nx2970 double array, where N is number of frames (it is variable because each video contains different number of frames), 2970 is number of features extracted from each video (it is constant because I am using same extraction method for all videos).
How can I format the training data into a 2d double matrix to use as input for an SVM? I don't know how to do it because SVM requires double matrix, and the information I have is one matrix for each video of different sizes.
Your question is a bit unclear about how you want to go about classifying human motion from your video. You have two options,
Single Frame Classification
For the first option, the solution to your problem is simple. You simply concatenate all the frames into one big matrix.
Let me give a toy example. I've made X_cell
, a cell array with a video with 2 frames and a video with 3 frames. In your question, you don't specify where you get your ground truth labels from. I'm going to assume that you have per video labels stored in a vector video_labels
X_cell = {[1 1 1; 2 2 2], [3 3 3; 4 4 4; 5 5 5]};
video_labels = [1, 0];
One simple way to concatenate these is to use a for loop,
X = [];
Y = [];
for ii = 1:length(X_cell)
X = [X; X_cell{ii}];
Y = [Y', repmat(video_labels(ii), size(X_cell{ii},1), 1)];
end
There is probably also a more efficient solution. You could think about vectorizing this code if you need to improve speed.
Whole Video Classification
Time series features are a course topic all in themselves. Here the simplest thing you could do is simply resize all the video clips to have the same length using imresize
. Then vectorize the resulting matrix. This will create a very long, redundant feature.
num_frames = 10; %The desired video length
length_frame_feature = 2;
num_videos = length(X_cell);
X = zeros(num_videos, length_frame_feature*num_frames);
for ii=1:length(X_cell)
video_feature = imresize(X_cell{ii}, [num_frames, length_frame_feature]);
X(ii, :) = video_feature(:);
end
Y = video_labels;
For more sophisticated techniques, take a look at spectrograms.