I am working on a project in image processing which is based on importance of phase only reconstruction. For more information you can read the answer given by geometrikal in https://dsp.stackexchange.com/questions/16462/how-moving-part-pixel-intensity-values-of-video-frames-becomes-dominant-compared
I want to
Detect moving objects from the video of traffic on road ( Please download the 1.47 MB video by ( step1) click on the play button then (step2) right clicking on video then ( step3 ) click on save as option )
Algorithm for it is
The proposed approach
Requirement: An input image sequence I(x, y, n) (where x and y are image dimensions and n represent frame number in a video) which is extracted from video.
Outcome: The segmentation mask of moving object for each frame
For each frame in a input video perform step 2, append step 2 result in resultant array ‘I(x, y, n)’
Smoothen the current frame using 2D Gaussian filter
Perform 3D FFT for the whole sequence I(x, y, n) using (Eq.4.1)
Calculate the phase spectrum using the real and imaginary parts of 3D DFT
Calculate the reconstructed sequence Î(x, y, n) using (Eq.4.2)
For each frame in a input video perform step 7 to step 10 to get segmentation mask for each frame and append step 10 result in resultant segmentation mask array BW(x,y,n)’
Smooth the reconstructed frame of Î(x, y, n) using the averaging filter.
Compute the mean value of the current frame
Convert the current frame into binary image using mean value as the threshold
Perform morphological processing, i.e., filling and closing, to obtain segmented mask of moving object for the current frame
End algorithm.
With the above algorithm I could find all moving object from the video.
But the problem is the vehicle segmented mask that I obtained has no proper shape which I am expecting.
So can anybody help me for getting expected shape ?
- What changes should I make in the algorithm?
or
- What changes should I make in MATLAB code ?
tic
clc;
clear all;
close all;
%read video file
video = VideoReader('D:\dvd\Matlab code\test videos\5.mp4');
T= video.NumberOfFrames ; %number of frames%
frameHeight = video.Height; %frame height
frameWidth = video.Width ; %frameWidth
get(video); %return graphics properties of video
i=1;
for t=300:15:550 %select frames between 300 to 550 with interval of 15 from the video
frame_x(:,:,:,i)= read(video, t);
frame_y=frame_x(:,:,:,i);
%figure,
%imshow(f1),title(['test frames :' num2str(i)]);
frame_z=rgb2gray(frame_y); %convert each colour frame into gray
frame_m(:,:,:,i)=frame_y; %Store colour frames in the frame_m array
%Perform Gaussian Filtering
h1=(1/8)*(1/8)*[1 3 3 1]'*[1 3 3 1] ; % 4*4 Gaussian Kernel
convn=conv2(frame_z,h1,'same');
g1=uint8(convn);
Filtered_Image_Array(:,:,i)=g1; %Store filtered images into an array
i=i+1;
end
%Apply 3-D Fourier Transform on video sequences
f_transform=fftn(Filtered_Image_Array);
%Compute phase spectrum array from f_transform
phase_spectrum_array =exp(1j*angle(f_transform));
%Apply 3-D Inverse Fourier Transform on phase spectrum array and
%reconstruct the frames
reconstructed_frame_array=(ifftn(phase_spectrum_array));
k=i;
i=1;
for t=1:k-1
%Smooth the reconstructed frame of Î(x, y, n) using the averaging filter.
Reconstructed_frame_magnitude=abs(reconstructed_frame_array(:,:,t));
H = fspecial('disk',4);
circular_avg(:,:,t) = imfilter(Reconstructed_frame_magnitude,H);
%Convert the current frame into binary image using mean value as the threshold
mean_value=mean2(circular_avg(:,:,t));
binary_frame = im2bw(circular_avg(:,:,t),1.6*mean_value);
%Perform Morphological operations
se = strel('square',3);
morphological_closing = imclose(binary_frame,se);
morphological_closing=imclearborder(morphological_closing); %clear noise present at the borders of the frames
%Superimpose segmented masks on it's respective frames to obtain moving
%objects
moving_object_frame = frame_m(:,:,:,i);
moving_object_frame(morphological_closing) = 255;
figure,
imshow(moving_object_frame,[]), title(['Moving objects in Frame :' num2str(i)]);
i=i+1;
end
toc
I don't understand the details of the algorithm (btw, you could gain in visibility by using more meaningful names than f1,f2,f7,mean1,mean2, etc.) but it seems that your problem is inherent to the technic used.
By using phase of FFT, you are working independently on each pixel without any kind of contour awareness. What you could do though is tuning a bit the threshold (fixed at the mean here) and see how it responds.
Another option would be to post treat the current results by trying to recognize your expected shape in the images (see maximum expectation algorithms).
What are your constraints ?