I want your intellectual suggestions for a problem i have. I have 3D points data along with intensity field (x,y,z,I)
which represent the 3D scene. I want this 3D data converted into an image (2D matrix with intensity values 'I'
).
I plan to do perspective projection of 3D points using pinhole camera model (Wikipedia).
x'=f*x/z
and y'=f*y/z
What value should I select for 'f'
? How is the size of image dependent on it? (say I need an image of size 500*500 , what value will suit for 'f'
)
Since coordinates in 2D image are integers, how should I quantize x'
and y'
values and substitute the corresponding intensity value? E.g. if I get two sets(by using f=10
) as
x,y,z,I
(3,1,2,128) -> x',y',I(15,5,128)
(3.1,1.1,2,150) -> (15.5,5.5,150)
Of the above two sets, should i just round off the x'
and y'
values and use its intensity at that coordinate or should I use an average of intensity of the non-integer coordinates ?
Will the resulting image be clearly depicting the scene in 2D (like a photo taken from a camera)?
Shall pay much gratitude for your ideas. Thanks
Whether you use the average intensity or nearest neighbour or other kinds of interpolation depends on your application. OpenGL for instance, when it does this operation, gives you the option of choosing (see GL_TEXTURE_MIN_FILTER and GL_TEXTURE_MAG_FILTER).
I suggest you try different approaches and see what they look like; the difference between linear and nearest neighbour interpolation is one line of code. More information about your intended application would probably be helpful.
Algorithmically the simplest approach to doing the projection is not necessarily the most computationally efficient. It is much easier to code this if rather than projecting the points the process would start with the 2D pixel location, find the corresponding nearby 3D points and perform interpolation (even if just nearest neighbour interpolation) to get the intensity. This will stop you from having e.g. gaps in the image and you no longer have to worry about having interpolation because of magnification as well as spaces between pixels.
How to project the data again depends on what you are trying to achieve so some more information about the application would be useful - e.g. are you trying to fit all of the points into the image? Or are you trying to fill the image? Or is there some property of the cloud that makes it likely to be squeezable into a square if projected? If it was collected by an image array then it should be possible to project it easily (and much of the above mechanics unnecessary since it should be easy to recover the original coordinates). Otherwise there are likely to be points that don't turn up in the image or parts of the image that don't have corresponding points.
If I make some assumptions then I can solve the projection equation for the limits. If we assume a 640 x 480 image and that the centre of projection is at the centre of the image then we have:
x'=f*x/z + 320
(note that this is misuing the focal length as is commonly done to map onto pixels where the true model has it mapping onto the scale of the image array and thereafter into pixels).
Let greatestx:x
be the largest x
value in the point array and greatestx:z
the corresponding z
value for that point then
639.5=f*greatestx:x/greatestx:z + 320
So,
f = 319.5*greatestx:z / greatestx:x
If you do this for the smallest x value, smallest y value, and largest y value:
f = -319.5*smallestx:z / smallestx: x
f = 239.5*greatesty:z / greatesty: y
f = -239.5*smallesty:z / smallesty: y
Now if we choose the smallest f
of the above then we guarantee the point cloud to fit into the image (but there might be gaps). If we choose the largest f
then we guarantee there to be no gaps in the image (but there might be parts that don't fit onto the image).