machine-learning neural-network computer-vision artificial-intelligence conv-neural-network

eventually you convert 2D matrix to Vector in convulational neural network so what's the point?

i guess this is a dumb ques?? but what i wanted to know was that eventually in convulational neural network we do flatten a 2d matrix into a single column vector so that we can feed it to feed forward neural network than at that point don't we loose the spatial information about the pixels. Any guidance will be appreciated thank u.

Solution

No, you don't lose spatial information when you transition from convolutional layers to dense layers. Think about the simple case of a single 2x2 filter being used on a 2x3 grayscale image with no padding. This will produce a 1x2 result.

Now let's say there are two classes of image. One always looks like this:

1 0 0
1 0 0

And the other always looks like this:

0 1 0
0 1 0

One filter that might be learned to distinguish these two images could look like this:

.5 0
.5 0

This filter simply averages the values on the left half of the spatial region, and would produce [1 0] for the first class and [0 1] for the second class. Obviously, this exclusively-spatial information could easily be used for classification by a dense layer with softmax activation.

In fact, this filter gives its spatial information for any 2xN image as a one-dimensional vector, so it should be clear that simply going from 2D vectors to 1D vectors does not necessarily lose spatial information. It depends on how those 1D vectors were generated.