Since the elements of the matrix are bounded then I thought to use a variation of counting sort and then the running time maybe could be O(n^2), assuming that the size of the matrix is n^2.
Assuming that the result should be a sorted one dimensional array of size n^2 .
Can I get a hint ?
You already have the answer in your tags... Counting sort will beat anything else in such a small range as [0, 127]
.