Search code examples
rdata-sciencemdsmulti-dimensional-scaling

What is the difference between metric and non-metric MDS for a beginner?


I am fairly new to data science and would like to know in simple words (like teaching your grandmother) what the difference between metric and non-metric Multidimensional scaling is.

I have been googling for 2 days and watching different videos and wasn't able to quite understand some of the terms people are using to describe the difference, maybe I am lacking some basic knowledge but I don't know in which area so if you have an idea of what I should have a firm understanding of before tackling this subject, I would appreciate the advice. Here is what I know:

Multidimensional scaling is a way of reducing dimensions to be able to visualize or represent data in a more friendly manner. I know that there are several ways for MDS like metric and non metric, PCA and FA (maybe FA is a part of PCA, I'm not sure).

The example I am trying to apply this on is a set of data showing different cities and attributes related to these cities. For example, on a score from 1-7 (1 lowest - 7 highest), this is the score of each city and the corresponding attribute.

          **Clean**      **Friendly**     **Expensive**     **Beautiful**          

Berlin----------- 4 --------------------- 2-----------------------5------------------------6

Geneva---------6 --------------------- 3-----------------------7------------------------7

Paris------------ 3 --------------------- 4-----------------------6------------------------7

Barcelona----- 2 --------------------- 6-----------------------3------------------------4

How do I know if I should be using metric or non-metric MDS. Are there general rules of thumb or simple logic that I can use to decide without going deep into the technical process.

Thank you


Solution

  • Well, I might not be able to give you a specific answer but a simple answer would be that metric MDS already has the input matrix in the form of distances (i.e. actual distances between cities) and therefore the distances have meaning in the input matrix and create a map of actual physical locations from those distances.

    In non-metric MDS, the distances are just a representation of the rankings (i.e. high as in 7 or low as in 1) and they do not have any meaning on their own but they are needed to create the map using euclidean geometry and the map then just shows the similarity in rankings represented by distances between coordinates on the map.