I have a dataset with these features : Duration-connect, IP, Duration-LogIn.
Duration-connect and Duration-LogIn are continuous variable but IP is a caretgorical variabl wich contain the IP adress of the computer.
I would like to create a correlation matrix for these features. But I am not sure that cor() will work with IP feature as a non continuous variable.
Any idea for this problem.
Thank you
It won't work; just try
> cor(iris)
Error in cor(iris) : 'x' must be numeric
You could transform your IP addresses to numeric data (e.g. use the numeric values associated with the factor levels as numeric values), but the problem with that is that there is not much sense in computing (Pearsson) correlation on IP-addresses. E.g. what is the mean of a set of IP addresses? (I.e. it is an unordered set without a distance metric.)
Depending on what you want to do, I would either close out the IP addresses from correlation computation (and maybe set up a hierarchy of IP-address sets along some logic and compare for these) or cluster the continuous variables and see what this entails on the IP-addresses. Again, it depends on your goal, but I think that this is not purely a problem of R mechanics.