I have a data set with 12 variables each taking values 1 to 4 and are to be treated as ordinal. If I don't specify their type, they are being treated as interval type
> attributes(gower_dist)
$class
[1] "dissimilarity" "dist"
$Size
[1] 5845
$Metric
[1] "mixed"
$Types
[1] "I" "I" "I" "I" "I" "I" "I" "I" "I" "I" "I" "I"
but if I add 'type=list(ordratio=1:12)', the type becomes 'T' and I'm sure what that stands for. If it doesnt stand for ordinal, then how do I tell daisy that I am inputting ordinal data?
> attributes(gower_dist)
$class
[1] "dissimilarity" "dist"
$Size
[1] 5845
$Metric
[1] "mixed"
$Types
[1] "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T" "T"
Short answer:
If you specified ordinal ratios & observe the resulting type to be "T", that's the expected behaviour.
Long answer:
I took a look inside the daisy
function. There are 6 possible values for the Types
attribute:
typeCodes <- c("A", "S", "N", "O", "I", "T")
I cycled through the function in debug mode a couple of times with different parameters. The mapping appears to be as follows for this attribute:
If you specify type = list(asymm=<whichever columns in the dataset>)
: "A"
If you specify type = list(symm=<whichever columns in the dataset>)
: "S"
If you specify type = list(ordratio=<whichever columns in the dataset>)
: "T"
If you don't specify type, or you specify type=list(logratio=<whichever columns in the dataset>)
, & your dataset's columns are:
factors: "N"
ordered: "O"
numeric / integers: "I"
(Not sure why logratio doesn't get its own type, but that's probably going off topic here...)