Unique finds unique values of a vector.
If I have a data frame:
test_data <- data.frame(x = c(rep(1.00050239485720394857,4),
1.00050239485720394854,rep(2.0002230948570293845,5),rep(3.0005903847502398475,5)),
y = c(rep(4.00423409872345,5),rep(2.034532039485722,5),rep(1.1234152304957,5)))
sapply(test_data,unique)
R returns:
x y
[1,] 1.000502 4.004234
[2,] 2.000223 2.034532
[3,] 3.000590 1.123415
As expected.
But say I fit an lm() or aov() object and then try to find unique fitted values():
set.seed(123)
y = rf(100,50,3,3)
x1 <- factor(c(rep("blue",25),
rep("green",25),
rep("orange",25),
rep("purple",25)))
bsFit <- aov(y ~ x1)
unique(bsFit$fitted.values)
R returns:
[1] 2.709076 2.709076 2.709076 2.709076 2.709076 2.709076
[7] 2.709076 4.060080 4.060080 4.060080 4.060080 3.314801
[13] 3.314801 3.314801 3.314801 1.960280 1.960280 1.960280
[19] 1.960280 1.960280
There are clearly duplicates here.
As others have said (@Tim-Biegeleisen especially), RStudio is formatting the output to a specific number of decimal places (remember anything printed to the console is formatted by RStudio). So the "duplicates", if correctly formatted to show all decimal places, aren't duplicates.
We can use format
to show all decimal places:
format(unique(bsFit$fitted.values), digit = 22)
[1] "2.7090760788376542" "2.7090760788376773" "2.7090760788376604" "2.7090760788376622" "2.7090760788376627"
[6] "2.7090760788376649" "2.7090760788376640" "4.0600797479202155" "4.0600797479202164" "4.0600797479202200"
[11] "4.0600797479202146" "3.3148005388803132" "3.3148005388803128" "3.3148005388803146" "3.3148005388803137"
[16] "1.9602804435309986" "1.9602804435309984" "1.9602804435309982" "1.9602804435309988" "1.9602804435310004
I experimented to with the number of digits before an error was thrown and got 22.