Here: in R, to arise the need to define dimension for a vector,
M. JORGENSEN (Dept of Stat, U of Waikato, NZ):
"Would it not make sense to have dim(A)=length(A)
for all vectors?"
B.D. RIPLEY (Dept of Applied Statistics, Oxford, UK):
"No. A one-dimensional array and a vector are not the same thing.
There are subtle differences, such as what names()
means (see ?names
).
That a 1D array and a vector print in the same way does occasionally
lead to confusion, but then you also cannot tell from your printout that A
has type integer
and not double
.
......
My question:
(1) Not only I cannot figure out the subtle difference on names()
but also
(2) I cannot produce a concrete example about "telling from the printout that A
has type integer
and not double
issue".
Any help to clarify JORGENSEN-RIPLEY discussion (with concrete examples in R) will be appreciated.
To address the first question, let's first create a vector and a 1-d array:
(vector <- 1:10)
#> [1] 1 2 3 4 5 6 7 8 9 10
(arr_1d <- array(1:10, dim = 10))
#> [1] 1 2 3 4 5 6 7 8 9 10
If we give the objects some names, we can see the difference that Ripley alludes to by looking at the attributes:
names(vector) <- letters[1:10]
names(arr_1d) <- letters[1:10]
attributes(vector)
#> $names
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
attributes(arr_1d)
#> $dim
#> [1] 10
#>
#> $dimnames
#> $dimnames[[1]]
#> [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
That is, the 1-d array doesn't actually have a names
attribute,
but rather a dimnames
attribute (which is a list, not a vector),
the first element of which names()
actually accesses.
This is covered in the "Note" section in ?names
:
For vectors, the names are one of the attributes with restrictions on the possible values. For pairlists, the names are the tags and converted to and from a character vector.
For a one-dimensional array the names attribute really is dimnames[[1]].
Here we also see the lack of a dim
attribute for vectors. (A related SO answer covers the differences between arrays and vectors, too.)
The additional attributes and their storage method means that 1-d arrays always take up a little more memory than their vector equivalent:
# devtools::install_github("r-lib/lobstr")
lobstr::obj_size(vector)
#> 848 B
lobstr::obj_size(arr_1d)
#> 1,056 B
However, that's about the only reason I can think of why one
would want to have separate types for vectors and 1-d arrays. I would assume this was really the question that Jorgensen was
asking, i.e. why have a separate vector
type without the dim
attribute at all; and I don't think Ripley really addresses that.
I'd be very interested to hear other rationale for this.
As for point 2), when you create a vector with :
it
is always an integer:
vector <- 1:10
typeof(vector)
#> [1] "integer"
A double with the same values will print the same:
double <- as.numeric(vector)
typeof(double)
#> [1] "double"
double
#> [1] 1 2 3 4 5 6 7 8 9 10
But integers and doubles are not the same thing:
identical(vector, double)
#> [1] FALSE
The differences between integers and doubles in R are subtle, the main one being that integers take up less space in memory.
lobstr::obj_size(vector)
#> 88 B
lobstr::obj_size(double)
#> 168 B
See this answer for a more comprehensive overview of the differences between integers and doubles.
Created on 2018-07-09 by the reprex package (v0.2.0.9000).