I do not understand the meaning of the n_unique value in skim about list variables:
library(tidyverse)
library(skimr)
skim(starwars)
The following is part of the result, about the three list variables in the dataset:
Now, there are 10 different vehicles in the dataset, so it makes sense that n_unique is 11 (including the null case of Star Wars characters not using any vehicle). Characters can use from a min of zero vehicles (min_length) to a max of two different vehicles (max_length) all along the movies. There are also 16 starships, and characters can use from zero to five different starships, so all makes sense.
However, there are only seven movies. So, n_unique should be 7 and not 24. Also, it is true that a character can make an appearance in a minimum of one movie (min_length) to a max of all the seven movies (max_length).
There are 7 values for individual films, but there are 24 unique elements if you compare the list elements between themselves.
For example if the first element is [The Phantom Menace
, Revenge of the Sith
] and the second element is [The Phantom Menace
] then the two elements are different.
library(tidyverse)
library(skimr)
# Count unique individual films
starwars$films |>
unlist() |>
unique() |>
length()
#> [1] 7
# Count unique list elements
starwars$films |>
unique() |>
length()
#> [1] 24