I have a dynamically generated, indexed array in bash and want to know whether it is sparse or dense.
An array is sparse iff there are unset indices before the last entry. Otherwise the array is dense.
The check should work in every case, even for empty arrays, very big arrays (exceeding ARG_MAX when expanded), and of course arrays with arbitrary entries (for instance null entries or entries containing *
, \
, spaces, and linebreaks). The latter should be fairly easy, as you probably don't want to expand the values of the array anyways.
Ideally, the check should be efficient and portable.
Here are some basic test cases to check your solution.
Your check can use the hard-coded global variable name a
for compatibility with older bash versions. For bash 4.3 and higher you may want to use local -n isDense_array="$1"
instead, so that you can specify the array to be checked.
isDense() {
# INSERT YOUR CHECK HERE
# if array `a` is dense, return 0 (success)
# if array `a` is sparse, return any of 1-255 (failure)
}
test() {
isDense && result=dense || result=sparse
[[ "$result" = "$expected" ]] ||
echo "Test in line $BASH_LINENO failed: $expected array considered $result"
}
expected=dense
a=(); test
a=(''); test
a=(x x x); test
expected=sparse
a=([1]=x); test
a=([1]=); test
a=([0]=x [2]=x); test
a=([4]=x [5]=x [6]=x); test
a=([0]=x [3]=x [4]=x [13]=x); test
To benchmark your check, you can use
a=($(seq 9999999))
time {
isDense
unset 'a[0]'; isDense
a[0]=1; unset 'a[9999998]'; isDense
a=([0]=x [9999999999]=x); isDense
}
Non-empty, dense arrays have indices from 0
to ${#a[*]}-1
. Due to the pigeonhole principle, the last index of a sparse array must be greater or equal ${#a[@]}
.
To get the last index, we assume that the list of indices ${!a[@]}
is in ascending order. Bash's manual does not specify any order, but (at least for bash 5 and below) the implementation guarantees this order (in the source code file array.c
search for array_keys_to_word_list
).
isDense() {
[[ "${#a[*]}" = 0 || " ${!a[*]}" == *" $((${#a[*]}-1))" ]]
}
For small arrays this works very well. For huge arrays the check is a bit slow because of the ${!a[*]}
. The benchmark from the question took 9.8 seconds.
The approach in this answer only needs the last index. But bash only allows to extract all indices using ${!a[*]}
which is unnecessary slow. Internally, bash knows what the last index is. So if you wanted, you could write a loadable builtin that has access to bash's internal data structures.
Of course this is not a really practical solution. If the performance really did matter that much, you shoudn't use a bash script. Nevertheless, I wrote such a builtin for the fun of it.
The space and time complexity of above builtin is indepent of the size and structure of the array. Checking isdense a
should be as fast as something like b=1
.