I was using Matlab's fillmissing
function to replace missing values in some data using a custom function, and I was running into trouble filling data at the locations that I thought should be considered 'endvalues' by the function. It's fairly clear from the documentation that the fillfun
method uses a different moving-window definition than the movmean
or movmedian
methods, as they define the window on either side of each gap element, while fillfun
processes the window on either side of each full gap.
That said, looking at the following test array, A
, that includes several combinations of missing values that should be addressed as individual columns:
>> A = reshape(1:20,5,4);
>> A([1,7,8,9,11:15,20])=NaN
A =
NaN 6 NaN 16
2 NaN NaN 17
3 NaN NaN 18
4 NaN NaN 19
5 10 NaN NaN
Using the simple fill methods things are straightforward:
>> fillmissing(A, 'constant', 0)
ans =
0 6 0 16
2 0 0 17
3 0 0 18
4 0 0 19
5 10 0 0
>> fillmissing(A, 'constant', 0, 'endvalues',99)
ans =
99 6 99 16
2 0 99 17
3 0 99 18
4 0 99 19
5 10 99 99
Now if I just use a simple test function: @(x,y,z) z
which should fill the entire gap with the SamplePoints position (default [1 2 3 4 5]), that's when behavior gets odd:
>> fillmissing(A, @(x,y,z) z, 2)
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
>> fillmissing(A, @(x,y,z) z, 2, 'endvalues',99)
ans =
99 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 99
It seems what are considered 'endvalues' are not equivalent between methods. Further, it seems fillmissing
arbitrarily excludes endvalues (consistent with the above caveats) if the window has no included points, whether or not the function is defined to fill those values:
>> fillmissing(A, @(x,y,z) z, [2 0])
ans =
NaN 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
>> fillmissing(A, @(x,y,z) z, [0 2])
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 NaN
>> fillmissing(A, @(x,y,z) z, [0 2], 'endvalues', 99)
ans =
99 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 99
>> fillmissing(A, @(x,y,z) z, [0 2], 'endvalues', 'extrap')
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 NaN
Last, the interaction with 'SamplePoints' seems unclear.
>> fillmissing(A, @(x,y,z) z, 2, 'SamplePoints', [1 2 3 4 5])
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
>> fillmissing(A, @(x,y,z) z, 2, 'SamplePoints', [1 2 3 4 5]+10)
ans =
11 6 1 16
2 12 2 17
3 13 3 18
4 14 4 19
5 10 5 15
That appears to maybe be a bug in the way SamplePoints is handled. Can anyone clarify expected behavior? Am I missing something? If there is clearer documentation somewhere for this function method I would appreciate any pointers.
(Tested using Matlab 2021b if that matters. Update: verified above behavior persists in Matlab 2022b.)
(Edit: see answer below with updates showning function output changes tested in Matlab 2023a.)
It appears that as of Matlab 2023a, the inconsistencies posted above when using a custom fillfun
with fillmissing
have been resolved. See below:
the current output of the same commands now produces:
no changes to the simple cases:
>> fillmissing(A, 'constant', 0)
ans =
0 6 0 16
2 0 0 17
3 0 0 18
4 0 0 19
5 10 0 0
>> fillmissing(A, 'constant', 0, 'endvalues',99)
ans =
99 6 99 16
2 0 99 17
3 0 99 18
4 0 99 19
5 10 99 99
the fillfun
method now consistently sees an empty column as endvalues the same as above:
>> fillmissing(A, @(x,y,z) z, 2)
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
>> fillmissing(A, @(x,y,z) z, 2, 'endvalues',99)
ans =
99 6 99 16
2 2 99 17
3 3 99 18
4 4 99 19
5 10 99 99
And endvalues appear to be consistently accounted for with ranges:
>> fillmissing(A, @(x,y,z) z, [2 0])
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
>> fillmissing(A, @(x,y,z) z, [0 2])
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
>> fillmissing(A, @(x,y,z) z, [0 2], 'endvalues', 99)
ans =
99 6 99 16
2 2 99 17
3 3 99 18
4 4 99 19
5 10 99 99
>> fillmissing(A, @(x,y,z) z, [0 2], 'endvalues', 'extrap')
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
and the SamplePoints interaction is also consistently using the actual value rather than the place-value:
>> fillmissing(A, @(x,y,z) z, 2, 'SamplePoints', [1 2 3 4 5])
ans =
1 6 1 16
2 2 2 17
3 3 3 18
4 4 4 19
5 10 5 5
>> fillmissing(A, @(x,y,z) z, 2, 'SamplePoints', [1 2 3 4 5]+10)
ans =
11 6 11 16
2 12 12 17
3 13 13 18
4 14 14 19
5 10 15 15