Search code examples
.netvb.netexceloptimizationexcel-dna

How to handle full-column ranges in UDF?


I developed an addin for Excel using Excel DNA. I declared an UDF that accepts a range as input, casts the ExcelReference to Range and uses the enumerator from GetEnumerator to collect all cells values in a List for further processing and then writing outputs to another range.

As a test, I tried passing an entire column to the function (A:A) and everything froze because of the enumerator keeping enumerating empty cells after the last one with value.

Is there a faster way to detect a full-column Range and to get all its cells that are not empty?

For now I am using this code but it is incredibly slow in the above-mentioned case.

            Dim ue = inputRange.GetEnumerator    
            Dim L As New List(Of String)

            Do
                If ue.MoveNext Then
                    Dim c As Range = ue.Current
                    Dim V As String = c.FormulaLocal
                    If String.IsNullOrWhiteSpace(V) Then Continue Do
                    L.Add(V)
                Else
                    Exit Do
                End If
            Loop

I am going to use the following workaround, but I'd like to prevent this at the root.

            Dim ue = inputRange.GetEnumerator
            Dim counter as integer=0
            Dim L As New List(Of String)
            Do
                If counter>10 Then Exit Do
                If ue.MoveNext Then
                    Dim c As Range = ue.Current
                    Dim V As String = c.FormulaLocal
                    If String.IsNullOrWhiteSpace(V.Trim) Then
                        counter = counter + 1
                        Continue Do
                    End If
                    L.Add(V)
                Else
                    Exit Do
                End If
            Loop

Solution

  • It would be much faster to get all the values in one go from the ExcelReference, instead of getting a COM Range object.

    Either remove AllowReference=true from the argument (then you'll get the value directly) or get the value from the ExcelReference:

    object value = inputRef.GetValue();
    
    if (value is object[,])
    { 
        object[,] valueArr = (object[,])value;
        int rows = valueArr.GetLength(0);
        int cols = valueArr.GetLength(1);
        for (int i = 0; i < rows; i++)
        {
            for (int j = 0; j < cols; j++)
            {    
                object val = valueArr[i,j];
                // Do more here...
            }
        }
    }
    

    If a cell is empty, the object you get will be of type ExcelEmpty. You can ignore these it you're not interested in empty cells.

    Another approach is to use the C API to get the used range for the sheet, and intersect that with your ExcelReference. One disadvantage is that this requires your function to be marked IsMacroType=true, which (together with AllowReference=true) has the side effect of making your function volatile.

    Code that shows how to do this is here: https://gist.github.com/govert/e66c5462901405dc96aab8e77abef24c