Search code examples
excelvbaunique

Loop through each unique date in array based on criteria


In Sheet 1, I have a flat-file format of data, including dates.

In Sheet 2, in Cell A1, I have a date of 01/01/2019 and in cell A2 a date of 31/12/2019.

In VBA I need to be able to pick out the unique dates from Sheet 1 which are within the date criteria specified in A1/A2 in Sheet 2 and put them into Sheet 3, incrementing one row for each unique date.

Sheet 1 data structure is like follows:

    A            B
1   02/01/2019   Chromium
2   02/02/2019   Chromium
3   02/06/2019   Chromium
4   02/09/2019   Chromium
5   03/03/2020   Chromium
6   05/06/2020   Chromium
7   02/01/2019   Cadmium
8   10/10/2019   Cadmium
9   20/11/2019   Cadmium

In the example dataset above, there are nine rows with dates in and eight unique dates overall, but only six of them fall within the date criteria specified above.

I'll be incorporating this into another loop within the wider VBA code.

I'm wondering if I need to use a dictionary?


Solution

  • Short, easy and fast solution is

    Option Explicit
    
    Sub GetUniqueDatesBetween()
      Dim a, b, al&, i&, j&, lb&, ub&, pv&
      a = Application.Sort(Sheet1.[A1].CurrentRegion.Columns(1).Value2)
      lb = Sheet2.[A1].Value2: ub = Sheet2.[A2].Value2
      j = 0: al = UBound(a, 1): ReDim b(al, 0)
      For i = 1 To al
        If a(i, 1) >= lb Then Exit For
      Next
      pv = a(i, 1) - 1
      For i = i To al
        If a(i, 1) > ub Then Exit For
        If a(i, 1) > pv Then
          b(j, 0) = a(i, 1)
          pv = a(i, 1)
          j = j + 1
        End If
      Next
      Sheet3.[A1].Resize(j, 1) = b
    End Sub
    

    UPD

    If you need to incorporate "this into another loop within the wider VBA code", you can use the next function:

    Public Type ArrL
      Length As Long
      Values As Variant
    End Type
    
    Function LoopThroughOmitDuplicates(rng As Range, lb&, ub&) As ArrL
      Dim a, b, al&, i&, j&, pv&, ob&
      a = rng.Value2: al = UBound(a, 1): ob = al + 1
      b = Application.SortBy(Application.Sequence(al), a)
      For i = 1 To al
        If a(b(i, 1), 1) >= lb Then Exit For
        b(i, 1) = ob
      Next
      j = 0: pv = a(b(i, 1), 1) - 1
      For i = i To al
        If a(b(i, 1), 1) > ub Then Exit For
        If a(b(i, 1), 1) > pv Then
          pv = a(b(i, 1), 1): j = j + 1
        Else
          b(i, 1) = ob
        End If
      Next
      For i = i To al: b(i, 1) = ob: Next
      With LoopThroughOmitDuplicates
        .Length = j: .Values = Application.Sort(b)
      End With
    End Function
    

    The test call:

    Sub test()
      Dim res As ArrL
      res = LoopThroughOmitDuplicates(Sheet1.[A1].CurrentRegion.Columns(1), Sheet2.[A1].Value2, Sheet2.[A2].Value2)
      Sheet3.[A1].Resize(res.Length, 1) = res.Values
    End Sub
    

    enter image description here

    This function produces the array with indexes of unique dates while other entries are replaced by the over bound index. The j variable gives the number of unique entries. You can loop through this array till j.

    This is very efficient algorithm which is faster than usage of a dictionary in most cases.

    Testing

    I conducted three tests on 150000 data table. Execution times (in seconds) are provided in last two columns:

    1. The left column is for the sort-based algorithm (my code).
    2. The right column is for the dictionary-based algorithm (by CDP1802).
    Formula Data sample ll ul Lower bound Upper bound Unique values With sort With dictionary
    (1) 03.03.2020 26.01.1900 07.12.2150 01.01.2000 31.12.2050 14898 0,1 0,71
    (1) 02.06.2019 26.01.1919 07.12.2120 01.01.2019 31.12.2020 580 0,09 0,04
    (2) HAUI26299 01.01.2018 31.12.2022 C W 115354 0,28 40,51

    Formulas:

    (1)=RANDARRAY(150000;1;ll;ul;TRUE)
    (2)=MAKEARRAY(150000;1;LAMBDA(r;c;CONCAT(CHAR(RANDARRAY(4;1;65;90;TRUE)))))
      &RANDARRAY(150000;1;ll;ul;TRUE)