Search code examples
vbaexceloptimizationtext-search

Optimization VBA text search


I created a VBA code for text analysis, but I encountered a problem at run time. I just found on Google an advice on using excel built in function, but it did not improve the run time.

Here is the problem for which I use VBA. I've got a list of ~30k cells containing text (one or two sentences on average) and a list of 1k keywords, all of them having a numeric score. For every of the 30k cells, I want to see which of the keywords the cell contains, and calculate the sum of the scores of the keywords found.

Here is my way of approaching the problem in a nutshell right now :

  • Loop on the 30k text cells

  • Loop on the keywords

  • Check if the keyword is in the text cell, if yes, add the score of the keyword

I also tried using the search built in function :

  • Loop on the keywords

  • Search the keywords on the entire sheet containing the 30k text cells

  • When the keyword is found, add the score on the corresponding cell.

There was no significant change in run time.

Below you can find my code for the first approach:

'Loop on all the 30k text cells
For i = 2 To last_textcell

    'loop on the number of different category of scores, setting intial scores to zero.
    For k = 1 To nb_score - 1
        Score(k) = 0
    Next k

    j = 2

    'loop on the 1k keywords        
    Do While j < last_keywords

            !search if the keyword is in the text cell
            If UCase(Sheets("DATA").Range("V" & i).Value) Like "*" & UCase(Sheets("Keywords").Range("A" & j).Value) & "*" Then

                'if the keyword is found, add the score of the keyword to the previous score
                For l = 1 To nb_score - 1
                    Score(l) = Score(l) + Sheets("Keywords").Range("B" & j).Offset(0, l - 1).Value
                Next l

            End If

            j = j + 1

    Loop

    'paste the score 
    For k = 1 To nb_categ - 1
        Sheets("DATA").Range("CO" & i).Offset(0, k - 1).Value = Score(k)
    Next k


Next i

Do you have any tips on how to improve the performance?

Thank you very much!


Solution

  • I'd suggest two optimizations:

    1. Load both the lists of sentences, and keywords, into memory before running the test. This means that you only request data from the sheet once, rather than for every iteration of the test.

    2. Use the InStr function with vbTextCompare to find an instance of a keyword.

    Here's the sample code - I left stubs for you to re-insert your scoring function code:

    Option Explicit
    
    Sub QuickTest()
    
        Dim wsKeywords As Worksheet
        Dim wsData As Worksheet
        Dim lngLastRow As Long
        Dim varKeywords As Variant
        Dim varData As Variant
        Dim lngSentenceCounter As Long
        Dim lngKeywordCounter As Long
    
        Set wsKeywords = ThisWorkbook.Worksheets("Keywords")
        Set wsData = ThisWorkbook.Worksheets("DATA")
    
        'get list of keywords in memory
        lngLastRow = wsKeywords.Cells(wsKeywords.Rows.Count, "B").End(xlUp).Row
        varKeywords = wsKeywords.Range("B2:B" & lngLastRow).Value
    
        'get data in memory
        lngLastRow = wsData.Cells(wsData.Rows.Count, "V").End(xlUp).Row
        varData = wsData.Range("V2:V" & lngLastRow).Value
    
        'your scoring setup code goes here
        '...
    
        'iterate data
        For lngSentenceCounter = 1 To UBound(varData, 1)
            'iterate keywords
            For lngKeywordCounter = 1 To UBound(varKeywords, 1)
                'test
                If InStr(1, varData(lngSentenceCounter, 1), varKeywords(lngKeywordCounter, 1), vbTextCompare) > 0 Then
                    'you have a hit!
                    'do something with the score
                End If
            Next lngKeywordCounter
        Next lngSentenceCounter
    
        'your scoring output code goes here
        '...
    
    End Sub