I have a spreadsheet that lists all permutations of 5 columns of data into a single column of text (Column X aka 24) and my goal is to extract only actual words from that list into its own column (Column Y aka 25). The first part is not performed with VBA and happens almost instantaneously, but the spell check + extracting the actual words takes over an hour to complete (I've had to stop it it after 10 minutes and not even 10% of the way through). Is there a better way to do this?
My lists start on row 6 (n = 6) and Range("V3") is just the number of permutations (in this case, 83,521).
Sub Permute_and_Extract()
n = 6
Range("X7:X1000000").ClearContents
Range("Y6:Y1000000").ClearContents
Max = Range("V3") + 5
Range("X6").Select
Selection.AutoFill Destination:=Range("X6:X" & Max)
For i = 6 To Max
x = Application.CheckSpelling(Cells(i, 24).Text)
If x = True Then
Cells(n, 25) = Cells(i, 24)
n = n + 1
End If
Next i
End Sub
Following from the comments above:
Sub Permute_and_Extract()
Const RNG As String = "F1:F10000"
Dim wlist As Object, t, c As Range, i As Long, arr, res
Dim rngTest As Range
Set rngTest = ActiveSheet.Range(RNG)
t = Timer
Set wlist = WordsList("C:\Temp\words.txt", 5)
Debug.Print "loaded list", Timer - t
Debug.Print wlist.Count, "words"
'using an array approach...
t = Timer
arr = rngTest.Value
For i = 1 To UBound(arr, 1)
res = wlist.exists(arr(i, 1))
Next i
Debug.Print "Array check", Timer - t
'going cell-by-cell...
t = Timer
For Each c In rngTest.Cells
res = wlist.exists(c.Value)
Next c
Debug.Print "Cell by cell", Timer - t
End Sub
'return a dictionary of words of length `wordLen` from file at `fPath`
Function WordsList(fPath As String, wordLen As Long) As Object
Dim dict As Object, s As String
Set dict = CreateObject("scripting.dictionary")
dict.comparemode = vbTextCompare 'case-insensitive !!!
With CreateObject("scripting.filesystemobject").opentextfile(fPath)
Do While Not .AtEndOfStream
s = .readline()
If Len(s) = wordLen Then dict.Add s, True
Loop
.Close
End With
Set WordsList = dict
End Function
Output:
loaded list 0.359375
8938 words
Array check 0.019
Cell by cell 0.030