Search code examples
excelvbaimportms-wordfiledialog

Problem importing text from a word document into excel if I use the filedialog


I have a vba code that imports some specific text from a Word document to specific cell in Excel.

Sub ExtractTextFromWordToExcel()
'copy text between two specified strings from Word to Excel.

    Dim wordApp As Object
    Dim wordDoc As Object
    Dim excelApp As Excel.Application
    Dim excelWorkbook As Excel.Workbook
    Dim excelSheet As Excel.Worksheet
    Dim startString As String
    Dim endString As String
    Dim extractedText As String
    
Dim i As Long
Dim LRow As Long

ThisWorkbook.Sheets("List2").Select

'find last row
ActiveSheet.UsedRange 'Refresh UsedRange

LRow = ActiveSheet.UsedRange.Rows(ActiveSheet.UsedRange.Rows.Count).Row

    ' Create a Word application object
    Set wordApp = CreateObject("Word.Application")
    
    ' Open the Word document
    Set wordDoc = wordApp.Documents.Open("C:\MyPath\MyFile.docx") ' Replace with your Word file path

For i = 2 To LRow
    
   If ActiveSheet.Range("B" & i) = "" Then
   
    ' Set the start and end strings for text extraction
        startString = ActiveSheet.Range("C" & i)
        endString = ActiveSheet.Range("D" & i)
    
    Else
    ' Set the start and end strings for text extraction
        startString = ActiveSheet.Range("B" & i) & vbTab '& ActiveSheet.Range("C" & i)
        endString = ActiveSheet.Range("D" & i) & vbTab & ActiveSheet.Range("E" & i)
   
   
   End If
    
   
    
    ' Extract the text between the start and end strings
    extractedText = wordDoc.Content
    extractedText = Split(extractedText, startString)(1)
    extractedText = Split(extractedText, endString)(0)
    
    
'    ' Create an Excel application object
'    Set excelApp = Excel.Application
'
'    ' Add a new workbook
'    Set excelWorkbook = excelApp.Workbooks.Add
'
'    ' Set a reference to the first sheet in the workbook
'    Set excelSheet = excelWorkbook.Sheets(1)
    
    ' Paste the extracted text into cell A1 of the Excel sheet
'    excelSheet.Range("D2").Value = extractedText
    ActiveSheet.Range("A" & i).Value = extractedText
    
    
 Next i

     ' Close the Word document and quit the Word application
    wordDoc.Close SaveChanges:=False
    wordApp.Quit
   
    ' Save the Excel workbook
'    excelWorkbook.SaveAs "C:\path\to\save\excel\file.xlsx" ' Replace with your desired Excel file path
    
    ' Close the Excel workbook and quit the Excel application
'    excelWorkbook.Close SaveChanges:=False
'    excelApp.Quit
    
    ' Release the objects from memory
    Set wordDoc = Nothing
    Set wordApp = Nothing
'   Set excelSheet = Nothing
'   Set excelWorkbook = Nothing
'   Set excelApp = Nothing
    
    MsgBox "Text extracted and saved to Excel successfully!"
End Sub

This code works, but I'd like to be able to search for Word document to be opened.

I have replaced this part of code


    ' Create a Word application object
    Set wordApp = CreateObject("Word.Application")
    
    ' Open the Word document
    Set wordDoc = wordApp.Documents.Open("C:\MyPath\MyFile.docx") ' Replace with your Word file path

with following code

    ' Create an instance of Word application
    Set wordApp = CreateObject("Word.Application")
    
    ' Display the file picker dialog
    filePath = Application.GetOpenFilename("Word Documents (*.docx), *.docx")
    
    ' Check if a file was selected
    If filePath <> False Then
        ' Open the selected Word document
        Set wordDoc = wordApp.Documents.Open(filePath)
        
        ' Make Word visible (optional)
        wordApp.Visible = True
    End If

Unfortunately, this code does not work as I expected. It opens the selected Word document, but does not copy the text and reports the error "Run-time error '9', Subscription out of range" for this line in 'Extract the text between the start and end strings part:

extractedText = Split(extractedText, startString)(1)

Any advice on how to fix the error would be appreciated.


Solution

  • This problem is not in the lines you change, those I tried, they are error-free. This problem is in the parsing of the contents of the open document, with Split function to convert out of the array element count is less than 2, so you use 1 to the array index subscript, there will be such an error. With 1, it is to take out the second element of the array, with 0, it is the first, and so on. To avoid this error, you should first check if the element count of the array is larger than 1, otherwise, you should handle it separately. Because I don't see your Excel file content, I don't know what your startString value actually is. (endString is also the same) You set its value with

      startString = ActiveSheet.Range("C" & I)
       ……
      startString = ActiveSheet.Range("B" & i) & vbTab
    

    However, if startString is not found in the .docx content or if startString is a null string (a string of zero length), then the return will be a one-dimensional array of one element,

    If delimiter is a zero-length string, a single-element array containing the entire expression string is returned.

    and the value of which is the entire content of the file. So your code can prefix these two lines with a judgment and a handler like maybe this:

          ' Extract the text between the start and end strings
        extractedText = wordDoc.Content
        
        Dim arr_startString, arr_endString
        
        arr_startString = Split(extractedText, startString)
        If UBound(arr_startString) > 0 Then
            extractedText = arr_startString(1)
            'extractedText = Split(extractedText, startString)(1)
        Else
            Rem What should be done, when the startString ="" or not found in the wordDoc.Content
            
        End If
        arr_endString = Split(extractedText, endString)
        If UBound(arr_endString) >= 0 Then
            extractedText = arr_endString(0)
            'extractedText = Split(extractedText, endString)(0)
        Else
            Rem What should be done, when the endString ="" or not found in the wordDoc.Content
            
        End If