Search code examples
rofficer

Using R to Automate Filename Retrieval in a Microsoft Word Table


I have a large table within a Microsoft Word document.

The majority of rows, but not all, have a single Microsoft Word file attached.

My job is to go into each row and manually type in the file name where an attachment is provided.

Is there any way to automate this task using an R package? For example, for each row that has a file attachment, automatically pull the filename and record it in the field directly to its left?

This is what the table looks like. The files are in the most right column. The column to its left is where I am going to be typing the filenames.

enter image description here

I've tried importing the docx file using the docxtractr package, but it is not reading in the filenames properly. Instead, it is replacing them with \s.

ievs_raw <- read_docx("ievs-raw.docx")

tbls <- docx_extract_all_tbls(real_world)

view(as.data.frame.list(tbls))

Produces the following output with \s where there should be filenames like CAP_ATT_H.11.114.docx etc.:

enter image description here


Solution

  • I wasn't able to figure this out using an R package, but the kind people at the Microsoft Community Forum helped out by providing a super useful Visual Basic Macro. What's great about this is it can accommodate cases where there is more than 1 attachment in a particular row.

    Sub ObjectNames()
        Dim ILS As InlineShape
        Dim nObj As Long
        Dim strName As String
        Dim col As Long
        Dim row As Long
        
        With ActiveDocument.Tables(1)
            col = .Columns.Count
            For row = 1 To .Rows.Count
                strName = ""
                
                # loop through all shapes in this row's last cell
                # (if there are none, the loop does nothing)
                For nObj = 1 To .Cell(row, col).Range.InlineShapes.Count
                    Set ILS = .Cell(row, col).Range.InlineShapes(nObj)
                    If Not ILS.OLEFormat Is Nothing Then
                        # build up a string with as many names as
                        # there are embedded objects, separated by
                        # paragraph marks (vbCr)
                        If nObj > 1 Then strName = strName & vbCr
                        strName = strName & ILS.OLEFormat.IconLabel
                    End If
                Next nObj
                
                If Len(strName) > 0 Then
                    .Cell(row, col - 1).Range.Text = strName
                End If
            Next row
        End With
    End Sub