Search code examples
excelvbamacoscsvapplescript

Creating multiple Excel workbooks from CSV data using Query Tables - why don't the format parameters take effect?


I need to import many (like, over 100) French-language, semicolon delimited CSV files to Excel and I got a little lazy about doing the conversion manually, and I sought a way to automate it. I'm using a MacBook (Apple silicon if it matters) so I can't use the built-in PowerQuery feature for this, unfortunately. I know nothing about VBA, so what was a measure of convenience spiraled out of control. Luckily up to this point I've been served well by my own stubbornness and Google -- I ended up operating off of this solution posted on Reddit. It ended up needing edits to actually function in current-year. Here's what I ended up with:

Sub Select_File_Or_Files_Mac()
    Dim MyPath As String
    Dim MyScript As String
    Dim MyFiles As String
    Dim MySplit As Variant
    Dim N As Long
    Dim Fname As String
    Dim mybook As Workbook

    On Error Resume Next

    MyFiles = AppleScriptTask("excelCSVselect.scpt", "select_files", "")
    On Error GoTo 0

    If MyFiles <> "" Then
        With Application
            .ScreenUpdating = False
            .EnableEvents = False
        End With
    MySplit = Split(MyFiles, Chr(10))
    
        For N = LBound(MySplit) To UBound(MySplit)

            'Get file name only and test if it is open
            Fname = Right(MySplit(N), Len(MySplit(N)))
'            - InStrRev(MySplit(N), _
'             ":", , 1))
            
                On Error Resume Next
                Set mybook = Workbooks.Open(MySplit(N))
                On Error GoTo 0
             Next
             
Worksheets("Sheet1").Activate

With ActiveSheet.QueryTables.Add( _
        Connection:="TEXT;" & Fname, _
        Destination:=Range("A1"))
        .FieldNames = True
        .RowNumbers = False
        .RefreshOnFileOpen = False
        .BackgroundQuery = True
        .RefreshStyle = xlInsertDeleteCells
        .SaveData = True
        .AdjustColumnWidth = True
        .TextFilePromptOnRefresh = False
        .TextFilePlatform = 65001
        .TextFileStartRow = 1
        .TextFileParseType = xlDelimited
        .TextFileTextQualifier = xlTextQualifierDoubleQuote
        .TextFileSemicolonDelimiter = True

'        .TextFileColumnDataTypes = Array(1, 1, 1, 1, 1, 1, 1, 1, 1, 1)
End With
           
              End If

End Sub


That excelCSVselect.scpt AppleScript has almost the exact same functionality as the reddit post's MacScript function. Cobbled together from multiple online sources. It allows you to select multiple .csv but converts it to a list with POSIX filepaths i.e. not using colons:

on convertListToString(theList, theDelimiter)
    set AppleScript's text item delimiters to theDelimiter
    set theString to theList as string
    set AppleScript's text item delimiters to ""
    return theString
end convertListToString

on select_files()
    set filelist to {}
    set CSV_files to choose file with prompt "Pick CSV Files to convert" with multiple selections allowed
    
    repeat with a in CSV_files
        set end of filelist to (POSIX path of a as string)
    end repeat
    return convertListToString(filelist, "\n")
end select_files

select_files()

The problem? All of the formatting that's supposed to be handled by the QueryTables.Add function isn't actually happening. I get a new workbook for each CSV but each CSV row is an ugly block. As far as I can tell, .TextFileSemicolonDelimiter = True isn't getting run. The other big important thing is making sure the French special characters are getting displayed correctly. Theoretically, .TextFilePlatform = 65001 should be doing that; it's not. The other unideal thing happening is that I think the program is getting stuck somehow in that last section because after running it I can't get back into the VBA project anymore without quitting the Excel worksheet.

As I've said above, I have little to no idea what I'm doing. But I was looking at the Microsoft documentation and thinking maybe one of those QueryTable dot properties (e.g. .BackgroundQuery) was extraneous and messing with the output. I no longer think that is the case, and by removing too many of those methods I earned myself a mysterious 400 error I could not fix so I just left them be. I think it had something to do with the refresh process. Not that I understand what that means! :')


Solution

  • After (unwisely) spending my whole day solving this problem, I managed to get the field separation working almost 100% the way I wanted it to!

    I didn't realize the linked solution was doing two things at once - that original Reddit poster opened a new workbook with Workbooks.Open when they actually wanted to load the data into a sheet.

    
    Sub Select_File_Or_Files_Mac()
        Dim MyFiles As String
        Dim MySplit As Variant
        Dim N As Long
        Dim Fname As String
        Dim ws As Worksheet
        Dim newData As QueryTable
    
        On Error Resume Next
        MyFiles = AppleScriptTask("excelCSVselect.scpt", "select_files", "")
        On Error GoTo 0
    
        If MyFiles <> "" Then
            With Application
                .ScreenUpdating = False
                .EnableEvents = False
            End With
    
            MySplit = Split(MyFiles, Chr(10))
            For N = LBound(MySplit) To UBound(MySplit)
                'Get file name only
                Fname = Right(MySplit(N), Len(MySplit(N)))
                Set ws = Worksheets.Add
                Set newData = ws.QueryTables.Add( _
                            Connection:="TEXT;" & Fname, _
                            Destination:=ws.Range("A1"))
                
                
        
      
                With newData
                    .BackgroundQuery = False ' Change to False to prevent freezing
                    .Refresh BackgroundQuery:=False
                    .TextFileParseType = xlDelimited
                    .TextFileStartRow = 1
                    .TextFileTextQualifier = xlTextQualifierDoubleQuote
                    .TextFilePlatform = 65001
                    .PreserveFormatting = True
                    .TextFileSemicolonDelimiter = True
                    .FieldNames = True
                    .RowNumbers = False
                    .FillAdjacentFormulas = False
                    .RefreshStyle = xlInsertDeleteCells
                    .SaveData = True
                    .AdjustColumnWidth = True
                    .RefreshPeriod = 0 ' No refresh period
                End With
                DoEvents ' Allow Excel to process events
            Next N
    
            With Application
                .ScreenUpdating = True
                .EnableEvents = True
            End With
        End If
    End Sub
    
    

    However, I still for the life of cannot get the special characters encoded correctly. I read something can happen where Excel just ignores UTF-8 encoding and uses UTF-16LE. I will choose to believe that is happening to cope.

    Whatever the reason may be, since it was ultra important to get the special characters right to read the French text, I just used Python/Pandas to do it instead. Went a lot smoother. (Maybe should have started with that.) ¯\_(ツ)_/¯