Search code examples
sqlexcelvbaadodate-formatting

VBA ADO Date Formatting on Mixed Data Input


we are trying to use ADO to read data from a closed workbook, remove any whitespace and convert any incorrectly keyed dates into a valid format. Once the data has been cleansed, it's uploaded into a custom app.

We are using ADO for speed purposes as we have found using VBA to open/manipulate/close takes too long, meaning we miss our upload target time (we have multiple workbooks we need to apply this to).

The problem we have is converting the dates to a valid format. Dates are entered into the workbook either as dd/mm/yy or dd.mm.yy - we have no control over this, the template was created years ago and we are unable to update it and apply data validation.

Ideas We Have Tried: We have a few ideas, but have not been successful, does anyone know if any of these suggestions could work / suggest alternate ideas?

Check for a "." and apply a Replace(): If InStr(rs.Fields("Date").Value, ".") > 0 Then rs.Fields("Date").Value = Replace(rs.Fields("Date").Value, ".", "/")

This works when the column is read into the record set as type 202: adVarWChar, unfortunatly as the majority of the dates are valid, the data in the record set is set as type 7: adDate, when looping, once we get to an invalid date format (with the dots), we get a debug error:

"you cannot record changes because a value you entered violates the settings defined for this table or list (for example, a value is less than the minimum or greater than the maximum). correct the error and try again"

Convert the whole column data type to 202 adVarWChar: As the above code works for entries when they are formatted as text, we had an idea to see if we could pull the whole column of data in directly as text, we have experimented with Casting and Convert but cannot get it to work - I no longer have the sample code we were trying for that. I recall experimenting adding IMEX=1 to the connection string, but this didn't seem to make any difference.

Apply a Find/Replace query on a whole column: Instead of retrieving the data and looping through it, we had an idea to apply a find and replace query directly on the column, similar to how we are able to trim a whole column. Again, we were unable to find any code/queries which worked.

Create an empty record set and set the column type to String: We had an idea to create a blank/empty record set and manually set the date column to a string type, and then loop through the retrieved data and move them into the new record set. We didn't get very far with this as we weren't too sure how to create a blank RS, then we also thought, how would we write this data back to the worksheet - as I don't think you can write back to a closed workbook.

Here is the code I have at the moment:

Sub DataTesting()

On Error GoTo ErrorHandler

'set the workbook path of the file we want to read from
Dim workbookFileName As String
workbookFileName = "C:\Users\xxx\xxx\myWorkbook.xls"

'create a connection string
Dim connectionString As String
connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" _
    & workbookFileName _
    & ";Extended Properties=""Excel 12.0 Xml;HDR=YES;"";" 'IMEX=1"";"

'open the connection
Dim conn As ADODB.connection
Set conn = New ADODB.connection
conn.connectionString = connectionString
conn.Open

Dim rs As ADODB.Recordset
Set rs = New ADODB.Recordset

'Convert all data in the date column to a valid date (e.g. replace dates with decimals 1.1.21 to 01/01/2021)

'set query to select all data from the date column
Dim query As String
query = "SELECT * FROM [DATA SHEET$B2:B100]"  'col B is the Date column

With rs
    .ActiveConnection = conn
    '.Fields.Append "Date", adVarChar, 20, adFldMayBeNull   'NOT WORKING
    .CursorType = adOpenDynamic
    .LockType = adLockOptimistic
    .Source = query
    .Open

    If Not .BOF And Not .EOF Then
        While (Not .EOF)
            If InStr(rs.Fields("Date").Value, ".") > 0 Then rs.Fields("Date").Value = Replace(rs.Fields("Date").Value, ".", "/")
            .MoveNext
        Wend
    End If
    .Close
End With

conn.Close

GoTo CleanUp

ErrorHandler:
MsgBox Err.Description 'THIS WILL BE WRITTEN TO TXT FILE

CleanUp:
'ensure the record set is equal to nothing and closed
If Not (rs Is Nothing) Then
    If (rs.State And adStateOpen) = adStateOpen Then rs.Close
    Set rs = Nothing
End If

'ensure the connection is equal to nothing and closed
If Not (conn Is Nothing) Then
    If (conn.State And adStateOpen) = adStateOpen Then conn.Close
    Set conn = Nothing
End If

End Sub

UPDATE: I am now able to read the data using the following query:

"SELECT IIF([Date] IS NULL, NULL, CSTR([Date])) AS [Date] FROM [DATA SHEET$B2:B10]"

This will only work if I set IMEX=1, which is only read-only. I am able to loop through each item and print out the value / detect where the dots are, but I cannot then amend them!

As mentioned by @Doug Coats I can move the data into an array, perform the manipulation on the array. But how exactly do I then put that array back into the recordset?

I guess I would need to close the first 'read only' connection, and re-open it as a 'write' connection. Then somehow run an update query - but how do I replace the existing record set values with the values from the array?

Thanks


Solution

  • You could try an update query

        Const SQL = " UPDATE [DATA SHEET$] " & _
                    " SET [Date] = REPLACE([Date],""."",""/"")" & _
                    " WHERE INSTR([Date],""."") > 0 "
    
        Dim n
        conn.Execute SQL, n
        MsgBox n & " records updated"
    
    Sub testdata()
       Dim wb, ws, i
       Set wb = Workbooks.Add
       Set ws = wb.Sheets(1)
       ws.Name = "DATA SHEET"
       ws.Cells(1, 2) = "Date"
       For i = 2 To 10
          If Rnd() > 0.5 Then
             ws.Cells(i, 2) = "27.07.21"
          Else
             ws.Cells(i, 2) = "27/07/21"
          End If
       Next
       wb.SaveAs "c:\temp\so\dates.xls"
       wb.Close
    End Sub