Stack Overflow, I've blown my whole morning on this issue. I'm trying to help out a coworker with a script. He's not a programmer, he just copied down some code off the internet, and asked me to modify it to give the results he wanted. I pored over it and scrapped all unnecessary parts and rewrote it so it was doing what I wanted in a way I understand. I should be honest and say I only deal with VBscript in these contexts, when a coworker has one that needs fixing. I have all my VB experience in VB6.
The purpose of the script is to take a text file delimited with newlines & potentially filled with duplicate entries, and output it with all duplicates removed.
Set objConnection = CreateObject("ADODB.Connection")
Set objRecordSet = CreateObject("ADODB.Recordset")
strPathToTextFile = "C:\Scripts\"
strFile = "Test.txt"
strOutputFile = "C:\this_is_the_output_changeme.txt"
Dim objFSO, objFile
Set objFSO = CreateObject("Scripting.FileSystemObject")
set objFile = objFSO.CreateTextFile(strOutputFile)
sql = "Select DISTINCT * FROM " & strFile
objConnection.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & strPathtoTextFile & ";" & _
"Extended Properties=""text;HDR=NO;FMT=Delimited"""
objRecordSet.Open sql, objConnection
Do Until objRecordSet.EOF
objFile.Write(objRecordSet.Fields.Item(0).Value)
objFile.Write(vbCrLf)
objRecordSet.MoveNext
Loop
objFile.Close
Seems pretty solid right? It works fine....depending on the input file. So here's the issue, sometimes it works like a charm, sometimes it gets confused and reports all non-number entries as a single distinct null.
Here are two example inputs that work just fine:
0
1
1
2
3
4
5
3
5
6
7
8
9
9
9
will output:
0
1
2
3
4
5
6
7
8
9
This input
gray
grey
gray
graey
greay
grey
gray
greasy
greay
outputs:
graey
gray
greasy
greay
grey
but a lot of other inputs cause this particular script to crash with a Type Mismatch error. If I swap out the objFile.Write's with a Wscript.echo, I can see that the objRecordSet is returning nulls.
The simplest input to recreate this error with is:
1
1
a
a
If I echo out this input, I get:
null
1
Basically any combination of letters and numbers produces this error. All letters get returned as a single null, and the numbers come out fine.
This seems like very bizarre behavior to me. It appears as if the RecordSet concludes that it's only going to receive number values if there are some number values, and throws out all letters as null numbers. As far as I can tell, it experiences this error in any input where there are half as many number entries as there are letter entries
I have been unable to determine a way to specify to receive all returned Items as strings. How should I pursue a solution to this issue?
The problem is caused by the driver gessing at the data type of the (one and only) column. Help the driver by putting a schema.ini file in the data source folder.
My schema.ini for this demo:
[numbers.txt]
Format=TabDelimited
ColNameHeader=False
Col1=F1 FLOAT
[texts.txt]
Format=TabDelimited
ColNameHeader=False
Col1=F1 TEXT
[mixed.txt]
Format=TabDelimited
ColNameHeader=False
Col1=F1 TEXT
Demo code:
Const adClipString = 2
Dim oCN : Set oCN = CreateObject("ADODB.Connection")
Dim sTDir : sTDir = goFS.GetAbsolutePathName("..\data")
Dim aTables : aTables = Array("numbers.txt", "texts.txt", "mixed.txt")
oCN.Open "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & sTDir & ";" & _
"Extended Properties=""text;HDR=NO;FMT=TabDelimited"""
Dim sTable
For Each sTable In aTables
Dim sFSpec : sFSpec = goFS.BuildPath(sTDir, sTable)
WScript.Echo " In:", Replace(goFS.OpenTextFile(sFSpec).ReadAll(), vbCrLf, " ")
WScript.Echo "Seen:", oCN.Execute("SELECT * FROM [" & sTable & "]").GetString(adClipString, , "", " ", "NULL")
WScript.Echo " Out:", oCN.Execute("SELECT DISTINCT * FROM [" & sTable & "]").GetString(adClipString, , "", " ", "NULL")
Next
oCN.Close
QED output:
Unique00 - unique via ADO Text Driver
=================================================
In: 2,05 2 1 2,5 3 2,05 2
Seen: 2,05 2 1 2,5 3 2,05 2
Out: 1 2 2,05 2,5 3
In: grey gray gray
Seen: grey gray gray
Out: gray grey
In: 1000 grey 10 gray 9 gray 9 1 gray
Seen: 1000 grey 10 gray 9 gray 9 1 gray
Out: 1 10 1000 9 gray grey
=================================================
xpl.vbs: Erfolgreich beendet. (0) [0.67188 secs]