Search code examples
textvbscriptadoadodb

VBScript to correctly/re format a delimited text file?


Can someone help me reformat/correctly format a delimited text file using VBScript?

I have a text file that is ^ delimited as below:

AGREE^NAME^ADD1^ADD2^ADD3^ADD4^PCODE^BAL^ARREARS
00010004000051162^MISS JENNIFER GRAY                      ^123 FAKE STREET           ^                              ^TOWN                     ^COUNTY    ^POSTCODE    ^ 004978.00^ 000188.72

All of the data contains leading and trailing spaces that need to be removed. I only have VBScript available to do this.

I have tried to use ADO GetStrings but it has yielded inconsistent results because of the leading and trailing spaces.

Can anyone offer any suggestions or alternatives?

Thanks


Solution

  • When working with ADO text files, you should start with table definitions in the approriate schema.ini file:

    [agree.txt]
    Format=Delimited(^)
    ColNameHeader=True
    DecimalSymbol=.
    CharacterSet=ANSI
    TextDelimiter=None
    Col1=AGREE   CHAR
    Col2=NAME    CHAR
    Col3=ADD1    CHAR
    Col4=ADD2    CHAR
    Col5=ADD3    CHAR
    Col6=ADD4    CHAR
    Col7=PCODE   CHAR
    Col8=BAL     FLOAT
    Col9=ARREARS FLOAT
    

    Then you can access your data:

      Dim sTDir  : sTDir    = goFS.GetAbsolutePathName( "..\data" )
      Dim sTbl1  : sTbl1    = "agree.txt"
      Dim sFSpec : sFSpec   = goFS.BuildPath(sTDir, sTbl1)
      Dim sTbl2  : sTbl2    = "agree2.txt"
      WScript.Echo "src file with spaces:"
      WScript.Echo goFS.OpenTextFile(sFSpec).ReadAll()
      Dim oTDb   : Set oTDb = CreateObject( "ADODB.Connection" )
      Dim sCS    : sCS      = Join( Array( _
         "Provider=Microsoft.Jet.OLEDB.4.0" _
       , "Data Source=" & sTDir _
       , "Extended Properties=" & Join( Array( _
            "text" _
          ), ";" ) _
      ), ";" )
      oTDB.Open sCS
      WScript.Echo "trimmed automagically:"
      WScript.Echo oTDb.Execute(Replace("SELECT * FROM [@T]", "@T", sTbl1)) _
        .GetString( adClipString, , "|", vbCrLf, "" )
    

    output:

    src file with spaces:
    AGREE^NAME^ADD1^ADD2^ADD3^ADD4^PCODE^BAL^ARREARS
    00010004000051162^MISS JENNIFER GRAY                      ^123 FAKE STREET     ^                              ^TOWN                     ^COUNTY    ^POSTCODE    ^ 004978.00^ 000188.72
    
    trimmed automagically:
    00010004000051162|MISS JENNIFER GRAY|123 FAKE STREET||TOWN|COUNTY|POSTCODE|4978|188,72
    

    As you can see, there isn't any problem with spurious spaces, if you use the right tool.

    To get a clean copy, just add

      sFSpec = goFS.BuildPath(sTDir, sTbl2)
      If goFS.FileExists(sFSpec) Then goFS.DeleteFile sFSpec
      Dim sSQL : sSQL = Replace(Replace( _
          "SELECT * INTO [@T2] FROM [@T1]" _
        , "@T1", sTbl1), "@T2", sTbl2)
      WScript.Echo "Copy statement"
      WScript.Echo sSQL
      oTDb.Execute sSQL
      WScript.Echo "QED: no spurious whilespace"
      WScript.Echo goFS.OpenTextFile(sFSpec).ReadAll()
    

    output:

    Copy statement
    SELECT * INTO [agree2.txt] FROM [agree.txt]
    QED: no spurious whilespace
    "AGREE";"NAME";"ADD1";"ADD2";"ADD3";"ADD4";"PCODE";"BAL";"ARREARS"
    "00010004000051162";"MISS JENNIFER GRAY";"123 FAKE STREET";;"TOWN";"COUNTY";"POSTCODE";4978,00;188,72
    

    The driver will add

    [agree2.txt]
    ColNameHeader=True
    CharacterSet=1252
    Format=Delimited(;)
    Col1=AGREE Char Width 255
    Col2=NAME Char Width 255
    Col3=ADD1 Char Width 255
    Col4=ADD2 Char Width 255
    Col5=ADD3 Char Width 255
    Col6=ADD4 Char Width 255
    Col7=PCODE Char Width 255
    Col8=BAL Float
    Col9=ARREARS Float
    

    (german locale defaults) to the schema.ini; edit this to

    [agree2.txt]
    Format=Delimited(^)
    ColNameHeader=True
    DecimalSymbol=.
    CharacterSet=ANSI
    TextDelimiter=None
    Col1=AGREE   CHAR
    ...
    

    to re-create your original format:

    QED: no spurious whilespace
    AGREE^NAME^ADD1^ADD2^ADD3^ADD4^PCODE^BAL^ARREARS
    00010004000051162^MISS JENNIFER GRAY^123 FAKE STREET^^TOWN^COUNTY^POSTCODE^4978.00^188.72
    

    By using more elaborate SQL statements and/or enhancing the table definitions you can solve more complex tasks in a straighforward way.

    P.S.

    Look here, if

      Dim sSQL : sSQL = Replace(Replace( _
          "SELECT * INTO [@T2] FROM [@T1]" _
        , "@T1", sTbl1), "@T2", sTbl2)
    

    makes you wonder (@T1 and @T2 are not ADO command parameters; the nested Replace() calls apply before ADO sees the then modified statement).