Search code examples
vb.netdatagridviewcsvheader-row

Read Tab Delimited File Into DataGridView


The following code I have reads a tab delimited file into a DataGridView. It works fine, but there are a couple of issues I'm not exactly sure how to address.

Dim query = From line In IO.File.ReadAllLines("C:\Temp\Temp.txt")
Let Data = line.Split(vbTab)
Let field1 = Data(0)
Let field2 = Data(1)
Let field3 = Data(2)
Let field4 = Data(3)

DataGridView1.DataSource = query.ToList
DataGridView1.Columns(0).Visible = False

How do I go about adding fields (columns) based on the number of fields in the header row? The header row currently contains 110 fields, which I'd hate to define in a similar manner to Let field1 = Data(0)

I'd also need to skip the header row and only display the lines after this.

Is there a better way to handle this then what I'm currently doing?


Solution

  • There are several tools to parse this type of file. One is OleDB.

    I cant quite figure out how the (deleted) answer works because, HDR=No; tells the Text Driver the first row does not contain column names. But this is sometimes ignored after it reads the first 8 lines without IMEX.

    However, FMT=Delimited\""" looks like it was copied from a C# answer because VB doesnt use \ to escape chars. It also looks like it is confusing the column delimiter (comma or tab in this case) and text delimiter (usually ")

    If the file is tab delimited, the correct value would be FMT=TabDelimited. I am guessing that the fields are text delimited with quotes (e.g. "France" "Paris" "2.25") and OleDB is chopping the data by quotes rather than tabs to accidentally get the same result.

    The correct ACE string would be:

    Dim connstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='C:\Temp';Extended Properties='TEXT;HDR=Yes;FMT=TabDelimited';"
    

    Using just the connection string will import each filed as string. You can also have OleDB convert the data read to whatever datatype it is meant to be so that you do not have to litter your code with lots of Convert.ToXXXX to convert the String data to whatever.

    This requires using a Schema.INI to define the file. This replaces most of the Extended Properties in the connection string leaving only Extended Properties='TEXT';" (which means use the TEXT Driver). Create a file name Schema.INI in the same folder as the data:

    [Capitals.txt]
    ColNameHeader=True
    CharacterSet=437
    Format=TabDelimited
    TextDelimiter="
    DecimalSymbol=.
    CurrencySymbol=$
    Col1="Country" Text Width 254
    Col2="Capital City" Text Width 254
    Col3="Population" Single
    Col4="Fake" Integer

    One Schema.INI can contain the layout for many files. Each file has its own section titled with the name of the file (e.g. [FooBar.CSV], [Capitals.txt]etc)

    Most of the entries should be self-explanatory, but FORMAT defines the column delimiter (TabDelimited, CSVDelimited or custom Delimited(;)); TextDelimiter is the character is used to enclose column data when it might contain spaces or other special characters. Things like CurrencySymbol lets you allow for a foreign symbol and can be omitted.

    The ColN= listings are where you can rename columns and specify the datatype. This might be tedious to enter for 100+ columns, however it would probably be mostly copy and paste. Once it is done you'd always have it and be able to easily use typed data.

    You do not need to specify the column names/size/type to use a Schema.INI If the file includes column names as the first row (ColNameHeader=True), you can use the Schema simply to specify the various parameters in a clear and readable fashion rather than squeezing them into the connection string.

    OleDB looks for a Schema.INI in the same folder as the import file, and then looks for a section bearing the exact name of the "table" used in the SQL:

    ' form level DT var
    Private capDT As DataTable
    
    ' procedure code to load the file:
    Dim connstr = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='C:\Temp';Extended Properties='TEXT';"
    Dim SQL = "SELECT * FROM Capitals.txt"
    
    capDT = New DataTable
    
    ' USING will close and dispose of resources
    Using cn As New OleDbConnection(connstr),
                cmd As New OleDbCommand(SQL, cn)
    
        cn.Open()
        Using da As New OleDbDataAdapter(cmd)
            da.Fill(capDT)
        End Using
    
    End Using   ' close and dispose
    

    The DataTable is now ready to use. If we iterate the columns, you can see they match the Type specified in the schema:

    ' display data types
    For n As Int32 = 0 To capDT.Columns.Count - 1
         Console.WriteLine("name: {0}, datatype: {1}",
                            capDT.Columns(n).ColumnName,
                            capDT.Columns(n).DataType.ToString)
    Next
    

    Output:

    name: Country, datatype: System.String
    name: Capital City, datatype: System.String
    name: Population, datatype: System.Single
    name: Fake, datatype: System.Int32

    See also: