Search code examples
vb.netparsinghtml-tablehtml-agility-packlistviewitem

Parsing HTML tables with different divisions at same time to write on listview


trying to read the data from the html table which are located in different divisions but could not get it simultaneously ( parsing starting with the first table and jumps to second which i need to read at the same level row data together )

  Private Sub Form1_Load(sender As Object, e As EventArgs) Handles Me.Load

    Dim web As New HtmlWeb
    Dim docmech As HtmlDocument = web.Load("http://www.eurovent-certification.com/en/Certified_products/Access_by_programme.php?lg=en&rub=04&srub=01&select_prog=AHU&select_partic=664&select_marque=YORK&select_class=MB+%2F+MB+%2F+MECH")
    Dim MechNodes As HtmlNodeCollection = docmech.DocumentNode.SelectNodes("/html/body/table/tr/td[2]/table[6]/tr[1]/td[2]/div[2]/table//nobr[a[@class='certificat-pdf']] | /html/body/table/tr/td[2]/table[6]/tr[1]/td[2]/div[3]/table//td[@class='tabGrisClair > normal']")
    Dim ColumnCount As Integer = 1
    Dim TempListItem As New ListViewItem


    If Not IsNothing(MechNodes) Then

        For Each item As HtmlNode In MechNodes
            If item.Name = "nobr" And item.InnerText <> "" And item.Attributes.Count = 0 Then
                Dim Name As String = item.InnerText.Replace(vbLf, "").Replace(vbCr, "").Replace(vbTab, "").Replace("&nbsp;", "")
                TempListItem = ListView1.Items.Add(Name)

            Else
                If item.Attributes("class").Value = "tabGrisClair > normal" Then

                    Dim SubName As String = item.InnerText.Replace(vbLf, "").Replace(vbCr, "").Replace(vbTab, "")
                    TempListItem = ListView1.Items.Add(SubName)
                End If
            End If
        Next

    End If

Xpath information of the tables ;

  • 1st table located at second div and row number starting from 4 to 10 and 1 column )

/html/body/table/tr/td[2]/table[6]/tbody/tr[1]/td[2]/div[2]/table/tr[4]/td/nobr ---> PU3055 (Target Text)

  • 2nd table located at third div and row number starting from 4 to 10 but 14 columns )

/html/body/table/tr/td[2]/table[6]/tr[1]/td[2]/div[3]/table/tbody/tr[4]/td[2] ---> D1(M) (Target Text)

How can i add the data's of both table at the same listview (with the same row number shown in the web page?

I am getting result as this ;Parsing result from the code

instead of the target result as : Target result from the web page


Solution

  • Simplest way to do it would be two for each loops.

    After you add models (items), loop over them or something similar then add columns (subitems).

    15 lines in total for everything. enter image description here Alternatively you could create two collections (A and B), then read one by one from each (like A1, B1, A2, B2...etc).

    Complete: enter image description here