Search code examples
c#screen-scraping

Scrape Data and join fields from web in c#


I'm trying to make a simple TVchannel guide for a school project using C#. I made this, viewing a youtube tutorial:

        List<string> programasSPTV1 = new List<string>();
        List<string> horasSPTV1 = new List<string>();
        WebClient web = new WebClient();
        String html = web.DownloadString("http://www.tv.sapo.pt/programacao/detalhe/sport-tv-1");
        MatchCollection m1 = Regex.Matches(html, "<a href=\"#\" class=\"pinfo\">\\s*(.+?)\\s*</a>", RegexOptions.Singleline);
        MatchCollection m2 = Regex.Matches(html, "<p>\\s*(.+?)\\s*</p>", RegexOptions.Singleline);

            foreach(Match m in m1)
            {
                string programaSPTV1 = m.Groups[1].Value;
                programasSPTV1.Add(programaSPTV1);
            }
            foreach (Match m in m2)
            {
                string hora_programaSPTV1 = m.Groups[1].Value;
                horasSPTV1.Add(hora_programaSPTV1);
            }

            listBox1.DataSource = programasSPTV1 + horasSPTV1;

The last line is not correct... :(

What I really need is to get the time and program together in the same box...

Something like

17h45 : Benfica-FCPorto

And not 17h45 in a box and Benfica-FCPorto in another... :/

How can I do that?


Solution

  • Assuming that counts in both lists are the same, then the following should give you what you want:

    listBox1.DataSource = programasSPTV1.Zip(horasSPTV1, (a,b) => (a + " : " + b)).ToList();