Search code examples
flutterdartweb-scrapinghtml-parsing

Web scrapping table with mutual classNames in flutter


I am coming from frontend development and new to functionality of flutter. I want to scrap tables from this website [https://ticket.ady.az/hereket-cedveli]

And there are tables in tabs and they have mutual classNames. As you can see in screenshot there are two different data tables that have same className (bxs) which I have to call with it (.bxs and .from_s)

enter image description here

So generally I want to get all the tables in this website (you can see on inspect) which represent schedule of trains.

I have tried it, installed http and html/parser to do but not worked at all. So how can I get all of the tables in structured seperated Arrays (or Lists)?

My code

var url = Uri.parse("https://ticket.ady.az/hereket-cedveli");
  List<BXS> bxsData = [];
  int _isSelected = 41;

  Future getData() async {
    var res = await http.get(url);
    final body = res.body;
    final document = parser.parse(body);
    print(
        "----------------------------------------------------------------------------");
    var response = document
        .getElementsByClassName("bxs")[1]
        .getElementsByTagName("tbody")[0]
        .getElementsByTagName("tr")
        .forEach((element) {
      // bxsData.add(
      //   BXS(number: number, )
      // )
      print(document
              .getElementsByClassName("bxs")[1]
              .getElementsByTagName("tbody")[0]
              .children[0]
              .getElementsByTagName("td")
          // .getElementsByTagName("tr")[1]
          // .text.toString()
          );
      // print(element.children[1].text.toString());
      setState(() {
        // print("--------------------- "+element.toString());
        bxsData.add(BXS(
            number: element.children[1].text.toString(),
            Baku: element.children[2].text.toString(),
            Bilajari: element.children[3].text.toString(),
            Khirdalan: element.children[4].text.toString(),
            Sumgayit: element.children[5].text.toString()));
      });
    });
  }

Solution

  • Something like this?

    import 'package:wnetworking/wnetworking.dart';
    import 'package:html/parser.dart' as parser;
    
    class HereketCedveli {
      final _url = 'https://ticket.ady.az/hereket-cedveli';
      /* ---------------------------------------------------------------------------- */
      HereketCedveli();
      /* ---------------------------------------------------------------------------- */
      Future<void> getWebContent() async {
        var result = await HttpReqService.get<String>(_url, jsonResponse: false);
        
        if (result == null) return;
        
        var _document = parser.parse(result);
        try {
          var content = _document.body?.getElementsByClassName('bps from_b tablo__table')[0]
            .children[0]
            .text
            .trim();
          var items = content?.split(RegExp(r'\s{10,}'));
          var headers = items?.removeAt(0).split(RegExp(r'\s(?=[A-Z])'));
          var data = items?.map((e) => e.split(RegExp(r'\s{2,3}'))).toList();
          
          print(headers);
          data?.forEach(print);
        } catch (e) {
          print('ERROR: $e');
        }
      }
    }
    
    void main(List<String> args) async {
      await HereketCedveli().getWebContent();
      print('\nJob done!');
    }
    

    Output:

    [Qatarın statusu və nömrəsi, Bakı, Keşlə, Koroğlu, Bakıxanov, Sabunçu, Zabrat 1, Zabrat 2, Məmmədli, Pirşağı, Görədil, Novxanı, Sumqayıt]
    [№ 6601, 07:00, 07:05, 07:09, 07:12, 07:15, 07:20, 07:23, 07:26, 07:34, 07:39, 07:43, 07:50]
    [№ 6603, 08:35, 08:40, 08:44, 08:47, 08:50, 08:55, 08:58, 09:01, 09:09, 09:14, 09:18, 09:25]
    [№ 6605, 09:00, 09:05, 09:09, 09:12, 09:15, 09:20, 09:23, 09:26, 09:34, 09:39, 09:43, 09:50]
    [№ 6607, 09:30, 09:35, 09:39, 09:42, 09:45, 09:50, 09:53, 09:56, 10:04, 10:09, 10:13, 10:20]
    [№ 6609, 10:00, 10:05, 10:09, 10:12, 10:15, 10:20, 10:23, 10:26, 10:34, 10:39, 10:43, 10:50]
    [№ 6611, 13:30, 13:35, 13:39, 13:42, 13:45, 13:50, 13:53, 13:56, 14:04, 14:09, 14:13, 14:20]
    [№ 6613, 14:50, 14:55, 14:59, 15:02, 15:05, 15:10, 15:13, 15:16, 15:24, 15:29, 15:33, 15:40]
    [№ 6615, 15:50, 15:55, 15:59, 16:02, 16:05, 16:10, 16:13, 16:16, 16:24, 16:29, 16:33, 16:40]
    [№ 6617, 16:55, 17:00, 17:04, 17:07, 17:10, 17:15, 17:18, 17:21, 17:29, 17:34, 17:38, 17:45]
    [№ 6619, 17:50, 17:55, 17:59, 18:02, 18:05, 18:10, 18:13, 18:16, 18:24, 18:29, 18:33, 18:40]
    [№ 6621, 18:25, 18:30, 18:34, 18:37, 18:40, 18:45, 18:48, 18:51, 18:59, 19:04, 19:08, 19:15]
    [№ 6623, 18:40, 18:45, 18:49, 18:52, 18:55, 19:00, 19:03, 19:06, 19:14, 19:19, 19:23, 19:30]
    [№ 6625, 19:00, 19:05, 19:09, 19:12, 19:15, 19:20, 19:23, 19:26, 19:34, 19:39, 19:43, 19:50]
    [№ 6627, 19:55, 20:00, 20:04, 20:07, 20:10, 20:15, 20:18, 20:21, 20:29, 20:34, 20:38, 20:45]
    [№ 6629, 20:30, 20:35, 20:39, 20:42, 20:45, 20:50, 20:53, 20:56, 21:04, 21:09, 21:13, 21:20]
    [№ 6631, 21:30, 21:35, 21:39, 21:42, 21:45, 21:50, 21:53, 21:56, 22:04, 22:09, 22:13, 22:20]
    
    Job done!