Search code examples
wikipediawikidata

How to get Contents section data from some Wikipedia page?


I am looking for a dump file (ideally) or an API call to get the Contents section of Wikipedia pages. e.g. Fitbit page

Fitbit content

Any help really appreciated. Thanks!


Solution

  • You can do it with MediaWiki API by parsing the page. For your example the query will be:

    https://en.wikipedia.org/w/api.php?action=parse&prop=sections&page=Fitbit
    

    Response will include all page sections with their names (lines) and heading levels.

    {
        "parse": {
            "sections": [
                {
                    "index": "1",
                    "line": "Products",
                    "level": "2",
                    ...
                },
                {
                    "index": "2",
                    "line": "Fitbit Tracker",
                    "level": "3",
                    ...
                },
                {
                    "index": "3",
                    "line": "Fitbit Ultra",
                    "level": "3",
                    ...
                },
                ...
            ]
        }
    }