PyPDF2 : extract table of contents/outlines and their page number

I am trying to extract the TOC/outlines from PDFs and their page number using Python (PyPDF2), I am aware of the reader.outlines but it does not return the correct page number.

Pdf example: https://www.annualreports.com/HostedData/AnnualReportArchive/l/NASDAQ_LOGM_2018.pdf

and the output of reader.outlines is :

[{'/Title': '2018 Highlights', '/Page': IndirectObject(5, 0), '/Type': '/Fit'},
{'/Title': 'Letter to Stockholders', '/Page': IndirectObject(6, 0), '/Type': '/Fit'}, 
...
{'/Title': 'Part I', '/Page': IndirectObject(10, 0), '/Type': '/Fit'}, 
[{'/Title': 'Item 1. Business', '/Page': IndirectObject(10, 0), '/Type': '/Fit'}, 
{'/Title': 'Item 1A. Risk Factors', '/Page': IndirectObject(19, 0), '/Type': '/Fit'}
...

For instance, PART I was not expected to begin at page 10, am I missing something ? Does anyone have an alternative ?

I've tried with PyMupdf, Tabula and the getDestinationPageNumber method with no luck.

Thank you in advance.

Solution

Martin Thoma's answer is exactly what I needed (PyMuPDF). Diblo Dk's answer is an interesting workaround as well (PyPDF2).

I am citing exactly Martin Thoma's code :

from typing import Dict

import fitz  # pip install pymupdf


def get_bookmarks(filepath: str) -> Dict[int, str]:
    # WARNING! One page can have multiple bookmarks!
    bookmarks = {}
    with fitz.open(filepath) as doc:
        toc = doc.getToC()  # [[lvl, title, page, …], …]
        for level, title, page in toc:
            bookmarks[page] = title
    return bookmarks


print(get_bookmarks("my.pdf"))