Search code examples
latexmarkdownpandocgithub-flavored-markdown

generate table of contents for markdown out of latex .toc-file


I have a latex-generated .toc file with the table of contents of a large document. I would like to extract the TOC to a (github-)markdown list e.g. with pandoc.

e.g. I have

\contentsline {chapter}{\numberline {1}Introduction}{1}{chapter.1}
\contentsline {section}{\numberline {1.1}Aim and Motivation}{1}{section.1.1}
\contentsline {section}{\numberline {1.2}State of the art}{1}{section.1.2}
\contentsline {section}{\numberline {1.3}Outline}{1}{section.1.3}
\contentsline {chapter}{\numberline {2}Fundamentals}{2}{chapter.2}
...

in my .toc file.

And would like to get something like this

1. Introduction
  1.1. Aim and Motivation
  1.2. State-of-the-art
  1.3. Outline
2. Fundamentals

Another alternative would be to extract this information (without the content) out of the tex-file directly. However, I could not get this working and I also think it would be more error-prone.

Any suggestions?


Solution

  • Another alternative would be to extract this information out of the tex-file directly.

    Pandoc can do that:

    $ pandoc -s --toc input.tex -o output.md
    

    To exclude the document body content, you'll have to use a custom pandoc markdown template:

    $ pandoc -D markdown > mytemplate.md
    

    Modify mytemplate.md to keep $toc$ and remove $body$, then use with pandoc --template mytemplate.md ...

    If you want to customize it more I would recommend outputting to html (pandoc -t html) instead of markdown, then write a small script that traverses the html DOM and does your numbering etc.