I'm using the wkhtmltopdf to create some documents that require a TOC; this is done using Hx tags and it works properly (including a modified XSL file). The input HTML is generated by my own code, so I have full control over it.
Now, I need to exclude some, but not all entries at a certain level, like in the sample below
<h1>First</h1>
<h2>First of first</h2> <- exclude
<h2>Second of first</h2> <- exclude
<h1>Second</h1>
<h2>First of second</h2>
<h2>Second of second</h2>
The documentation explains how to customize the XSL; so I have generated the outline for the document and looked at the XML file.
It contains, as described in the manual, elements with four attributes : title, page, link and backLink.
<outline xmlns="http://wkhtmltopdf.org/outline">
<item title="First" page="0" link="__WKANCHOR_0" backLink="__WKANCHOR_1">
<item title="First of first" page="1" link="__WKANCHOR_2" backLink="__WKANCHOR_3"/>
<item title="Second of first" page="1" link="__WKANCHOR_4" backLink="__WKANCHOR_5">
... and so on
I guess that in order to get the desired result there could be two ways :
I could not find a way to achieve the first option, and the outline file has not enough information to use in the XSL file.
A couple of notes :
I know I can exclude by title attribute, but the documents may have a lot of them, so excluding by title isn't really an option.
another reason it can't be done is because I may exclude an entry at a certain location but need to include one with the same title elsewhere.
obviously I cannot exclude by page name, since I cannot possibly know in advance where the pages will break.
it would be nice to have those entries in the document outline but not in the TOC, so probably the second way to achieve this would be the proper way.
... so any help is appreciated. Thanks.
After a bit of struggling, here's a solution.
Since I cannot alter the outline file or creation method without tapping into wkhtmltopdf's source files, I tried to find a way to tell apart the TOC entries I want from those I don't want, based on the only available properties; I chose the title
since it's the most easy to control.
Enter the "zero-width non-joiner" : ‌
I added this special character as the first element of the TOC entries I don't want, then used the following test within the TOC XSL file.
<xsl:if test="not(starts-with(@title, '‌'))">
This simple trick seems to do the job properly.