Comparing List to String in neo4j, reading from HTML

Here's my problem: Let's suppose that I have an HTML file containing a table like the one below

<table>
    <tr>
        <td> keyword1 </td>
        <td>
            <p> paragraph 1 </p>
        </td>
    </tr>
    <tr>
        <td> keyword2 </td>
        <td>
            <p> paragraph 2 </p>
            <p> paragraph 3 </p>
        </td>
    </tr>
    <tr>
        <td> keyword3 </td>
        <td>
            <p> paragraph 1 </p>
            <p> paragraph 3 </p>
        </td>
    </tr>
</table>

I use the following code to extract the infos from the HTML

CALL apoc.load.html("file:///input_HTML.html",{kwords:"table tr td:eq(1)",
paragraphs:"table tr td:eq(2)",paragraphsList:"table tr td:eq(2) p"}) YIELD value

What I would like to have at the end, would be, for each input line of the table something similar to the statement below, but of course created dynamically upon reading the HTML file

CREATE(:kwords {name:"keyword1"})-[:'APPEARS_IN']->(:paragraph {name:"paragraph1"})

The tricky part is to get the paragraphs name ... any hint?

Solution

You need to be going after td element with an index of 1; the element index starts at 0.

...
paragraphs:"table tr td:eq(1)",paragraphsList:"table tr td:eq(1)
...

But I am not sure that still enables you to do what you want.

How about getting the keywords in one pass and then selecting the paragraphs for each keyword in a second pass.

CALL apoc.load.html("file:///input_HTML.html",{kwords: "tr td:eq(0)"}) YIELD value
UNWIND value.kwords AS kw
WITH kw.text AS kw
CALL apoc.load.html("file:///input_HTML.html",{paras: 'tr:contains(' + kw + ') td:eq(1) p'}) YIELD value
UNWIND value.paras AS para
MERGE (k:kwords {name: kw }) 
MERGE (p:paragraph {name: para.text}) 
MERGE (k)-[:APPEARS_IN]->(p)
RETURN *