I'm in the process of writing a scraper for the articles on the site https://www.welt.de. I'd also like to include the comments. However, when loading the page, not all comments are loaded automatically. Instead one has to click on a link to load more comments, until at some point, all are loaded.
When you scroll down, there appears a surface "MEHR KOMMENTARE ANZEIGEN" (German for 'show more comments').
This link looks like:
<div href="#" style="text-align: center; height: 44px; cursor: pointer;">
<a style="font-size: 0.6875rem; font-family: ffmark, "Helvetica Neue", Helvetica, Arial, sans-serif; font-weight: 800; color: rgb(0, 57, 91); line-height: 5;">
<span style="font-size: 0.6875rem; font-family: ffmark, "Helvetica Neue", Helvetica, Arial, sans-serif; font-weight: 500; margin-right: 0.625rem; text-align: right; color: rgb(120, 120, 120);">
MEHR KOMMENTARE ANZEIGEN
<span style="width: 14px; height: 8px; margin: 0px 0px 0px 0.625rem; padding-top: 0px; display: inline-block; vertical-align: initial;">
<svg viewBox="0 0 15 9" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g stroke="none" stroke-width="1" fill="none" fill-rule="evenodd">
<g transform="translate(-608.000000, -4318.000000)" fill="#787878">
<polygon transform="translate(615.205882, 4322.852941) rotate(-90.000000) translate(-615.205882, -4322.852941) " points="618.264706 4315.79412 611.205882 4322.85353 618.264706 4329.91176 619.205882 4328.97059 613.088824 4322.85353 619.205882 4316.73529">
</polygon>
</g>
</g>
</svg>
</span>
</span>
</a>
</div>
However, I do not know how to load this link in a script?
I understand that href="#"
is used when a link is handled by javascript and that it is bad style, as it is only used to change the appearance of the mouse, for which there are other methods.
But where is the onClick() method? Kinda dumbfoundead here...
Clicking that show comments twice gives me the following urls
https://api-co.la.welt.de/api/comments?document-id=183878020&created-cursor=2018-11-15T13:52:41.714&sort=NEWEST
https://api-co.la.welt.de/api/comments?document-id=183878020&created-cursor=2018-11-15T12:23:26.896&sort=NEWEST
Which returns the comments. So just use the post id you have and keep fiddling with created-cursor until you get all the comments?
EDIT: Removing the creator-cursor parameter should give you all the comments
https://api-co.la.welt.de/api/comments?document-id=183878020
EDIT 2:
As someone else mentioned, this might not be a good idea without first contacting the owner of the site.