A Docusaurus documentation website: https://slovakia-atmo-plan.marvintest.vito.be/docs/ is rendered in Docs only mode.
The Algolia Docsearch scraper is not scraping root level pages, instead it logs Ignored: from start url
. This issue only seems to arise when the Docusaurus build is nested under {baseUrl}/docs
Why is this being ignored? This is my docsearch config:
"index_name": "atmoplan-documentation",
"start_urls": ["https://slovakia-atmo-plan.marvintest.vito.be/docs"],
"sitemap_urls": ["https://slovakia-atmo-plan.marvintest.vito.be/docs/sitemap.xml"],
"sitemap_alternate_links": true,
"stop_urls": ["/tests"],
"selectors": {
"lvl0": {
"selector": "(//ul[contains(@class,'menu__list')]//a[contains(@class, 'menu__link menu__link--sublist menu__link--active')]/text() | //nav[contains(@class, 'navbar')]//a[contains(@class, 'navbar__link--active')]/text())[last()]",
"type": "xpath",
"global": true,
"default_value": "Documentation"
"lvl1": "header h1",
"lvl2": "article h2",
"lvl3": "article h3",
"lvl4": "article h4",
"lvl5": "article h5, article td:first-child",
"lvl6": "article h6",
"text": "article p, article li, article td:last-child"
"strip_chars": " .,;:#",
"custom_settings": {
"separatorsToIndex": "_",
"attributesForFaceting": ["language", "version", "type", "docusaurus_tag"],
"attributesToRetrieve": ["hierarchy", "content", "anchor", "url", "url_without_anchor", "type"]
"conversation_id": ["833762294"],
"nb_hits": 46250
Inside your docusaurus.config.js
you should set the url
parameter with the actual website where you will be hosting your docs. Something like:
module.exports = {
url: 'https://slovakia-atmo-plan.marvintest.vito.be/docs',
This will be used by your docusaurus
to generate the sitemap.xml
, used by algolia
to locate your pages.
REFERENCE: https://docusaurus.io/docs/docusaurus.config.js/#url
I noted something strange inside your sitemap.xml
. For example the first link was https://www.vito.be/docs/markdown-page
, but defined URL for Algolia is https://slovakia-atmo-plan.marvintest.vito.be/docs