If I set StormCrawler's ContentParseFilter to be
"pattern": "//DIV[@id=\"site-body\"]",
does that mean that that is the ONLY place it will look for links to other pages when processing each url? I am wondering if I set that if it will start ignoring all the urls in the menus and such.
Thanks! Jim
See WIKI page for ParseFilters
The ContentFilter allows to restrict the text of a document to the text covered by a Xpath expression
it does not affect the extraction of links at all but aims at improving the text indexed.