We ran into an issue after installing SP2.1 on CQ5.5 which effects references update for all pages under a page that has been renamed using the "websites" console of CQ5. The issue is described here:
The hotfix fixes future page name changes and updates the references in all other pages , whether the links are authored as html directly or through input widgets such as "pathfields".
However , we have discovered this bug pretty late and there have been lot of page re-naming done which resulted in broken links on existing pages where we have used pathfield component in dialog boxes for authors to refer to other pages. I would like to write some custom code using the LinkChecker api under the com.day.cq.rewriter.linkchecker package. I am not able to find any sample code that CQ5 actually uses to perform the "reference updates" on page renames , to serve as a starting point.
I need inputs based on your experience , whether Linkchecker API is the best way forward or if there is some other API for checking all the authored links and generate a report on which links / pathfields have broken links .
Help appreciated.
I have checked: 1. the external link checker tool, which does report broken links, but only if the link is to some other external domain, so not useful in our case.
Linkchecker is a Sling rewriter. Rewriters are strictly associated with the request. They operate on the HTML code generated by the CQ before it's returned to the client. If I understand correctly, you want to look for broken internal links in the whole site and the Linkchecker won't be very useful here.
Consider using Groovy console to crawl over the /content/your_site
looking for strings starting with /content
. Then use resourceResolver
to check if the found path exists. Sample script implementing this algorithm can be found here.