Search code examples
htmlseosearch-engine

Site crawling/indexing issues caused by link structure?


I'm doing SEO-type work for a client with many diverse site properties-- none of which were built by myself. One of them in particular, to which I'm linking here, appears to be having issues being indexed by search engines. Interestingly, I've tried multiple sitemap generator tools and they too seem to have problems indexing the site; although the site is made up of only a few pages and external links, the sitemap tools-- and I suspect search engines-- are only seeing the homepage itself and nothing else.

In Google webmaster tools, I'm seeing a couple crawl errors (404) relating to home/index.html but nothing else. Also, in Google Analytics, over 80% of the traffic is direct-- i.e. not search traffic-- which seems alarming. The site has been live for about a month, and is being promoted by various sources. Even searching Google using the domain name itself doesn't bring the homepage up in results (!), let alone any related keywords.

My ultimate question is whether or not there appears to be any glaring issues with the code that might prevent proper indexing. I'm noticing that the developer chose to structure the navigation by naming directories, i.e. linking to "home/index.html," "team/index.html," "about/index.html" etc. when it seems optimal to be naming the HTML file itself, i.e. "team.html" and "about.html." Could this be part of the problem?

Thanks for any insight here.


Solution

  • You have two major issues here.

    First issue is the root http://www.raisetheriver.org/ has a meta refresh that redirects the page to http://www.raisetheriver.org/home/index.html

    Google recommends against using meta refresh, 301 redirects should be used if you want to redirect pages. However I recommend against redirecting the root home page to another page, as a websites home page is expected to be the root.

    The second issue is that all the pages on the site are blocked from being indexed in Google as they have the following code in the source code: <meta name="robots" content="noindex"> which instructs search engines not to index the page.

    Correct these issue and the site will be able to get indexed in Google and sitemap generators will be able to crawl the site.