I'm totally confused.
I run the site http://citylightstours.com
It is built on the CodeIgniter platform.
I noticed in Google Search Console that only 1 page of my site is indexed on Google. All other pages had 404 errors and hence google didnt list them.
I therefore thought it was a faulty sitemap so went to https://www.xml-sitemaps.com/ to generate a new one. I put in the root url and to my surprise only blog entries were contained in generated xml sitemap - NONE of the main pages of my site were there!!
I therefore went to another site to check for broken links http://www.brokenlinkcheck.com/ and to my extra surprise, every page on my site had a status of 404 broken link. HOWEVER, clicking on those links displays a valid page. They are therefore not broken links and i can navigate the site fine.
I therefore dont understand why automated robots come with a list of 404s and wont index the site, when all links appear to work!???
Any ideas?
THanks
UPDATE: I tried doing a Fetch and Render from search console too and a valid page that is displayed on browsers gives a Not Found error!
UPDATE 2: After doing site:citylightstours.com in google i notice that the ONLY pages indexed are the blog pages. All other pages have dropped out of the index - any ideas why???
UPDATE 3: One of the comments suggested it may be an issue with the .htaccess so i am posting it here in the hope that someone spots something. Thanks
UPDATE 4: After reading this post enter link description here I think it may be that the server returns a 404 error with the actual page code as the customer 404 human readable message!! As I said, I use codeigniter so it must have something to do with custom 404 page and routing. I dont know how to debug this though or even what to look at. Can anyone help?...THANKS!
<IfModule mod_rewrite.c>
# Development
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond $1 !^(index\.php|images|scripts|styles|vendor|robots\.txt)
RewriteRule ^(.*)$ index.php/$1 [L]
</IfModule>
DirectoryIndex index.php
RewriteEngine on
RewriteCond $1 !^(index\.php|images|css|js|robots\.txt|favicon\.ico)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ ./index.php/$1 [L,QSA]
# ----------------------------------------------------------------------
# Better website experience for IE users
# ----------------------------------------------------------------------
<IfModule mod_setenvif.c>
<IfModule mod_headers.c>
BrowserMatch MSIE ie
Header set X-UA-Compatible "IE=Edge,chrome=1" env=ie
</IfModule>
</IfModule>
<IfModule mod_headers.c>
Header append Vary User-Agent
</IfModule>
# ----------------------------------------------------------------------
# Webfont access
# ----------------------------------------------------------------------
<FilesMatch "\.(ttf|otf|eot|woff|font.css)$">
<IfModule mod_headers.c>
Header set Access-Control-Allow-Origin "*"
</IfModule>
</FilesMatch>
# ----------------------------------------------------------------------
# Proper MIME type for all files
# ----------------------------------------------------------------------
# audio
AddType audio/ogg oga ogg
# video
AddType video/ogg .ogv
AddType video/mp4 .mp4
AddType video/webm .webm
# Proper svg serving. Required for svg webfonts on iPad
# twitter.com/FontSquirrel/status/14855840545
AddType image/svg+xml svg svgz
AddEncoding gzip svgz
# webfonts
AddType application/vnd.ms-fontobject eot
AddType font/truetype ttf
AddType font/opentype otf
AddType application/x-font-woff woff
# assorted types
AddType image/x-icon ico
AddType image/webp webp
AddType text/cache-manifest appcache manifest
AddType text/x-component htc
AddType application/x-chrome-extension crx
AddType application/x-xpinstall xpi
AddType application/octet-stream safariextz
# ----------------------------------------------------------------------
# gzip compression
# ----------------------------------------------------------------------
<IfModule mod_deflate.c>
<IfModule mod_setenvif.c>
<IfModule mod_headers.c>
SetEnvIfNoCase ^(Accept-EncodXng|X-cept-Encoding|X{15}|~{15}|-{15})$ ^((gzip|deflate)\s,?\s(gzip|deflate)?|X{4,13}|~{4,13}|-{4,13})$ HAVE_Accept-Encoding
RequestHeader append Accept-Encoding "gzip,deflate" env=HAVE_Accept-Encoding
</IfModule>
</IfModule>
<FilesMatch "^(?!.*\.ogg$|.*\.ogv$|.*\.mp4$).+" >
# html, txt, css, js, json, xml, htc:
<IfModule filter_module>
FilterDeclare COMPRESS
FilterProvider COMPRESS DEFLATE resp=Content-Type /text/(html|css|javascript|plain|x(ml|-component))/
FilterProvider COMPRESS DEFLATE resp=Content-Type /application/(javascript|json|xml|x-javascript)/
FilterChain COMPRESS
FilterProtocol COMPRESS change=yes;byteranges=no
</IfModule>
</FilesMatch>
# webfonts and svg:
<FilesMatch "\.(ttf|otf|eot|svg)$" >
SetOutputFilter DEFLATE
</FilesMatch>
</IfModule>
# ----------------------------------------------------------------------
# Expires headers (for better cache control)
# ----------------------------------------------------------------------
<IfModule mod_expires.c>
ExpiresActive on
# Perhaps better to whitelist expires rules? Perhaps.
ExpiresDefault "access plus 1 month"
# cache.appcache needs re-requests in FF 3.6 (thx Remy ~Introducing HTML5)
ExpiresByType text/cache-manifest "access plus 0 seconds"
# your document html
ExpiresByType text/html "access plus 0 seconds"
# data
ExpiresByType text/xml "access plus 0 seconds"
ExpiresByType application/xml "access plus 0 seconds"
ExpiresByType application/json "access plus 0 seconds"
# rss feed
ExpiresByType application/rss+xml "access plus 1 hour"
# favicon (cannot be renamed)
ExpiresByType image/x-icon "access plus 1 week"
# media: images, video, audio
ExpiresByType image/gif "access plus 1 month"
ExpiresByType image/png "access plus 1 month"
ExpiresByType image/jpg "access plus 1 month"
ExpiresByType image/jpeg "access plus 1 month"
ExpiresByType video/ogg "access plus 1 month"
ExpiresByType audio/ogg "access plus 1 month"
ExpiresByType video/mp4 "access plus 1 month"
ExpiresByType video/webm "access plus 1 month"
# htc files (css3pie)
ExpiresByType text/x-component "access plus 1 month"
# webfonts
ExpiresByType font/truetype "access plus 1 month"
ExpiresByType font/opentype "access plus 1 month"
ExpiresByType application/x-font-woff "access plus 1 month"
ExpiresByType image/svg+xml "access plus 1 month"
ExpiresByType application/vnd.ms-fontobject "access plus 1 month"
# css and javascript
ExpiresByType text/css "access plus 2 months"
ExpiresByType application/javascript "access plus 2 months"
ExpiresByType text/javascript "access plus 2 months"
<IfModule mod_headers.c>
Header append Cache-Control "public"
</IfModule>
</IfModule>
# ----------------------------------------------------------------------
# ETag removal
# ----------------------------------------------------------------------
FileETag None
# ----------------------------------------------------------------------
# Stop screen flicker in IE on CSS rollovers
# ----------------------------------------------------------------------
# The following directives stop screen flicker in IE on CSS rollovers - in
# combination with the "ExpiresByType" rules for images (see above). If
# needed, un-comment the following rules.
# BrowserMatch "MSIE" brokenvary=1
# BrowserMatch "Mozilla/4.[0-9]{2}" brokenvary=1
# BrowserMatch "Opera" !brokenvary
# SetEnvIf brokenvary 1 force-no-vary
RewriteEngine On
RewriteCond %{HTTP_HOST} !^citylightstours\.com$ [NC]
RewriteRule ^(.*)$ http://citylightstours.com/$1 [R=301,L]
RewriteCond %{HTTP_USER_AGENT} libwww-perl.*
RewriteRule .* ? [F,L]
Solved - the wordpress blog integrated in the site was setting the 404 status for all non wordpress pages i.e. codeigniter pages
index.php of CI had the following code which needed to be commented out
/*
*---------------------------------------------------------------
* WORDPRESS INTEGRATION
*---------------------------------------------------------------
* The ci_site_url function helps to avoid collision between WP & CI.
*/
//header("HTTP/1.0 200 OK");
define('WP_USE_THEMES', false);
require_once './blog/wp-blog-header.php';
add_filter('site_url', 'ci_site_url', 1);
function ci_site_url()
{
include(APPPATH.'/config/config.php');
return $config['base_url'];
}