I have an Apache server running WordPress in the web root (/var/www/html
). In my access_log I have been seeing many entries of the form:
98.209.16.114 - - [15/Feb/2013:21:19:51 -0500] "GET http://www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"
98.209.16.114 - - [15/Feb/2013:21:19:51 -0500] "GET http://www.twitter.comhttp/www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"
98.209.16.114 - - [15/Feb/2013:21:19:51 -0500] "GET http://www.twitter.comhttphttp/www.twitter.comhttp/www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"
98.209.16.114 - - [15/Feb/2013:21:19:52 -0500] "GET http://www.twitter.comhttphttphttp/www.twitter.comhttphttp/www.twitter.comhttp/www.twitter.com HTTP/1.1" 301 - "-" "curl/7.28.1"
where www.twitter.com
can be replaced with any number of odd domains external to my own.
EDIT: the copied lines include curl because I was testing this phenomenon from my own command line.
The pertinent lines in my httpd.conf
file are:
NameVirtualHost *:80
<VirtualHost *:80>
DocumentRoot /var/www/html
ServerName www.mydomain.com
<Directory /var/www/html>
AllowOverride All
</Directory>
</VirtualHost>
<VirtualHost *:80>
ServerName mydomain.com
RewriteEngine On
RewriteRule ^/(.*) http://www.mydomain.com/$1 [L,R=301]
</VirtualHost>
and the .htaccess
file in the WordPress directory looks like:
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
This happens tens of times per day. A few questions I have are:
301
or 400
Let me know what other information you need.
The only modules you need loaded by apache for wordpress are the following... this excludes your php module and I'll explain that a bit more after:
LoadModule authz_host_module modules/mod_authz_host.so
LoadModule log_config_module modules/mod_log_config.so
LoadModule expires_module modules/mod_expires.so
LoadModule deflate_module modules/mod_deflate.so
LoadModule env_module modules/mod_env.so
LoadModule setenvif_module modules/mod_setenvif.so
LoadModule mime_module modules/mod_mime.so
LoadModule autoindex_module modules/mod_autoindex.so
LoadModule dir_module modules/mod_dir.so
LoadModule alias_module modules/mod_alias.so
LoadModule rewrite_module modules/mod_rewrite.so
LoadModule negotiation_module modules/mod_negotiation.so
LoadModule headers_module modules/mod_headers.so
Depending on whether you are running php fcgi or php mod_php (prefork) then you would include one of the following:
#---- For FPM of PHP Enable both below and disable php5_module
LoadModule fastcgi_module modules/mod_fastcgi.so
LoadModule actions_module modules/mod_actions.so
#---- For standard php5 module enable below and disable 2 above
#LoadModule php5_module modules/libphp5.so
The redirects that I'm seeing based on your log output looks more like a badly configured mod_rewrite rule, or a 301 redirect plugin in wordpress that is conflicting with rules on apache.
Now as a general rule of thumb, when we host namevirtualhost based vhosts, we create a default vhost with Servername default
And set it's doc root to a directory that includes 1 html file and rewrite rule to transfer all traffic to the index.html (The html file just outputs hosted by) That way you filter out all of the garbage traffic that isn't based on your domain itself, and since it's only running html it doesn't require any iops based on php as it's static html only.
That way your vhost will only handle requests for the domain that it hosts. Now that doesn't fix the issue with 301's that are happening. If you can share your mod_rewrite configuration then perhaps there is something in there that we can help with.
The biggest problem I've seen in the community with SSL based traffic is when the developers have to integrate with load balancers where the SSL certificate is truncated on the load balancer instead of apache. With that said you can no longer do a rewrite condition / rule to check if https != on, redirect 301 to a new location as the load balancer terminates the cert and sends http (port 80) traffic to your webhost so apache will always think it's unencrypted even though it hits a secured vhost
One additional note on your vhost. It's better to not use .htaccess files unless you ABSOLUTELY have to. For wordpress a vhost directory element should look something like this for optimal performance.
<VirtualHost *:80>
ServerName www.example.com
DocumentRoot /path/to/doc_root
<Directory /path/to/doc_root/>
AllowOverride None
Options SymLinksIfOwnerMatch MultiViews -Indexes
Order allow,deny
Allow from all
RewriteEngine On
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^.*$ index.php [L]
</Directory>
</VirtualHost>
The reason why I say optimal performance is that the .htaccess file does not get read by apache on every single request that comes to it. It instead reads it as part as vhost config and saves the configuration in memory so essentially you save a huge amount on iops by not having it use a .htaccess in addition to additional security of not having to rely on .htaccess files and overrides.
And finally using RewriteBase is irrelevant and sometimes just causes prolems unless you host wordpress under a specific alias like /blog/ In your case based on your .htaccess file it appears to be on the base domain so there is no need to have that directive in there. For reference see the rewrite rules i have in the vhost config above.