I needed to migrate our aging host-level single Wordpress server into our production cluster, which uses EC2 autoscaled hosts and Docker. Although I found some articles that helped, there were a lot of gaps and inconsistencies, and I ran into some unexpected issues that I haven't seen documented that I believe will be common in a clustered and complex environment. So I'm leaving this self-answered Q&A:
I'm going to assume you're more "devops" than "blog admin", and are already reasonably familiar with Docker, NFS, ssh, your strategy for running containers on multiple hosts, and how you might set up your reverse proxy to route to a particular group of Wordpress backends given a path prefix like /blog
.
I won't dive into this too deeply, as this is the one part that is well documented elsewhere. The TL;DR is "find the wp-config.php
file to get the definitions, and then you only need to archive the wp-content
directory and extract the contents of the database. Use the variables you got from the config. (For future import into RDS, I needed to add some command line options not found in simpler mysqldump
examples or else I got permission errors around GTID issues.)
# tar czvf wp-content.tgz wp-content
# mysqldump -h <bloghost> -u <bloguser> -p \
--column-statistics=0 --no-tablespaces --set-gtid-purged=OFF \
--databases <dbname> | gzip > blog.sql.gz
For hosting, I have a /shared
EFS mount that can be used for my Docker containers. You'll want something similar in a clustered environment.
(For the purposes of this answer, I'll use /shared
as a shared network file system, and I'll use /blog
as my subdirectory, and various <VAR>
values for you to replace with your own values. I'm using Docker/EC2/EFS/RDS, but the instructions should still be relevant to other similar cloud clustered environments.)
I created a directory /shared/wordpress/blog
directory and extracted the tarball into this hierarchy, yielding /shared/wordpress/blog/wp-content
. Note that this is down a level and is in a directory that will host the entire subdirectory I want to serve later. This is important!
% sudo -s
# mkdir -p /shared/wordpress/blog
# cd /shared/wordpress/blog
# scp <cluster-user>@<old-host-of-blog>:/var/www/html/blog/blog.sql.gz .
# scp <cluster-user>@<old-host-of-blog>:/var/www/html/blog/wp-content.tgz .
# tar xzvf wp-content.tgz
# zcat blog.sql.gz | mysql -u <bloguser> -p -h <new-mysql-address>
Note: putting authentication info directly into your compose files is a bad practice; use more rigorous secrets management once you have things working!
You'll want to follow any of the recipes for running wordpress:latest
using docker-compose
or whatever your preferred toolchain is. The way this image works is that it dynamically generates a wp-config.php
file internally that reads some configuration options via named environment variables, but notably excludes some of the configuration that we'll need in order to fully set things up. Add the following obvious environment variables: WORDPRESS_DB_HOST
, WORDPRESS_DB_USER
, WORDPRESS_DB_PASSWORD
, and then one big yucky override for stuff that doesn't have named environment vars:
WORDPRESS_CONFIG_EXTRA="define('WP_HOME', 'https://www.yoursite.com/blog/'); define('WP_SITEURL', 'https://www.yoursite.com/blog/');"
The above settings are key to serving out of a subdirectory!
You'll now want to ensure that you add a mount point for the parent of the wp-content
directory that you're sharing over NFS. In a docker-compose.yml
you'd add something like:
volumes:
- /shared/wordpress/blog/wp-content:/var/www/html/blog/wp-content
When you run this (and assuming that you have your /blog
directory routed to your containerized backends) you should now be able to visit your Wordpress instance. Sort of. Perhaps you have a redirect loop. Maybe you have some partially broken content. (It depends a bit on what you had set up previously.)
If you ssh
to the host and look at the running container (i.e. via docker exec -t -i <running_image_name> /bin/bash
) you'll discover that the /var/www/html
directory has been populated with all sorts of stuff. This isn't quite right, we want the /var/www/html/blog
directory to hold the site, because the proxy won't ever route to the parent directory!
Unfortunately, there's no runtime-configurable way to make this happen.
We need to make a custom Docker image. It's annoying, but at least the Dockerfile
is trivial:
FROM wordpress:latest
WORKDIR /var/www/html/blog
ENTRYPOINT ["docker-entrypoint.sh"]
CMD ["apache2-foreground"]
Changing the WORKDIR
will cause the script to write into our subdirectory instead of the default.
Build the image, update your docker-compose.yml
or k8s or whatever to use it, and BOOM you're up and running!
Mostly.
/blog/wp-admin
!Maybe it works when you have only one server, but your production cluster with multiple instances doesn't let you log in.
So here's the thing, the generated wp-config.php
file contains a block of unique generated keys/salts that are used for cookie generation. If you have multiple independent wordpress instances running that have all generated their own keys, each will generate login cookies that the others reject!
Fortunately, the config allows you to specify environment variables for all of them!
Here's the snippet from wp-config.php
where you can see their names:
define( 'AUTH_KEY', getenv_docker('WORDPRESS_AUTH_KEY', 'xxx') );
define( 'SECURE_AUTH_KEY', getenv_docker('WORDPRESS_SECURE_AUTH_KEY', 'xxx') );
define( 'LOGGED_IN_KEY', getenv_docker('WORDPRESS_LOGGED_IN_KEY', 'xxx') );
define( 'NONCE_KEY', getenv_docker('WORDPRESS_NONCE_KEY', 'xxx') );
define( 'AUTH_SALT', getenv_docker('WORDPRESS_AUTH_SALT', 'xxx') );
define( 'SECURE_AUTH_SALT', getenv_docker('WORDPRESS_SECURE_AUTH_SALT', 'xxx') );
define( 'LOGGED_IN_SALT', getenv_docker('WORDPRESS_LOGGED_IN_SALT', 'xxx') );
define( 'NONCE_SALT', getenv_docker('WORDPRESS_NONCE_SALT', 'xxx') );
These values need to be identical between all running wordpress instances in your cluster, so add them to your docker configuration and restart.
The simple answer most people cite here is to log into the /blog/wp-admin
console and go to Settings > Permalinks
. Assuming you already had this set up and working, all you should need to do now is do nothing but click "Save", and this will (handwave handwave) "update the .htaccess file".
The problem with this solution is that whatever it is doing to .htaccess
is only going to modify that file for the single instance that you happened to be connected to for administration! It won't work for a clustered environment.
Let's look at what it actually does. When you first start up, the /var/www/html/blog/.htaccess
file contains:
# BEGIN WordPress
RewriteEngine On
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
After you visit the permalinks settings and save, it turns into:
# BEGIN WordPress
# The directives (lines) between "BEGIN WordPress" and "END WordPress" are
# dynamically generated, and should only be modified via WordPress filters.
# Any changes to the directives between these markers will be overwritten.
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteRule .* - [E=HTTP_AUTHORIZATION:%{HTTP:Authorization}]
RewriteBase /blog/
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /blog/index.php [L]
</IfModule>
# END WordPress
Never mind the REST API, this is alarming! It seemed to be pre-configured to assume no subdirectory, and none of the prior steps for getting the subdirectory working changed that.
So, we definitely need to ensure this file is configured correctly. The easy answer is to put it outside the container in our shared directory, and mount it. Paste the above file (tweak the /blog
to your own subdirectory if necessary) to /shared/wordpress/blog/.htaccess
and add a mount to the docker configuration:
volumes:
- /shared/wordpress/blog/wp-content:/var/www/html/blog/wp-content
- /shared/wordpress/blog/.htaccess:/var/www/html/blog/.htaccess
Restart everything, and celebrate your new Wordpress cluster!
Since our goal was to share the keys/salts from wp-config.php
and to also share the .htaccess
file, an alternate approach is to simply mount one directory level up. So, replace the mount points with only:
volumes:
- /shared/wordpress/blog:/var/www/html/blog
This will cause the entire generated Wordpress hierarchy to be written to the network share instead of internally with selected overrides.
In practice, this is the approach I took, but I may consider migrating to the "many vars + .htaccess" approach documented above.
Note that if multiple instances start up simultaneously sharing the entire tree, and try to bootstrap this hierarchy, they do interfere with each other! Some simply exited and then successfully restarted. To prevent the risk of possible file corruption or other weirdness, you might want to babysit the first startup to ensure only one instance starts and initializes the hierarchy.