mysql git docker content-management-system modx

How should I dockerize a CMS, such that MySQL works nice with git?

I want to dockerize a MODX application for development and store it in git (again, for development.) There's a solution here, but all the MySQL files are now in binary, plus the database cares about their permissions. I'd like to either

put all of mysql's data in a single massive binary file, so I don't have to care about permissions and I can put it in LFS or
Somehow export the database to an SQL file on container shutdown and import it on launch, so I can use diffs.

Solution

So I've actually implemented a partial solution to your problem (though I'm still learning to use docker, this potential solution encapsulates everything else).

I use MODx as my CMS of choice, however, theoretically this should work for other CMS'es too.

In my git workflow, I have a pre-commit hook set to mysqldump the database into a series of SQL files, which when implementing in production represent the inputs into mysql to recreate the entire database.

Some of the code in the following example is not directly related to the answer, and it is also worth noting that I personally implement a final column within each table of the database that actually separates out different rows of data into different branches of the git repo (because my chosen workflow involves 3 parallel branches, each for local development, staging, and production respectively).

The sample code below is a pre-commit hook of one of my older projects, which I no longer use, but the same code is still relatively in use (with a few exceptions unrelated to this post). It goes far beyond the question, because it is verbatim from my repo, but perhaps it might spark some inspiration.

In this example you'll also see references to "lists" which are text files containing the various individual repos and some settings, which are imploded into bash associative arrays, which requires bash 4.0 or higher. There is also a reference to 'mysql-defaults' which is a text file that contains my database credentials so the script can run without interruption.

#!/bin/bash

# Set repository and script variables
REPO_NAME='MODX';REPO_FOLDER='modx';REPO_KEY='modx';REPO_TYPE='MODX';

declare -a REPO_PREFIX_COUNTS=();

MODULES_STRING=$(cat /Users/cjholowatyj/Dev/modules-list | tr "\n" " ");MODULES_ARRAY=(${MODULES_STRING});# echo ${MODULES_ARRAY[1]};
PROJECTS_STRING=$(cat /Users/cjholowatyj/Dev/projects-list | tr "\n" " ");PROJECTS_ARRAY=(${PROJECTS_STRING});# echo ${PROJECTS_ARRAY[1]};
THEMES_STRING=$(cat /Users/cjholowatyj/Dev/themes-list | tr "\n" " ");THEMES_ARRAY=(${THEMES_STRING});# echo ${THEMES_ARRAY[1]};

alias mysql='/Applications/MAMP/Library/bin/mysql --defaults-file=.git/hooks/mysql-defaults';
alias dump='/Applications/MAMP/Library/bin/mysqldump --defaults-file=.git/hooks/mysql-defaults';
alias dump-compact='/Applications/MAMP/Library/bin/mysqldump --defaults-file=.git/hooks/mysql-defaults --no-create-info --skip-add-locks --skip-disable-keys --skip-comments --skip-extended-insert --compact';
shopt -s expand_aliases

# Print status message in terminal console
/bin/echo "Running ${REPO_NAME} Pre-Commits...";

# Switch to repository directory
# shellcheck disable=SC2164
cd "/Users/cjholowatyj/Dev/${REPO_FOLDER}/";

# Fetch database tables dedicated to this repository
mysql -N information_schema -e "select table_name from tables where table_schema = 'ka_local2019' and table_name like '${REPO_KEY}_%'" | tr '\n' ' ' > sql/${REPO_KEY}_tables.txt;
tablesExist=$(wc -c "sql/${REPO_KEY}_tables.txt" | awk '{print $1}')

# Reset pack_ sql files
if [[ -f sql/pack_structure.sql ]]; then rm sql/pack_structure.sql; fi
if [[ -f sql/pack_data.sql ]]; then rm sql/pack_data.sql; fi
touch sql/pack_structure.sql
touch sql/pack_data.sql

dump --add-drop-database --no-create-info --no-data --skip-comments --databases ka_local2019 >> sql/pack_structure.sql

# Process repository tables & data
if [[ ${tablesExist} -gt 0 ]]; then
  dump --no-data --skip-comments ka_local2019 --tables `cat sql/${REPO_KEY}_tables.txt` >> sql/pack_structure.sql
  dump-compact ka_local2019 --tables `cat sql/${REPO_KEY}_tables.txt` --where="flighter_key IS NULL" >> sql/pack_data.sql
  sed -i "" "s/AUTO_INCREMENT=[0-9]+[ ]//g" sql/pack_structure.sql
fi
dump-compact ka_local2019 --where="flighter_key='${REPO_KEY}'" >> sql/pack_data.sql
isLocalHead=$(grep -c cjholowatyj .git/HEAD);
if [[ ${isLocalHead} = 1 ]]; then
  dump-compact ka_local2019 --where="flighter_key='${REPO_KEY}-local'" >> sql/pack_data.sql
  sed -i "" "s/\.\[${REPO_KEY}-local]//g" sql/pack_data.sql
fi
isDevelopHead=$(grep -c develop .git/HEAD);
if [[ ${isDevelopHead} = 1 ]]; then
  dump-compact ka_local2019 --where="flighter_key='${REPO_KEY}-develop'" >> sql/pack_data.sql
  sed -i "" "s/\.\[${REPO_KEY}-develop]//g" sql/pack_data.sql
  sed -i "" "s/ka_local2019/ka_dev2019/g" sql/pack_structure.sql
  sed -i "" "s/ka_local2019/ka_dev2019/g" sql/pack_structure.sql
fi
isReleaseHead=$(grep -c release .git/HEAD);
if [[ ${isReleaseHead} = 1 ]]; then
  dump-compact ka_local2019 --where="flighter_key='${REPO_KEY}-release'" >> sql/pack_data.sql
  sed -i "" "s/\.\[${REPO_KEY}-release]//g" sql/pack_data.sql
  sed -i "" "s/ka_local2019/ka_rel2019/g" sql/pack_structure.sql
  sed -i "" "s/ka_local2019/ka_rel2019/g" sql/pack_structure.sql
fi

# Create master structure sql file for this repository (and delete it once again if it is empty)
awk '/./ { e=0 } /^$/ { e += 1 } e <= 1' < sql/pack_structure.sql > sql/${REPO_KEY}_structure.sql
structureExists=$(wc -c "sql/${REPO_KEY}_structure.sql" | awk '{print $1}')
if [[ ${structureExists} -eq 0 ]]; then rm sql/${REPO_KEY}_structure.sql; fi

# Create master sql data file in case the entire database needs to be rebuilt from scratch
awk '/./ { e=0 } /^$/ { e += 1 } e <= 1' < sql/pack_data.sql > sql/all_${REPO_KEY}_data.sql

# Commit global repository sql files
git add sql/all_${REPO_KEY}_data.sql
if [[ ${structureExists} -gt 0 ]]; then git add sql/${REPO_KEY}_structure.sql; fi

# Deleting any existing sql files to recreate them fresh below
if [[ -f sql/create_modx_data.sql ]]; then rm sql/create_modx_data.sql; fi
if [[ -f sql/create_flighter_data.sql ]]; then rm sql/create_flighter_data.sql; fi
for i in "${MODULES_ARRAY[@]}"
do
    if [[ -f sql/create_${i}_data.sql ]]; then rm sql/create_${i}_data.sql; fi
done
if [[ -f sql/create_${REPO_KEY}_data.sql ]]; then rm sql/create_${REPO_KEY}_data.sql; fi

# Parse global repository data and separate out data filed by table prefix
lastPrefix='';
lastTable='';
while IFS= read -r iLine;
do
    thisLine="${iLine}";
    thisPrefix=$(echo ${thisLine} | grep -oEi '^INSERT INTO `([0-9a-zA-Z]+)_' | cut -d ' ' -f 3 | cut -d '`' -f 2 | cut -d '_' -f 1);
    thisTable=$(echo ${thisLine} | grep -oEi '^INSERT INTO `([0-9a-zA-Z_]+)`' | cut -d ' ' -f 3 | cut -d '`' -f 2);
    if [[ $(echo -n ${thisPrefix} | wc -m) -gt 0 ]]; then
         if [[ -n "${REPO_PREFIX_COUNTS[$thisPrefix]}" ]]; then
              if [[ ${REPO_PREFIX_COUNTS[$thisPrefix]} -lt 1  ]]; then
                   if [[ -f sql/create_${thisPrefix}_data.sql ]]; then rm sql/create_${thisPrefix}_data.sql; fi
                   touch "sql/create_${thisPrefix}_data.sql";
              fi
              REPO_PREFIX_COUNTS[$thisPrefix]=0;
          fi
         REPO_PREFIX_COUNTS[$thisPrefix]+=1;
         echo "${thisLine}" >> sql/create_${thisPrefix}_data.sql;
         if [[ ${thisTable} != ${lastTable} ]]; then
             if [[ ${thisPrefix} != ${lastPrefix} ]]; then
                 if [[ -f sql/delete_${thisPrefix}_data.sql ]]; then rm sql/delete_${thisPrefix}_data.sql; fi
                 touch "sql/delete_${thisPrefix}_data.sql";
             fi
             if [[ $(echo -n ${thisTable} | wc -m) -gt 0 ]]; then
                 echo "DELETE FROM \`${thisTable}\` WHERE \`flighter_key\` LIKE '${REPO_KEY}%';" >> sql/delete_${thisPrefix}_data.sql
             fi
         fi
         # Add previous prefix sql file to git if lastPrefix isn't ''
         if [[ $(echo -n ${lastPrefix} | wc -m) -gt 0 ]]; then
             git add "sql/create_${lastPrefix}_data.sql";
             git add "sql/delete_${lastPrefix}_data.sql";
         fi
    fi
    lastPrefix=${thisPrefix};
    lastTable=${thisTable};
done < sql/all_${REPO_KEY}_data.sql
# Add previous prefix sql file to git for the final lastPrefix value
git add "sql/create_${lastPrefix}_data.sql";
git add "sql/delete_${lastPrefix}_data.sql";

# Clean up unused files
rm "sql/${REPO_KEY}_tables.txt";
rm "sql/pack_data.sql";
rm "sql/pack_structure.sql";

git add sql/;

A couple nuances worth noting are... (1) My code strips out all the auto_increment cursors from each table because they were creating a lot of unnecessary changes in the sql files and that ended up making commits more complicated. (2) My code also strips out the database name itself, because on the production server, I will be specifying the database that will be used and it is not the same database name as the one I use for local development and we don't want the data going to the wrong place. (3) This workflow also separates database structure and the data itself in the files I commit to git, which may be a source of confusion if you didn't already pick up on that.

On the flip side, when implementing the project on a server, I also have code which intuitively iterates through all the *.sql files and imports them into my database one at a time. I won't share the exact code for security reasons, but the general gist is... mysql mysql_database < database_file.sql