Search code examples
mysqlvalidationormreplicationcakephp-3.4

How can I validate fields and criteria at execute override?


We have a c (central) server and several d (district servers), such as d1, d2, d3, d4, d5.

There are some tables that are to be replicated. For the sake of simplicity, let's assume that we have a tblFoo table that exists on d1, d2, d3, d4, d5 and c as well and it has the same structure in all. The rules are simple:

  • if a record exists at a d server, then it exists on the c server and has the exact same values for each fields
  • if a record exists on a d server (for example on d1), then it does not exist on any other d server (d2, d3, d4 nor d5)

The goal is to make sure that if a change is made on tblFoo (insert, update, delete) for a d server, then it should be done promptly on the c server as well. This works nicely for insert (because the id, pkFooID is having the auto_increment property by definition). This also works for update and delete, but we have some worries about those. This is (the simplified version of the) code:

namespace App\ORM;

use Cake\ORM\Query as ORMQuery;
// Some other use statements

class Query extends ORMQuery
{
    //Lots of stuff...

    /**
     * Overrides a method with the same name to handle synchonizations with c
     */
    public function execute()
    {
        //Some tables need replication. If this is such a table, then we need to perform some extra steps. Otherwise we would just call the parent
        //Method
        if (($this->_repository->getIgnoreType() || (!in_array($this->type(), ['select']))) && $this->isReplicate() && ($this->getConnection()->configName() !== 'c')) {
            //Getting the table
            $table = $this->_repository->getTable();
            //Replicating the query
            $replica = clone $this;
            //Setting the connection of the replica to c, because we need to apply the district changes on central
            $replica->setParentConnectionType('d')->setConnection(ConnectionManager::get('c'));
            $replica->setIgnoreType($this->_repository->getIgnoreType());
            //We execute the replica first, because we will need to refer to c IDs and not the other way around
            $replica->execute();
            //If this is an insert, then we need to handle the ids as well
            if (!empty($this->clause('insert'))) {
                //We load the primary key's name to use it later to find the maximum value
                $primaryKey = $this->_repository->getPrimaryKey();
                //We get the highest ID value, which will always be a positive number, because we have already executed the query at the replica
                $firstID = $replica->getConnection()
                                   ->execute("SELECT LAST_INSERT_ID() AS {$primaryKey}")
                                   ->fetchAll('assoc')[0][$primaryKey];

                //We get the columns
                $columns = $this->clause('values')->getColumns();
                //In order to add the primary key
                $columns[] = $primaryKey;
                //And then override the insert clause with this adjusted array
                $this->insert($columns);
                //We get the values
                $values = $this->clause('values')->getValues();
                //And their count
                $count = count($values);
                //There could be multiple rows inserted already into the replica as part of this query, we need to replicate all their IDs, without
                //assuming that there is a single inserted record
                for ($index = 0; $index < $count; $index++) {
                    //We add the proper ID value into all of the records to be inserted
                    $values[$index][$primaryKey] = $firstID + $index;
                }
                //We override the values clause with this adjusted array, which contains PK values as well
                $this->clause('values')->values($values);
            }
        }
        if ($this->isQueryDelete) {
            $this->setIgnoreType(false);
        }
        //We nevertheless execute the query in any case, independently of whether it was a replicate table
        //If it was a replicate table, then we have already made adjustments to the query in the if block
        return parent::execute();
    }

}

The worry is the following: If we execute update or delete statements on d1 whose condition would be met by records at another district server (d2, d3, d4, d5), then we would end up with the update and delete statement being correctly executed on d1, but once the same statements are executed at d1, we may accidentally update/delete records of other districts from the c server.

To remedy this issue, the proposed solution is to validate the statements and throw an exception if one of the following conditions is not met:

  • the condition is either = or IN
  • the field is [pk|fk]*ID, that is, a foreign key or a primary key

Tables not having the replication behavior would perform execute normally, the above restrictions would only be valid for tables having the replication behavior, such as tblFoo from our example.

The Question

How can I validate update/delete queries in my execute override so that only primary keys or foreign keys can be searched for and only with the = or IN operator?


Solution

  • This is how I have solved the issue.

    The models that have the replicate behavior perform a validation as follows

    <?php
    
    namespace App\ORM;
    
    use Cake\ORM\Table as ORMTable;
    
    class Table extends ORMTable
    {
        protected static $replicateTables = [
            'inteacherkeyjoin',
        ];
    
        public function isValidReplicateCondition(array $conditions)
        {
            return count(array_filter($conditions, function ($v, $k) {
                return (bool) preg_match('/^[\s]*[pf]k(' . implode('|', self::$replicateTables) . ')id[\s]*((in|=).*)?$/i', strtolower(($k === intval($k)) ? $v : $k));
            }, ARRAY_FILTER_USE_BOTH)) > 0;
        }
    
        public function validateUpdateDeleteCondition($action, $conditions)
        {
            if ($this->behaviors()->has('Replicate')) {
                if (!is_array($conditions)) {
                    throw new \Exception("When calling {$action} for replicate tables, you need to pass an array");
                } elseif (!$this->isValidReplicateCondition($conditions)) {
                    throw new \Exception("Unsafe condition was passed to the {$action} action, you need to specify primary keys or foreign keys with = or IN operators");
                }
            }
        }
    
        public function query()
        {
            return new Query($this->getConnection(), $this);
        }
    }
    

    For the Query class we have an isReplicate method that triggers the validation we need and the where method has been overriden to make sure that the conditions are properly validated:

        /**
         * True if and only if:
         * - _repository is properly initialized
         * - _repository has the Replicate behavior
         * - The current connection is not c
         */
        protected function isReplicate()
        {
            if (($this->type() !== 'select') && ($this->getConnection()->configName() === 'c') && ($this->getParentConnectionType() !== 'd')) {
                throw new \Exception('Replica tables must always be changed from a district connection');
            }
            if (in_array($this->type(), ['update', 'delete'])) {
                $this->_repository->validateUpdateDeleteCondition($this->type(), $this->conditions);
            }
    
            return ($this->_repository && $this->_repository->behaviors()->has('Replicate'));
        }
    
        public function where($conditions = null, $types = [], $overwrite = false)
        {
            $preparedConditions = is_array($conditions) ? $conditions : [$conditions];
            $this->conditions = array_merge($this->conditions, $preparedConditions);
    
            return parent::where($conditions, $types, $overwrite);
        }