Search code examples
cakephpormunique-constraintcakephp-4.x

belongsToMany: allow and reuse an associated entry with existing UNIQUE title


What I have

My belongsToMany association is similar to the one from CakePHP Cookbook. However, I have set the UNIQUE constraint on tag titles.

(Another difference, which may be irrelevant, is I have added a site_id field next to every tag in the Tags table, and another composite UNIQUE constraint is set on both tag and site_id.)

What doesn't work

Submitting a duplicate tag title results in an error.

When I debug my new Article entity before saving it, I can see that the duplicate tag titles are rejected after a validation attempt.

'tags' => [
    // This submitted tag title already exists in Tags
    (int) 0 => object(App\Model\Entity\Tag) id:1 {
        'site_id' => (int) 2
        '[new]' => true
        '[accessible]' => [
            'site_id' => true,
            'title' => true,
            'created' => true,
            'modified' => true,
            'site' => true,
            'articles' => true,
        ]
        '[dirty]' => [
            'site_id' => true,
        ]
        '[original]' => [
        ]
        '[virtual]' => [
        ]
        '[hasErrors]' => true
        '[errors]' => [
            'title' => [
                'unique' => 'The provided value is invalid', // ← error
            ],
        ]
        '[invalid]' => [
            'title' => 'test',
        ]
        '[repository]' => 'Tags'
    },

    // …

    // This submitted tag title does *not* already exist in Tags
    (int) 3 => object(App\Model\Entity\Tag) id:4 {
        'title' => 'tag'
        'site_id' => (int) 2
        '[new]' => true
        '[accessible]' => [
            'site_id' => true,
            'title' => true,
            'created' => true,
            'modified' => true,
            'site' => true,
            'articles' => true,
        ]
        '[dirty]' => [
            'title' => true, // ← no error
            'site_id' => true,
        ]
        '[original]' => [
        ]
        '[virtual]' => [
        ]
        '[hasErrors]' => false
        '[errors]' => [
        ]
        '[invalid]' => [
        ]
        '[repository]' => 'Tags'
    },
]

How I expected it to work?

The behaviour I'm looking for is if a tag already exists, then take its ID and just link the submitted article entry to that existing ID. So ON DUPLICATE KEY clause, in a way.

Is there a flag that I'm missing that would tell/allow the ORM to do this, or should I maybe start trying some ->epilog() tricks?


Solution

  • There is no such functionality for the ORM saving process, no, and you cannot make use of epilog() with the default ORM save() process, you'd have to actually create the insert query manually then, however you cannot really use entities then, and it wouldn't solve the validation problem, and you'd have to more or less manually apply validation and application rules (you don't want to blindly insert data into insert queries, even tough Query::values() binds data).

    I'd probably suggest to check if a solution that modifies the data before marshalling would be a good fit, that would integrate transparently into process. You could use your unique index columns to look up existing rows, and inject their primary key values into the request data, then the patching/marshalling process will be able to properly look up the existing records and update them accordingly.

    Depending on the specific use case this could be more work than manually constructing insert queries, but it will IMHO integrate nicer. In your specific case it's probably easier, as using manual insert queries would require you to insert data for all the different tables separately, as you cannot make use of the ORM's association saving functionality with manually constructed insert queries.

    To finish things off, here's some untested quick & dirty example code to illustrate the concept:

    // in ArticlesTable
    
    public function beforeMarshal(
        \Cake\Event\EventInterface $event,
        \ArrayAccess $data,
        \ArrayObject $options
    ): void {
        // extract lookup keys from request data
        $keys = collection($data['tags'])
            ->extract(function ($row) {
                return [
                    $row['tag'],
                    $row['site_id'],
                ];
            })
            ->toArray();
    
        // query possibly existing rows based on the extracted lookup keys
        $query = $this->Tags
            ->find()
            ->select(['id', 'tag', 'site_id'])
            ->where(
                new \Cake\Database\Expression\TupleComparison(
                    ['tag', 'site_id'],
                    $keys,
                    ['string', 'integer'],
                    'IN'
                )
            )
            ->disableHydration();
    
        // create a map of lookup keys and primary keys from the queried rows
        $map = $query
            ->all()
            ->combine(
                function ($row) {
                    return $row['tag'] . ';' . $row['site_id'];
                },
                'id'
            )
            ->toArray();
    
        // inject primary keys based on whether lookup keys exist in the map
        $data['tags'] = collection($data['tags'])
            ->map(function ($row) use ($map) {
                $key = $row['tag'] . ';' . $row['site_id'];
                if (isset($map[$key])) {
                    $row['id'] = $map[$key];
                }
    
                return $row;
            })
            ->toArray();
    }
    

    With the primary keys of existing records injected, marshalling, validation, rules and saving should be able to properly distinguish what's to update and what's to insert, ie you should be able to continue using the default ORM saving process just like you're used to.

    See also