Search code examples
phparraysregexpreg-matchtext-parsing

Convert string with no delimiters into associative multidimensional array


I need to parse a string that has no delimiting character to form an associative array.

Here is an example string:

*01the title*35the author*A7other useless infos*AEother useful infos*AEsome delimiters can be there multiple times

Every "key" (which precedes its "value") is comprised of an asterisk (*) followed by two alphanumeric characters. I use this regex pattern: /\*[A-Z0-9]{2}/

This is my preg_split() call:

$attributes = preg_split('/\*[A-Z0-9]{2}/', $line);

This works to isolate the "value", but I also need to extract the "key" to form my desired associative array.

What I get looks like this:

$matches = [
    0 => 'the title',
    1 => 'the author',
    2 => 'other useless infos',
    3 => 'other useful infos',
    4 => 'some delimiters can be there multiple times'
];

My desired output is:

$matches = [
    '*01' => 'the title',
    '*35' => 'the author',
    '*A7' => 'other useless infos',
    '*AE' => [
        'other useful infos',
        'some delimiters can be there multiple times',
    ],
];

Solution

  • Use the PREG_SPLIT_DELIM_CAPTURE flag of the preg_split function to also get the captured delimiter (see documentation).

    So in your case:

    # The -1 is the limit parameter (no limit)
    $attributes = preg_split('/(\*[A-Z0-9]{2})/', $line, -1, PREG_SPLIT_DELIM_CAPTURE);
    

    Now you have element 0 of $attributes as everything before the first delimiter and then alternating the captured delimiter and the next group so you can build your $matches array like this (assuming that you do not want to keep the first group):

    for($i=1; $i<sizeof($attributes)-1; $i+=2){
        $matches[$attributes[$i]] = $attributes[$i+1];
    }
    

    In order to account for delimiters being present multiple times you can adjust the line inside the for loop to check whether this key already exists and in that case create an array.

    Edit: a possibility to create an array if necessary is to use this code:

    for($i=1; $i<sizeof($attributes)-1; $i+=2){
        $key = $attributes[$i];
        if(array_key_exists($key, $matches)){
            if(!is_array($matches[$key]){
                $matches[$key] = [$matches[$key]];
            }
            array_push($matches[$key], $attributes[$i+1]);
        } else {
            $matches[$attributes[$i]] = $attributes[$i+1];
        }
    }
    

    The downstream code can certainly be simplified, especially if you put all values in (possibly single element) arrays.