Search code examples
phpfunctional-programmingstateless

Transform array using a declarative approach


Given the following code:

$flat = [
    [ '10', 'hoho'],
    [ '10', null],
    [ '13', null],
    [ '10', 'ahha']
];

//imperative, procedural approach
$hierarchical = [];
foreach ($flat  as $entry) {
    $id = $entry[0];

    $hierarchical[$id]['id'] = $id;
    $hierarchical[$id]['microtags'] = $hierarchical[$id]['microtags'] ?? [];
    if ($entry[1] != null)
        array_push($hierarchical[$id]['microtags'], $entry[1]);
}

And its result ($hierarchical):

 array (
   10 => 
   array (
     'id' => '10',
     'microtags' => 
     array (
       0 => 'hoho',
       1 => 'ahha',
     ),
   ),
   13 => 
   array (
     'id' => '13',
     'microtags' => 
     array (
     ),
   ),
 )

Is it possible to refactor it to a reasonably efficient declarative/functional approach? Like using array transformation functions (map,reduce,filter,etc)? Also without changing references or altering the same variable. If so, how?


Solution

  • Creating and traversing trees of different shape is best accomplished by using functions. Below, we create functions node_create and node_add_child which encode our intention. Finally, we use array_reduce to complete the transformation. $flat remains untouched; our reducing operation only reads from the input data.

    function node_create ($id, $children = []) {
      return [ "id" => $id, "children" => $children ];
    }
    
    function node_add_child ($node, $child) {
      return node_create ($node['id'], array_merge ($node['children'], [ $child ]));
    }
    
    $flat =
      [ [ '10', 'hoho' ]
      , [ '10', null ]
      , [ '13', null ]
      , [ '10', 'ahha' ]
      ];
    
    $result =
      array_reduce ($flat, function ($acc, $item) {
        list ($id, $value) = $item;
        if (! array_key_exists ($id, $acc))
          $acc [$id] = node_create ($id);
        if (! is_null ($value))
          $acc [$id] = node_add_child ($acc [$id], $value);
        return $acc;
      }, []);
    

    And the result

    print_r ($result);
    // Array
    // (
    //     [10] => Array
    //         (
    //             [id] => 10
    //             [children] => Array
    //                 (
    //                     [0] => hoho
    //                     [1] => ahha
    //                 )
    //         )
    //     [13] => Array
    //         (
    //             [id] => 13
    //             [children] => Array
    //                 (
    //                 )
    //         )
    // )
    

    Above, we use an associative array for $acc which means we have to use PHP's built-in functions for interaction with associative arrays. We can abstract away PHP's ugly, non-functional interfaces for more favourable ones.

    function has ($map, $key) {
      return array_key_exists ($key, $map);
    }
    
    function get ($map, $key) {
      return $map [$key];
    }
    
    function set ($map, $key, $value = null) {
      $map [$key] = $value;
      return $map;
    }
    

    We move the logic for adding null children to node_add_child

    function node_create ($id, $children = []) {
      return [ "id" => $id, "children" => $children ];
    }
    
    function node_add_child ($node, $child = null) {
      if (is_null ($child))
        return $node;
      else
        return node_create ($node['id'], array_merge ($node['children'], [ $child ]));
    }
    

    Now we can see a much more declarative reduce

    function make_tree ($flat = []) {
      return 
        array_reduce ($flat, function ($acc, $item) {
          list ($id, $value) = $item;
          return 
              set ( $acc
                  , $id
                  , has ($acc, $id)
                      ? node_add_child (get ($acc, $id), $value)
                      : node_add_child (node_create ($id), $value)
                  );
        }, []);
    }
    
    print_r (make_tree ($flat));
    // same output as above
    

    Above, we see how has, get, and set can simplify our reduce operation. However, this kind of approach can lead to lots of small, separated functions. Another approach involves inventing your own data type that satisfies your needs. Below, we scrap the separated functions we created above and trade them for a class, MutableMap

    class MutableMap {
      public function __construct ($data = []) {
        $this->data = $data;
      }
      public function has ($key) {
        return array_key_exists ($key, $this->data);
      }
      public function get ($key) {
        return $this->has ($key)
          ? $this->data [$key]
          : null
        ;
      }
      public function set ($key, $value = null) {
        $this->data [$key] = $value;
        return $this;
      }
      public function to_assoc () {
        return $this->data;
      }
    }
    

    Now instead of having to pass $acc, to each function, we swap it out for $map which is an instance of our new type

    function make_tree ($flat = []) {
      return 
        array_reduce ($flat, function ($map, $item) {
          list ($id, $value) = $item;
          return
            $map -> set ( $id
                        , $map -> has ($id)
                            ? node_add_child ($map -> get ($id), $value)
                            : node_add_child (node_create ($id), $value)
                        );
        }, new MutableMap ())
        -> to_assoc ();
    }
    

    Of course you could swap node_create and node_add_child out for a class-based implementation, class Node { ... }. This exercise is left for the reader.

    function make_tree ($flat = []) {
      return 
        array_reduce ($flat, function ($map, $item) {
          list ($id, $value) = $item;
          return
            $map -> set ( $id
                        , $map -> has ($id)
                            ? $map -> get ($id) -> add_child ($value)
                            : (new Node ($id)) -> add_child ($value)
                        );
        }, new MutableMap ())
        -> to_assoc ();
    }