Search code examples
phpassociative-arraydata-processing

How to create php associative array from messy data


Supposing you have a html form with a small amount of backend php. The form has only one field, which returns either 1 or 0. For instance...

enter image description here

And the php script records this in a MySQL database along with the information about the user's OS.

So you might find that lots of Windows8 users like waffles, but no so much Linux Mint users.

"Oh... 87% of windows 8 users said yes... that's something I guess."

But the classes (namely, in this case, the OS type) aren't really known in advance by the developer. As such, the actual MySql table is a messy conglomeration of response (yes or no) along with the OS type; possibly bound to a primary key which is an autoincrementing integer. To make life easier there is a separate attribute for yes and no -so that every time a user says "yes" is recorded separately from when they say "no". The data may look something like:

ID | OS_type | YES | NO

  1. Windows 7 | 1 | NULL.
  2. Windows 8 | NULL | 1
  3. Windows 7 | NULL | 1
  4. Jellybean | 1 | NULL
  5. Windows 8 | NULL | 1
  6. Ubunto | 1 | NULL
  7. Iphone 4 | 1 | NULL
  8. Windows 7| NULL | 1
  9. Jellybean | 1 | NULL

and so on

So the problem comes when trying to automatically get some sort of coherent detail from this dataset. You can't simply make an associative array based on a key-value pair of OS-response because there are many duplicate OS keys. Really an associative array of unique keys has to be created that will not lose the critical data: that in its creation, the count of the number of "yes"es and "no"s will be attached as values to the respective keys. How might one go about making such an associate array?


Solution

  • Assuming you are grabbing the whole dataset from MySQL (i.e. "SELCET * FROM waffles" into $waffleResponse array)

    $resultSet = array();
    foreach($waffleResponse as $r)
    {
        if (!array_key_exists($r['OS_type'], $resultSet))
        {
            $resultSet[$r['OS_type']] = array(
                'YES'=>0,
                'NO'=>0
            );
        }
        $resultSet[$r['OS_type']]['YES'] += $r['YES'];
        $resultSet[$r['OS_type']]['NO']  += $r['NO'];
    }
    var_dump($resultSet);
    

    Would output:

    array(5) {
      ["Windows 7"]=>
      array(2) {
        ["YES"]=>
        int(1)
        ["NO"]=>
        int(2)
      }
      ["Windows 8"]=>
      array(2) {
        ["YES"]=>
        int(0)
        ["NO"]=>
        int(2)
      }
      ["Jellybean"]=>
      array(2) {
        ["YES"]=>
        int(2)
        ["NO"]=>
        int(0)
      }
      ["Ubunto"]=>
      array(2) {
        ["YES"]=>
        int(1)
        ["NO"]=>
        int(0)
      }
      ["Iphone 4"]=>
      array(2) {
        ["YES"]=>
        int(1)
        ["NO"]=>
        int(0)
      }
    }