Search code examples
phppostgresqlloopsgroupingrecords

Grouping similar records with php


I need help writing the logic of the php script which sorts data into a certain format...

Firstly the script needs to loop through each s1 value and ping an endpoint to get the ml values (more like) which are actually referencing other s1 records. this is the easy part! the data is returned like so;

Table 1
s1  |   ml
----------
1   |   -
2   |   3,4    
3   |   2,8,9
4   |   -
5   |   2
6   |   1
7   |   10
8   |   -
9   |   -
10  |   -

Condition 1: As you can see the endpoint returns data for the s1 value telling it other s1 records are similar to other ones, but the direction of ml is not always bidirectional. sometimes, like when s1=6 the ml value is 1 however when s1=1 there isn't a ml value.

Condition 2: Again just to explain the ml records, look above and below where s1=5 (above) and where s1=2 + rec=5 (below), this script needs to realise there is already an s1 record for it's value and that it should be added there.

Condition 3: Note how when s1=2,ml=3 this is stored but when s1=3,ml=2 this is ignored because we have the reverse record.

I basically want to match all the data into 1 sorted 'profile' so it ends up in the below format which i will store in another db table of 'sorted' records.

Table 2
s1  |   rec
----------
2   |   3
2   |   4
2   |   8
2   |   9
2   |   9
2   |   5
6   |   1
7   |   10

This has been racking my brains for days now, I need something thats efficient because in the end it will deal with millions of records and I'm sure there is an easy solution but i just can't figure how to start it.

I tried the following, but I'm stuck and don't know how to go further;

public function getrelated($id='', $t=''){

    if($id != ""){
    $get = Easytest::where('s1','=',$id)->get();

        if(count($get) > 0){
            $ret= array();
            foreach($get as $go){
                    $v = explode(",", $go->s2);

                    foreach ($v as $e) {
                        if($e != $t){
                            $ret[$e] = $this->getrelated($e, $id);
                        }
                    }
                }
                if(count($ret) > 0){
                    return $ret;
                }else{
                    return "";
                }

        }else{
                return $id;
        }
     }else{
        return "";
     }

     }

public function easytest(){
    ob_start();
    $a = array(
                array("s1"=>1,"s2"=>implode(",",array()).""),
                array("s1"=>2,"s2"=>implode(",",array(3,4)).","),
                array("s1"=>3,"s2"=>implode(",",array(2,8,9)).","),
                array("s1"=>4,"s2"=>implode(",",array()).""),
                array("s1"=>5,"s2"=>implode(",",array(2)).","),
                array("s1"=>6,"s2"=>implode(",",array(1)).","),
                array("s1"=>7,"s2"=>implode(",",array(10)).","),
                array("s1"=>8,"s2"=>implode(",",array()).""),
                array("s1"=>9,"s2"=>implode(",",array()).""),
                array("s1"=>10,"s2"=>implode(",",array()).""),
                array("s1"=>11,"s2"=>implode(",",array(12)).","),
                array("s1"=>12,"s2"=>implode(",",array(2)).",")
                );

    //return Easytest::insert($a);

    $records = Easytest::all();

    foreach ($records as $record) {

        $id = $record->s1;
        echo "ROW: ".$id." > ";
        $record->s2 = ltrim($record->s2,",");
        $ml = explode(",",$record->s2);
        if(count($ml) >= 1){
            foreach ($ml as $t) {

                echo "RESULT: ".$t." -".print_r($this->getrelated($t, $id), true);
                echo ",\n";

            }
        }
        echo " <br><br>\n\n";

    }

    return ob_get_clean();

}

Solution

  • Ok, so I eventually solved this... esentially this is the code below; improvements welcome :)

    You need to call the function like so

    related(array('searched'=>array(),'tosearch'=>array(13)));
    

    function:

    public function related($input){
    
    
        $searched = $input['searched'];
    
        $ar = array();
        $bits = array();
    
        if(count($input['tosearch']) != 0){
    
            $get = Easytest::orWhere(function($query) use ($input)
                            {
                                foreach ($input['tosearch'] as $k=>$v) {
                                        $query->orWhere('s2', 'LIKE', '%,'.$v.',%')->orWhere('s1', '=', $v);
                                    }                      
                            })
                ->orderBy('s1', 'ASC')->get();
    
                foreach ($input['tosearch'] as $k=>$v) {
                                    unset($input['tosearch'][$k]);
                                    $input['searched'][$v] = $v;
                    }
    
    
                foreach ($get as $result) {
    
                    $thesebits = explode(",", trim($result->s2,","));
                    foreach ($thesebits as $smallbit) {
                        if($smallbit != ""){
                        $bits[] = $smallbit;
                        }
                    }
    
                    $bits[] = $result->s1;
                    $bits = array_unique($bits);
    
                    foreach ($bits as $k=>$v) {
                            if(($key = array_search($v, $input['searched'])) == false) {
                                    $input['tosearch'][$v] = $v; 
                            }else{
                                unset($input['tosearch'][$v]);
                            }
    
                    }
    
                    $input['tosearch'] = array_unique($input['tosearch']);
    
                }
                return $this->related($input);
        }else{
            return $input;
        }
    
    }