I am trying to parse a large amount of JSON data generated from a remote web service. The output produced is paginated across 500 URIs and each URI contains 100 JSON objects. I need to match a property in each JSON object, it's DOI (a digital object identifier), against a corresponding field fetched from a local database and then update the record.
The issue I am having is controlling my looping constructs to seek out the matching JSON DOI while making sure that all the data has been parsed.
As you can see I have tried to use a combination of break and continue statements but I am not able to 'move' beyond the first URI.
I later introduced a flag variable to help control the loops without effect.
while($obj = $result->fetch_object()){
for($i=1;$i<=$outputs_json['meta']['response']['total-pages'];$i++){
$url = 'xxxxxxxxxxxxxxx&page%5Bnumber%5D='."$i".'&page%5Bsize%5D=100';
if($outputs = json_decode(file_get_contents($url),true)===false){
}
else{
try{
$outputs = json_decode(file_get_contents($url),true);
$j=0;
do{
$flag = false;
$doi = trim($outputs['data'][$j]['attributes']['identifiers']['dois'][0], '"');
if(!utf8_encode($obj->doi)===$doi) continue;
}else{
$flag = true;
$j++;
}
}while($j!==101);
if($flag===true) break;
} catch(Exception $e) {
}
}
}
}
} What is the optimal approach that guarantees each JSON object at all URIs is parsed and that CRUD operations are only performed on my database when a fetched record's DOI field matches the DOI property of the incoming JSON data?
I'm not 100% sure I understand every aspect of your question but for me it would make sense to change the order of execution
I think it's not a good idea to fetch one record from local DB and try to find it in 100 different remote calls - instead it's better to base your workflow/loops on fetched remote data and try to find the corresponding elements in your local DB
If you think that approach will fit your task - I can of course help you with the code :)