Let's say I want XML Files only with upto 10MB to be loaded from a remote server.
Something like
$xml_file = "http://example.com/largeXML.xml";// size= 500MB
//PRACTICAL EXAMPLE: $xml_file = "http://www.cs.washington.edu/research/xmldatasets/data/pir/psd7003.xml";// size= 683MB
/*GOAL: Do anything that can be done to hinder this large file from being loaded by the DOMDocument without having to load the File n check*/
$dom = new DOMDocument();
$dom->load($xml_file /*LOAD only IF the file_size is <= 10MB....else...echo 'File is too large'*/);
How can this possibly be achieved?.... Any idea or alternative? or best approach to achieving this would be highly appreciated.
I checked PHP: Remote file size without downloading file but when I try with something like
var_dump(
curl_get_file_size(
"http://www.dailymotion.com/rss/user/dialhainaut/"
)
);
I get string 'unknown' (length=7)
When I try with get_headers
as suggested below, the Content-Length is missing in the headers, so this will not work reliably either.
Please kindly advise how to determine the length
and avoid sending it to the DOMDocument
if it exceeds 10MB
Ok, finally working. The headers solution was obviously not going to work broadly. In this solution, we open a file handle and read the XML line by line until it hits the threshold of $max_B. If the file is too big, we still have the overhead of reading it up until the 10MB mark, but it's working as expected. If the file is less than $max_B, it proceeds...
$xml_file = "http://www.dailymotion.com/rss/user/dialhainaut/";
//$xml_file = "http://www.cs.washington.edu/research/xmldatasets/data/pir/psd7003.xml";
$fh = fopen($xml_file, "r");
if($fh){
$file_string = '';
$total_B = 0;
$max_B = 10485760;
//run through lines of the file, concatenating them into a string
while (!feof($fh)){
if($line = fgets($fh)){
$total_B += strlen($line);
if($total_B < $max_B){
$file_string .= $line;
} else {
break;
}
}
}
if($total_B < $max_B){
echo 'File ok. Total size = '.$total_B.' bytes. Proceeding...';
//proceed
$dom = new DOMDocument();
$dom->loadXML($file_string); //NOTE the method change because we're loading from a string
} else {
//reject
echo 'File too big! Max size = '.$max_B.' bytes.';
}
fclose($fh);
} else {
echo '404 file not found!';
}