I can't detect mimetype for xlsx Excel file via PHP because it's zip archive.
File utilite
file file.xlsx
file.xlsx: Zip archive data, at least v2.0 to extract
PECL fileinfo
$finfo = finfo_open(FILEINFO_MIME_TYPE);
finfo_file($finfo, "file.xlsx");
application/zip
How to validate it? Unpack and view structure? But if it's arcbomb?
I know this works for zip files, but I'm not too sure about xlsx files. It's worth a try:
To list the files in a zip archive:
$zip = new ZipArchive;
$res = $zip->open('test.zip');
if ($res === TRUE) {
for ($i=0; $i<$zip->numFiles; $i++) {
print_r($zip->statIndex($i));
}
$zip->close();
} else {
echo 'failed, code:' . $res;
}
This will print all the files like this:
Array
(
[name] => file.png
[index] => 2
[crc] => -485783131
[size] => 1486337
[mtime] => 1311209860
[comp_size] => 1484832
[comp_method] => 8
)
As you can see here, it gives the size
and the comp_size
for each archive. If it is an archive bomb, the ratio between these two numbers will be astronomical. You could simply put a limit of however many megabytes you want the maximum decompressed file size and if it exceeds that amount, skip that file and give an error message back to the user, else proceed with your extraction. See the manual for more information.