Processing the text from a PPTX extension works perfectly, but if a PPSX sits on the same URL (same server & permissions) the code throws code:9 error (ER_NOENT). Can someone help determine why a PPTX vs a PPSX is treated differently though they are both the same openXML standard? How can I extract the text from a PPSX file?
For reference the mime-type is: application/vnd.openxmlformats-officedocument.presentationml.slideshow
<?php
if(isset($_POST['processFile']) && isset($_FILES["file"]["tmp_name"]))
{
$fileText = ppsx_to_text($_FILES["file"]["tmp_name"]);
}
function ppsx_to_text( $path_to_file )
{
$zip_handle = new ZipArchive();
$response = '';
if (true === $zip_handle->open($path_to_file)) // <-- fails to open / recognize PPSX as zip***
{
$slide_number = 1; //loop through slide files
$doc = new DOMDocument();
while (($xml_index = $zip_handle->locateName('ppt/slides/slide' . $slide_number . '.xml')) !== false)
{
$xml_data = $zip_handle->getFromIndex($xml_index);
$doc->loadXML($xml_data, LIBXML_NOENT | LIBXML_XINCLUDE | LIBXML_NOERROR | LIBXML_NOWARNING);
$response .= strip_tags($doc->saveXML());
$slide_number++;
}
$zip_handle->close();
}
return $response;
}
?>
<form id="content_form" class="the_form" action="" method="post" enctype="multipart/form-data">
<label for="file">Choose file to upload</label>
<input type="file" id="file" name="file">
<button type="submit" value="processFile" name="processFile">Process</button>
<div><?php echo $fileText;?></div>
</form>
If you do have access to the server (assuming it's windows) take a look at these settings :
Your problem may be with how the mime types are configured and served at request. If it is Linux, look up those platform specific settings. If you can't control the server, then your only other option is to access a local copy or memory copy as you have done.