Search code examples
phpzipdocxphp-ziparchive

Editing uploaded .docx file with php


I would like to apologize in advance as I'm fairly new to web development and php. I'm currently trying to edit an uploaded .docx file using php and then make it available for download.

As .docx files are zipped folders, my logic is to unzip the uploaded .docx, navigate to the "word/" folder, parse through the XML files I need to edit, edit them with str_replace(); or something similar, re-zip the directory so that it can be opened by MSWord and make it available for downloading. I would prefer to do this with without saving the file to a permanent location, however, I don't know if that's possible.

The problem I am having so far is that I cannot figure out how to navigate through the unzipped folder to get to the .xml files to edit. This is the code so far:

<a href="../index.php">home</a>
<?php
$allowedExts = array("docx");
$temp = explode(".", $_FILES["upload"]["name"]);
$extension = end($temp);

if (($_FILES["upload"]["type"] == "application/vnd.openxmlformats-officedocument.wordprocessingml.document")
&& ($_FILES["upload"]["size"] < 20000)
&& in_array($extension, $allowedExts)) {
  if ($_FILES["upload"]["error"] > 0) {
    echo "Error: " . $_FILES["upload"]["error"] . "<br>";
  } else {
    echo "Upload: " . $_FILES["upload"]["name"] . "<br>";
    echo "Type: " . $_FILES["upload"]["type"] . "<br>";
    echo "Size: " . ($_FILES["upload"]["size"] / 1024) . " kB<br>";
    echo "Stored in: " . $_FILES["upload"]["tmp_name"] . "<br>";

    // $document = $_FILES["upload"]["tmp_name"].'/'. $_FILES["upload"]["name"].$extension;
    $document = moveUploadFile($_FILES["upload"]["tmp_name"], $_FILES["upload"]["name"]);
    unzipDocFile($document);

  } 
} else {
  echo "Invalid file <br>";
}


//-------------- functions ------------------//


function unzipDocFile($doc){
    $zip = new ZipArchive();
    if ($zip->open($doc, ZipArchive::CREATE)!==TRUE) {
        exit("cannot open <$doc>\n");
    }
    // print_r($zip."<br>");
    // var_dump($zip);
    echo "numFiles: " . $zip->numFiles . "<br>";
    echo "status: " . $zip->status  . "<br>";
    echo "statusSys: " . $zip->statusSys . "<br>";
    echo "filename: " . $zip->filename . "<br>";
    echo "comment: " . $zip->comment . "<br>";
    $zip->close();
}

function moveUploadFile($tmp_name, $name){
    move_uploaded_file($tmp_name, "uploads/" . $name);
    return "uploads/" . $name;
}

function forceDocxDownload($tmp_name){
    header("Content-disposition: attachment; filename=$tmp_name");
    header("Content-type: application/vnd.openxmlformats-    officedocument.wordprocessingml.document");
    readfile("$tmp_name");
}

?>

Any help at all would be much appreciated. Thanks in advance!


Solution

  • Ok, there is one very simple solution as to how to get access to the contents of the zip folder. I did not realise that one had to extract the contents of the zip folder using extractTo();

    This means that the contents of the .docx file are extracted to tmp_doc/. From there we can manipulate any .xml files that we need to.

    if (isset($_FILES['upload'])) {
        $errors = array();
        $zip = new ZipArchive;
    
        if (end(explode('.', $_FILES["upload"]["name"]))!== 'docx') {
            $errors[] = 'Error: Wrong format';
        }
    
        if($zip->open($_FILES['upload']['tmp_name']) === false){
            $errors[] = 'Failed to open file';
        }
    
        if (empty($errors)) {
            $zip->extractTo('./uploads/tmp_doc');
            $zip->close();
        }
    }