Search code examples
arraysasp-classicdocxcorruption

How to fix corrupted docx files programmatically (adding missing bytes)


I'm trying to repair a large batch of corrupted .docx files in classic asp.

(The files are missing bytes at the end - as detailed in this question).

When I look at the file in Sublime (which shows it in a hex view), the corruption can be fixed by adding 0000 to the end of the file.

enter image description here

But I am struggling to add those 4 zeros onto the end, programmatically.

I am attempting to use the cByteArray class, the usage of which is like this:

With oByte
    Call .AddBytes(LoadBytes(sFilePath))
    Call .AddBytes(HOW DO I GET THE BYTE VALUE OF 0000 HERE?)
    lngBytes = .BytesTotal
    ByteArray = .ReturnBytes
End With

Call SaveBytesToBinaryFile(ByteArray, sNewFilePath)

I can't work out how to get the 0000 value into the .AddBytes() method.

How can I do this? I'm a bit out of my depth and not sure if I'm even approaching this the right way.


In my ignorance, here's what I have tried:


Redimming ByteArray leaving the extra bytes empty (because I think the 0000 represent null values).

This doesn't seem to change the file at all. The new saved file is identical to the old file.

With oByte
    Call .AddBytes(LoadBytes(sFilePath))
    ByteArray = .ReturnBytes
End With

arrayLength = ubound(ByteArray)
redim ByteArray(arrayLength + 2)

Call SaveBytesToBinaryFile(ByteArray, sNewFilePath)

Converting 0000 from hex to bytes and adding it to the corrupted file bytes.

Again, this doesn't seem to change the file at all.

dim k, hexString, str, stream, byteArrToAdd
hexString = "000000"
For k = 1 To Len(hexString) Step 2
 str = str & Chr("&h" & Mid(hexString, k, 2))
response.write "<hr />" & str & "<hr />"
Next

Set stream = CreateObject("ADODB.Stream")
With stream
 .Open
 .Type = 2       ' set type "text"
 .WriteText str  
 .Position = 0
 .Type = 1       ' change type to "binary"
 byteArrToAdd = .Read 
 .Close
End With
set stream = nothing

With oByte
    Call .AddBytes(LoadBytes(sFilePath))
    Call .AddBytes(byteArrToAdd)
    ByteArray = .ReturnBytes
End With

Call SaveBytesToBinaryFile(ByteArray, sNewFilePath)

Getting the final byte of the corrupted file, and adding it to 2 new values after redimming ByteArray.

This doesn't seem to change the file at all either!!

With oByte
    Call .AddBytes(LoadBytes(sFilePath))
    ByteArray = .ReturnBytes
End With


arrayLength = ubound(ByteArray)
finalByte = ByteArray(arrayLength)
redim ByteArray(arrayLength + 2)
ByteArray(arrayLength + 1) = finalByte
ByteArray(arrayLength + 2) = finalByte

Call SaveBytesToBinaryFile(ByteArray, sNewFilePath)

Solution

  • You can use a binary file stream with the help of a user defined conversion (string to byte()) function like the following.

    Function GetBytesOf(str) 'returns bytes of given string
        With CreateObject("Adodb.Stream")
            .Type = 2 'text
            .Charset = "x-ansi" 
            .Open
            .WriteText str
            .Position = 0
            .Type = 1 'binary
            GetBytesOf = .Read 'returns Byte()
            .Close
        End With
    End Function
    
    Dim patch
    patch = GetBytesOf(Chr(0) & Chr(0)) 'equals to WORD 0000
    
    With CreateObject("Adodb.Stream")
        .Type = 1 'binary
        .Open
        .LoadFromFile sFilePath
        'move cursor to the end of file
        .Position = .Size
        .Write patch
        .SaveToFile sNewFilePath, 2 '2 for overwrite if exists
        .Close
    End With