Search code examples
.nethttpwebrequestdocxcorruption

What could be causing this corruption in .docx files during httpwebrequest?


I am using httpwebrequest to post a file with some additional form data from an MVC app to a classic ASP site.

If the file is a .docx, it always arrives as corrupted. Others seem to open fine, but it could be that their formats are more flexible.

When I open the original and corrupted files in Sublime Text, I noticed that the corrupted file is missing a block of 0000 at the end. When I manually replace this block the file opens fine.

enter image description here

Is there something I'm doing incorrectly in my .NET code that is causing this happen? Or is the problem more esoteric?

The classic ASP code uses Persist's AspUpload to receive the file. This is used in numerous places elsewhere on the receiving site and has never caused any problems before. So I don't think the error lies there. Plus, it's just a simple call, and I can't see what there is to get wrong!

Set File = Upload.Files("fileField")

I'm at a loss as to how to start debugging this problem further.


This is the code I'm using to post the file:

public async Task<string> TestFileSend()
{
    string result;

    var postToUrl = "https://www.mywebsite.com/receive-file.asp";

    Dictionary<string, string> extraData = new Dictionary<string, string>();
    extraData.Add("colour", "red");
    extraData.Add("name", "sandra");

    var filePath = "/path-to-file/file.docx";
    byte[] fileAsByteArray = File.ReadAllBytes(filePath);


    // setup data  to send
    var dataBoundry = "---------------------------9849436581144108930470211272";
    var dataBoundryAsBytes = Encoding.ASCII.GetBytes(Environment.NewLine + "--" + dataBoundry + Environment.NewLine);

    var startOfFileData = "--" + dataBoundry + Environment.NewLine +
        @"Content-Disposition: form-data; name=""fileField""; filename=""file.docx""" + Environment.NewLine;

    startOfFileData += @"Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document" + Environment.NewLine + Environment.NewLine;
    var startOfFileDataAsBytes = Encoding.UTF8.GetBytes(startOfFileData);
    var endOfRequest = "--" + dataBoundry + "--";
    byte[] endOfRequestAsBytes = Encoding.UTF8.GetBytes(endOfRequest);


    // perform request
    HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(postToUrl);
    httpWebRequest.ContentType = "multipart/form-data; boundary=" + dataBoundry;
    httpWebRequest.Method = "POST";
    using (var stream = await httpWebRequest.GetRequestStreamAsync())
    {
        foreach (KeyValuePair<string, string> item in extraData)
        {
            var dataItemBytes = DataItemAsBytes(item.Key, item.Value);
            stream.Write(dataBoundryAsBytes, 0, dataBoundryAsBytes.Length);
            stream.Write(dataItemBytes, 0, dataItemBytes.Length);
        }
        stream.Write(startOfFileDataAsBytes, 0, startOfFileDataAsBytes.Length);
        stream.Write(fileAsByteArray, 0, fileAsByteArray.Length);
        stream.Write(endOfRequestAsBytes, 0, endOfRequestAsBytes.Length);
    }
    try
    {
        using (WebResponse response = httpWebRequest.GetResponse())
        {
            HttpWebResponse httpResponse = (HttpWebResponse)response;
            using (Stream myData = response.GetResponseStream())
            using (var reader = new StreamReader(myData))
            {
                result = reader.ReadToEnd();
            }
        }
    }
    catch (WebException e)
    {
        result = e.Message;
    }

    return result;
}

Problem Solved - This is The Amended, Working Code

Jon was bang on with his answer. I added the line he suggested immediately after writing the file stream and they now transfer without any problems.

public async Task<string> TestFileSend()
{
    string result;

    var postToUrl = "https://www.mywebsite.com/receive-file.asp";

    Dictionary<string, string> extraData = new Dictionary<string, string>();
    extraData.Add("colour", "red");
    extraData.Add("name", "sandra");

    var filePath = "/path-to-file/file.docx";
    byte[] fileAsByteArray = File.ReadAllBytes(filePath);


    // setup data  to send
    var dataBoundry = "---------------------------9849436581144108930470211272";
    var dataBoundryAsBytes = Encoding.ASCII.GetBytes(Environment.NewLine + "--" + dataBoundry + Environment.NewLine);

    var startOfFileData = "--" + dataBoundry + Environment.NewLine +
        @"Content-Disposition: form-data; name=""fileField""; filename=""file.docx""" + Environment.NewLine;

    startOfFileData += @"Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document" + Environment.NewLine + Environment.NewLine;
    var startOfFileDataAsBytes = Encoding.UTF8.GetBytes(startOfFileData);
    var endOfRequest = "--" + dataBoundry + "--";
    byte[] endOfRequestAsBytes = Encoding.UTF8.GetBytes(endOfRequest);


    // perform request
    HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create(postToUrl);
    httpWebRequest.ContentType = "multipart/form-data; boundary=" + dataBoundry;
    httpWebRequest.Method = "POST";
    using (var stream = await httpWebRequest.GetRequestStreamAsync())
    {
        foreach (KeyValuePair<string, string> item in extraData)
        {
            var dataItemBytes = DataItemAsBytes(item.Key, item.Value);
            stream.Write(dataBoundryAsBytes, 0, dataBoundryAsBytes.Length);
            stream.Write(dataItemBytes, 0, dataItemBytes.Length);
        }
        stream.Write(startOfFileDataAsBytes, 0, startOfFileDataAsBytes.Length);
        stream.Write(fileAsByteArray, 0, fileAsByteArray.Length);
        // *** THIS ADDITIONAL LINE IS THE KEY 
        stream.Write(new byte[] { 45, 45 }, 0, 2);
        // ***
        stream.Write(endOfRequestAsBytes, 0, endOfRequestAsBytes.Length);
    }
    try
    {
        using (WebResponse response = httpWebRequest.GetResponse())
        {
            HttpWebResponse httpResponse = (HttpWebResponse)response;
            using (Stream myData = response.GetResponseStream())
            using (var reader = new StreamReader(myData))
            {
                result = reader.ReadToEnd();
            }
        }
    }
    catch (WebException e)
    {
        result = e.Message;
    }

    return result;
}

Solution

  • I recently played about with multipart/form-data and noticed it has an extra –- on the end of the final multipart boundary. There is an example in this stackoverflow answer. I think that is where you are losing the two bytes.

    If so the solution is to add a final write to the request stream of two bytes of 45 (ASCII -).

    stream.Write(new byte[]{45, 45}, 0, 2);
    

    I can't be sure, but it looks like a good fit. Hope it helps.