Search code examples
c#parsingstreamreaderhttplistenerhttplistenerrequest

Httplistener and file upload


I am trying to retrieve an uploaded file from my webserver. As the client sends its files through a webform (random files), I need to parse the request to get the file out and to process it further on. Basically, the code goes as:

HttpListenerContext context = listener.GetContext();
HttpListenerRequest request = context.Request;
StreamReader r = new StreamReader(request.InputStream, System.Text.Encoding.Default);
// this is the retrieved file from streamreader
string file = null;

while ((line = r.ReadLine()) != null){
     // i read the stream till i retrieve the filename
     // get the file data out and break the loop 
}
// A byststream is created by converting the string,
Byte[] bytes = request.ContentEncoding.GetBytes(file);
MemoryStream mstream = new MemoryStream(bytes);

// do the rest

As a result, i am able to retrieve textfiles, but for all other files, they are corrupted. Could someone tell me how to parse these HttplistnerRequests properly (or providing a lightweighted alternative)?


Solution

  • I think you are making things harder on yourself than necessary by doing this with an HttpListener rather than using the built in facilities of ASP.Net. But if you must do it this way here is some sample code. Note: 1) I'm assuming you're using enctype="multipart/form-data" on your <form>. 2) This code is designed to be used with a form containing only your <input type="file" /> if you want to post other fields or multiple files you'll have to change the code. 3) This is meant to be a proof of concept/example, it may have bugs, and is not particularly flexible.

    static void Main(string[] args)
    {
        HttpListener listener = new HttpListener();
        listener.Prefixes.Add("http://localhost:8080/ListenerTest/");
        listener.Start();
    
        HttpListenerContext context = listener.GetContext();
    
        SaveFile(context.Request.ContentEncoding, GetBoundary(context.Request.ContentType), context.Request.InputStream);
    
        context.Response.StatusCode = 200;
        context.Response.ContentType = "text/html";
        using (StreamWriter writer = new StreamWriter(context.Response.OutputStream, Encoding.UTF8))
            writer.WriteLine("File Uploaded");
    
        context.Response.Close();
    
        listener.Stop();
    
    }
    
    private static String GetBoundary(String ctype)
    {
        return "--" + ctype.Split(';')[1].Split('=')[1];
    }
    
    private static void SaveFile(Encoding enc, String boundary, Stream input)
    {
        Byte[] boundaryBytes = enc.GetBytes(boundary);
        Int32 boundaryLen = boundaryBytes.Length;
    
        using (FileStream output = new FileStream("data", FileMode.Create, FileAccess.Write))
        {
            Byte[] buffer = new Byte[1024];
            Int32 len = input.Read(buffer, 0, 1024);
            Int32 startPos = -1;
    
            // Find start boundary
            while (true)
            {
                if (len == 0)
                {
                    throw new Exception("Start Boundaray Not Found");
                }
    
                startPos = IndexOf(buffer, len, boundaryBytes);
                if (startPos >= 0)
                {
                    break;
                }
                else
                {
                    Array.Copy(buffer, len - boundaryLen, buffer, 0, boundaryLen);
                    len = input.Read(buffer, boundaryLen, 1024 - boundaryLen);
                }
            }
    
            // Skip four lines (Boundary, Content-Disposition, Content-Type, and a blank)
            for (Int32 i = 0; i < 4; i++)
            {
                while (true)
                {
                    if (len == 0)
                    {
                        throw new Exception("Preamble not Found.");
                    }
    
                    startPos = Array.IndexOf(buffer, enc.GetBytes("\n")[0], startPos);
                    if (startPos >= 0)
                    {
                        startPos++;
                        break;
                    }
                    else
                    {
                        len = input.Read(buffer, 0, 1024);
                    }
                }
            }
    
            Array.Copy(buffer, startPos, buffer, 0, len - startPos);
            len = len - startPos;
    
            while (true)
            {
                Int32 endPos = IndexOf(buffer, len, boundaryBytes);
                if (endPos >= 0)
                {
                    if (endPos > 0) output.Write(buffer, 0, endPos-2);
                    break;
                }
                else if (len <= boundaryLen)
                {
                    throw new Exception("End Boundaray Not Found");
                }
                else
                {
                    output.Write(buffer, 0, len - boundaryLen);
                    Array.Copy(buffer, len - boundaryLen, buffer, 0, boundaryLen);
                    len = input.Read(buffer, boundaryLen, 1024 - boundaryLen) + boundaryLen;
                }
            }
        }
    }
    
    private static Int32 IndexOf(Byte[] buffer, Int32 len, Byte[] boundaryBytes)
    {
        for (Int32 i = 0; i <= len - boundaryBytes.Length; i++)
        {
            Boolean match = true;
            for (Int32 j = 0; j < boundaryBytes.Length && match; j++)
            {
                match = buffer[i + j] == boundaryBytes[j];
            }
    
            if (match)
            {
                return i;
            }
        }
    
        return -1;
    }
    

    To help you better understand what the code above is doing, here is what the body of the HTTP POST looks like:

    Content-Type: multipart/form-data; boundary=----WebKitFormBoundary9lcB0OZVXSqZLbmv
    
    ------WebKitFormBoundary9lcB0OZVXSqZLbmv
    Content-Disposition: form-data; name="my_file"; filename="Test.txt"
    Content-Type: text/plain
    
    Test
    ------WebKitFormBoundary9lcB0OZVXSqZLbmv--
    

    I've left out the irrelevant headers. As you can see, you need to parse the body by scanning through to find the beginning and ending boundary sequences, and drop the sub headers that come before the content of your file. Unfortunately you cannot use StreamReader because of the potential for binary data. Also unfortunate is the fact that there is no per file Content-Length (the Content-Length header for the request specifies the total length of the body including boundaries, sub-headers, and spacing.