I am building a simple proxy server in .NET 8.
The process is as follows:
HttpRequestMessage
with a StreamContent
StreamContent
HttpClient
is posted with the transformed payloadThe issue I have is that using StreamContent
requires me to write ALL transformed data to it and I have to set the Position
back to zero before I can POST it. This means (if I understand correctly) that the entire "new request" is in memory on my proxy server.
If I use the obsolete HttpWebRequest
for my new request, I can get the RequestStream
and process my incoming message in chunks which I write directly to the RequestStream
. It seems like this is a much better approach as it should induce less memory pressure.
Am I missing something here?
Following is the code I have for using StreamContent
:
/// <summary>
/// Uses the HttpRequestMessage / HttpResponseMessage to communicate with
/// the proxied API. This requires the entire Request stream to be assembled
/// before it can be sent to the API.
/// </summary>
/// <param name="clientHttpContext">The HttpContext for this request.</param>
/// <returns>An HttpResponseMessage which exposes its stream for processing.</returns>
private async Task<HttpResponseMessage> SendToProxiedAPIWithStreamContent(HttpContext clientHttpContext)
{
byte[] incomingRequestBuffer = new byte[_settings.ChunkSize];
HttpResponseMessage? response = null;
//
// Create a upstreamRequestStream to write chunks to for sending to the upstream server.
//
using (MemoryStream upstreamRequestStream = new MemoryStream(_settings.ChunkSize))
{
//
// Create a StreamContent with the memory upstreamRequestStream as its internal implementation.
//
StreamContent upstreamContent = new StreamContent(upstreamRequestStream);
//
// Make a Request to send to the proxied API.
//
HttpRequestMessage upstreamRequest = new HttpRequestMessage();
upstreamRequest.Method = new HttpMethod(clientHttpContext.Request.Method);
upstreamRequest.RequestUri = new Uri($"{_settings.UpstreamUrl}/api/postdata");
upstreamRequest.Content = upstreamContent;
upstreamRequest.Content.Headers.ContentType = System.Net.Http.Headers.MediaTypeHeaderValue.Parse(clientHttpContext.Request.Headers.ContentType.First());
//
// Loop through the incoming upstreamRequestStream sending it to the Transform method
// and then writing it to the upstream data stream.
//
int incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);
while (incomingBufferBytesRead > 0)
{
//
// Process (transform) a single chunk prior to its going to the proxied API.
//
byte[] transformedBuffer = TransformTheIncomingBuffer(incomingRequestBuffer, incomingBufferBytesRead);
//
// Write the transformed data to the upstreamRequestStream that is wrapped in the StreamContent.
//
await upstreamRequestStream.WriteAsync(transformedBuffer, 0, transformedBuffer.Length);
//
// Clear my transformed buffer and get the next chunk from the input upstreamRequestStream.
//
Array.Clear(transformedBuffer);
incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);
}
//
// Reset the upstreamRequestStream pointer on the outgoing upstreamRequestStream.
// This is a problem - it means the entire object is in memory
// so large objects will overwhelm this.
// How can we feed chunks to the upstream request's StreamContent?
//
upstreamRequestStream.Position = 0;
//
// Send this request on to the httpClient that is bound to the proxied API.
// But, by now we have read and transformed the entire incoming request clientResponseBody
// which may be huge. How do we send this using chunks as we transform it?
//
response = await _httpClient.SendAsync(upstreamRequest);
}
return response;
}
And here is my code using the obsolete HttpWebRequest
:
/// <summary>
/// Uses the obsolete WebRequest / WebResponse to communicate with
/// the proxied API. This allows us access to the upstream request stream.
/// </summary>
/// <param name="clientHttpContext">The HttpContext for this request.</param>
/// <returns>An HttpWebResponse which exposes its stream for processing.</returns>
private async Task<HttpWebResponse> SendToProxiedAPIWithWebRequest(HttpContext clientHttpContext)
{
byte[] incomingRequestBuffer = new byte[_settings.ChunkSize];
HttpWebResponse? response = null;
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.CreateHttp($"{_settings.UpstreamUrl}/api/postdata");
webRequest.Method = "POST";
webRequest.ContentType = "application/json";
using (var upstreamRequestStream = webRequest.GetRequestStream())
{
int incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);
long contentLength = 0;
while (incomingBufferBytesRead > 0)
{
//
// Process (transform) a single chunk prior to its going to the proxied API.
//
byte[] transformedBuffer = TransformTheIncomingBuffer(incomingRequestBuffer, incomingBufferBytesRead);
contentLength += transformedBuffer.LongLength;
//
// Write the transformed data directly to the outgoing request upstreamRequestStream.
// (Note: there is no way to do this using the HttpClient)
//
upstreamRequestStream.Write(transformedBuffer, 0, transformedBuffer.Length);
//
// Clear my transformed buffer and get the next chunk from the input stream.
//
Array.Clear(transformedBuffer);
incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);
}
webRequest.ContentLength = contentLength;
}
//
// Send the request to the proxied API and get the httpResponseMessage.
//
response = (HttpWebResponse)await webRequest.GetResponseAsync();
return response;
}
Is there any way to use the HttpRequestMessage
and still process my data in chunks?
I would like to use the newer method since from what I understand, it does a much better job of reusing connections and in a high volume proxy server this would definitely be an advantage.
Thanks in advance for any guidance.
You can't do this with the standard HttpContent
classes, as they all expect the data to be ready upfront. What you need is a class than can pull the data from somewhere else.
Here is one possible solution. It takes a Func
which can be used to stream the data at the exact point that HttpClient
demands it, and also optionally accepts a Func
to supply a length if available.
public class PullingStreamContent(Func<Stream, CancellationToken, Task> streamWriter, Func<long?>? getLength = null)
: HttpContent
{
private readonly Func<Stream, CancellationToken, Task> _streamWriter = streamWriter;
private readonly Func<long?>? _getLength = getLength;
protected override Task SerializeToStreamAsync(Stream stream, TransportContext? context) =>
_streamWriter(stream, default);
protected override Task SerializeToStreamAsync(Stream stream, TransportContext? context, CancellationToken cancellationToken) =>
_streamWriter(stream, cancellationToken);
protected override bool TryComputeLength(out long length)
{
var l = _getLength?.Invoke();
length = l.GetValueOrDefault();
return l.HasValue;
}
}
The length lambda is optional, and if you don't provide it then you won't get a Content-Length
, and instead the client will do Chunked Transfer.
Then you can pass in a lambda that pulls the data from one side and sends direct to the other.
private async Task<HttpResponseMessage> SendToProxiedAPIWithStreamContent(HttpContext clientHttpContext)
{
using var upstreamContent = new PullingStreamContent(async (outputStream, ct) =>
{
var incomingRequestBuffer = new byte[_settings.ChunkSize];
var body = clientHttpContext.Request.Body;
int incomingBufferBytesRead;
while ((incomingBufferBytesRead = await body.ReadAsync(incomingRequestBuffer, ct)) > 0)
{
// perhaps a reusable transform buffer as well??
var transformedBuffer = TransformTheIncomingBuffer(incomingRequestBuffer.AsMemory(incomingBufferBytesRead));
await outputStream.WriteAsync(transformedBuffer, ct);
}
});
upstreamContent.Headers.ContentType = MediaTypeHeaderValue.Parse(clientHttpContext.Request.Headers.ContentType.First());
using var upstreamRequest = new HttpRequestMessage(HttpMethod.Parse(clientHttpContext.Request.Method), $"{_settings.UpstreamUrl}/api/postdata");
upstreamRequest.Content = upstreamContent;
var response = await _httpClient.SendAsync(upstreamRequest);
return response;
}
Other points to note
HttpResponseMessage
is a bit of a code smell. This function should really deal with and dispose it immediately.AsMemory
to pass around segments of arrays.while ((bytesRead = DoRead()) > 0) {
This removes the need to repeat the read statement.HttpMethod.Parse
returns singletons, rather than doing new HttpMethod
on every run.