Search code examples
.netregexquantifiers

How to create a .NET Regex with a dynamic quantifier


I am trying to extract blocks of JSON data from a data stream in the following format:

    Some-Header-Name:Value
    Content-Length:Value
    Some-Other-Header:Value

    {JSON data string of variable length}

The stream contains many instances of the above pattern and the length of JSON data in each instance is different, as indicated by the preceeding Content-Length header.

I wish to create a Regex that matches each of the content length header values and uses it to match the associated content block. I envisage something like this ...

    Content-Length:(?<LENGTH>\d+).*?\r\n\r\n(?<CONTENT>.{$<LENGTH>})

... but I'm not sure how to specify the quantifier for the CONTENT group as a dynamic value.

Note: although the headers are on separate lines and the content is separated from the headers by a blank line, there is no linefeed after the content, so it is not possible to use this to determine the end of content.

Any suggestions would be appreciated.

Thanks, Tim


Solution

  • Regular expressions match strings, not numbers, and therefore they can't take a part of the string, convert it to a number, and reapply it within the same regex.

    You'd have to do it in several steps:

    1. Match the header, extract the length value
    2. Build a new regex like @"(?<HEADER>...)(?<CONTENT>.{" + length + "})"
    3. Reapply that regex and extract the contents.