Search code examples
c#streamstreamreader

Efficient Way To Get Line from StreamReader which Matches Regex in C#


I have a file, and I want to get the line of the file which matches a regex query.

My code is something like this:

Assembly assembly = typeof(EmbeddedResourceGetter).GetTypeInfo().Assembly;
Stream stream = assembly.GetManifestResourceStream(resourcePath);
StreamReader sr = new StreamReader(stream);

return file.ReadToEnd()
    .Split('\n').ToList()
    .Find(l => Regex.IsMatch(l, "regex-query-here"));

however, I feel like this is quite inefficient and if I need to repeat this multiple times, it can take a long time to complete.

So is there a more efficient way to get a line which matches a regex query without reading the whole file, or will I have to refactor my code in a different way to make it more efficient?


Solution

  • You should read the file once, store it in a variable, because I/O operations are expensive. Then, run the regex on the variable.

    When you read your file into a variable, you read it from hard disk to RAM, accessing RAM is fast, hard disk is slow. Without doubt best is to read from hard disk once!

    Also reading line by line fails, if you want to match multiline pattern.

    For example:

    Can
    you
    match
    me
    if
    you
    read
    me
    line
    by
    line?
    

    "Can\s+you" regex would fail to match in this case, because you won't get "Can" and "you" in same string.