Search code examples
c#linqsprache

How the Sprache LINQ query example works?


I came across the following piece of code in the Sprache repository :

Parser<string> identifier =
    from leading in Parse.WhiteSpace.Many()
    from first in Parse.Letter.Once().Text()
    from rest in Parse.LetterOrDigit.Many().Text()
    from trailing in Parse.WhiteSpace.Many()
    select first + rest;

var id = identifier.Parse(" abc123  ");

I see a contradiction here: the from clause docs say the source (Parse.WhiteSpace.Many() or Parse.Letter.Once().Text() in our case) must be IEnumerable:

The data source referenced in the from clause must have a type of IEnumerable, IEnumerable<T>, or a derived type such as IQueryable<T>

But it isn't and the compiler says that's fine!

I thought there is some implicit cast to IEnumerable, but there isn't: Parse.WhiteSpace.Many() returns Parser<IEnumerable<T>> and Parse.Letter.Once().Text() returns Parser<string> (types are not IEnumerable).

1st question: Why does the compiler allow this code?

Also, the final expression select first + rest doesn't take into account leading and trailing variables, but the final result identifier, for sure, uses them inside.

2nd question: By what rule\mechanism leading and trailing variables were added to the identifier?

P.S. It'd be great if someone shared an all-encompassing doc about internal work of LINQ query syntax. I've found nothing on this topic.


Solution

  • After like five minutes of looking at the code I have observed:

    1. parser is a delegate that returns an intermediate result

      public delegate IResult<T> Parser<out T>(IInput input);
      
    2. there are linq compliant methods declared that allow linq syntax like:

      public static Parser<U> Select<T, U>(this Parser<T> parser, Func<T, U> convert)
       {
           if (parser == null) throw new ArgumentNullException(nameof(parser));
           if (convert == null) throw new ArgumentNullException(nameof(convert));
      
           return parser.Then(t => Return(convert(t)));
       }
      

    https://github.com/sprache/Sprache/blob/develop/src/Sprache/Parse.cs#L357

    It is not true that IEnumerable interface is required for the syntax from x in set to work you just require particular extension method with correct name that accepts correct set of parameters. So the above makes select valid. Here you have where method

    public static Parser<T> Where<T>(this Parser<T> parser, Func<T, bool> predicate)
        {
            if (parser == null) throw new ArgumentNullException(nameof(parser));
            if (predicate == null) throw new ArgumentNullException(nameof(predicate));
    
            return i => parser(i).IfSuccess(s =>
                predicate(s.Value) ? s : Result.Failure<T>(i,
                    $"Unexpected {s.Value}.",
                    new string[0]));
        }
    

    https://github.com/sprache/Sprache/blob/develop/src/Sprache/Parse.cs#L614

    and so on.

    This is separate implementation of the linq abstraction that has nothing to do with collections it is about parsing text. It produces a nested chain of delegates that process given string to verify if it confirms to particular gramma and returns structure that describes parsed text.

    that answers first question. For the second you need to follow the code: all from x in set except the first one map to SelectMany function:

    public static Parser<V> SelectMany<T, U, V>(
            this Parser<T> parser,
            Func<T, Parser<U>> selector,
            Func<T, U, V> projector)
        {
            if (parser == null) throw new ArgumentNullException(nameof(parser));
            if (selector == null) throw new ArgumentNullException(nameof(selector));
            if (projector == null) throw new ArgumentNullException(nameof(projector));
    
            return parser.Then(t => selector(t).Select(u => projector(t, u)));
        }
    

    https://github.com/sprache/Sprache/blob/develop/src/Sprache/Parse.cs#L635

    and Then method https://github.com/sprache/Sprache/blob/develop/src/Sprache/Parse.cs#L241

    there you will see that if first succeeds (leading white spaces where matched) only than second (the single letter parser) is applied on the remainder of the string. So Again it is not a collection processing its a chain of events that lead to parsing the string.