Search code examples
c#mapreduceravendb

Raven DB: How to create "UniqueVisitorCount by date" index


I have an application to track the page visits for a website. Here's my model:

public class VisitSession {
    public string SessionId { get; set; }
    public DateTime StartTime { get; set; }
    public string UniqueVisitorId { get; set; }
    public IList<PageVisit> PageVisits { get; set; }
}

When a visitor go to the website, a visit session starts. One visit session has many page visits. The tracker will write a UniqueVisitorId (GUID) cookie when the first time a visitor go to the website. So we are able to know if a visitor is returning visitor.

Now, I want to know how many unique visitors visited the website in a date range. That is, I want to display a table in our webpage like this;

Date        | Unique Visitors Count
------------+-----------------------
2012-05-01  | 100
2012-05-02  | 1000
2012-05-03  | 120

I want to create an index to do this in RavenDB. But I don't know how to write the Map/Reduce query. I though it can be like this:

public class UniqueVisitor_ByDate : AbstractIndexCreationTask<VisitSession, UniqueVisitorByDate>
{
    public UniqueVisitor_ByDate()
    {
        Map = sessions => from s in sessions
                            select new
                            {
                                s.StartTime.Date,
                                s.UniqueVisitorId
                            };

        Reduce = results => from result in results
                            group result by result.Date into g
                            select new
                            {
                                Date = g.Key,
                                UniqueVisitorCount = g.Distinct()
                            };
    }
}

But it's not working. In Ayende's e-book, I know that the result of Map function should be same as the result of Reduce function. So how can I write the correct map/reduce functions?


Solution

  • This index should do what you want:

    public class UniqueVisitor_ByDate : AbstractIndexCreationTask<VisitSession, UniqueVisitorByDate>
    {
        public UniqueVisitor_ByDate()
        {
            Map = sessions => 
                from s in sessions
                select new {
                    s.StartTime.Date,
                    s.UniqueVisitorId,
                    Count = 1,
                };
    
            Reduce = results =>
                from result in results
                group result by result.Date
                into g
                select new UniqueVisitorByDate {
                    Date = g.Key,
                    Count = g.Select(x => x.UniqueVisitorId).Distinct().Count(),
                    UniqueVisitorId = g.FirstOrDefault().UniqueVisitorId,
                };
        }
    }
    

    Note that it requires the extra 'UniqueVisitorId' property in the 'reduce' and the 'count' property in the map, but you can just ignore those.