RavenDB is a document database, and unlike a relational database (like say Microsoft SQL Server) – you can’t do your usual group-by type queries quite as simply.

Consider this scenario: I have a series of documents showing when users have performed certain actions within an application.
I want to produce a unique count of users, and a count of Application starts.

public class StatisticEntry
    {
        public DateTime Timestamp { get; set; }
        public string UserId { get; set; }
        public string Action { get; set; }
    }

 

My documents will look something like:

{ "Timestamp": "2010-01-01", "UserId": "Able", "Action": "Open"} 
{ "Timestamp": "2010-01-01", "UserId": "Barry", "Action": "Open"} 
{ "Timestamp": "2010-01-02", "UserId": "Barry", "Action": "Close"} 
{ "Timestamp": "2010-01-03", "UserId": "Barry", "Action": "Open"} 
{ "Timestamp": "2010-01-03", "UserId": "Charlie", "Action": "Open"} 

In SQL Server, I would do something like:

SELECT UserID, count(Timestamp) as Count 
FROM StatisticEntries 
WHERE Action=”Open” 
GROUP BY UserID

This would give me:

Row # UserID Count
1 Able 1
2 Barry 2
3 Charlie 1


For RavenDB instead I need to use a Map-Reduce query. The basics of this are that the first part (the Map) runs over the documents, and finds the matches. The reduce then takes the found matches, and summarises (reduces) the result set.

To make this magic happen, we need to know the results we want out at the end, and define this as a class:

public class StatisticResult
{
    public string UserId { get; set; }
    public int Count { get; set; }
}

We then create our index like so, adding a reference to Raven.Abstractions.dll

public class StatisticsUniqueUsers : AbstractIndexCreationTask<StatisticsEntry, StatisticResult> 
{ 
    public StatisticsUniqueVisitors() 
    { 

        Map = docs => from doc in docs 
                      where doc.Action == "Open"
                      select new 
                                 { 
                                     UserId = doc.UserId, 
                                     Count = 1 
                                 }; 
        Reduce = results => from result in results 
                            group result by result.UserId 
                            into g 
                            select new 
                                       { 
                                           UserId = g.Key, 
                                           Count = g.Sum(x => x.Count) 
                                       }; 
    } 
}

You then feed this class to RavenDB using this bit of magic (This took me a while to find):

Raven.Client.Indexes.IndexCreation.CreateIndexes(typeof(StatisticsUniqueUsers).Assembly, store);


NB: This will result in scanning this entire assembly for indexes. All of them will be created at that time. If any are invalid, you’ll get exceptions.

The CreateIndexes method should be called when your application starts – but be aware that you may get stale results temporarily while RavenDB produces the index.

So, Let me break the index class we created above:

The Map gives us a list of matching documents, with the intermediate results being:

{ "UserId": "Able", "Count": 1} 
{ "UserId": "Barry", "Count": 1} 
{ "UserId": "Barry", "Count": 1} 
{ "UserId": "Charlie", "Count": 1}

The Reduce then summarises these values and gives us the summarised values:

{ "UserId": "Able", "Count": 1} 
{ "UserId": "Barry", "Count": 2} 
{ "UserId": "Charlie", "Count": 1}

…so, we then have this index, which gives us a per-user count. But we just wanted unique users and total Opens?  Quite right.. It’s a little bit of a round trip, but we get there in the end.

We get the results by performing two queries:

int uniqueUsers = _documentSession.Query<StatisticsEntry, StatisticsUniqueUsers>().Count(); 
int totalOpens = _documentSession.Query<StatisticsEntry>().Where(s=>s.Action=="Open").Count();

The biggest ‘wtf’ for me was how to actually load the indexes into RavenDB – since there’s no apparent functions to do so on the IDocumentStore interface.

No comments to “RavenDB: Map Reduce Indexes, and how to install them.”

  1. Rob Ashton says:

    We need to work on the documentation for that, thanks for the feedback