Search code examples
nservicebusnservicebus-sagas

Existing saga instances after applying the [Unique] attribute to IContainSagaData property


I have a bunch of existing sagas in various states of a long running process.

Recently we decided to make one of the properties on our IContainSagaData implementation unique by using the Saga.UniqueAttribute (about which more here http://docs.particular.net/nservicebus/nservicebus-sagas-and-concurrency).

After deploying the change, we realized that all our old saga instances were not being found, and after further digging (thanks Charlie!) discovered that by adding the unique attribute, we were required to data fix all our existing sagas in Raven.

Now, this is pretty poor, kind of like adding a index to a database column and then finding that all the table data no longer select-able, but being what it is, we decided to create a tool for doing this.

So after creating and running this tool we've now patched up the old sagas so that they now resemble the new sagas (sagas created since we went live with the change).

However, despite all the data now looking right we're still not able to find old instances of the saga!

The tool we wrote does two things. For each existing saga, the tool:

  1. Adds a new RavenJToken called "NServiceBus-UniqueValue" to the saga metadata, setting the value to the same value as our unique property for that saga, and
  2. Creates a new document of type NServiceBus.Persistence.Raven.SagaPersister.SagaUniqueIdentity, setting the SagaId, SagaDocId, and UniqueValue fields accordingly.

My questions are:

  1. Is it sufficient to simply make the data look correct or is there something else we need to do?
  2. Another option we have is to revert the change which added the unique attribute. However in this scenario, would those new sagas which have been created since the change went in be OK with this?

Code for adding metadata token:

var policyKey = RavenJToken.FromObject(saga.PolicyKey); // This is the unique field
sagaDataMetadata.Add("NServiceBus-UniqueValue", policyKey);

Code for adding new doc:

var policyKeySagaUniqueId = new SagaUniqueIdentity
                            {
                                Id = "Matlock.Renewals.RenewalSaga.RenewalSagaData/PolicyKey/" + Guid.NewGuid().ToString(), 
                                SagaId = saga.Id,
                                UniqueValue = saga.PolicyKey,
                                SagaDocId = "RenewalSaga/" + saga.Id.ToString()
                            };

 session.Store(policyKeySagaUniqueId);

Any help much appreciated.

EDIT

Thanks to David's help on this we have fixed our problem - the key difference was we used the SagaUniqueIdentity.FormatId() to generate our document IDs rather than a new guid - this was trivial tio do since we were already referencing the NServiceBus and NServiceBus.Core assemblies.


Solution

  • The short answer is that it is not enough to make the data resemble the new identity documents. Where you are using Guid.NewGuid().ToString(), that data is important! That's why your solution isn't working right now. I spoke about the concept of identity documents (specifically about the NServiceBus use case) during the last quarter of my talk at RavenConf 2014 - here are the slides and video.

    So here is the long answer:

    In RavenDB, the only ACID guarantees are on the Load/Store by Id operations. So if two threads are acting on the same Saga concurrently, and one stores the Saga data, the second thread can only expect to get back the correct saga data if it is also loading a document by its Id.

    To guarantee this, the Raven Saga Persister uses an identity document like the one you showed. It contains the SagaId, the UniqueValue (mostly for human comprehension and debugging, the database doesn't technically need it), and the SagaDocId (which is a little duplication as its only the {SagaTypeName}/{SagaId} where we already have the SagaId.

    With the SagaDocId, we can use the Include feature of RavenDB to do a query like this (which is from memory, probably wrong, and should only serve to illustrate the concept as pseudocode)...

    var identityDocId = // some value based on incoming message
    
    var idDoc = RavenSession
        // Look at the identity doc's SagaDocId and pull back that document too!
        .Include<SagaIdentity>(identityDoc => identityDoc.SagaDocId)
        .Load(identityDocId);
    
    var sagaData = RavenSession
        .Load(idDoc.SagaDocId); // Already in-memory, no 2nd round-trip to database!
    

    So then the identityDocId is very important because it describes the uniqueness of the value coming from the message, not just any old Guid will do. So what we really need to know is how to calculate that.

    For that, the NServiceBus saga persister code is instructive:

    void StoreUniqueProperty(IContainSagaData saga)
    {
        var uniqueProperty = UniqueAttribute.GetUniqueProperty(saga);
    
        if (!uniqueProperty.HasValue) return;
    
        var id = SagaUniqueIdentity.FormatId(saga.GetType(), uniqueProperty.Value);
        var sagaDocId = sessionFactory.Store.Conventions.FindFullDocumentKeyFromNonStringIdentifier(saga.Id, saga.GetType(), false);
    
        Session.Store(new SagaUniqueIdentity
        {
            Id = id, 
            SagaId = saga.Id, 
            UniqueValue = uniqueProperty.Value.Value,
            SagaDocId = sagaDocId
        });
    
        SetUniqueValueMetadata(saga, uniqueProperty.Value);
    }
    

    The important part is the SagaUniqueIdentity.FormatId method from the same file.

    public static string FormatId(Type sagaType, KeyValuePair<string, object> uniqueProperty)
    {
        if (uniqueProperty.Value == null)
        {
            throw new ArgumentNullException("uniqueProperty", string.Format("Property {0} is marked with the [Unique] attribute on {1} but contains a null value. Please make sure that all unique properties are set on your SagaData and/or that you have marked the correct properties as unique.", uniqueProperty.Key, sagaType.Name));
        }
    
        var value = Utils.DeterministicGuid.Create(uniqueProperty.Value.ToString());
    
        var id = string.Format("{0}/{1}/{2}", sagaType.FullName.Replace('+', '-'), uniqueProperty.Key, value);
    
        // raven has a size limit of 255 bytes == 127 unicode chars
        if (id.Length > 127)
        {
            // generate a guid from the hash:
            var key = Utils.DeterministicGuid.Create(sagaType.FullName, uniqueProperty.Key);
    
            id = string.Format("MoreThan127/{0}/{1}", key, value);
        }
    
        return id;
    }
    

    This relies on Utils.DeterministicGuid.Create(params object[] data) which creates a Guid out of an MD5 hash. (MD5 sucks for actual security but we are only looking for likely uniqueness.)

    static class DeterministicGuid
    {
        public static Guid Create(params object[] data)
        {
            // use MD5 hash to get a 16-byte hash of the string
            using (var provider = new MD5CryptoServiceProvider())
            {
                var inputBytes = Encoding.Default.GetBytes(String.Concat(data));
                var hashBytes = provider.ComputeHash(inputBytes);
                // generate a guid from the hash:
                return new Guid(hashBytes);
            }
        } 
    }
    

    That's what you need to replicate to get your utility to work properly.

    What's really interesting is that this code made it all the way to production - I'm surprised you didn't run into trouble before this, with messages creating new saga instances when they really shouldn't because they couldn't find the existing Saga data.

    I almost think it might be a good idea if NServiceBus would raise a warning any time you tried to find Saga Data by anything other than a [Unique] marked property, because it's an easy thing to forget to do. I filed this issue on GitHub and submitted this pull request to do just that.