Search code examples
c#cassandradatastaxdatastax-enterprise

Datastax C# driver 3.3.0 deadlocking on connect to cluster?


To Datastax C# driver engineers:

C# driver 3.3.0 is deadlocking while calling to Connect(). The following code snippet on Windows Forms will deadlock trying to connect:

    public void SimpleConnectTest()
    {
        const string ip = "127.0.0.1";
        const string keyspace = "somekeyspace";

        QueryOptions queryOptions = new QueryOptions();
        queryOptions.SetConsistencyLevel(ConsistencyLevel.One);

        Cluster cluster = Cluster.Builder()
            .AddContactPoints(ip)
            .WithQueryOptions(queryOptions)
            .Build();

        var cassandraSession = cluster.Connect(keyspace);

        Assert.AreNotEqual(null, cassandraSession);

        cluster.Dispose();
    }

Deadlocking happens here:

Cluster.cs -> 
private void Init()
{
  ...
TaskHelper.WaitToComplete(_controlConnection.Init(), initialAbortTimeout);
  ...
}

I have tested this on Cassandra 3.9.0, CQL spec 3.4.2 on local machine.

Everything deadlocks on calling this method _controlConnection.Init() here:

task = Id = 11, Status = WaitingForActivation, Method = "{null}", Result = "{Not yet computed}"

This then just runs for 30000ms and throws this:

                throw new TimeoutException(
                    "Cluster initialization was aborted after timing out. This mechanism is put in place to" +
                    " avoid blocking the calling thread forever. This usually caused by a networking issue" +
                    " between the client driver instance and the cluster.", ex);

Running same test on 3.2.0 has no such problems. Can anyone else test this? Maybe this just happens to me.

Edit:

Here is the screenshot for the deadlock:

Deadlocked tasks with blocking awaiting ()


Solution

  • Thanks to the details in your comments, we were able to identify the underlying issue.

    Similar to what was proposed by Luke, there were some missing ConfigureAwait() calls.

    This issue impacts users that are calling Cluster.Connect() on environments with SynchonizationContext which is not a common use case:

    • For Windows Forms, its unlikely to communicate directly to a database (without a service in the middle). Furthermore, users should call Connect() before creating a form (where there is no SynchonizationContext) to share the same Session instance across all forms.
    • For ASP.NET, users should call Connect() outside of any endpoint action, before the HttpContext is created (where there is no SynchonizationContext).

    Note that this issue affects only Connect() calls. Other blocking calls like Execute() don't have this issue.

    In any case, this issue could be a showstopper for users getting started with the driver, for example, users creating a simple windows forms app to try a concept.

    I've submitted a pull request with the fix, which also contains a test that looks into the source code for the usage of await without ConfigureAwait() calls to avoid having this issue in the future: https://github.com/datastax/csharp-driver/pull/309

    You can expect the fix to land in the next patch release.