Search code examples
azuresslasp.net-coreazure-service-fabrickestrel-http-server

HTTPS not working with Kestrel 3.2.187/ASPNET Core 2.1.5 running on the Service Fabric with a custom domain


We are running a Service Fabric application on our remote dev cluster. It consists of several stateful and stateless services and is fronted by several front-end APIs running on Kestrel.

Until now, since it was not used for production, Kestrel was configured to use a self-signed certificate, which was also used for the reverse proxy and the cluster itself and the service was running directly on the default domain provided by Azure, <app>.<region>.cloudapp.azure.com.

We are now getting to the point in development where the self-signed certificate errors are becoming problematic, with third party callbacks rejecting the connection, so it was seen as the time to start using a proper domain and certificate for it.

So far, I have done the following:

  • Added an A record for devcluster.somexampledomain.com -> our public IP for the service.
  • Created a Wildcard Azure Application certificate for *.someexampledomain.com.
  • Imported the certificate to Azure Key Vault.
  • Bound the certificate to the Vault Secrets of the cluster, pulling the certificate to Cert:/LocalMachine/My/
  • Modified the application config to use this certificate when initialising Kestrel and verified that it is found when it is initialising.
    • Have tried with and without UseHsts() and UseHttpsRedirection()
    • Kestrel is configured with Listen(IPAddress.IPv6Any, endpoint.Port, ...) and UseHttps(X509Certificate2) on the options object.
    • UseUrls(string) is used with the default Url, which is https://+:<port> but tried manually adding https://*:<port> and even the actual hostname itself.

No matter what I have tried, no HTTPS connection can be established to the server. Trying the endpoints of the other staging servers that still use the old certificate, it works as expected.

Using openssl s_client -connect devcluster.someexampledomain.com:<port> -prexit, I get:

---
no peer certificate available
---
No client certificate CA names sent
---

There are no errors or exceptions being logged on ETW, everything seems to be in order. I suspect that this might have something to do with the CN of the certificate but I have run out of ideas to try and find out what is going on and how to fix it.

Been trying to look into this using Fiddler and I am not getting much out of it, the session just ends with fiddler.network.https> HTTPS handshake to <myhost> (for #191) failed. System.IO.IOException Authentication failed because the remote party has closed the transport stream.

Does anybody know how to add some logging on the Kestrel side? I don't think installing Fiddler on the Azure VMs running my cluster is a viable solution.


Solution

  • After delving into the Kestrel source, I found that it logs under "Microsoft-AspNetCore-Server-Kestrel" and "Microsoft-Extensions-Logging", so adding transfer of those I found what was happening.

    Connections were terminating with the following exception:

    System.ComponentModel.Win32Exception (0x8009030D): The credentials supplied to the package were not recognized
       at System.Net.SSPIWrapper.AcquireCredentialsHandle(SSPIInterface secModule, String package, CredentialUse intent, SCHANNEL_CRED scc)
       at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(CredentialUse credUsage, SCHANNEL_CRED secureCredential)
       at System.Net.Security.SslStreamPal.AcquireCredentialsHandle(X509Certificate certificate, SslProtocols protocols, EncryptionPolicy policy, Boolean isServer)
       at System.Net.Security.SecureChannel.AcquireServerCredentials(Byte[]& thumbPrint, Byte[] clientHello)
       at System.Net.Security.SecureChannel.GenerateToken(Byte[] input, Int32 offset, Int32 count, Byte[]& output)
       at System.Net.Security.SecureChannel.NextMessage(Byte[] incoming, Int32 offset, Int32 count)
       at System.Net.Security.SslState.StartSendBlob(Byte[] incoming, Int32 count, AsyncProtocolRequest asyncRequest)
       at System.Net.Security.SslState.ProcessReceivedBlob(Byte[] buffer, Int32 count, AsyncProtocolRequest asyncRequest)
       at System.Net.Security.SslState.PartialFrameCallback(AsyncProtocolRequest asyncRequest)
    --- End of stack trace from previous location where exception was thrown ---
       at System.Net.Security.SslState.EndProcessAuthentication(IAsyncResult result)
       at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
    --- End of stack trace from previous location where exception was thrown ---
       at Microsoft.AspNetCore.Server.Kestrel.Https.Internal.HttpsConnectionAdapter.InnerOnConnectionAsync(ConnectionAdapterContext context)
       at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.HttpConnection.ApplyConnectionAdaptersAsync()
    

    This makes it a manifestation of Certificate problem with a new machine - credentials supplied to package not recognized.

    I spent some time trying to figure out what the best way to sort this out would be, the Service Fabric documentation has a script to modify the permissions but that just did not sound right.

    As it turns out, this can be done directly in the ApplicationManifest as follows:

    <Principals>
        <Users>
            <User Name="NETWORK SERVICE" AccountType="NetworkService" />
        </Users>
    </Principals>
    <Policies>
        <SecurityAccessPolicies>
            <SecurityAccessPolicy ResourceRef="HttpsCert2" PrincipalRef="NETWORK SERVICE" ResourceType="Certificate" />
        </SecurityAccessPolicies>
    </Policies>
    <Certificates>
        <SecretsCertificate X509FindValue="[HttpsCertThumbprint]" Name="HttpsCert" />
    </Certificates>
    

    For the SecurityAccessPolicy to find the ResourceRef it had to be a SecretsCertificate, not an EndpointCertificate. Since the EndpointBindingPolicy requires an EndpointCertificate, I just added both a SecretsCertificate and an EndpointCertificate, with different names. They are both referring to the same certificate, so it worked. It doesn't feel particularly clean having to double them up but that is the solution I have for now.