Search code examples
c#.netazureazure-service-fabric

relying on a stateful service for configuration values?


We have approximately 100 microservices running. Each microservice has an entire set of configuration files such as applicationmanifest.xml, settings.xml, node1.xml, etc.

This is getting to be a configuration nightmare.

After exploring this, someone has suggested:

You can keep configs inside stateful service, then change parameters through your API.

The problem I see with this, is that there is now a single point of a failure: the service that provides the configuration values.

Is there a centralized solution to maintaining so much configuration data for every microservice?


Solution


  • NOTE: It seems your question is blurred between whether to use a single configuration service is reliable or whether to use static vs dynamic configuration.

    For the debate on static vs dynamic configuration, see my answer to the OP's other question.


    A config service sounds reasonable particularly when you consider that Service Fabric is designed to be realiable, even stateful services.

    MSDN:

    Service Fabric enables you to build and manage scalable and reliable applications composed of microservices that run at high density on a shared pool of machines, which is referred to as a cluster

    Develop highly reliable stateless and stateful microservices. Tell me more...

    Stateful services store state in a reliable distrubuted dictionary enclosed in a transaction which guarentees the data is stored if the transaction was successful.

    OP:

    The problem I see with this, is that there is now a single point of a failure: the service that provides the configuration values.

    Not necessarily. It's not really the service that is the single point of failure but a "fault domain" as defined by Service Fabric and your chosen Azure data centre deployment options.

    MSDN:

    A Fault Domain is any area of coordinated failure. A single machine is a Fault Domain (since it can fail on its own for various reasons, from power supply failures to drive failures to bad NIC firmware). Machines connected to the same Ethernet switch are in the same Fault Domain, as are machines sharing a single source of power or in a single location. Since it's natural for hardware faults to overlap, Fault Domains are inherently hierarchal and are represented as URIs in Service Fabric.

    It is important that Fault Domains are set up correctly since Service Fabric uses this information to safely place services. Service Fabric doesn't want to place services such that the loss of a Fault Domain (caused by the failure of some component) causes a service to go down. In the Azure environment Service Fabric uses the Fault Domain information provided by the environment to correctly configure the nodes in the cluster on your behalf. For Service Fabric Standalone, Fault Domains are defined at the time that the cluster is set up

    So you would probably want to have at least two configuration services running on two separate fault domains.

    More