Search code examples
ruststructenumsetl

How to Implement a Rust Struct with Enum for Config File Parsing to Create Varying Object Types


I need help implementing Rust structs and enums to parse a config file, which will create objects of different types based on a field in the config.

In my use case the configuration specifies datastore details, where the datastore type (e.g. CSV, JSON) influences the structure and fields of the settings section.

Here's an example of a CSV config:

model:
  model_path: octaioxide/tests/testdata/pytorch_nn.onnx

datastore:
  datastore_type: csv
  settings:
    file_path: inference-example/test-data.csv
    delimiter: ;
    headers: true

and Json:

model:
  model_path: octaioxide/tests/testdata/pytorch_nn.onnx

datastore:
  datastore_type: json
  settings:
    file_path: inference-example/test-data.json
    json_specific_field: true

I'd like to create a struct so that when I deserialise the config file, it has an object datastore which is instantiated by default using the settings provided. Something like:

pub struct InferenceConfig {
    pub model: Model,
    pub datastore: DataStore
}

let inference = InferenceConfig::new(filepath);
data: vec![] = inferece.datastore.load_data();

I had thought that there would be a way of creating a DataStore enum which held Csv, Json etc. objects:

pub enum DataSource {
    Csv(CsvDataSource),
    Json(JsonDataSource),
    // more
}

such that when the config is parsed, the datastore attribute of InferenceConfig becomes the CsvDataSource objects (containing a load_data method)

However I have tried many variations on this without success.


Solution

  • I'm assuming you're using serde and serde-yaml to load this config.

    Your datastore structure reflects an adjacently tagged enum meaning it only has a "tag" field and a "content" field (where the content is determined by the tag). You can configure that like so:

    #[derive(Deserialize)]
    #[serde(tag = "datastore_type", content = "settings", rename_all = "snake_case")]
    enum DataStore {
        Csv(CsvDataSource),
        Json(JsonDataSource),
        // more
    }
    

    The rename_all = ... part is just so that "json" (lowercase) gets parsed to the Json variant. Same for "csv".

    Full working example available on the playground.