Search code examples
c#parsinggenericsrefactoringtypesafe

How can I refactor this C# code currently using Dictionarys to have even less redundancy and be more typesafe?


Because of business decisions which are above my paygrade, I need to parse and merge multiple XML files.

In order to cut down on redundant code, I have this map:

        private static readonly Dictionary<string, Type> listTypeByFileName = new Dictionary<string, Type> {
            {"a.xml", typeof(List<A>)},
            {"b.xml", typeof(List<B>)},
            {"c.xml", typeof(List<C>)},
            {"d.xml", typeof(List<D>)},
            // etc.
        };

Because how this map gets used, after downloading and parsing all the XMLs, the result is of type Dictionary<string, object> where the key is the same as the keys in the above map and the value is of the type specified in the map, as result of executing this code with DownloadFiles(config):

        private static Dictionary<string, object> DownloadFiles(IConfigurationRoot config) {
            Dictionary<string, object> dataListByFileNames = new Dictionary<string, object>();
            listTypeByFileName.Keys.ToList()
                .ForEach(name => dataListByFileNames.Add(name, DownloadData(name, config)));
            return dataListByFileNames;
        }

        private static object DownloadData(string name, IConfigurationRoot config) {
            _ = listTypeByFileName.TryGetValue(name, out Type listType);
            return new XmlSerializer(listType, new XmlRootAttribute("Document"))
                .Deserialize(new StringReader(DownloadFromBlobStorage(name, config).ToString()));
        }

        private static CloudBlockBlob DownloadFromBlobStorage(string filetoDownload, IConfigurationRoot config) {
            return CloudStorageAccount.Parse(config["AzureWebJobsStorage"])
                .CreateCloudBlobClient()
                .GetContainerReference(config["BlobStorageContainerName"])
                .GetBlockBlobReference(filetoDownload);

First question: Is there a way I can make the return more typesafe? Perhaps using parameterized types?

The second part of the problem is actually consuming this Dictionary.

For each type in this Dictionary, I now need a function like:

        private void AddA(Dictionary<string, object> dataByFileNames) {
            if (dataByFileNames.TryGetValue("a.xml", out object data)) {
                List<A> aList = (List<A>)data;
                aList.ForEach(a =>
                    doSomethingWithA(a);
                );
            }
        }

        private void AddB(Dictionary<string, object> dataByFileNames) {
            if (dataByFileNames.TryGetValue("b.xml", out object data)) {
                List<B> bList = (List<B>)data;
                bList.ForEach(b =>
                    doSomethingWithB(b);
                );
            }
        }

       // etc.

As I already have the list of filenames to types (top of this question), I feel there should be some way to abstract the above so it does not need to be repeated again and again and again. Note, it may be significant that every type (A, B, C, D, etc. all have a property string Id which will be definitely be needed for all doStringWithX() methods... if useful, I can create an interface to get this. It is okay if I need to caste to the correct type within each doStringWithX() or when invoking each of these methods.c


Solution

  • First, instead of storing the List<T> type in the dictionary, just store the underlying generic type:

    private static readonly Dictionary<string, Type> listTypeByFileName = new Dictionary<string, Type> {
        {"a.xml", typeof(A)},
        {"b.xml", typeof(B)}
        // etc.
    

    That's going to make future steps a little bit easier. When deserializing, create the generic list type. After getting the type from the dictionary, you can do:

    var listType = typeof(List<>).MakeGenericType(typeRetrievedFromDictionary);
    

    Once you've deserialized it, cast it as IList. That's effectively casting it as a list of object. That's okay. Because you deserialized using a specific type, every item in the list will be of the expected type.

    Create a dictionary for the type-safe methods you want to invoke on every time in list.

    Dictionary<Type, Action<object>> methodsToInvokeByType;
    

    Add methods to the dictionary:

    doSometingMethods.Add(typeof(A), dataItem => DoSomethingWithA((A)dataItem));
    doSometingMethods.Add(typeof(B), dataItem => DoSomethingWithB((B)dataItem));
    

    Now, once you've got your IList full of objects, you retrieve the type-safe method to invoke:

    var methodToInvoke = methodsToInvokeByType[typeRetrievedFromDictionary];
    

    Then do this:

    foreach(object itemInList in list) // this is your deserialized list cast as IList
    {
        methodToInvoke(itemInList);
    }
    

    So if the type is A, you'll be invoking

    DoSomethingWithA((A)itemInList)

    It's not pretty. Bridging between code that uses objects and Type and type-safe generic code can be messy. But ultimately the goal is that whatever those final methods are - DoSomethingWithA, DoSomethingWithB, etc., at least those are type-safe.


    You can simplify some more:

    Create a class that deserializes a list and passes it off to a method for processing, and an interface:

    public interface IXmlFileProcessor
    {
        void Process(byte[] xmlFile);
    }
    
    public class XmlFileProcessor<T> : IXmlFileProcessor
    {
        private readonly Action<T> _doSomething;
    
        public XmlFileProcessor(Action<T> doSomething)
        {
            _doSomething = doSomething;
        }
    
        public void Process(byte[] xmlFile) // or string or whatever
        {
            // deserialize into a List<T>
            foreach (T item in deserializedList)
                _doSomething(item);
        }
    }
    

    Then create a Dictionary<Type, IXmlFileProcessor> and populate it:

    fileProcessors.Add(typeof(A), new XmlFileProcessor<A>(SomeClass.DoSomethingWithA));
    fileProcessors.Add(typeof(B), new XmlFileProcessor<B>(SomeClass.DoSomethingWithB));
    

    That approach (injecting the Action) is intended to keep the "do something" method decoupled from the class responsible for deserialization. DoSomething could also be a generic method in XmlFileProcessor<T>. There are different ways to compose those classes and add them to that dictionary. But either way, having determined the type, you just retrieve the correct type-specific processor from the dictionary, pass your file to it, and it does the rest.

    That approach bridges the generic/non-generic gap by making the class - XmlFileProcessor<T> - generic, but having it implement a non-generic interface. It works as long as you take steps (using the dictionary) to ensure that you're selecting the correct implementation for whatever type you're deserializing.