Search code examples
c#.netcultureinforesxresourcemanager

Reliable calculation of overriden culture-specific resources in .NET


It might be a bit hard to guess from the title what I mean, so let me elaborate.

I have an application that makes use of resources (resx) for translations. I have standard Strings.resx file with all strings, and Strings.xx-XX.resx files overriding them in specific xx-XX culture. Initial Strings.resx file has X number of strings, where culture-specific Strings.xx-XX.resx file might have less or equal than X number of strings.

I'm trying to code a function that will be able to reliably calculate how many strings are overriden in that language, which compared to initial number can give me a nice overall translation percentage.

For example, we have 10 strings overall, 6 strings in es-ES culture. When es-ES user launches application, he'll get a message that es-ES translation is completed in 60%.

So far I managed to code something like this:

ushort defaultResourceSetCount = 0;
ResourceSet defaultResourceSet = Strings.ResourceManager.GetResourceSet(CultureInfo.GetCultureInfo("en-US"), true, true);
if (defaultResourceSet != null) {
    defaultResourceSetCount = (ushort) defaultResourceSet.Cast<object>().Count();
}

ushort currentResourceSetCount = 0;
ResourceSet currentResourceSet = Strings.ResourceManager.GetResourceSet(CultureInfo.CurrentCulture, true, false);
if (currentResourceSet != null) {
    currentResourceSetCount = (ushort) currentResourceSet.Cast<object>().Count();
}

if (currentResourceSetCount < defaultResourceSetCount) {
    // This is our percentage that we want to calculate and show to user
    float translationCompleteness = currentResourceSetCount / (float) defaultResourceSetCount;
}

The above code works, but has a lot of dialect limitations that I want to solve. It basically works only for very generic case of having specific culture such as es-ES and Strings file in that specific culture as well - if user uses something else, such as es-UY, he'll get a fallback from .NET of es-ES, but that won't work for our calculation. I could switch tryParents boolean in GetResourceSet() to true, but then I'd always get a fallback in form on en-US strings declared in original Strings file, so we'd always have 100% translation progress, even if user picked totally different culture.

So basically, with our example of 10 resources in Strings.resx, and 6 resources in Strings.es-ES.resx, following thing should happen:

  • We get 100% completion when using en-US culture.
  • We get 100% completion when using en-GB culture, since it falls back to en-US, and we have that one covered.
  • We get 60% completion when using es-ES culture.
  • We get 60% completion when using es-UY culture, because .NET considers es, and then es-ES as a fallback. Notice that we don't have es, but es-ES declared.
  • We get 0% completion when using zh-CN, even if deepest fallback is en-US that we have covered.

I want to solve this problem in the best way, and I'm not quite sure what is that best way - I thought that simply getting number of resources would work, and it does, but not for dialects, and trying all parents doesn't work either, since it'd always result in en-US. On the other hand, I want to assume that any fallback that is better than Strings.resx can be considered as translated, because es-ES is totally fine for es-UY user, but en-US is bad for zh-CN. On the other hand, en-US is totally fine for en-GB user.

Maybe I could in some way compare both ResourceSets with tryParents set and compare which strings are different? It'd need to be reference comparison though, as it's totally possible that some strings might have same translation in two different languages. Is it even possible?

Any suggestions welcome.


Solution

  • This is the best I came up with regarding this issue:

    if (CultureInfo.CurrentCulture.TwoLetterISOLanguageName.Equals("en")) {
        return;
    }
    
    ResourceSet defaultResourceSet = Strings.ResourceManager.GetResourceSet(CultureInfo.GetCultureInfo("en-US"), true, true);
    if (defaultResourceSet == null) {
        return;
    }
    
    HashSet<DictionaryEntry> defaultStringObjects = new HashSet<DictionaryEntry>(defaultResourceSet.Cast<DictionaryEntry>());
    if (defaultStringObjects.Count == 0) {
        return;
    }
    
    ResourceSet currentResourceSet = Strings.ResourceManager.GetResourceSet(CultureInfo.CurrentCulture, true, true);
    if (currentResourceSet == null) {
        return;
    }
    
    HashSet<DictionaryEntry> currentStringObjects = new HashSet<DictionaryEntry>(currentResourceSet.Cast<DictionaryEntry>());
    if (currentStringObjects.Count >= defaultStringObjects.Count) {
        // Either we have 100% finished translation, or we're missing it entirely and using en-US
        HashSet<DictionaryEntry> testStringObjects = new HashSet<DictionaryEntry>(currentStringObjects);
        testStringObjects.ExceptWith(defaultStringObjects);
    
        // If we got 0 as final result, this is the missing language
        // Otherwise it's just a small amount of strings that happen to be the same
        if (testStringObjects.Count == 0) {
            currentStringObjects = testStringObjects;
        }
    }
    
    if (currentStringObjects.Count < defaultStringObjects.Count) {
        float translationCompleteness = currentStringObjects.Count / (float) defaultStringObjects.Count;
        Console.WriteLine("Do something with translation completeness: " + translationCompleteness);
    }
    

    It requires only two fairly nice assumptions:

    1. Current culture resource set can't have more resources than our default (en-US) culture, this will be the case always when only doing translations.
    2. If we have 100% finished translation, we must have at least 1 translated resource that has different representation than original default string, otherwise there is no way to say whether this is intended translated resource set, or default one pulled from en-US.

    I'm very happy with this solution.