Search code examples
.netpowershellkeyhashtable

Is it possible to get the actual key back from a hashtable


Given a hash table (or a dictionary) like:

$HashTable = @{ 'One' = 1; 'Two' = 2; 'Three' = 3 }

(which is case insensitive by default, but doesn't has to be in every situation for my case. In fact, I have no clue of the comparer that is used in the given hashtable or dictionary)

It is easy to get the related value for a specific item:

$HashTable['two'] # Or: $HashTable.get_item('two')
2

But is it also possible to get the actual key (of a specific item) back, something like:

$HashTable.get_Key('two')

Which would return Two (which a capital T) rather than an error or two in lowercase?


Update 1

Apparently I haven't been clear in my question, so let me rephrase it:

For any given hashtable (or dictionary) with an unknown comparer, I am looking for a way to retrieve the original key of a specific item similar as I would retrieve the value of t specific item. Meaning:

  • If the key doesn't exist, I would not expect anything in return (similar as would look for the value using that same key)
  • If the key does exist, I expect a (single) original key in return (similar as would get a single) value in return)

Meanwhile, I am starting to believe that the answer to my question is:

That is not possible

Simply because the key of an hashtable isn't reordered but only the hashcode and collision recovery data but not the whole key (/string, which might virtually unlimited in size).. That doesn't make sense either because the whole key is still available in the .keys collection but apparently the (reverse) link with the actual item is gone...


Update 2

To be honest what I was expecting from this question, was either:

  • a standard way to retrieve original key
  • or "not possible" with possibly a why that is the case

But it turns out that I got a list of interesting alternatives which I might use instead. For my actual use case, the performance is important but more important is the fact whether the provide suggestion reacts according my expectations. Therefore I have setup Pester test bench to confirm this:

Describe 'Retrive the original key from a hashtable or dictionary' { # https://stackoverflow.com/a/78656228

    BeforeAll {
        $CaseInsensitive = [Ordered]@{
            One   = 1
            Two   = 2
            Three = 3
            4 = 'Four'
        }

        $CaseSensitive = [System.Collections.Specialized.OrderedDictionary]::new()
        $CaseSensitive['One']   = 1
        $CaseSensitive['two']   = 2
        $CaseSensitive['Two']   = 2
        $CaseSensitive['three'] = 3
        $CaseSensitive['Three'] = 3
        $CaseSensitive['THREE'] = 3
        $CaseSensitive[[Object]4] = 'Four'

        # mklement0 https://stackoverflow.com/a/78656228/1701026
        function GetKey($Dictionary, $lookupKey) {
            if ($Dictionary.Contains($lookupKey)) {
                $candidateKeys = $Dictionary.Keys.Where({ $_ -eq $lookupKey }, 0, 2)
                if ($candidateKeys.Count -ge 2) { $lookupKey }
                else                            { $candidateKeys[0] }
            }
        }

        # Fabrice SANGA https://stackoverflow.com/a/78656281/1701026
        # $CaseInsensitive | Add-Member -Force -MemberType ScriptMethod -Name get_Key -Value {
            # param([string] $CaseInsensitiveIndex)
            # return $this.Keys.Where{ $_ -ceq ([CultureInfo]::InvariantCulture).TextInfo.ToTitleCase($CaseInsensitiveIndex) }
        # }
        # $CaseSensitive | Add-Member -Force -MemberType ScriptMethod -Name get_Key -Value {
            # param([string] $CaseInsensitiveIndex)
            # return $this.Keys.Where{ $_ -ceq ([CultureInfo]::InvariantCulture).TextInfo.ToTitleCase($CaseInsensitiveIndex) }
        # }
        # function GetKey($Dictionary, $lookupKey) {
            # $Dictionary.get_Key($lookupKey)
        # }

        # Abraham Zinala https://stackoverflow.com/a/78657526/1701026
        # function GetKey($HashTable, $to_match) {
            # [System.Linq.Enumerable]::Where(
            # ($entries = [System.Collections.DictionaryEntry[]]@($HashTable.GetEnumerator())),
                # [Func[System.Collections.DictionaryEntry, bool]]{
                    # param($entry)
                    # $entry.Key -match $to_match -and $entries.Where{$_.Key -match $to_match}.Count -eq 1 -or 
                    # $entry.Key -cmatch $to_match -and $entries.Where{$_.Key -match $to_match}.Count -ge 1
                # }
            # ).Key
        # }

        # iRon
        # function GetKey($Dictionary, $LookupKey) {
            # if ($Dictionary -is [System.Collections.IDictionary]) {
                # if (-not $Dictionary.Contains($LookupKey)) { return }
            # } else {
                # if (-not $Dictionary.ContainsKey($LookupKey)) { return }
            # }
            # foreach ($Key in $Dictionary.get_Keys()) {
                # if ($Key -ceq $LookupKey) { return $Key }
                # if ($Key -ieq $LookupKey) { $iKey = $Key }
            # }
            # if ($Null -ne $iKey) { $iKey } else { $LookupKey }
        # }
    }

    Context 'Case insensitive, existing key' {


        It 'One' {
            $CaseInsensitive['One'] | Should -Be 1
            GetKey $CaseInsensitive 'One' | Should -BeExactly 'One'
        }

        It 'ONE' {
            $CaseInsensitive['ONE'] | Should -Be 1
            GetKey $CaseInsensitive 'ONE' | Should -BeExactly 'One'
        }

        It 'two' {
            $CaseInsensitive['two'] | Should -Be 2
            GetKey $CaseInsensitive 'two' | Should -BeExactly 'Two'
        }

        It 'Two' {
            $CaseInsensitive['Two'] | Should -Be 2
            GetKey $CaseInsensitive 'Two' | Should -BeExactly 'Two'
        }

        It 'TWO' {
            $CaseInsensitive['TWO'] | Should -Be 2
            GetKey $CaseInsensitive 'TWO' | Should -BeExactly 'Two'
        }

        It 'three' {
            $CaseInsensitive['three'] | Should -Be 3
            GetKey $CaseInsensitive 'three' | Should -BeExactly 'Three'
        }

        It 'Three' {
            $CaseInsensitive['Three'] | Should -Be 3
            GetKey $CaseInsensitive 'Three' | Should -BeExactly 'Three'
        }

        It 'THREE' {
            $CaseInsensitive['THREE'] | Should -Be 3
            GetKey $CaseInsensitive 'THREE' | Should -BeExactly 'Three'
        }

        It '4' {
            $CaseInsensitive[[Object]4] | Should -Be 'Four'
            GetKey $CaseInsensitive 4 | Should -Be 4
        }
    }

    Context 'Case insensitive, missing key' {

        It '"4"' {
            $CaseInsensitive['4'] | Should -BeNullOrEmpty
            GetKey $CaseInsensitive '4' | Should -BeNullOrEmpty
        }
    }

    Context 'Case sensitive, existing key' {

        It 'One' {
            $CaseSensitive['One'] | Should -Be 1
            GetKey $CaseSensitive 'One' | Should -BeExactly 'One'
        }

        It 'two' {
            $CaseSensitive['Two'] | Should -Be 2
            GetKey $CaseSensitive 'Two' | Should -BeExactly 'Two'
        }

        It 'Two' {
            $CaseSensitive['Two'] | Should -Be 2
            GetKey $CaseSensitive 'Two' | Should -BeExactly 'Two'
        }

        It 'three' {
            $CaseSensitive['three'] | Should -Be 3
            GetKey $CaseSensitive 'three' | Should -BeExactly 'three'
        }

        It 'Three' {
            $CaseSensitive['Three'] | Should -Be 3
            GetKey $CaseSensitive 'Three' | Should -BeExactly 'Three'
        }

        It 'THREE' {
            $CaseSensitive['THREE'] | Should -Be 3
            GetKey $CaseSensitive 'THREE' | Should -BeExactly 'THREE'
        }

        It '4' {
            $CaseInsensitive[[Object]4] | Should -Be 'Four'
            GetKey $CaseInsensitive 4 | Should -Be 4
        }
    }

    Context 'Case sensitive, missing key' {

        It 'one' {
            $CaseSensitive['one'] | Should -BeNullOrEmpty
            GetKey $CaseSensitive 'one' | Should -BeNullOrEmpty
        }

        It 'ONE' {
            $CaseSensitive['ONE'] | Should -BeNullOrEmpty
            GetKey $CaseSensitive 'ONE' | Should -BeNullOrEmpty
        }

        It 'TWO' {
            $CaseSensitive['TWO'] | Should -BeNullOrEmpty
            GetKey $CaseSensitive 'TWO' | Should -BeNullOrEmpty
        }

        It 'Four' {
            $CaseSensitive['Four'] | Should -BeNullOrEmpty
            GetKey $CaseSensitive 'Four' | Should -BeNullOrEmpty
        }

        It '"4"' {
            $CaseInsensitive['4'] | Should -BeNullOrEmpty
            GetKey $CaseInsensitive '4' | Should -BeNullOrEmpty
        }
    }
}

Solution

  • Preface:

    • Small caveat: The solutions below rely on PowerShell's -eq operator to perform case-insensitive equality comparison, which implicitly occur in the context of the invariant culture, where as given hashtable / dictionary may use a different culture context, such as the current one.
      While this typically won't make a difference, there are edge cases where it may. In the absence of being able to obtain the comparer that a given hashtable / dictionary instance uses internally,[1] there's no robust solution to this problem.

    I'm not aware of an efficient way to obtain the case-exact form of the original key, but the following is a concise one (at the expense of performance, 'First' can be omitted), using the intrinsic .Where() method:

    @{
     'One' = 1
     'Two' = 2
     'Three' = 3
    }.Keys.Where({ $_ -eq 'two' }, 'First')[0] # -> 'Two'
    

    Note:

    • The above only works reliably with case-insensitive dictionaries, which PowerShell's hashtables and [ordered] hashtable literals are (note that the use of 'First' isn't strictly necessary, but is a performance optimization: it stops looking after the first - and by definition only in a case-insensitive dictionary - match; [0] extracts the one and only element of the collection of matches that .Where() invariably returns).

    • With case-sensitive dictionaries, the above may report false positives - see below for a solution that handles such dictionaries too.

      • If, by contrast, you want to retrieve (potentially) all keys from a case-sensitive dictionary that are identical to or case variations of the given lookup key (whether or not the lookup key matches one of them case-exactly), replace
        .Keys.Where({ $_ -eq 'two' }, 'First')[0] with
        .Keys.Where({ $_ -eq 'two' })
    • The reason this is inefficient is the need to perform a linear (O(N)) search through the keys collection, and the fact that a script block must be invoked for each collection element doesn't help.


    Dealing with a dictionary whose case-sensitivity is unknown requires more work, including potentially having to examine all keys:

    # Note: [System.Collections.Specialized.OrderedDictionary] is the type
    #       underlying [ordered] hashtable literals in PowerShell.
    #       It - like [System.Collections.Hashtable] - is case-SENSITIVE by default; 
    #       it is PowerShell that makes them case-INsensitive behind the scenes.
    $dictionary = [System.Collections.Specialized.OrderedDictionary]::new()
    $dictionary['One'] = 1
    $dictionary['TWO'] = 2
    $dictionary['two'] = 2  # case variation
    $dictionary['Three'] = 2
    
    $lookupKey = 'two'
    
    # -> Stores 'two' in $actualKey, the case-exact match.
    $actualKey = 
      if ($dictionary.Contains($lookupKey)) {
        $candidateKeys = $dictionary.Keys.Where({ $_ -eq $lookupKey }, 0, 2)
        if ($candidateKeys.Count -ge 2) { $lookupKey }
        else                            { $candidateKeys }
      }
    
    • .Contains($lookupKey), weeds out lookup keys that do not match an existing key, whether case-insensitively or not.

    • .Where({ $_ -eq $lookupKey }, 0, 2) looks for at most 2 candidate (potential) key matches, case-insensitively; limiting the search to 2 candidates is a performance optimization: from the presence of two (or more) keys that are case variations of each other it can be inferred that the dictionary at hand is case-sensitive.

    • If there are (at least) two candidate keys ($candidateKeys.Count -ge 2), the implication is that the hashtable is case-sensitive and that the lookup key is therefore by definition already in the case-exact form (otherwise .Contains() wouldn't have returned $true), so $lookupKey is returned.

    • Otherwise, if only a single candidate key is returned, that key is by definition the correct one and is returned (note that there's no need to use [0] in this case in order to output the one and only element itself, because the use of an if statement causes any collection (enumerable) to be auto-enumerated, just like in the pipeline).

    Alternative, which optimizes for the case where the lookup key already is in the case-exact form:

    $actualKey =
      if ($dictionary.Contains($lookupKey)) {
          if (-1 -ne [Array]::IndexOf($dictionary.Keys, $lookupKey)) { $lookupKey }
          else { $dictionary.Keys.Where({ $_ -eq $lookupKey }, 'First') }
      }
    
    • Once the existence of a matching key has been confirmed via .Contains(), a case-sensitive (linear) lookup is performed first, using [Array]::IndexOf() (which performs current-culture, case-sensitive comparisons), which performs much better than a .Where() filter.

    • If a case-exact match is found, the lookup key is by definition already in the case-exact form.

    • Otherwise, the implication is that the dictionary is case-insensitive and that that is therefore sufficient to look for the first - and by definition only - key in the .Keys collection that case-insensitively matches the lookup key, using .Where() with -eq.


    An hypothetical efficient solution would require efficient (O(1)) lookups in the .Keys collection that return the matching keys themselves, but as you note, dictionaries aren't designed for that, as they translate keys into abstract hashes (numbers) for lookup of values (and these hashes have no reverse link to the keys they were derived from).

    If it were possible - it isn't[1] - to obtain the System.Collections.IEqualityComparer / System.Collections.Generic.IEqualityComparer`1 used by a given dictinonary instance and the specific instance used is of a type exposed via the static properties of the System.StringComparer class, such as [System.StringComparer]::OrdinalIgnoreCase, you could at least infer case-sensitivity and optimize based on that: if the dictionary is case-sensitive and a value lookup (.Contains()) succeeds, you're by definition dealing with the case-exact form of the key; otherwise, it is sufficient to (linearly) look for the first case-insensitively matching key in .Keys.
    If the given dictionary happens to be of type System.Collections.Generic.SortedDictionary`2, you could perform a binary search of the .Keys collection, taking advantage of the fact that the collection of keys is by definition sorted, along the lines of:
    [Array]::BinarySearch($sortedDict.Keys, $lookupKey, [StringComparer]::InvariantCultureIgnoreCase)

    With System.Collections.Generic.SortedDictionary`2, specifically, you could achieve a relatively efficient solution even without knowing the comparer, via two to three non-linear lookups (though in practice you may not notice a difference from the linear approach except in large dictionaries):

    • Test if the lookup key matches at all with .Contains() and stop if not.

    • If it does, first use the binary search case-sensitively: if it succeeds, the lookup key is by definition already in the case-exact form, so return it.

    • Otherwise (the dictionary must be case-insensitive), repeat the search case-insensitively, which will by definition find the one and only matching key that is a case variation of the lookup key, and return it.


    [1] Neither the specific System.Collections.Hashtable ([hashtable]) type nor the dictionary interfaces (System.Collections.IDictionary and System.Collections.Generic.IDictionary`2) offer (public) members for accessing the comparer (and trying to use reflection to access non-public members is not only generally ill-advised, but would have to be done for each concrete dictionary-like type separately).