Search code examples
pythonsortingattributescase-sensitivecase-insensitive

attrgetter: Altering Default Order When Sorting by Object Attribute


I am using the attrgetter function from Python 3's operator module in order to sort a list of objects (hits). Each object has 12 attributes, and my sorting function can be fed any of them in order to sort the list in whatever way is needed. The attributes I am interested in sorting on contain strings. Here is the relevant snippet from my code.

from operator import attrgetter
...
def sort_hits_by_attribute(hits, attribute, backwards = False):
    """Takes a lits of hits and sorts them by some attribute.
    """
    return sorted(hits, key = attrgetter(attribute), reverse = backwards)

Here is an example of a "hit" object, with its attributes.

  name: ...
  entity_1: coffee cultivation
  entity_2: ...
  full_statement: ...
  category: ...
  rule: ...
  syn_configs: ...
  lex_conditions: ...
  sentence_number: ...
  close_call: False
  message: ...
  id: 119

If I sort my list of objects by the attribute entity_1, then the above object is sorted after an instance whose entity_1 field begins with an upper case letter: e.g., "Coffee" or even "Zoo."

I would like to use a function something like casefold(), so that upper case letters are sorted adjacent to and after their lower case counterparts. However, casefold() is only valid for strings, so using key = attrgetter(attribute).casefold() returns an AttributeError.

How can I preserve the functionality of sort_hits_by_attribute() – i.e., sorting by an attribute passed in during the function call – but force Python to use a different ordering {aAbBcCdDeE...} when doing so?


Solution

  • I found the answer here, thanks to @KylePDavis, who provided a generalized solution where the attribute can be passed in as an argument. The key is defining the key using a lambda function.

    My code now looks as follows. Note the input check to verify (1) that the list is not empty and (2) whether the attribute of interest is indeed of a type (str) that can be sorted using casefold().

    def sort_hits_by_attribute(hits, attribute, backwards=False):
        """Takes a lits of hits and sorts them by some attribute.
    
        For instance, group duplicate relation hits together by sorting
        on full_statement.
        """
    
        if hits:
            if isinstance(attrgetter(attribute)(hits[0]), str):
                return sorted(hits, 
                    key = lambda A: attrgetter(attribute)(A).casefold(), 
                    reverse = backwards)
            else:
                return sorted(hits, 
                    key = attrgetter(attribute), 
                    reverse = backwards)
    

    I have not marked this question as a duplicate because the cited question's favorited answer is not the answer that was important specifically for this case.