Search code examples
pythonsortinggroupingaggregatefileparsing

Parsing, Aggregating & Sorting text file in Python


I have the file named "names.txt" having the following contents:

{"1":[1988, "Anil 4"], "2":[2000, "Chris 4"], "3":[1988, "Rahul 1"],
"4":[2001, "Kechit 3"], "5":[2000, "Phil 3"], "6":[2001, "Ravi 4"],
"7":[1988, "Ramu 3"], "8":[1988, "Raheem 5"], "9":[1988, "Kranti 2"],
"10":[2000, "Wayne 1"], "11":[2000, "Javier 2"], "12":[2000, "Juan 2"],
"13":[2001, "Gaston 2"], "14":[2001, "Diego 5"], "15":[2001, "Fernando 1"]}

Problem statement : File "names.txt" contains some student records in the format -

{"number": [year of birth, "name rank"]}

Parse this file and Segregate them according to year and then sort the names according to rank. First segregation and then sorting. Output should be in the format -

{year : [Names of students in sorted order according to rank]}

So the expected output is -

{1988:["Rahul 1","Kranti 2","Rama 3","Anil 4","Raheem 5"],
2000:["Wayne 1","Javier 2","Jaan 2","Phil 3","Chris 4"],
2001:["Fernando 1","Gaston 2","Kechit 3","Ravi 4","Diego 5"]}

First How to store this file content in a dictionary object? Then Grouping by year & then ordering names by rank? How to achieve this in Python?

Thanks..


Solution

  • Its Very Simple :)

    #!/usr/bin/python
    # Program: Parsing, Aggregating & Sorting text file in Python
    # Developed By: Pratik Patil
    # Date: 22-08-2015
    
    import pprint;
    
    # Open file & store the contents in a dictionary object
    file = open("names.txt","r");
    file_contents=eval(file.readlines().pop(0));
    
    # Extract all lists from file contents
    file_contents_values=file_contents.values();
    
    # Extract Unique Years & apply segregation
    year=sorted(set(map(lambda x:x[0], file_contents_values)));
    file_contents_values_grouped_by_year = [ [y[1] for y in file_contents_values if y[0]==x ] for x in year];
    
    # Create Final Dictionary by combining respective keys & values
    output=dict(zip(year, file_contents_values_grouped_by_year));
    
    # Apply Sorting based on ranking
    for NameRank in output.values():
        NameRank.sort(key=lambda x: int(x.split()[1]));
    
    # Print Output by ascending order of keys
    pprint.pprint(output);