Search code examples
pythonregexpython-re

How do I use regex to sort this string


I'm trying to sort this string of data by the date at the beginning of the string, but I'm not sure exactly how to split, join, and sort it using this regex. And yes I'm using re.MULTILINE.

regex that matches the date at the beginning of the line:

^ [0-9]{4}

Example of the string I need to be sorted:

 string = ''' 
 2013 this is data 3 (more data from 3)
 2016 this is data 6 (more data from 6)
 2011 this is data 1 (more data from 1)
 2012 this is data 2 (more data from 2)
 2014 this is data 4 (more data from 4)'''

What I want it to look like:

 string = ''' 
 2016 this is data 6 (more data from 6)
 2014 this is data 4 (more data from 4)
 2013 this is data 3 (more data from 3)
 2012 this is data 2 (more data from 2)
 2011 this is data 1 (more data from 1)'''

Solution

  • If I understand you correctly, your input data is one long string. You can use str.splitlines() on the string to get it line by line.

    multiline_str = """ 2013 this is data 3 (blah blah blah)
     2016 this is data 6 (blah blah blah)
     2011 this is data 1 (blah blah blah)
     2012 this is data 2 (blah blah blah)
     2014 this is data 4 (blah blah blah)
    """
    sorted(multiline_str.splitlines(), key=lambda x: x[:5], reverse=True)
    

    Will produce this:

    [' 2016 this is data 6 (blah blah blah)', 
     ' 2014 this is data 4 (blah blah blah)',
     ' 2013 this is data 3 (blah blah blah)', 
     ' 2012 this is data 2 (blah blah blah)', 
     ' 2011 this is data 1 (blah blah blah)', 
     ' ']