Search code examples
linuxbashshellsortingtime

Sort date time field based on AM/PM via shell script


I am trying to sort these two lines reversely based on date and time, i am using the below command, and it is working fine but not considering AM and PM, so the below two line not sorted correctly as the time of first one is 10 AM while the second one is 2 PM

command:

sort -t',' -k2,2 -k7,7r file.csv

lines:

before sorting:

field1,field2,field3,field4,field5,field6,19-NOV-23 10.58.03.000000 AM
field1,field2,field3,field4,field5,field6,19-NOV-23 02.48.47.000000 PM

lines after sorting (same arrangement)

field1,field2,field3,field4,field5,field6,19-NOV-23 10.58.03.000000 AM
field1,field2,field3,field4,field5,field6,19-NOV-23 02.48.47.000000 PM

Solution

  • man sort (GNU sort, that is):

    -k, --key=KEYDEF
           sort via a key; KEYDEF gives location and type
    
    KEYDEF  is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where
    F is a field number and C a character position in the field;  both  are
    origin 1, and the stop position defaults to the line's end. - -  
    OPTS is one or more single-letter ordering options [bdfgiMhnRrV] - -
    

    Field 7 in your data:

             1111111111222222222
    1234567890123456789012345678
    19-NOV-23 10.58.03.000000 AM
    19-NOV-23 02.48.47.000000 PM
    

    First define reverse (or in the OPTS), field separator and the year (field 7, positions 8 and 9, numeric):

    sort -r -t, -k7.8,7.9n file
    

    Then the month (M ordering option in GNU sort):

    sort -r -t, -k7.8,7.9n -k7.4,7.6M file
    

    Then the day and AM/PM (actually A/P):

    sort -r -t, -k7.8,7.9n -k7.4,7.6M -k7.1n,7.2n -k7.27 file
    

    And then the time part (left for exercise).

    With above you get:

    field1,field2,field3,field4,field5,field6,19-NOV-23 02.48.47.000000 PM
    field1,field2,field3,field4,field5,field6,19-NOV-23 10.58.03.000000 AM