Search code examples
excelsortingdata-manipulation

Sort data contained in blocks in excel


I have a large amount of reference data in excel, which I am trying to manipulate in a variety of ways. I'm having some problems with the way it is structured and sorting into a more manageable format.

Problem number 1: I have three columns. Column A contains first a date, and then a designator of high or low. Column B contains times, Column C contains heights.

I would like to sort the data by column B (easy enough) EXCEPT I would like the date headings in Column A preserved. It's almost as though I have 365 tables, each with between 3 and 5 pieces of data - I'm looking to sort the 3 - 5 pieces of data within each date only.

This is what I have currently:

excel data in 3 columns

There's no issue with me taking the data and manipulating it some other way first - this is ultimately around me being able to take a batch of data (5x different reference points, each for 365 days) and develop a process to sanitise it and get it displayed in time order, as well as being able to get it into a usable format for problem 2 (I need to adjust some other data points by the sorted data once I have it).

This is what I would like it to look like (I manually went through each of these blocks and sorted them):

enter image description here


Solution

  • It is possible to do it in Excel as follows in cell E2:

    =LET(rng, A1:C11, set, FILTER(rng, (INDEX(rng,,1) <>"")),
      dates, SCAN("", INDEX(set,,1), LAMBDA(acc, item, IF(ISNUMBER(item), item, acc))),
      in, FILTER(HSTACK(dates, set), INDEX(set,,2)<>""), inDates, INDEX(in,,1),
      out, REDUCE("", UNIQUE(inDates), LAMBDA(acc, date,
        LET(sorted, VSTACK(date, DROP(SORT(FILTER(in, inDates = date),3),,1), {"","",""}),
        VSTACK(acc, sorted)
      ))), IFERROR(DROP(DROP(out,1),-1),"")
    )
    

    Here is the output: sample excel file

    You can avoid the clean-up process except for removing the last row as follow:

    =LET(rng, A1:C11, set, FILTER(rng, (INDEX(rng,,1) <>"")),
      dates, SCAN("", INDEX(set,,1), LAMBDA(acc, item, IF(ISNUMBER(item), item, acc))),
      in, FILTER(HSTACK(dates, set), INDEX(set,,2)<>""), inDates, INDEX(in,,1),
      out, REDUCE("", UNIQUE(inDates), LAMBDA(acc, date,
        LET(sorted, VSTACK(HSTACK(date,"",""), DROP(SORT(FILTER(in, inDates = date),3),,1), 
          {"","",""}), IF(MAX(LEN(acc))=0, sorted, VSTACK(acc, sorted))
      ))), DROP(out, -1)
    )
    

    Explanation

    Basically is to carry out the manual steps but using excel functions. The name set, is the same as the input data (rng) but we removed the empty rows. The name dates, is a column with the same size as rng, repeating all the dates. The condition in the SCAN function to identify a new date is ISNUMBER because dates are stored in Excel as whole numbers. The name in has the data in the format we want for doing the sorting and filter by date removing the date header and adding as the first column the dates.

    Now we use DROP/REDUCE/VSTACK pattern (check the answer to the question: how to transform a table in Excel from vertical to horizontal but with different length provided by David Leal) to append each sorted data for a given unique date. We add the date as the first row, then sorted data, and finally an empty row to separate each group of data. Finally, we do a clean-up via IFERROR/DROP to remove the #N/A values and the first and the last empty row.