Search code examples
excelanalysis

How to analyze data in Excel


I have an Excel document that consists of huge data in need of analysis. The data is basically objects with corresponding error messages. Typical output is:

**REC NO07/121007163**
Valuation for 0001 IFRS16 Balance sheet valuation
Asset transactions already posted need to be reversed
The valuation could not be completed
**REC NO07/121007165**
Valuation for 0001 IFRS16 Balance sheet valuation
Asset transactions already posted need to be reversed
The valuation could not be completed
**REC NO07/121007220**
Valuation for 0001 IFRS16 Balance sheet valuation
Closing balance 5 070,00 NOK liability available
Difference 5 070,00- NOK between clearing and expense available
**REC NO07/121007221**
Valuation for 0001 IFRS16 Balance sheet valuation
Closing balance 5 070,00 NOK liability available
Difference 5 070,00- NOK between clearing and expense available

What you see in bold above, is the object. This is not in bold in Excel, but I have made it bold here to explain. Everything in-between is the error message for that object.

The length (number of lines) of the error message could vary between objects.

What I would like to do, is basically convert the above to this:

REC NO07/121007163  Valuation for 0001 IFRS16 Balance sheet valuation. Asset transactions already posted need to be reversed. The valuation could not be completed
REC NO07/121007165  Valuation for 0001 IFRS16 Balance sheet valuation. Asset transactions already posted need to be reversed. The valuation could not be completed
REC NO07/121007220  Valuation for 0001 IFRS16 Balance sheet valuation. Closing balance 5 070,00 NOK liability available. Difference 5 070,00- NOK between clearing and expense available
REC NO07/121007221  Valuation for 0001 IFRS16 Balance sheet valuation. Closing balance 5 070,00 NOK liability available. Difference 5 070,00- NOK between clearing and expense available

I am adding a tab between the object and the error message.

I am combining all lines of the error message with ". "

Is this possible in Excel and if yes, is there anyone that could help me with that?

Thank you

Best regards Antonis

I have tried to do this with formulas in Excel but as the number of lines for each error message varies, I was not able to solve it.


Solution

  • Assuming all the error codes start with REC and no excel version constraints per tags listed in the question, then you can use the following formula in cell B1:

    =LET(A, A1:A16, m, ROWS(A), seq, SEQUENCE(m), idx, FILTER(seq, (LEFT(A,3)="REC")),
     start, idx+1, end, VSTACK(DROP(idx-1,1), m), MAP(start, end, 
      LAMBDA(s,e, INDEX(A,s-1)&" "&TEXTJOIN(". ",, FILTER(A, (seq>=s) * (seq<=e))))))
    

    Here is the output: excel output

    Basically, it finds first the index position of the error codes (idx) and based on that finds the start and end rows of each error message. Then we use MAP to concatenate the result via TEXTJOIN selecting on each iteration the range via FILTER and prefixing the error code (INDEX(A,s-1)).