Search code examples
sqlexcelvbapowerquery

Transforming large vertical set of data into multiple tables in excel


I have a very large set of data stored vertically which I need to separate into multiple tables.

Currently the data is stored as such when exported (repeated for about 3k rows down)

Raw data:

enter image description here

The date is stored is A1 and beneath is each group of data. Each group is separated by a header row with a name in the first column and no column labels for column 2 and column 3.

The Column 1 header for each "Table" in the data range is the name of the "table" which it shows data for, and the cells in column 1 are identical for each grouping. So for "Name1" the cells beneath will be same as the cells beneath "Name2", and the cells beneath Column2 and Column3's headers will be the data that changes. Which is what has become my biggest headache to solve. The problem is this is the only way which the system exports that data.

I'm attempting to select the entire range of data, so three columns wide and a good few thousand rows down and apply either a macro or SQL automation to convert it into multiple tables.

The format for the tables I'm attempting to generate is keeping the name in the already existing header rows in column 1, and setting the second and third columns of each header row to "Header2" and "Header3". Additionally I'd like to take the date from cell A1 and concatenate it with the Name stored in the first header, EG "Name-19/02/2024" and set it as the table name. EG: Transformed data:

https://i.sstatic.net/ipCBA.png

I've already attempted to perform this using macros however the nature of the problem extends my knowledge in VBA due to the complexity of the transformation, I have the assumption that it may be easier solved in SQL or PowerQuery? So I'm open to using those if its a more elegant solution.


Solution

    • Assuming that cell A contains the only date on the sheet.
    • Note: EG "Name-19/02/2024" and set it as the table name. Table names must be unique; for instance, in your screenshot, two tables have the same headers (with the first cell being "Name"). It throws error 1004.
    Option Explicit
    
    Sub Demo()
        Dim i As Long, iR As Long, oRng
        Dim arrData, rngData As Range, sDate As String
        Dim oCol As Collection, oSht As Worksheet
        Const COL2_NAME = "Header2"
        Const COL3_NAME = "Header3"
        Set oSht = ActiveSheet ' modify as needed
        With oSht.Range("A1").CurrentRegion
            Set rngData = .Resize(.Rows.Count + 1)
        End With
        ' Load data into array
        arrData = rngData.Value
        sDate = arrData(1, 1)
        Set oCol = New Collection
        ' Loop through data
        For i = LBound(arrData) + 1 To UBound(arrData)
            ' Locate the header row of each table
            If Len(arrData(i, 2) & arrData(i, 3)) = 0 Then
                If Len(arrData(i, 1)) > 0 Then
                    ' Populate header
                    arrData(i, 1) = arrData(i, 1) & "-" & sDate
                    arrData(i, 2) = COL2_NAME
                    arrData(i, 3) = COL3_NAME
                    If iR > 0 Then
                        oCol.Add oSht.Range("A" & iR).Resize(i - iR, 3), CStr(i)
                    End If
                    iR = i
                ElseIf i = UBound(arrData) Then
                    oCol.Add oSht.Range("A" & iR).Resize(i - iR, 3), CStr(i)
                End If
            End If
        Next i
        ' Write output to sheet
        rngData.Value = arrData
        ' Create table (ListObject)
        For Each oRng In oCol
            ' table with default name
            oSht.ListObjects.Add xlSrcRange, oRng, , xlYes
            ' table with customized name
            ' oSht.ListObjects.Add(xlSrcRange, oRng, , xlYes).Name = oRng(1)
        Next
    End Sub
    
    

    enter image description here