Search code examples
excelexcel-formulaworksheet-functionvba

How to distribute data of "type:value" format by categories in Excel?


I am on my way to do linear regression analysis of data parsed from a site. I was able to get data in the following inconvenient format only:

Data in 1-st column contains data types separated by ':' symbol. For example:

Year:Storey:Area:Condition:Type:Name

Data in 2-nd column contains information corresponding to data types in 1st column also separated by ':' symbol:

2015:3:170:Renovated:TypeB:John

In my Excel table I have 13000 rows with such data in above mentioned format. I want to categorize this data by categories specified for the data.

To clarify, I provide picture of original data fromat: enter image description here

I try to convert this data to the following format: enter image description here


I tried to split this data into columns by using delimiters, but the problem is that number of data types varies. 1 row may have 5 categories. Another row may have only 2 categories (data types).

Is it possible to categorize that kind of data in Excel?


Solution

  • This should do what you want:

    Sub Andrey()
    Dim catArr() As String
    Dim resArr() As String
    Dim lastrow As Long
    Dim ows As Worksheet
    Dim tws As Worksheet
    Dim i As Long
    Dim j As Long
    Dim startrow As Long
    
    Set ows = Sheets("Sheet9") ' change to where your data is
    Set tws = Sheets("Sheet10") ' change to where you want your data
    
    startrow = 3 ' Change to the first row with data
    
    With ows
        lastrow = .Range("A" & .Rows.Count).End(xlUp).Row
        For i = startrow To lastrow
            tws.Cells(i, 1) = .Cells(i, 1)
            tws.Range(tws.Cells(i, "B"), tws.Cells(i, "G")).Value = "NULL"
            tws.Range(tws.Cells(i, "B"), tws.Cells(i, "G")).Font.Italic = True
            tws.Range(tws.Cells(i, "B"), tws.Cells(i, "G")).Font.Bold = False
            tws.Range(tws.Cells(i, "A"), tws.Cells(i, "G")).HorizontalAlignment = xlCenter
            catArr = Split(.Cells(i, 2), ":")
            resArr = Split(.Cells(i, 3), ":")
            For j = LBound(catArr) To UBound(catArr)
                tws.Cells(i, WorksheetFunction.Match(catArr(j), tws.Range("A1:G1"), 0)) = resArr(j)
                tws.Cells(i, WorksheetFunction.Match(catArr(j), tws.Range("A1:G1"), 0)).Font.Bold = True
                tws.Cells(i, WorksheetFunction.Match(catArr(j), tws.Range("A1:G1"), 0)).Font.Italic = False
            Next j
        Next i
    End With
    
    End Sub
    

    A couple of caveats.

    1) You need to put the data types as titles in the first row of the sheet into which the split data will be placed.

    2) I used the rows and columns from your example. If different then you will need to adjust the cells references.

    Edit: I changed the above to add the "NULL" and format the cells with bold and italics to match better what you want.