Search code examples
vbaexceldata-cleaning

Converting data to panel format using forloops vba-excel


My data set looks like:

AUS, UK, USA, GERMANY, FRANCE, MEXICO, DATE
R1, R1,   R1,    R1  ,   R1  , R1    ,  1
R2, R2,   R2,    R2  ,   R2  , R2    ,  2
...

And so on. I want to convert it so that it looks like

COUNTRY, RETURNS, DATE, 
AUS,     R1,       1
AUS,     R2,       2
...,    ...,     ...,
UK,     R1,        1,
UK,     R2,        2,
...     ...      ...,
MEXICO, R1,        1,
MEXICO, R2,        2,
...     ...      ...

I feel like this should be possible with a simple nested forloop.

I tried:

    sub panel()
'dim variables
Dim i As Integer
Dim j As Integer
Dim reps As Integer
Dim country As String
Dim strfind As String
Dim obs As Integer

'count the number of countries
reps = Range("D1:AL1").Columns.Count

'count the number of observations per country
obs = Range("C4:C5493").Rows.Count

'copy and paste country into panel format
For i = 1 To reps
    'set country name
    country =Range("D1").Cells(1, i)
    For j = 1 To obs
    'copy and paste country values
    Range("AS2").Cells(j, 1) = country
    Next j
Next i

but after the j loops finishes, and the new country name is set, the new values replace the old values in the first batch of cells.


Solution

  • Consider an SQL solution using UNION queries to select each column for long format. If using Excel for PC, Excel can connect to the Jet/ACE SQL Engine (Windows .dll files) via ADO and run SQL queries on worksheets of current workbook.

    With this approach, you avoid any for looping, nested if/then logic, and other data manipulation needs for desired output. Below example assumes data resides in tab called DATA and an empty tab called RESULTS.

    Sub RunSQL()    
        Dim conn As Object, rst As Object
        Dim strConnection As String, strSQL As String
        Dim i As Integer
    
        Set conn = CreateObject("ADODB.Connection")
        Set rst = CreateObject("ADODB.Recordset")
    
        ' CONNECTION STRINGS (TWO VERSIONS)
    '    strConnection = "DRIVER={Microsoft Excel Driver (*.xls, *.xlsx, *.xlsm, *.xlsb)};" _
    '                      & "DBQ=C:\Path\To\Workbook.xlsm;"
        strConnection = "Provider=Microsoft.ACE.OLEDB.12.0;" _
                           & "Data Source='C:\Path\To\Workbook.xlsm';" _
                           & "Extended Properties=""Excel 8.0;HDR=YES;"";"        
        strSQL = " SELECT 'AUS' AS COUNTRY, AUS AS RETURNS, [DATE] FROM [DATA$]" _
                & " UNION ALL SELECT 'UK', UK AS Country, [DATE] FROM [DATA$]" _
                & " UNION ALL SELECT 'USA', USA AS Country, [DATE] FROM [DATA$]" _
                & " UNION ALL SELECT 'GERMANY', GERMANY AS Country, [DATE] FROM [DATA$]" _
                & " UNION ALL SELECT 'FRANCE', FRANCE AS Country, [DATE] FROM [DATA$]" _
                & " UNION ALL SELECT 'MEXICO', MEXICO AS Country, [DATE] FROM [DATA$];"
    
        ' OPEN CONNECTION & RECORDSET
        conn.Open strConnection
        rst.Open strSQL, conn
    
        ' COLUMN HEADERS
        For i = 1 To rst.Fields.Count
            Worksheets("RESULTS").Cells(1, i) = rst.Fields(i - 1).Name
        Next i        
        ' DATA ROWS
        Worksheets("RESULTS").Range("A2").CopyFromRecordset rst
    
        rst.Close: conn.Close    
        Set rst = Nothing: Set conn = Nothing    
    End Sub
    

    Output

    COUNTRY     RETURNS     DATE
    AUS         R1          1
    AUS         R2          2
    UK          R1          1
    UK          R2          2
    USA         R1          1
    USA         R2          2
    GERMANY     R1          1
    GERMANY     R2          2
    FRANCE      R1          1
    FRANCE      R2          2
    MEXICO      R1          1
    MEXICO      R2          2