Search code examples
excelregressiondata-analysisvba

Defining the range of a regression in which the X and Y will change


I need to create a simple regression with the Data Analysis Toolpack. The thing is, the range for Y and X input is always different. To illustrate what I'm trying to say, here's an example of the table I need to work on:

        A   B   C   D   E   F   G   H   I   J   K   L
    1   Y   T   T1  T2  T3  T4  T5  T6  T7  T8  T9  T10
    2   19  1                                       
    3   13  2   19                                  
    4   14  3   13  19                              
    5   16  4   14  13  19                          
    6   17  5   16  14  13  19                      
    7   16  6   17  16  14  13  19                  
    8   20  7   16  17  16  14  13  19              
    9   10  8   20  16  17  16  14  13  19          
    10  20  9   10  20  16  17  16  14  13  19      
    11  11  10  20  10  20  16  17  16  14  13  19  
    12  11  11  11  20  10  20  16  17  16  14  13  19
    13  14  12  11  11  20  10  20  16  17  16  14  13
    14  15  13  14  11  11  20  10  20  16  17  16  14
    15  17  14  15  14  11  11  20  10  20  16  17  16
    16  10  15  17  15  14  11  11  20  10  20  16  17
    17  4   16  10  17  15  14  11  11  20  10  20  16
    18  15  17  4   10  17  15  14  11  11  20  10  20
    19  6   18  15  4   10  17  15  14  11  11  20  10
    20  10  19  6   15  4   10  17  15  14  11  11  20
    21  16  20  10  6   15  4   10  17  15  14  11  11
    22          16  10  6   15  4   10  17  15  14  11
    23              16  10  6   15  4   10  17  15  14
    24                  16  10  6   15  4   10  17  15
    25                      16  10  6   15  4   10  17
    26                          16  10  6   15  4   10
    27                              16  10  6   15  4
    28                                  16  10  6   15
    29                                      16  10  6
    30                                          16  10
    31                                              16

In this example, The Y input would be range A12:A21, that's because the first entry in the last column of the table (the "19" in cell L12) is in row 12 AND The last entry in the first column of the table (the "16" in cell A21) is in row 21; furthermore, The X input would be region B12:L21 for the same reasons.

After doing the first regression, I need to delete two columns out of the table and afterwards do ANOTHER regression. So if, for example I need to delete Columns J and L, the table would look like this:

        A   B   C   D   E   F   G   H   I   J
    1   Y   T   T1  T2  T3  T4  T5  T6  T7  T9  
    2   19  1                                       
    3   13  2   19                                  
    4   14  3   13  19                              
    5   16  4   14  13  19                          
    6   17  5   16  14  13  19                      
    7   16  6   17  16  14  13  19                  
    8   20  7   16  17  16  14  13  19              
    9   10  8   20  16  17  16  14  13  19          
    10  20  9   10  20  16  17  16  14  13      
    11  11  10  20  10  20  16  17  16  14  19  
    12  11  11  11  20  10  20  16  17  16  13  
    13  14  12  11  11  20  10  20  16  17  14  
    14  15  13  14  11  11  20  10  20  16  16  
    15  17  14  15  14  11  11  20  10  20  17  
    16  10  15  17  15  14  11  11  20  10  16  
    17  4   16  10  17  15  14  11  11  20  20  
    18  15  17  4   10  17  15  14  11  11  10  
    19  6   18  15  4   10  17  15  14  11  20  
    20  10  19  6   15  4   10  17  15  14  11  
    21  16  20  10  6   15  4   10  17  15  11  
    22          16  10  6   15  4   10  17  14  
    23              16  10  6   15  4   10  15  
    24                  16  10  6   15  4   17  
    25                      16  10  6   15  10  
    26                          16  10  6   4   
    27                              16  10  15  
    28                                  16  6   
    29                                      10  
    30                                      16  

And now the regression would be with inputs Y (A11:A21) because the first entry in the last column of the table ("19" in cell J11) is in row 11 AND The last entry in the first column of the table ("16" in cell A21) is in row 21. Likewise the X input would be (B11:J21) for the same reasons.

I have tried in a hundred different ways, but no luck. This is the closest I've been to creating what I need, but I'm still lost since it won't execute the regression:

Sub Prueba1()
    Range("A1").Select
    Selection.End(xlToRight).Select
    Selection.End(xlDown).Select
    Selection.End(xlToLeft).Select
    Application.Run "ATPVBAEN.XLAM!Regress", Range(Selection, Selection.End(xlDown)).Select, _
    Range(Selection.Offset(, 1), Selection.End(xlToRight)).Select, False, False, , Range("S1") _
    , False, False, False, False, , False
End Sub

Solution

  • This User Defined Function (aka UDF) will return the range into your Application.Run "ATPVBAEN.XLAM!Regress" as a parameter.

    Function regress_range()
        Dim strAddr As String, c As Long
    
        With Worksheets("Sheet4")   '<~~set this worksheet name!
            With .Cells(1, 1).CurrentRegion
                Set regress_range = .Range(.Cells(.Cells(1, .Columns.Count).End(xlDown).Row, 1), _
                             .Cells(Application.Match(1E+99, .Columns(1)), .Columns.Count))
            End With
        End With
    
    End Function
    

    You need to make sure that it is properly referencing the correct worksheet in the third line.

    This would become part of the run command like,

    Application.Run "ATPVBAEN.XLAM!Regress", regress_range(), False, False, , Range("S1") _
    , False, False, False, False, , False
    

    I'm still concerned how Range("S1") may change (i.e. shift right) if columns are deleted from the regression range. Additionally, it has no explicitly referenced parent worksheet.

    Output starting at your original data block:

    $A$12:$L$21
    $A$11:$J$21