Search code examples
rmatrixsummary

Dividing Summary table Matrix into a few table Matrix in R


So I have a matrix which dim is 17 cols and 1000 rows (all of it is numeric), and then I summary the matrix, summary(matrix) then I got these:

Picture 1) Summary Matrix

My Question is: Is there anyway to split these summary table into a few table? like these

          V1  V2  V3  V4  V5  V6 
Min

1st Qu

Median 

Mean

3rd Qu

Max

           V7  V8  V9  V10  V11  V12 

Min

1st Qu

Median 

Mean

3rd Qu

Max

           V13  V14  V15  V16  V17  

Min

1st Qu

Median 

Mean

3rd Qu

Max

I need to maintain space in my R shiny app for these matrix to be displayed without make it display collide each other like these

Picture 2) Summary Colliding

Note: sorry if all i can state is a picture


Solution

  • 1) read.dcf/unnest The elements of the matrix are of DCF form so we can use read.dcf and then unnest that:

    library(tidyr)
    
    s <- summary(mtcars)
    DF <- read.dcf(textConnection(s), all = TRUE)
    res <- setNames(data.frame(t(unnest(DF)), check.names = FALSE), trimws(colnames(s)))
    

    giving:

    > res
              mpg   cyl  disp    hp  drat    wt  qsec     vs     am  gear  carb
    Min.    10.40 4.000  71.1  52.0 2.760 1.513 14.50 0.0000 0.0000 3.000 1.000
    1st Qu. 15.43 4.000 120.8  96.5 3.080 2.581 16.89 0.0000 0.0000 3.000 2.000
    Median  19.20 6.000 196.3 123.0 3.695 3.325 17.71 0.0000 0.0000 4.000 2.000
    Mean    20.09 6.188 230.7 146.7 3.597 3.217 17.85 0.4375 0.4062 3.688 2.812
    3rd Qu. 22.80 8.000 326.0 180.0 3.920 3.610 18.90 1.0000 1.0000 4.000 4.000
    Max.    33.90 8.000 472.0 335.0 4.930 5.424 22.90 1.0000 1.0000 5.000 8.000
    

    2) subset columns For reduced width this could be broken up into res[1:6] and res[7:11] or more generally if there are n columns and we want k columns per group except possibly for the last group:

    n <- ncol(res)
    k <- 6
    g <- droplevels(gl(n, k, n)) # grouping vector
    lapply(split(as.list(res), g), data.frame)
    

    giving:

    $`1`
              mpg   cyl  disp    hp  drat    wt
    Min.    10.40 4.000  71.1  52.0 2.760 1.513
    1st Qu. 15.43 4.000 120.8  96.5 3.080 2.581
    Median  19.20 6.000 196.3 123.0 3.695 3.325
    Mean    20.09 6.188 230.7 146.7 3.597 3.217
    3rd Qu. 22.80 8.000 326.0 180.0 3.920 3.610
    Max.    33.90 8.000 472.0 335.0 4.930 5.424
    
    $`2`
             qsec     vs     am  gear  carb
    Min.    14.50 0.0000 0.0000 3.000 1.000
    1st Qu. 16.89 0.0000 0.0000 3.000 2.000
    Median  17.71 0.0000 0.0000 4.000 2.000
    Mean    17.85 0.4375 0.4062 3.688 2.812
    3rd Qu. 18.90 1.0000 1.0000 4.000 4.000
    Max.    22.90 1.0000 1.0000 5.000 8.000
    

    3) no transpose Another alternative for reduced width is to just not transpose it:

    data.frame(unnest(DF), row.names = trimws(colnames(s)), check.names = FALSE)
    

    giving:

         Min.    1st Qu. Median  Mean    3rd Qu. Max.   
    mpg    10.40   15.43   19.20   20.09   22.80   33.90
    cyl    4.000   4.000   6.000   6.188   8.000   8.000
    disp    71.1   120.8   196.3   230.7   326.0   472.0
    hp      52.0    96.5   123.0   146.7   180.0   335.0
    drat   2.760   3.080   3.695   3.597   3.920   4.930
    wt     1.513   2.581   3.325   3.217   3.610   5.424
    qsec   14.50   16.89   17.71   17.85   18.90   22.90
    vs    0.0000  0.0000  0.0000  0.4375  1.0000  1.0000
    am    0.0000  0.0000  0.0000  0.4062  1.0000  1.0000
    gear   3.000   3.000   4.000   3.688   4.000   5.000
    carb   1.000   2.000   2.000   2.812   4.000   8.000
    

    4) psych::describe A simple alternative is to use psynh::describe

    library(psych)
    
    describe(mtcars)
    

    giving:

         vars  n   mean     sd median trimmed    mad   min    max  range  skew kurtosis    se
    mpg     1 32  20.09   6.03  19.20   19.70   5.41 10.40  33.90  23.50  0.61    -0.37  1.07
    cyl     2 32   6.19   1.79   6.00    6.23   2.97  4.00   8.00   4.00 -0.17    -1.76  0.32
    disp    3 32 230.72 123.94 196.30  222.52 140.48 71.10 472.00 400.90  0.38    -1.21 21.91
    hp      4 32 146.69  68.56 123.00  141.19  77.10 52.00 335.00 283.00  0.73    -0.14 12.12
    drat    5 32   3.60   0.53   3.70    3.58   0.70  2.76   4.93   2.17  0.27    -0.71  0.09
    wt      6 32   3.22   0.98   3.33    3.15   0.77  1.51   5.42   3.91  0.42    -0.02  0.17
    qsec    7 32  17.85   1.79  17.71   17.83   1.42 14.50  22.90   8.40  0.37     0.34  0.32
    vs      8 32   0.44   0.50   0.00    0.42   0.00  0.00   1.00   1.00  0.24    -2.00  0.09
    am      9 32   0.41   0.50   0.00    0.38   0.00  0.00   1.00   1.00  0.36    -1.92  0.09
    gear   10 32   3.69   0.74   4.00    3.62   1.48  3.00   5.00   2.00  0.53    -1.07  0.13
    carb   11 32   2.81   1.62   2.00    2.65   1.48  1.00   8.00   7.00  1.05     1.26  0.29
    

    5) Hmisc::describe Hmisc also has a describe function:

    library(Hmisc)
    describe(mtcars)
    

    giving:

    mtcars 
    
     11  Variables      32  Observations
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    mpg 
           n  missing distinct     Info     Mean      Gmd      .05      .10      .25      .50      .75      .90      .95 
          32        0       25    0.999    20.09    6.796    12.00    14.34    15.43    19.20    22.80    30.09    31.30 
    
    lowest : 10.4 13.3 14.3 14.7 15.0, highest: 26.0 27.3 30.4 32.4 33.9
    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    cyl 
           n  missing distinct     Info     Mean      Gmd 
          32        0        3    0.866    6.188    1.948 
    
    Value          4     6     8
    Frequency     11     7    14
    Proportion 0.344 0.219 0.438
    
    ...etc...
    

    6) skimr::skim This is a new package. It can produce spark graphics as part of the summary output; however, that depends on font support which may be tricky so we have disabled that part below. Note that skim requires a data frame as input so if your input is a matrix use skim(as.data.frame(input)).

    library(skimr)
    skim_with(numeric = list(hist = NULL)) # omit spark histogram
    skim(mtcars) 
    

    giving:

    Skim summary statistics
     n obs: 32 
     n variables: 11 
    
    Variable type: numeric 
       variable missing complete  n   mean     sd   min    p25 median    p75    max
    1        am       0       32 32   0.41   0.5   0      0      0      1      1   
    2      carb       0       32 32   2.81   1.62  1      2      2      4      8   
    3       cyl       0       32 32   6.19   1.79  4      4      6      8      8   
    4      disp       0       32 32 230.72 123.94 71.1  120.83 196.3  326    472   
    5      drat       0       32 32   3.6    0.53  2.76   3.08   3.7    3.92   4.93
    6      gear       0       32 32   3.69   0.74  3      3      4      4      5   
    7        hp       0       32 32 146.69  68.56 52     96.5  123    180    335   
    8       mpg       0       32 32  20.09   6.03 10.4   15.43  19.2   22.8   33.9 
    9      qsec       0       32 32  17.85   1.79 14.5   16.89  17.71  18.9   22.9 
    10       vs       0       32 32   0.44   0.5   0      0      0      1      1   
    11       wt       0       32 32   3.22   0.98  1.51   2.58   3.33   3.61   5.42
    

    If you want to try the spark graphics see: Skimr - cant seem to produce the histograms

    7) pastecs::stat.desc The pastecs package also has a function that could be used:

    stat.desc(mtcars)
    

    giving:

                         mpg         cyl         disp           hp         drat          wt        qsec          vs          am        gear       carb
    nbr.val       32.0000000  32.0000000 3.200000e+01   32.0000000  32.00000000  32.0000000  32.0000000 32.00000000 32.00000000  32.0000000 32.0000000
    nbr.null       0.0000000   0.0000000 0.000000e+00    0.0000000   0.00000000   0.0000000   0.0000000 18.00000000 19.00000000   0.0000000  0.0000000
    nbr.na         0.0000000   0.0000000 0.000000e+00    0.0000000   0.00000000   0.0000000   0.0000000  0.00000000  0.00000000   0.0000000  0.0000000
    min           10.4000000   4.0000000 7.110000e+01   52.0000000   2.76000000   1.5130000  14.5000000  0.00000000  0.00000000   3.0000000  1.0000000
    max           33.9000000   8.0000000 4.720000e+02  335.0000000   4.93000000   5.4240000  22.9000000  1.00000000  1.00000000   5.0000000  8.0000000
    range         23.5000000   4.0000000 4.009000e+02  283.0000000   2.17000000   3.9110000   8.4000000  1.00000000  1.00000000   2.0000000  7.0000000
    sum          642.9000000 198.0000000 7.383100e+03 4694.0000000 115.09000000 102.9520000 571.1600000 14.00000000 13.00000000 118.0000000 90.0000000
    median        19.2000000   6.0000000 1.963000e+02  123.0000000   3.69500000   3.3250000  17.7100000  0.00000000  0.00000000   4.0000000  2.0000000
    mean          20.0906250   6.1875000 2.307219e+02  146.6875000   3.59656250   3.2172500  17.8487500  0.43750000  0.40625000   3.6875000  2.8125000
    SE.mean        1.0654240   0.3157093 2.190947e+01   12.1203173   0.09451874   0.1729685   0.3158899  0.08909831  0.08820997   0.1304266  0.2855297
    CI.mean.0.95   2.1729465   0.6438934 4.468466e+01   24.7195501   0.19277224   0.3527715   0.6442617  0.18171719  0.17990541   0.2660067  0.5823417
    var           36.3241028   3.1895161 1.536080e+04 4700.8669355   0.28588135   0.9573790   3.1931661  0.25403226  0.24899194   0.5443548  2.6088710
    std.dev        6.0269481   1.7859216 1.239387e+02   68.5628685   0.53467874   0.9784574   1.7869432  0.50401613  0.49899092   0.7378041  1.6152000
    coef.var       0.2999881   0.2886338 5.371779e-01    0.4674077   0.14866382   0.3041285   0.1001159  1.15203687  1.22828533   0.2000825  0.5742933