Search code examples
rstatisticspsychdescribe

Descriptive Statistic in R with more than one X


I'm currently working with the psych package. I have the variables StockPrice, BookValuePS (PS = per share), EPS (= Earnings per share) and ESGscore.

I want to do some descriptive statistic for my final paper.

The code I use at the moment is:

install.packages("psych")
library(psych)
attach(data_excel)
describeBy(StockPrice, group = ggroup, mat = TRUE, skew = FALSE, quant=c(0.25, 0.75))

--> I first tried to to it with one X (=StockPrice) and this is the Output

item group1 vars   n     mean       sd   min    max  range       se  Q0.25   Q0.75
11    1   1010    1 133 64.75000 31.27857 12.59 184.07 171.48 2.712196 42.750 83.0000
12    2   1510    1 154 71.27513 39.00316  6.77 221.53 214.76 3.142964 40.675 94.8125
  1. Why is it using item 11 & 12?

Now I wanted to do the same thing, with more Xs and the same grouping.

describeBy(StockPrice, BookValuePS, EPS, ESGscore group = ggroup, mat = TRUE, skew = FALSE, quant=c(0.25, 0.75))

But there is the error notice:

 unexpected symbol in "describeBy(StockPrice, BookValuePS, EPS, ESGscore group"
  1. Is there any way to get a complete output for all variables "grouped" by ggroup?

It would be so nice if you could help me out and if you need any data, let me know. :)

Happy New Year everybody and thanks for the help :)


Solution

    • First

    11 and 12 means: Not item, the item column is 1 and 2 meaning that your grouping variable ggroup has 2 items or two levels.

    11 and 12 are row names: if you wrap the whole thing around tibble() thy will disappear. In detail they mean first 1 = variable1, second 1 of 11 means level or item 1 of the grouping variable. You can get it if you follow the example at the end.

    • Second

    You can analyze the complete dataset grouped with this formula input: describeBy(data_excel ~ ggroup)

    To avoind the error:

    describeBy(SATV + SATQ ~ gender,data =sat.act)
    

    Use this formula , here is an example with the mtcars dataset:

    describeBy(disp+mpg ~cyl, mat = TRUE, skew = FALSE, quant=c(0.25, 0.75), data=mtcars)
    
          item group1 vars  n      mean        sd   min   max range         se  Q0.25  Q0.75
    disp1    1      4    1 11 105.13636 26.871594  71.1 146.7  75.6  8.1020903  78.85 120.65
    disp2    2      6    1  7 183.31429 41.562460 145.0 258.0 113.0 15.7091334 160.00 196.30
    disp3    3      8    1 14 353.10000 67.771324 275.8 472.0 196.2 18.1126481 301.75 390.00
    mpg1     4      4    2 11  26.66364  4.509828  21.4  33.9  12.5  1.3597642  22.80  30.40
    mpg2     5      6    2  7  19.74286  1.453567  17.8  21.4   3.6  0.5493967  18.65  21.00
    mpg3     6      8    2 14  15.10000  2.560048  10.4  19.2   8.8  0.6842016  14.40  16.25