Search code examples
rstderr

R: t.test error


The following occurred:

I set up my work space; read the .csv; added some subsets; did a few t.tests in form of t.test(HtoC2/C2.dur, s1) and everything went just fine, until a few t.tests later I suddenly received the following error message:

Fehler in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) 
stop("data are essentially constant") : 
Fehlender Wert, wo TRUE/FALSE nötig ist
Zusätzlich: Warnmeldungen:
1: In mean.default(y) : argument is not numeric or logical: returning NA
2: In var(y) : NAs durch Umwandlung erzeugt

Ever since no other t.test (of the kind mentioned above) will work and neither do those t.tests which worked perfectly fine before. I always receive the same error message.

I had a look on similar problems but found no working solution, hence I am writing here. Of course I tried re-doing my first steps just in case I did some commands by accident, but this did not help either. Also, I tried using a similar set of data with identical columns for which t.tests had also worked before, but I received the same error.

Background information:

portion of my data:

sprecher ident tier1.label testwort  L.zeit  H.zeit C1.start C1.ende V1.start V1.ende C2.start C2.ende V2.start V2.ende C1.dur V1.dur C2.dur V2.dur LtoC1 LtoV1 HtoV1 HtoC2
1       s1     1    ma:mi_01    ma:mi 23912.0 24108.2  23827.4 23937.5  23937.5 24064.5  24064.5 24148.0  24148.0 24214.6  110.1  127.0   83.5   66.6  84.6 -25.5 170.7  43.7
2       s1     1     mami_01     mami 26755.0 26958.8  26700.0 26800.2  26800.2 26887.4  26887.4 26957.1  26957.1 27035.5  100.2   87.2   69.7   78.4  55.0 -45.2 158.6  71.4
3       s1     2    ma:mi_02    ma:mi 33237.6 33451.4  33179.6 33282.1  33282.1 33395.8  33395.8 33473.2  33473.2 33562.0  102.5  113.7   77.4   88.8  58.0 -44.5 169.3  55.6
4       s1     3    ma:mi_03    ma:mi 39100.7 39315.5  39057.8 39162.3  39162.3 39290.1  39290.1 39363.1  39363.1 39441.0  104.5  127.8   73.0   77.9  42.9 -61.6 153.2  25.4
5       s1     2     mami_02     mami 41881.7 42099.5  41825.6 41936.8  41936.8 42028.3  42028.3 42101.4  42101.4 42180.1  111.2   91.5   73.1   78.7  56.1 -55.1 162.7  71.2
6       s1     4    ma:mi_04    ma:mi 44801.2 45028.8  44753.5 44860.2  44860.2 44990.9  44990.9 45070.6  45070.6 45131.3  106.7  130.7   79.7   60.7  47.7 -59.0 168.6  37.9

According to sapply(mode) and sapply(length) all columns are numeric and each "sprecher" (s1 - s5) consists of 30 lines, resulting in a total of 150.

Edit1: Forgot to mention how I defined my subsets:

s1 = subset(daten,sprecher=="s1")
s1.mahmi = subset(s1,testwort=="ma:mi")
s1.mammi = subset(s1,testwort=="mami")
s2 = subset(daten,sprecher=="s2")
s2.mahmi = subset(s2,testwort=="ma:mi")
s2.mammi = subset(s2,testwort=="mami")
s3 = subset(daten,sprecher=="s3")
s3.mahmi = subset(s3,testwort=="ma:mi")
s3.mammi = subset(s3,testwort=="mami")  
s4 = subset(daten,sprecher=="s4")
s4.mahmi = subset(s4,testwort=="ma:mi")
s4.mammi = subset(s4,testwort=="mami")
s5 = subset(daten,sprecher=="s5")
s5.mahmi = subset(s5,testwort=="ma:mi")
s5.mammi = subset(s5,testwort=="mami")

Solution

  • This should work for the subset with sprecher == "s1" (and presuming you want the default t.test options):

    t.test(HtoC2/C2.dur ~ testwort, subset(my_data, sprecher == "s1"))
    

    If you just wanted the p value for each subset, you could do:

    sapply(levels(factor(my_data$sprecher)), function(lev) {
      t.test(HtoC2/C2.dur ~ testwort, my_data[my_data$sprecher == lev, ])$p.value
    })