Search code examples
rdplyrsingletonphonetics

Is there a way to extract rows based on condition of one column?


I wish to extract all columns for rows 4, 11 and so on. If you look at my posted data, my wish is to extract row values that are present immediately before an 'A' in column 'xsampa'. For example, all the columns for row 4 (that occur before row 5 that contains an 'A' in column 'xsampa'). I can manually extract them but anything better will definitely save me some labour.

Many thanks if you help me out.

Filename Speaker Consonant      tdiff xsampa
1  AK_baagge.TextGrid       1  Geminate 0.23165381      B
2  AK_baagge.TextGrid       1  Geminate 0.09607762      b
3  AK_baagge.TextGrid       1  Geminate 0.15799431     A:
4  AK_baagge.TextGrid       1  Geminate 0.08753738     g:
5  AK_baagge.TextGrid       1  Geminate 0.02668823      A
6  AK_baagge.TextGrid       1  Geminate 0.12917102     e:
7  AK_baagge.TextGrid       1  Geminate 0.87323879      E
8   AK_baagi.TextGrid       1 Singleton 0.22415281      B
9   AK_baagi.TextGrid       1 Singleton 0.11448148      b
10  AK_baagi.TextGrid       1 Singleton 0.15873483     A:
11  AK_baagi.TextGrid       1 Singleton 0.09716495      g
12  AK_baagi.TextGrid       1 Singleton 0.05387364      A
13  AK_baagi.TextGrid       1 Singleton 0.10125358     i:
14  AK_baagi.TextGrid       1 Singleton 0.70685099      E
15   AK_baga.TextGrid       1 Singleton 0.78044616      B
16   AK_baga.TextGrid       1 Singleton 0.09659531      b
17   AK_baga.TextGrid       1 Singleton 0.09220461      @
18   AK_baga.TextGrid       1 Singleton 0.05159068      g
19   AK_baga.TextGrid       1 Singleton 0.13482446     A:
20   AK_baga.TextGrid       1 Singleton 0.46999388      E

Solution

  • As @Jon Spring replied in the comments, the answer to this question is to use dplyr:: lead() function instead of lag(). This way, all the rows in the column 'xsampa' that contained the value 'A' will be filtered and produce the desired output.

    The lag() function will simply produce rows one behind the input.

    ANSWER:

    mydata_new<- mydata %>% filter(lead (xsampa) == "A")
    

    Output:

    Filename Speaker Consonant      tdiff xsampa
    1    AK_baagge.TextGrid       1  Geminate 0.08753738     g:
    2     AK_baagi.TextGrid       1 Singleton 0.09716495      g
    3     AK_bagga.TextGrid       1  Geminate 0.11573271     g:
    4     AK_buute.TextGrid       1 Singleton 0.08538239     t`
    5    AK_buutte.TextGrid       1  Geminate 0.21568940    t`:  
    6   AK_chaakki.TextGrid       1  Geminate 0.12341936     k:
    7     AK_chape.TextGrid       1 Singleton 0.06812137      p
    8    AK_chappe.TextGrid       1  Geminate 0.14723284     p:
    9      AK_fati.TextGrid       1 Singleton 0.06677743     t`
    10    AK_fatti.TextGrid       1  Geminate 0.13503550    t`:
    11     AK_gada.TextGrid       1 Singleton 0.06472276    d_d
    12    AK_gadda.TextGrid       1  Geminate 0.13475387   d_d:
    13   AK_jaaddi.TextGrid       1  Geminate 0.12847036   d_d:
    14    AK_jaadi.TextGrid       1 Singleton 0.06732941    d_d
    15    AK_katha.TextGrid       1 Singleton 0.01338915  t_d_h
    16     AK_kute.TextGrid       1 Singleton 0.04600485    t_d
    17    AK_kutte.TextGrid       1  Geminate 0.15318115   t_d:
    18  AK_raajegi.TextGrid       1 Singleton 0.03868537     dZ
    19 AK_raajjegi.TextGrid       1  Geminate 0.10578673    dZ:
    20     AK_sada.TextGrid       1 Singleton 0.05504982    d_d