How to load sound files and data in R for bioacoustic analysis (MonitoR)?

I am trying to follow this bioacoustics in R guide to help me run analysis on some frog chirps. I am also looking at this MonitoR guide as they're trying to accomplish the same thing (using templates to run against sound files).

As I have over 30,000 frog sound files of about 5 MB each, I've created a "dummy" folder with 20 random frog sound files in it and 5 template files. I figured this would make things quicker and once I had my code fully working I would tweak the folder names.

I am working in R Studio and the sound files are WAV ones.

I have loaded MonitoR and WarbleR using library(monitoR, warbleR) and I think that has successfully loaded them.

Then I have set my working directory to that "dummy" folder with the 20 frog files and 5 chirp template files. I know that one has worked as when I call up list.files() they all appear.

> list.files()
 [1] "frog1.WAV"          "frog10.WAV"         "frog11.WAV"        
 [4] "frog12.WAV"         "frog13.WAV"         "frog14.WAV"        
 [7] "frog15.WAV"         "frog16.WAV"         "frog17.WAV"        
[10] "frog18.WAV"         "frog19.WAV"         "frog2.WAV"         
[13] "frog20.WAV"         "frog3.WAV"          "frog4.WAV"         
[16] "frog5.WAV"          "frog6.WAV"          "frog7.WAV"         
[19] "frog8.WAV"          "frog9.WAV"          "template_test1.WAV"
[22] "template_test2.WAV" "template_test3.WAV" "template_test4.WAV"
[25] "template_test5.WAV"

It is the second step of the first guide that I keep getting error messages on no matter how I try and tweak it.

Step 1 has this:

x<-c("warbleR", "monitoR")

That bit I understand as loading the packages, though I don't know what x and c are doing there...

if(!y %in% installed.packages()[,"Package"])  install.packages(y)
require(y, character.only = T)
 })

I have no idea what that bit does... but including it or excluding it makes no difference to the error messages in the next step.

Step 2 is creating templates and where I am getting stuck:

The guide says: # load sound files and data

data(list = c("Phae.long1", "Phae.long2", "Phae.long3", "Phae.long4", "selec.table"))

I am unsure what the selec.table bit of the code does, but again, including or excluding it makes no difference to the error messages.

My attempted code is:

data(list = c("template_test1", "template_test2", "template_test3", "template_test4", "template_test5", "selec.table"))

and it throws up the following errors:

Warning messages: 1: In data(list = c("template_test1", "template_test2", "template_test3", : data set ‘template_test1’ not found 2: In data(list = c("template_test1", "template_test2", "template_test3", : data set ‘template_test2’ not found 3: In data(list = c("template_test1", "template_test2", "template_test3", : data set ‘template_test3’ not found 4: In data(list = c("template_test1", "template_test2", "template_test3", : data set ‘template_test4’ not found 5: In data(list = c("template_test1", "template_test2", "template_test3", : data set ‘template_test5’ not found

I've tried steps suggested in both guides (with no success) and looking online for where I am going wrong, but I'm afraid I'm stumped. I've got a long road in R ahead of me for analysing this data set so any help anyone reading this has to give is greatly appreciated. Thank-you.

Solution

library(monitoR)
library(tuneR)
knitr::opts_chunk$set(echo = FALSE)

MonitoR

Data

First we define where to find our files...

data_path <- paste0(getwd(), "./../data/")

tpl.1 <- "template_test1.WAV"
tpl.2 <- "template_test2.WAV"
tpl.3 <- "template_test3.WAV"
tpl.4 <- "template_test4.WAV"
tpl.5 <- "template_test5.WAV"

frog1 <- "frog1.WAV"
frog2 <- "frog2.WAV"

Spectograms

Then we display the spectogram of $tpl.1$

viewSpec(paste0(data_path, tpl.1))

We see the interesting frequencies are between 1.6 and 3 khz.

viewSpec(paste0(data_path, tpl.1), frq.lim = c(1.6, 3))

Correlation Templates

wct1 <- makeCorTemplate(paste0(data_path, tpl.1), 
                        name=tpl.1,
                        t.lim=c(0.1, 0.5),
                        frq.lim=c(1.6, 3))

wct2 <- makeCorTemplate(paste0(data_path, tpl.2), 
                        name=tpl.2,
                        t.lim=c(0.0, 0.35),
                        frq.lim=c(1.0, 8.0))

wct3 <- makeCorTemplate(paste0(data_path, tpl.3), 
                        name=tpl.3,
                        t.lim=c(0.0, 0.35),
                        frq.lim=c(1, 8))

wct4 <- makeCorTemplate(paste0(data_path, tpl.4), 
                        name=tpl.4,
                        t.lim=c(0.0, 0.3),
                        frq.lim=c(0.0, 6.0))

wct5 <- makeCorTemplate(paste0(data_path, tpl.5), 
                        name=tpl.5,
                        t.lim=c(0.0, 0.2),
                        frq.lim=c(0.0, 6))

ctemps <- combineCorTemplates(wct1, wct2, wct3, wct4, wct5)

Checking Frog2 recording

cscores <- corMatch(paste0(data_path, frog2), ctemps)

We see here the max corelation score is between $tpl.1$ and $frog2$ around 0.84

cscores

min.score
<dbl>
max.score
<dbl>
n.scores
<int>
template_test1.WAV  -0.15   0.84    5119    
template_test2.WAV  0.23    0.71    5124    
template_test3.WAV  0.18    0.69    5124    
template_test4.WAV  0.52    0.80    5128    
template_test5.WAV  0.42    0.72    5138    
5 rows

From the n.peaks column, we can see there are from 112 to 250 peaks per template, and from the n.detections column, we can see that the templates resulted in from 5 to 250 detections

cdetects <- findPeaks(cscores)
cdetects

n.peaks
<int>
n.detections
<int>
min.peak.score
<dbl>
max.peak.score
<dbl>
template_test1.WAV  112 5   0.1195016   0.8438077   
template_test2.WAV  128 7   0.3092445   0.7053220   
template_test3.WAV  135 6   0.2550825   0.6853129   
template_test4.WAV  156 156 0.5934782   0.8018123   
template_test5.WAV  250 250 0.4687607   0.7191635   
5 rows | 1-5 of 6 columns

head(getDetections(cdetects), 6)

template
<chr>
date.time
<S3: POSIXct>
time
<dbl>
score
<dbl>
1   template_test1.WAV  2022-09-08 17:10:03 3.317333    0.5348492
2   template_test1.WAV  2022-09-08 17:10:07 7.786667    0.5271136
3   template_test1.WAV  2022-09-08 17:10:15 15.594667   0.8438077
4   template_test1.WAV  2022-09-08 17:10:19 19.626667   0.6815375
5   template_test1.WAV  2022-09-08 17:10:23 23.797333   0.7162522
6   template_test2.WAV  2022-09-08 17:10:03 3.296000    0.5244355
6 rows

Let's plot this!

plot(cdetects)

This plot makes no sense to me so I'll just update the cutoff to some arbitrary values.

templateCutoff(cdetects) <- c(template_test1.WAV = 0.6, 
                              template_test2.WAV = 0.6, 
                              template_test3.WAV = 0.6, 
                              template_test4.WAV = 0.7, 
                              template_test5.WAV = 0.6)

what are our peaks now?

cdetects

n.peaks
<int>
n.detections
<int>
min.peak.score
<dbl>
max.peak.score
<dbl>
template_test1.WAV  112 3   0.1195016   0.8438077   
template_test2.WAV  128 2   0.3092445   0.7053220   
template_test3.WAV  135 4   0.2550825   0.6853129   
template_test4.WAV  156 6   0.5934782   0.8018123   
template_test5.WAV  250 6   0.4687607   0.7191635   
5 rows | 1-5 of 6 columns

Looking better!

plot(cdetects)