Convert Speech Start and End Time into Time Series

I am looking to convert the following R data frame into one that is indexed by seconds and have no idea how to do it. Maybe dcast but then in confused on how to expand out the word that's being spoken.

startTime endTime           word
1     1.900s  2.300s         hey
2     2.300s  2.800s         I'm
3     2.800s      3s        John
4         3s  3.400s       right
5     3.400s  3.500s         now
6     3.500s  3.800s           I
7     3.800s  4.300s        help

Time           word
1.900s         hey
2.000s         hey
2.100s         hey
2.200s         hey
2.300s         I'm
2.400s         I'm
2.500s         I'm
2.600s         I'm
2.700s         I'm
2.800s         John
2.900s         John
3.000s         right
3.100s         right
3.200s         right
3.300s         right

Solution

One solution can be achieved using tidyr::expand.

EDITED: Based on feedback from OP, as his data got duplicate startTime

library(tidyverse)
step = 0.1
df %>% group_by(rnum = row_number()) %>%
  expand(Time = seq(startTime, max(startTime, (endTime-step)), by=step), word = word) %>%
  arrange(Time) %>% 
  ungroup() %>%
  select(-rnum)

# # A tibble: 24 x 2
# # Groups: word [7]
#    Time word 
#   <dbl> <chr>
# 1  1.90 hey  
# 2  2.00 hey  
# 3  2.10 hey  
# 4  2.20 hey  
# 5  2.30 I'm  
# 6  2.40 I'm  
# 7  2.50 I'm  
# 8  2.60 I'm  
# 9  2.70 I'm  
# 10  2.80 John
# ... with 14 more rows

Data

df <- read.table(text = 
"startTime endTime           word
     1.900  2.300         hey
     2.300  2.800         I'm
     2.800      3        John
     3      3.400       right
     3.400  3.500         now
     3.500  3.800           I
     3.800  4.300        help",
header = TRUE, stringsAsFactors = FALSE)