I have a df with a text column, and a column with a wordcount value.
How can I delete the last n words of the text (specified in the 'wc' column) and save the output to a third column?
In other words, I need the "introductory" part of a bunch of texts, and I know when the intro ends, so I want to cut the text off at that point and save the intro in a new column.
df <- data.frame(text = c("this is a long text","this is also a long text", "another long text"),wc=c('1','2','1'))
Desired output:
text | wc | chopped_off_text |
---|---|---|
this is a long text | 1 | this is a long |
this is also a long text | 2 | this is also a |
another long text | 1 | another long |
You can use the word
function from the stringr
package to extract "words" in a sentence. str_count(text, "\\s") + 1
counts the number of words present in the sentence.
library(stringr)
library(dplyr)
df %>%
mutate(chopped_off_text =
word(text, 1, end = str_count(text, "\\s") + 1 - as.integer(wc)))
text wc chopped_off_text
1 this is a long text 1 this is a long
2 this is also a long text 2 this is also a
3 another long text 1 another long