I have a table example like this
No, Memo
1, Date: 2020/10/22 City: UA Note: True mastery of any skill takes a lifetime.
2, Date: 2022/11/01 City: CH Note: Sweat is the lubricant of success.
3, Date: 2022y11m1d City: UA Note: Every noble work is at first impossible.
4, Date: 2022y2m15d City: AA Note: Live beautifully, dream passionately, love completely.
I want to extract string after Date:
,City:
and Note:
.
For example at NO. 1,I need to extract the "2020/10/22" which is between Date:
and City:
, "UA" which is between City:
and Note:
, and the "True mastery of any skill takes a lifetime." which is after Note:
.
Desired Output like :
No Date City Note
1 2020/10/22 UA True mastery of any skill takes a lifetime.
2 2022/11/01 CH Sweat is the lubricant of success.
3 2022y11m1d UA Every noble work is at first impossible.
4 2022y2m15d AA Live beautifully, dream passionately, love completely.
Does anyone know an answer for that?Any help would be great.Thank you.
My solution using regex and stringr
and dplyr
library(stringr)
library(dplyr)
df <- read.table(
text = "No; Memo
1; Date: 2020/10/22 City: UA Note: True mastery of any skill takes a lifetime.
2; Date: 2022/11/01 City: CH Note: Sweat is the lubricant of success.
3; Date: 2022y11m1d City: UA Note: Every noble work is at first impossible.
4; Date: 2022y2m15d City: AA Note: Live beautifully, dream passionately, love completely.",
sep = ";",
header = T
)
df_test <- df %>% mutate(date = str_extract(Memo, "(?<=Date: )(.*)(?= City)"),
city = str_extract(Memo, "(?<=City: )(.*)(?= Note)"),
note = str_extract(Memo, "(?<=Note: ).*")) %>%
select(-Memo)
> df_test
No date city note
1 1 2020/10/22 UA True mastery of any skill takes a lifetime.
2 2 2022/11/01 CH Sweat is the lubricant of success.
3 3 2022y11m1d UA Every noble work is at first impossible.
4 4 2022y2m15d AA Live beautifully, dream passionately, love completely.
The regex matches everything between the groups specified using positive lookahead and loohbehind.