Search code examples
facebookrdata-analysis

R: Create column with tags from exported Facebook .csv data


I am analyzing my Facebook page's posts to see what kind of posts attract the most people. So I want to create columns with the tags used. Here's an example of how the data export would look like:

Post              Likes
Blah   #a          10
Blah Blah #b       12
Blah Bleh #a       10
Bleh   #b           9
Bleh Blah #a #b    15

I want to create this:

Post              Likes   tags
Blah   #a          10      #a
Blah Blah #b       12      #b
Blah Bleh #a       10      #a
Bleh   #b           9      #b
Bleh Blah #a #b    15      #a #b
Bleh #b Blah #a    14      #a #b

Is this possible? I thought of using grep1 to check for posts with "#" inside, but I'm stuck at what to do next.


Solution

  • You can use gregexpr for example to find the desired pattern and regmatches to extract it:

    txt = c('Bleh Blah #a #b','Blah Bleh #a')
    regmatches(txt,gregexpr('#[a-z]',txt))   ## I assume a tag is # followed by lower letter 
    [[1]]
    [1] "#a" "#b"
    
    [[2]]
    [1] "#a"
    

    using alexis example, you write something like this:

    DF$tag <- regmatches(DF$Post,gregexpr('#[a-z]',DF$Post)
    

    edit in case tag is someething like #hi (more than one letter):

    txt = c('Bleh Blah #hi allo #b','Blah Bleh #a')
    regmatches(txt,gregexpr('#[a-z]+',txt))
    
    [1]]
    [1] "#hi" "#b" 
    
    [[2]]
    [1] "#a"