Research context: speakers (writers) and recipients interact in written communication concerning a certain discussion topic. The first speaker is the original person who posted a thread.
Data look like:
structure(list(topic = c(1, 1, 1, 1, 1, 1, 2, 2), thread = c(1,
1, 1, 2, 2, 2, 3, 3), speaker_id = c(111, 111, 111, 222, 222,
222, 111, 222), recipient_id = c(222, 333, 444, 111, 555, 444,
222, 111), dyad = structure(c(1L, 2L, 3L, 1L, 5L, 4L, 1L, 1L), .Label = c("111_222",
"111_333", "111_444", "222_444", "222_555"), class = "factor")), class = "data.frame", row.names = c(NA,
-8L), codepage = 65001L)
The aims are creating two variables:
Based on the example data, the results would look like:
╔═══════╦════════╦═════════╦═══════════╦═════════╦═══════════╦══════════════════════════════════════════╦═════════╦════════════════════════════════════════════╗
║ topic ║ thread ║ speaker ║ recipient ║ dyad ║ threads ║ note ║ threads ║ note ║
║ ║ ║ id ║ id ║ ║ partnered ║ ║ present ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 222 ║ 111_222 ║ 2 ║ 111 and 222 interacted (made a dyad) ║ 0 ║ Outside the given thread (thread #1) of ║
║ ║ ║ ║ ║ ║ ║ in two different threads (thread #1, #2) ║ ║ the given topic (topic #1), 111 and 222 ║
║ ║ ║ ║ ║ ║ ║ within topic 1 ║ ║ are not found together as recipients ║
║ ║ ║ ║ ║ ║ ║ ║ ║ other than being in a dyad ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 333 ║ 111_333 ║ 1 ║ 111 and 333 interacted in ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ one thread (thread #1) ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 1 ║ 111 ║ 444 ║ 111_444 ║ 1 ║ 111 and 444 interacted in ║ 1 ║ 111 and 444 are found in thread #2, ║
║ ║ ║ ║ ║ ║ ║ one thread (thread #1) ║ ║ where they did not interact (made a dyad), ║
║ ║ ║ ║ ║ ║ ║ ║ ║ but were only recipients of ║
║ ║ ║ ║ ║ ║ ║ ║ ║ the original speaker (111) ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 111 ║ 111_222 ║ 2 ║ 111 and 222 interacted in two different ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ threads within topic 1 ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 555 ║ 222_555 ║ 1 ║ 222 and 555 interacted in one thread ║ 0 ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 1.00 ║ 2 ║ 222 ║ 444 ║ 222_444 ║ 1 ║ 222 and 444 interacted in one thread ║ 1 ║ 222 and 444 are found together ║
║ ║ ║ ║ ║ ║ ║ ║ ║ in thread #1, where they did not ║
║ ║ ║ ║ ║ ║ ║ ║ ║ interact ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 2.00 ║ 3 ║ 111 ║ 222 ║ 111_222 ║ 1 ║ 111 and 222 interacted in one thread ║ 0 ║ ║
║ ║ ║ ║ ║ ║ ║ (thread 3) within topic 2 ║ ║ ║
╠═══════╬════════╬═════════╬═══════════╬═════════╬═══════════╬══════════════════════════════════════════╬═════════╬════════════════════════════════════════════╣
║ 2.00 ║ 3 ║ 222 ║ 111 ║ 111_222 ║ 1 ║ same as above ║ 0 ║ ║
╚═══════╩════════╩═════════╩═══════════╩═════════╩═══════════╩══════════════════════════════════════════╩═════════╩════════════════════════════════════════════╝
Not entirely sure this accomplishes what you need, but perhaps it might be helpful in some way.
I created a custom function to take the speaker, recipient, thread, and topic, and determine the threads_present
based on your description. This includes looking at other thread
s within the same topic
, checking to make sure the other thread
s don't contain the speaker and recipient as a dyad
. Finally, the thread
should include both a speaker and recipient as a recipient in some row. These thread
s are then counted.
The second threads_partnered
is more straightforward and described in the comments. After you group_by
both topic
and dyad
you can determine the number of unique thread
s with n_distinct
.
library(tidyr)
library(dplyr)
library(purrr)
my_fun <- function(the_speaker, the_recipient, the_thread, the_topic) {
df %>%
filter(
topic == the_topic,
thread != the_thread,
dyad != paste(min(the_speaker, the_recipient), max(the_speaker, the_recipient), sep = "_")) %>%
group_by(thread) %>%
filter(all(c(the_speaker, the_recipient) %in% recipient_id)) %>%
ungroup() %>%
distinct(thread) %>%
count(name = "threads_present")
}
df %>%
mutate(threads_present = pmap(
list(the_speaker = speaker_id, the_recipient = recipient_id, the_thread = thread, the_topic = topic),
my_fun)
) %>%
unnest(cols = threads_present) %>%
group_by(topic, dyad) %>%
mutate(threads_partnered = n_distinct(thread))
Output
topic thread speaker_id recipient_id dyad threads_present threads_partnered
<dbl> <dbl> <dbl> <dbl> <fct> <int> <int>
1 1 1 111 222 111_222 0 2
2 1 1 111 333 111_333 0 1
3 1 1 111 444 111_444 1 1
4 1 2 222 111 111_222 0 2
5 1 2 222 555 222_555 0 1
6 1 2 222 444 222_444 1 1
7 2 3 111 222 111_222 0 1
8 2 3 222 111 111_222 0 1