I'm wanting to run a network analysis but am completely lost at how to get my data structured correctly, since most examples already have data structured at the to
and from
level.
An example of my data looks like:
df <- data.frame(Name = c("Alice", "Ben", "Tom", "Jane", "Neil", "Alice", "Tom", "Ben", "Jane", "Neil", "Alice", "Tom", "Ben", "Jane", "Bob"),
Location = c("Ward", "Desk", "Op", "Call", "Off",
"Ward", "Desk", "Op", "Call", "Off",
"Ward", "Desk", "Op", "Call", "Off"),
Rating = c(1, 1, 1, 1, 1, 10, 10, 10, 10, 10, 8, 8, 8, 8, 8))
I now wish to get to
and from
combinations of people, as denoted by Name
, for every Rating
. You will also note that people can be at a different Location
during a different rating, although I'd prefer to for this, in combination with Name
to be the nodes and Rating
to be the edges.
I have looked at library(iterpc)
but am struggling to comprehend the whole combination thing, with five different lineups.
Is there a potential dplyr
solution to my problem? Thank you!
EDIT: It looks as though my question is very similar to this yet the answer marked does not work for me, instead I get Error: Column name Name must not be duplicated.
If you want the from
column to be Name
and the to
column to be your Rating
column, then tidygraph does this mapping for you.
library(tidygraph)
#> Warning: package 'tidygraph' was built under R version 3.6.3
#>
#> Attaching package: 'tidygraph'
#> The following object is masked from 'package:stats':
#>
#> filter
df <- data.frame(
Name = c(
"Alice", "Ben", "Tom", "Jane", "Neil",
"Alice", "Tom", "Ben", "Jane", "Neil",
"Alice", "Tom", "Ben", "Jane", "Bob"
),
Location = c(
"Ward", "Desk", "Op", "Call", "Off",
"Ward", "Desk", "Op", "Call", "Off",
"Ward", "Desk", "Op", "Call", "Off"
),
Rating = c(
1, 1, 1, 1, 1,
10, 10, 10, 10, 10,
8, 8, 8, 8, 8)
)
tg <- as_tbl_graph(df)
tg
#> # A tbl_graph: 11 nodes and 15 edges
#> #
#> # A directed acyclic multigraph with 4 components
#> #
#> # Node Data: 11 x 1 (active)
#> name
#> <chr>
#> 1 Alice
#> 2 Ben
#> 3 Tom
#> 4 Jane
#> 5 Neil
#> 6 Bob
#> # ... with 5 more rows
#> #
#> # Edge Data: 15 x 3
#> from to Rating
#> <int> <int> <dbl>
#> 1 1 7 1
#> 2 2 8 1
#> 3 3 9 1
#> # ... with 12 more rows
You can double-check this mapping is done correctly by looking at the first row of your edge table and see an edge between 1
and 7
, which are Alice
and Ward
, which is the first row in your original data frame.
data.frame(tg)
#> name
#> 1 Alice
#> 2 Ben
#> 3 Tom
#> 4 Jane
#> 5 Neil
#> 6 Bob
#> 7 Ward
#> 8 Desk
#> 9 Op
#> 10 Call
#> 11 Off
Created on 2020-09-21 by the reprex package (v0.3.0)