Count frequency based on multiple columns

I have multiple columns with VC's funds.

df <- data.frame(Company = c("X", "Y","Z"),
                 Valuation = c("10","20","30"),
                 VC.1 = c("Zeev Ventures","Bedrock Capital","Activant Capital"),
                 VC.2 = c("Bedrock Capital","Activant Capital","Zeev Ventures"))

Company	Valuatiion	VC.1	VC.2
X	10	Zeev Ventures	Bedrock Capital
Y	20	Bedrock Capital	Activant Capital
Z	30	Activant Capital	Zeev Ventures

I need calcucalte different statistics for each VC fund. For example

VC	Total Valuatiion	Total Investments
Zeev Ventures	40	2
Bedrock Capital	40	2
Activant Capital	50	2

Solution

You can pivot the VC.1 and VC.2 columns and get your results with the usual dplyr functions:

library(tidyr)
library(dplyr)
pivot_longer(df, cols = c("VC.1", "VC.2"), values_to = "VC") %>%
  group_by(VC) %>%
  summarise(TV = sum(as.numeric(Valuation)), TI = n())

# A tibble: 3 × 3
  VC                  TV    TI
  <chr>            <dbl> <int>
1 Activant Capital    50     2
2 Bedrock Capital     30     2
3 Zeev Ventures       40     2

I recommend inspecting the result of the pivot_longer() function call alone for a better understanding of why this pivoting step solves the problem of having identifiers in multiple columns.