Correlation R between categoric and binary variables

I want to use the data attached to see i there is correlation between a bootcamp that students attended and the job they end up getting. For example, does someone who attended a software engineering bootcamp end up with a software job, or does attending a data science one lead to a job in data? I have tried doing this but I dont think its right. I have attached a screenshot of the data.Please help with correct code

data <- data[rowSums(is.na(data)) == 0,]
summary(data)
data <- as.data.frame.matrix(data)
sapply(data,class)
data$Bootcamp <- as.numeric(factor(data$Bootcamp))
sapply(data,class)
data <- data[rowSums(is.na(data)) == 0,]

Solution

Here is how you can compute correlation (remember correlation is not causation, there can be confounders). Since I don't have access to your data, I started by generating some random data, which looks like the following (you can replace it with your actual data).

head(data)
#       Bootcamp software web data security engineer developer analyst
#1  Data Science        0   1    0        0        0         1       1
#2  Data Science        1   1    1        0        1         1       1
#3 Cybersecurity        1   1    0        1        0         0       1
#4 Cybersecurity        0   0    0        1        1         0       1
#5 Cybersecurity        0   1    0        1        0         0       0
#6  Data Science        0   1    0        1        0         0       1

Now, use the function model.matrix() which creates a design (or model) matrix, e.g., by expanding factors to a set of dummy variables, to create dummy binary variables from the categorical column.

bootcamp <- as.data.frame(model.matrix(~ Bootcamp + 0, data)) # with no intercept term
head(bootcamp)
#  BootcampCybersecurity BootcampData Science BootcampSoftware Engineering
#1                     0                    1                            0
#2                     0                    1                            0
#3                     1                    0                            0
#4                     1                    0                            0
#5                     1                    0                            0
#6                     0                    1                            0

Note that the first row has Bootcamp value as Data science, hence only the corresponding dummy variable has value 1, all others have value 0 for the row.

Note that it generated only 3 dummy column variables for me, since I had only 3 levels of the corresponding factor variable that is expanded. You will have number of columns as number of levels in the factor variable.

Now, compute the correlation:

job <- data[,2:ncol(data)]
corr <- cor(bootcamp, job)

You can use fancy plot for better visualization / interpretation if your want like the following:

library(ggcorrplot)
ggcorrplot(corr, lab = TRUE)

Note from the above visualization that with my data, the correlation of the binary variable representing a data job with the binary variable representing data science bootcamp is 0.1

You can do linear regression to find whether a particular predictor (e.g., bootcamp training) is significant one to predict the response (e.g., the job type). Hope it answers your question.