I'm using the Vectorise
function on psych::phi2poly
, and then selecting certain values from the output using ifelse
. Unfortunately phi2poly
raises warnings whenever it sees a NA
. The warnings are irrelevant as the ifelse
handles NA
s without using phi2poly
. MWE below.
I'm reluctant to suppress all warnings from phi2poly
as that could mask other issues. Is there a clean way to handle this?
MWE:
library(psych)
r_biserial <- function(p, r_pb) {
# Jacobs' formula, r_b = sqrt(pq)/f(z_p) . r_pb:
sqrt(p * (1-p)) / dnorm(qnorm(p)) * r_pb
}
r_tetrachoric <- function(a, b, phi) {
# This causes a bucketload of warnings
Vectorize(phi2poly)(a, b, phi)
# This gets rid of the warnings but is a total hack
# Vectorize(phi2poly)(
# a |> replace_na(0.5),
# b |> replace_na(0.5),
# phi |> replace_na(0.5))
}
adjust_correlation <- function(p, q, cor) {
ifelse(is.na(p),
ifelse(is.na(q),
cor,
r_biserial(q, cor)),
ifelse(is.na(q),
r_biserial(p, cor),
r_tetrachoric(p, q, cor)))
}
adjust_correlation(
c(NA, NA, 0.5, 0.5),
c(NA, 0.6, NA, 0.6),
c(0.3, 0.3, 0.3, 0.3)
)
Edit: Speed is not an issue here. In the end I went with a modification of the accepted answer that I preferred (as it contained the phi2poly
issue more locally).
r_tetrachoric <- function(a, b, phi) {
# We cannot call phi2poly with NAs
call = !is.na(a) & !is.na(b)
result <- rep(NA_real_, length(a))
result[call] <- Vectorize(phi2poly)(phi[call], a[call], b[call])
result
}
adjust_correlation <- function(p, q, cor) {
case_when(
is.na(p) & is.na(q) ~ cor,
is.na(p) & !is.na(q) ~ r_biserial(q, cor),
!is.na(p) & is.na(q) ~ r_biserial(p, cor),
!is.na(p) & !is.na(q) ~ r_tetrachoric(p, q, cor)
)
}
Definitely avoid the ifelse
s. Make a data.frame
and fill a column successively with results using boolean vectors for sub-setting - definitely faster and much easier to read.
> library(psych)
>
> r_biserial <- function(p, r_pb) {sqrt(p*(1 - p))/dnorm(qnorm(p))*r_pb}
>
> adjust_correlation <- function(p, q, corr=.3) {
+ data <- data.frame(p, q, corr, res=NA_real_)
+ na_pq <- with(data, is.na(p) & is.na(q))
+ na_p <- with(data, is.na(p) & !is.na(q))
+ na_q <- with(data, !is.na(p) & is.na(q))
+ na_0 <- with(data, !is.na(p) & !is.na(q))
+ data$res[na_pq] <- corr
+ data$res[na_p] <- r_biserial(q[na_p], corr)
+ data$res[na_q] <- r_biserial(p[na_q], corr)
+ data$res[na_0] <- Vectorize(phi2poly)(p[na_0], q[na_0], corr)
+ data$res
+ }
>
> adjust_correlation(
+ p=c(NA, NA, 0.5, 0.5),
+ q=c(NA, 0.6, NA, 0.6)
+ )
[1] 0.3000000 0.3804121 0.3759942 0.8504852
I assumed you want corr
always be the same value, but you can still change to corr[<na_vector>]
to be able to use corr
as a vector in the arguments.