problem already solved by restoring data before running the code
with the help of several answered questions from this forum I created my code, however, it does not fully work as expected and I appreciate any hints and tips to solve the problem.
My goal is to create a new variable "v_edu_recoded" based on the variables "v_236", which contains respondent level of education, and v_236, which contains respondents specifications, if they chose '6' (meaning "other") in v_236. So, the new variable v_edu_recoded basically should be the other two variables merged. v_edu_recoded should be the same number as v_236. Only when v_236 is '6', then depending on the answer, it should be recoded to one of the other numbers (as most people gave an answer in "other" which is already covered by the categories of v_236).
My problem is, that in the output only the ten recoded cases (who had chosen 6 in v_236) are listed. The fist part of my condition (all the 832 cases who chose 1-5) did not work and is given as NA.
Any idea how to solve this? (I also tried it via "mutate", but the result was even worse..) Kind regards and thanks a lot for any help!
Here is my code:
dr_ma$v_edu_recoded <- with(dr_ma, ifelse(
(v_236 == '1' & v_237 == '-99' | v_236 == '6' & v_237 == 'Schüler'), '1', ifelse(
(v_236 == '2' & v_237 == '-99'), '2', ifelse(
(v_236 == '3' & v_237 == '-99'| v_236 == '6' & v_237 == 'Fachabitur'),'3', ifelse(
(v_236 == '4' & v_237 == '-99' | v_236 == '6' & v_237 == 'Verwaltungsfachwirt'), '4', ifelse(
(v_236 == '5' & v_237 == '-99'| v_236 == '6' & v_237 == 'Diplom'| v_236 == '6' & v_237 == 'Universität'),'5', ifelse(
(v_236 == '6' & v_237 == 'meister'|v_236 == '6' & v_237 == 'Meister'|v_236 == '6' & v_237 == 'Fachakademie'),'6',NA
)))))))
And here my output summary:
> summary(dr_ma$v_edu_recoded)
Length Class Mode
842 character character
> frq(dr_ma$v_edu_recoded)
x <character>
# total N=842 valid N=10 mean=4.60 sd=1.58
Value | N | Raw % | Valid % | Cum. %
--------------------------------------
1 | 1 | 0.12 | 10 | 10
3 | 1 | 0.12 | 10 | 20
4 | 1 | 0.12 | 10 | 30
5 | 4 | 0.48 | 40 | 70
6 | 3 | 0.36 | 30 | 100
<NA> | 832 | 98.81 | <NA> | <NA>
@CPak @caldwellst thank you for that super quick reply! I tried out the case_when, however, I got the same result, probably my conditions are not set right, but I can't find whats wrong
dr_ma$v_edu_recoded3 <- case_when (dr_ma$v_236 == 1 & dr_ma$v_237 == -99 | dr_ma$v_236 == 6 & dr_ma$v_237 == 'Schüler' ~1,
dr_ma$v_236 == 2 & dr_ma$v_237 == -99 ~ 2,
dr_ma$v_236 == 3 & dr_ma$v_237 == -99| dr_ma$v_236 == 6 & dr_ma$v_237 == 'Fachabitur' ~3,
dr_ma$v_236 == 4 & dr_ma$v_237 == -99 | dr_ma$v_236 == 6 & dr_ma$v_237 == 'Verwaltungsfachwirt' ~ 4,
dr_ma$v_236 == 5 & dr_ma$v_237 == -99| dr_ma$v_236 == 6 & dr_ma$v_237 == 'Diplom'| dr_ma$v_236 == '6' & dr_ma$v_237 == 'Universität' ~5,
dr_ma$v_236 == 6 & dr_ma$v_237 == 'meister'|dr_ma$v_236 == 6 & dr_ma$v_237 == 'Meister'|dr_ma$v_236 == '6' & dr_ma$v_237 == 'Fachakademie' ~6,TRUE~NA_real_)
summary(dr_ma$v_edu_recoded3)
frq(dr_ma$v_edu_recoded3)
> summary(dr_ma$v_edu_recoded3)
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
1.00 4.25 5.00 4.60 5.75 6.00 832
> frq(dr_ma$v_edu_recoded3)
x <numeric>
# total N=842 valid N=10 mean=4.60 sd=1.58
Value | N | Raw % | Valid % | Cum. %
--------------------------------------
1 | 1 | 0.12 | 10 | 10
3 | 1 | 0.12 | 10 | 20
4 | 1 | 0.12 | 10 | 30
5 | 4 | 0.48 | 40 | 70
6 | 3 | 0.36 | 30 | 100
<NA> | 832 | 98.81 | <NA> | <NA>
The problem was solved by restoring the data before running the code again. When running
(dput(head(dr_ma, 10))
as proposed by @CPak, it was found that the original data had been messed up by the many previous trials of recoding, and setting it back to the initial state was the solution.