I'm trying to remove rows in my dataset according with specific values in 2 columns, but it seems that i'm setting it in the wrong way.
Here a sample of dataset
nquest nord sex anasc ireg eta staciv studio ID tpens
<int> <int> <dbl> <int> <int> <int> <int> <int> <int> <int>
1 173 1 1 1948 18 72 3 5 1 1800
2 2886 1 1 1949 13 71 1 5 2 1211
3 2886 2 0 1952 13 68 1 6 3 2100
4 5416 1 0 1958 8 62 3 3 4 700
5 7886 1 1 1950 9 70 1 5 5 2000
6 20297 1 1 1960 5 60 1 3 6 1200
7 20711 2 1 1944 4 76 1 2 7 2000
8 22169 1 0 1944 15 76 4 2 8 600
9 22276 1 1 1949 8 71 2 5 9 1200
10 22286 1 1 1950 8 70 1 2 10 850
11 22286 2 0 1956 8 64 1 2 11 650
12 22657 1 0 1951 13 69 1 7 12 2400
13 22657 2 1 1946 16 74 1 5 13 1500
14 23490 1 0 1937 5 83 2 5 14 1400
15 24147 1 1 1948 4 72 1 7 15 1730
16 24147 2 0 1958 4 62 1 5 16 1600
17 24853 1 1 1957 13 63 1 3 17 2180
18 27238 1 1 1952 12 68 1 3 19 1050
19 27730 1 1 1939 20 81 1 2 20 1470
20 27734 1 1 1947 20 73 1 2 21 1159
I want to get a dataset in which are exluded all the rows where the values of tpens
are greater than 2000 if ireg
= 13 ( I need to maintan all the other values of tpens
and ireg
if ireg
is different than 13).
I have tried
new <- subset(data, data$ireg == 13 & data$tpens <= 2000)
But it is wrong, because even if tpens
are now lower than 2000, it gives me a dataset with only ireg
== 13. I need to maintain all the other values of ireg
( and then the tpens
values linked to them) actually.
I also tried
new <-data [!(data$ireg == 13 & data$tpens <= 2000),]
But it is the same. Even using filter
of dplyr, it seems I'm not able to set the conditions in the proper way
How can I remove the rows that satisfy specific conditions on 2 columns at the same time, but without delete all the other things?
I hope I was able to explain myself
subset
or filter
keeps the rows where the conditions are matched. So, you rather want the inverse selection:
filter(data, !(ireg == 13 & tpens > 2000))