Before we get to my question please note that I purposefully did not include example data in this post, as my problem occurs on my full dataset and subsets of it. I have two dataset with client data in the following format.
Have_1
+------------+------------+------+
| dt | dt_next | id |
+------------+------------+------+
| 30.09.2010 | 31.10.2010 | 0001 |
+------------+------------+------+
| 31.10.2010 | 30.11.2010 | 0001 |
+------------+------------+------+
| 30.11.2010 | 31.12.2010 | 0001 |
+------------+------------+------+
| 31.12.2010 | 31.01.2011 | 0001 |
+------------+------------+------+
Have_2
+------+-------+------------+------------+
| id | event | start_date | end_date |
+------+-------+------------+------------+
| 0001 | 1 | 31.10.2010 | 30.11.2010 |
+------+-------+------------+------------+
| 0001 | 2 | 31.10.2010 | 31.12.2010 |
+------+-------+------------+------------+
I am trying to use the IFN function to put 1-0 flags in my dataset by using the following logic:
Proc SQL;
Create table want as
Select a.*
,ifn(a.id in (select id from have_2 where a.dt <= end_date and start_date <= a.dt_next), 1, 0) as flg_1
,ifn(a.id in (select id from have_2 where a.dt <= end_date and start_date <= a.dt), 1, 0) as flg_2
From have_1 as a;
Quit;
The code works fine if I take only one client, however, if I take the full dataset (or even a small subset of it such as only 10 clients) then the code gets stuck in the sense that the process begins without error but simply never finishes. I tried setting indexes to both my input datasets, without success. Are there any peculiarities to the IFN function, which can make it behave that way?
So why not just join and take the max of all events if any event's dates fall into those periods? That should eliminate the need to do two subqueries for every observation in HAVE1.
proc sql;
create table want2 as
select a.id
, a.dt
, a.dt_next
, max(a.dt <= b.end_date and b.start_date <= a.dt_next) as flg1
, max(a.dt <= b.end_date and b.start_date <= a.dt) as flg2
from have1 a
left join have2 b
on a.id = b.id
group by 1,2,3
;
quit;
Note the issue is with the subqueries, not the IFN() function call. Also there is no need for IFN() function here. SAS evaluates boolean expressions to 1 or 0. So the expression a=b
returns the same result as IFN(a=b,1,0)
returns.