I have a database of people with information that in which households they are included. How can I group in one statement households in which lives only woman or only a specific group of people?
Example:
So from this example I would like to get only one row (B) and the count that there lives two people.
To better understand I can do it when I want households where lives only people under 20 years old. The following code works:
proc sql;
create table new_tab as
select household, count(*) as freq
from table
group by household
having max(gender) < 21;
quit;
But I couldnt find a similar code when it is not a numeric value. Can someone please give an advise how could I group only households in which lives only woman?
Consider computing an aggregate categorical value and counts that can be used in later analysis.
Example:
select household,
case
when min(gender) ne max(gender) then 'mixed'
else min(gender)
end as gender
, sum (gender eq 'woman') as woman_count
, sum (gender eq 'man') as man_count
, sum (missing(gender)) as missingG_count
, sum (not missing(gender) and gender not in ('woman','man')) as otherG_count
from have
group by household