I have a program that is written in SAS and uses PROC FORMAT and is actually building some type of new data types for the columns to be assigned to... I need that to be rewritten in Hive/Pig or even Unix, need some ideas as to how to write that. Any suggestions would be welcome.
here is an example,
PROC FORMAT;
VALUE $ABCD
'3000',
'3001',
'8816' - '8817',
'8817' - '8815' = 'Y'
OTHER = 'N';
PUT(DDDD,$ABCD.) = 'Y'
Proc Format is just an efficient way to write if/then logic, yes?
In SQL, you would use a CASE
statement:
case
when <column> between 3000 and 3001 then 'Y'
when <column> between 8816 and 8817 then 'Y'
when <column> between 8815 and 8817 then 'Y'
else 'N'
end