I have the following sample data to read into SAS
2012-05-0317:36:00NYA
2012-05-0410:29:00SNW
2012-05-2418:45:00NYA
2012-05-2922:24:00NSL
2012-05-3107:26:00DEN
2012-05-2606:10:00PHX
2012-05-0202:30:00FTW
2012-05-0220:45:00HOB
2012-05-0103:01:00HGR
2012-05-0120:30:00RCH
2012-05-1112:00:00NAS
However, there is a strange problem bothering me. Here is my first try.
data test;
informat DT yymmdd10.
TM $TIME8.
orig $3.
;
format DT yymmddd10.
TM TIME8.
orig $3.
;
input
@1 DT_temp
@11 TM_temp
@19 orig
;
datalines;
2012-05-0317:36:00NYA
2012-05-0410:29:00SNW
2012-05-2418:45:00NYA
2012-05-2922:24:00NSL
2012-05-3107:26:00DEN
2012-05-2606:10:00PHX
2012-05-0202:30:00FTW
2012-05-0220:45:00HOB
2012-05-0103:01:00HGR
2012-05-0120:30:00RCH
2012-05-1112:00:00NAS
run;
The result shows
DT TM orig
. . NYA
. . SNW
. . NYA
. . NSL
. . DEN
. . PHX
. . FTW
. . HOB
. . HGR
. . RCH
. . NAS
This means the date and time are not read correctly. A work around I have right now is to read everything as string first and then convert it to date and time respectively.
data test;
informat DT_temp $10.
TM_temp $8.
orig $3.
;
format DT yymmddd10.
TM TIME8.
orig $3.
;
input
@1 DT_temp
@11 TM_temp
@19 orig
;
DT=input(strip(DT_temp),yymmdd10.);
TM=input(strip(TM_temp),time8.);
drop DT_temp TM_temp;
datalines;
2012-05-0317:36:00NYA
2012-05-0410:29:00SNW
2012-05-2418:45:00NYA
2012-05-2922:24:00NSL
2012-05-3107:26:00DEN
2012-05-2606:10:00PHX
2012-05-0202:30:00FTW
2012-05-0220:45:00HOB
2012-05-0103:01:00HGR
2012-05-0120:30:00RCH
2012-05-1112:00:00NAS
run;
In this way, everything gets the correct format.
orig DT TM
NYA 2012-05-03 17:36:00
SNW 2012-05-04 10:29:00
NYA 2012-05-24 18:45:00
NSL 2012-05-29 22:24:00
DEN 2012-05-31 7:26:00
PHX 2012-05-26 6:10:00
FTW 2012-05-02 2:30:00
HOB 2012-05-02 20:45:00
HGR 2012-05-01 3:01:00
RCH 2012-05-01 20:30:00
NAS 2012-05-11 12:00:00
Basically, these two methods used the same informat. I was wondering why the first method does not work. Appreciate for any kind of help. Thank you very much.
Your "first try" code has a couple errors, but I'm guessing they were introduced while writing the question.
Because you are using column-oriented input
, you need to specify the format to be used for each variable. Here is a corrected version:
data test;
informat DT yymmdd10.
TM TIME8.
orig $3.
;
format DT yymmddd10.
TM TIME8.
orig $3.
;
input
@1 DT yymmdd10.
@11 TM TIME8.
@19 orig $3.
;
datalines;
2012-05-0317:36:00NYA
2012-05-0410:29:00SNW
2012-05-2418:45:00NYA
2012-05-2922:24:00NSL
2012-05-3107:26:00DEN
2012-05-2606:10:00PHX
2012-05-0202:30:00FTW
2012-05-0220:45:00HOB
2012-05-0103:01:00HGR
2012-05-0120:30:00RCH
2012-05-1112:00:00NAS
run;