I use the functions to count words words countw
and to get words scan
a lot to analyse full file names. (For those interested, I typically use FILENAME docDir PIPE "dir ""&docRoot"" /B/S";
)
With traditional SAS, this works both on UNIX and Windows:
data OLD_SCHOOL;
format logic withSlash withBack secondSlash secondBack $20.;
logic = 'OLD_SCHOOL';
withSlash = 'Delimited/With/Slash';
wordsSlash = countw(withSlash, '/');
secondSlash = scan(withSlash, 2, '/');
withBack = 'Delimited\With\Back';
wordsBack = countw(withBack, '\');
secondBack = scan(withBack, 2, '\');
worksTheSame = wordsSlash eq wordsBack and secondSlash eq secondBack;
put _all_;
run;
results in
withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
withBack=Delimited\With\Back secondBack=With wordsBack=3
worksTheSame=1
Using the newer DS2 syntax, scan and countw handle backslash differently
proc ds2;
data DS2_SCHOOL / overwrite=yes;
dcl double wordsSlash wordsBack worksTheSame;
dcl char(20)logic withSlash withBack secondSlash secondBack;
method init();
logic = 'DB2_SCHOOL';
withSlash = 'Delimited/With/Slash';
wordsSlash = countw(withSlash, '/');
secondSlash = scan(withSlash, 2, '/');
withBack = 'Delimited\With\Back';
wordsBack = countw(withBack, '\');
secondBack = scan(withBack, 2, '\');
worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);
end;
enddata;
run;
quit;
data BOTH_SCHOOLS;
set OLD_SCHOOL DS2_SCHOOL;
run;
results in
withSlash=Delimited/With/Slash secondSlash=With wordsSlash=3
withBack=Delimited\With\Back secondBack= wordsBack=1
worksTheSame=0
Is there a good reason for this, or should I report it as a bug to SAS?
(There might be a link with the role of backslash in regular expressions.)
I verified this in 9.3 (which is missing overwrite=yes, as a side note, annoyingly):
proc ds2;
data DS2_SCHOOL ;
dcl double wordsSlash wordsBack worksTheSame;
dcl char(20)logic withSlash withBack secondSlash secondBack;
method init();
logic = 'DB2_SCHOOL';
withSlash = 'Delimited/With/Slash';
wordsSlash = countw(withSlash, '/');
secondSlash = scan(withSlash, 2, '/');
withBack = 'Delimited\\With\\Back';
wordsBack = countw(withBack, '\\');
secondBack = scan(withBack, 2, '\\');
worksTheSame = (wordsSlash eq wordsBack) and (secondSlash eq secondBack);
end;
enddata;
run;
quit;
The backslash indeed seems to be an escape - even in your original string you need a pair of them.
This is no longer the case as of 9.4 TS1M3, so it's unclear where between 9.3 TS1M2 and 9.4 TS1M3 this was changed and/or fixed - and it's not mentioned in any of the change logs, unfortunately.
According to comments/verification, it looks like it was changed/fixed in 9.4 TS1M2 specifically.