I would like to include a line feed before the keyword 'data-base-url' only when it doesnt have one.
</html>
<et1>
<a data-linked-resource-type="userinfo" data-base-url="https://url.com/c">USERNAME 1</a>
<td class="conTd">
INFO 1
</td>
</et1>
<et2>
<a data-linked-resource-type="userinfo"
data-base-url="https://url.com/c1">USERNAME 2</a>
<td class="conTd">
INFO 2
</td>
</et2>
<et3>
<a data-linked-resource-type="userinfo"
data-base-url=
"https://url.com/c2">USERNAME 3</a>
<td class="conTd">
INFO 3
</td>
</et3>
</html>
/* data program */
data inp;
infile "c:/tmp/output.txt";
input @'data-base-url=' user_info $30000.
@'<td class="conTd">' details $30000.;
run;
/* data program ends */
et3 tag is the required pattern. If you run the above program for the input file, you will get only the et3 tag gets converted properly to the user_info and details columns but I would like to include the line feed in the first two tags to get the desired output. Thanks in advance.
Regards, AKS
Here is my solution which is based on your output dataset inp
rather than your question per se as with this solution there is no need to modify your input file.
Basically you read every line of your input file as a single SAS row and manipulate data from there. Modify record length at your convinience.
data inp;
infile "/sascr/user/me/output.txt" truncover lrecl=200;
input string $200. ;
lstr = lag(string);
if lstr='<td class="conTd">' then details = string;
if string='<td class="conTd">' then _info = lstr;
user_info = scan(lag(_info),-1,'=');
if length(strip(details))>1 then output;
keep details user_info;
run;
Hope this help.