Search code examples
sasenterprise-guidesas-dis

Removing quotes and spaces in SAS dataset


I am working in SAS EG and DI, facing a very peculiar problem.

When I look into a column of a dataset in SAS DI Studio or EG, it is appearing fine. But when I paste the data into notepad, some quotes and spaces are appearing.

The data which I am seeing in EG: without spaces

But the same data when copied into Notepad,

extra quotes and spaces are appearing like this(in 6th row):with spaces and quotes

I found this problem when I am using this field as a key in a join, the other related column values for 6th row are not going to the output as the match is failing for that 6th record.

I tried many things like tranwrd,dequote and compress but none of them is changing my result.

Can someone please help in understanding what the problem is and how can this be solved.


Solution

  • Take a look at what is in the column so that you can decide how to handle it. This query will show you both the character string and the Hexadecimal representation of the string.

    proc sql;
      select postcode,put(trim(postcode),$hex.) as hexcode,count(*) as nobs
        from x
        group by 1,2
      ;
    quit;
    

    So if you see hex characters like 0A, 0D, A0, 08 or other non-printable codes then you can figure out what is happening.

    So you might see that you have POSTCODE='LS5 3BT' with HEXCODE='4C533520334254' for most of the records. But perhaps have some that look like the POSTCODE='LS5 3BT', but the value of HEXCODE is something like '0A4C533520334254' which would mean that you have a linefeed character at the beginning of the string. Or perhaps instead of space ('20'X) you have a tab ('09'X) in the middle of the string.