Search code examples
sasfcmp

Is there an equivalent of the "of some_array{*}" form for use in SAS functions


Our database predates our database software having good unicode support, and in its place has a psuedo-base64 encoding which it uses to store UTF16 characters in an ascii field. I am writing a function to convert this type of field into straight UTF8 within SAS.

The function loops through the string converting each set of three ascii characters into a unicode character and placing it in an array. When experimenting with code in a data step I had used cat(of final{*}) to convert the array into a string, but the same code does not appear to be valid within a function.

I am currently collating the string in the loop with collate = trim(collate)!!trim(final{i}) and an arbitrary length collate string, but I would like to produce this directly from the array or at least set the size of the collate string based on the length of the input string.

I've included a pastebin of the data and function here.

Edit: The version of SAS I was using is 9.3


Solution

  • The same code is valid in a function in SAS 9.4 TS1M3; it may not be in earlier versions (significant changes were made to how arrays were handled in FCMP in 9.4 and in maintenance releases TS1M2 and 3).

    However, this doesn't really solve your arbitrary length problem; when I run your function with

            outtext = cat(of final{*});
            return (outtext);
    

    I get... 1 character! And when I run

            return(cats(of final{*}));
    

    output:

    Obs text_enc finaltext 
    1 ABCABlABjABhAB1ABzABlAAgABVABUABGAA4AAgABpABzAAgABoABhAByABk BecauseU 
    2 ABTABpABtABwABsABlAByAAgABsABpABrABlAAgAB0ABoABpABz          Simplerl 
    3 ABJABvAAgABJABvAAgABCAByABvABtABpABvABz                      IoIoBrom 
    

    which is a bit better (cats trims for you), I still only get 8 characters. That's because 8 characters is the default length in SAS for an undeclared character variable. Expand the length (using a length statement for outtext) and you get:

    Obs text_enc finaltext 
    1 ABCABlABjABhAB1ABzABlAAgABVABUABGAA4AAgABpABzAAgABoABhAByABk  BecauseUTF8ishard 
    2 ABTABpABtABwABsABlAByAAgABsABpABrABlAAgAB0ABoABpABz           Simplerlikethis 
    3 ABJABvAAgABJABvAAgABCAByABvABtABpABvABz                       IoIoBromios 
    

    You'll still need to define whatever length you need, then. FCMP doesn't, as far as I know, allow for a way to have an undefined-length string; you need to define the default (and maximum) length for the string you're going to return. The user is welcome to define a shorter length, and should, when it's appropriate.