Search code examples
unicodelatexibm-doors

How to search for any unicode symbol in a character string?


I've got an existing DOORS module which happens to have some rich text entries; these entries have some symbols in them such as 'curly' quotes. I'm trying to upgrade a DXL macro which exports a LaTeX source file, and the problem is that these high-number symbols are not considered "standard UTF-8" by TexMaker's import function (and in any case probably won't be processed by Xelatex or other converters) . I can't simply use the UnicodeString functions in DXL because those break the rest of the rich text, and apparently the character identifier charOf(decimal_number_code) only works over the basic set of characters, i.e. less than some numeric code value. For example, charOf(8217) should create a right-curly single quote, but when I tried code along the lines of

if (charOf(8217) == one_char)

I never get a match. I did copy the curly quote from the DOORS module and verified via an online unicode analyzer that it was definitely Unicode decimal value 8217 .

So, what am I missing here? I just want to be able to detect any symbol character, identify it correctly, and then replace it with ,e.g., \textquoteright in the output stream.

My overall setup works for lower-count chars, since this works: ( c is a single character pulled from a string)

    thedeg = charOf(176)
 if( thedeg == c )
        {
           temp += "$\\degree$"
       }

Solution

  • Got some help from DXL coding experts over at IBM forums.

    Quoting the important stuff (there's some useful code snippets there as well):

    Hey, you are right it seems intOf(char) and charOf(int) both do some modulo 256 and therefore cut anything above that off. Try:

    int i=8217;
     char c = addr_(i);
     print c;
    

    Which then allows comparison of c with any input char.