Search code examples
stringencodingrdfliteralsn-triples

How to encode RDF N-Triples string literals?


The specification for RDF N-Triples states that string literals must be encoded.

https://www.w3.org/TR/n-triples/#grammar-production-STRING_LITERAL_QUOTE

Does this "encoding" have a name I can look up to use it in my programming language? If not, what does it mean in practice?


Solution

  • The grammar productions that you need are right in the document that you linked to:

    [9] STRING_LITERAL_QUOTE    ::= '"' ([^#x22#x5C#xA#xD] | ECHAR | UCHAR)* '"'
    [141s]  BLANK_NODE_LABEL    ::= '_:' (PN_CHARS_U | [0-9]) ((PN_CHARS | '.')* PN_CHARS)?
    [10]    UCHAR   ::= '\u' HEX HEX HEX HEX | '\U' HEX HEX HEX HEX HEX HEX HEX HEX
    [153s]  ECHAR   ::= '\' [tbnrf"'\]
    

    This means that a string literal begins and ends with a double quote ("). Inside of the double quotes, you can have:

    • any character except: #x22, #x5C, #xA, #xD. Offhand, I don't know what each of those is, but I'd assume that they're the space characters covered in the escapes;
    • a unicode character represented with a \u followed by four hex digits, or a \U followed by eight hex digits; or
    • an escape character, which is a \ followed by any of t, b, n, r, f, ", ', and \, which represent various characters.