RDF allows to use typed literals to specify the data type of string values. This is usually used for XML Schema datatypes such as xsd:integer
and xsd:date
, e.g.
<https://example.org/> dc:created "1999-12-17"^^xsd:date.
How can typed literals in RDF be used to specify data type defined by (or extending the) IANA MIME Types registry? I'd like to do something like this:
<https://example.org/>
dc:description "I love cookies!" ;
dc:description "I <em>love</em> cookies!"^^<text/html> ;
dc:description "I *love* cookies!"^^<text/x-markdown> ;
dc:description "I \\emph{love} cookies!"^^<application/x-tex> .
But plain MIME types are no valid datatype IRIs. Does an official URI namespace exist for MIME types and have such URIs been used for RDF typed literals?
There is no official way to use MIME types as RDF (or XML Schema) datatypes, because it is ambiguous what such a thing would mean ‒ MIME types describe a sequence of bytes, while RDF literals are always a sequence of Unicode characters. You'd have to define a method the lexical value should be converted to the byte sequence and then interpreted, and for non-textual formats, you would likely have to start from xsd:base64Binary
or xsd:hexBinary
. In addition to that, some of your examples are only fragments, not documents valid on their own, so let's look at other options first:
I'd recommend first looking for concrete identifiers for the formats you want to support, but even then you will likely have several options:
rdf:XMLLiteral
, rdf:HTML
, and rdf:JSON
are official and should be used for valid literals in these languages.xtypes:Fragment-HTML
, xtypes:Fragment-Markdown
, or xtypes:Fragment-LaTeX
. What might be a bit ambiguous is what exactly a "fragment" means here. I assume it means that something like '<tag attr="a">'^^xtypes:Fragment-XML
is valid, while '<tag attr="a">'^^rdf:XMLLiteral
is not (it must be self-contained, akin to application/xml-external-parsed-entity
).http://dbpedia.org/resource/Markdown
, but these are not explicitly defined as datatypes, so some processors might have a hard time trying to find their definition.tag:yaml.org,2002:yaml
for YAML.urn:publicid:%2B:ISBN+0-201-13448-9;Knuth:NOTATION+The+TeXbook:EN
, but these are not produced anymore (you can find a collection of them here).I would not recommend using any other URI scheme than http(s)
however, since at least humans should be able to find out what it means through HTTP.
If you want to have URIs for MIME types (but not necessarily used as datatypes), you could use something like uri4uri to arrive at RDF descriptions of MIME types, for example https://w3id.org/uri4uri/mime/text/markdown
(but note that the charset
parameter is required for Markdown, so it should be https://w3id.org/uri4uri/mime/text/markdown;charset=utf-8
‒ parameters are supported too!).
You could also refer to the IANA registration document such as https://www.iana.org/assignments/media-types/text/markdown, but that's just a document and not all MIME types have those. This URL pattern could also be used for non-standard MIME types such as https://www.iana.org/assignments/media-types/text/yaml
but these will not be resolvable unless officially registered.
Another option I could come use up with is to (ab)use language tags instead of datatypes for this purpose, such as zxx-Latn-x-md
for Markdown or zxx-Latn-x-tex
for TeX. This is absolutely not standardized (except for zxx
usable for, among other things, programming source codes, and Latn
for texts using the Latin alphabet), and I would not recommend using it for literals that should be parsed ‒ think of it as affecting the presentation of the text, such as picking a syntax highlighter.
data:
URIsThe only standardized way to combine a text and MIME type is using the data
URI scheme, but you won't get a literal:
<https://example.org/>
dc:description <data:text/markdown;charset=utf-8,I%20*love*%20cookies!> .
<data:text/markdown;charset=utf-8,I%20*love*%20cookies!>
a <https://w3id.org/uri4uri/mime/text/markdown;charset=utf-8> ;
rdf:value "I *love* cookies!" .