Search code examples
metadatardfatom-feedrdfadublin-core

Format to use for exposing structured meta data (dublin core, rdf, atom)?


In an altruistic manner I would like to expose as much structured data about my website. I also wouldn't mind SEO boost but its secondary.

Seems there are a couple of options:

  • Full on RDF (kill me now XML)
  • Atom with your own custom tags (liking that)
  • RDFa in your webpage (might help SEO)
  • Dublin Core Meta tags
  • Dublin Core using RDFa
  • Atom with RDFa

I'm just trying to make it easy for people to get data off my site.

The nice thing about standards is that there are so many of them to choose from.

Which one do you think I should use?


Solution

  • RDF is not just XML; RDF is a data model that relies on sets of triples (subject, predicate, object) and URIs to unambiguously refer to things. Actually, people working with RDF tend to run away from RDF/XML and we prefer RDF/Turtle or RDF/Ntriples, even RDF in JSON format. These serializations are more readable, easier to construct and easier to parse. Moreover, there are many tools that allow you to transform between all the range of RDF flavors (i.e: rapper or Jena).

    When it comes to publishing information in RDF. You generally have three different choices:

    1. To provide RDF dumps of your data.
    2. To publish RDF following the Linked Data rules.
    3. To add metadata to your existing Web pages with RDFa.

    ... these are not exclusive. You can go for any combination of them, the most important thing is choosing the correct structure of URIs (see Cool URIs don't change).

    Following your SO profile I see that you're working on a social taste recommendation website (http://evocatus.com/). I assume that you might want to expose information about those reviews. So for a review like http://evocatus.com/sauce/cholula-chipolte-hot-sauce/272645/ you can provide different serializations and give back not just HTML but also:

    • .../holula-chipolte-hot-sauce/272645/rdf-turtle
    • .../holula-chipolte-hot-sauce/272645/rdf-xml
    • .../holula-chipolte-hot-sauce/272645/rdf-json
    • and one for any other type of format you want to expose.

    In addition, the HTML version could be enhanced with RDFa. Depending on the type of client that consumes your data, following content negotiation rules, you'll redirect the HTTP request to whichever format is accepted by the client. This is established by the HTTP header Accept. So a request like the one below with curl would be redirected by your application giving back the RDF/XML version:

    curl -H 'Accept: application/rdf+xml' .../holula-chipolte-hot-sauce/272645/
    

    In the future, people would be able to say things about existing reviews in your site by just reusing your URIs in their RDF data. That's the power of RDF and Linked Data.

    About Dublin Core, you could use Dublin Core with either RDF or RDFa. But, in your case there are some other interesting ontologies to consider and the right thing would be to use a mix of all of them:

    • FOAF: Friend Of A Friend, to express user personal information and relations between users.
    • Tag Ontology: A very simple ontology to express tag information.
    • RDF Review Vocabulary: Vocabulary for expressing reviews and ratings using RDF.
    • GoodRelations: An ontology to express product information and eCommerce.
    • Vcard/RDF: for addresses, normally used in combination with FOAF.

    There is one site called http://revyu.com/ that uses all these ontologies (except GoodRelations), so you could use it as a guideline. See for instance:

    ... these are HTML and RDF versions of the same review.

    Unlike with ATOM, as you can see, with RDF you would be able to reuse existing ontologies and since RDF is based on URIs everything would be interlinked.

    Linked Data Added Value

    What would happen if you invest sometime linking your products and reviews to other data sources ? (i.e: dbpedia.org or freebase.com). Let's imagine that you start linking all your Beer reviews (http://evocatus.com/beer/) to whatever brewery is manufacturing the product from (http://dbpedia.org/page/Alcoholic_beverage), by following the links you would be able to know for instance where the preferable beers are manufactured. Dbpedia holds that information.

    Also see that in Freebase, that also provides RDF versions, you could link to manufacturers. For instance see, http://rdf.freebase.com/rdf/en.budweiser in RDF or http://www.freebase.com/view/en/budweiser in HTML.