Search code examples
javascriptruby-on-railsfetchmeta-tags

Fetching URL Metadata from JS


Most social media sites have a feature where you can type in a link and the site will generate a link preview of it. See example below from Google+

Let's say I'd like to build my own. I'm using Ruby on Rails as a web framework but that's irrelevant as I imagine I'll have to use JS to fetch this client-side right?

  1. Where do I look for this data? I know it's usually in the <meta> tags, but is that standard? When I tried it for a few links only the description was in the <meta> tags. The image and title didn't match anything else in the meta tags.

  2. How do I go about fetching a remote document asynchronously and parsing it's tags? If anyone could point me to an example I'd be grateful.

Thanks!

enter image description here


Solution

  • There are three common ways how authors might provide this data in HTML documents (from least expressive to most expressive):

    1. Metadata in the head element: This is plain HTML, i.e.,

    2. Microformats: Still using plain HTML, but together with specific class names. All Microformats are described in their wiki.

    3. Structured data: Using extending/additional syntaxes (JSON-LD, Microdata, RDFa, …) and vocabularies (Schema.org, Open Graph Protocol, Dublin Core …).

    You’ll typically find suitable parsers in your programming languages.

    You’ll probably find that most sites make use of Open Graph Protocol (in RDFa), as this is used by Facebook and Twitter. Probably followed by Schema.org (in JSON-LD/Microdata/RDFa), as this is sponsored by the major search engines.

    Note that 2. and 3. also allow authors to provide data about entities described on (or relevant to) the page, i.e., not every extracted data is suitable for link previews, so you have to take the context into account.