Search code examples
ibm-cloudalchemyapi

Not picking up the correct author field in alchemy api demo


I'm trying out the online demo : http://www.alchemyapi.com/products/demo/alchemylanguage

I pasted in one of your blog articles: http://www.programmableweb.com/news/alchemyapi-updates-api-brings-deep-learning-to-masses/2013/07/25

For the Author field returned by Alchemyapi I get 'Google+' whilst in the blog article it says the author is 'Amy Castor'

Any reason why this has happened?

btw. I recently posted to ibm dwanswers but found out that they are moving onto stackoverflow, hence the cross post


Solution

  • From the Author Extraction documentation:

    The author information can be embedded into a news article or a blog post in a multitude of different ways, including in the page meta data, using REL links, just plain text, and others. Since there's no standard way to express the author via HTML tags (i.e. like the tags), reliably extracting the author is a complex task. AlchemyAPI uses over a dozen techniques in parallel to attempt to find the author, and then cross references the results to determine the most likely candidate for the author. AlchemyAPI makes the difficult task of author extraction easy to integrate into your application.

    The reason it thinks the author is Google+ on that specific article is because it appears in a REL link.

    At the bottom of the article it says:

    About the author: Amy Castor Follow me on Google+

    That Google+ word links to https://plus.google.com/108856065353244179079?rel=author

    The Algorithm is picking up this ?rel=author argument and therefore thinks Google+ is the author.

    In a lot of cases this would be true but this is one of those cases where it's wrong - Such is the nature of cognitive computing.