Search code examples
phpjsonuriopencalais

OpenCalais returns persons URI for Linked Data instead of actual person name


I am using OpenCalais Semantic Web service and receiving "Application/JSON" response to my submitted content. When i am looking at the Quotation entity, OpenCalais is sending the person quote but the person name is not a name of the person but a "Linked Data" URI. For example, for a person named Tayyip Erdogan:

http://d.opencalais.com/pershash-1/a7077bd6-bcc9-3419-b75e-c44e1b2eb693

I need the name of the person, not the URI. OpenCalais also send URI instead of person name in PersonCareer entity as well. I don't want to read the URI's html DOM and extract person's name as it will slow down everything. Is there a solution?

Description of Quotation Entity: http://www.opencalais.com/documentation/calais-web-service-api/api-metadata/entity-index-and-definitions#Quotation )


Solution

  • It turns out that there is a way to access these person URIs other than HTML; and that is by parsing RDF. Any URI link to a Linked Data resource provided by OpenCalais can also be used as an RDF. Just change the uri from .html to .rdf and you will get all the information of that resource in RDF format.

    For example, for a person named Tayyip Erdogan:

    http://d.opencalais.com/pershash-1/a7077bd6-bcc9-3419-b75e-c44e1b2eb693.rdf

    The following code uses a file_get_dom library, you can use any native function to get file contents as well. This is just an approach i used to extract person names from the retrieved RDF contents from the web service. I am sure you can think of a better solution.

    public function get_persons_from_pershash($url)
    {   
        //Gets RDF of the person URI
        @$person_html = file_get_dom($url);
    
        if(!empty($person_html))
        {
            //Get position of name tag and extract the name
            $strpos_start = strpos($person_html, '<c:name>') + 8;
            $strpos_end = strpos($person_html, '</c:name>');
            $str_name_length = $strpos_end - $strpos_start;
            $extracted_name = trim(substr($person_html, $strpos_start, $str_name_length));
    
            return $extracted_name;
        }
        return '';      
    }
    

    When you change the URL to .rdf, you will be prompted to save an rdf file.

    I wanted to parse it programmatically so i did that way!

    Hope someone finds this helpful!

    Cheers!