Search code examples
resthere-apiwikipedia-apiheremaps

Matching HERE Maps items with Wikipedia content using REST APIs


Using the V7 version of the HERE Maps REST api, I'm trying to get match the data I'm getting from the /browse and /discover endpoints and match them unambiguously with data obtained from the Wikimedia Action APIs for a tourists guide app.

For example, by calling the the following endpoint:

curl "https://discover.search.hereapi.com/v1/browse?in=circle%3A45.4627338,9.1777323%3Br%3D300&limit=30&apiKey={API KEY REDACTED}&categories=300-3000-0025&at=45.4627338,9.1777323&lang=en-US" \
     -H 'Accept: application/json' \
     -H 'Accept-Encoding: gzip'

we get an JSON Formatted array of Points Of Interest around the center of Milan, here's a single element from it:

{
  "title":"Basilica di Sant'Ambrogio",
  "id":"here:pds:place:380u0nd8-2d259709e0aa42908182e1a03a8d9aca",
  "language":"en",
  "resultType":"place",
  "address":{
    "label":"Basilica di Sant'Ambrogio, Piazza Sant'Ambrogio, 15, 20123 Milan MI, Italy",
    "countryCode":"ITA",
    "countryName":"Italy",
    "state":"Lombardy",
    "countyCode":"MI",
    "county":"Milan",
    "city":"Milan",
    "district":"Duomo",
    "street":"Piazza Sant'Ambrogio",
    "postalCode":"20123",
    "houseNumber":"15"
  },
  "position":{
    "lat":45.4624,
    "lng":9.1757
  },
  "access":[
    {
      "lat":45.46219,
      "lng":9.17481
    }
  ],
  "distance":163,
  "categories":[
    {
      "id":"300-3200-0030",
      "name":"Church",
      "primary":true
    },
    {
      "id":"300-3000-0023",
      "name":"Tourist Attraction"
    },
    {
      "id":"300-3000-0025",
      "name":"Historical Monument"
    },
    {
      "id":"300-3100-0028",
      "name":"History Museum"
    },
    {
      "id":"300-3100-0029",
      "name":"Art Museum"
    },
    {
      "id":"900-9300-0221",
      "name":"Residential Area\/Building"
    }
  ],
  "references":[
    {
      "supplier":{
        "id":"core"
      },
      "id":"1175957532"
    },
    {
      "supplier":{
        "id":"core"
      },
      "id":"1175957749"
    },
    {
      "supplier":{
        "id":"core"
      },
      "id":"50930489"
    },
    {
      "supplier":{
        "id":"core"
      },
      "id":"800801056"
    },
    {
      "supplier":{
        "id":"tripadvisor"
      },
      "id":"591187"
    },
    {
      "supplier":{
        "id":"yelp"
      },
      "id":"QIqHOCwwIy1kFTPiPIeGnQ"
    }
  ],
  "contacts":[
    {
      "phone":[
        {
          "value":"+39028057310"
        },
        {
          "value":"+390286450895",
          "categories":[
            {
              "id":"300-3100-0028"
            },
            {
              "id":"300-3200-0030"
            }
          ]
        }
      ],
      "www":[
        {
          "value":"http:\/\/www.basilicasantambrogio.it",
          "categories":[
            {
              "id":"300-3100-0028"
            },
            {
              "id":"300-3200-0030"
            }
          ]
        }
      ]
    }
  ]
}

I can see HERE maps returning a series of entities IDs matching both Tripadvisor and Yelp, but nothing explicitly pointing to wikipedia.

It seems that this information was available on a deprecated version of the API using the show_content parameter, but I can't find anything equivalent for the currently available revision.

Has anyone been able to translate between the two systems? I guess I could go for a less precise approach and roughly match between latitude, longitude and place name, but it feels way too 'hacky' and I'd rather avoid multiple query requests to wikipedia for every single location.

Has anyone faced anything similar? I feel like I'm missing something obvious but I can't figure out which parameters should I actually be using.


Solution

  • First you need to query Wikipedia using normal wording, like the City name etc. And refine it until you know the resulting page will most likely be the page you need each time. Then the Wikipedia API will return you huge arrays that are hard to work with, but there is one key value which is the ID of each Wikipedia page. It is hidden in a huge array, and this is what determines if you get the correct page or not.

    Sorry for the very unpolished code but this is very old and the Wikipedia API is very complex. I hope it serves as an example. You can imagine what the missing values are, and it should give you a place to start. Here I am extracting the title of an article and the page ID, and then the contents of the article with a normal query by the name that the short description. So I am effectively using 2 distinct Wikipedia APIs and feeding them the data I get from the HERE APIs. So you can see both functions as separate examples.

     //wikipedia title
      wikiQueryBaseUrl="https://en.wikipedia.org/w/api.php?format=json&origin=*&action=query&titles=";
      if(typeof city !== 'undefined'){
      wikiQueryUrl=wikiQueryBaseUrl+city;}
      else{wikiQueryUrl=wikiQueryBaseUrl+inputValue};
      fetch(wikiQueryUrl) 
      .then((response) => response.json()) 
      .then((data) => {
      preWikiData = data;
      pageId = Object.keys(preWikiData.query.pages)[0];
      if(preWikiData.query.pages[pageId] !== 'undefined'){
      wikiLink="http://en.wikipedia.org/?curid="+pageId}
      if(pageId == '-1'){
        wikiLink="https://en.wikipedia.org/";
      };
    
      //wikipedia data
      wikiQueryBaseUrl="https://en.wikipedia.org/w/api.php?format=json&origin=*&action=query&prop=extracts&exintro&explaintext=true&titles=";
      wikiQueryUrl=wikiQueryBaseUrl+country.replace(' ', '')+"&redirects=1";
      fetch(wikiQueryUrl) 
      .then((response) => response.json()) 
      .then((data) => {
      preWikiData = data;
      pageId = Object.keys(preWikiData.query.pages)[0];
      if(typeof preWikiData.query.pages[pageId].title !== 'undefined'){
      wikiData=preWikiData.query.pages[pageId].extract;}
      });