Search code examples
ibm-cloudibm-watsonwatsonwatson-discovery

IBM Watson- Extracting Keywords and Concepts


I'm trying to figure out the proper method to extract keywords and concepts from a large batch of documents individually. On DW it was recommended to use IBM Watson- Knowledge Studio. Knowledge Studio is linked to Discovery, I cannot seem to find in the Discovery API Reference how to pull the keywords and concepts individually. I can easily look at concepts on a macro level however I need the keywords and concepts for each file individually. All of my files have been uploaded to Knowledge Studio. Additionally, I also uploaded everything to Discovery. I have been unable to extract the information per an individual file. The API Reference guide does not cover extracting information down to an individual level for a file that has been uploaded. Last week, I filed a support ticket and the response was to post the question on Stackoverflow for additional support. What is the correct method for finding Keywords and Concepts for each file individually in a large batch of files? Discovery or NLU?

Any guidance is greatly appreciated.


Solution

  • I think you should try the Natural Language Understanding service. Here is a demo that will allow you to analyze text and extract concepts and keywords https://natural-language-understanding-demo.mybluemix.net/.

    I would recommend you to first read the documentation, and then look at the API Reference where you will find how to call the method to extract keywords and concepts based in different languages.

    What you need to do is to loop through your files, read the content and then send it to NLU.

    Here is an example of how to analyze text to extract concepts and keywords in Node.js:

    const NaturalLanguageUnderstandingV1 = require('watson-developer-cloud/natural-language-understanding/v1.js');
    const service = new NaturalLanguageUnderstandingV1({
      'username': '{username}',
      'password': '{password}',
      'version_date': '2017-02-27'
    });
    
    const parameters = {
      text: 'IBM is an American multinational technology company headquartered in Armonk, New York, United States, with operations in over 170 countries.',
      features: {
        keywords: {
          emotion: true,
          sentiment: true,
          limit: 2
        },
        concepts: {
          limit: 3
        }
      }
    }
    
    service.analyze(parameters, (err, response) => {
      if (err)
        console.log('error:', err);
      else
        console.log(JSON.stringify(response, null, 2));
    });