Search code examples
ibm-watsondocument-conversion

Can the answer unit content array returned by the Watson Document Conversion service ever have more than one element?


I am writing a program that takes advantage of IBM Watson's Document Conversion service to convert documents of various types into answer units. Each answer unit that is returned by the service contains an array named content which is composed of objects having a media_type and a text element.

I've never seen more than one element in this content array, and I'm not sure how to handle them if there were. Can there ever be more than one element in this array and, if so, what are the possible values? Will they all have the same media_type value? My plan at the moment is to combine all of the text elements into one if more than one exists.


Solution

  • The answer unit content array can have more than one element (if you request that - see below). If it does, each element in the array will be a different media type representation of the same contents.

    You can get this by putting more than one output media type in your request. When you do this, the output content array will contain more than element - with an element for each of the media types you request.

    For example, if your request contained a config like this:

    {
        conversion_target : 'answer_units',
        answer_units : {
            output_media_types : ['text/plain', 'text/html']
        }
    }
    

    (see https://www.ibm.com/watson/developercloud/document-conversion/api/v1/#convert-document for explanation of where you put config)

    Then the content in your response will contain:

    content : [
        {
            text : <the plain text contents of the answer unit>,
            ...
        },
        {
            text : <the HTML contents of the answer unit>,
            ...
        }
    ]
    

    If you don't specify the output media type parameter, you'll get the default value which is:

            output_media_types : ['text/plain']
    

    This is why you're always getting an array of length 1, with a text version of the output. Because implicitly, by leaving it with the default config, you're asking for one output media type.