I have a bunch of articles that are translated, which I want to use as training data for IBM Watson language translation. What is the correct way to use these articles for training? Do I use the whole article and its translation as an entry in the parallel corpus, or do I have to split the article into sentences and have its translation pair as an entry?
You have two choices.
Either split up the text into phrase pairs with a from and to for each phrase, and create either a forced_glossary or a parallel_corpus.
Or send all the translated text as a single file to create a monolingual_corpus.
Detailed documentation is available at https://www.ibm.com/watson/developercloud/doc/language-translator/customizing.html#training and the API documentation is available at https://www.ibm.com/watson/developercloud/language-translator/api/v2/?curl#create-model