Using ssplit options for CoreNLP

According to the documentation, I can use options such as ssplit.isOneSentence for parsing my document into sentences. How exactly do I do this though, given a StanfordCoreNLP object?

Here's my code -

Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, ner, depparse");
pipeline.annotate(document);
Annotation document = new Annotation(doc);
pipeline.annotate(document);
List<CoreMap> sentences = document.get(SentencesAnnotation.class);

At what point do I add this option and where? Something like this?

pipeline.ssplit.boundaryTokenRegex = '"'

I'd also like to know how to use it for the specific option boundaryTokenRegex

EDIT:

I think this seems more appropriate -

props.put("ssplit.boundaryTokenRegex", "/"");

But I still have to verify.

Solution

The way to do it for tokenizing sentences to end at any instance of a ' " ' is this -

props.setProperty("ssplit.boundaryMultiTokenRegex", "/\'\'/");

props.setProperty("ssplit.boundaryMultiTokenRegex", "/\"/");

depending on how it is stored. (CoreNLP normalizes it as the former)

And if you want both starting and ending quotes -

props.setProperty("ssplit.boundaryMultiTokenRegex","\/'/'|``\");