Search code examples
stanford-nlpcorenlp-serverstanford-nlp-server

Stanford CoreNLP Server disable logging


I have the feeling that the logging of the server is quite exhaustive. Is there a way to disable or reduce the logging output? It seems that if I send a document to the server it will write the content to stdout which might be a performance killer.

Can I do that somehow?


Update

I found a way to suppress the output from the server. Still my question is how and if I can do this using a command line argument for the actual server. However for a dirty workaround it seems the following can ease the overhead.

Running the server with

java -mx6g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -prettyPrint false 2&>1 >/dev/null

where >/dev/null would pipe the output into nothing. Unfortunately this alone did not help. 2&>1 seems to do the trick here. I confess that I do not know what it's actually doing. However, I compared two runs.

Running with 2&>1 >/dev/null

Processed 100 sentences
Overall time:      2.1797 sec 
Time per sentence: 0.0218 sec 
Processed 200 sentences
Overall time:      6.5694 sec 
Time per sentence: 0.0328 sec 
...
Processed 1300 sentences
Overall time:      30.482 sec 
Time per sentence: 0.0234 sec 
Processed 1400 sentences
Overall time:      32.848 sec 
Time per sentence: 0.0235 sec 
Processed 1500 sentences
Overall time:      35.0417 sec 
Time per sentence: 0.0234 sec 

Running without additional arguments

ParagraphVectorTrainer - Epoch 1 of 6
Processed 100 sentences
Overall time:      2.9826 sec 
Time per sentence: 0.0298 sec 
Processed 200 sentences
Overall time:      5.5169 sec 
Time per sentence: 0.0276 sec 
...
Processed 1300 sentences
Overall time:      54.256 sec 
Time per sentence: 0.0417 sec 
Processed 1400 sentences
Overall time:      59.4675 sec 
Time per sentence: 0.0425 sec 
Processed 1500 sentences
Overall time:      64.0688 sec 
Time per sentence: 0.0427 sec 

This was a very shallow test but it appears that this can have quite an impact. The difference here is a factor of 1.828 which is quite a difference over time.

However, this was just a quick test and I cannot guarantee that my results are completely sane!

Further update:

I assume that this has to do with how the JVM is optimizing the code over time but the time per sentence becomes compareable with the one I am having on my local machine. Keep in mind that I got the results below using 2&>1 >/dev/null to eliminate the stdout logging.

Processed 68500 sentences
Overall time:      806.644 sec 
Time per sentence: 0.0118 sec 
Processed 68600 sentences
Overall time:      808.2679 sec 
Time per sentence: 0.0118 sec 
Processed 68700 sentences
Overall time:      809.9669 sec 
Time per sentence: 0.0118 sec 

Solution

  • You're now the third person that's asked for this :) -- Preventing Stanford Core NLP Server from outputting the text it receives . In the HEAD of the GitHub repo, and in versions 3.6.1 onwards, there's a -quiet flag that prevents the server from outputting the text it receives. Other logging can then be configured with SLF4J, if it's in your classpath.