Search code examples
python-3.xjupyter-notebookhttp-headersbox-api

How to suppress or prevent http header response returned from Box python API from printing in the cell output in Jupyter notebook


I'm new to working with APIs. I'm running an iterator in Jupyter Notebook which calls Box.com API for some data (.docx and .pdf files). The main function cell prints a lot of http header responses per per directory scraped while iterating. This builds up as I'm iterating through roughly 9000 files, making the notebook super heavy (over 100 Mb). At this point the notebook becomes irresponsive even when I'm using 16Gb RAM.
Is there a way to suppress those header responses, prevent them from printing in the cell output, or an alternate approach to this?
I have tried the semicolon(;) at the end of box API calls and %%capture. I'm not sure what I'm doing wrong here. I need the output for training a word2vec model and I have build the whole data processing pipeline. Sample snip from output cell


Solution

  • I figured it out. You can use logging in python for controlling the level of logs/header outputs in a Notebook cell. The only thing to note (that I missed) is that you must add the logging statement at the top of the particular cell you want to trim the output for. Its scope is limited to the cell, not the whole Jupyter Notebook.

    Note: Any Python print() statements are not affected by this.