Search code examples
pythonprintingsubprocessopenshift

Python subprocess not capturing print output after warnings in OpenShift environment


We are facing an issue where a Python script run via subprocess in our OpenShift environment does not capture print outputs after displaying warning messages, while it works as expected locally.

Issue Description:

We have a script (snippet.py) that uses openpyxl (version 3.1.2 in both local and OpenShift environments) to read an Excel file and print specific information. This script is executed as a subprocess in another Python script (app.py). Locally, the script runs perfectly, printing both the warning messages from openpyxl and the expected output. However, in our OpenShift deployment, only the warnings are printed, and none of the print statements after the warnings are captured.

Code Snippets:

snippet.py:

from openpyxl import load_workbook
# Load the workbook and select the 'Sheet C' sheet\n
wb = load_workbook(r'some_excel_file.xlsx')
sheet = wb['Sheet C']
# Find the 'A/C' task and its duration\n
for row in sheet.iter_rows(values_only=True):
    if row and 'A/C' in row:        
        task_index = row.index('A/C')
        duration_index = task_index + 1 
        # Assuming 'Duration' is next to 'Task'\n        
        duration = row[duration_index]        
        print(f'The duration for the A/C task is: {duration} days')
        break
    else:
        print('The A/C task was not found or the Duration column is missing.')

app.py:

import subprocess

try:
    result = subprocess.run(
        ["python3 ", "snippet.py"],
        stdout=subprocess.PIPE,
        # To Pipe errors/warning also into STDOUT
        stderr=subprocess.STDOUT,
        text=True,
        check=False
    )
except subprocess.CalledProcessError as e:
    print("ERROR")

print("stdout: " + result.stdout)

Output Comparison:

Local:

stdout: 
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Data Validation extension is not supported and will be removed
  warn(msg)
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Conditional Formatting extension is not supported and will be removed
  warn(msg)

The A/C task was not found or the Duration column is missing.
[... 40 times the same line]
The duration for the A/C task is: =SheetB!Y44 days

Openshift:

stdout: 
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Data Validation extension is not supported and will be removed
  warn(msg)
C:\Users\User\Documents\project_x\.venv\Lib\site-packages\openpyxl\worksheet\_reader.py:329: UserWarning: Conditional Formatting extension is not supported and will be removed
  warn(msg)

Trouble Shooting Done

  • Checked for any unhandled exceptions or early exits in the code.
  • Tried setting PYTHONUNBUFFERED=1(environment) and using flush=True(print) and bufsize=0(subprocess).
  • Reviewed environment differences between local and OpenShift setups.

Env Details

Spec Local Openshift
python 3.11.2 3.11.5
openpyxl 3.1.2 3.1.2
Questions
  • Why might the print statements after the warnings not be captured in OpenShift?
  • Are there specific configurations in OpenShift that could affect the buffering or capturing of subprocess outputs?

Solution

  • After further investigation we could narrow down the root cause of the issue to openpyxl's load_workbook function and not the presumed subprocess. It only occurs if read_only is set to False. This might be due to Linux running on our Openshift deployment thus missing some components that seem relevant for coping/editing an excel file, escpecially when they contain formulas. Locally we don't encounter issue because the machine runs on Windows and additionally contains MS Office software components like Excel.

    Causes issue on Openshift:

    from openpyxl import load_workbook
    wb = load_workbook(r'some_excel_file.xlsx', read_only=False)
    

    No issue on Openshift:

    from openpyxl import load_workbook
    wb = load_workbook(r'some_excel_file.xlsx', read_only=True)
    

    Other issues discuss the possibilities of editing Excel files in python on a Linux machine, e.g. this one: On Linux, use Python to edit excel file that contains formulas and then read the resulting values