Search code examples
pythonpandaspycharmxlsxwriter

Create plots using python while separating excel data to new files


I'll preface this with the fact that I am new to Python and coding in general. I can successfully separate my data based on unique values in a certain column, but as the for loop iterates I also want to create a plot. For some reason, python is creating the separate Excel sheets but is completely ignoring the plot. Any advice/pointing out what I'm doing wrong would be greatly appreciated.

import pandas as pd
import xlsxwriter

excel_file = 'CR1000XSeries_LiftoffData.xlsx'

df = pd.read_excel(excel_file, sheet_name='CR1000XSeries_LiftoffData')

split_values = df['StrandID'].unique()

for value in split_values:
    df1 = df[df['StrandID'] == value]
    output_file_name = "StrandID" + str(value) + "Test.xlsx"

    workbook = xlsxwriter.Workbook(output_file_name)
    ws = workbook.add_worksheet('Sheet1')


    rows = len(df1)

    r1 = 1
    r2 = rows
    C1 = 0
    C2 = 9
    C3 = 14

    chart = workbook.add_chart({'type': 'scatter', 'subtype': 'straight'})

    chart.add_series({
        'categories': ['Sheet1', r1, C1, r2, C1],
        'values': ['Sheet1', r1, C2, r2, C2],
        'name': 'JackPressurePSI'})

    # Second Y axis
    chart.add_series({
        'categories': ['Sheet1', r1, C1, r2, C1],
        'values': ['Sheet1', r1, C3, r2, C3],
        'name': 'AbsoluteDispIN',
        'y2_axis': 1})

    ws.insert_chart('B10', chart)

    df1.to_excel(output_file_name, index=False)

Solution

  • The main issue is that the program is overwriting the xlsx file created by xlsxwriter.Workbook() with one of same name created by df1.to_excel().

    See the XlsxWriter docs on Working with Python Pandas and XlsxWriter for the correct way to use Pandas and XlsxWriter together.

    Fixing that issue the program should work. Here is an updated version:

    import pandas as pd
    
    excel_file = 'CR1000XSeries_LiftoffData.xlsx'
    
    df = pd.read_excel(excel_file, sheet_name='CR1000XSeries_LiftoffData')
    
    split_values = df['StrandID'].unique()
    
    for value in split_values:
        output_df = df[df['StrandID'] == value]
        output_file_name = "StrandID" + str(value) + "Test.xlsx"
    
        # Create a pandas excel writer using XlsxWriter as the engine.
        writer = pd.ExcelWriter(output_file_name, engine='xlsxwriter')
        output_df.to_excel(writer, sheet_name='Sheet1')
    
        # Get the xlsxwriter workbook and worksheet objects.
        workbook = writer.book
        worksheet = writer.sheets['Sheet1']
    
        max_row = df.shape[0]
    
        row1 = 1
        row2 = max_row
        col1 = 0
        col2 = 9
        col3 = 14
    
        chart = workbook.add_chart({'type': 'scatter', 'subtype': 'straight'})
    
        chart.add_series({
            'categories': ['Sheet1', row1, col1, row2, col1],
            'values': ['Sheet1', row1, col2, row2, col2],
            'name': 'JackPressurePSI'})
    
        # Second Y axis
        chart.add_series({
            'categories': ['Sheet1', row1, col1, row2, col1],
            'values': ['Sheet1', row1, col3, row2, col3],
            'name': 'AbsoluteDispIN',
            'y2_axis': True})
    
        worksheet.insert_chart('B10', chart)
    
        writer.save()
    

    I couldn't test this fully because I don't have the input file but it compiles correctly. Also, I changed some of the variable names to make them more explicit.