Search code examples
pythonpandasdataframetext-alignment

Python - Pandas - Print 10 rows at a time while keeping them properly aligned


I'm using pandas to create a dataframe and printing 10 rows at a time. If the user responds yes to print more, then I print the next 10 rows. This works, but printing it in section ruins the alignment. I'm wondering if there's a way I can keep things aligned while still only printing 10 rows at a time.

start_row = 0
end_row = 10
print((df[start_row:end_row]).to_string(index=False))
    while end_row < max_rows:
        if user_input("Add More Results? (YES/NO)") == "yes":
            # Deletes the question after "yes" input to make a more readable list
            print("\033[A                             \033[A")

            start_row += 10
            end_row += 10
            print((df[start_row:end_row]).to_string(index=False, header=False))
        else:
            print("\n")
            break
    else:
        end_row = max_rows
        print("\nEnd of List\n")

result looks like this, an unaligned mess.


Solution

  • You could try to format the output via formatters:

    # Outside the loop
    formatters = {col: (lambda x: f'{x:>{max(len(str(y)) for y in df[col]) + 2}}')
                  for col in df.columns}
    
    # In the loop
    print(df[start_row:end_row].to_string(index=False, header=False, formatters=formatters))
    

    EDIT: After some consideration I think this is a more robust version:

    formatters = {col: (lambda x: str(x).rjust(max(len(str(y)) for y in df[col]) + 2))
                  for col in df.columns}
    

    This should align the output consistently to the right such that there are at least 2 blanks padding on the left (via > in the format string). You can of course change the 2 to anything you want.

    With the sample frame

    df = pd.DataFrame({'Col1': [0, 1, 1000, 1001], 'Col2': [0, 1, 10000, 10001]})
    
       Col1   Col2
    0     0      0
    1     1      1
    2  1000  10000
    3  1001  10001
    

    this

    start_row, end_row = 0, 2
    print(df[start_row:end_row].to_string(index=False, header=False))
    start_row, end_row = 2, 4
    print(df[start_row:end_row].to_string(index=False, header=False))
    

    leads to

     0  0
     1  1
     1000  10000
     1001  10001
    

    whereas this

    start_row, end_row = 0, 2
    print(df[start_row:end_row].to_string(index=False, header=False, formatters=formatters))
    start_row, end_row = 2, 4
    print(df[start_row:end_row].to_string(index=False, header=False, formatters=formatters))
    

    leads to

          0       0
          1       1
       1000   10000
       1001   10001
    

    You can of course also use individual functions for each column. For example this

    formatters = {'Col1': (lambda x: str(x).rjust(max(len(str(y)) for y in df['Col1']) + 2)),
                  'Col2': (lambda x: str(x).ljust(max(len(str(y)) for y in df['Col2']) + 2))}
    

    leads to

         0 0      
         1 1      
      1000 10000  
      1001 10001  
    

    Or you could only define functions for selected columns. But that has the potential to lead to an overall strange output, since the column widths accumulate from left to right.