Search code examples
pythonpandasdataframe

How to print a df with 156636 lines?


I am trying to print a dataframe, which contains 156636 lines of code. It takes like ages to print. It supposes to print the first lines , then '...' and some of the final lines. I can;t understand why it prints nothing.

I tried to do print(df) or df in a seperate cell in google collab, but nothing. Does anyone knows about that ?


Solution

  • Printing out a large dataframe is never a good idea. What you need to do is some data analysis to understand your data better. I suggest making plots of some interesting things you need. Use df['some_name'].isnull().values.any() to see if you have null values. Use 'dtype' to check the datatype if you have to. Make plots of all the rows with respect to columns you are interested- you might find outliers this way. Using df.iloc() and df.loc() lets you find data points that you might be interested in. And as AKX and Gustavo have mentioned, use df.describe(), df.head(), df.tail() etc. to understand your data. Hope this helps!