Search code examples
pythonplotlyplotly-express

Python plotly Express Histogram: Graph not showing all unique TIME_BUCKET values, it clubbing TIME_BUCKETs in hourly value


My CSV content has three columns like 1.1K columns. This has values for 5 minutes TIME_BUCKET like 03:40:00+00:00, 03:45:00+00:00 etc. I expect the graph to plot histogram for all these different TIME_BUCKETS, but it actually plotting the graph for hourly time buckets like 03:00, 04:00, etc.

My code is like below

import pandas as pd
import plotly.express as px

df = pd.read_csv("D:/Work/Issue/5MinTimeBucketHistogramNotWorking1.csv")
graph = px.histogram(df, x='TIME_BUCKET', color='REPORT_NAME', title='Report Category Wise Execution Count (5 minuntes sample size)')
graph.show()

My CSV content is like below with 1.1K columns. The whole CSV is shared here for reference.

   ,REPORT_NAME,TIME_BUCKET
23,DashboardReport,2021-01-20 03:30:00+00:00
33,DashboardReport,2021-01-20 03:40:00+00:00
69,ExportReport,2021-01-20 03:40:00+00:00
74,ExportReport,2021-01-20 03:40:00+00:00
97,ExportReport,2021-01-20 03:40:00+00:00
98,ExportReport,2021-01-20 03:40:00+00:00
99,ExportReport,2021-01-20 03:40:00+00:00
101,ExportReport,2021-01-20 03:40:00+00:00
103,ExportReport,2021-01-20 03:40:00+00:00
2821,DashboardReport,2021-01-20 15:40:00+00:00
2822,DashboardReport,2021-01-20 15:40:00+00:00
2823,DashboardReport,2021-01-20 15:45:00+00:00
2896,DashboardReport,2021-01-20 16:15:00+00:00
3283,SQLReport,2021-01-20 19:00:00+00:00
3285,DashboardReport,2021-01-20 19:00:00+00:00
3288,DashboardReport,2021-01-20 19:05:00+00:00
3289,DashboardReport,2021-01-20 19:05:00+00:00
3292,ImportReport,2021-01-20 19:05:00+00:00
3293,DashboardReport,2021-01-20 19:05:00+00:00
3294,DashboardReport,2021-01-20 19:05:00+00:00
3295,DashboardReport,2021-01-20 19:10:00+00:00
3297,DashboardReport,2021-01-20 19:10:00+00:00
3298,SQLReport,2021-01-20 19:10:00+00:00
3300,DashboardReport,2021-01-20 19:10:00+00:00
3303,SQLReport,2021-01-20 19:15:00+00:00
3307,ImportReport,2021-01-20 19:15:00+00:00
3309,DashboardReport,2021-01-20 19:15:00+00:00
3312,DashboardReport,2021-01-20 19:15:00+00:00
3313,DashboardReport,2021-01-20 19:15:00+00:00
3314,SQLReport,2021-01-20 19:15:00+00:00
3315,DashboardReport,2021-01-20 19:15:00+00:00
3316,DashboardReport,2021-01-20 19:15:00+00:00
3317,DashboardReport,2021-01-20 19:15:00+00:00
3318,ImportReport,2021-01-20 19:15:00+00:00
3319,DashboardReport,2021-01-20 19:15:00+00:00
3324,DashboardReport,2021-01-20 19:20:00+00:00
3328,SQLReport,2021-01-20 19:20:00+00:00
3331,ImportReport,2021-01-20 19:20:00+00:00
3332,ImportReport,2021-01-20 19:20:00+00:00
3335,DashboardReport,2021-01-20 19:20:00+00:00
3336,ImportReport,2021-01-20 19:20:00+00:00
3337,DashboardReport,2021-01-20 19:20:00+00:00
3339,DashboardReport,2021-01-20 19:20:00+00:00
3344,DashboardReport,2021-01-20 19:20:00+00:00
3345,DashboardReport,2021-01-20 19:20:00+00:00
3349,DBReport,2021-01-20 19:20:00+00:00
3350,SQLReport,2021-01-20 19:20:00+00:00
3354,DashboardReport,2021-01-20 19:20:00+00:00
3355,DashboardReport,2021-01-20 19:20:00+00:00
3356,DashboardReport,2021-01-20 19:20:00+00:00
3357,DashboardReport,2021-01-20 19:20:00+00:00
3358,DashboardReport,2021-01-20 19:20:00+00:00
3359,DashboardReport,2021-01-20 19:20:00+00:00
3360,DashboardReport,2021-01-20 19:20:00+00:00
3368,DashboardReport,2021-01-20 19:25:00+00:00
3370,DashboardReport,2021-01-20 19:25:00+00:00
3375,DashboardReport,2021-01-20 19:25:00+00:00
3377,DashboardReport,2021-01-20 19:30:00+00:00
3379,DashboardReport,2021-01-20 19:30:00+00:00
3381,DashboardReport,2021-01-20 19:30:00+00:00
3384,DashboardReport,2021-01-20 19:30:00+00:00
3396,ImportReport,2021-01-20 19:40:00+00:00
3398,DashboardReport,2021-01-20 19:40:00+00:00
3403,DashboardReport,2021-01-20 19:45:00+00:00
3404,DashboardReport,2021-01-20 19:45:00+00:00
3408,DashboardReport,2021-01-20 19:45:00+00:00
3410,DashboardReport,2021-01-20 19:45:00+00:00
3418,DashboardReport,2021-01-20 19:50:00+00:00
3419,SQLReport,2021-01-20 19:50:00+00:00
3421,DashboardReport,2021-01-20 19:50:00+00:00
3422,DashboardReport,2021-01-20 19:50:00+00:00
3429,DashboardReport,2021-01-20 19:50:00+00:00
3434,DashboardReport,2021-01-20 19:55:00+00:00
3443,ImportReport,2021-01-20 20:00:00+00:00
3444,ImportReport,2021-01-20 20:00:00+00:00
3450,DBReport,2021-01-20 20:05:00+00:00
3451,DBReport,2021-01-20 20:05:00+00:00
3489,SQLReport,2021-01-20 20:20:00+00:00
3490,ImportReport,2021-01-20 20:20:00+00:00
3496,DashboardReport,2021-01-20 20:20:00+00:00
3499,ImportReport,2021-01-20 20:25:00+00:00
3501,DashboardReport,2021-01-20 20:25:00+00:00
3505,DashboardReport,2021-01-20 20:25:00+00:00
3513,SQLReport,2021-01-20 20:30:00+00:00
3514,DashboardReport,2021-01-20 20:35:00+00:00
3521,SQLReport,2021-01-20 20:35:00+00:00
3522,DashboardReport,2021-01-20 20:35:00+00:00
3523,DashboardReport,2021-01-20 20:35:00+00:00
3527,DashboardReport,2021-01-20 20:40:00+00:00
3537,DashboardReport,2021-01-20 20:40:00+00:00
3538,DashboardReport,2021-01-20 20:40:00+00:00
3540,DashboardReport,2021-01-20 20:45:00+00:00
3549,DashboardReport,2021-01-20 20:50:00+00:00
3552,DashboardReport,2021-01-20 20:55:00+00:00
3555,SQLReport,2021-01-20 20:55:00+00:00
3556,DashboardReport,2021-01-20 20:55:00+00:00
3557,SQLReport,2021-01-20 20:55:00+00:00
3558,DashboardReport,2021-01-20 20:55:00+00:00

The output looks like below enter image description here


Solution

  • Your df['TIME_BUCKETS'] is unsuprisingly interpreted by plotly to be continuous time, and is shown as such on a continuous x-axis. If you'd like to display values for the bucket categories just as they occur in your dataframe, just add:

    fig.update_xaxes(type='category')
    

    If you adjust the font size of the ticktext a bit as well, then you'll end up with this:

    enter image description here

    Notice that I've used a formatted version of df['TIME_BUCKETS'] in:

    df['buckets'] = [dat[11:16] for dat in df['TIME_BUCKET']]
    

    If you don't you'll end up with this:

    enter image description here

    Complete code with data sample:

    import pandas as pd
    import plotly.express as px
    
    df.to_dict()
    df = pd.DataFrame({'   ': {0: 23,
      1: 33,
      2: 69,
      3: 74,
      4: 97,
      5: 98,
      6: 99,
      7: 101,
      8: 103,
      9: 2821,
      10: 2822,
      11: 2823,
      12: 2896,
      13: 3283,
      14: 3285,
      15: 3288,
      16: 3289,
      17: 3292,
      18: 3293,
      19: 3294,
      20: 3295,
      21: 3297,
      22: 3298,
      23: 3300,
      24: 3303,
      25: 3307,
      26: 3309,
      27: 3312,
      28: 3313,
      29: 3314,
      30: 3315,
      31: 3316,
      32: 3317,
      33: 3318,
      34: 3319,
      35: 3324,
      36: 3328,
      37: 3331,
      38: 3332,
      39: 3335,
      40: 3336,
      41: 3337,
      42: 3339,
      43: 3344,
      44: 3345,
      45: 3349,
      46: 3350,
      47: 3354,
      48: 3355,
      49: 3356,
      50: 3357,
      51: 3358,
      52: 3359,
      53: 3360,
      54: 3368,
      55: 3370,
      56: 3375,
      57: 3377,
      58: 3379,
      59: 3381,
      60: 3384,
      61: 3396,
      62: 3398,
      63: 3403,
      64: 3404,
      65: 3408,
      66: 3410,
      67: 3418,
      68: 3419,
      69: 3421,
      70: 3422,
      71: 3429,
      72: 3434,
      73: 3443,
      74: 3444,
      75: 3450,
      76: 3451,
      77: 3489,
      78: 3490,
      79: 3496,
      80: 3499,
      81: 3501,
      82: 3505,
      83: 3513,
      84: 3514,
      85: 3521,
      86: 3522,
      87: 3523,
      88: 3527,
      89: 3537,
      90: 3538,
      91: 3540,
      92: 3549,
      93: 3552,
      94: 3555,
      95: 3556,
      96: 3557,
      97: 3558},
     'REPORT_NAME': {0: 'DashboardReport',
      1: 'DashboardReport',
      2: 'ExportReport',
      3: 'ExportReport',
      4: 'ExportReport',
      5: 'ExportReport',
      6: 'ExportReport',
      7: 'ExportReport',
      8: 'ExportReport',
      9: 'DashboardReport',
      10: 'DashboardReport',
      11: 'DashboardReport',
      12: 'DashboardReport',
      13: 'SQLReport',
      14: 'DashboardReport',
      15: 'DashboardReport',
      16: 'DashboardReport',
      17: 'ImportReport',
      18: 'DashboardReport',
      19: 'DashboardReport',
      20: 'DashboardReport',
      21: 'DashboardReport',
      22: 'SQLReport',
      23: 'DashboardReport',
      24: 'SQLReport',
      25: 'ImportReport',
      26: 'DashboardReport',
      27: 'DashboardReport',
      28: 'DashboardReport',
      29: 'SQLReport',
      30: 'DashboardReport',
      31: 'DashboardReport',
      32: 'DashboardReport',
      33: 'ImportReport',
      34: 'DashboardReport',
      35: 'DashboardReport',
      36: 'SQLReport',
      37: 'ImportReport',
      38: 'ImportReport',
      39: 'DashboardReport',
      40: 'ImportReport',
      41: 'DashboardReport',
      42: 'DashboardReport',
      43: 'DashboardReport',
      44: 'DashboardReport',
      45: 'DBReport',
      46: 'SQLReport',
      47: 'DashboardReport',
      48: 'DashboardReport',
      49: 'DashboardReport',
      50: 'DashboardReport',
      51: 'DashboardReport',
      52: 'DashboardReport',
      53: 'DashboardReport',
      54: 'DashboardReport',
      55: 'DashboardReport',
      56: 'DashboardReport',
      57: 'DashboardReport',
      58: 'DashboardReport',
      59: 'DashboardReport',
      60: 'DashboardReport',
      61: 'ImportReport',
      62: 'DashboardReport',
      63: 'DashboardReport',
      64: 'DashboardReport',
      65: 'DashboardReport',
      66: 'DashboardReport',
      67: 'DashboardReport',
      68: 'SQLReport',
      69: 'DashboardReport',
      70: 'DashboardReport',
      71: 'DashboardReport',
      72: 'DashboardReport',
      73: 'ImportReport',
      74: 'ImportReport',
      75: 'DBReport',
      76: 'DBReport',
      77: 'SQLReport',
      78: 'ImportReport',
      79: 'DashboardReport',
      80: 'ImportReport',
      81: 'DashboardReport',
      82: 'DashboardReport',
      83: 'SQLReport',
      84: 'DashboardReport',
      85: 'SQLReport',
      86: 'DashboardReport',
      87: 'DashboardReport',
      88: 'DashboardReport',
      89: 'DashboardReport',
      90: 'DashboardReport',
      91: 'DashboardReport',
      92: 'DashboardReport',
      93: 'DashboardReport',
      94: 'SQLReport',
      95: 'DashboardReport',
      96: 'SQLReport',
      97: 'DashboardReport'},
     'TIME_BUCKET': {0: '2021-01-20 03:30:00+00:00',
      1: '2021-01-20 03:40:00+00:00',
      2: '2021-01-20 03:40:00+00:00',
      3: '2021-01-20 03:40:00+00:00',
      4: '2021-01-20 03:40:00+00:00',
      5: '2021-01-20 03:40:00+00:00',
      6: '2021-01-20 03:40:00+00:00',
      7: '2021-01-20 03:40:00+00:00',
      8: '2021-01-20 03:40:00+00:00',
      9: '2021-01-20 15:40:00+00:00',
      10: '2021-01-20 15:40:00+00:00',
      11: '2021-01-20 15:45:00+00:00',
      12: '2021-01-20 16:15:00+00:00',
      13: '2021-01-20 19:00:00+00:00',
      14: '2021-01-20 19:00:00+00:00',
      15: '2021-01-20 19:05:00+00:00',
      16: '2021-01-20 19:05:00+00:00',
      17: '2021-01-20 19:05:00+00:00',
      18: '2021-01-20 19:05:00+00:00',
      19: '2021-01-20 19:05:00+00:00',
      20: '2021-01-20 19:10:00+00:00',
      21: '2021-01-20 19:10:00+00:00',
      22: '2021-01-20 19:10:00+00:00',
      23: '2021-01-20 19:10:00+00:00',
      24: '2021-01-20 19:15:00+00:00',
      25: '2021-01-20 19:15:00+00:00',
      26: '2021-01-20 19:15:00+00:00',
      27: '2021-01-20 19:15:00+00:00',
      28: '2021-01-20 19:15:00+00:00',
      29: '2021-01-20 19:15:00+00:00',
      30: '2021-01-20 19:15:00+00:00',
      31: '2021-01-20 19:15:00+00:00',
      32: '2021-01-20 19:15:00+00:00',
      33: '2021-01-20 19:15:00+00:00',
      34: '2021-01-20 19:15:00+00:00',
      35: '2021-01-20 19:20:00+00:00',
      36: '2021-01-20 19:20:00+00:00',
      37: '2021-01-20 19:20:00+00:00',
      38: '2021-01-20 19:20:00+00:00',
      39: '2021-01-20 19:20:00+00:00',
      40: '2021-01-20 19:20:00+00:00',
      41: '2021-01-20 19:20:00+00:00',
      42: '2021-01-20 19:20:00+00:00',
      43: '2021-01-20 19:20:00+00:00',
      44: '2021-01-20 19:20:00+00:00',
      45: '2021-01-20 19:20:00+00:00',
      46: '2021-01-20 19:20:00+00:00',
      47: '2021-01-20 19:20:00+00:00',
      48: '2021-01-20 19:20:00+00:00',
      49: '2021-01-20 19:20:00+00:00',
      50: '2021-01-20 19:20:00+00:00',
      51: '2021-01-20 19:20:00+00:00',
      52: '2021-01-20 19:20:00+00:00',
      53: '2021-01-20 19:20:00+00:00',
      54: '2021-01-20 19:25:00+00:00',
      55: '2021-01-20 19:25:00+00:00',
      56: '2021-01-20 19:25:00+00:00',
      57: '2021-01-20 19:30:00+00:00',
      58: '2021-01-20 19:30:00+00:00',
      59: '2021-01-20 19:30:00+00:00',
      60: '2021-01-20 19:30:00+00:00',
      61: '2021-01-20 19:40:00+00:00',
      62: '2021-01-20 19:40:00+00:00',
      63: '2021-01-20 19:45:00+00:00',
      64: '2021-01-20 19:45:00+00:00',
      65: '2021-01-20 19:45:00+00:00',
      66: '2021-01-20 19:45:00+00:00',
      67: '2021-01-20 19:50:00+00:00',
      68: '2021-01-20 19:50:00+00:00',
      69: '2021-01-20 19:50:00+00:00',
      70: '2021-01-20 19:50:00+00:00',
      71: '2021-01-20 19:50:00+00:00',
      72: '2021-01-20 19:55:00+00:00',
      73: '2021-01-20 20:00:00+00:00',
      74: '2021-01-20 20:00:00+00:00',
      75: '2021-01-20 20:05:00+00:00',
      76: '2021-01-20 20:05:00+00:00',
      77: '2021-01-20 20:20:00+00:00',
      78: '2021-01-20 20:20:00+00:00',
      79: '2021-01-20 20:20:00+00:00',
      80: '2021-01-20 20:25:00+00:00',
      81: '2021-01-20 20:25:00+00:00',
      82: '2021-01-20 20:25:00+00:00',
      83: '2021-01-20 20:30:00+00:00',
      84: '2021-01-20 20:35:00+00:00',
      85: '2021-01-20 20:35:00+00:00',
      86: '2021-01-20 20:35:00+00:00',
      87: '2021-01-20 20:35:00+00:00',
      88: '2021-01-20 20:40:00+00:00',
      89: '2021-01-20 20:40:00+00:00',
      90: '2021-01-20 20:40:00+00:00',
      91: '2021-01-20 20:45:00+00:00',
      92: '2021-01-20 20:50:00+00:00',
      93: '2021-01-20 20:55:00+00:00',
      94: '2021-01-20 20:55:00+00:00',
      95: '2021-01-20 20:55:00+00:00',
      96: '2021-01-20 20:55:00+00:00',
      97: '2021-01-20 20:55:00+00:00'}})
    
    df['buckets'] = [dat[11:16] for dat in df['TIME_BUCKET']]
    fig = px.histogram(df, x='TIME_BUCKET', color='REPORT_NAME', title='Report Category Wise Execution Count (5 minuntes sample size)')
    fig.update_xaxes(type='category')
    fig.layout.xaxis.tickfont.size = 10
    
    fig.show()