Search code examples
looker-studio

How can I add values to a chart that do not exist as 0 in google data studio?


I have got 4 tables in BigQuery that keep statistics for messages in a Message Queue. The tables are : receivedMessages, processedMessages, skippedMessages and failedMessages. Each table has among other things a header.processingMetadata.approximateArrivalTimestamp which as you might have guessed it is a timestamp field.

My purpose is to create 4 charts for each one of this tables aggregating in this field as well as a 5th chart that displays the percentage of each message category each day in regards to the receivedMessages as well as the unknown status messages using the following formula :

UNKNOWN_STATUS_MESSAGES = TOTAL_RECEIVED_MESSAGES - (TOTAL_PROCESSED_MESSAGES + TOTAL_SKIPPED_MESSAGES + TOTAL_FAILED_MESSAGES)

However some days do not have skipped or failed messages, therefore there are no records in Big Query in these two tables. This results to these 2 graphics having dates missing and also not displaying correctly the UNKNOWN_STATUS_MESSAGES in the 5th graph.

I also used the following code as a metric in my graphs with no success (changing the variable name appropriately each time).

CASE WHEN TOTAL_FAILED_MESSAGES IS NULL THEN 0 ELSE TOTAL_FAILED_MESSAGES END

Is there a way to make google data studio to fill the dates with no data with 0s so I can display the charts correctly?


Solution

  • As long as you know the date boundaries of your chart, you can fill those holes with zeros. For instance, if you want to generate your report for last 30 days:

    with dates as (
       select 
          x as date 
       from 
          unnest(generate_date_array(date_sub(current_date(), interval 30 day), current_date())) as x
    )
    
    select 
     date, 
     received_messages, 
     processed_messages, 
     skipped_messages, 
     failed_messages, 
     received_messages - (processed_messages + skipped_messages + failed_messages) as unknown_messages from (
        select 
           d.date,
           coalesce(count(received.*), 0) as received_messages,
           coalesce(count(processed.*), 0) as processed_messages,
           coalesce(count(skipped.*), 0) as skipped_messages,
           coalesce(count(failed.*), 0) as failed_messages
        from dates d 
           left join dataset.receivedMessages received 
              on date(received.header.processingMetadata.approximateArrivalTimestamp) = d.date
           left join dataset.processedMessages processed
              on date(processed.header.processingMetadata.approximateArrivalTimestamp) = d.date
           left join dataset.skippedMessages skipped
              on date(skipped.header.processingMetadata.approximateArrivalTimestamp) = d.date
           left join dataset.failedMessages failed
              on date(failed.header.processingMetadata.approximateArrivalTimestamp) = d.date
        
        group by 1
    )
    order by 1