Search code examples
datetimelooker-studioto-date

Data Studio "Date Range Control" doesn't working when date column with empty values


I have two different DateTime columns with the same format date. "Created_datetime" is well recognized by data studio as a date). "paid_datetime" will contain some empty value, Data Studio does not recognize it as date but as text. Data table shown below.

Example data

After my research, I apply "TODATE" function as TODATE(paid_datetime, "%d/%m/%Y %H:%M", "%d%m%Y%H%M") TODATE FORMULA

But after applying the filter "Date Range Control", the "total_selling" and "grand_total" did not match the correct results. May I know any solution for this issue? enter image description here

Appreciate for help! Thank you! Detail Provided below:

1.)Public Editable DataStudio Report Link:

https://datastudio.google.com/reporting/a3067f59-b01a-411c-aab8-4f930495cf2a

2.) Example Datatable

id total_selling grand_total paid_datetime Created_datetime
1 10 11 23/6/2022 23:22 17/3/2021 14:34
2 17/3/2021 14:34
3 12 14 20/6/2022 20:22 17/4/2021 14:34
4 17/4/2021 14:34
5 12 14 10/6/2022 13:22 17/4/2021 14:34
6 10/4/2021 14:34
7 2 5 5/4/2022 21:22 10/4/2021 14:34
8 11 21 1/6/2022 2:33 30/5/2022 14:44
9 30/5/2022 14:44

Solution

  • 0) Summary

    Broken down into three sections:

    1. Looks at the root cause of the issue (date formatting at the data set)
    2. Explains the issue with the currently attempted TODATE calculated field
    3. Suggestion to solve the issue that only requires Google Data Studio (does not need data cleaning at the Google Sheet, and is intended to work with a formatted date column, plaint text date fields or dirty data (which includes a mix of formatted and unformatted date fields, as is the case here)

    1) Issue at the Data Set

    The issue stems from how the dates are formatted at the Google Sheets data set. While some values in the paid_datetime field were detected as dates (left aligned; id values 5, 7 and 8; highlighted in green, below), others were treated as plain text (right aligned; id values 2 and 4; highlighted red in the image below). Additionally, simply changing the format of the entire column to the date format also does not resolve the issue as the days and months have been identified in the reverse order for values currently in the date format (id values 5, 7 and 8; highlighted in orange, below):

    To explain created a new Google Sheet and added a column that uses the ISDATE and CELL functions and highlighted the issues in red and orange:

    issue_date

    id total_selling grand_total paid_datetime Cell Type
    1 10 11 23/6/2022 23:22 Plain Text
    2 NULL
    3 12 14 20/6/2022 20:22 Plain Text
    4 NULL
    5 12 14 10/6/2022 13:22 Date
    6 NULL
    7 2 5 5/4/2022 21:22 Date
    8 11 21 1/6/2022 2:33 Date
    9 NULL

    NOTE 1: for purposes of simplicity and avoiding confusion, focusing on a single date column, thus excluding the Created_datetime field.

    2) Issue with the current attempt

    The problem with the calculated field in the question (named TODATE_testing in the report) is two fold:

    2.1) Partial Date Recognition

    The calculated field in the question and the respective output are:

    TODATE(paid_datetime, "%d/%m/%Y %H:%M", "%d%m%Y%H%M")
    
    id total_selling grand_total paid_datetime TODATE_testing
    1 10 11 23/6/2022 23:22 Jun 23, 2022
    2
    3 12 14 20/6/2022 20:22 Jun 20, 2022
    4
    5 12 14 2022-10-06 13:22:00
    6
    7 2 5 2022-05-04 21:22:00
    8 11 21 2022-01-06 02:33:00
    9

    It seems like Google Data Studio had recognised the cells that were in the Date format in Google Sheets (id values 5, 7 and 8) and had attempted to turn it into the native format, thus why it has changed the format to:

    YYYY-DD-MM HH:MM:SS

    NOTE 2: the native format was meant to be "YYYY-MM-DD HH:MM:SS", however, in this case (again, the issue stems from the data set as highlighted in the first image), the day (DD) and month (MM) components were detected the other way round for some components (which is accounted for in the suggestion below)

    2.2) Compatibility Date format

    The TODATE function is no longer recommended as of the 17 Sep 2020 update, which saw "New and improved date and time functions"; the top of the respective function page displays the following warning:

    This function only supports compatibility mode dates.

    We recommend upgrading any older date fields in your data sources to the new Date or Date & Time types. After upgrading:

    Learn more.

    3) Suggestion

    As there are two different Date formats, the CASE below ensures that each format is recognised using the the PARSE_DATETIME function:

    CASE
      WHEN REGEXP_CONTAINS(paid_datetime, "/") THEN PARSE_DATETIME("%d/%m/%Y %H:%M", paid_datetime)
      WHEN REGEXP_CONTAINS(paid_datetime, "-") THEN PARSE_DATETIME("%Y-%d-%m %H:%M:%S", paid_datetime)
      ELSE NULL
    END
    

    The calculated field above (titled Date_CASE in the report below) would also work if all the date values in the data column were formatted as expected at the data set (Dates in Google Sheets) or were all plain text values.

    Publicly editable Google Data Studio report (embedded Google Sheets data source) and a GIF to elaborate:

    gif