Search code examples
excelpowerquery

Should the Date and Time be stored in the same DateTime field or in different Dates and Times?


The goal: to connect a CSV file to Excel and analyze it.
Example of an analysis of DateTime:

  1. Groupings;
  2. Sorting;
  3. Filters;
  4. Charting;
  5. DateTime calculations. (presumably)

1. Datetime store
- Var-1. in one field: DateTime
or
- Var-2. in different fields: Date, Time, Milliseconds, time zone?
Or use Var-1 and Var-2 depending on the circumstances and both options are allowed?

So that you can connect CSV to Excel without additional settings?
Or with minimal settings: for example, change the format of the table field in Excel.
Well, or in PowerQuery to "tweak" something in small things.

I.e. so that the data goes the way of CSV -> PowerQuery -> Excel.
Under the maximum DateTime i understand: yyyy-MM-dd HH:mm:ss.MS +zone

Example: 2024-03-19 13:18:31.395 +03:00


2. For example, I made several columns with different representation options DateTime. I have considered several column options:

  • Column1: 2024-03-21 15:24:36,897 +03:00 - Date time,Milliseconds +zone;
  • Column2: 2024-03-21 15:24:36,897 - milliseconds are separated by ",";
  • Column3: 2024-03-21 15:24:36.897 - milliseconds are separated by ".";
  • Column4: 2024-03-21 15:24:36 - Datetime;
  • Column5: 2024-03-21 - Date;
  • Column6: 15:24:36 - Time;
  • Column7: 897 - milliseconds;
  • Column8: +03:00 - zone.

3. My concerns.
3.1 If you use Column1(or Column2).
PowerQuery recognized the field type as Text.
I am afraid that if I build a pivot table or use a table obtained from a query in other tables, the data will not be recognized as datetime

3.2 If you use Column3.
Milliseconds and time zone are not displayed in Column3
Then Column3 should be used together with the fields Column7(milliseconds), Column8(time zone).

csv

Column1 Column2 Column3 Column4 Column5 Column6 Column7 Column8 Column9 Column10
2024-03-21 15:24:36,897 +03:00 2024-03-21 15:24:36,897 21.03.2024 15:24 21.03.2024 15:24 21.03.2024 15:24:36 897 +03:00 INF Messaeg-i - 0
2024-03-21 15:24:38,192 +03:00 2024-03-21 15:24:38,192 21.03.2024 15:24 21.03.2024 15:24 21.03.2024 15:24:38 192 +03:00 INF Messaeg-i - 1
2024-03-21 15:24:39,490 +03:00 2024-03-21 15:24:39,490 21.03.2024 15:24 21.03.2024 15:24 21.03.2024 15:24:39 490 +03:00 INF Messaeg-i - 2
2024-03-21 15:24:40,821 +03:00 2024-03-21 15:24:40,821 21.03.2024 15:24 21.03.2024 15:24 21.03.2024 15:24:40 821 +03:00 INF Messaeg-i - 3
2024-03-21 15:24:40,949 +03:00 2024-03-21 15:24:40,949 21.03.2024 15:24 21.03.2024 15:24 21.03.2024 15:24:40 949 +03:00 INF Messaeg-i - 4

PowerQuery enter image description here

Excel enter image description here

Code

let
    Источник = Csv.Document(File.Contents("E:\Projects\39\01_pr\01\2024-03-21_15-24-36.csv"),
                    [Delimiter="|", 
                    Columns=10, 
                    Encoding=1251]),
    #"Измененный тип" = Table.TransformColumnTypes(Источник,
                    {
                        {"Column1", type text}, 
                        {"Column2", type text}, 
                        {"Column3", type datetime}, 
                        {"Column4", type datetime}, 
                        {"Column5", type date}, 
                        {"Column6", type time}, 
                        {"Column7", Int64.Type}, 
                        {"Column8", type text}, 
                        {"Column9", type text}, 
                        {"Column10", type text}
                    }
                    )
in
    #"Измененный тип"

Solution

  • If your goal is to use this in Excel, Excel does not have a DateTimeZone type, so I would eliminate that.

    But you can convert your DateTime string into a "real" DateTime by transforming the column into a valid textual DateTime string first.

    Here is representative code:

    let
        Source = Csv.Document(File.Contents("C:\Users\ron\Desktop\Times.txt"),[Delimiter="  ", Columns=10, Encoding=1252, QuoteStyle=QuoteStyle.None]),
    
    //We will only process column 2
        DateTime = Table.SelectColumns(Source,"Column2"),
        #"Promoted Headers" = Table.PromoteHeaders(DateTime, [PromoteAllScalars=true]),
    
    //Create valid DateTime string by inserting a `T` before the Time component
        #"DateTime Type" = Table.TransformColumns(#"Promoted Headers", {
            {"Column2", each DateTime.FromText(Text.ReplaceRange(_,10,1,"T"),[Culture="en-150"]), type datetime}
        })
    in
        #"DateTime Type"
    

    In Power Query the dates are shown as in your local culture (mine is en-US or m/d/y h:mm:ss AM/PM but when you load it back to Excel, the milliseconds will be preserved and you can format however you wish.

    enter image description here

    And here is that same data, loaded to Excel and formatted to include seconds:

    enter image description here