Picking Out Time Data From .dat File

Just discovered Octave, and coming from a primarily Matlab background, I am astonished such a program existed.

Here is a sample of the data I am processing, using Octave:

#Recipe: VAC Test
#Pressure: TORR
#Temp.: C
#TC Labels: TC0;TC1;TC2;TC3;TC4;TC5;TC6;TC7;TC8;TC9;TC10;TC11;TC12;TC13;TC14;TC15;TC16;TC17;TC18;TC19;TC20;TC21;TC22;TC23
230915 13:51:52;1;1;9.000E+2;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;19.6;19.1
230915 13:51:53;2;2;9.000E+2;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;19.6;19.1
230915 13:51:54;3;3;9.000E+2;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;19.6;19.1
230918 07:27:47;236156;236156;1.396E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:48;236157;236157;1.397E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:49;236158;236158;1.399E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:50;236159;236159;1.398E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:51;236160;236160;1.398E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:52;236161;236161;1.397E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3

I currently have the script set up to replace the space between the date & time with a ';' for consistency in delimiting, then continue on to drop the data into a matrix. I am easily able to get the data I primarily need--date (col 1) pressures (col 5), and two temperature values (last two columns).

However, I am stumped at the time, and then the formatting for the date... and I need them for plotting against temperature and pressures.

The date is an a YYMMDD format, which happens to be in a format that Octave doesn't seem to support in outputting (according to https://octave.sourceforge.io/octave/function/datestr.html).

Using dlmread, the time will come with only the first digits before the colon. Sample: 13:51:52 comes as the value 13

I have attempted out csv2cell, but it appears to drop it into a massive Rx1 vector, rather than a matrix.

The goal of the script is for it to eventually be turned into an executable, where a user will input the logfile of a vacuum chamber, and the output will be i) the modified file, and ii) two plots of temperatures and pressures against date and time.

Sep-26 Edits: I am wanting to use datenum(), datestr() together w/ the data from the first column. But with the data coming out as YYMMDD format, I cannot use any of the specified code formats here: octave.sourceforge.io/octave/function/datestr.html From how it looks, I need to split it up into YY, MM, DD, and use that with datenum() to get it in properly. I've tried:

textscan(file, '%s %s', 'delimiter', ';', 'headerlines', 4)

Along with:

textscan(file, '%d %d', 'delimiter', ';', 'headerlines', 4)

textscan(file, '%d %d:%d:%d', 'delimiter', ';', 'headerlines', 4)

But all of them return as empty vectors, or just a single vector w/ the filename.

Any help would be appreciated, thank you.

Solution

It is by no means the fastest approach, but even though your data is numeric, it may be better to use something other than dlmread. One approach to that:

First, that date format can be readable if you use the calling form that tells datevec what format to look for. E.g.,

datevec("230915 13:51:52", "yymmdd HH:MM:SS")
ans =

   2023      9     15     13     51     52

For reading the data, if I first manually trim out all of the~ lines, it leaves:

#Recipe: VAC Test
#Pressure: TORR
#Temp.: C
#TC Labels: TC0;TC1;TC2;TC3;TC4;TC5;TC6;TC7;TC8;TC9;TC10;TC11;TC12;TC13;TC14;TC15;TC16;TC17;TC18;TC19;TC20;TC21;TC22;TC23
230915 13:51:52;1;1;9.000E+2;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;19.6;19.1
230915 13:51:53;2;2;9.000E+2;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;19.6;19.1
230915 13:51:54;3;3;9.000E+2;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;19.6;19.1
230918 07:27:47;236156;236156;1.396E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:48;236157;236157;1.397E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:49;236158;236158;1.399E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:50;236159;236159;1.398E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:51;236160;236160;1.398E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3
230918 07:27:52;236161;236161;1.397E-4;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;26.5;26.3

then the following works acceptably:

fid = fopen ('data.txt');
data = textscan(fid, "%d %d:%d:%d;%d;%d;%f;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;-;%f;%f\n", "he
aderlines",4)
data = 
{
  [1,1] =

    230915
    230915
    230915
    230918
    230918
    230918
    230918
    230918
    230918

  [1,2] =

    13
    13
    13
     7
     7
     7
     7
     7
     7

  [1,3] =

    51
    51
    51
    27
    27
    27
    27
    27
    27
                                                                                                                                           
  [1,4] =

    52
    53
    54
    47
    48
    49
    50
    51
    52

  [1,5] =

         1
         2
         3
    236156
    236157
    236158
    236159
    236160
    236161

  [1,6] =

         1
         2
         3
    236156
    236157
    236158
    236159
    236160
    236161

  [1,7] =

     9.0000e+02
     9.0000e+02
     9.0000e+02
     1.3960e-04
     1.3970e-04
     1.3990e-04
     1.3980e-04
     1.3980e-04
     1.3970e-04

  [1,8] =

     19.600
     19.600
     19.600
     26.500
     26.500
     26.500
     26.500
     26.500
     26.500

  [1,9] =

     19.100
     19.100
     19.100
     26.300
     26.300
     26.300
     26.300
     26.300
     26.300

}
fclose(fid);

Now, each cell entry is one of the columns of your data. Here, they are all still numbers. You may be able to use %s delimiters to leave some as strings, or use a different format string that tries to keep the date components together, but I found this easy enough to parse from the cell array of numbers that textscan outputs. Your dates and times are located in the first four elements of data:

dates = data(1:4)
dates =
{
  [1,1] =

    230915
    230915
    230915
    230918
    230918
    230918
    230918
    230918
    230918

  [1,2] =

    13
    13
    13
     7
     7
     7
     7
     7
     7

  [1,3] =

    51
    51
    51
    27
    27
    27
    27
    27
    27

  [1,4] =

    52
    53
    54
    47
    48
    49
    50
    51
    52

}

putting them into a single string that datevec can be told how to read:

dates = num2str([dates{1:4}])
dates =

230915      13      51      52
230915      13      51      53
230915      13      51      54
230918       7      27      47
230918       7      27      48
230918       7      27      49
230918       7      27      50
230918       7      27      51
230918       7      27      52

(the above syntax extracts the cell contents with the curly braces {1:4} and places them as a comma separated list inside the [] brackets to concatenate them horizontally, before passing them to num2str which turns them into an evenly spaced char array).

finally, turn it into a datevec using the syntax I mentioned above. the function will parse all rows as separate vectors, and seems to be fine with the variable whitespace and hours sometimes only having 1 digit:

dates = datevec (num2str (dates, "yymmdd HH MM SS")
dates =

   2023      9     15     13     51     52
   2023      9     15     13     51     53
   2023      9     15     13     51     54
   2023      9     18      7     27     47
   2023      9     18      7     27     48
   2023      9     18      7     27     49
   2023      9     18      7     27     50
   2023      9     18      7     27     51
   2023      9     18      7     27     52