Search code examples
performancebashdateawkcygwin

How to execute a date function inside an awk command in an fast way?


I am reading a file using awk and parsing the date and converting it into seconds. I wrote a working code that achieves this but it runs relatively slow and I have large log files that I intend to read. I need to know if there is another way that can achieve this in less time. Here is my code:

awk '
 {
  currentDateTime=$1 " " $2
  seconds ="date \"+%s\" -d \""currentDateTime" \""
  print "The time in seconds is: [ " seconds "]"
  seconds | getline result
  print result 
 }'  out.log

Here is the output:

The time in seconds is: [ date "+%s" -d "2014-04-01 10:20:22,357 "]
1396336822

Note that in the first output, the variable currentDateTime is outputted as a command and not as a value. I am wondering if I could execute this command without using getline since the getline seems slow when trying it out on larger files.


Solution

  • The reason getline is slow is that in this case you are running a shell command.

    That's what is slow.

    You are spawning a new shell and running date in it for each entry.

    Stop doing that.

    Use the GNU awk mktime function if you are using GNU awk.