Example Input File
AsofDate,Portfolio,Shock,CUSIP,DATE,BALANCE,INTEREST,SCHED_PRIN,UNSCHED_PRIN,LOSS
28-MAR-2024,INV,DN50,28623RAE8,22-APR-2024,25000000,447380,0,0,0
28-MAR-2024,INV,DN50,28623RAE8,22-JUL-2024,25000000,413892,0,0,0
28-MAR-2024,INV,DN50,28623RAE8,21-OCT-2024,25000000,399624.25,0,0,0
28-MAR-2024,INV,DN50,28623RAE8,21-JAN-2025,25000000,386169.5,0,0,0
Lets say above is my input file, I would like to change the format of 1st (AsofDate) and 5th (DATE) column date form from dd-MMM-YYYY (Example: 28-MAR-2024) to mm/dd/YYYY (Example: 03/28/2024).
Tried the below command for 1st column,
awk -F',' '{gsub(/"/,"",$1); cmd="date -d\""$1"\" +\\\"%m/%d/%Y\\\""; cmd |getline $1; close(cmd); print >FILENAME }' OFS=, $FILE
The problem with this command is, its taking more time to process. The file size is 26000KB and I need to iterate for all the files under a directory. How to modify this command to achieve better performance or efficiency?
OP's current code takes so long to run because of the repeated calls out to the OS to have date
reformat the date fields. We can eliminate these expensive OS/date
calls with some simple splitting and reformatting of the date fields.
Assumptions:
One awk
idea:
awk '
function reformat(_date, c) {
split(_date,c,"-") # split _date on "-", place parts in array c[]
return map[c[2]] "/" c[1] "/" c[3] # return parts in desired format
}
BEGIN { FS = OFS = ","
split("JAN,FEB,MAR,APR,MAY,JUN,JUL,AUG,SEP,OCT,NOV,DEC",a)
for (i in a)
map[a[i]] = sprintf("%02d",i) # convert a[1]=JAN to map[JAN]=01
}
FNR>1 { $1 = reformat($1) # skip header line; replace $1 with new date format
$5 = reformat($5) # replace $5 with new date format
}
1 # print current line
' sample.file
This generates:
AsofDate,Portfolio,Shock,CUSIP,DATE,BALANCE,INTEREST,SCHED_PRIN,UNSCHED_PRIN,LOSS
03/28/2024,INV,DN50,28623RAE8,04/22/2024,25000000,447380,0,0,0
03/28/2024,INV,DN50,28623RAE8,07/22/2024,25000000,413892,0,0,0
03/28/2024,INV,DN50,28623RAE8,10/21/2024,25000000,399624.25,0,0,0
03/28/2024,INV,DN50,28623RAE8,01/21/2025,25000000,386169.5,0,0,0