I am trying to replace part of filenames based on matching string of filename from another file. Filenames are in following format:
36872_20190806_00.csv 40800_20190806_00.csv 41883_20190806_00.csv
38064_20190806_00.csv 40848_20190806_00.csv 41891_20190806_00.csv
38341_20190806_00.csv 40856_20190806_00.csv 41923_20190806_00.csv
40417_20190806_00.csv 40948_20190806_00.csv 44373_20190806_00.csv
40745_20190806_00.csv 41217_20190806_00.csv 45004_20190806_00.csv
40754_20190806_00.csv 41256_20190806_00.csv
where digits before first _
represent station code, which I want to replace with its station name from another file named radiosonde.csv
. For example : I want
change 36872_20190806_00.csv
to ALMATY_20190806_00.csv
change 38064_20190806_00.csv
to KYZYLORDA_20190806_00.csv
Data of radiosonde
is as given below:
CODE,LAT,LON,Elevation,STN_NAME
41620,31.35,69.467,1407,ZHOB
41600,32.5,74.5333,255,SIALKOT
41598,32.9333,73.7167,232,JHELUM
41594,32.05,72.667,188,SARGODHA
41571,33.6167,73.1,507,ISLAMABAD_AIRPORT
41560,33.8667,70.0833,1725,PARACHINAR
41529,34.0333,71.9333,329,PESHAWAR
41516,35.9167,74.3333,1453,GILGIT
41515,35.5667,71.7833,1464,DROSH
41506,35.9217,71.8,1499,CHITRAL
41316,17.0439,54.1022,23,SALALAH_AIRPORT
41288,20.667,58.9,19,MASIRAH
41256,23.5953,58.2983,8.4,MUSCAT_INTL_AIRPORT
41217,24.4333,54.65,16,ABU_DHABI_INTL_AIRPOR
41169,25.2731,51.6081,4,HAMAD_INTL_AIRPORT
40990,31.5,65.85,1010,KANDAHAR_AIRPORT
40948,34.55,69.2167,1791,KABUL_AIRPORT
40938,34.217,62.217,977,HERAT
40913,36.6667,68.9167,433,KUNDUZ
40911,36.7,67.2,378,MAZAR-I-SHARIF
40875,27.2167,56.3667,10,BANDARABBASS
40856,29.4667,60.8833,1370,ZAHEDAN
40848,29.5333,52.6,1484,SHIRAZ
40841,30.25,56.9667,1748,KERMAN
40821,31.9,54.2833,1238,YAZD
40811,31.3333,48.6667,20,AHWAZ
40809,32.8667,59.2,1491,BIRJAND
40800,32.5175,51.7061,1550.4,ESFAHAN
40754,35.6833,51.3167,1204,TEHRAN-MEHRABAD
40745,36.2667,59.6333,999,MASHHAD
40427,26.267,50.617,2,BAHRAIN
40417,26.45,49.8167,22,KING_FAHD_INTL_AIRPORT
40416,26.267,50.167,19,DHAHRAN
3992,10.83,106.97,11,AN_LOC
38989,35.9,62.9667,375,TAGTABAZAR
38954,37.5,71.5,2077,KHOROG
38927,37.233,67.267,310,TERMEZ
38880,37.987,58.361,211,ASHGABAT_KESHI
38836,38.55,68.783,800,DUSHANBE
38750,37.467,53.967,-22,ESENGYLY
38687,39.083,63.6,190,CHARDZHEV
38613,40.917,72.95,765,DZHALAL-ABAD
38606,40.55,70.95,499,KOKAND
38599,40.217,69.733,427,KHUDJAND
38507,40.0333,52.9833,90,TURKMENBASHI
38457,41.267,69.267,493,TASHKENT
38413,41.733,64.617,237,TAMDY
38392,41.833,59.983,87,DASHKHOVUZ
38353,42.833,74.583,760,BISHKEK
38341,42.85,71.3,652,TARAZ
38064,44.7667,65.5167,133.4,KYZYLORDA
38001,44.55,50.25,-25,FORT SHEVCHENKO
37985,38.733,48.833,-11,LANKARAN
37860,40.5333,50,27,MASHTAGA
36974,41.433,76,2041,NARYN
36872,43.3633,77.0042,662.7,ALMATY
36859,44.167,80.067,645,ZHARKENT
3369,22.77,88.37,0,BARAKPUR
3368,25.88,89.43,0,LALMANIR_HAT
I looked into this question. As suggested there, I tried :
sort -r radiosonde.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }' | bash
It did work in some sense. It renamed some files and left few as it is and gave error as:
bash: line 25: unexpected EOF while looking for matching `''
bash: line 113: syntax error: unexpected end of file
I am not understanding why it's behaving so strangely with some files. If I'll take those filenames and put them into some another file say test.csv
and use above command again i.e.
sort -r test.csv | awk -F"," '{print "for files in *00.csv; do mv $files ${files/" $1 "/" $5 "}; done" }' | bash
then it will rename all those files which were left earlier. Is there any way to do it using shell script. I tried following script but didn't work:
for file in *00.csv ; do
mv $files ${files/" $1 "/" $5 "};
done < radiosonde.csv
What about this:
Make sure that radiosonde.csv
file along with all the csv
files that you want to rename in the same directory.
$ cd <directory of radiosonde.csv, 36872_20190806_00.csv, 38064_20190806_00.csv and so on...>
$ ls *.csv > .tmp; awk -F ',' '{name[$1]=$5}END{for(;(getline filename < ".tmp")>0;){ori=filename;sub(/_.+$/,"",filename);pre=filename;sub(/^[0-9]+/,"",ori);post=ori;if(name[pre]!="")system("mv " pre post " " name[pre] post)}} ' 'radiosonde.csv'
$ rm -f '.tmp'
Explanation:
ls *.csv > .tmp
-> List all files in current dir and write them into .tmp
awk -F ','
-> Set ,
(comma) as the field separator for awk. Because we want to split lines like 41620,31.35,69.467,1407,ZHOB
into separate fields. Then we can get them via $1
, $2
, $3
and so on.'{ ... }END{}'
-> This is awk's blocks. First block for reading input files and the later will be execute before awk program exits.'radiosonde.csv'
Set this as input file to feed awk for reading.'{name[$1]=$5}'
-> $1
is the first field and $5
is the 5'th one. In this case $1
would be 41620
, 41600
and so on and $5
would be ZHOB
, SIALKOT
and etc. name is an array. When we read the first line, we set name[CODE]=STN_NAME
and name[41620]=ZHOB
for the second line.END{}'
-> After we the set all the variables we needed, we need to rename the files and END{}
is one of the block we can used for that purpose.for(;(getline filename < ".tmp")>0;) {}
-> This is for reading .tmp
file that contains list of files that we want to rename.ori=filename;
-> Set variable filename
to another variable. This is because we want to use sub()
function that will alter the variable but still need filename
variable to get the remaining part of the filename.sub(/_.+$/,"",filename);
-> This is to remove characters that we don't want to. In this case from character _
to the end. For example, if filename
is 41620_20190806_00.csv
, _20190806_00.csv
will be removed and filename
will become 41620
.pre=filename;
-> Set filename
to another variable called pre
for clarity.sub(/^[0-9]+/,"",ori);
-> This will remove the leading numbers so ori
will become _20190806_00.csv
.post=ori;
-> Set ori
to another variable in this case post
.if(name[pre]!="")
-> Because radiosonde.csv
will be inside .tmp
and is not one of the files that we want to rename, we need this if
statement so that we don't receive any error for the next command. name[radiosonde]
will be empty.system("mv " pre post " " name[pre] post)
-> What this statement does would be renaming your file. If pre
is 41620
and post
is _20190806_00.csv
, this statement can be translate into this "mv 41620_20190806_00.csv ZHOB_20190806_00.csv"
.rm -f '.tmp'
-> Delete .tmp
file because we don't need it anymore.Ignore my commend below. We do need the if
statement.