Before I had this code:
for year in 2000 2001 2002 2003; do
echo $year" LST data being merged"
cd $base_data_dir/$year
# this is the part that takes a long time
cdo -f nc2 mergetime *.nc $output_dir/LST_$year.nc
done
I wanted to use GNU Parallel to try and run this in parallel.
I tried the following:a) Create a 'controller' script that calls other scripts
b) pass in an array as arguments to GNU parallel
# 1. Create monthly LST for each year
cd $working_dir
seq 2000 2003 | parallel 'bash create_yearly_LST_files.sh {}'
# 2. Create monthly NDVI for each year
cd $working_dir
seq 2000 2003 | parallel 'bash create_yearly_NDVI_files.sh {}'
This should be running the following in parallel:
bash create_yearly_LST_files.sh 2000
bash create_yearly_LST_files.sh 2001
...
bash create_yearly_NDVI_files.sh 2000
bash create_yearly_NDVI_files.sh 2001
...
year="$1"
echo $year" LST data being merged"
cd $base_data_dir/$year
cdo -f nc2 mergetime *.nc $output_dir/LST_$year.nc
So the commands should read:
cd $base_data_dir/2000
cdo -f nc2 mergetime *.nc $output_dir/LST_2000.nc
cd $base_data_dir/2001
cdo -f nc2 mergetime *.nc $output_dir/LST_2001.nc
...
cd $base_data_dir/2000
cdo -f nc2 mergetime *.nc $output_dir/NDVI_2000.nc
cd $base_data_dir/2001
cdo -f nc2 mergetime *.nc $output_dir/NDVI_2001.nc
...
The processes still work in my new code but there was no performance speed up.
Can anyone help me understand how to pass each year to be run in parallel?
And also run both of the scripts in parallel (create_yearly_LST_files.sh
and create_yearly_NDVI_files.sh
)
What is stopping you from doing
for year in 2000 2001 2002 2003; do
echo $year" LST data being merged"
cd $base_data_dir/$year
# this is the part that takes a long time
cdo -f nc2 mergetime *.nc $output_dir/LST_$year.nc &
done
wait