I have the following directory structure with certain files of interest, on which I have to do calculation/ arithmetic operations using awk.
$ mkdir DP1/postProcessing/0/ DP2/postProcessing/0/ DP3/postProcessing/0/;
$ touch DP1/postProcessing/0/wallShearStress.dat DP1/postProcessing/0/wallShearStress_0.02.dat DP2/postProcessing/0/wallShearStress_0.dat DP2/postProcessing/0/wallShearStress_0.1.dat DP3/postProcessing/0/wallShearStress_0.05.dat DP3/postProcessing/0/wallShearStress_0.000012.dat
masterDir/;
$ tree masterDir/
masterDir/
├── DP1
│ └── postProcessing
│ └── 0
│ ├── wallShearStress_0.02.dat
│ └── wallShearStress.dat
├── DP2
│ └── postProcessing
│ └── 0
│ ├── wallShearStress_0.1.dat
│ └── wallShearStress_0.dat
└── DP3
└── postProcessing
└── 0
├── wallShearStress_0.000012.dat
├── wallShearStress_0.05.dat
└── wallShearStress.dat
Expected output
DP File_processed Ouput_value #Optional header
DP1 wallShearStress_0.02.dat <some result using AWK>
DP2 wallShearStress_0.1.dat <some result using AWK>
DP3 wallShearStress_0.05.dat <some result using AWK>
My (very basic) attempt failed where the script only returns files three times for the last directory found:
$ for i in $(find -type d -name "DP*"); do
> for j in $(find . -type f -name "wallShearStress*" | tail -n 1); do
> echo $j;
> awk 'NR == 3 {print $0}' $j; # this just for example ...
> # but I wanna do something more here, but no issue with that
> # once I can get the proper files into AWK.
> done;
> done;
./DP3/postProcessing/0/wallShearStress_0.05.dat
./DP3/postProcessing/0/wallShearStress_0.05.dat
./DP3/postProcessing/0/wallShearStress_0.05.dat
Problem definition: I want to,
wallShearStress*.dat
. where,wallShearStress*.dat
files present in a directory, e.g. for DP3
only DP3\postProcessing\0\wallShearStress_0.05.dat
should be chosen for processing as it has higher precedence than DP3\postProcessing\0\wallShearStress.dat
, similarly only DP1\postProcessing\0\wallShearStress_0.02.dat
and DP2\postProcessing\0\wallShearStress_0.1.dat
should be chosen) wallShearStress*.dat
, for each directory and output in the masterDir
as a .txt
/.csv
file as follow:Questions
I prefer bash + awk (since it's easier for me to understand than if someone comes up with other programming languages). Thank you so much for your time!
You could just use a for loop for the parent directories and use find for the subdirectories. If your sort
has the -V
flag use that.
#!/usr/bin/env bash
for d in masterDir/DP*/; do
find "$d" -type f -name 'wallShearStress*'| sort -Vk2 -t.| head -n1
done
To loop through the output you can use a while read loop.
#!/usr/bin/env bash
while IFS= read -r files; do
echo Do something with "$files"
done < <(for d in masterDir/DP*/; do find "$d" -type f -name 'wallShearStress*'| sort -Vk2 -t.| head -n1; done )
Another option as per OP's request
#!/usr/bin/env bash
for d in masterDir/DP*/; do
while IFS= read -r files; do
echo Do something with "$files"
done < <(find "$d" -type f -name 'wallShearStress*'| sort -Vk2 -t.| head -n1)
done
-t, --field-separator=SEP use SEP instead of non-blank to blank transition
sorting using the .
as field separator.
The <()
Is Process Substitution, it is some sort of a file, a named pipe to be exact see the output of ls -l <(:)
, and in order to read from a file you need the <
redirection sign and it needs to be separated from <( )
otherwise you will get an error.