I want to extract some information from filenames using regex in bash, which I will use to rename them according to BIDS. Here are the filenames:
ACC_svs_ECC.nii.gz
ACC_svs_noECC.nii.gz
ACC_svs_ref.nii.gz
Lt Hippocampus_svs_ECC.nii.gz
Lt Hippocampus_svs_noECC.nii.gz
Lt Hippocampus_svs_ref.nii.gz
Those are literally the only possibilities for filenames, not a pattern or example. all the filenames. each participant will have those 6 for two of their sessions, but the filenames will be the same until I change them to be BIDS compliant. The json files are sidecar files, so the filename is exactly the same as the .nii.gz except .json files.
Each filename has a brain region (ACC
or Lt Hippocampus
) and a type of mrs file (ECC
, noECC
, and ref
). Those are the information that I will need in the new filenames, so as I loop through each existing filename I'd like to use them in the new filename.
Here is what the code will look like:
#!/bin/bash
ID=$1 # participant ID= user input
ses=$2 # session no (MRI)= user input
bidsdir=path/path/sub-001/ses-MRI1/mrs/ # path to mrs folder as specified by BIDS
for file in "${bidsdir}"; do
voi= # either ACC of Lt Hippocampus
type= # either ECC, noECC, or ref
ext= # file extension- either .nii.gz or .json
newfilename="sub-${ID}_ses-${ses}_voi-${voi}_acq-svs_${type}_mrs.${ext}"
# rest of code to rename each file
mv $bidsdir$file $bidsdir$newfilename
done
I'm used to using regex with python, and I'm pretty good at it, but I don't have time to spend another hour on figuring it out in bash. Here's how far I've gotten with the regex itself:
(?P<voi>ACC|Lt\ Hippocampus)_svs_(?P<type>ECC|noECC|ref)
Assumptions:
<brain-region>
+ _
+ <some-string-to-ignore>
+ _
+ <type>
+ { .nii.gz
or .json
}Adding a .json
entry to OP's list of file names:
$ ls -1 /tmp/testd
ACC_svs_ECC.nii.gz
ACC_svs_noECC.nii.gz
ACC_svs_ref.nii.gz
'Lt Hippocampus_svs_ECC.nii.gz'
'Lt Hippocampus_svs_noECC.nii.gz'
'Lt Hippocampus_svs_ref.json'
'Lt Hippocampus_svs_ref.nii.gz'
In this particular case I'd skip the hassles/complexities of a regex and use a combination of parameter substitution (to strip off the path) and the bash / read
builtin (in conjunction with dual delimiters _
and .
) to parse the file names into the desired variables:
ID='myid'
ses='myses'
bidsdir='/tmp/testd'
for path_file in "${bidsdir}"/*{.nii.gz,.json}
do
oldfilename="${path_file##*/}" # strip off the path (via parameter substitution)
IFS='_.' read -r voi x type ext <<< "${oldfilename}" # parse old file name into variables based on dual delimimters "_" and "."
newfilename="sub-${ID}_ses-${ses}_voi-${voi}_acq-svs_${type}_mrs.${ext}"
echo "path/file = ${path_file}"
echo "old file = ${oldfilename}"
echo "new file = ${newfilename}"
echo ""
done
This generates:
path/file = /tmp/testd/ACC_svs_ECC.nii.gz
old file = ACC_svs_ECC.nii.gz
new file = sub-myid_ses-myses_voi-ACC_acq-svs_ECC_mrs.nii.gz
path/file = /tmp/testd/ACC_svs_noECC.nii.gz
old file = ACC_svs_noECC.nii.gz
new file = sub-myid_ses-myses_voi-ACC_acq-svs_noECC_mrs.nii.gz
path/file = /tmp/testd/ACC_svs_ref.nii.gz
old file = ACC_svs_ref.nii.gz
new file = sub-myid_ses-myses_voi-ACC_acq-svs_ref_mrs.nii.gz
path/file = /tmp/testd/Lt Hippocampus_svs_ECC.nii.gz
old file = Lt Hippocampus_svs_ECC.nii.gz
new file = sub-myid_ses-myses_voi-Lt Hippocampus_acq-svs_ECC_mrs.nii.gz
path/file = /tmp/testd/Lt Hippocampus_svs_noECC.nii.gz
old file = Lt Hippocampus_svs_noECC.nii.gz
new file = sub-myid_ses-myses_voi-Lt Hippocampus_acq-svs_noECC_mrs.nii.gz
path/file = /tmp/testd/Lt Hippocampus_svs_ref.nii.gz
old file = Lt Hippocampus_svs_ref.nii.gz
new file = sub-myid_ses-myses_voi-Lt Hippocampus_acq-svs_ref_mrs.nii.gz
path/file = /tmp/testd/Lt Hippocampus_svs_ref.json
old file = Lt Hippocampus_svs_ref.json
new file = sub-myid_ses-myses_voi-Lt Hippocampus_acq-svs_ref_mrs.json
oldfilename = Lt Hippocampus_svs_ref.json
newfilename = sub-myid_ses-myses_voi-Lt Hippocampus_acq-svs_ref_mrs.json