Selecting specific text from text file, BASH scripting

I've been running simulations on a cluster and I would like to check temporary results by going through all cluster nodes and to copy all the files I need.

What I've been trying to do is to extract job ID and node name as a string from a text file that looks like this after typing qstat -rn u djsavic:

fermi: 
                                                                               Req'd    Req'd      Elap
Job ID               Username    Queue    Jobname          SessID NDS   TSK    Memory   Time   S   Time
-------------------- ----------- -------- ---------------- ------ ----- ------ ------ -------- - --------
59281.fermi          djsavic     xlarge   Smith2            30676     1      2    --  96:00:00 R 24:19:14
    fermi-node08/1+fermi-node08/0
59282.fermi          djsavic     xlarge   Smith2            30686     1      2    --  96:00:00 R 24:18:56
    fermi-node08/3+fermi-node08/2
59283.fermi          djsavic     xlarge   Smith2            30700     1      2    --  96:00:00 R 24:18:56
    fermi-node08/5+fermi-node08/4
59284.fermi          djsavic     xlarge   Smith2            30729     1      2    --  96:00:00 R 24:21:09
    fermi-node08/7+fermi-node08/6
59285.fermi          djsavic     xlarge   Smith2             9076     1      2    --  96:00:00 R 24:19:24
    fermi-node07/1+fermi-node07/0
59286.fermi          djsavic     xlarge   Smith2             9078     1      2    --  96:00:00 R 24:19:23
    fermi-node07/3+fermi-node07/2
59287.fermi          djsavic     xlarge   Smith2             9079     1      2    --  96:00:00 R 24:19:41
    fermi-node07/5+fermi-node07/4
59288.fermi          djsavic     xlarge   Smith2             9080     1      2    --  96:00:00 R 24:19:57
    fermi-node07/7+fermi-node07/6

In reality, the list is longer, around 80 lines.

What I need are jobs ID and node name, so I could copy files e.g. from directory fermi-node08/59281/ to some /location

After a lot of digging and searching throught the internet, so far, I did something like this:

for i in `qstat -rn -u djsavic`; do
    for j in `echo $i|grep fermi`; do
             echo $j|sed -r 's/(.{12}).*/\1/'|sed  's/.fermi//';
    done;
done

and what I get is a list like this:

fermi:
59281
fermi-node08
59282
fermi-node08
59283
fermi-node08
59284
fermi-node08
59285
fermi-node07
59286
fermi-node07
59287
fermi-node07
59288
fermi-node07

At this point, I would like to copy files from all /fermi-node##/JobID/ to a desired location and also to remove this fermi: from the top of the list. I am new to bash scripting and I would really appreciate if anyone can help me with the final step.

Thanks in advance.

Solution

awk to the rescue!

If your input is in that form (the records are in two lines) and three header lines, you can extract the information you need with this

$ awk 'NR>3{ if(!(NR%2)) {sub(".fermi","",$1); n=$1}
              else {sub("/.*","",$1); print $1"/"n}}' file

fermi-node08/59281
fermi-node08/59282
fermi-node08/59283
fermi-node08/59284
fermi-node07/59285
fermi-node07/59286
fermi-node07/59287
fermi-node07/59288

you can use this in a while loop for your further processing such as

$ while read f; do echo $f; done < <(awk ...)

just replace echo $f with what you want to do.

UPDATE: if the header lines are not fixed, this may be more robust

$ awk '/^[0-9]*\.fermi/ {sub(".fermi","",$1); n=$1; next}
                       n{sub("/.*","",$1); print $1"/"n;n=""}' file