On one of the servers, I have a script in which at one of the stages tcpdump is sent to nohup.
start_dump() {
2>&1 /usr/bin/nohup /usr/sbin/tcpdump -s 0 -i $IFACE host $HOST -C 1000 -w $DUMP_DIR/$LOGIN/$DATE\_$HOST.pcap | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }' >> /var/log/dump/nohup_$LOGIN.out &
}
I need to make sure everything went well and the dump is being written. To do this, I check if the process exists in ps, but in some cases I get an error even though the process exists there.
dump_check() {
ps u -C tcpdump | grep $HOST > /dev/null
}
For debugging, I made a cycle of checks, as it seemed to me the reason was that the dump did not have time to start before checking the condition.
start_dump() {
2>&1 /usr/bin/nohup /usr/sbin/tcpdump -s 0 -i $IFACE host $HOST -C 1000 -w $DUMP_DIR/$LOGIN/$DATE\_$HOST.pcap | awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }' >> /var/log/dump/nohup_$LOGIN.out &
}
dump_check_check() {
ps u -C tcpdump | grep $HOST
echo $?
}
...
start_dump
for run in {1..10}; do
dump_check_check
done
And apparently I was right. This is what I get:
+ start_dump
+ for run in {1..10}
+ dump_check_check
+ grep 172.x.x.x
+ ps u -C tcpdump
+ awk '{ print strftime("%Y-%m-%d %H:%M:%S"), $0; fflush(); }'
+ /usr/bin/nohup /usr/sbin/tcpdump -s 0 -i ppp0 host x.x.x.x -C 1000 -w /root/dumps/xxxx/2021-01-21_17:31:51_172.19.5.234.pcap
+ echo 1
1
+ for run in {1..10}
+ dump_check_check
+ grep 172.x.x.x
+ ps u -C tcpdump
+ echo 1
1
+ for run in {1..10}
+ dump_check_check
+ grep 172.x.x.x
+ ps u -C tcpdump
root 768 0.0 0.0 10020 1468 pts/0 D+ 17:31 0:00 /usr/sbin/tcpdump -s 0 -i ppp0 host 172.x.x.x -C 1000 -w /root/dumps/xxxx/2021-01-21_17:31:51_172.19.5.234.pcap
+ echo 0
0
Firstly, the dump itself starts executing after checking the condition, why? Secondly, even after the launch, the next check of the condition is also not successful, as I understand it, due to the fact that the command is sent to nohup and the dump does not have time to start before the next check. Well, the third time everything works.
Question: it seems that the solution to this is to add a delay before checking the condition, but sleep does not suit me because sometimes the check is performed the first time, and sometimes the fifth time. I can't just waste so much time, it's critical for me. I am looking for a solution where the success check will run multiple times before success, but no longer than a specific time. If this time expires, an error should appear.
ps I hope I haven't overdone the details. This is my first question here. Thanks in advance, friends!
I am looking for a solution where the success check will run multiple times before success, but no longer than a specific time. If this time expires, an error should appear.
You can always use something like this:
check_dump()
{
for run in {1..10}
do sleep .1
ps u -C tcpdump | grep $HOST && return 0
done
return 1
}
start_dump
if check_dump; then echo SUCCESS; else echo ERROR; fi
This will run no longer than about one second (the time of ps | grep
should be negligible). You can adjust the maximum number of checks and the interval between them at will.