An application is continually writing to a log. Each line forms a new entry, the log is in a csv format. Example:
123123123,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
444444222,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
563434535,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
234234334,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
234234534,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
546456456,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
567567567,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
234232342,asdf,asdf,3453456,sdfgsfgs,4567asd,zxc,aa
I need to poll the log and extract the data in chunks appending the data to another log file called newLog.csv
I need to ensure that; I don't copy data already moved over to the new file, If there is not 200 lines of data then it captures the nearest number of lines available, without getting duplicates.
Can I change this tail statement to meet the above?
tail -n 200 $REMOTE_HOME/data/log.csv >> $SCRIPT_DIR/$project/newLog.csv
Provided the first data in the string is some sort of a time code (unixtime ?), you could do:
1.Check the time of last written line in new log.
LAST_LINE=tail -n 1 /PATH/new_log | awk -F',' '{print $1}'
2.Check the first line you want to write
FIRST_LINE=tail -n 200 /PATH/old_log | head -n 1
3.If the last line in new log is older than first line of 200 write 200 lines
if [ $LAST_LINE -lt $FIRST_LINE ]
do tail -n 200 /PATH/old_log >> /PATH/new_log;done;
Now you have to put it in a loop, to make stuff work if e.g. 3 lines overlap. Basically you do the same as before, just have to list the last 200 lines to get the first new one.
LAST_LINE=tail -n 1 /PATH/new_log | awk -F',' '{print $1}'
COUNT=200;
while [ $COUNT -gt 0 ]; do
FIRST_LINE=tail -n $COUNT /PATH/old_log | head -n 1
if [ $LAST_LINE -lt $FIRST_LINE ]
do tail -n $COUNT /PATH/old_log >> /PATH/new_log;break;done;
done