Search code examples
unixfilesystemshbasetailflume

Tailing only updated line of file


I want to tail only the updated line of the file not entire content

tail -F /path to file

displays all the line from the file. I need to display only the new lines added to the file can anyone help me in this ?

for e.g. i want to see only updated lines in the file for e.g if file has 10 lines tail -F shows me 10 lines on terminal now if 5 more line are added i should be able to see only new 5 lines not all the 15 lines

EDIT I have configured flume to send the log data to hbase i am using "tail -F /path to file" which gives me the all the lines everytime file is updated.Only updated log data(say 5 lines added) should be sent to hbase otherwise there will be data redundancy.

Regards Chhaya


Solution

  • I assume that file is a log file?

    So maybe, instead of trying to come up with a way to remember what was written the last time, and only display what's new, you probably want to use a logging system [like syslogd, or a newer version of it], and tell it to log both in the file AND send it to flume?

    Otherwise, here is a dirty hack: create a "shownew.sh" file containing:

    #try to be as "atomic" as possible: we will all do with a copy of ${1}, to "freeze" time
    cp -p "${1}" "${1}.cur"  #very important. "freezes" the state of $1
    
    if [ -f "${1}.old ]; then
    
       diff "${1}.old" "${1}.cur" | grep '^> ' | sed -e 's/^> //'
    
    else
    
       cat "${1}.cur" #show the file at the time of invocation
    
    fi
    
    mv -f "${1}.cur" "${1}.old"  #we just showed "${1}.cur" (or the diff between ${1}.cur and a previous ${1}.old=.
      # so we now move that ${1}.cur $^{1}.old, for the next iteration
      #We used a ${1}.cur instead of ${1} because ${1} may be updated at any time, and it's possible we copy a "$1" updated since the display of differences! By using ${1}.cur instead, this won't be a problem
    
    #edit: after the OP's comment he wants to tail -f the file too:
    #and now we showed the diffs since $1.old, we continue to display what is new in $1, using tail -f:
    
    #since we showed ${1}.cur (now known as ${1}.old}, $1 may have changed?
    diff "${1}" "${1}.old" | grep '^> ' | sed -e 's/> //' 
    
    #and now we tail -f on $1 to show what's incoming, until the user press ctrl+C
    tail -n 0 -f "${1} 
    
    #we showed the complete ${1}, this becomes the new ${1}.old
    cp "${1}" "${1}.old"
    
    • At the first invocation of, say, shownew.sh /some/file : it displays it's whole content, if it's the first time you called it on /some/file.

    • Each further time you call the script: shownew.sh /some/file : it will only show lines that are now in "${1}" and that were not before in "${1}.old" ... I hope that's what you wanted ?