Search code examples
bashshellsolrlucenecron

Bash script to start Solr deltaimporthandler


I am after a bash script which I can use to trigger a delta import of XML files via CRON. After a bit of digging and modification I have this:

#!/bin/bash
# Bash to initiate Solr Delta Import Handler

# Setup Variables
urlCmd='http://localhost:8080/solr/dataimport?command=delta-import&clean=false'
statusCmd='http://localhost:8080/solr/dataimport?command=status'
outputDir=. 

# Operations
wget -O $outputDir/check_status_update_index.txt ${statusCmd} 
2>/dev/null 
status=`fgrep idle $outputDir/check_status_update_index.txt` 
if [[ ${status} == *idle* ]] 
then 
wget -O $outputDir/status_update_index.txt ${urlCmd} 
2>/dev/null 
fi 

Can I get any feedback on this? Is there a better way of doing it? Any optimisations or improvements would be most welcome.


Solution

  • This certainly looks usable. Just to confirm, you intend to run this ever X minutes from your crontab? That seems reasonsable.

    The only major quibble (IMHO) is discarding STDERR information with 2>/dev/null. Of course it depends on what are your expectations for this system. If this is for a paying customer or employer, do you want to have to explain to the boss, "gosh, I didn't know I was getting error message 'Cant connect to host X' for the last 3 months because we redirect STDERR to /dev/null"! If this is for your own project, and your monitoring the work via other channels, then not so terrible, but why not capture STDERR to file, and if check that there are no errors. as a general idea ....

    myStdErrLog=/tmp/myProject/myProg.stderr.$(/bin/date +%Y%m%d.%H%M)
    wget -O $outputDir/check_status_update_index.txt ${statusCmd} 2> ${myStdErrLog}
    
    if [[ ! -s ${myStdErrLog} ]] ; then
       mail -s "error on myProg" [email protected] < ${myStdErrLog}
    fi
    rm ${myStdErrLog}
    

    Depending on what curl includes in its STDERR output, you may need filter what is in the StdErrLog to see if there are "real" error messages that you need to have sent to you.

    A medium quibble is your use backticks for command substitution, if you're using dbl-sqr-brackets for evaluations, then why not embrace complete ksh93/bash semantics. The only reason to use backticks is if you think you need to be ultra-backwards compatible and that you'll be running this script under the bourne shell (or possibly one of the stripped down shells like dash).Backticks have been deprecated in ksh since at least 1993. Try

    status=$(fgrep idle $outputDir/check_status_update_index.txt) 
    

    The $( ... ) form of command substitution makes it very easy to nest multiple cmd-subtitutions, i.e. echo $(echo one $(echo two ) ). (Bad example, as the need to nest cmd-sub is pretty rare, I can't think of a better example right now).

    Depending on your situation, but in a large production environement, where new software is installed to version numbered directories, you might want to construct your paths from variables, i.e.

    hostName=localhost
    portNum=8080
    SOLRPATH=/solr
    SOLRCMD='delta-import&clean=false"
    urlCmd='http://${hostName}:${portNum}${SOLRPATH}/dataimport?command=${SOLRCMD}"
    

    The final, minor quibble ;-). Are you sure ${status} == *idle* does what you want?

    Try using something like

     case "${status}" in 
        *idle* ) .... ;; 
        * ) echo "unknown status = ${status} or similar" 1>&2 ;; 
     esac
    

    Yes, your if ... fi certainly works, but if you want to start doing more refined processing of infomation that you put in your ${status} variable, then case ... esac is the way to go.

    EDIT

    I agree with @alinsoar that 2>/dev/null on a line by itself will be a no-op. I assumed that it was a formatting issue, but looking in edit mode at your code I see that it appears to be on its own line. If you really want to discard STDERR messages, then you need cmd ... 2>/dev/null all on one line OR as alinsoar advocates, the shell will accept redirections at the front of the line, but again, all on one line.