Search code examples
bashsyntaxformattingtesseract

What does ${img_file%.*} in a shell script mean?


I know that .* means fetch all files regardless of the extensions (I hope I'm not wrong). However, I can't for the love of my life seem to figure out what does that extra % sign mean!

Here's two code snippets that might help describe the situation a bit more :

img_files=${img_files}' '$(ls ${TRAINING_DIR}/*.exp${exposure}.tif)

for img_file in ${img_files}; do
        run_command tesseract ${img_file} ${img_file%.*} \
            ${box_config} ${config} &

For those who need even more details, here's the full script.


Solution

  • The expression ${img_file%.*} will remove the rightmost dot and any character after it in the variable img_file. From man bash:

    ${parameter%word}
    ${parameter%%word}
           Remove matching suffix pattern.  The word is expanded to produce
           a pattern just as in pathname expansion.  If the pattern matches
           a  trailing portion of the expanded value of parameter, then the
           result of the expansion is the expanded value of parameter  with
           the  shortest  matching  pattern
    

    Example:

    >var="word1 word2"
    >echo ${var%word2}
    word1
    >echo ${var%word1}
    word1 word2