Search code examples
stringshelltextdigits

Shell Extract Text Before Digits in a String


I've found several examples of extractions before a single character and examples of extracting numbers, but I haven't found anything about extracting characters before numbers.

My question: Some of the strings I have look like this:

NUC320 Syllabus Template - 8wk
SLA School Template - UL
CJ101 Syllabus Template - 8wk
TECH201 Syllabus Template - 8wk
Test Clone ID17

In cases where the string doesn't contain the data I want, I need it to be skipped. The desired output would be:

NUC-320
CJ-101
TECH-201

SLA School Template - UL & Test Clone ID17 would be skipped.

I imagine the process being something to the effect of:

  1. Extract text before " "
  2. Condition - Check for digits in the string
  3. Extract text before digits and assign it to a variable x
  4. Extract digits and assign to a variable y
  5. Concatenate $x"-"$y and assign to another variable z

More information: The strings are extracted from a line in a couple thousand text docs using a loop. They will be used to append to a hyperlink and rename a file during the loop.

Edit:

    #!/bin/sh

    # my files are named 1.txt through 9999.txt i both 
    # increments the loop and sets the filename to be searched

    i=1

    while [ $i -lt 10000 ]
    do
        x=$(head -n 31 $i.txt | tail -1 | cut -c 7-)
        if [ ! -z "$x" -a "$x" != " " ]; then
# I'd like to insert the hyperlink with the output on the
# same line (1.txt;cj101 Syllabus Template - 8wk;www.link.com/cj101)
            echo "$i.txt;$x" >> syllabus.txt
    #   else
    #       rm $i.txt
        fi
        i=`expr $i + 1`
        sleep .1
    done

Solution

  • sed for printing lines starting with capital letters followed by digits. It also adds a - between them:

    sed -n 's/^\([A-Z]\+\)\([0-9]\+\) .*/\1-\2/p' input 
    

    Gives:

    NUC-320
    CJ-101
    TECH-201