bash subsequence matching speed-up

I am wondering if there is an easy way to check if a string is a subsequence of another string in bash, actually a subsequence with an extra rule. I will explain.

Some subsequences of "apple" are "aple", "al", "pp" and "ale". The subsequences with an extra rule, I want to get are those that start and end with the same letter as the string so only "aple" and "ale" fit my desire.

I have made the following program:

#!/bin/bash
while read line
do
    search=$(echo "$line" | tr -s 'A-Za-z' | sed 's/./\.\*&/g;s/^\.\*//' )
    expr match "$1" "$search" >/dev/null && echo "$line"
done

It is executed as followed:

./program.sh greogdgedlqfe < words.txt

This program works, but is very slow.

It takes every line of the file, modify it to regex expression and then check if they match and then print the original line. So example:

one of the lines has the word google

$search becomes g.*o.*g.*l.*e (repeated letters become squeezed, extra rule )

then we check that expression with the given parameter and if it matches, we print the line: google

This works fine, however when the file words.txt gets too big, this program becomes too slow. How can I speed up my program, possibly by faster matching subsequences.

Edit after possible solution of Kamilcuk

That solution returns quick,quiff,quin,qwerty for the string "qwertyuihgfcvbnhjk" and only quick should be returned, so it is almost correct, but not quite yet.

Solution

Try it like so:

grep -x "$(<<<"$1" tr -s 'A-Za-z' | sed 's/./&*/g;s/\*$//;s/\*//1')" words.txt

Tested against:

set -- apple  
cat >words.txt <<EOF
aple
al
pp
ale
fdafda
apppppppple
apple
google
EOF

outputs:

aple
ale
apppppppple
apple

And for set -- greogdgedlqfe it outputs just google.

If I understand you correctly, a "subsequent" of apple is everything that mathes ap*l*e.

Tested on repl