regexgrep

I want to grep [[:punct:]] and [[:alnum:]] in list without any other characters


I need help to grep Alphanumeric [[:alnum:]] & Special Characters [[:punct:]] from wordlist without any other characters.

for example, this is the list:

love
great2
pl@ce
&rt
joHn
&^~%$@!$#)(_-+
hearʥ
ʯbattle
fiʬeld

i need regular expression to print all the list & exclude these characters "ʥ - ʯ - ʬ" from the last three words and print them like that as output:

love
great2
pl@ce
&rt
joHn
&^~%$@!$#)(_-+
hear
battle
field

Thanks, wish you all great day :)

i've tried that linux command but i didn't get the expected output

grep -E -o --text "[[:punct:]]*[[:alnum:]]|[[:alnum:]]*[[:punct:]]" list.txt > Final.txt

Solution

  • grep is not the right tool because -o option will break down each matched input on a separate line.

    Better to use a tool like gnu-sed, where you can remove all non-ascii characters from input:

    sed 's/[^\x00-\x7F]//g' file
    
    love
    great2
    pl@ce
    &rt
    joHn
    &^~%$@!$#)(_-+
    hear
    battle
    field
    

    Similarly using perl:

    perl -pe 's/[^[:ascii:]]+//g' file
    
    love
    great2
    pl@ce
    &rt
    joHn
    &^~%$@!$#)(_-+
    hear
    battle
    field
    

    Just to elaborate above point about grep, here is the output:

    grep -oP "[[:ascii:]]+" file
    
    love
    great2
    pl@ce
    &rt
    joHn
    &^~%$@!$#)(_-+
    hear
    battle
    fi
    eld