My file test.csv
Col1,Col2,Col3,Col4
1,AAA,1,
2,BBB,0,
3,CCCÆ,,ttt
4,DDD,1,
5,EEE,0,
Expected output:
3,CCCÆ,,ttt
Tried:
grep -a "[^\x20-\x7e]+" test.csv
grep -a '[^\x20-\x7e]+' test.csv
grep "[^\x20-\x7e]+" test.csv
grep '[^\x20-\x7e]+' test.csv
also tried the flags -P and -E but all do not return me the result I want. In Powershell, I did
Select-String -Pattern '[^\x20-\x7E]+' test.csv
and it returned me the expected result.
Could someone point me in the right direction for MINGW64 bash grep (GNU grep) 3.1
on Windows10?
It is installed via git download for windows here: https://git-scm.com/download/win
It appears the POSIX BRE and ERE syntax in grep for Windows do not support \xXX
notation.
You may use -P
option to enable the PCRE regex engine and then use
grep -P "[^\x{00}-\x{7E}]" file
Or,
grep -P "[^[:ascii:]]" file
to find any line containing a non-ASCII character.
NOTE that you cannot use [^\x20-\x7E]
range because the CR (part of the line ending in Windows text files) will get matched, and all lines but the last (if it is not followed with trailing line break(s)) will get matched. You may add CR symbol though to the negated character class and use grep -P "[^\x{0D}\x{20}-\x{7E}]" file
though.