How to Exclude everything after a character/string from an output in Linux

This is my file:

abc.test.com
efg.test.com:80/test1/123/xyz
xyz.test.com:443/test1
xab.test.com:80
lmn.test.com/100
com.test.com:10

I am trying to remove all characters after the string ".com", but I want to include ".com" in it. I tried sed 's/.com.*//', however it seems to exclude ".com" as well:

$ cat test1.txt | grep .com | sed 's/.com.*//'
abc.test
efg.test
xyz.test
xab.test
lmn.test
com.test

Is there a way to remove all characters after a particular string, however the output should still have that string it.

Solution

You don't have to use both grep and sed, you can just use either one of them.

Your code 's/.com.*// replaces the match with an empty string instead of the .com that you want to keep, and also note that you have to escape the dot \. or also it would match any character.

If you are using grep, and there is just a single occurrence of .com on the line, you can match that part and then output the match with -o

grep -o ".*\.com" file

An alternative using awk replacing the match with .com

awk '{sub(/\.com.*/, ".com")}1' file

Both will output

abc.test.com
efg.test.com
xyz.test.com
xab.test.com
lmn.test.com
com.test.com

Note that the difference is that the sed and awk solutions will print a line that does not contain .com as they are doing a substition and then print that whole line.

The grep solution will not display a line without .com as it prints the output for a match.