I'm trying to find a way to grep grammatical clauses from an ebook sample. Here's what the input looks like:
This is a test my friend, this is just a test; I'm going to do some shopping:`what do you need?`
Nothing, he said.
Desired output:
This is a test my friend
this is just a test
I'm going to do shopping
what do you need
Nothing
he said
Any ideas on how one could achieve this?
Thank you very much !
You can use gnu-awk like this:
awk -v RS='[\n.,;:`?]+' -v ORS='\n' '{$1=$1} 1' file
This is a test my friend
this is just a test
I'm going to do some shopping
what do you need
Nothing
he said