I would like to extract a full sentence "." to "." into a document given a word. So for example given this text:
Dijkstra's original algorithm does not use a min-priority queue. For a given source vertex (node) in the graph, the algorithm finds the path with lowest cost (i.e. the shortest path) between that vertex and every other vertex. It can also be used for finding costs of shortest paths from a single vertex to a single destination vertex by stopping the algorithm once the shortest path to the destination vertex has been determined.
I would like to have the entire sentence that contains "graph"
For a given source vertex (node) in the graph, the algorithm finds the path with lowest cost (i.e. the shortest path) between that vertex and every other vertex.
Also it would be useful to find a way to include in the results the starting sentence if it contains graph, because there is no dot before it.
Assuming the text file dijk
doesn't actually contain any newlines, you could do this in perl:
perl -MLingua::EN::Sentence=get_sentences -ne '
print "$_\n" for grep { /graph/ } @{get_sentences($_)}' dijk
The Lingua::EN::Sentence module is smart enough to deal with well-known abbreviations and you can add your own if necessary.
Output:
For a given source vertex (node) in the graph, the algorithm finds the path with lowest cost (i.e. the shortest path) between that vertex and every other vertex.
If the newlines do actually exist in the input, it should be possible to adapt the script without too much difficulty.
If there are newlines in the input, you could do this instead:
perl -MLingua::EN::Sentence=get_sentences -00 -e '
$t = <>; # slurp the whole file
$t =~ tr{\n}{ }; # convert newlines to spaces
print "$_\n" for grep { /graph/ } @{get_sentences($t)}' dijk
Of course, by now this is looking a lot more like a full-blown perl script rather than a one-liner!
Alternatively, as mentioned by @mklement0, you could use the external tool tr
to perform the translation and pass the result to the original script:
perl -MLingua::EN::Sentence=get_sentences -ne '
print "$_\n" for grep { /graph/ } @{get_sentences($_)}' <(tr '\n' ' ' < dijk)