Following a c++ tutorial and teaching about find()
the following code was implemented to search for all the "cat" occurrences in a string
:
std::string input;
std::size_t i = 0, x_appearances = 0;
std::getline(std::cin,input);
for(i = input.find("cat",0); i != std::string::npos; i=input.find("cat", i))
{
++x_appearances;
++i; //Move past the last discovered instance to avoid finding the same string
}
Then the tutorial challenges the apprentice to change find()
for rfind()
, and that's where the problems came in, first I tried what seemed to be the obvious approach:
for(i = input.rfind("cat",input.length()); i != std::string::npos; i=input.rfind("cat", i))
{
++x_appearances;
--i; //Move past the last discovered instance to avoid finding the same string
}
but with this solution I fell into an infinite loop. Then I discovered that it was happening because the increment is performed before the condition check, and that the increment rfind()
was always finding a match even with i==std::string::npos
(if the match is on the beginning of the string, for example "cats"). My final solution came to be:
int n=input.length();
for(i = input.rfind("cat",input.length()); n>0 && i!=std::string::npos; i=input.rfind("cat", i))
{
++x_appearances;
n=i;
--i; //Move past the last discovered instance to avoid finding the same string
}
With n
I can keep the track of the position in the string
, and with it exit the for loop
when the entire string
had been searched.
So my question is: Is my approach correct? Did I need an extra variable or is there any other simpler way of doing this?
for(i = input.rfind("cat",input.length()); i != std::string::npos; i=input.rfind("cat", i))
{
++x_appearances;
--i; //Move past the last discovered instance to avoid finding the same string
}
The problem with the above is the --i
inside the loop. Suppose the input string starts with "cat". Your algorithm will eventually find that "cat" with i
being 0. Since you've declared i
as a std::size_t
, subtracting 1 from 0 results in the largest possible std::size_t
. There's no warning, no overflow, no undefined behavior. This is exactly how unsigned integers must work, per the standard.
Somehow you need to handle this special case. You could use an auxiliary variable and a more convoluted test in your loop. An alternative is to keep your code simple and at the same time make it blatantly obvious you are explicitly handling this special case:
for (i = input.rfind("cat"); i != std::string::npos; i=input.rfind("cat", i-1))
{
++x_appearances;
// Finding "cat" at the start means we're done.
if (i == 0) {
break;
}
}
Note also that I've changed the loop statement a bit. The default value for pos
is std::string::npos
, which means search from the end of the string. There's no need for that second argument with the initializer. I also moved the --i
into the update part of the for loop, changing input.rfind("cat",i)
to input.rfind("cat",i-1)
. Since i
is always positive at this point, there's no danger in subtracting one.