Search code examples
cstringsubstringstrpos

C version of strpos and substr?


I'm really surprised I can't figure out a way to do this effectively. I've tried strstr, a combination of things with sscanf, and nothing seems to work the way I would expect it to based on my experience in other languages.

I have a char of "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS". I do not know where "BEGINTheMiddleEND" is in the string, and I would like to end with a char that equals "TheMiddle" by finding the occurrences of "BEGIN" and "END" and grabbing what is in between.

What is the most efficient way to accomplish this (find and sub-string)?

Thanks!

-- EDIT BASED ON ANSWERS --

I have tried this:

char *searchString = "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS"
char *t1, *t2;
t1 = strstr(searchString, "BEGIN");
t2 = strstr(t1, "END");

But something must be wrong from a pointer standpoint as it doesn't work for me. Strstr only takes two arguments, so I'm not sure what you mean by starting at the previous pointer. I'm also not sure how to then use those pointers to substring it, as they are not integer values like strpos returns, but character pointers.

Thanks again.

-- EDIT WITH FINAL CODE --

For anyone else who hits this, the final, working code:

char *searchString = "ABCDEFG HIJ K BEGINTheMiddleEND LMNO PQRS"
char *b = strstr(searchString , "BEGIN");
char *e = strstr(b, "END");
int offset = e - b;
b[offset] = 0;

Where "b" is now equal to "BEGINTheMiddle". (which as it turns out is what I needed in this case).

Thanks again everyone.


Solution

  • You need to realize what a string is. A 0 delimited sequence of chars.

    strstr does what it says: it finds the beginning of the given substring.

    So calling strstr with the needle "BEGIN" takes you to the position of this substring. The pointer pointing to "BEGIN", advanced by 5 characters is pointing to "TheMiddle" onward to the next 0 char. By searching for "END" you can find the end pointer, and then you need to copy the substring into a new string array (or cut it, by replacing the "E" with a 0; or implement your own string functions that do not use 0 terminated strings, so they can arbitrarily overlap).

    That is probably the step that you are still missing: actually copy the string. E.g. using

     t3 = strndup(t1, t2 - t1);
    

    Take the string ABCDEF0, where 0 is an actual 0 character. A pointer to the beginning points to the full string, a pointer pointing to the E points to "EF" only. If you want to get a string "AB", you need to either copy that to "AB0", or replace C by 0.

    strstr does not do the copying for you. It just finds the position. If you want an index, you can do int offset = newPosition - oldPosition;, but if you need to continue searching, it's easier to work with the newPosition pointer.

    All this is less intuitive than e.g. String operations in Java. Except for truncating strings, it actually is more efficient as far as I know, and if you realize the 0-terminated memory layout, it makes a lot of sense. It's only when you think of strings as arrays that it may seem odd to have a pointer somewhere in the middle, and continue using it like a regular array. That makes "sub = string + offset" the C way of writing "sub = string.substring(offset)".