I'm trying to write code that outputs the length of a string.
int main(){
string s1, s2;
scanf("%s %s", &s1, &s2); //Doesn't work.
cin >> s1, s2; //Works, I inputed abcd abc
int s2len = s2.length(); //Incorrect, outputted 0 instead of 3
int s2len = sizeof(s2.c_str()); //Incorrect, outputted 8
int s2len = strlen(s2.c_str()); //Incorrect, outputted 0
}
Can someone tell me why line 3 doesn't work while 4 does?
Also, I have an understanding that I declared s1
and s2
as string
s which can only interact with .length()
, which didn't work. Then I tried converting the strings into cstrings with .c_str()
, which means I must use sizeof()
or strlen()
. But that still didn't work.
Can someone tell me what I'm missing out?
The reason is that you are making several invalid assumptions about how things work in C++. It is your expectations that are incorrect.
In explaining, I'm also going to have to make assumptions (to fill in relevant information that you omitted). I assume that the code you have shown is incomplete, and it is actually preceded by (something like);
#include <string> // for the std::string type
#include <iostream> // iostreams, including std::cin
#include <cstdio> // C I/O functions like std::scanf()
using namespace std; // so you can avoid std:: prefix in your code
When I quote part of your code, I leave the comments intact, and then explain why your code (and therefore your comments) are incorrect.
The first line that doesn't work in your code
scanf("%s %s", &s1, &s2); //Doesn't work.
is wrong because the %s
format tells scanf()
to ASSUME that the corresponding arguments are pointers to char
(i.e. a char *
) and that each of those pointers points at an array of char
long enough to hold input received.
In reality, s1
and s2
are both C++ objects (formally, instances of std::string
, which is another name for std::basic_string<char>
). Neither is an array of char
.
Since scanf()
is being told by the %s
format to ASSUME the two arguments passed are arrays of char
, and the arguments being passed are not, the call of scanf()
has undefined behaviour.
Some compilers will issue a warning on the scanf()
call due to the type mismatch. But compilers are not required to do that (that is the part of the nature of undefined behaviour - no diagnostics are required).
In the following, I'll assume the call scanf()
did nothing (i.e. it didn't change s1
or s2
or anything else at all). In reality, depending on what you input, it could overwrite arbitrary memory.
The next assumption you make is that
cin >> s1, s2; //Works, I inputed abcd abc
actually receives input into objects s1
and s2
. It doesn't. In reality, the expression cin >> s1, s2
is equivalent to (cin >> s1), s2
. This actually has two distinct sub-expressions, (cin >> s1)
and s2
separated by the comma operator. The comma operator causes (cin >> s1)
to be evaluated (which has an effect of reading data from std::cin
to s1
), then evaluates s2
(without modifying it).
(There's more to the comma operator than that, but for purposes of explaining here, the above is enough).
Because of that, if you entered abcd abc
to that statement, then the result will be that s1
will be a std::string
containing "abcd"
, s2
will be unmodified, and the trailing data abc
will be unread (if your code was to read from std::cin
later, that data is available to be read then).
If you want to read abcd
to s1
and abc
to s2
, one option is cin >> s1 >> s2
.
Now we get to
int s2len = s2.length(); //Incorrect, outputted 0 instead of 3
Because of that, s2.length()
will give a result of zero. s2
was created as an empty string (with length zero) and is never modified.
Next we get to
int s2len = sizeof(s2.c_str()); //Incorrect, outputted 8
You seem to be assuming this should produce information based upon the user input. Actually, what happens is that the return type of s2.c_str()
is a const char *
, and sizeof(char *)
(for your compiler) gives a size of 8
(indicating you are using a 64-bit compiler). Since sizeof
is a compile-time operator, it does not evaluate the size of user input.
Technically sizeof(char *)
is implementation-defined. This means that different compilers can give different results. For example, a 32-bit compiler would often evaluate sizeof(char *)
as 4
. But that value still has no relationship whatsoever to the data you input to your program.
Next we get to
int s2len = strlen(s2.c_str()); //Incorrect, outputted 0
Now, since s2
is an empty string (discussed above), s2.c_str()
gives a pointer to the first element of an array (managed internally by std::string
) that has a first character '\0'
. strlen()
counts the number of non-zero characters preceding that '\0'
, then gives a result of zero.
I suggest you find a good textbook on C++, rather than guessing about things like this. Because your guesswork has been badly wrong.