MRE:
std::vector<std::string> someFunction() {
auto vec;
return vec;
}
What's stopping "auto" from inferring the type of vec
as std::vector<std::string>
?
Why can't
auto
infer the type when the return type is specified?
Because the C++ specification says it can't (see also the answer by @Anya here). More specifically, the people who write the specification apparently don't want it to.
So, I'll just explain below how the compiler actually works, not how the compiler should, might, or could work. Languages like Rust, however, are very different, and can do this. Thanks, @Blindy, for pointing that out.
This is a good question.
From your question, it appears that you believe that since the compiler knows the return type, it should be able to deduce that the variable being returned must be of that return type. But, that is not how the C++ compiler works!
Rather, the compiler knows the return type, so it tries to implicitly cast vec
to that known return type upon returning. But, it doesn't know how to perform this implicit cast because it doesn't know the original type of vec
.
That's just how the compiler works. Specifying a return type doesn't change the type of vec
, it just ensures that the compiler tries to implicitly cast vec
to that type.
So, with your code, you get this error when compiled with the GNU g++ compiler as C++17:
main.cpp: In function ‘std::vector<std::__cxx11::basic_string<char> > someFunction()’:
main.cpp:15:5: error: declaration of ‘auto vec’ has no initializer
15 | auto vec;
| ^~~~
To make my point, check this out. This code compiles! I am forcing a reinterpret cast from an int
to a std::vector<std::string>
:
Run this code online here: https://onlinegdb.com/RaDEmExfF
#include <iostream>
#include <string>
#include <vector>
std::vector<std::string> someFunction()
{
int i = 0;
// return by value (copy)
return *((std::vector<std::string>*)(&i)); // C-style cast
// OR (same thing)
// return *reinterpret_cast<std::vector<std::string>*>(&i); // C++ cast
}
int main()
{
std::vector<std::string> vector = someFunction();
return 0;
}
That makes no sense and is totally undefined behavior simply because sizeof(int) < sizeof(std::vector<std::string>)
, however, meaning that the space allocated for the int
is too small to contain the std::vector<std::string>
object, so I get a run-time crash. Here is that run-time crash output on Linux x86-64:
terminate called after throwing an instance of 'std::bad_array_new_length'
what(): std::bad_array_new_length
Just remember, the return type doesn't specify the source type of vec
. Rather, it simply enforces the type that vec
gets implicitly cast to when it is returned.
Note: the below memory pool demos are not a good use of memory pools in this case. They are just a demo to teach some interesting concepts that are useful in other scenarios, such as when implementing your own malloc()
or new
from scratch in a deterministic, O(1) sort of way, using memory pools (either statically-allocated, or dynamically-allocated at initialization), rather than relying on the built-in dynamic memory allocators which are not deterministic and not O(1) in time complexity.
Check this out. To make things even weirder for you, this is a totally fine and valid program now, with well-defined behavior.
I simply used an array of bytes as a memory pool to statically construct a std::vector<std::string>
object from within a memory pool of 100 bytes. I could have used a memory pool of int
s too, as int[25]
.
#include <cstring>
#include <iostream>
#include <string>
#include <vector>
std::vector<std::string> someFunction()
{
constexpr uint16_t NUM_BYTES = 100;
// undefined behavior check
static_assert(NUM_BYTES >= sizeof(std::vector<std::string>));
uint8_t memory_pool[NUM_BYTES];
// set all bytes in the memory pool to zero
memset(memory_pool, 0, sizeof(memory_pool));
// return by value (copy)
return *((std::vector<std::string>*)(memory_pool));
}
int main()
{
std::vector<std::string> vector = someFunction();
vector.push_back("hello ");
vector.push_back("world");
std::cout << vector[0] << vector[1] << "\n";
return 0;
}
Now it is a perfect program and runs just fine. The output is:
hello world
One more valid version of this function that now has optimal memory usage (just like the actual, proper way to do it below does):
std::vector<std::string> someFunction()
{
uint8_t memory_pool[sizeof(std::vector<std::string>)];
memset(memory_pool, 0, sizeof(memory_pool));
// return by value (copy)
return *((std::vector<std::string>*)(memory_pool));
}
Or, just skip the memory pool and use the proper type in the first place [<== key takeaway here--just do this!]:
std::vector<std::string> someFunction()
{
std::vector<std::string> vec;
return vec;
}
...but you can't use auto
here:
std::vector<std::string> someFunction()
{
auto vec; // NOT ok!
return vec;
}
Notes:
std::vector<std::string>
by copy are probably triggering the compiler to make use of "copy elision" and "return value optimization", which from what I understand basically just makes the compiler skip the copy and treat the original variable inside the function as though it was in a scope outside the function.std::cout << "sizeof(std::vector<std::string>) = "
<< sizeof(std::vector<std::string>) << "\n";
on my Linux x86-64 system with the g++ -std=c++17
compiler outputs sizeof(std::vector<std::string>) = 24
, meaning that a vector object of that type is 24 bytes. Therefore, any memory pool that is at least 24 bytes is large enough to hold a std::vector<std::string>
object. Since the std::vector
just contains pointers to allocated memory where it stores its elements, sizeof(std::vector<std::string>)
stays constant even while the vector grows.auto
Personally, I hate auto
. It obfuscates at every turn. I talk about that a bit in my other answer here. The use of auto
is a debated topic, however. Some people really love it. And...that's one of the reasons I don't want to work with those people. :) I'd rather go program in C than fight with a C++ developer about auto
.
Also, to see how C++ sees your code, run it through https://cppinsights.io/. I've found this tool to be amazing to help me better understand C++! I only recently found out about it about the time I wrote my other answer, which was just a couple months ago (Mar. 2023).