So, we found that a sort wasn’t working correctly. The code for the comparison function looked something like this:
const char* a_name = a->get_name().c_str();
const char* b_name = b->get_name().c_str();
return strcmp(a_name, b_name) < 0
Odd code, but before we change it, let’s figure out why it’s failing. It turned out that this change made it work:
string a_name = a->get_name();
string b_name = b->get_name();
return strcmp(a_name.c_str(), b_name.c_str()) < 0
At first blush, it doesn’t look like that should make a difference. It does, though. To see why, let’s rewrite the original to explicitly show the temporary variables that the compiler must generate.
const char* a_name;
{
string tmp = a->get_name();
a_name = tmp.c_str();
}
const char* b_name;
{
string tmp = b->get_name();
b_name = tmp.c_str();
}
return strcmp(a_name, b_name) < 0
The temporary string
objects go out-of-scope at the end of the statement. (Actually, at the end of the “full expression”.) The c_str
function—in most implementations—returns a pointer to the temporary’s internal data. When the temporary goes out-of-scope, that pointer becomes invalid. Like is so often the case with this kind of thing, it might seem to work because the data at that memory location may not have changed.
In the “fixed” code, the temporaries get promoted to full local variables. There are const char*
temporaries instead (which I think the compiler can optimize away), but they don’t have a destructor to cause the kind of havoc the string
destructor did.
My co-worker wondered why temporaries don’t just last until the end of the function. In section 6.3.2 of The Design and Evolution of C++, Stroustrup says that this was originally the case. He found, however, that his users were wrapping statements that might generate temporaries in their own blocks because enough programs would generate enough large temporaries that this would sometimes exhaust memory.
I’d known there were these kinds of issues with C++ temporaries (and even guessed that that might be the issue here), but I’m embarrassed that I never took the time to understand it well enough to recognize exactly what was happening here.
Of course, there shouldn’t have been an issue here at all, because it should’ve been written:
return a->get_name() < b->get_name();