Monday, December 8, 2014

C++ Bloopers - references to temporary objects

Breaking encapsulation of classes to expose internal state, is always a source for potential errors.

STL had to do this to allow communication with C's lower level strings and arrays. And here's how you can mess things up when using std::string.



It is quite easy to get the following wrong, when you're in a hurry.

    const char *make_c_string (const char *text)
    {
        std::string s = "hello, the text is: ";
        s += std::string(text);
        return s.c_str();
    }

    int main()
    {
        const char *p = make_c_string ("133");
        printf ("%s \n", p);
    }

The problem is, that s is an object which will go out of scope die when the function make_c_string() returns. Therefore, the string's internals, which are exposed by c_str(), will also be gone by the time control comes back to main().

The worst thing is that it will sometimes work !
- if you printf() the string, you'll probably get "hello, the text is: 133" as an answer, which is correct.

The reason is that the string is allocated in the heap, and printing it right after it is deallocated is likely to succeed, because the memory has not been overwritten yet.
But if you insert some code that allocates some memory, before printing the string, then the error is likely to surface up:

int main()
{
    const char *p = make_c_string ("133");
    std::string x = "inserted some code here";
    printf ("%s \n", p);
}
Now, printing p is likely to fail.

Beware that you still return the internals of temporary objects if you do this:
const char *make_c_string (string s)
{
    s += std::string(" and some text is appended");
    return s.c_str();
}

Again, s is a string that lives only during the call of make_c_string(). It is a copy of the argument of the caller function. Hence, the following code is likely to break:

    std::string x = "133";
    const char *p = make_c_string(x);
    printf ("%s \n", p);




Note, however, that even if you change code to 

    const char *make_c_string_from_ref (const string& s)
    {
        ... ; 
        return s.c_str();
    }

then you still might fail for the same reason:


  • This is valid code:
    std::string x = "133";
    const char *p = make_c_string_from_ref(x);
    printf ("%s \n", p);
p will point to the internal representation of x, which still lives when p is printed.

  • But this will fail:

    const char *p = make_c_string_from_ref("133");
    printf ("%s \n", p);
"133" creates a temporary std::string, which is passed as an argument to make_c_string(). p will point to the internal representation of this temporary object, but the compiler might kill it at any time.

  • This will fail also:

    const char *p = make_c_string_from_ref(std::string("133"));
    printf ("%s \n", p);
again, std::string("133") is itself a temporary object, which is passed to make_c_string_from_ref()p will point to the internal representation of this temporary object, but the compiler might kill it at any time.



You can make the function more robust by returning a std::string:

    std::string make_string (string s)
    {
        s += ... ; 
        return s;
    }

Now this calling method is valid:

    printf ("%s \n"make_string("133").c_str());   // safe
Now, make_string() returns a temporary string. The string is guaranteed to live as long as the full-expression. Hence. it will be valid during the call to printf().
But watch how naive usage of it may still cause it to fail:

    const char *p = make_string("133").c_str();
    printf ("%s \n", p);
Now, make_string() returns a temporary string. Then we store a pointer to the internals of this temporary string and ... crash again

The safest way to guard against this error is to make sure the lifetime of the string object to be greater or equal to the pointer's:

    std::string s = make_string("133");
    const char *p = s.c_str();     // safe
    printf ("%s \n", p);



Summary:

  • always avoid to store references of temporary objects


printf("%s", i_return_a_std_string().c_str()); // safe

std::string str = i_return_a_std_string();
printf("%s", str.c_str());                     // safe

const char *p = i_return_a_std_string().c_str();
printf("%s", p);                               // error

const char *p = i_return_a_C_string();
printf("%s", p);                               // error, or unsafe

No comments:

Post a Comment