Search code examples
multithreadingc++11valgrindthread-sanitizer

Thread safety of std::cout insertion operator


I've always thought that using std::cout << something was thread safe.

For this little example

#include <iostream>
#include <thread>

void f()
{
   std::cout << "Hello from f\n";
}

void g()
{
   std::cout << "Hello from g\n";
}

int main()
{
   std::thread t1(f);
   std::thread t2(g);
   t1.join();
   t2.join();
}

my expectation was that the order of the two outputs would be undefined (and indeed that is what I observe in practice), but that the calls to operator<< are thread safe.

However, ThreadSanitizer, DRD and Helgrind all seem to give various errors regarding access to std::__1::ios_base::width(long) and std::__1::basic_ios<char, std::__1::char_traits >::fill()

On Compiler Explorer I don't see any errors.

On FreeBSD 13, ThreadSanitizer gives me 3 warnings, the two listed above plus the malloc/memcpy to the underlying i/o buffer.

Again in FreeBSD 13, DRD gives 4 errors, width() and fill() times two for the two threads.

Finally FreeBSD 13 Helgrind gives one known false positive related to TLS in thread creation, fill()and width() twice.

On Fedora 34

  • No errors with g++ 11.2.1 and ThreadSanitizer
  • DRD complains about malloc/memcpy in fwrite with g++ compiled exe
  • Helgrind also complains about fwrite and also for the construction of cout, again with the g++ compiled exe
  • clang++ 12 ThreadSanitizer complains about fill() and width()
  • DRD with the clang++ compiler exe complains about fill(), width(), fwrite and one other in start_thread
  • Helgrind with the clang++ exe complains about some TLS, fill(), width(), fwrite

macOS XCode clang++ ThreadSanitizer generates warnings as well (which will be libc++).

Looking at the libc++ and libstdc++ code I don't see anything at all that protects width(). So I don't understand why there are no complaints on compiler explorer.

I tried running with TSAN_OPTIONS=print_suppressions=1 and there was no more output (g++ Fedora ThreadSanitizer)

There does seem to be some consensus over the width() and fill() calls.

Looking more closely at the libstdc++ source I see that there is (with some trimming and comments):

// ostream_insert.h
// __n is the length of the string pointed to by __s
  template<typename _CharT, typename _Traits>
    basic_ostream<_CharT, _Traits>&
    __ostream_insert(basic_ostream<_CharT, _Traits>& __out,
             const _CharT* __s, streamsize __n)
{
    typedef basic_ostream<_CharT, _Traits>       __ostream_type;
    typedef typename __ostream_type::ios_base    __ios_base;

    typename __ostream_type::sentry __cerb(__out);
    if (__cerb)
    {
        __try
        {
            const streamsize __w = __out.width();
            if (__w > __n)
            {
                // snipped
                // handle padding
            }
            else
              __ostream_write(__out, __s, __n);
          // why no hazard here?
          __out.width(0);
      }

__out is the stream object, global cout in this case. I don't see anything like locks or atomics.

Any suggestions as to how ThreadSanitizer/g++ is getting a "clean" output?

There is this somewhat cryptic comment


  template<typename _CharT, typename _Traits>
    basic_ostream<_CharT, _Traits>::sentry::
    sentry(basic_ostream<_CharT, _Traits>& __os)
    : _M_ok(false), _M_os(__os)
    {
      // XXX MT
      if (__os.tie() && __os.good())
    __os.tie()->flush();

The libc++ code looks similar. In iostream

template<class _CharT, class _Traits>
basic_ostream<_CharT, _Traits>&
__put_character_sequence(basic_ostream<_CharT, _Traits>& __os,
                          const _CharT* __str, size_t __len)
{
#ifndef _LIBCPP_NO_EXCEPTIONS
    try
    {
#endif // _LIBCPP_NO_EXCEPTIONS
        typename basic_ostream<_CharT, _Traits>::sentry __s(__os);
        if (__s)
        {
            typedef ostreambuf_iterator<_CharT, _Traits> _Ip;
            if (__pad_and_output(_Ip(__os),
                                 __str,
                                 (__os.flags() & ios_base::adjustfield) == ios_base::left ?
                                     __str + __len :
                                     __str,
                                 __str + __len,
                                 __os,
                                 __os.fill()).failed())
                __os.setstate(ios_base::badbit | ios_base::failbit);

and in locale


template <class _CharT, class _OutputIterator>
_LIBCPP_HIDDEN
_OutputIterator
__pad_and_output(_OutputIterator __s,
                 const _CharT* __ob, const _CharT* __op, const _CharT* __oe,
                 ios_base& __iob, _CharT __fl)
{
    streamsize __sz = __oe - __ob;
    streamsize __ns = __iob.width();
    if (__ns > __sz)
        __ns -= __sz;
    else
        __ns = 0;
    for (;__ob < __op; ++__ob, ++__s)
        *__s = *__ob;
    for (; __ns; --__ns, ++__s)
        *__s = __fl;
    for (; __ob < __oe; ++__ob, ++__s)
        *__s = *__ob;
    __iob.width(0);
    return __s;
}

Again I see no thread protection, but also this time the tools detect a hazard.

Are these real issues? For plain calls to operator<< the value of width doesn't change, and is always 0.


Solution

  • I got the answer from Jonathan Wakely. Makes me feel rather stupid.

    The difference is that on Fedora, libstdc++.so contains an explicit instantiation of the iostream classes. libstdc++.so isn't instrumented for ThreadSanitizer so it cannot detect any hazards related to it.