Search code examples
c++c++11compiler-optimizationfseek

c++ O3 optimization breaks working while loop


I have this simple code that searches the file for "data" step by step using fseek in a pcm wav file:

  FILE * waveFile;
  waveFile = fopen ( this->fileLocation.c_str ( ), "rb" );

  // ... some other code here between, then ... //

  int seekTo = 0;
  bool found = false;
  char data[4];

  rewind ( waveFile );
  while ( !found && ( fseek ( waveFile, seekTo, SEEK_SET ) == 0 )) {
    fread ( data, sizeof ( data ), 1, waveFile );
    if (( std::strcmp ( data, "data" ) == 0 ) || ( std::strcmp ( data, "Data" ) == 0 ) || ( std::strcmp ( data, "DATA" ) == 0 )) {
      found = true;
      fread ( &waveHeader->DATA_SIZE, sizeof ( waveHeader->DATA_SIZE ), 1, waveFile );
    }
    seekTo++;
  }

The code works correctly, and on test files it finds the data, reads the remaining. As the "data" is near to the beginning on even the biggest files, this code is ok for me.

But, when I add the cpp flag -O3, code goes haywire, while loop never finishes.

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -O3")

I'm using cmake + lldb (osx, clion), same thing happens if I use GDB.

What can be the problem, how can I resolve this?

PS. I'm not trying to improve the code you see, I'm trying to understand why compiler optimization hacks this while loop.

PSS. Here is the null terminated working code:

  int seekTo = 0;
  char data[5];

  rewind ( waveFile );
  while (( fseek ( waveFile, seekTo, SEEK_SET ) == 0 )) {
    fread ( data, 4, 1, waveFile );
    data[ 4 ] = '\0';

    if (( std::strcmp ( data, "data" ) == 0 ) || ( std::strcmp ( data, "Data" ) == 0 ) || ( std::strcmp ( data, "DATA" ) == 0 )) {
      fread ( &waveHeader->DATA_SIZE, sizeof ( waveHeader->DATA_SIZE ), 1, waveFile );
      break;
    }
    seekTo += 1;
  }

Solution

  • Since nobody else wants to write an answer... When code works with optimizations off but stops working with optimizations on, it's likely some undefined behavior that's revealed by a compiler optimization. In your case that bug is:

    char data[4];
    ...
    fread ( data, sizeof ( data ), 1, waveFile );
    if (( std::strcmp ( data, "data" ) == 0 ) || ( std::strcmp ( data, "Data" ) == 0 ) || ( std::strcmp ( data, "DATA" ) == 0 )) {
    

    strcmp is for:

    Compares two null-terminated byte strings lexicographically.

    So either data happens to have a \0 in it somewhere, and the comparison is false (because data would be too short). Or it doesn't, and you're going to read way off the end of data toward some random null byte in memory. As a result, the compiler could deduce that there's no way that comparison could be true and optimize your code into:

    if (false) { ... }
    

    and then drop the if statement completely.

    Perhaps in the non-optimized build, you happened to always have zero memory immediately after data and the if was never optimized out?


    An easy fix for this would be to ensure that data is null-terminated:

    char data[5];
    data[4] = '\0';
    // rest as before
    

    Or to replace your calls of strcmp to memcmp, providing sizeof(data) as the additional length argument.