Search code examples
clanguage-lawyerstrict-aliasingmemset

Is it well-defined to use memset on a dynamic bool array?


Is this code well-defined behavior, in terms of strict aliasing?

_Bool* array = malloc(n);
memset(array, 0xFF, n);
_Bool x = array[0];

The rule of effective type has special cases for memcpy and memmove (C17 6.5 §6) but not for memset.

My take is that the effective type becomes unsigned char. Because the second parameter of memset is required to be converted to unsigned char (C17 7.24.6.1) and because of the rule of effective type, (C17 6.5 §6):

...or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

  • Question 1: What is the effective type of the data stored in array after the memset call?
  • Question 2: Does the array[0] access therefore violate strict aliasing? Since _Bool is not a type excluded from the strict aliasing rule (unlike character types).

Solution

    1. memset does not change the effective type. C11 (C17) 6.5p6:

      1. The effective type of an object for an access to its stored value is the declared type of the object, if any. [ This clearly is not the case. An allocated object has no declared type. ]

        If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. [ this is not the case as an lvalue of character type is used by memset! ]

        If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. [ this too is not the case here - it is not copied with memcpy, memmove or an array of characters ]

        For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access. [ therefore, this has to apply in our case. Notice that this applies to accessing it as characters inside memset as well as dereferencing array. ]

      Since the values are stored with an lvalue that has character type inside memset, and not have the bytes copied from another object with lvalues of character type (the clause exists to equate memcpy and memmove with doing the same with an explicit for loop!), it does not get an effective type, and the effective type of elements is _Bool for those accessed through array.

      There might be parts in the C17 standard that are underspecified, but this certainly is not one of those cases.

    2. array[0] would not violate the effective type rule.

      That does not make using the value of array[0] any more legal. It can (and will most probably) be a trap value!

      I tried the following functions

      #include <stdio.h>
      #include <stdbool.h>        
      
      void f1(bool x, bool y) {
          if (!x && !y) {
              puts("both false");
          }
      }
      
      
      void f2(bool x, bool y) {
          if (x && y) {
              puts("both true");
          }
      }
      
      void f3(bool x) {
          if (x) {
              puts("true");
          }
      }
      
      void f4(bool x) {
          if (!x) {
              puts("false");
          }
      }
      

      with array[0] as any of the arguments - for the sake of avoiding compile-time optimizations this was compiled separately. When compiled with -O3 the following messages were printed:

      both true
      true
      

      And when without any optimization

      both false
      both true
      true
      false