Search code examples
cvariable-length-array

Why is this VLA (variable-length array) definition unreliable?


Why doesn't this code defining and using a VLA (variable-length array) work reliably?

#include <stdio.h>

int main(void)
{
    int n;
    double vla[n];

    if (scanf("%d", &n) != 1)
        return 1;
    for (int i = 0; i < n; i++)
    {
        if (scanf("%lf", &vla[i]) != 1)
            return 1;
    }
    for (int i = 0; i < n; i++)
        printf("[%d] = %.2f\n", i, vla[i]);
    return 0;
}

Solution

  • Diagnosis

    In the code in the question, the variable n is uninitialized when it is used in the definition of vla. Indeed, with GCC set fussy, the code shown produces a compilation error (it'll give a warning if you are careless enough to omit -Werror from your compilation options — don't do that!):

    $ gcc -std=c11 -O3 -g -Wall -Wextra -Werror -Wstrict-prototypes -Wmissing-prototypes -Wshadow -pedantic-errors  vla37.c -o vla37  
    vla37.c: In function ‘main’:
    vla37.c:6:5: error: ‘n’ is used uninitialized [-Werror=uninitialized]
        6 |     double vla[n];
          |     ^~~~~~
    vla37.c:5:9: note: ‘n’ declared here
        5 |     int n;
          |         ^
    cc1: all warnings being treated as errors
    $
    

    (That's from GCC 11.2.0 on a machine running RedHat RHEL 7.4.)

    The trouble is that the compiler must know the size of the array when it is declared, but the value in n is undefined (indeterminate) because it is uninitialized. It could be huge; it could be zero; it could be negative.

    This analysis also applies to multi-dimensional arrays with variable bounds — whether the bounds are the same for each dimension or different dimensions have different bounds. All the following declarations are flawed — and the extension to three or more dimensions should be obvious.

    int n1, n2;
    double array1[n1][n1];
    double array2[n1][n2];
    double array3[10][n2];
    double array4[n1][10];
    

    Prescription

    The cure for the problem is simple — make sure the size is known and sane before it is used to declare the VLA:

    #include <stdio.h>
    
    int main(void)
    {
        int n;
    
        if (scanf("%d", &n) != 1 || n <= 0 || n > 1024)
            return 1;
    
        double vla[n];
        for (int i = 0; i < n; i++)
        {
            if (scanf("%lf", &vla[i]) != 1)
                return 1;
        }
        for (int i = 0; i < n; i++)
            printf("[%d] = %.2f\n", i, vla[i]);
        return 0;
    }
    

    Now you can run the result:

    $ vla41 <<<'9 2.34 3.45 6.12 8.12 99.60 -12.31 1 2 3'
    [0] = 2.34
    [1] = 3.45
    [2] = 6.12
    [3] = 8.12
    [4] = 99.60
    [5] = -12.31
    [6] = 1.00
    [7] = 2.00
    [8] = 3.00
    $
    

    (That assumes your shell is Bash or compatible with Bash and supports 'here strings' (the <<<'…' notation.)

    The code shown in the question and in this answer is barely adequate in handling I/O errors; it detects input problems but doesn't provide useful feedback to the user. The code shown in the answer does rudimentary validation of the value of n for plausibility, but the error reporting is abysmal (non-existent). You should ensure that the size is larger than zero and less than some upper bound. The maximum size depends on the size of the data being stored in the VLA and the platform you're on.

    If you're on a Unix-like machine, you probably have 8 MiB of stack; if you're on a Windows machine, you probably have 1 MiB of stack; if you're on an embedded system, you may have much less stack available to you. You need to leave some stack space for other code too, so you should probably check that the array size is not more than, for sake of discussion, 1024 — that would be 8 KiB of stack for an array of double, which is not huge at all but it provides plenty of space for most homework programs. Tweak the number larger to suit your purposes, but when the number grows, you should use malloc() et al to dynamically allocate the array instead of using an on-stack VLA. For example, on a Windows machine, if you use a VLA of type int, setting the size above 262,144 (256 * 1024) almost guarantees that your program will crash, and it may crash at somewhat smaller sizes than that.

    Be aware that if you try to create an automatically allocated variable (e.g. a large VLA) that exceeds the available stack space, your program will be terminated 'with extreme prejudice'. It will probably not have a chance to report an error — it will simply stop. This happens when any automatic variable allocation exceeds the available space not only VLAs. If this is a concern, use dynamic memory allocation (malloc() et al) instead of automatic memory allocation; you can handle allocation failures, and the space available is usually far larger than the space available for automatic variable allocations.

    Lessons to learn

    • Compile with stringent warning options.
    • Compile with -Werror or its equivalent so warnings are treated as errors.
    • Make sure the variable defining the size of a VLA is initialized before defining the array.
      • Not too small (not zero, not negative).
      • Not too big (not using more than 1 megabyte on Windows, 8 megabytes on Unix).
      • Leave a decent margin for other code to use as well.

    Note that all compilers that support VLAs also support variables defined at arbitrary points within a function. Both features were added in C99. VLAs were made optional in C11 — and a compiler should define __STDC_NO_VLA__ if it does not support VLAs at all but claims conformance with C11 or later.

    C23 and variable-length arrays

    The proposed C23 standard will require support for VLAs as arguments to functions and, therefore, in prototype definitions too. This is a good move; for matrix operations, the ability to write generic functions which work with arbitrary array sizes is beneficial. However, it permits implementations that do not support VLAs with automatic storage to document this by defining the __STDC_NO_VLA__ macro to 1:

    __STDC_NO_VLA__
    The integer constant 1, intended to indicate that the implementation does not support variable length arrays with automatic storage duration. Parameters declared with variable length array types are adjusted and then define objects of automatic storage duration with pointer types. Thus, support for such declarations is mandatory.

    Thus, the meaning of __STDC_NO_VLA__ changes in C23 from what it meant in C11 and C18.

    Thanks to Lundin for pointing out this change. See N3054 §6.10.9 Conditional feature macros (or a later draft of the proposed C23 standard) for more information. Some later drafts are password protected.

    C++ and variable-length arrays

    Standard C++ does not support C-style VLAs. However, GCC (g++) does support them by default as an extension. This can cause confusion. If you're writing C++, you should not use VLAs. C++ has better facilities (such as <vector> and <array>) for supporting dynamically-sized arrays.