Search code examples
c++for-loopparsingstack-overflowcompile-time

Compile time for loop throwing "parser stack overflow, program too complex"


The following is a compile-time for loop:

namespace nstl
{
template <typename index_type, index_type begin, index_type stride, index_type... indices, typename function_type>
void compile_time_for_internal(std::index_sequence<indices...>, function_type function)
{
  (function(begin + indices * stride), ...);
}
template <typename index_type, index_type begin, index_type end, index_type stride, typename function_type>
void compile_time_for(function_type function)
{
  compile_time_for_internal<index_type, begin, stride>(std::make_index_sequence<end-begin>{}, function);
}
}

When I attempt to run this loop with up to 596 elements it works:

int main(int argc, char** argv)
{
  auto data = std::vector<double>(596);
  std::iota(data.begin(), data.end(), 0);

  auto sum(0.0);
  nstl::compile_time_for<std::size_t, 0, 596, 1>([&] (const auto& i)
  {
    sum += data[i];
  });
}

If I use 597 elements, it throws "parser stack overflow, program too complex". I already have ideas about why this might be happening, such as the fold expressions using some sort of recursion to expand the function call at compile-time, but this is just theory. So I'd like to ask: What are the reasons for this? Are there workarounds to avoid this issue?


Solution

  • Gcc compiles your code without issues also with 597: https://godbolt.org/z/15ETd9rKM. I tried to increase the depth, but I only saw godbolt killing the process before gcc itself reported an error.

    Clang, reports the following error message:

    <source>:11:40: fatal error: instantiating fold expression with 597 arguments exceeded expression nesting limit of 256
      (function(begin + indices * stride), ...);
      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~
    <source>:16:3: note: in instantiation of function template specialization 'nstl::compile_time_for_internal<unsigned long, 0UL, 1UL, 0UL, 1UL, 2UL, 3UL, 4UL, 5UL, 6UL, 7UL, 8UL, 9UL, 10UL, 11UL, 12UL, 13UL, 14UL, 15UL, 16UL, 17UL, 18UL, 19UL, 20UL, 21UL, 22UL, 23UL, 24UL, 25UL, 26UL, 27UL, 28UL, 29UL, 30UL, 31UL, 32UL, 33UL, 34UL, 35UL, 36UL, 37UL, 38UL, 39UL, 40UL, 41UL, 42UL, 43UL, 44UL, 45UL, 46UL, 47UL, 48UL, 49UL, 50UL, 51UL, 52UL, 53UL, 54UL, 55UL, 56UL, 57UL, 58UL, 59UL, 60UL, 61UL, 62UL, 63UL, 64UL, 65UL, 66UL, 67UL, 68UL, 69UL, 70UL, 71UL, 72UL, 73UL, 74UL, 75UL, 76UL, 77UL, 78UL, 79UL, 80UL, 81UL, 82UL, 83UL, 84UL, 85UL, 86UL, 87UL, 88UL, 89UL, 90UL, 91UL, 92UL, 93UL, 94UL, 95UL, 96UL, 97UL, 98UL, 99UL, 100UL, 101UL, 102UL, 103UL, 104UL, 105UL, 106UL, 107UL, 108UL, 109UL, 110UL, 111UL, 112UL, 113UL, 114UL, 115UL, 116UL, 117UL, 118UL, 119UL, 120UL, 121UL, 122UL, 123UL, 124UL, 125UL, 126UL, 127UL, 128UL, 129UL, 130UL, 131UL, 132UL, 133UL, 134UL, 135UL, 136UL, 137UL, 138UL, 139UL, 140UL, 141UL, 142UL, 143UL, 144UL, 145UL, 146UL, 147UL, 148UL, 149UL, 150UL, 151UL, 152UL, 153UL, 154UL, 155UL, 156UL, 157UL, 158UL, 159UL, 160UL, 161UL, 162UL, 163UL, 164UL, 165UL, 166UL, 167UL, 168UL, 169UL, 170UL, 171UL, 172UL, 173UL, 174UL, 175UL, 176UL, 177UL, 178UL, 179UL, 180UL, 181UL, 182UL, 183UL, 184UL, 185UL, 186UL, 187UL, 188UL, 189UL, 190UL, 191UL, 192UL, 193UL, 194UL, 195UL, 196UL, 197UL, 198UL, 199UL, 200UL, 201UL, 202UL, 203UL, 204UL, 205UL, 206UL, 207UL, 208UL, 209UL, 210UL, 211UL, 212UL, 213UL, 214UL, 215UL, 216UL, 217UL, 218UL, 219UL, 220UL, 221UL, 222UL, 223UL, 224UL, 225UL, 226UL, 227UL, 228UL, 229UL, 230UL, 231UL, 232UL, 233UL, 234UL, 235UL, 236UL, 237UL, 238UL, 239UL, 240UL, 241UL, 242UL, 243UL, 244UL, 245UL, 246UL, 247UL, 248UL, 249UL, 250UL, 251UL, 252UL, 253UL, 254UL, 255UL, 256UL, 257UL, 258UL, 259UL, 260UL, 261UL, 262UL, 263UL, 264UL, 265UL, 266UL, 267UL, 268UL, 269UL, 270UL, 271UL, 272UL, 273UL, 274UL, 275UL, 276UL, 277UL, 278UL, 279UL, 280UL, 281UL, 282UL, 283UL, 284UL, 285UL, 286UL, 287UL, 288UL, 289UL, 290UL, 291UL, 292UL, 293UL, 294UL, 295UL, 296UL, 297UL, 298UL, 299UL, 300UL, 301UL, 302UL, 303UL, 304UL, 305UL, 306UL, 307UL, 308UL, 309UL, 310UL, 311UL, 312UL, 313UL, 314UL, 315UL, 316UL, 317UL, 318UL, 319UL, 320UL, 321UL, 322UL, 323UL, 324UL, 325UL, 326UL, 327UL, 328UL, 329UL, 330UL, 331UL, 332UL, 333UL, 334UL, 335UL, 336UL, 337UL, 338UL, 339UL, 340UL, 341UL, 342UL, 343UL, 344UL, 345UL, 346UL, 347UL, 348UL, 349UL, 350UL, 351UL, 352UL, 353UL, 354UL, 355UL, 356UL, 357UL, 358UL, 359UL, 360UL, 361UL, 362UL, 363UL, 364UL, 365UL, 366UL, 367UL, 368UL, 369UL, 370UL, 371UL, 372UL, 373UL, 374UL, 375UL, 376UL, 377UL, 378UL, 379UL, 380UL, 381UL, 382UL, 383UL, 384UL, 385UL, 386UL, 387UL, 388UL, 389UL, 390UL, 391UL, 392UL, 393UL, 394UL, 395UL, 396UL, 397UL, 398UL, 399UL, 400UL, 401UL, 402UL, 403UL, 404UL, 405UL, 406UL, 407UL, 408UL, 409UL, 410UL, 411UL, 412UL, 413UL, 414UL, 415UL, 416UL, 417UL, 418UL, 419UL, 420UL, 421UL, 422UL, 423UL, 424UL, 425UL, 426UL, 427UL, 428UL, 429UL, 430UL, 431UL, 432UL, 433UL, 434UL, 435UL, 436UL, 437UL, 438UL, 439UL, 440UL, 441UL, 442UL, 443UL, 444UL, 445UL, 446UL, 447UL, 448UL, 449UL, 450UL, 451UL, 452UL, 453UL, 454UL, 455UL, 456UL, 457UL, 458UL, 459UL, 460UL, 461UL, 462UL, 463UL, 464UL, 465UL, 466UL, 467UL, 468UL, 469UL, 470UL, 471UL, 472UL, 473UL, 474UL, 475UL, 476UL, 477UL, 478UL, 479UL, 480UL, 481UL, 482UL, 483UL, 484UL, 485UL, 486UL, 487UL, 488UL, 489UL, 490UL, 491UL, 492UL, 493UL, 494UL, 495UL, 496UL, 497UL, 498UL, 499UL, 500UL, 501UL, 502UL, 503UL, 504UL, 505UL, 506UL, 507UL, 508UL, 509UL, 510UL, 511UL, 512UL, 513UL, 514UL, 515UL, 516UL, 517UL, 518UL, 519UL, 520UL, 521UL, 522UL, 523UL, 524UL, 525UL, 526UL, 527UL, 528UL, 529UL, 530UL, 531UL, 532UL, 533UL, 534UL, 535UL, 536UL, 537UL, 538UL, 539UL, 540UL, 541UL, 542UL, 543UL, 544UL, 545UL, 546UL, 547UL, 548UL, 549UL, 550UL, 551UL, 552UL, 553UL, 554UL, 555UL, 556UL, 557UL, 558UL, 559UL, 560UL, 561UL, 562UL, 563UL, 564UL, 565UL, 566UL, 567UL, 568UL, 569UL, 570UL, 571UL, 572UL, 573UL, 574UL, 575UL, 576UL, 577UL, 578UL, 579UL, 580UL, 581UL, 582UL, 583UL, 584UL, 585UL, 586UL, 587UL, 588UL, 589UL, 590UL, 591UL, 592UL, 593UL, 594UL, 595UL, 596UL, (lambda at <source>:27:50)>' requested here
      compile_time_for_internal<index_type, begin, stride>(std::make_index_sequence<end-begin>{}, function);
      ^
    <source>:27:9: note: in instantiation of function template specialization 'nstl::compile_time_for<unsigned long, 0UL, 597UL, 1UL, (lambda at <source>:27:50)>' requested here
      nstl::compile_time_for<std::size_t, 0, 597, 1>([&] (const auto& i)
            ^
    <source>:11:40: note: use -fbracket-depth=N to increase maximum nesting level
      (function(begin + indices * stride), ...);
                                           ^
    1 error generated.
    Compiler returned: 1
    

    And indeed when you use -fbracket-depth=600 also clang compiles the code: https://godbolt.org/z/59Y8ze6a8.

    I did not try MSVC, but I suppose all major 3 have a flag that lets you change that limit.

    Update: I am not very familiar with MSCV (its the only one that reports a message saying "program too complex" when I did try it, so I suppose it is the one you are using). I was looking for a flag to increase the limit, but I didn't find one. So my assumption that all major 3 have such a flag seems to be wrong. A workaround is needed.

    You can manually unroll the loop to decrease the depth of the templates:

    #include <iostream>
    #include <utility>
    #include <vector>
    #include <numeric>
    
    namespace nstl
    {
    template <typename index_type, index_type begin, index_type stride, index_type... indices, typename function_type>
    void compile_time_for_internal(std::index_sequence<indices...>, function_type function)
    {
      (function(begin + indices*2*stride),...);
      (function(begin + (indices*2+1)*stride),...);
    }
    template <typename index_type, index_type begin, index_type end, index_type stride, typename function_type>
    void compile_time_for(function_type function)
    {
      compile_time_for_internal<index_type, begin, stride>(std::make_index_sequence<(end-begin)/2>{}, function);
    }
    }
    
    
    int main(int argc, char** argv)
    {
      auto data = std::vector<double>(600);
      std::iota(data.begin(), data.end(), 0);
    
      auto sum(0.0);
      nstl::compile_time_for<std::size_t, 0, 600, 1>([&] (const auto& i)
      {
        sum += data[i];
      });
      std::cout << sum << "\n";
      sum = 0.0;
      for (int i=0; i < 100; ++i) sum+= i;
      std::cout << sum;
      
    }
    

    This is only for even number of iterations. To allow also odd number of iterations you would need to distinguish the two cases and handle the last iteration seperately. Of course you can unroll more iterations to push the limit further.

    Live Demo with clang, this time only -fbracket-depth=300 (note: 600 iterations). Unfortunately I wasn't able to test it with MSVC, due to some Internal Compiler Explorer error. Though, note that clang needs -fbracket-depth=600 for the not-unrolled version while it can handle the unrolled one with half the depth.