Why does non-executed compile-time code increase Raku's bytecode size? Does it slow runtime performance?

Consider the following two programs:

unit module Comp;
say 'Hello, world!'

and

unit module Comp;
CHECK { if $*DISTRO.is-win { say 'compiling on Windows' }}
say 'Hello, world!'

Naively, I would have expected both programs to compile to exactly the same bytecode: the CHECK block specifies code to run at the end of compilation; checking a variable and then doing nothing has no effect on the run-time behavior of the program, and thus (I would have thought) shouldn't need to be included in the compiled bytecode.

However, compiling these two programs does not result in the same bytecode. Specifically, compiling the version without the CHECK block creates 24K of bytecode versus 60K for the version with it. Why is the bytecode different for these two versions? Does this difference in bytecode have (or potentially have) a runtime cost? (It seems like it must, but I want to be sure).

And one more related question: how do DOC CHECK blocks fit in with the above? My understanding is that even the compiler skips DOC CHECK blocks when it's not run with the --doc flag. Consistent with that, the bytecode for a hello-world program does not increase in size when given a DOC CHECK block like the one above. However, it does increase in size if the block includes a use statement. From that, I conclude that use is somehow special-cased and gets executed even in DOC CHECK blocks. Is that correct? If so, are there other simillarly special-cased forms I should know about?

Solution

A CHECK or BEGIN block (or other BEGIN-time constructs) may contain code that escapes. For example:

BEGIN SomeClass.^add_method('foo', anon method foo() { 42 })

Adds a method to a class, which exists beyond the bounds of the BEGIN block. That method's bytecode is therefore required in the compiled output. Currently, Rakudo conservatively includes the bytecode of everything in a BEGIN or CHECK block. It may be possible to avoid that for some simple cases in the future.

So far as the runtime cost goes, the implementation goes to some lengths to minimize the cost of bytecode that is never run (not so much for this case, but because the standard library is huge but many programs use only a fraction of it). For example:

Bytecode is mmap'd, so some unused parts of it may not actually be paged into memory
Bytecode is only validated on the first call to that frame
Frame meta-data (what lexicals does it have) is only deserialized on the first call to the frame
Unless something references it, the code object will not be deserialized

So far as use goes, its action is performed as soon as it is parsed. Being inside a DOC CHECK block does not suppress that - and in general can not, because the use might bring in things that need to be known in order to finish parsing the contents of that block.