Why C builds small programs faster than Go or D?

Go and D advertised to have incredibly fast compilers. Due to a modern design of the language itself with concurrent single-pass parsing in mind.

With an understanding that most of the build time wasted in linking phase. I wonder why gcc still faster on small programs.

#include <stdio.h>    
int main() {
    printf("Hello\n");
}

$ time gcc hello.c
real    0m0.724s
user    0m0.030s
sys     0m0.046s

Idiomatic

import std.stdio;
void main() {
    writeln("Hello\n");
}

$ time dmd hello.d

real    0m1.620s
user    0m0.047s
sys     0m0.015s

With hacks

import core.stdc.stdio;
void main() {
    printf("Hello\n");
}

$ time dmd hello.d
real    0m1.593s
user    0m0.061s
sys     0m0.000s

$ time dmd -c hello.d
real    0m1.203s
user    0m0.030s
sys     0m0.031s

package main
import "fmt"
func main() {
    fmt.Println("Hello.")
}

$ time go build hello.go
real    0m2.109s
user    0m0.016s
sys     0m0.031s

Java

public class Hello {
    public static void main(String[] args) {
        System.out.println("Hello.");
    }
}

$ time javac Hello.java
real    0m1.500s
user    0m0.031s
sys     0m0.031s

Solution

Running compiler filename actually still runs the linker and may copy in a good amount of the standard library to the generated executable (especially hurting D and Go, which static link their language runtimes by default for better compatibility).

Given this trivial D hello world:

import std.stdio;
void main() { writeln("hello world"); }

Let me show you some timings on my computer:

$ time dmd hello.d

real    0m0.204s
user    0m0.177s
sys     0m0.025s

Contrast to skipping the link step with -c, which means "compile, do not link":

$ time dmd -c hello.d

real    0m0.054s
user    0m0.048s
sys     0m0.006s

Cuts down the time to about 1/4 the first run - in this small program, nearly 3/4 of the compile time is actually linking.

Now, let me modify the program a bit:

import core.stdc.stdio;
void main() { printf("hello world\n"); }

$ time dmd -c hello.d

real    0m0.017s
user    0m0.015s
sys     0m0.001s

Cut in half by using printf instead of writeln! I'll come back to this.

And, just for comparison's sake, compile+link:

$ time dmd hello.d

real    0m0.099s
user    0m0.083s
sys     0m0.014s

This gives us an idea of what's going on:

the linker eats a good chunk of time. Using -c removes it from the equation.
parsing the standard library also takes a good chunk of time. Using just the C function instead of the D lib cuts that out and gives a more apples-to-apples look.
But, using the stdlib is important to see the scalability thing.

What D (and I presume, Go, but I don't know much about them) aims to do is reduce the time to compile medium to large programs. Small programs are already fast - waiting for a fraction of a second (or perhaps one or two on slower computers, the one I'm on now has a nice SSD on it which will speed this up, running the same command on a spinning hard disk about doubles the time!) for a small build isn't a big deal.

Waiting several minutes for a big build is a problem though. If we can cut that down to several seconds, it is a major win.

The time to compile 100,000 lines is more important than the time to compile 10. So init times aren't really important. Link times matter, but there's not much the compiler itself does about that (the linker is a separate program written by separate teams, though there are efforts to improve that too elsewhere).

So the time D takes to build including the standard library is where it gets impressive. Slower than a C hello world (because the C compiler does less work with the smaller lib), but you already see benefits over a C++ hello world, which is slower per line and also tends to have more work to do on each build (parsing #includes, etc).

A good compiler benchmark would want to isolate these issues and test scalability more than small programs. Though D does very well on small programs too if you run the test right to ensure a fair comparison.