Search code examples
tensorflowbazel

Building TensorFlow from source on Ubuntu 14.04 LTS: gcc: internal compiler error: Killed (program cc1plus)


I have successfully built TensorFlow from source under Debian but at present cannot get it to build starting with a new virtual machine using Ubuntu 14.04 LTS. IIRC for Debian I tried g++/gcc 5.2 but had to downgrade to g++/gcc 4.9 and it worked. Following the instructions Installing from sources if I install g++ the version is 4.8 and it failed .

gcc: internal compiler error: Killed (program cc1plus)

I have not tired 4.9 yet.

I checked the info on the last Jenkins build but could not find anything listed for the tools and their versions. Even opened issue: Build tools and versions listed in Jenkins build log

What version(s) of g++/gcc is know to work?
What version of g++/gcc do the build machines use?

EDIT

Found this: TensorFlow.org Continuous Integration


Solution

  • The problem is not with the g++/gcc version but the number of CPU cores Bazel uses to build TensorFlow.

    In running multiple builds on a VMware Workstation 7.1 with a fresh install of Ubuntu 14.04 LTS with one CPU core, 2G ram, 2G swap partition and 2G swap file the builds run the fastest. This may not be the best setup, but is the best one I have found so far that consistently works. If I allow 4 cores via VMware and build with Bazel it fails. If I limit the resources with the Bazel option --local_resources using

    --local_resources 2048,2.0,1.0
    

    builds successfully

    INFO: Elapsed time: 11683.908s, Critical Path: 11459.26s
    

    using

    --local_resources 4096,2.0,1.0
    

    builds successfully

    INFO: Elapsed time: 39765.257s, Critical Path: 39578.52s
    

    using

    --local_resources 4096,1.0,1.0
    

    builds successfully

    INFO: Elapsed time: 6562.744s, Critical Path: 6443.80s
    

    using

    --local_resources 6144,1.0,1.0
    

    builds successfully

    INFO: Elapsed time: 2810.509s, Critical Path: 2654.90s
    

    In summary more memory and less CPU cores works best for my environment.

    TLDR;

    While keeping an eye during the build process I noticed that certain source files would take a long time to compile and appeared to tie down the flow rate while building. It is as if they are in competition for a resource with other source files and that Bazel does not know about this critical resource so it allows the competing files to compile at the same time. Thus the more files competing with the unknown resource the slower the build.