Search code examples

Missing dependencies in Apache Crunch Scala build

I'm trying to build the Apache Crunch source code on my CentOS 7 machine, but am getting the following error in the crunch-spark project when I execute mvn package:

[ERROR] /home/bwatson/programming/git/crunch/crunch-spark/src/it/scala/org/apache/crunch/scrunch/spark/PageRankClassTest.scala:71: error: bad symbolic reference. A signature in PTypeH.class refers to term protobuf
[ERROR] in package which is not available.
[ERROR] It may be completely missing from the current classpath, or the version on
[ERROR] the classpath might be incompatible with the version used when compiling PTypeH.class.
[ERROR]       .map(line => { val urls = line.split("\\t"); (urls(0), urls(1)) })
[ERROR]           ^

Other SO questions about similar errors (here and here) seem to involve PATH or version issues. I've been messing around but can't seem to resolve them. For completeness:

[bwatson@ben-pc crunch]$ scala -version
Scala code runner version 2.11.5 -- Copyright 2002-2013, LAMP/EPFL

[bwatson@ben-pc crunch]$ java -version
java version "1.8.0_31"
Java(TM) SE Runtime Environment (build 1.8.0_31-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.31-b07, mixed mode)

[bwatson@ben-pc crunch]$ mvn -version
Apache Maven 3.0.5 (Red Hat 3.0.5-16)
Maven home: /usr/share/maven
Java version: 1.8.0_31, vendor: Oracle Corporation
Java home: /usr/java/jdk1.8.0_31/jre
Default locale: en_GB, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-123.20.1.el7.x86_64", arch: "amd64", family: "unix"

Any advice? I'm not really sure where Scala is looking for its dependencies, but I'd have thought that Maven would take care of it.


  • It turns out the official documentation for Crunch was missing a Maven parameter. The issue was solved by building using:

    mvn package -Dcrunch.platform=2