This may be an old question but is still pending a solution. The entire question stemmed from a small detail in the development of Apache Spark, one of the largest open source project in history.
During the delivery and release of Spark 1.x and 2.x. A key library dependency (Apache Hive 1.x) was found to have introduced too many obsolete transitive dependencies, and prone to cause conflict if deployed with YARN/HDFS. Realising that the team won't have enough resource to enforce the mono-repo principal (namely, ensuring that each library in the dependency tree can only have 1 version), a hard fork of Apache Hive was made, compiled and published:
https://github.com/JoshRosen/hive
https://mvnrepository.com/artifact/org.spark-project.hive/hive-common/1.2.1.spark2
It's only difference with the official Apache Hive is that all source code references to the package "org.apache.hive" was replaced with "org.spark-project.hive".
This is obviously a lousy way of using another project: the new code won't keep up with the development of Apache Hive community, or mundane, repetitive works are required to keep it up to date. This also introduces dangerous exploits where an unsigned jar could be used to swap out the migrated jar (also unsigned) in an Apache Spark installation. As a result, after Spark 3.0 the migrated project was discontinued: With enough resources, the new original Apache Hive 2.x was introduced with most obsolete dependencies upgraded.
One would hope that after 5 years of the release of Apache Spark 2.0, such process should be largely automated by the improvement of all the compilation tools and plugins. Specifically, 2 plugins (maven shade plugin and gradle shadow plugin) are specifically designed for relocation of packages in dependencies, and can be used to generate the migrated bytecode of Apache Hive directly from the canonical Hive. But a quick experiment quickly revealed that none of them can accomplish such a simple task:
https://github.com/tribbloid/autoshade
This project contains 2 subprojects that only exist for repacking, one written in maven and another in gradle.
The maven subproject uses maven shade plugin to relocate json4s into repacked.test1.org.json4
s:
<dependencies>
<dependency>
<groupId>org.json4s</groupId>
<artifactId>json4s-jackson_${vs.scalaBinaryV}</artifactId>
<version>4.0.4</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.2.4</version>
<executions>
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<!-- <createSourcesJar>true</createSourcesJar>-->
<createDependencyReducedPom>true</createDependencyReducedPom>
<dependencyReducedPomLocation>${project.build.directory}/dependency-reduced-pom.xml</dependencyReducedPomLocation>
<!-- <generateUniqueDependencyReducedPom>true</generateUniqueDependencyReducedPom>-->
<keepDependenciesWithProvidedScope>false</keepDependenciesWithProvidedScope>
<promoteTransitiveDependencies>false</promoteTransitiveDependencies>
<!-- <shadedClassifierName>${spark.classifier}</shadedClassifierName>-->
<relocations>
<relocation>
<pattern>org.json4s</pattern>
<shadedPattern>repacked.test1.org.json4s</shadedPattern>
</relocation>
</relocations>
<filters>
<filter>
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
</excludes>
</filter>
</filters>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
The gradle project uses shadow plugin to relocate json4s into repacked.test2.org.json4s
:
dependencies {
api("org.json4s:json4s-jackson_${vs.scalaBinaryV}:4.0.4")
}
tasks {
shadowJar {
exclude("META-INF/*.SF")
exclude("META-INF/*.DSA")
relocate("org.json4s", "repacked.test2.org.json4s")
}
}
After that, a third project (in gradle, but it doesn't matter) declared both as dependencies and use Scala to access the new relocated class:
dependencies {
api(project(":repack:gradle", configuration = "shadow"))
api("com.tribbloids.autoshade:repack-maven:0.0.1")
}
class Json4sTest {
classOf[test1.org.json4s.Formats]
classOf[test2.org.json4s.Formats]
}
Surprisingly, it cannot be compiled:
[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:7:11: Symbol 'term org.json4s' is missing from the classpath.
This symbol is required by ' <none>'.
Make sure that term json4s is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'package.class' was compiled against an incompatible version of org.
[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:7:28: type Formats is not a member of package repacked.test1.org.json4s
[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:10:11: Symbol 'term org.json4s' is missing from the classpath.
This symbol is required by ' <none>'.
Make sure that term json4s is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
A full rebuild may help if 'package.class' was compiled against an incompatible version of org.
[Error] /home/peng/git-proto/autoshade/main/src/main/scala/com/tribbloids/spookystuff/Json4sTest.scala:10:28: type Formats is not a member of package repacked.test2.org.json4s
The 1st and 3rd error messages will not appear if referring to a non-existing class, it can be speculated that the package migration was incomplete and inconsistent, completely useless comparing to a manual migration of source code that the Apache Spark team did before.
So why is it so hard for such a simple task to be automated? What extra steps are required in either maven or gradle to make it work?
At this moment (Oct 13 2022), the only working solution is through sbt. The following built file is used in https://github.com/tribbloid/autoshade/blob/main/repack/sbt/build.sbt, which called AssemblyPlugin
to publish a shaded assembly jar:
project
.in(file("."))
.settings(commonSettings)
.settings(
scalacOptions += "-Ymacro-annotations",
libraryDependencies ++= Seq(
"org.json4s" %% "json4s-jackson" % "4.0.4"
),
addArtifact(
Artifact("repack-sbt", "assembly"),
sbtassembly.AssemblyKeys.assembly
),
ThisBuild / assemblyMergeStrategy := {
case PathList("module-info.class") => MergeStrategy.discard
case x if x.endsWith("/module-info.class") => MergeStrategy.discard
case x =>
val oldStrategy = (ThisBuild / assemblyMergeStrategy).value
oldStrategy(x)
},
artifact in (Compile, assembly) := {
val art = (artifact in (Compile, assembly)).value
art.withClassifier(Some("assembly"))
},
ThisBuild / assemblyJarName := {
s"${name.value}-${scalaBinaryVersion.value}-${version.value}-assembly.jar"
},
ThisBuild / assemblyShadeRules := Seq(
ShadeRule.rename("org.json4s.**" -> "repacked.test3.org.json4s.@1").inAll
)
)
.enablePlugins(AssemblyPlugin)
after publishing:
sbt "clean;publishM2"
...
[success] Total time: 0 s, completed Oct. 13, 2022, 4:19:49 p.m.
[info] Wrote /home/peng/git-proto/autoshade/repack/sbt/target/scala-2.13/repack-sbt_2.13-0.0.1-SNAPSHOT.pom
[info] Strategy 'discard' was applied to 9 files (Run the task at debug level to see details)
[info] Strategy 'rename' was applied to 4 files (Run the task at debug level to see details)
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT-sources.jar
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT-javadoc.jar
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT.jar
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT.pom
[info] published repack-sbt_2.13 to file:/home/peng/.m2/repository/com/tribbloids/autoshade/repack-sbt_2.13/0.0.1-SNAPSHOT/repack-sbt_2.13-0.0.1-SNAPSHOT-assembly.jar
[success] Total time: 3 s, completed Oct. 13, 2022, 4:19:53 p.m.
...
Any class in the assembly jar can be referred within the new repackage repacked.test3.org.json4s
.
It is yet to know which part of the sbt plugin did correctly to make it possible. Once it has been figured out, the same subroutine should ideally be ported to maven-shade-plugin and gradle-shadow-plugin respectively