We recently added a second build machine to our build environment and began experiencing very odd occasional build failures.
I have two separate Maven build machines, A and B, each running Maven 2.2.1 and communicating to a shared Nexus 1.5.0 repository manager. My problem is that builds on B will occasionally fail because it refuses to download a newer version of a common dependency 'acme-1.0.0-SNAPSHOT' previously built by A and uploaded to Nexus.
Looking inside the local repositories on both machines I noticed some oddities in the repository metadata.
Machine A's acme\1.0.0-SNAPSHOT\maven-metadata-nexus.xml:
<metadata>
<groupId>acme</groupId>
<artifactId>acme</artifactId>
<version>1.0.0-SNAPSHOT</version>
<versioning>
<snapshot>
<buildNumber>1</buildNumber>
</snapshot>
<lastUpdated>20100525173546</lastUpdated>
</versioning>
</metadata>
Machine B's acme\1.0.0-SNAPSHOT\maven-metadata-nexus.xml:
<metadata>
<groupId>acme</groupId>
<artifactId>acme</artifactId>
<version>1.0.0-SNAPSHOT</version>
<versioning>
<snapshot>
<buildNumber>2</buildNumber>
</snapshot>
<lastUpdated>20100519232317</lastUpdated>
</versioning>
</metadata>
In Nexus's acme/1.0.0-SNAPSHOT/maven-metadata.xml:
<metadata>
<groupId>acme</groupId>
<artifactId>acme</artifactId>
<version>1.0.0-SNAPSHOT</version>
<versioning />
</metadata>
If I'm interpreting the metadata files correctly (documentation online is scant), it appears machine B believes it has a newer version of the acme dependency (based on buildNumber) despite the fact that machine A last built it 6 days after machine B did (based on timestamp). Nexus also appears to be unaware of a universally correct buildNumber.
How could this situation possibly arise? What could I do to prevent my builds from failing due to inconsistent metadata? Have you experienced anything similar?
Important notes:
It took me a while, but I tracked down the underlying issue to maven bug MNG-4142.
Here's what happened:
My acme-1.0-SNAPSHOT (build 1) was installed on A and uploaded to Nexus. The project was next built on B where the newly built acme-1.0-SNAPSHOT (build 2) was installed and uploaded to Nexus, overriding build 1.
Then, when a build happened on the A machine that had acme-1.0-SNAPSHOT as a dependency, MNG-4142 kicked in. The repository metadata contained "true" which prevented A from downloading the more recent build 2 of acme-1.0-SNAPSHOT, and so maven built my project against the older build 1 which caused build failures. This was still the case even when -U was used.
As I mentioned on the issue, I'm quite surprised at this behaviour and struggle to think of how other distributed build environments work in the presence of this bug. We currently have some cron jobs that frequently change the "localCopy" metadata to false in order to get what I believe should be the default, and correct, behaviour.