Search code examples
javamavenlucenenexus

Searching an artifact on nexus


I'm using the following code to search an artifact on Nexus. Although the URL returns a result in XML if I open it in a browser, the result as an object is not properly mapped and has empty data.

import java.net.URL;
import java.net.URLConnection;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.Unmarshaller;

import org.sonatype.nexus.rest.model.NexusNGArtifact;
import org.sonatype.nexus.rest.model.SearchNGResponse;
import org.sonatype.nexus.rest.model.SearchResponse;

public class TestNexus {

    public static void main(String[] args) throws Exception {
        JAXBContext context = JAXBContext.newInstance(SearchResponse.class, SearchNGResponse.class);
        Unmarshaller unmarshaller = context.createUnmarshaller();
        URLConnection connection = new URL("https://oss.sonatype.org/service/local/lucene/search?sha1=4a7b16bae95be72f7d591a517bd03e1172ede7ee").openConnection();
        connection.setConnectTimeout(5000);
        connection.setReadTimeout(10000);
        Object resp = unmarshaller.unmarshal(connection.getInputStream());
        if (resp instanceof SearchNGResponse) {
            SearchNGResponse srsp = (SearchNGResponse) resp;
            for (NexusNGArtifact ar : srsp.getData()) {
                System.out.println("group:" + ar.getGroupId());
                System.out.println("artifact:" + ar.getArtifactId());
                System.out.println("version:" + ar.getVersion());
            }
        }
    }
}

I use the following pom:

<project xmlns="http://maven.apache.org/POM/4.0.0"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <groupId>nexus</groupId>
    <artifactId>nexus</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <properties>
        <maven.compiler.version>17</maven.compiler.version>
        <maven.compiler.target>17</maven.compiler.target>
    </properties>
    <dependencies>
        <dependency>
            <groupId>com.sun.xml.bind</groupId>
            <artifactId>jaxb-impl</artifactId>
            <version>2.3.8</version>
        </dependency>
        <dependency>
            <groupId>javax.xml.bind</groupId>
            <artifactId>jaxb-api</artifactId>
            <version>2.3.1</version>
        </dependency>
        <dependency>
            <groupId>org.glassfish.jaxb</groupId>
            <artifactId>jaxb-runtime</artifactId>
            <version>2.3.3</version>
        </dependency>
        <dependency>
            <groupId>org.sonatype.nexus.plugins</groupId>
            <artifactId>nexus-indexer-lucene-model</artifactId>
            <version>2.15.1-02</version>
        </dependency>
    </dependencies>
</project>

The XML response is the following:

<searchNGResponse>
  <totalCount>1</totalCount>
  <from>-1</from>
  <count>-1</count>
  <tooManyResults>false</tooManyResults>
  <collapsed>false</collapsed>
  <repoDetails>
    <org.sonatype.nexus.rest.model.NexusNGRepositoryDetail>
      <repositoryId>releases</repositoryId>
      <repositoryName>Releases</repositoryName>
      <repositoryContentClass>maven2</repositoryContentClass>
      <repositoryKind>hosted</repositoryKind>
      <repositoryPolicy>RELEASE</repositoryPolicy>
      <repositoryURL>https://oss.sonatype.org/service/local/repositories/releases</repositoryURL>
    </org.sonatype.nexus.rest.model.NexusNGRepositoryDetail>
  </repoDetails>
  <data>
    <artifact>
      <groupId>org.slf4j</groupId>
      <artifactId>slf4j-nop</artifactId>
      <version>2.1.0-alpha1</version>
      <latestRelease>2.1.0-alpha1</latestRelease>
      <latestReleaseRepositoryId>releases</latestReleaseRepositoryId>
      <artifactHits>
        <artifactHit>
          <repositoryId>releases</repositoryId>
          <artifactLinks>
            <artifactLink>
              <extension>pom</extension>
            </artifactLink>
            <artifactLink>
              <extension>jar</extension>
            </artifactLink>
          </artifactLinks>
        </artifactHit>
      </artifactHits>
    </artifact>
  </data>
</searchNGResponse>

But the JAVA program finds no <data> element.


Solution

  • The search API at https://oss.sonatype.org/service/local/lucene/search appears to be very old (well over a decade) and does not appear to have been maintained recently.

    The Problem

    The specific problem in the above code is that the XML response from the search request does not match the structure of the Java object org.sonatype.nexus.rest.model.SearchNGResponse.

    For example, the XML contains <repoDetails>:

    <searchNGResponse>
        <repoDetails>
            ...
        </repoDetails>
        ...
    </searchNGResponse>
    

    And this contains a set of zero, one, or many NexusNGRepositoryDetail objects.

    But that does not exist in the Java object.

    You can see this for yourself by adding debugging logic to the unmarshaller:

    unmarshaller.setEventHandler(
            new ValidationEventHandler() {
        @Override
        public boolean handleEvent(ValidationEvent event) {
            throw new RuntimeException(event.getMessage(),
                    event.getLinkedException());
        }
    });
    

    Now, instead of failing silently (and appearing to simply return no data) the unmarshaller will throw a stack trace for you:

    java.lang.RuntimeException: unexpected element (uri:"", local:"org.sonatype.nexus.rest.model.NexusNGRepositoryDetail"). Expected elements are <{}repositoryDetail>

    I can't explain why there is this mismatch, and I did not find any alternative Java object in the library you are using, which has the expected structure. The documentation also did not shed any light on this.

    One Possible Solution

    I recommend using a more modern replacement for the search API you are using. This includes a REST call which searches for hashes - equivalent to the call you are using in the question.

    In your case, the new search would be this:

    https://search.maven.org/solrsearch/select?q=1:4a7b16bae95be72f7d591a517bd03e1172ede7ee&rows=20&wt=json
    

    Note, as well as wt=json you could use wt=xml to control the response's structure (JSON or XML).

    I chose JSON, because then I can use Jackson to deserialize the response into a JSON node quite easily:

    URL url = new URI("https://search.maven.org/solrsearch/select?q=1:4a7b16bae95be72f7d591a517bd03e1172ede7ee&rows=20&wt=json").toURL();
    ObjectMapper mapper = new ObjectMapper();
    JsonNode json = mapper.readTree(url);
    JsonNode docs = json.get("response").get("docs");
    for (JsonNode doc : docs) {
        // example ID is: org.slf4j:slf4j-nop:2.1.0-alpha1
        // coordinates[0] group ID    = "org.slf4j"
        // coordinates[1] artifact ID = "slf4j-nop"
        // coordinates[2] version     = "2.1.0-alpha1"
        String[] coordinates = doc.get("id").asText().split(":");
        System.out.println(coordinates[2]); // "2.1.0-alpha1"
    }
    

    The newer API provides the group ID, artifact ID and version as a single string of Maven coordinates - so the above code splits that string on :.

    I used Jackson v1.9.2 - because I just happened to have it in my project - but it's not the latest version.

    (Also bear in mind you may need to URL encode some URLs, as noted in the guide).

    Other Approaches

    I suppose if you want to continue using your older API, you could create your own set of Java objects which match the structure of the returned XML - but that could be error-prone in the absence of a formal definition for the XML.

    Using Jackson, you could also choose XML over JSON, as noted earlier. Jackson also provides various other ways to navigate a JSON node in addition to readTree(). The approach shown above is not the only way.