Search code examples
shellmavenawkdependenciesgroup-concat

Group_by and group_concat in shell script


My intent is to identify the duplicate jars in classpath. So I have used following commands to do some preprocessing.

mvn -o dependency:list | grep ":.*:.*:.*" | cut -d] -f2- | sed 's/:[a-z]*$//g' | sort -u -t: -k2

and the file produced is in format

group_id:artifact_id:type:version

so, now for an example, I have following two lines in a file

com.sun.jersey:jersey-client:jar:1.19.1
org.glassfish.jersey.core:jersey-client:jar:2.26

I want to produce a file with following content.

jersey-client | com.sun.jersey:1.19.1,org.glassfish.jersey.core:2.26

content of this file varies. there can be multiple libs with diff version. Any idea how to do it with shell script? I want to avoid database query.

Adding a snap of sample file here...

org.glassfish.jaxb:jaxb-runtime:jar:2.4.0-b180725.0644
    org.jboss.spec.javax.annotation:jboss-annotations-api_1.2_spec:jar:1.0.2.Final
    org.jboss.logging:jboss-logging:jar:3.3.2.Final
    org.jboss.spec.javax.transaction:jboss-transaction-api_1.2_spec:jar:1.0.1.Final
    org.jboss.spec.javax.websocket:jboss-websocket-api_1.1_spec:jar:1.1.3.Final
    com.github.stephenc.jcip:jcip-annotations:jar:1.0-1
    com.beust:jcommander:jar:1.72
    com.sun.jersey.contribs:jersey-apache-client4:jar:1.19.1
    org.glassfish.jersey.ext:jersey-bean-validation:jar:2.26
    com.sun.jersey:jersey-client:jar:1.19.1
    org.glassfish.jersey.core:jersey-client:jar:2.26
    org.glassfish.jersey.core:jersey-common:jar:2.26
    org.glassfish.jersey.containers:jersey-container-servlet:jar:2.26
    org.glassfish.jersey.containers:jersey-container-servlet-core:jar:2.26
    com.sun.jersey:jersey-core:jar:1.19.1
    org.glassfish.jersey.ext:jersey-entity-filtering:jar:2.26
    org.glassfish.jersey.inject:jersey-hk2:jar:2.31
    org.glassfish.jersey.media:jersey-media-jaxb:jar:2.26
    org.glassfish.jersey.media:jersey-media-json-jackson:jar:2.26
    org.glassfish.jersey.media:jersey-media-multipart:jar:2.26
    org.glassfish.jersey.core:jersey-server:jar:2.26
    org.glassfish.jersey.ext:jersey-spring4:jar:2.26
    net.minidev:json-smart:jar:2.3
    com.google.code.findbugs:jsr305:jar:3.0.1
    javax.ws.rs:jsr311-api:jar:1.1.1
    org.slf4j:jul-to-slf4j:jar:1.7.25
    junit:junit:jar:4.12
    org.latencyutils:LatencyUtils:jar:2.0.3
    org.liquibase:liquibase-core:jar:3.5.5
    log4j:log4j:jar:1.2.16
    org.apache.logging.log4j:log4j-api:jar:2.10.0
    com.googlecode.log4jdbc:log4jdbc:jar:1.2
    org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0
    ch.qos.logback:logback-classic:jar:1.2.3
    ch.qos.logback:logback-core:jar:1.2.3
    io.dropwizard.metrics:metrics-core:jar:4.1.6
    io.dropwizard.metrics:metrics-healthchecks:jar:4.1.6
    io.dropwizard.metrics:metrics-jmx:jar:4.1.6
    io.micrometer:micrometer-core:jar:1.0.6
    org.jvnet.mimepull:mimepull:jar:1.9.6
    com.microsoft.sqlserver:mssql-jdbc:jar:6.2.2.jre8
    com.netflix.netflix-commons:netflix-commons-util:jar:0.3.0
    com.netflix.netflix-commons:netflix-statistics:jar:0.1.1
    io.netty:netty-buffer:jar:4.1.27.Final
    io.netty:netty-codec:jar:4.1.27.Final
    io.netty:netty-codec-http:jar:4.1.27.Final
    io.netty:netty-common:jar:4.1.27.Final
    io.netty:netty-resolver:jar:4.1.27.Final
    io.netty:netty-transport:jar:4.1.27.Final
    io.netty:netty-transport-native-epoll:jar:4.1.27.Final
    io.netty:netty-transport-native-unix-common:jar:4.1.27.Final
    com.nimbusds:nimbus-jose-jwt:jar:8.3

Solution

  • There might be easier methods but this is what I can do now ... probably can be narrowed down to a single line with some tweaking

    [07:38 am alex ~]$ date; cat a
    Wed  4 Nov 07:38:21 GMT 2020
    com.sun.jersey:jersey-client:jar:1.19.1
    org.glassfish.jersey.core:jersey-client:jar:2.26
    
    [07:38 am alex ~]$ FIRST=`cat a | awk -F'[:]' '{print $2}' | uniq`
    [07:38 am alex ~]$ SECOND=`cat a | awk -F'[:]' '{print $1":"$4}' | xargs | sed 's/ /,/g'`
    [07:38 am alex ~]$ echo "$FIRST | $SECOND"
    jersey-client | com.sun.jersey:1.19.1,org.glassfish.jersey.core:2.26