I am parsing hundreds of poms with a bash script in a weird hierarchy to extract an overview of all the projects into a single report (the kind of thing that maven-info-projects:project-team can't do in one go). For undisclosed reasons I don't want to mess with a parent pom or try and configure maven-info-projects sections.
I am using XMLStarlet because it is installed, and xmllint is not.
Given a pom.xml extract that contains:
<developer>
<id>devId</id>
<name>Developer Name</name>
<email>dev@nowhere.com</email>
<roles>
<role>Project manager</role>
<role>Developer</role>
</roles>
</developer>
How can I extract all developer info, including the multiple roles, with a single call to XMLStarlet?
At the moment, I can extract the bulk of my information with:
# Developers
locate_section_values $pom_file_name "/x:project/x:developers/x:developer" \
"concat( \
x:id, '|', x:name, '|', x:email, '|', x:roles, '|', \
x:organization, '|', x:organizationUrl, '|', x:timezone
)"
where
function locate_section_values(){
local xml_file=$1
local section=$2
local value_table=$3
OLD_IFS=$IFS
IFS=$'\n'
xml_values=()
xml_values=(`xmlstarlet sel -B -N x="http://maven.apache.org/POM/4.0.0" -t -m "$section" -v "$value_table" -nl $xml_file`)
IFS=$OLD_IFS
}
I then split the results:
for developer in ${xml_values[@]}; do
IFS='|'
set $developer # split into $1, $2, etc using | as seperator
#echo "id:${1}, name:${2}, roles:${4}"
if [ -n "${1}" ]; then # id
developer_id=${1}
developer_ids+=( $developer_id )
fi
...
The problem is, a developer with multiple roles gets their roles concatenated:
Project managerDeveloper
Is there a way to tell the original call to xmlstarlet to combine multiple roles into, for example, a comma-seperated list?
I think the following gives approximately what you want, but you'll have to change the interface to locate_section_values
:
xmlstarlet sel -T -B -N x="http://maven.apache.org/POM/4.0.0" \
-t -m "/x:project/x:developers/x:developer" -v "x:id" -o "|" \
-v "x:name" -o "|" -v "x:email" -o "|" \
-m "x:roles/x:role" -v "." -o "," -b -o "|" \
-v "x:organization" -o "|" -v "x:organizationUrl" -o "|" \
-v "x:timezone" --nl
$pom_file_name
That produces roles as a comma terminated list because it's easier to code.
locate_section_values
sans eval
:
function locate_section_values() {
local xml_file=$1 # $local_project_dir/$fixed_name/pom.xml
local section=$2 #/x:project/x:modules/x:module
local value_table=("${@:3}")
OLD_IFS=$IFS
IFS=$'\n'
xml_values=($(xmlstarlet sel -B -N x=http://maven.apache.org/POM/4.0.0 \
-t -m "$section" "${value_table[@]}" --nl "$xml_file"))
IFS=$OLD_IFS
}
call:
locate_section_values "$pom_file_name" '/x:project/x:developers/x:developer' \
-v 'x:id' -o '|' -v 'x:name' -o '|' -v 'x:email' -o '|' \
-m 'x:roles/x:role' -v '.' -o ', ' -b -o '|' \
-v 'x:organization' -o '|' -v 'x:organizationUrl' -o '|' \
-v 'x:timezone'
loop over developers and extract fields:
for developer in "${xml_values[@]}"; do
# get | separated fields
IFS='|' read id name email roles org orgUrl timezone <<<"$developer"
if [ -n "$roles" ]; then # roles
developer_roles_csv=${roles%, } # strip trailing comma
fi
echo "$name ($id) has roles: $developer_roles_csv."
done # developer