I am trying to get scala xml node tag with attribute. I would like to get just the tag name with attribute and not the child elements.
I have this input:
<substance-classes>
<nucleic-acid-sequence display-name="Nucleic Acid Sequence">
<nucleic-acid-base>
<base-symbol>a</base-symbol>
<count>295</count>
</nucleic-acid-base>
<nucleic-acid-base>
<base-symbol>c</base-symbol>
<count>329</count>
</nucleic-acid-base>
<nucleic-acid-base>
<base-symbol>g</base-symbol>
<count>334</count>
</nucleic-acid-base>
<nucleic-acid-base>
<base-symbol>t</base-symbol>
<count>268</count>
</nucleic-acid-base>
</nucleic-acid-sequence>
<genbank-information>
<genbank-accession-number>EU186063</genbank-accession-number>
</genbank-information>
</substance-classes>
I am trying to replace the contents of <nucleic-acid-sequence>
by doing this
val newNucleicAcidSequenceNode = <nucleic-acid-sequence>{ myfunction
} </nucleic-acid-sequence>
But some <nucleic-acid-sequence>
has attributes like <nucleic-acid-
sequence display-name="Nucleic Acid Sequence">
. Since my
newNucleicAcidSequenceNode
is a hardcoded tag I am losing the attibutes.
How do I retain the optional attributes and still pass { myfunction }
to
<nucleic-acid-sequence>
tag?
So, if I have understood you well:
nucleic-acid-sequence
under substance-classes
nucleic-acid-sequence
myFunction
)So my answer would be in that case:
import scala.xml.{Node, Elem}
val myXml: Elem =
<substance-classes>
<nucleic-acid-sequence display-name="Nucleic Acid Sequence">
<nucleic-acid-base>
<base-symbol>a</base-symbol>
<count>295</count>
</nucleic-acid-base>
<nucleic-acid-base>
<base-symbol>c</base-symbol>
<count>329</count>
</nucleic-acid-base>
<nucleic-acid-base>
<base-symbol>g</base-symbol>
<count>334</count>
</nucleic-acid-base>
<nucleic-acid-base>
<base-symbol>t</base-symbol>
<count>268</count>
</nucleic-acid-base>
</nucleic-acid-sequence>
<genbank-information>
<genbank-accession-number>EU186063</genbank-accession-number>
</genbank-information>
</substance-classes>
def myFunction(children: Seq[Node]) : Seq[Node] = ??? // whatever you want it to be
// Here's the replacement:
myXml.copy(child = myXml.child.map {
case e@Elem(_, "nucleic-acid-sequence", _, _, children@_*) =>
e.asInstanceOf[Elem].copy(child = myFunction(children))
case other => other
})
For instance, myFunction
could keep only children which have a count above 300 and could be something like:
import scala.util.{ Try, Success }
def myFunction(children: Seq[Node]): Seq[Node] = children.collect {
case e: Node if Try((e \ "count").text.toInt > 300) == Success(true) =>
e
}
In that case, if you replace the unimplemented myFunction
in the first snippet by this, the replacement would give:
<substance-classes>
<nucleic-acid-sequence display-name="Nucleic Acid Sequence"><nucleic-acid-base>
<base-symbol>c</base-symbol>
<count>329</count>
</nucleic-acid-base><nucleic-acid-base>
<base-symbol>g</base-symbol>
<count>334</count>
</nucleic-acid-base></nucleic-acid-sequence>
<genbank-information>
<genbank-accession-number>EU186063</genbank-accession-number>
</genbank-information>
</substance-classes>
As you can see no attributes of nucleic-acid-sequence
is lost and your function has kept two nodes over four for a defined condition.
Hope it helps.