Search code examples
xmlscalascala-xml

Scala XML attribute replacement results in appending modified node as a child


Im trying to write an XML parsing utility that allows a user to modify an XML attribute through providing an attribute name, the current attribute value and the new value they would like the attribute to have. Here is my code:

def main (args: Array[String]) {
   val xml= <Rule debug="true" expression="testing"/>
   val printable= replaceXMLEntryAttribute(xml, "debug", "true", "false")
   println(printable)
}
/**
   * This method is used to iterate over the entirety of the xml presented and modify the XML attribute desired 
   */
  def replaceXMLEntryAttribute(elem: Elem, attrName: String, curVal: String, desiredVal: String): Elem = {

    def replace(current: Elem): Elem = current.copy(
      child = current.map {
        case e: Elem if isReplacementEntry(current, attrName, curVal) ⇒ generateReplacementXMLAttribute(current)
        case e: Elem ⇒ replace(e)
        case other⇒ other
      }
    )

   def generateReplacementXMLAttribute(node: Elem): Elem = {
      val currentXML= node.toString()
      val newAttr= currentXML.replace(curVal, desiredVal)
      return XML.loadString(newAttr)
    }
    replace(elem)
  }

  private def isReplacementEntry(node: Elem, attributeName: String,  currentAttrValue: String): Boolean = {
   val attr = "@" + attributeName
   val exists = node \\ attr find { _.text == currentAttrValue }
   exists match{
     case None => false
     case _ => true
   }

The desired output is <Rule debug="false" expression="testing"/> while the result of the program is <Rule debug="true" expression="testing"><Rule expression="testing" debug="false"/></Rule>

I can only guess and say the replace method is messing up here.


Solution

  • The documentation of the Elem.map method doesn't have any text to explain what it is supposed to do, and its type is confusing. To obtain a more concrete type, we can use the Scala interpreter:

    scala> import scala.xml._
    scala> val elem: Elem = <numbers><one/><two/><three/></numbers>
    scala> :t elem.map(identity)
    scala.xml.NodeSeq
    

    Strange, why would it produce a NodeSeq? If Elem.map was mapping over the element's children, returning an Elem with the same label and attributes but new children, the return type should be Elem, not NodeSeq. To verify whether Elem.map is really iterating over its children, let's accumulate the nodes we encounter into a list.

    scala> var nodes = Seq[Node]()
    scala> elem.map {node =>
         |   nodes :+= node
         |   node
         | }
    res: scala.xml.NodeSeq = NodeSeq(<numbers><one/><two/><three/></numbers>)
    scala> nodes
    res: Seq[scala.xml.Node] = List(<numbers><one/><two/><three/></numbers>)
    

    If we were iterating over the children, I would have expected List(<one/>, <two/>, <three/>), but that's not what we got. So it seems we are iterating over a 1-element collection containing the element itself, which isn't very useful. Looking at the code, however, this seems to be intentional: Node, and by extension Elem, is a subclass of NodeSeq whose sequence consists of a single element, itself.

    So, to summarize, the reason you get your unexpected result is that you start with <Rule debug="true" expression="testing"/>, you then map over it to obtain the result <Rule debug="true" expression="testing"/>, and then you replace the children of Rule with that result, obtaining <Rule debug="true" expression="testing"><Rule expression="testing" debug="false"/></Rule>.

    The solution to that part of the problem is to use current.child.map instead of current.map. However, since you are only examining Rule's zero children and not Rule itself, the body of the map is never executed, so the debug attribute is left unchanged. I advise swapping the pattern-matching and the map:

    def replace: Node => Node =
      {
        case e: Elem if isReplacementEntry(e, attrName, curVal) ⇒ generateReplacementXMLAttribute(e)
        case e: Elem ⇒ e.copy(
          child = e.child.map { replace(_) }
        )
        case other⇒ other
      }
    

    After fixing the types to work with Nodes instead of Elems, I obtain the desired result.