Skip to content

How do I read a node that has a text value and child elements? #30

@wattsjp2

Description

@wattsjp2

I have an XML element that looks like

<Card>
    4111111111111111
    <Type>VISA</Type>
</Card>

that I'm trying to write an XmlReader for. My case class looks like:

case class Card(
    number: String,
    cardType: String
)

The problem I'm having is trying to extract the card number. I've played around on the REPL and looked at the source code. My first thought was I could just read the root element

scala> val xml = <Card>
     |   4111111111111111
     |   <Type>VISA</Type>
     | </Card>
xml: scala.xml.Elem =
<Card>
	4111111111111111
	<Type>VISA</Type>
</Card>

scala> val path = __
path: com.lucidchart.open.xtract.XPath.type =

scala> path(xml)
res0: scala.xml.NodeSeq =
<Card>
	4111111111111111
	<Type>VISA</Type>
</Card>

scala> val reader = path.read[String]
reader: com.lucidchart.open.xtract.XmlReader[String] = com.lucidchart.open.xtract.XmlReader$$anon$1@52f47f0c

scala> reader.read(xml)
res6: com.lucidchart.open.xtract.ParseResult[String] =
ParseSuccess(
	4111111111111111
	VISA
)

This is reading everything under the root node though. My next thought was maybe I could loop through the child nodes:

scala> path.children(xml)
res8: scala.xml.NodeSeq = NodeSeq(<Type>VISA</Type>)

but that doesn't return the text node. My last thought was what if <Card> wasn't the root element. Would that change anything:

scala> val xml = <Root>
     |   <Card>
     |           4111111111111111
     |           <Type>VISA</Type>
     |   </Card>
     | </Root>
xml: scala.xml.Elem =
<Root>
	<Card>
		4111111111111111
		<Type>VISA</Type>
	</Card>
</Root>

scala> val path = (__ \ "Card")
path: com.lucidchart.open.xtract.XPath = /Card

scala> path(xml)
res10: scala.xml.NodeSeq =
NodeSeq(<Card>
		4111111111111111
		<Type>VISA</Type>
	</Card>)

scala> path.read[String].read(xml)
res11: com.lucidchart.open.xtract.ParseResult[String] =
ParseSuccess(
		4111111111111111
		VISA
	)

So that seems to be giving the same behavior. It looks like under the hood stringReader is using the text function on NodeSeq

  /**
   * [[XmlReader]] matches the text of a single node.
   */
  implicit val stringReader: XmlReader[String] = XmlReader { xml =>
    getNode(xml).map(_.text)
  }

It looks like this behavior comes from there

scala> val xml: NodeSeq = <Card>
     |   4111111111111111
     |   <Type>VISA</Type>
     | </Card>
xml: scala.xml.NodeSeq =
<Card>
  4111111111111111
  <Type>VISA</Type>
</Card>

scala> xml.text
res16: String =
"
  4111111111111111
  VISA
"

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions