-
Notifications
You must be signed in to change notification settings - Fork 4
XMLParser
XMLParser is a... synchronous XML parser.
It takes the source XML text in the form of a String (that must be loaded entirely) and gives out an XMLNode instance representing its root (aka document) node with all hierarchy filled in.
Elements and text/CDATA nodes are present in the output, all other (comments, prolog/processing instructions and exotic stuff like DTD) are ignored.
In application code, it may be convenient not to use raw XMLNode, but somehow normalize it with:
- the
detachinstance method (as shown in the example below) or - the XMLNode.toObject standalone transformer.
const fs = require ('fs')
const {XMLParser} = require ('xml-toolkit')
const xml = fs.readFileSync ('doc.xml')
const parser = new XMLParser ({...options})
const document = parser.process (xml)
for (const element of document.detach ().children) {
console.log (element.attributes)
}| Name | Default | Description |
|---|---|---|
| useEntities | true |
If true, the EntityResolver is in use, otherwise &...; may occur in output |
| useNamespaces | true |
If true, all element attributes are scanned for xmlns... prefixes |
| stripSpace | true |
If true, text fragments are trimmed |
...
| Name, Params | Type | Description |
|---|---|---|
| src | String | XML to parse |
Return value: the XMLNode object representing the document element.
Sequences of text/CDATA fragments are concatenated together to atomic Characters nodes.
If useEntities option is set on (by default), Characters fragments are transformed by EntityResolver (CDATA never are).
To drop insignificant whitespace, use the stripSpace option. When it's set to true, every aggregated text fragment is trimmed down, emptied lines are ignored completely. So, for example, <foo/>\n\n<bar/> yields no Characters at all, but for a <![CDATA[cdata]]> section spaces are left in place.
XMLParser and XMLIterator and are both synchronous XML parsers reading a complete String. But:
| Name | Proto | Pro | Contra |
|---|---|---|---|
XMLIterator |
Iterable |
scans XML tag by tag, allows early completion | doesn't build hierarchy |
XMLParser |
none | simple | always allocates the complete document tree in memory |
In fact, XMLParser is built on top of XMLIterator.
XMLParser and XMLReader and are both high level XML parsers producing XMLNodes. But:
| Name | Proto | XML Source | Pro | Contra |
|---|---|---|---|---|
XMLReader |
Transform |
Readable |
allows to scan huge XML with limited memory footprint | asynchronous by nature |
XMLParser |
none | String |
can be used in synchronous contexts, e. g. in object constructors | limited size XML only |
So, XMLReader vs. XMLParser is basically like fs.createReadStream vs. fs.readFileSync.