XMLReader

XMLReader is

an object mode Transform stream
consuming XMLLexer's output (distinct XML tags and text fragments),
transforming incoming strings into XMLNode objects,
adjusting data (see Data transformation below),
putting XMLNodes to the output,
and emitting them as SAX events.

Usage

const {XMLLexer, XMLReader} = require ('node-xml-toolkit')

const reader = new XMLReader ({...options})

someInputSream.pipe (new XMLLexer ()).pipe (reader)

for await (const e of reader) {
  console.log (e)
}

Options

Name	Default	Description
stripSpace	false	If `true`, text fragments are trimmed
useEntities	true	If `true`, the EntityResolver is in use, otherwise `&...;` may occur in output
useNamespaces	true	If `true`, all element attributes are scanned for `xmlns...` prefixes
filter	(xmlNode) => true	If set, this function is called for each XMLNode before `push`ing it out. Unless if returns a true value, the push is skipped. Think Array.filter
find		Same as `filter`, but `destroy`s the stream previously `unpipe`ing itself from the source (if any) on first node published. Think Array.find.
collect	(xmlNode) => false	If set, this function is called on each StartDocument except for self enclosed tags. If it returns true, the `children` array will be created for this node and all its subtree.

Computed properties

Name	Type	Description
fixText	string => string	if `useEntities` is set on, EntityResolver's main method exposed; identity transformation otherwise

Data transformation

Event substitution

on stream end, the {type: 'EndDocument'} fake event (plain Object, not a SAXEvent instance) is published;
for EndElement tags, same XMLNode is published as for StartElement, with altered type field;
- for self enclosed elements, too, XMLNodes are published twice: as StartElement and as EndElement.

Text aggregation

Sequences of text/CDATA fragments are reported as atomic Characters events.

If useEntities option is set on (by default), Characters fragments are transformed by EntityResolver (CDATA never are).

To drop insignificant whitespace, use the stripSpace option. When it's set to true, every aggregated text fragment is trimmed down, emptied lines are ignored completely. So, for example, <foo/>\n\n<bar/> yields no Characters at all, but for a <![CDATA[cdata]]> section spaces are left in place.

Nodes subtree collection

By default, XMLReader works in a manner similar to SAX parsers, emitting atomic objects one by one. They are XMLNodes, not just SAXEvents, but, once again, by default each XMLNode is only aware of its parent, not children (the children member is null, not an empty array).

To start collecting children, the application developer can explicitly mark the necessary set of nodes with the function passed as collect option. For example, for the complete DOM tree one can use

collect: e => true,

for root element direct children (that may have sense with huge linear XML):

collect: e => e.level === 1,

and so on.

Each XMLNode conforming to the collect criterion will have the default children = [], and each XMLNode having it as parent:

will be added to that list;
if it's an element, it will collect its own children.

Complete children content is available at EndElemnt event.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

XMLReader

Usage

Options

Computed properties

Data transformation

Event substitution

Text aggregation

Nodes subtree collection

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally