Overview

VoDoo/Stream project is based on three concepts.

Transcuder Architecture

Finally a XSLT like language is defined in order to express data transformations.

Stream representation

Stream was a simple formalism based on opening and closing a level, labels and text. Using this simple grammar we provide a simple tree (XML for example) stream denotation (XML was given by a dedicate SAX handler). Current supported formats are XML and free text. More formalisms can be supported and done using stream extension facility. A stream interpreation was provided for Document Object Model. Then a stream can manipulate either a pure text, an ad-hoc stream and a DOM based data.

In comparison the STAX approach was a low level XML matching integration based on token stream representation of XML fragments. The Stream representation used with classical switch/case conditional structure is similar to STAX approach but such integration is two low level and do not provide an expressive layer for XML management and was in fact at the same level than SAX.

Automata for Stream recognition

Automata provides a hight level for pattern recognition and variable binding. It produces DAG with specific attributes for variable denotations. Such automata is able to find or also to match a given stream. An automata was built using a given stream containing extended formalism including pattern like repetition, any kind of label or text and choice. Such stream was analysed in order to given a direct acyclic graph used for the automata generation (classical approach).

Transducer for Stream transformation

Transducers are in fact ordered set of rules. A rule has a selection part and a body. A selection can deal with pathes (tree visitor) and current entity. A first entity was the tree node and selection can be done filtering its name or attributes. A second entity was the string which can be filtered using usual pattern matching. A body was a piece of java code which is able to continue parsing or not (recursive descent).

Transducer Stream Processor language: XSP

Finally a transducer language - called XSP - expressed in XML is defined. This language has a bootstrap definition in XML (only for XML and text transformation for the moment). Such XSP definition was extended in order to provide rules supporting code written in languages providing a BSF handler (Javascript, Beanshell, jRuby, Jython ...).