Content Pipelines

Publishing Engine Programmer's Guide > PTC Arbortext Publishing > Content Pipelines

Content Pipelines

A content pipeline is a mechanism for translating an XML or SGML document into another XML or SGML document. It consists of a sequence of Java objects called filters.

A content pipeline begins with a special filter called a generator that translates an XML or SGML document into a series of SAX events. It ends with a special filter called a serializer that translates a series of SAX events back into an XML or SGML document, placing it in-memory or on disk. Each filter between the generator and serializer accepts a stream of SAX events from the previous filter in the pipeline and passes a stream of SAX events to the next filter in the pipeline.

A filter can remove content from the XML document passing through it by omitting some SAX events from its output stream. It can insert content by adding SAX events. Some filters make only minor modifications to the document; some may replace the document entirely by applying complex transformation rules to the input stream.

The content pipeline and filters used in PTC Arbortext software are an implementation of a freely-available technology called SAX2, the second version of the Simple API for XML. You can find information about SAX2, the SAX2 parser, and SAX filters on the web. Be sure to consult the Content Pipeline Guide for more information on the PTC Arbortext implementation; however, the following sections summarize its information.

When Arbortext Editor uses an Arbortext Publishing Engine server (Arbortext PE server) for publishing, most content pipelines run on the Arbortext PE server rather than on the client. However, some auxiliary pipelines run on the client if their primary function is to prepare data for transmission to the server.