FAQ

From Smooks

Jump to: navigation, search

Contents

What's the easiest way to get started with Smooks?

The easiest way to get started with Smooks is to:

  1. Have a quick read of the Smooks Basics section.
  2. Download and try out some of the examples.

The tutorials are the perfect base upon which to integrate Smooks into your application.

How do I use Smooks in a Maven/Ant based Project?

Take a look at Maven & Ant.

What "Processing Model" does Smooks employ?

Smooks supports DOM and SAX based processing models, but adds a more "code friendly" layer on top of them. See the User Guide.

What "Configuration Model" does Smooks employ?

See the SmooksResourceConfiguration javadocs.

What is a "Message Fragment"?

See Fragment based Data Processing with Smooks.

What is a "selector"?

A Smooks transformation is specified as a series of "resource" configurations (typically made in an XML file). Smooks loads/looks-up/"selects" a resource at runtime through the resources configuration "selector" value. For an XML fragment (Element) transformation resource, the selector value is the XML Element name. When Smooks passes over the message and encounters each Element, it uses the Element's name to "select" the resources to be applied to that message fragment.

A "selector" is not always an XML Element name however. An example of this would be message parser resource configurations (CSV parser, EDI parser etc), where the selector is always "org.xml.sax.driver ".

Is Smooks an alternative to technologies such as XSLT?

No! Smooks is a framework for performing Fragment based Data Processing using existing XML processing technologies (such as XSLT). Read about Why Smooks was Created.

Other technologies supported by Smooks are: raw Java, Groovy script, FreeMarker and StringTemplate templating.

Is Smooks going to be yet another Transformation Configuration Model I'll have to learn?

First off, don't thinking of Smooks as an alternative to the likes of XSLT and don't think of the Smooks configuration as a "Transformation Configuration" in the same way as an XSL Stylesheet.

The Smooks configuration should be thought of more as a "Framework Configuration" - most frameworks have a configuration of one sort or another. It's nothing like XSLT, which is basically a programming language in XML. It just defines how transformation/analysis resources are targeted at messages and message fragments. It doesn't define the low level details of each individual transformation. Also note that the Smooks configuration is quite simple in terms of the number of configuration elements you need to remember - there are only a few.

See the SmooksResourceConfiguration javadocs.

How does Smooks simplify my XSLT?

Smooks can be used to simplify your XSLT in a number of ways:

  1. Because Smooks supports targeting of transformation resources (including XSLT) at message fragments, it's easier to modularize and reuse your XSLT.
  2. Smooks allows you to mix and match different technologies within the context of a single message transform. This means you can transform (or pre-process) fragments of the message, not easily consumed by XSLT, using other technologies e.g. Java or Groovy. See the examples. Unlike XSL Extensions however, it supports mixing these technologies with your XSLT, [#SmooksFAQs-xsltcustomextensions without effecting the XSLT's portability].

What version of XSLT is supported by Smooks?

The simple answer to this question is that Smooks supports the version of XSLT supported by the installed XSL Processor.

A more accurate answer would be that Smooks is not an XSL Processor and so does not rely on a specific version of the XSLT Specification. Smooks simply uses the javax.xml.transform API within the JDK when applying XSLTs. So as far as dependancies are concerned, the only denendancy Smooks has is the javax.xml.transform API.

XSLT supports custom Extensions (Java, Javascript etc), so can't I "mix and match" other technologies with XSLT in this way?

Sure, XSLT supports custom extensions. The problem with XSLT Extensions is the effect they often have on your XSLT in terms of portability across processor implementations, as well as general stylesheet maintenance. Take a quick look at the mailing lists for some of the main XSL Processor implementations. You'll see that extension portability is a recurring topic of conversation.

Smooks helps you solve the same type of problems that XSLT Extensions are designed to solve, but by keeping the extension logic separate from the XSLT. Your stylesheets should always be vanilla XSLT, without any reference to extension code.

What sort of processing overhead is encored when using Smooks to apply XSLT?

We carried out some profiling in order to get an answer to this very question. The scenario we used was purposely geared in favor of XSLT. The message being processed was very flat (not hierarchical) and was not normalized and the XSLT we chose to apply was very simple. This type of scenario is especially suited to Stream/SAX based XSL processing.

What this profiling demonstrated was that in this scenario (worse case scenario) a 5% - 15% overhead is encored when comparing Smooks based application of XSLT to direct DOM based XSLT processing. However, when comparing DOM based application of XSLT (direct or via Smooks) to direct Stream/SAX based XSLT processing, we see that Stream/SAX based processing kicks ass in this type of scenario ([#SmooksFAQs-streamsaxsupport Smooks currently only supports DOM based processing]). The fact that Stream/DOM based processing can (given the right conditions) outperform DOM based processing is a long known fact.

This approach to profiling gives users of Smooks a "worse case scenario" idea of how Smooks performs when applying XSL Transforms. What users of Smooks need to remember is that Stream/SAX based processing is not well suited to all types of transforms. As well as that, Smooks offers many other features that help simplify otherwise complex transforms, while at the same time maintaining stylesheet portability across XSL Processors. We're also very keen to add Stream and SAX based processing to Smooks.

For more on the profiling we carried out, see blog.

NOTE  : Smooks v1.0 supports a SAX based processing model. More information on this to follow, or mail the User mailing list.

Can Smooks be used to process message formats other than XML?

Absolutely! Smooks allows you configure a Reader per transform basis (based on a message profile if necessary). As long as a message is hierarchical in nature, SAX events can be generated for that message, allowing it to be consumed by Smooks. See the [[1]] and csv-to-xml tutorials.

What's the difference between the Javabean Cartridge and XML Binding frameworks like JAXB, XMLBeans, XStream etc?

The Javabeans Cartridge is not intended as a straight alternative to existing XML Binding frameworks such as those listed above. If your only requirement is that of binding XML to and from Java Objects, then you should probably go with one of the these other frameworks.

The Smooks Java binding functionality can be a very useful alternative:

  1. For binding non-XML data e.g. EDI, CSV, Java (i.e. performing Java to Java transforms).
  2. For binding XML data that doesn't line up with the target java object model.
  3. In situations where your source data model does not conform to any schema. Some tools (e.g. JAXB and XMLBeans) require you to have an XSD, from which the Java model is generated and against which the source message is validated.
  4. Performing Expression Based Bindings.
  5. For creating Virtual Object Models. This can be very useful when performing model driven transformations.
  6. For binding XML data where the XML model's dataset is a superset of the Java model's dataset i.e. where you need to selectively pick data from the source XML.
  7. In support of complex splitting, transformation and routing (and other operations).

We're sure there are other use cases where Java binding using Smooks makes sense, but basically what we're saying is that if all you are interested in is straightforward marshalling and unmarshalling between Java and XML, then JAXB/XStream etc is probably a more straightforward option, as long as your use case fits inside the parameters set down by these frameworks. If your use case cannot be solved using JAXB (etc) without major headaches (which can often be the case), then Smooks can be an option for you!

How do I target a resource at the document root fragment of a message without having to specify the name of the root Element?

Specify the selector as "$document".

How do I target a resource at all Elements of a message?

Specify the selector as "*".

What happens to message elements I don't target any resources at?

They remain in the resulting document, untouched. This is unlike a templating type system (e.g. XSLT) where this would result in these fragments being omitted from the resulting document.

Can I target more than one resource at a message fragment?

You can. They will be sorted an applied by Smooks in order of their configuration specificity. See the SmooksResourceConfigurationSortComparator.

What technologies are supported by Smooks?

Smooks supports a number of technologies and can easily be extended to support more. These technologies are bundled in what we call "Cartridges". A single cartridge may support more than one type of processing technology.

See the current list of Smooks cartridges and the technologies they add support for.

Apart from message Transformation, what other forms of message processing are possible with Smooks?

In most of the Smooks documentation we talk about Smooks in the context of data transformation. However, the core of Smooks (smooks-core) knows nothing about data transformation and so does nothing specific in this area. It's basically an engine for applying "visitor" logic to data "fragments". This logic can be data transformation logic, or it can be logic for processing/analyzing data in any way you require.

So, the answer to this question is - "whatever type of processing you require". Just write the visitor implementation(s) and get Smooks to apply the logic on the target fragments. Visitor implementations can interact with each other via the ExecutionContext.

Can a single Smooks instance be used concurrently?

Absolutely!

Can Smooks be extended to support other transformation/processing technologies?

Sure. To add support for another transformation or processing technology, you need to implement a ContentHandlerFactory for that technology. As examples, see the following ContentHandlerFactory implementations:

  1. XslContentHandlerFactory  : Adds support for XSLT.
  2. StringTemplateContentHandlerFactory  : Adds support for the StringTemplate templating framework.
  3. GroovyContentHandlerFactory  : Adds support for Groovy scripted resource.

Hooking the content delivery unit creator implementation into Smooks is a matter of specifying a Smooks resource for the content delivery creator. See the configurations file for the above templating "CDU Creators".

Note how the selector for all CDU Creators is always "cdu-creator". This is how Smooks looks up all the CDU Creators targeted at a message profile. Also note how each of the CDU Creator configurations has a "restype" parameter denoting the resource type. Smooks uses this to select the appropriate CDU Creator for a given resource.

NOTE  : Since Smooks v1.0, other transformation/processing technologies are supported by implementing the ContentHandlerFactory interface. Registering the ContentHandlerFactory is just a matter of adding a file named "content-handlers.inf" to the META-INF folder of factory's jar file and listing the implementation class(s) there in (one per line).