- What’s the easiest way to get started with Smooks?
- How do I use Smooks in a Maven based Project?
- What “Processing Model” does Smooks employ?
- What “Configuration Model” does Smooks employ?
- What is a “Message Fragment”?
- What is a “selector”?
- How do I configure a Java resource?
- How do I configure an XSLT resource?
- How do I configure a FreeMarker resource?
- How do I configure a StringTemplate resource?
- How do I configure a Groovy resource?
- Is Smooks an alternative to technologies such as XSLT?
- Is Smooks going to be yet another Transformation Configuration Model I’ll have to learn?
- How does Smooks simplify my XSLT?
- When would I use Smooks to apply and XSLT verses using an XSLT Processor directly?
- What sort of processing overhead is encored when using Smooks to apply XSLT?
- Can I use Java (or Groovy) to pre-process one fragment of a message and then apply an XSLT to the whole document?
- Does Smooks support a Stream or SAX based processing model?
- Can Smooks be used to process message formats other than XML?
- What’s the difference between the Javabean Cartridge and XML Binding frameworks like JAXB, XMLBeans, XStream etc?
- How do I target a resource at the document root fragment of a message without having to specify the name of the root Element?
- How do I target a resource at all Elements of a message?
- What happens to message elements I don’t target any resources at?
- Can I target more than one resource at a message fragment?
- What technologies are supported by Smooks?
- Apart from message Transformation, what other forms of message processing are possible with Smooks?
- Can a single Smooks instance be run concurrently?
- Can Smooks be extended to support other transformation/processing technologies?
What’s the easiest way to get started with Smooks?
The easiest way to get started with Smooks is to:
The tutorials are the perfect base upon which to integrate Smooks into your application.
How do I use Smooks in a Maven based Project?
Take a look at this page.
What “Processing Model” does Smooks employ?
Smooks supports DOM and SAX based processing models, but adds a more “code friendly” layer on top of them. See the user guide.
What “Configuration Model” does Smooks employ?
See the SmooksResourceConfiguration javadocs.
What is a “Message Fragment”?
What is a “selector”?
A Smooks transformation is specified as a series of “resource” configurations (typically made in an XML file). Smooks loads/looks-up/”selects” a resource at runtime through the resources configuration “selector” value. For an XML fragment (Element) transformation resource, the selector value is the XML Element name. When Smooks passes over the message and encounters each Element, it uses the Element’s name to “select” the resources to be applied to that message fragment.
A “selector” is not always an XML Element name however. An example of this would be message parser resource configurations (CSV parser, EDI parser etc), where the selector is always “org.xml.sax.driver”. What happens when more than one parser configuration (under selector “org.xml.sax.driver”) is targeted at a message e.g. where profiling is in use?
How do I configure a Java resource?
Check out the java-basic tutorial.
How do I configure an XSLT resource?
How do I configure a FreeMarker resource?
How do I configure a StringTemplate resource?
How do I configure a Groovy resource?
Is Smooks an alternative to technologies such as XSLT?
Is Smooks going to be yet another Transformation Configuration Model I’ll have to learn?
First off, don’t think of Smooks as an alternative to the likes of XSLT and don’t think of the Smooks configuration as a “Transformation Configuration” in the same way as an XSL Stylesheet.
The Smooks configuration should be thought of more as a “Framework Configuration” - most frameworks have a configuration of one sort or another. It’s nothing like XSLT, which is basically a programming language in XML. It just defines how transformation/analysis resources are targeted at messages and message fragments. It doesn’t define the low level details of each individual transformation. Also note that the Smooks configuration is quite simple in terms of the number of configuration elements you need to remember - there are only a few.
See the SmooksResourceConfiguration javadocs.
How does Smooks simplify my XSLT?
Smooks can be used to simplify your XSLT in a number of ways:
- Because Smooks supports targeting of transformation resources (including XSLT) at message fragments, it’s easier to modularize and reuse your XSLT.
- Smooks allows you to mix and match different technologies within the context of a single message transform. This means you can transform (or pre-process) fragments of the message, not easily consumed by XSLT, using other technologies e.g. Java or Groovy. See the xslt-groovy tutorial. Unlike XSL Extensions however, it supports mixing these technologies with your XSLT, without effecting the XSLT’s portability.
Sure, XSLT supports custom extensions. The problem with XSLT Extensions is the effect they often have on your XSLT in terms of portability across processor implementations, as well as general stylesheet maintenance. Take a quick look at the mailing lists for some of the main XSL Processor implementations. You’ll see that extension portability is a recurring topic of conversation.
Smooks helps you solve the same type of problems that XSLT Extensions are designed to solve, but by keeping the extension logic separate from the XSLT. Your stylesheets should always be vanilla XSLT, without any reference to extension code. See the xslt-groovy tutorial. It uses a Groovy Smooks resource (“extension”) to pre-process a date field into XML nodes that are more consumable by XSLT, removing the convoluted date field processing logic from the XSLT, keeping the stylesheet nice and simple.
When would I use Smooks to apply and XSLT verses using an XSLT Processor directly?
- No one transformation technology (including XSLT) is ideally suited to all transformation use cases. Some parts (fragments) of a message are going to be easily transformed via a templating approach such as XSLT (structural transformations), while others are more easily transformed using a procedural language such as Java. When you encounter situations such as this and want to avoid the type of portability issues outlined here, Smooks offers a a nice clean solution. See the xslt-groovy tutorial. It uses a Groovy Smooks resource (“extension”) to pre-process a date field into XML nodes that are more consumable by XSLT, removing the convoluted date field processing logic from the XSLT, keeping the stylesheet nice and simple.
- Where you want to componentize your XSLTs and apply them against message fragments (verses a whole message using a single monolithic XSLT).
- Where you have a larger message set and require a mechanism for selecting and applying the appropriate XSLT based on message profiles.
What sort of processing overhead is encored when using Smooks to apply XSLT?
We carried out some profiling in order to get an answer to this very question. The scenario we used was purposely geared in favor of XSLT. The message being processed was very flat (not hierarchical) and was not normalized and the XSLT we chose to apply was very simple. This type of scenario is especially suited to Stream/SAX based XSL processing.
What this profiling demonstrated was that in this scenario (worse case scenario) a 5% - 15% overhead is encored when comparing Smooks based application of XSLT to direct DOM based XSLT processing. However, when comparing DOM based application of XSLT (direct or via Smooks) to direct Stream/SAX based XSLT processing, we see that Stream/SAX based processing kicks ass in this type of scenario. The fact that Stream/DOM based processing can (given the right conditions) outperform DOM based processing is a long known fact.
This approach to profiling gives users of Smooks a “worse case scenario” idea of how Smooks performs when applying XSL Transforms. What users of Smooks need to remember is that Stream/SAX based processing is not well suited to all types of transforms. As well as that, Smooks offers many other features that help simplify otherwise complex transforms, while at the same time maintaining stylesheet portability across XSL Processors. We’re also very keen to add Stream and SAX based processing to Smooks.
For more on the profiling we carried out, see blog.
NOTE: Smooks v1.0 supports a SAX based processing model. More information on this to follow, or mail the User mailing list.
Can I use Java (or Groovy) to pre-process one fragment of a message and then apply an XSLT to the whole document?
Absolutely! This is a classic Smooks usecase. See the xslt-groovy tutorial. It uses a Groovy Smooks resource (“extension”) to pre-process a date field into XML nodes that are more consumable by XSLT, removing the convoluted date field processing logic from the XSLT, keeping the stylesheet nice and simple. The XSLT is then applied to the “$document” (root) node.
Does Smooks support a Stream or SAX based processing model?
Smooks Core does support a SAX based processing model. Not all components have been updated to leverage the SAX processing model, but will be in time to come.
That said, we feel that people should remember that it’s not as simple as “Stream/SAX based processing is faster than DOM based processing”. Stream/SAX is not suited to all types of transforms. For normalized messages, the performance of Stream/SAX based processing can often suffer a lot more than an equivalent DOM based transform. Counteracting this for Stream/SAX can result in more complex and unmaintainable transformations.
Can Smooks be used to process message formats other than XML?
Absolutely! Smooks allows you configure a SAX parser on a per transform basis (based on a message profile if necessary). As long as a message is hierarchical in nature, SAX events can be generated for that message, allowing it to be consumed by Smooks. See the edi-to-xml and csv-to-xml tutorials.
What’s the difference between the JavaBean Cartridge and XML Binding frameworks like JAXB, XMLBeans, XStream etc?
The Javabeans Cartridge is not intended as a straight alternative to existing XML Binding frameworks such as those listed above. If your only requirement is that of binding XML to and from Java Objects, then you should probably go with one of the these other frameworks.
The Smooks Java binding functionality can be a very useful alternative:
- For binding non-XML data e.g. EDI, CSV, Java (i.e. performing Java to Java transforms).
- For binding XML data that doesn’t line up with the target java object model.
- In situations where your source data model does not conform to any schema. Some tools (e.g. JAXB and XMLBeans) require you to have an XSD, from which the Java model is generated and against which the source message is validated.
- Performing Expression Based Bindings.
- For creating Virtual Object Models. This can be very useful when performing model driven transformations.
- For binding XML data where the XML model’s dataset is a superset of the Java model’s dataset i.e. where you need to selectively pick data from the source XML.
- In support of complex splitting, transformation and routing (and other operations).
We’re sure there are other use cases where Java binding using Smooks makes sense, but basically what we’re saying is that if all you are interested in is straightforward marshalling and unmarshalling between Java and XML, then JAXB/XStream etc is probably a more straightforward option, as long as your use case fits inside the parameters set down by these frameworks. If your use case cannot be solved using JAXB (etc) without major headaches (which can often be the case), then Smooks can be an option for you!
How do I target a resource at the document root fragment of a message without having to specify the name of the root Element?
Specify the selector as “$document”.
How do I target a resource at all Elements of a message?
Specify the selector as “*”.
What happens to message elements I don’t target any resources at?
They remain in the resulting document, untouched. This is unlike a templating type system (e.g. XSLT) where this would result in these fragments being omitted from the resulting document.
Can I target more than one resource at a message fragment?
You can. They will be sorted an applied by Smooks in order of their configuration specificity. See the SmooksResourceConfigurationSortComparator.
What technologies are supported by Smooks?
Smooks supports a number of technologies and can easily be extended to support more. These technologies are bundled in what we call “Cartridges”. A single cartridge may support more than one type of processing technology.
Apart from message transformation, what other forms of message processing are possible with Smooks?
In most of the Smooks documentation we talk about Smooks in the context of data transformation. However, the core of Smooks (smooks-core) knows nothing about data transformation and so does nothing specific in this area. It’s basically an engine for applying “visitor” logic to data “fragments”. This logic can be data transformation logic, or it can be logic for processing/analyzing data in any way you require.
So, the answer to this question is - “whatever type of processing you require”. Just write the visitor implementation(s) and get Smooks to apply the logic on the target fragments. Visitor implementations can interact with each other via the ExecutionContext.
Can a single Smooks instance be run concurrently?
Can Smooks be extended to support other transformation/processing technologies?
Sure. To add support for another transformation or processing technology, you need to implement a ContentHandlerFactory for that technology. As examples, see the following ContentHandlerFactory implementations:
- XslContentHandlerFactory: Adds support for XSLT.
- StringTemplateContentHandlerFactory: Adds support for the StringTemplate templating framework.
- GroovyContentHandlerFactory: Adds support for Groovy scripted resource.
Registering the ContentHandlerFactory is just a matter of adding a file named “content-handlers.inf” to the META-INF folder of factory’s jar file and listing the implementation class(s) there in (one per line).