Frequently Asked Questions

1. What is the easiest way to get started with Smooks?
2. How do I use Smooks from a project managed with Maven?
3. What is the Smooks processing model?
4. What is the configuration model of Smooks?
5. What is a fragment/chunk?
6. What is a selector?
7. Is Smooks an alternative to XML transformation technologies such as XSLT?
8. How does Smooks compare with enterprise application integration (EAI) frameworks like Apache Camel or Mule?
9. Can I pre-process individual chunks of a stream and then apply a transformation to the overall stream?
10. Can Smooks ingest non-XML formats?
11. What is the difference between the JavaBean cartridge and XML data binding frameworks like JAXB, Apache XMLBeans, and XStream?
12. How do I configure a Java resource?
13. How do I configure an XSLT resource?
14. How do I configure a FreeMarker resource?
15. How do I configure a StringTemplate resource?
16. How do I configure a Groovy resource?
17. How do I target the root event of a stream without specifing the name of the root event?
18. How do I target all events of a stream?
19. What happens to events that I do not target?
20. Can I target a fragment with multiple resources?
21. What are the technologies that Smooks supports?
22. Apart from transformation, what other forms of processing are possible with Smooks?
23. Can a single Smooks instance have multiple concurrent executions?
24. Can Smooks be extended to support other libraries, frameworks, and platforms?

1. What is the easiest way to get started with Smooks?

A practical way to getting started with Smooks is to:

Read the Smooks fundamentals section.
Download and run some of the examples.

2. How do I use Smooks from a project managed with Maven?

3. What is the Smooks processing model?

The Smooks processing model is designed around the concepts of readers, events, chunks/fragments, selectors, and visitors. Smooks fires a hierarchy of events from a structured data source (e.g., JSON) thanks to a reader. An event is the smallest unit of data that a Smooks application can target for data integration. User-defined and built-in visitors use XPath-like selectors to target emitted events or chunks of these emitted events. A visitor is a unit of behaviour that processes a targeted event or chunk of events (i.e., fragment) to execute a data integration task such as transformation, routing, enrichment, binding, filtering, and so on.

4. What is the configuration model of Smooks?

Resource configs define resources driving the behaviour of Smooks. These resources could be readers, visitors, system resources, or even ad-hoc resources. A resource config is primarily composed of a selector, a type, the resource itself, and a key/value set of parameters. Resource configs can be declared from the Java or XML API of Smooks.

5. What is a fragment/chunk?

A chunk, also known as a fragment, is a hierarchy of events emitted from a source. Specifically, it is a (1) start event, (2) followed by zero or more child events, and (3) terminated by an end event. The application defines the chunk’s boundaries but, practically speaking, a chunk could correspond to a CSV record, an XML element, a JSON field, an EDI segment, etc…

6. What is a selector?

At a fundamental level, a Smooks applications is composed of a sequence of resource configs, typically declared in an XML file. Smooks activates a resource at runtime through its resources config selector attribute. With some exceptions, the selector is an XPath-like expression that points to one or more events in the stream. When Smooks fires an event (e.g., an end event), if the resource config selector matches the event, Smooks applies the corresponding resource to the event or fragment which results in an operation such as transformation, filtering, routing, binding, etc…

7. Is Smooks an alternative to XML transformation technologies such as XSLT?

While Smooks does indeed borrow a number of concepts from XSLT, and XML more generally, it should not be conflated with XSLT or other transformation solutions. XSLT is an XML language that uses pattern-matching for transforming XML documents into other XML documents. Smooks is a framework for event-driven, fragment-based data integration using XML (e.g., XSLT, DFDL, etc…) and non-XML (e.g., Java, Groovy, FreeMarker, Docker, etc…) solutions. Smooks empowers you to apply multiple independent XSLT resources on an event stream alongside other resources in the context of a single execution.

8. How does Smooks compare with enterprise application integration (EAI) frameworks like Apache Camel or Mule?

Camel and Mule are frameworks for building message-oriented middleware. The unit of data within these frameworks is a message which is a packet consisting of a payload and metadata. While this data abstraction is certainly useful in many integration scenarios, it can lead to un-necessary complexity when the data needs to be broken down and processed efficiently. Depending on the context, a splitter can help manage the complexity. On the other hand, Smooks is arguably a better candidate since it is geared towards this problem domain. Although a clear-cut answer cannot be given, we generally recommend leveraging an EAI framework for connectivity, content-based routing, and transformation/mapping whereas let Smooks do the heavy lifting when the data needs to be pulled apart and consumed in different ways.

Apache Camel has a Smooks component providing seamless integration between the two frameworks.

9. Can I pre-process individual chunks of a stream and then apply a transformation to the overall stream?

This is a classic Smooks usecase. See the xslt-groovy example. It uses core:smooks (i.e., pipeline) construct together with the core:rewrite construct to pre-process a date event in order for the date to be more consumable by XSLT. The XSLT script is then applied to the event stream with the #document (i.e., root) selector. This removes any need to have convoluted date processing logic within the XSLT script.

10. Can Smooks ingest non-XML formats?

Certainly. You can leverage one of the many out-of-the-box readers, or develop your own Smooks reader, to ingest a non-XML source (based on a profile if necessary). The reader emits events from the source, allowing the downstream Smooks resources to target and process the data. See the edi-to-xml and csv-to-xml examples.

11. What is the difference between the JavaBean cartridge and XML data binding frameworks like JAXB, Apache XMLBeans, and XStream?

The JavaBean cartridge is not designed to be an alternative to existing XML data binders such as those listed above. A dedicated XML data binding solution should be considered if the only requirement is that of marshalling Java objects into XML and unmarshalling XML into objects.

The JavaBean cartridge is a useful alternative:

For binding non-XML data.
For binding XML data that does not line up with the target Java object model.
In situations where your source data model does not conform to any schema. Some tools (e.g., JAXB and XMLBeans) require you to have an XSD, from which the Java model is generated and against which the source message is validated.
Performing Expression Based Bindings.
For creating Virtual Object Models. This can be very useful when performing model driven transformations.
For binding XML data where the XML model’s dataset is a superset of the Java model’s dataset i.e. where you need to selectively pick data from the source XML.
In support of complex splitting, transformation, and routing (as well as other operations).

The preceding list is not exhaustive. There are other use cases where Java binding with Smooks makes sense. However, the rule of thumb is that if all you are interested in is straightforward marshalling and unmarshalling between Java and XML, then an XML data binder is likely a more practical solution. That is, as long as your use case fits inside the parameters set down by these frameworks. If your use case cannot be solved using an XML data binder without major headaches, then Smooks could be the solution for you!

12. How do I configure a Java resource?

Check out the java-basic example.

13. How do I configure an XSLT resource?

Check out the user guide.
Check out the examples.

14. How do I configure a FreeMarker resource?

Check out the user guide.
Check out the templating examples.

15. How do I configure a StringTemplate resource?

Check out the StringTemplateContentHandlerFactory Javadocs.
Check out the templating examples.

16. How do I configure a Groovy resource?

Check out the user guide.
Check out the examples.

17. How do I target the root event of a stream without specifing the name of the root event?

Set the resource config selector to #document.

18. How do I target all events of a stream?

Set the resource config selector to *.

19. What happens to events that I do not target?

Un-targeted events are directed to the final output. This is unlike most templating solutions (e.g., XSLT) where this would result in these fragments being omitted from the result.

20. Can I target a fragment with multiple resources?

You can. They will be sorted an applied by Smooks in order of their configuration specificity. See the SmooksResourceConfigurationSortComparator.

21. What are the technologies that Smooks supports?

Smooks does not limit you to any single technology. You can write your own extensions but Smooks comes with its own extensions that are bundled together to form cartridges. A cartridge adds one or more resources to Smooks to support a particular technology.

22. Apart from transformation, what other forms of processing are possible with Smooks?

The core of Smooks has no notion of data transformation. Smooks is an engine for applying visitors to events emitted from a data source. A visitor could be a transformer, or it could implement logic for processing or analysing data in any way to fit your use case. In other words, the answer to this question is "whatever type of processing you require". You only need to implement a visitor and hook it up with Smooks from a resource config.

Multiple visitors can interact with each other via the execution context.

23. Can a single Smooks instance have multiple concurrent executions?

Absolutely! org.smooks.Smooks and org.smooks.api.ApplicationContext are thread-safe, however, each execution should have its own instance of org.smooks.api.ExecutionContext since an execution context is NOT thread-safe.

24. Can Smooks be extended to support other libraries, frameworks, and platforms?

Yes. To add support for another transformation or processing technology, you need to implement a org.smooks.api.delivery.ContentHandlerFactory for the provider. As examples, see the following ContentHandlerFactory implementations:

XslContentHandlerFactory: Adds support for XSLT.
StringTemplateContentHandlerFactory: Adds support for the StringTemplate templating framework.
GroovyContentHandlerFactory: Adds support for Groovy scripted resource.

Registering the ContentHandlerFactory is a matter of adding a configuration file named org.smooks.api.delivery.ContentHandlerFactory to the META-INF/services directory and declaring the ContentHandlerFactory implementation class(es).