The XSLTC Story

How the XSLTC compiler transforms XSLT stylesheets into executable code, the engineering decisions behind that approach, and where compiled transformation fits in modern pipelines.

The XSLTC compiler represents one of the more interesting engineering decisions in the XSLT ecosystem: what happens when you stop interpreting stylesheets at runtime and start compiling them into executable code? This page covers the technical story behind that approach, the design tradeoffs that shaped it, and where compiled XSLT transformation fits in the context of modern processing pipelines. This connects to the Gregor XSLT project, which extends compiled transformation with additional optimization work, and to the benchmark data that quantifies the performance differences.

The Interpretation Problem

XSLT processors traditionally work as interpreters. The processor reads a stylesheet, builds an internal representation of the template rules, and then walks the input document tree evaluating templates against nodes at runtime. This approach has clear advantages: it is flexible, easy to debug, and straightforward to implement against the XSLT specification.

The problem appears at scale. When the same stylesheet is applied to thousands of documents, the interpreter repeats substantial amounts of work on each execution. Template matching priorities are re-evaluated. XPath expressions are re-parsed and re-optimized. Data structures are re-allocated. None of this work produces different results across executions if the stylesheet has not changed.

For single-document transformations, the overhead is invisible. For batch processing, where the same stylesheet runs against a large corpus, the cumulative cost is significant. One thing I learned early in this domain: the difference between interpreted and compiled execution only matters when you are processing enough documents to notice it. The crossover point varies, but it is lower than most people assume.

The Compiler Approach

XSLTC takes the interpretation overhead and eliminates it by compiling the stylesheet into Java bytecode. The compiled artifact is a Java class that can be loaded and executed directly by the JVM, without the interpretive layer.

The compilation process analyzes the stylesheet structure and produces bytecode that:

implements template matching logic as direct method dispatch rather than runtime pattern evaluation
pre-resolves import precedence and default rules
compiles XPath expressions into optimized bytecode rather than interpreting them through an expression evaluator
handles output serialization directly without an intermediate abstract representation

The result is a transformation engine that does substantially less per-document work. Template matching becomes a method call. XPath evaluation becomes a series of bytecode instructions. The overhead that dominated interpreted execution is compiled away.

Technical Detail XSLTC compiles each stylesheet into a translet, which is a Java class that implements a specific interface. The translet can be serialized, cached, and reused across JVM sessions. This makes it practical to compile once during deployment and reuse the compiled artifact throughout the application lifetime.

Engineering Tradeoffs

Compilation is not free. The XSLTC approach introduces several engineering tradeoffs that matter in practice.

Compilation time. Compiling a stylesheet to bytecode takes longer than simply parsing it for interpretation. For small stylesheets, compilation might take 200-500ms. For large stylesheets, it can take several seconds. This cost is justified only when the compiled artifact is reused enough times to amortize it.

Debugging opacity. Bytecode is not readable. When a compiled transformation produces unexpected output, mapping the problem back to the original stylesheet is harder than in an interpreted environment. Stack traces reference bytecode offsets, not stylesheet line numbers. Tools exist to bridge this gap, but they add complexity. The Gregor project addresses this by maintaining explicit source-to-execution mappings.

Feature coverage. XSLTC targets XSLT 1.0. Supporting the full XSLT 2.0 or 3.0 specification through compilation would require substantially more engineering effort. Features like streaming, higher-order functions, and maps introduce compilation challenges that XSLT 1.0 does not present. This limits XSLTC to workloads that can be expressed in XSLT 1.0.

Startup cost. Loading a compiled translet has its own JVM class loading overhead. For very short-lived applications that process only a few documents, the class loading time may offset the execution time savings. This is rarely an issue in practice because most batch processing applications are long-lived.

Performance Characteristics

The performance profile of XSLTC compiled transforms is documented in detail on the performance results page. The summary pattern is consistent across workloads:

Compiled transforms are 2x to 5x faster than interpreted transforms for the same stylesheet and input. The improvement is larger for complex stylesheets (many templates, deep template matching logic) and smaller for simple stylesheets (few templates, straightforward processing).

The improvement is most pronounced in template matching. When an interpreted processor evaluates 60 template rules at each node in a large document, the cumulative matching cost is substantial. The compiled version resolves these matches through pre-computed dispatch tables, which reduces per-node cost from microseconds to nanoseconds.

XPath evaluation shows a smaller but consistent improvement. Compiled XPath expressions avoid the interpretation overhead of parsing and evaluating the expression tree at runtime. For simple path expressions, the difference is minimal. For complex predicates with multiple boolean conditions and position tests, the compiled version is noticeably faster.

Data Point In a representative batch workload processing 10,000 medium-complexity documents, XSLTC compiled execution completed in 42 seconds. The same workload through Xalan-J interpreted execution took 158 seconds. Both produced identical output. The compilation step added 1.8 seconds.

Where Compiled Transformation Fits Today

Compiled XSLT transformation occupies a specific niche. It is not the right choice for every workload, and understanding where it fits prevents both underuse and overuse.

Good fit: Batch processing pipelines that apply the same stylesheet to many documents. Data migration projects. Document conversion systems. Any scenario where the compilation cost is amortized across hundreds or thousands of executions.

Marginal fit: Interactive applications where the stylesheet might change between executions. Development and testing workflows where rapid iteration matters more than execution speed. Workloads with very small documents where transformation time is already negligible.

Poor fit: One-off transformations. Workloads that require XSLT 2.0 or 3.0 features. Scenarios where debugging transparency is critical and the additional tooling for bytecode mapping is not available.

The XSLT ecosystem has matured since XSLTC was first developed. Saxon’s optimization engine has closed some of the performance gap between interpreted and compiled execution. But for XSLT 1.0 batch workloads, compiled execution through XSLTC or through the extended compilation approach in Gregor remains the fastest option available.

Connection to Current Work

The compiler approach that XSLTC pioneered continues to influence current transformation tooling work. The Gregor XSLT project extends the compilation concept with better source mapping for debugging, additional optimization passes, and a more flexible integration model. The benchmarks section tracks how compiled and interpreted performance evolve as processors and JVMs are updated.

The XSLTC Story

The Interpretation Problem

The Compiler Approach

Engineering Tradeoffs

Performance Characteristics

Where Compiled Transformation Fits Today

Connection to Current Work

Related Reading