Benchmarks are easy to produce and hard to interpret honestly. This page covers how the performance data published on this site was gathered, what the numbers actually tell you about different XSLT processors, and the caveats that separate useful analysis from misleading comparisons. The detailed results are in the performance results page, and the XSLTC Story provides context on compiled transformation approaches. The XSLT debugging workflow guide is relevant for anyone trying to isolate performance issues in their own pipelines.
How Benchmark Data Is Interpreted
Every benchmark number on this site comes with context. Raw timing data without context is worse than no data, because it invites false conclusions.
The basic framework is straightforward: run a transformation N times, discard the first few iterations to account for JVM warmup and class loading, record the timing of the remaining iterations, and report the median. Mean values are less useful because outliers from garbage collection or I/O contention can skew them significantly.
We report timing for several dimensions:
- Engine: Which XSLT processor was used (Saxon HE, Saxon EE, Xalan-J, XSLTC, browser XSLTProcessor).
- Stylesheet complexity: Simple passthrough, moderate restructuring, or complex multi-template transforms.
- Input size: Small (under 5 KB), medium (20-100 KB), and large (over 500 KB) documents.
- Execution mode: Interpreted, compiled, or streaming where applicable.
The goal is not to crown a winner. It is to help practitioners make informed decisions about which engine fits their workload profile.
Caveats Around Environment, Input Shape, and I/O
Benchmarks run in a controlled environment. Production runs in the opposite of a controlled environment. The gap between the two is where most false confidence originates.
JVM version matters. The same Saxon version can show 15-20% performance differences between JVM implementations. We standardize on a specific JDK version for each benchmark run and report it.
Input shape matters more than input size. A 50 KB document with deep nesting and heavy use of namespaced attributes exercises different code paths than a 50 KB document with flat, repetitive structure. Template matching cost scales with tree depth and branching factor, not just byte count.
I/O is often the bottleneck. In batch processing scenarios, disk read time, network latency, and output serialization frequently dominate total elapsed time. A 3x improvement in pure transformation speed delivers negligible improvement if the transform accounts for only 10% of end-to-end time.
Warmup is real. JVM-based processors show substantially different performance in the first few executions compared to steady state. XSLTC compiled transforms reduce this gap but do not eliminate it.
Engine Comparison Overview
The following summary reflects general patterns observed across multiple benchmark runs. Individual results vary by workload.
| Engine | XSLT Version | Compiled Mode | Streaming | Typical Use Case |
|---|---|---|---|---|
| Saxon HE | 3.0 | No | No | General purpose, standards compliance |
| Saxon EE | 3.0 | Yes | Yes | High-volume production, large documents |
| Xalan-J | 1.0 | Via XSLTC | No | Legacy systems, embedded processing |
| XSLTC | 1.0 | Yes | No | Batch processing, compiled speed |
| Browser XSLTProcessor | 1.0 | N/A | No | Client-side preview, lightweight transforms |
Saxon HE is the default recommendation for most new projects. It supports XSLT 3.0, has excellent standards compliance, and performs well on medium-sized workloads. For high-volume batch processing where every millisecond matters, compiled execution via Saxon EE or XSLTC offers measurable gains.
The browser’s built-in XSLTProcessor is a viable option for client-side rendering of small documents. Performance degrades noticeably above 100 KB input sizes and with complex stylesheets.
Common Mistakes When Comparing Transformations
The most frequent benchmarking mistakes I encounter:
Comparing across XSLT versions. An XSLT 1.0 stylesheet does not exercise the same processor features as an XSLT 3.0 stylesheet. Performance comparisons between the two are meaningless.
Ignoring compilation time. XSLTC compilation is a one-time cost that amortizes over many executions. Reporting first-run time as representative of steady-state performance overstates the cost of compiled transforms.
Using trivially small inputs. A benchmark on a 200-byte XML document mostly measures startup overhead. Use input sizes representative of your actual workload.
Measuring wall clock time on shared infrastructure. Cloud VMs with noisy neighbors produce unreliable timing data. Use dedicated hardware or at minimum report the variance across runs.
Cherry-picking stylesheet features. A benchmark that tests only xsl:for-each tells you nothing about xsl:apply-templates performance. Design benchmarks to exercise the patterns your production stylesheets actually use.
Internal Benchmark Resources
For detailed data tables and per-engine analysis:
- Performance Results contains the raw timing data and comparative analysis across processors.
- The XSLTC Story explains the compiler architecture that enables compiled transformation performance.
- Gregor XSLT documents the transformation tool that originated several of the benchmark workloads used on this site.