Benchmarks are easy to produce and hard to interpret honestly. This page covers how the performance data published on this site was gathered, what the numbers actually tell you about different XSLT processors, and the caveats that separate useful analysis from misleading comparisons. The detailed results are in the performance results page, and the XSLTC Story provides context on compiled transformation approaches. The XSLT debugging workflow guide is relevant for anyone trying to isolate performance issues in their own pipelines.

How Benchmark Data Is Interpreted

Every benchmark number on this site comes with context. Raw timing data without context is worse than no data, because it invites false conclusions.

The basic framework is straightforward: run a transformation N times, discard the first few iterations to account for JVM warmup and class loading, record the timing of the remaining iterations, and report the median. Mean values are less useful because outliers from garbage collection or I/O contention can skew them significantly.

We report timing for several dimensions:

  • Engine: Which XSLT processor was used (Saxon HE, Saxon EE, Xalan-J, XSLTC, browser XSLTProcessor).
  • Stylesheet complexity: Simple passthrough, moderate restructuring, or complex multi-template transforms.
  • Input size: Small (under 5 KB), medium (20-100 KB), and large (over 500 KB) documents.
  • Execution mode: Interpreted, compiled, or streaming where applicable.

The goal is not to crown a winner. It is to help practitioners make informed decisions about which engine fits their workload profile.

Caveats Around Environment, Input Shape, and I/O

Benchmarks run in a controlled environment. Production runs in the opposite of a controlled environment. The gap between the two is where most false confidence originates.

JVM version matters. The same Saxon version can show 15-20% performance differences between JVM implementations. We standardize on a specific JDK version for each benchmark run and report it.

Input shape matters more than input size. A 50 KB document with deep nesting and heavy use of namespaced attributes exercises different code paths than a 50 KB document with flat, repetitive structure. Template matching cost scales with tree depth and branching factor, not just byte count.

I/O is often the bottleneck. In batch processing scenarios, disk read time, network latency, and output serialization frequently dominate total elapsed time. A 3x improvement in pure transformation speed delivers negligible improvement if the transform accounts for only 10% of end-to-end time.

Warmup is real. JVM-based processors show substantially different performance in the first few executions compared to steady state. XSLTC compiled transforms reduce this gap but do not eliminate it.

Methodology Note All benchmarks reported on this site use a minimum of 100 iterations after warmup. We report median, P95, and P99 values. The specific JDK version, XSLT processor version, and test hardware are documented alongside each result set.

Engine Comparison Overview

The following summary reflects general patterns observed across multiple benchmark runs. Individual results vary by workload.

Engine XSLT Version Compiled Mode Streaming Typical Use Case
Saxon HE 3.0 No No General purpose, standards compliance
Saxon EE 3.0 Yes Yes High-volume production, large documents
Xalan-J 1.0 Via XSLTC No Legacy systems, embedded processing
XSLTC 1.0 Yes No Batch processing, compiled speed
Browser XSLTProcessor 1.0 N/A No Client-side preview, lightweight transforms

Saxon HE is the default recommendation for most new projects. It supports XSLT 3.0, has excellent standards compliance, and performs well on medium-sized workloads. For high-volume batch processing where every millisecond matters, compiled execution via Saxon EE or XSLTC offers measurable gains.

The browser’s built-in XSLTProcessor is a viable option for client-side rendering of small documents. Performance degrades noticeably above 100 KB input sizes and with complex stylesheets.

Common Mistakes When Comparing Transformations

The most frequent benchmarking mistakes I encounter:

Comparing across XSLT versions. An XSLT 1.0 stylesheet does not exercise the same processor features as an XSLT 3.0 stylesheet. Performance comparisons between the two are meaningless.

Ignoring compilation time. XSLTC compilation is a one-time cost that amortizes over many executions. Reporting first-run time as representative of steady-state performance overstates the cost of compiled transforms.

Using trivially small inputs. A benchmark on a 200-byte XML document mostly measures startup overhead. Use input sizes representative of your actual workload.

Measuring wall clock time on shared infrastructure. Cloud VMs with noisy neighbors produce unreliable timing data. Use dedicated hardware or at minimum report the variance across runs.

Cherry-picking stylesheet features. A benchmark that tests only xsl:for-each tells you nothing about xsl:apply-templates performance. Design benchmarks to exercise the patterns your production stylesheets actually use.

Pitfall Do not use benchmarks to choose an XSLT engine without first profiling your actual workload. The engine that wins a generic benchmark may lose on your specific document structure and template patterns.

Internal Benchmark Resources

For detailed data tables and per-engine analysis:

  • Performance Results contains the raw timing data and comparative analysis across processors.
  • The XSLTC Story explains the compiler architecture that enables compiled transformation performance.
  • Gregor XSLT documents the transformation tool that originated several of the benchmark workloads used on this site.