The standard rendering workflow at Ambrosoft moves XML documents through a predictable sequence of stages. Each stage has a clear input, a defined operation, and a verifiable output. This note describes the full pipeline.
Stage 1: Source Ingestion
The workflow begins with an XML source document. Before any processing, validate the document against its schema or DTD. Catching structural errors at the ingestion stage prevents cascading failures in downstream stages.
Validation should check:
- Well-formedness (mandatory)
- Schema conformance (recommended)
- Encoding declaration presence
- Namespace declarations for all prefixed elements
If the source document fails validation, log the specific errors and reject the document. Do not attempt partial processing of invalid input.
Stage 2: Stylesheet Selection
Select the XSLT stylesheet based on the document type. The selection criteria depend on your architecture:
- Document element name. Match the root element to a stylesheet mapping table.
- Processing instruction. Some XML documents include an
xml-stylesheetprocessing instruction. This is useful for browser-rendered XML but should not be the sole mechanism for server-side routing. - Schema namespace. For namespace-qualified documents like UBL, the namespace URI identifies the document version and type.
Maintain a registry that maps document types to stylesheet paths. This avoids hardcoding stylesheet references in processing code.
Stage 3: Parameter Assembly
Most stylesheets accept external parameters for configuration. Assemble parameter values before invoking the transformation. Common parameters include:
- Output format (HTML, PDF intermediate, plain text)
- Language or locale identifier
- Date formatting pattern
- Base URL for relative links in output
- Debug flag for verbose output
Validate parameter values against expected types and ranges. A malformed parameter can produce subtly incorrect output that passes automated checks but fails human review.
Stage 4: Transformation
Execute the transformation using your chosen XSLT processor. The processor version matters:
| Processor | XSLT Support | Typical Use |
|---|---|---|
| Saxon HE | 3.0 | Server-side batch processing |
| Xalan-J | 1.0 | Legacy Java applications |
| XSLTC | 1.0 | Compiled, high throughput |
| Browser | 1.0 | Client-side preview |
Capture transformation time for monitoring. A sudden increase in transformation duration often indicates a source document with unexpected structure (deeply nested elements, unusually large text nodes) rather than a processor issue.
Stage 5: Output Validation
After transformation, validate the output:
- HTML output. Parse the result and check for well-formedness. Verify that required elements (title, meta description, canonical link) are present.
- UBL rendering. Verify that required fields (invoice number, line items, tax totals) appear in the output.
- Text output. Check encoding and line endings.
Automated output validation should compare against a known-good reference for regression detection. Maintain a set of reference documents and their expected outputs.
Stage 6: Post-Processing
Some outputs need post-processing:
- HTML. Minification, asset linking, Pagefind indexing.
- PDF. If the transformation produces an intermediate format (XSL-FO), run it through a formatter like FOP.
- Archive. Compress and timestamp the output for audit trails.
Error Handling
Each stage should produce structured error output with:
- Stage name where the error occurred
- Input document identifier
- Specific error message
- Suggested remediation
Avoid generic error messages like “transformation failed.” The error should tell the operator exactly which stage failed and why, so they can fix the problem without re-running the entire pipeline with debug logging enabled.
For processor-specific benchmarks, see the benchmarks section. For debugging transformation issues, see the XSLT debugging guide.