performance: improve performance for large template files #630
+200
−133
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR improves rendering performance for large docxtpl templates by removing a major XML manipulation bottleneck and reducing repeated per-render overhead in XML parsing, regex compilation, and Jinja2 environment setup.
The primary change avoids replacing the entire
<w:body>element during rendering and instead mutates its children in place. On very large documents, this reduces render time from hours to minutes.Problem
Rendering large templates (especially those with many tables and repeated data blocks) is extremely slow. Profiling showed that the dominant cost came from replacing the
<w:body>element viaroot.replace(body, new_body)during render.Replacing a large XML subtree triggers expensive libxml2 operations, including:
For documents with millions of nodes, this results in effectively O(size of subtree) work and becomes the main performance bottleneck.
Additional overhead came from:
Environmentobjects on every render pathCurrent behavior
<w:body>element entirely usingroot.replace(...)fix_tables()usesetree.fromstring(), producing raw lxml elementspatch_xml()andresolve_listing()are compiled on every renderEnvironmentobjects are repeatedly constructedOn large real-world templates, this can result in render times of multiple hours.
Expected behavior
<w:body>element rather than replacing itFix
1. Mutate
<w:body>children instead of replacing the elementmap_tree()now removes existing<w:body>children and appends rendered children in order2. Use
parse_xml()infix_tables()with safe fallbackdocx.opc.oxml.parse_xml()to ensure OXML element subclassesetree.fromstring(..., recover=True)to preserve robustness for malformed XML3. Ensure new table grid columns use OXML elements
<w:gridCol>elements are created usingOxmlElementandqn(...)4. Precompile commonly used regex patterns
patch_xml()andresolve_listing()are compiled once and reused5. Cache Jinja2 environments
Environmentinstances (autoescape and non-autoescape)6. Header/footer fast path
Performance impact
On a large real-world template:
The majority of the improvement comes from avoiding
<w:body>replacement.Backward compatibility
Notes