Skip to content

Get started

wvbe edited this page Oct 30, 2022 · 3 revisions

You basically have two options, with a lot of flexibility between them;

  • Use docxml components directly
  • Use XML rendering rules to compose compoe

Components

docxml comes with a bunch of components that, like web components, can be composed into an OOXML document. The components provide an API with which you could set their font size, spacing, color etc.

/** @jsx Docx.jsx */
import Docx, { Paragraph } from 'https://deno.land/x/docxml/mod.ts';

await Docx.fromJsx(
        <Paragraph>This is the simplest document you could make.</Paragraph>
    )
    .toFile('example-1.docx');

In the code example above a few things happened:

  • We specified a JSX pragma using the /** @jsx Docx.jsx */ comment, so that we could use JSX in the rest of the code.
  • We created a new DOCX using Docx.fromJsx() and one of the most common DOCX components, <Paragraph>.
  • We wrote file example-1.docx to disk using the .toFile() method.

To components provided by docxml are all exported from mod.ts, please see the API documentation generated for this module's exports for the full list.

Using JSX

The JSX pragma provided by docxml is exported as a function from mod.ts, and as a static member of the default export, as well as an instance member of that class;

import Docx, { jsx } from 'https://deno.land/x/docxml/mod.ts';

const doc = Docx.fromNothing();

// jsx === Docx.jsx === doc.jsx

It is recommended to use the instance member (doc.jsx in the example above) if you have an instance of Docx -- in the future it may contain more context than the exported function or static member.

The JSX function provides a few conveniences:

  • Supports the JSX syntax
  • Attempt some corrections to the component nesting if you've made a boo-boo;
    • Wrap strings in <Text> if you did not already
    • Split or unwrap contents if it was illegally nested
  • Await asynchronous child components

Deno supports JSX out of the box -- Make sure to name your file with the .tsx extension and put the @jsx doclet at the top of your file:

/** @jsx Docx.jsx **/

Not using JSX

If you prefer not to use JSX, simply create new instances of the component classes. Every component follows the same pattern when it comes to constructor arguments:

new Paragraph(
    props: ParagraphProps,
    ...children: ParagraphChild[]
)

For example:

new Paragraph(
    {},
    new Text(
        {},
        'This is '
    ),
    new Text(
        { isBold: true },
        'paragraph'
    ),
    new Text(
        {},
        ' text'
    )
)

Be mindful that this API is less forgiving than JSX, because it does not attempt fixes in case you're nesting components in an invalid way. However, if you're using TypeScript your IDE should warn you in case you're putting a component where it doesn't belong.

Rendering XML

The simplest example of building an XML-to-DOCX transformation is as follows:

/** @jsx Docx.jsx */
import Docx, { Paragraph, Text } from 'https://deno.land/x/docxml/mod.ts';

await Docx.fromNothing()
    .withXmlRule('self::node()', ({ traverse }) => traverse('./*'))
    .withXmlRule('self::text()', ({ node }) => <Text>{node.nodeValue}</Text>)
    .withXmlRule('self::p', ({ traverse }) => <Paragraph>{traverse()}</Paragraph>)
    .withXmlRule('self::strong', ({ traverse }) => <Text isBold>{traverse()}</Text>)
    .withXml(
        `<html>
            <body>
                <p>This is a very simply <strong>XML transformation</strong>.</p>
            </body>
        </html>`,
        {},
    )
    .toFile('example-2.docx');

In the code example above the following happened:

  • We specified a JSX pragma using the /** @jsx Docx.jsx */ comment, so that we could use JSX in the rest of the code.
  • We created an empty DOCX document using Docx.fromNothing() and proceeded to configure it immediately with a few chaining methods.
  • We specified the "template" for several XML nodes and elements using the .withXmlRule method. Using this we mapped a default behavior for "any node", we told the system to always return the text of any text nodes, and mapped <p> and <strong> XML elements to the <Paragraph> and <Text> docxml components. This is quite similar to how XSLT templates work.
  • We loaded some XML using the .withXml method. It's hardcoded here but could obviously come from anywhere.
  • We wrote file example-2.docx to disk using the .toFile() method.

XPath

.withXmlRule uses XPath 3.1 to select XML elements and to traverse further down the XML tree. For example, use the following selectors to select content meeting various conditions:

  • self::p
    Any <p> element
  • self::a[@title]
    Any <a> element that has a title attribute
  • self::section[@type="chapter"]
    Any <section> element where the type attribute is set to chapter
  • self::figure[child::img and not(descendant::fn)]
    Any <figure> element that directly contains <img> and not contains <fn> anywhere

Template context

Every template used for an element (the callback argument to .withXmlRule) is given some helper context to make rendering child content or doing XML lookups easier.

For example, you may sometimes want to skip XML content from a section that is meant for a different audience. Using XPath you can tell the system to "render all child elements, except any element with the audience="advanced" attribute:

docx.withXmlRule('self::section', ({ traverse }) => (
    <Section>{traverse('./*[not(@audience="advanced")]')}</Section>
));

The entire list of context that is passed into your template function is as follows:

({ traverse, node, document }) => ()
  • traverse(xPathQuery?: string) => RuleResult
    A function with which you can select and render the next part of the document wherever you like.
  • node: Node
    The XML node that matched the XPath selector for this rule. Often you are dealing with elements, but rendering rules can apply to text nodes, document nodes, XML comments, processing instructions and so on too!
  • document: OfficeDocument
    The instance of the OOXML document that is being worked on. This instance correlates with word/document.xml (in the eventual OOXML archive), and via the relationships of it you can reach helper classes to deal with (custom) styles, comments, change tracking and more.

You can pass in additional context if you like;

const doc = new Docx<{ publicationTime: Date }>();

docx.withXmlRule('self::meta', ({ publicationTime }) => (
    <Paragraph>{publicationTime.toString()}</Paragraph>
));

docx.withXml('<test><meta /></test>', {
    publicationTime: new Date()
});

Clone this wiki locally