Copy a tree

A key benefit of building a site as a tree is that we can seamlessly move between browsing that tree and rendering that tree as static content.

Build pipeline

When working with trees, the “build” pipeline can be conceptually simple:

Real markdown files → Virtual HTML files → Real HTML files
  1. We define a tree of the source markdown content — as an object, as real files, or as values generated by a function.
  2. We use one or more transforms to create a virtual tree of the final content that we want — here, HTML. We can directly serve and browse this virtual tree during development.
  3. We want to copy the virtual tree of HTML pages into some persistent form such as real .html files. We can then deploy these files.

For the last step, we could write the files out directly using a file system API. But we’ve gained a lot by abstracting away the file system read operations; we can make similar gains by abstracting away the file system write operations too.

Setting tree values

Let’s extend our AsyncTree interface with an optional method set(key, value). This updates the tree so that getting the corresponding key will now return the new value. We can supporting deleting a key/value from the tree by declaring that, if value is undefined, the key and its corresponding value will be removed from the tree.

This is straightforward for our object-based tree:

  /* In src/set/ObjectTree.js */

  async set(key, value) {
    if (value === undefined) {
      delete this.obj[key];
    } else {
      this.obj[key] = value;
    }
  }

And a fair bit of work for our file system-based tree:

  /* In src/set/FileTree.js */

  async set(key, value) {
    // Where are we going to write this value?
    const destPath = path.resolve(this.dirname, key ?? "");

    if (value === undefined) {
      // Delete the file or directory.
      let stats;
      try {
        stats = await stat(destPath);
      } catch (/** type {any} */ error) {
        if (error.code === "ENOENT" /* File not found */) {
          return;
        }
        throw error;
      }
      if (stats.isDirectory()) {
        // Delete directory.
        await fs.rm(destPath, { recursive: true });
      } else if (stats) {
        // Delete file.
        await fs.unlink(destPath);
      }
    }

    const isAsyncDictionary =
      typeof value?.get === "function" &&
      typeof value?.keys === "function";

    if (isAsyncDictionary) {
      // Write out the contents of the value tree to the destination.
      const destTree = key === undefined ? this : new FileTree(destPath);
      for await (const subKey of value) {
        const subValue = await value.get(subKey);
        await destTree.set(subKey, subValue);
      }
    } else {
      // Ensure this directory exists.
      await fs.mkdir(this.dirname, { recursive: true });
      // Write out the value as the contents of a file.
      await fs.writeFile(destPath, value);
    }
  }

Half the work here involves handling the case where we want to delete a file or subfolder by passing in an undefined value.

The other complex case we handle is when the value itself is an async tree node, and we have to recursively write out that value as a set of files or folders. We didn’t have to handle that case specially for ObjectTree, as it’s perfectly fine for an ObjectTree instance to have a value which is an async tree.

The file system is not so flexible. The good news is that all this complexity can live inside of the FileTree class — from the outside, we can just call set and trust that the file system will be updated as expected.

This leads to another way to think about async trees: async trees are software adapters or drivers for any real or virtual hierarchical storage.

setDeep

We can now introduce a new helper function, setDeep(target, source), which handles the general case of writing values from the source tree into the target tree.

/* src/set/setDeep.js */

// Apply all updates from the source to the target.
export default async function setDeep(target, source) {
  for (const key of await source.keys()) {
    const sourceValue = await source.get(key);
    const sourceIsAsyncDictionary =
      typeof sourceValue?.get === "function" &&
      typeof sourceValue?.keys === "function";

    if (sourceIsAsyncDictionary) {
      const targetValue = await target.get(key);
      const targetIsAsyncDictionary =
        typeof targetValue?.get === "function" &&
        typeof targetValue?.keys === "function";

      if (targetIsAsyncDictionary) {
        // Both source and target are async dictionaries; recurse.
        await setDeep(targetValue, sourceValue);
        continue;
      }
    }

    // Copy the value from the source to the target.
    await target.set(key, sourceValue);
  }
}

Build real files from virtual content

We’re now ready to build real static files for our site by copying the virtual tree of HTML pages into a real file system folder. All we need to do is wrap a real folder called distFiles in a FileTree:

/* src/set/distFiles.js */

import path from "node:path";
import { fileURLToPath } from "node:url";
import FileTree from "./FileTree.js";

const moduleFolder = path.dirname(fileURLToPath(import.meta.url));
const dirname = path.resolve(moduleFolder, "dist");

export default new FileTree(dirname);

And then create a build.js utility that copies the virtual tree defined in siteTree.js into that real dist folder:

/* src/set/build.js */

import distFiles from "./distFiles.js";
import setDeep from "./setDeep.js";
import siteTree from "./siteTree.js";

await setDeep(distFiles, siteTree);

Use this new build tool from inside the src/set directory to copy the virtual tree into files. The set method for FileTree takes care to create the target directory (dist), so it’s fine if that directory doesn’t exist when we start.

$ cd ../set
$ ls dist
ls: dist: No such file or directory
$ node build
$ ls dist
Alice.html Bob.html   Carol.html index.html more

Inspect the individual files in the dist folder to confirm their contents — or use our json utility to dump the entire dist folder to the console.

$ node json distFiles.js
{
  "Alice.html": "<p>Hello, <strong>Alice</strong>.</p>\n",
  "Bob.html": "<p>Hello, <strong>Bob</strong>.</p>\n",
  "Carol.html": "<p>Hello, <strong>Carol</strong>.</p>\n",
  "more": {
    "David.html": "<p>Hello, <strong>David</strong>.</p>\n",
    "Eve.html": "<p>Hello, <strong>Eve</strong>.</p>\n",
    "index.html": "<!DOCTYPE html>\n<html>\n  <body>\n    <ul>\n      <li><a href=\"David.html\">David</a></li>\n      <li><a href=\"Eve.html\">Eve</a></li>\n    </ul>\n  </body>\n</html>"
  },
  "index.html": "<!DOCTYPE html>\n<html>\n  <body>\n    <ul>\n      <li><a href=\"Alice.html\">Alice</a></li>\n      <li><a href=\"Bob.html\">Bob</a></li>\n      <li><a href=\"Carol.html\">Carol</a></li>\n      <li><a href=\"more\">more</a></li>\n    </ul>\n  </body>\n</html>"
}

We can see that we’ve generated HTML pages for all the markdown content, and also see that each level of this tree has an index.html page.

Browse the built HTML files

You could now deploy the HTML files in the dist folder anywhere, such as a CDN (Content Delivery Network).

As a quick test, serve the dist folder with any static server, such as http-server.

$ npx http-server dist
Starting up http-server, serving dist

(You could also temporarily hack serve.js to serve the tree defined by distFiles.js instead of siteTree.js. Everything here’s a tree, and you can serve any of those trees the same way.)

Browse to the static server and confirm that the static results are the same as what you can see running the dynamically-generated tree.

The results will look identical, but a key difference is that no real work is necessary to display the HTML files served from the dist folder.

Before moving on, in the terminal window, stop the server by pressing Ctrl+C.

In this tutorial, the markdown-to-HTML translation happens almost instantly, but in real projects, the data or transformations could easily take some time. Viewing an individual page might require non-trivial work, resulting in a perceptible delay before the page appears. Building the pages into static files performs all the work at once, so your users can browse the resulting static files as fast as the web can deliver them.

We’ve now solved our original problem: we’ve created a system in which our team can write content for our web site using markdown, and end up with HTML pages we can deploy.

A general approach for building things

In this tutorial, we’re using real markdown files to create virtual HTML files and then save those as real HTML files. But this type of build pipeline doesn’t really have anything to do with the web specifically — HTML pages are just a convenient and common example of content that can be created this way.

You could apply this same async tree pattern in build pipelines for many other kinds of artifacts: data sets, PDF documents, application binaries, etc. The pattern can benefit any situation in which you are transforming trees of values.

  • In some cases, the source information will be an obvious tree. In others, you might start with a single block of content (a document, say) and parse that to construct a virtual tree. Or you might wrap a data set to interpret it as an async tree.
  • You can then apply multiple transforms to that source tree to create additional virtual trees, each one step closer to your desired result.
  • Finally, you can save the last virtual tree in some persistent form. That might be a hierarchical set of files as in the example above, or you might reduce the tree in some fashion to a single result, perhaps a single file.

 

Next: Combine trees »