This audits the site defined by the given tree for broken internal links. It first crawls the site using the same crawler as dev:crawl
; see that page for details on what kinds of files and references are crawled.
- The audit only verifies links to internal pages and resources. It does not verify links to external pages and resources, i.e., outside the site being audited.
- The audit process currently ignores errors. If, when attempting to retrieve a given resource, an error is generated, that resource will be skipped.
- A link to a page is considered valid if the page exists; if a link includes an anchor, the audit does not confirm that the specific anchor exists on that page. E.g., a link to
foo.html#example
is valid iffoo.html
exists, regardless of whether that page has an#example
anchor. - Links to
foo
,foo/
, andfoo/index.html
are considered equal. - The audit process is unaware of any redirects you may have configured for your web server. For example, you might configure a web host with redirects that arrange for other kinds of equivalence so that
foo/
andfoo.html
are equivalent. Butaudit
will consider those to be different paths.
Auditing an Origami site #
You can give audit
the top-level file that defines your site’s root.
Example: a file contains a tiny site with an index.html
page that links to page a.html
that links to b.html
:
// missingPage.ori
{
index.html: `<a href="a.html">A</a>`
a.html: `<a href="b.html">B</a>`
}
$ ori audit missingPage.ori
a.html:
- b.html
Here audit
reports that a.html
has a link to a non-existent page b.html
.
Because Origami treats all trees equally, you can also audit a folder of HTML pages. For example, if you’re using copy
to build your site, you could audit the build output folder:
$ ori audit build
Auditing a site directly (via the .ori
example above) lets you audit it without having to build it first.
Auditing a live site #
Using Origami’s httpstree:
protocol, you can treat a live site as a traversable tree that audit
can audit.
Example: The venerable Space Jam web site has hundreds of pages which, despite being written by hand, contain very few broken internal links. As of this writing (April 2025), an audit of that site produces the following:
$ ori audit httpstree://www.spacejam.com/1996/
cmp/junior/juniornoframes.html:
- bin/junior.map><img src=
cmp/lineup/lineupnoframes.html:
- cmp/lineup/triviaframes.html
cmp/lineup/quiz6.html:
- cmp/lineup/quiz6b.html
- cmp/lineup/quiz6d.html
The audit shows that, for example, the quiz6.html page in the site’s Trivia Quiz contains two broken links, and the juniornoframes.html page has an <a>
element whose href
is missing a closing quote.