Stag Architecture

This document describes internal architecture o Stag. It might be useful during Stag development or for plugins authors to better understand the data model of Stag.

Site

The central object to Stag is Site, which holds all processed data:

pages - mapping of pages known by Stag
signals - a list of global signals emitted by Site. See Plugins Programmer Manual
list of registered readers

Pages

Page is the object which encodes input and output data for generating files. It is made of smaller entities, which encode a specific data format used by plugins.

Page internal structure (pseudocode)

Page:
  base: string
  path: string

  source: Optional[Path]
  metadata: Optional[Metadata]
  input: Optional[Content]
  output: Optional[Content]
  toc: Optional[Content]
  taxonomy: Optional[Taxonomy]
  term: Optional[Term]
  cached: Optional[Cache]

Identifying pages

Each page must have a single identifier, which is unique across all the pages: path, which Stag also uses to decide the path of the output file. From base (which is a configured base URL) and path Stag generates absolute and relative URLs for any page.

Relative URLs are relative to the base. It means that when base points to a directory (e.g. vhost deployment), it is relative to this directory.

Entities

Each page in site.pages can have a different type. For example, some are typical pages created by user, others aure auto-generated taxonomies and so on. Stag detects type of each page by inspecting so called entities inside each Page object. Each entity is optional, thus their enabled set decides the page type and how Stag behaves for it.

Entities are called after a well known game development pattern called entity-component-system.

The following entities can be attached to the page

Source

Entity specific for physical files which encodes its path and a root directory. All files inside the content directory will have their source set.

This entity is typically used by readers to create Metadata and Input entities.

When page has source entity, but doesn’t have input entity, it is considered a static file, which will be simply copied according to its URL.

Metadata

When file parsed by the reader has attached metadata, it is stored in this entity as a key-value mapping. Typically metadata, together with input is heavily used during later stages when generating the output entity.

Keys stored inside metadata can be accessed as ordinary Python dictionaries (page.metadata["key"]) or via attributes (page.metadata.key).

As a shortcut and for users convenience, metadata can be accessed via page.md.

Metadata is content-aware. It automatically performs the following normalizations:

metadata.title is always set (it is empty if page’s metadata doesn’t have it)
the following fields are considered as dates: date, lastmod.
dates are parsed into Python’s datetime objects
floating-point timestamp and lastmod_timestamp are automatically calculated and set from date and lastmod
presence of date, lastmod, timestamp and lastmod_timestamp is assured if at least date or lastmod exists in input metadata. In one of them is missing, it is inferred from the other one.

Page must have metadata to be considered outputable. Outputable pages are passed to the templating engine and are subsequently rendered.

Input

Contents of the input file, typically returned by the reader. Together with metadata it forms a complete information about the source page which can transformed into the output.

Output

Contents of the output file, usually generated from the input and metadata.

This isn’t the same as the output file itself, as this content is typically passed to the template engine (Jinja) which creates the source code of the output file which will be written to the filesystem.

Page must have output to be considered outputable. Outputable pages are passed to the templating engine and are subsequently rendered.

TOC

Source code of the output Table of Contents. Plugins might choose to generate it separately to the output because embedding it in the output file might be easier and more flexible this way (for example, it allows detecting it in templates and wrapping in specific, plugin-agnostic code).

Taxonomy

Automatically generated pages which group term pages for a given taxonomy, along with some additional metadata. Term pages are also auto-generated and they group specific "ordinary" pages.

For example, a tags taxonomy page keeps a list of all tags used in ordinary pages, while term pages keep a list of pages which use particular tag.

Taxonomy and term entities are exclusive.

Term

Automatically generated page which holds a list of pages which use a particular taxonomy term (e.g. a particular tag).

Taxonomy and term entities are exclusive.

Cache

Entity set for pages which are saved into the cache and then loaded from it. Newly created pages which weren’t cached won’t have this entity set. Some plugins might need this information explicitly.