Skip to content

Architecture

Most application stacks follow a predictable pattern. You define a data model, then you need a persistence layer, then CRUD methods, then API endpoints, then server handlers, then client code. For every entity. The shapes are almost identical each time — only the names and fields change.

Ontogen exists because that pattern is mechanical enough to automate. It’s a build-time code generator that runs in build.rs and produces everything downstream from an annotated Rust struct.

The central architectural concept is a layered pipeline of independent generator functions. Each function takes configuration and optionally the output of a previous stage, then writes files and returns typed metadata for the next stage.

┌──────────────────────────────────────────────┐
│ build.rs │
└──────────────────────────────────────────────┘
┌──────────────────────────┐
Schema/*.rs ──► │ parse_schema() │──► SchemaOutput
└──────────────────────────┘ { entities: Vec<EntityDef> }
┌───────────────────────┬┼──────────────────────┐
│ ││ │
▼ ▼▼ ▼
┌─────────────────┐ ┌─────────────────────┐ ┌──────────────┐
│ gen_seaorm() │ │ gen_markdown_io() │ │ gen_dtos() │
└─────────────────┘ └─────────────────────┘ └──────────────┘
│ (files) (files)
SeaOrmOutput
{ entity_tables, junction_tables, conversion_fns }
┌─────────────────┐
│ gen_store() │──► StoreOutput
└─────────────────┘ { methods, scaffolded_hooks, change_channels }
┌─────────────────┐
│ gen_api() │──► ApiOutput
└─────────────────┘ { modules: Vec<ApiModule> }
┌─────────────────┐
│ gen_servers() │──► ServersOutput
└─────────────────┘ { http_routes, ipc_commands, mcp_tools }
(also emits TypeScript clients
and admin registry, controlled
by ServersConfig.client_generators)

Data flows top to bottom. Each arrow is a plain Rust struct — no framework types, no trait objects, no magic. Just data.

Every generator in Ontogen is a standalone function. Here’s what gen_store actually looks like:

pub fn gen_store(
entities: &[EntityDef],
seaorm: Option<&SeaOrmOutput>,
config: &StoreConfig,
) -> Result<StoreOutput, CodegenError> { ... }

Three things to notice:

  1. It’s a regular function. Not a method on a builder, not a step in a pipeline object, not a trait implementation. Just a function you call.
  2. The seaorm parameter is Option. If you have SeaORM metadata, great — the store generator uses it for exact table and column names. If you don’t, it falls back to naming conventions. Either way it works.
  3. It returns StoreOutput — a concrete struct that downstream generators can consume. No dynamic dispatch, no type erasure.

This pattern repeats across every generator in the pipeline.

Each generator receives configuration and optional upstream IR. Nothing else. No shared mutable state, no global registry, no framework context object.

This means you can use generators independently:

  • Just persistence: Call parse_schema and gen_seaorm to get SeaORM entities. Stop there.
  • Persistence + store: Add gen_store for CRUD methods with lifecycle hooks.
  • Full stack: Chain everything through gen_servers (with client_generators populated) for HTTP, IPC, MCP, and TypeScript clients.
  • Skip the middle: Use gen_markdown_io alongside gen_seaorm without involving the store or API layers at all.

The generators at the top of the pipeline (seaorm, markdown_io, dtos) are siblings — they all consume SchemaOutput independently and don’t depend on each other. The lower layers form a chain, where each stage enriches the next.

Generated code lives in generated/ subdirectories within your project structure:

src/
├── store/
│ ├── generated/ ← gen_store writes here
│ │ ├── mod.rs
│ │ ├── task.rs
│ │ └── agent.rs
│ └── hooks/ ← scaffolded once, you own these
│ ├── mod.rs
│ ├── task.rs
│ └── agent.rs
├── api/
│ └── v1/
│ ├── generated/ ← gen_api writes here
│ │ ├── mod.rs
│ │ └── task.rs
│ ├── task.rs ← your custom endpoints live alongside
│ └── reports.rs ← entirely hand-written modules too
└── schema/
├── task.rs ← your annotated structs (the source of truth)
└── agent.rs

This separation matters for three reasons:

  1. You can git diff the generated/ directories to see exactly what changed after a schema modification.
  2. You never accidentally edit generated code, and the generator never overwrites your code.
  3. Your IDE can distinguish between files you maintain and files the build system maintains.

The generated/ directories are cleaned on each build — stale files from renamed or deleted entities are removed automatically.

parse_schema() reads your src/schema/ directory, finds structs with #[derive(OntologyEntity)] and #[ontology(...)] annotations, and parses them into EntityDef metadata using syn. This is always the starting point.

gen_seaorm() generates SeaORM entity modules: table models with typed columns, relation enums, junction table definitions for many-to-many relationships, and from_model()/to_active_model() conversion functions.

gen_markdown_io() generates YAML-frontmatter parsers, Markdown writers, and filesystem operations for reading/writing entities as Markdown files. Useful for content-as-code workflows.

gen_dtos() generates CreateEntityInput and UpdateEntityInput structs — the input types for create and update operations. Also invoked internally by gen_store, but available independently if you want input types without a full store layer.

gen_store() generates CRUD methods (list, get, create, update, delete) with lifecycle hook call sites, EntityUpdate structs with apply() methods, and relation population helpers. Scaffolds hook files once per entity.

gen_api() generates forwarding functions that bridge the store layer to transport handlers. Scans hand-written API directories and merges custom endpoints with generated CRUD into a unified ApiOutput.

gen_servers() generates transport-specific handlers: Axum HTTP route handlers, Tauri IPC commands, and MCP tool definitions. It also drives client-side generation — the same call emits TypeScript clients and the admin-UI registry, controlled by ServersConfig.client_generators. Each transport reads the same ApiOutput and produces output appropriate for its protocol.

The recommended way to wire the pipeline is through the Pipeline builder, which calls all of the above functions for you with sensible defaults:

use ontogen::Pipeline;
fn main() {
Pipeline::new("src/schema")
.seaorm(
"src/persistence/db/entities/generated",
"src/persistence/db/conversions/generated",
)
.store("src/store/generated", Some::<std::path::PathBuf>("src/store/hooks".into()))
.api("src/api/v1/generated", "AppState")
// .servers(servers_config) - add when you want HTTP/IPC/MCP/clients
.build()
.expect("ontogen pipeline failed");
}

If you’d rather call each generator function directly (one per stage, every config field spelled out), that’s still fully supported — see the Build Script Setup guide. There’s no hidden orchestration either way: Pipeline is a thin wrapper over the same parse_schema → gen_seaorm → ... → gen_servers calls.