Erik Engelen

AI Platform

AngelsWorks Hub

Central service hub orchestrating my personal AI infrastructure — routing, observability, RBAC, dashboards.

  • 22 Modules
  • 67 Containers
  • 12+ LLM providers via gateway
  • ~90k Free AI calls per day
  • 9 Mission Control stations
AngelsWorks Hub architecture — modules, LiteLLM gateway, RBAC trust scoring, Mission Control surface

Hub is the orchestration layer for everything else I run. It centralizes LLM routing, exposes a single observability surface, enforces access control across modules, and gives me one place to act when something is off.

Decisions

  1. 01

    Single LLM gateway, every call

    All AI calls route through LiteLLM with seven model tiers and ~90k free calls per day across 12+ providers. Daytime hits hosted endpoints (Groq, Cerebras, Gemini), nights fall back to local Ollama. Self-healing on quota errors. The integration tax of "every call goes through one gateway" is real, but it bought me unified cost reporting, audit logs, retry policy, and the ability to swap a provider without touching application code.

  2. 02

    Service + repository layered pattern, applied strangler-fig

    New hub modules and any module being touched for non-trivial work must follow a route → service → repo shape rather than the original single-file pattern. Strangler-fig migration — existing modules don't have to convert until they're modified. The shift surfaced a real recurring cost in the old shape, but converting everything at once would have stalled feature work for weeks. The horizontal pattern that worked for "split by file type" stopped scaling, so the next axis had to be vertical (route → service → repo).

  3. 03

    RBAC with trust scoring, not just roles

    Agents (and humans) accumulate a trust score from observed behavior. Sandboxed actions are checked against the score, not just against a static role. Patrol workers continuously verify that agents stay inside their declared sandboxes. The Mission Control Security station surfaces denials, trust-score deltas, and any sandbox escapes. Strict deny-by-default is feature-flagged until every agent is confirmed registered, then turned on.

  4. 04

    Mission Control as the observability surface

    Nine stations (modules, agents, providers, security, telemetry, backups, knowledge, boot, ops). Every module exposes a health endpoint and a manifest; the dashboard reads those rather than the modules pushing metrics. When something is wrong, I look at the same place every time and the source of truth is the module's own /health rather than a parallel monitoring config that drifts.

  5. 05

    Modules are containers

    Each module ships as its own Docker container with its own manifest declaring backup paths, ports, dependencies, and health endpoint. Adding a module is a docker-compose entry plus a manifest. Removing one doesn't leave residue elsewhere. This costs a bit more memory than running everything in one process, but the operational story (independent restart, isolated failure, clear dependency graph) is worth it.

The reason Hub exists is mundane. I was running 22 modules across 67 containers, calling 12+ AI providers, and the only way to answer “is this thing healthy right now” was to SSH to a box and read logs. Hub gave that answer to itself. Once it could, I started routing every LLM call through one place, enforcing access from one place, instrumenting from one place. The shape of the system stopped fighting me.

A few things I think transfer to a team. Instrumentation pays back faster than features — every time I waited to add metrics until “after I shipped the thing”, I shipped a thing I couldn’t operate. A single gateway for AI calls lets you change everything later — swapping providers, adding observability, enforcing rate limits, none of it requires touching the consumers. And the boring parts (RBAC, backups, manifests, health endpoints) are what determine whether the interesting parts (agents, routing, evaluation) survive contact with a real workload.

Hub is also where the strangler-fig refactor lives. Old single-file modules still work; new and touched modules go through the route → service → repo layering. That migration is in progress in modules/resources/; the lessons file is honest about what worked and what was harder than expected.

Related work

Open to permanent AI Architect roles, EU remote.

Email, LinkedIn, or grab 30 minutes on the calendar.