W17 — Spark Nemotron, flow templates lineage, vLLM

A two-commit week that, by raw count, looks empty. But the second commit is 6.3k LOC of carefully structured work.

What shipped

Spark Nemotron skill. A dedicated path for the Nvidia/Spark-hosted Nemotron model. Small commit (162 LOC) but architecturally important: it’s the first model that justified its own skill rather than fitting into the generic LLM dispatch layer. Custom prompt scaffolding, custom token accounting.
Flow templates: lineage analytics + vLLM support. 6.3k LOC. Two threads in one commit:
- Lineage analytics. When a flow is forked from a template, the relationship is now first-class metadata. The cockpit can answer “which flows came from this template?” and “how have those flows diverged since?” The bones of a proper template ecosystem.
- vLLM support. The Spark-hosted vLLM server topology became a first-class option in the model selector. Custom config, health checks, fallback routing. This unlocks the GPT-OSS-120B nvfp4 path without compromising the local-ownership posture — the vLLM box is local to me, even if “local” means a different machine on the same wire.

Architecture moves

Template lineage as first-class metadata is the precursor to a much larger idea: every flow in the cockpit eventually exposing its provenance — what it was forked from, what it inherited, what it diverged on. That data shape is what eventually feeds a /metrics-style narrative around how flows evolve.

By the numbers

2 commits
+5,923 / −533 (net +5,390)
2 milestone-class drops

What’s next

The Telegram bridge. The flow builder upgrades that have been queued for two weeks. A 7.7k-LOC single-commit drop arriving in W18.