Backend / distributed depth — a foundation in progress

CiteStreak

The greenfield 2026 GEO / AI-visibility SaaS I am currently building. The tested platform foundation — RLS tenant isolation, an append-only credits ledger, SELECT FOR UPDATE quota reservation, replay-safe Stripe webhooks, a Temporal refund workflow — landed across Phases 0–3 with 24/24 tests green. The product surface is still in progress.

NESTJSTEMPORALRLSSTRIPEGEO

Honest outcomes

tests green

24 / 24

across Phases 0–3

architecture decisions

append-only register

Postgres schema namespaces

AI answer engines targeted

at launch

prod Docker services

~14

01 —

Why

CiteStreak is the greenfield 2026 GEO / AI-visibility monitoring SaaS I am currently building — tracking whether and how brands get mentioned and cited across roughly six AI answer engines, and producing scheduled reports on a published methodology.

A metered, multi-tenant LLM product fails in specific, expensive ways before it ever fails on features: one tenant seeing another tenant’s data, a fan-out bug quietly running up thousands in provider cost, a webhook replayed twice double-granting credits, a long run crashing halfway and re-charging everything on retry. I wanted those failure modes designed out at the foundation, with tests proving it, before building the product surface on top.

So the honest claim here is a platform foundation, not a shipped product. The engine adapters, the extraction pipeline, and ~24 planned features are deliberately stubbed; what is real is the tested backend it will stand on.

The foundation is genuinely built and tested; the revenue-generating feature surface is still forward-looking — and I would rather say that plainly than call a scaffold a product.

the honest framing

02 —

What

The backend is a NestJS 11 + Fastify monorepo on Drizzle / PostgreSQL 16, multi-tenant from day one with a four-tier User → Org → Workspace → Project model. Tenant isolation is defense-in-depth: Postgres Row-Level Security anchored on org_id, propagated per-request via CLS, with a standing tenant-isolation test that runs on every commit.

The money plumbing is the heart of it: an append-only credits ledger where the balance is a SUM and rows are never updated or deleted; pre-enqueue quota reservation via SELECT … FOR UPDATE before any paid provider call, so a fan-out bug cannot cost thousands; replay-safe Stripe webhooks with a dedup table; and a Temporal credit-refund compensation workflow that unwinds charges when a run fails.

Long-running, cost-sensitive runs are orchestrated on self-hosted Temporal (with OpenSearch visibility) as a saga with native retries and compensation; BullMQ is kept only for fire-and-forget jobs. Observability is wired from the start — OpenTelemetry → Grafana Tempo, Pino → Loki — behind a ~14-service Docker Compose stack and a CI migration-safety gate that blocks destructive schema changes.

03 —

How

Every significant call lives in a 52-decision append-only architecture register, each row carrying the rationale: why RLS for the security boundary and the app layer for the UX boundary, why Temporal over a simple queue, why an append-only ledger over metered billing, how quota reservation prevents cost blowups. The decisions are written to be interview-defensible, not just made.

The foundation shipped across Phases 0–3 with the full end-to-end path — ledger, Stripe webhook, API-key crypto, idempotency, credit-refund workflow — working and covered by 24 passing tests (tenant isolation, credits, log-redaction, API-key crypto, feature-flags, idempotency). The marketing site is a separate Next.js 15 rebuild with Playwright visual-diff QA and deliberate GEO plumbing.

What I am explicit about: maturity here is evidenced by code, tests, and the decision register, not by scale. There are no latency, throughput, revenue, or customer numbers, because the run pipeline is still stubbed and the product is pre-GA. I do not quote numbers I cannot back up.

04 —

Where it stands

The platform scaffold is real and runtime-exercised — 24/24 tests green across Phases 0–3, a 52-decision register, and a full production-shaped DevOps and observability stack — which is exactly the part of a metered SaaS that is most expensive to retrofit later.

The product itself is in progress: engine adapters, the extraction pipeline, and the planned feature surface are still ahead of me. I frame CiteStreak as a tested foundation in progress, and that framing is the point — the correctness primitives are in place before the features land on top of them.

05 —

Stack

NestJS 11 + FastifyDrizzle / PostgreSQL 16TemporalBullMQ + RedisStripeOpenTelemetry / Grafana

All case studies Ask my AI twin about this