Neon → Kwiz Quants Site Integration

Status: Design proposal Owner: Kwiz Quants Last updated: 2026-05-07

1. Goal

Surface live, auto-updating data from the Kwiz Quants Neon Postgres pipeline on quants.kwizresearch.com:

Headline counters on the Strategy Library section of index.qmd (combinations scanned, instruments covered, % surviving, etc.) reflect the current state of the pipeline rather than hand-edited numbers.
Strategy cards (“Lead-Lag Pairs”, “Cointegration Pairs”, etc.) carry a live status badge — how many strategies in that family are queued / running / completed / promoted — instead of the static In Pipeline label.
Per-strategy “glimpse” cards / page sections for completed strategies show a high-level subset of the deep-dive: equity curve thumbnail, headline risk/return metrics, symbol + timeframe. Visitors can browse progress without ever seeing proprietary signal logic.

This is not a real-time trading dashboard and not a logged-in customer portal. It’s a public marketing surface that is honest about progress.

2. Constraints

Constraint	Implication
Site is static Quarto, deployed to Firebase Hosting	No server-side rendering at request time.
Neon DB credentials must never ship to the client	No browser-side `pg` client; no JWT-with-DB-access pattern.
Numbers should be fresh (hours, not days)	A pure render-time snapshot is too stale unless we redeploy frequently.
Strategy signal logic, parameters, fold splits must stay private	Public payloads are pre-filtered server-side; no raw row dump.
Existing Firebase project (`fit-asset-476910-t5`) already has Functions	We can reuse `content-dashboard/functions/` runtime for a Neon proxy.
Quants R repo already has a clean Postgres adapter (`R/utils/cloud_store.R`)	We can run an R-side exporter as the source of truth, not re-implement queries in Node.

3. Source of Truth

From Kwiz Quants/R/utils/cloud_store.R and dashboard/app/logic/pipeline_queue.R:

pipeline_queue (Postgres in cloud mode): key, sym_tf_group, source (ma_cross / ll / corr / kalman / …), timeframe, sharpe_priority, required_symbols, first_sym, second_sym, status (pending / claimed / done / failed / skipped), claimed_by, claimed_at, completed_at.
strategy_index (Postgres, lightweight): key + per-strategy metric columns updated by runners — total_return_pct, annualized_return_pct, max_drawdown_pct, sharpe_ratio, sortino_ratio, calmar_ratio, profit_factor, win_rate_pct, n_trades, avg_trade, expectancy, total_costs, last_updated.
all_strategies_deduped.parquet (DuckDB-queryable, on disk in the Quants repo): the richer per-strategy metric store; not in Neon today.
Connection: NEON_DATABASE_URL (with sslmode=require), via RPostgres. Helper queue_progress() already returns the exact totals we want for the headline counters.

Key point: the queue and the index are already in Neon. The expensive metrics live in parquet inside the Quants repo. The website integration must be designed around what’s actually in Neon and a thin “publisher” pattern for whatever else we want to expose.

4. What to Surface, and Where

4.1 `sites/quants/index.qmd` — Strategy Library counter strip

Replace the four hand-edited <div> counters (currently 52,248, 105, <1%, 6) with values from a single JSON artifact:

Counter	Source
Combinations in pipeline	`SELECT count(*) FROM pipeline_queue`
Completed	`... WHERE status = 'done'`
In progress	`... WHERE status = 'claimed'`
Failed / skipped	`... WHERE status IN ('failed','skipped')`
Pass rate (top tier)	`count(strategy_index WHERE crs_tier='top') / count(done)` once `crs_tier` is exported
Instruments covered	`count(distinct first_sym) + pairs`
Last updated	`max(completed_at)`

The “6 robustness dimensions” figure is a constant about the methodology, not a DB count — keep it static.

4.2 Strategy cards (“Lead-Lag Pairs”, etc.)

Each card maps to a source value (ll, ma_cross, corr, kalman, …). For each card surface: - queued / in-progress / completed counts (status histogram, scoped by source), - a colored badge: Queued (only pending), Running (any claimed), Backtested (any done), or Validated (any with CRS ≥ gate, once exported).

4.3 New section: “Pipeline Progress — Recent Completions”

A horizontally-scrolling row of glimpse cards, one per recently done strategy: - Strategy family (source → human label). - sym_tf_group rendered as e.g. EURUSD · D1 or EURUSD/GBPUSD · H4. - Sharpe ratio, max drawdown, total return — to one decimal. - Equity curve thumbnail (~200×80 SVG/PNG sparkline, no axis labels). - “Completed 2 days ago” relative timestamp. - No entry/exit logic, no parameters, no fold-by-fold breakdown, no DSR / PBO numerics.

4.4 Optional later: per-strategy “glimpse” pages

sites/quants/strategies/<key>/index.qmd rendered at build time from a per-key JSON file. Same content as the card, plus a slightly larger equity curve, a returns-distribution histogram, and a top-line CRS band (Top / Validated / Monitored) — never the underlying score numbers.

5. Architecture Options

Option	How it works	Pros	Cons
A. Build-time R query in CI	The deploy workflow runs an R script before `quarto render` that reads `NEON_DATABASE_URL`, queries Neon, writes JSON files into `sites/quants/_data/` (or similar), and Quarto picks them up via shortcode/JS.	Zero extra runtime infra. Numbers committed nowhere. Cheapest.	Stale until next push (and CI only fires on `main` push). Adds R toolchain to the deploy workflow (currently Quarto-only). DB credential is a GitHub Action secret, fine.
B. Pre-rendered JSON artifacts published by the Quants repo	The Quants repo (which already has Neon access, R, and parquet) runs a scheduled job that writes a small set of public JSON files, then `git push`es them into this repo (or uploads to a public Firebase Storage bucket / CDN). The website fetches the JSON client-side.	Source of truth lives where the data lives. Update cadence is independent of website deploys. Can include parquet-derived metrics (Sharpe, drawdown) that aren’t in Neon. No DB credential anywhere near the website.	Two-repo coupling. Need a publish mechanism (GHA cross-repo push, or upload to GCS). Client-side fetch means the page flashes empty until JSON arrives.
C. Firebase Functions proxy	New endpoint under `content-dashboard/functions/` (e.g. `/api/quants/stats`) connects to Neon server-side and returns JSON. The static site fetches it with `<script>`.	Real-time freshness. Cleanly fits the existing Functions infra. Easy to add caching.	Adds a Postgres dependency to a Node Functions runtime that today doesn’t have one. Per-request DB connections to Neon are slow without pooling — we’d want Neon’s HTTP/serverless driver (`@neondatabase/serverless`) and short cache TTLs. Cold-start latency on every visitor.
D. Plumber API in Quants repo, fronted by Cloud Run	Extend `dashboard/plumber/api.R` (or a new microservice) with read-only public endpoints, deploy to Cloud Run, point the website at it via CORS.	Single language (R) all the way through. Reuses queue progress helpers verbatim.	New service to operate, scale, monitor. Overkill for a 4-counter strip.
E. Browser-direct to Neon via PostgREST or Hasura	Stand up PostgREST in front of a read-only role with row-level security; site fetches JSON directly.	Pure-static, real-time, no custom backend code.	Yet another service. RLS rules become the security boundary. Far more surface area than the use case justifies.

5.1 Why not just plug `pg` into the browser?

Even with a “read-only” role, putting a Postgres credential into a public JS bundle is a hard no — it lets anyone open a psql session with that role’s grants, run pg_sleep(...), exhaust the connection pool, and discover schema we’d rather not advertise. Every option above keeps the credential on the server side (CI secret, Functions env, Cloud Run env).

6. Recommendation

Hybrid: Option B (pre-rendered JSON, published from Quants) for the slow-changing payload, with a thin Option C (Firebase Functions proxy) only if/when we want true live numbers later.

Reasoning: - The headline counters and per-source histograms move on the order of hours, not seconds. JSON refreshed every 30 min is indistinguishable from “live” for a marketing site. - The expensive metrics (Sharpe, drawdown, equity curve points) live in parquet inside the Quants repo, not Neon. Generating them needs R + DuckDB there. A publisher in that repo is the natural seam. - Keeps deploy.yml clean — it stays a pure Quarto build. No R toolchain in the website CI. - Keeps Neon credentials confined to the Quants repo’s existing secrets (where they already live). - Gives us a single, versioned JSON contract between the two repos. Easy to audit what’s public.

Defer Option C until either (a) we want sub-minute freshness on a specific counter (e.g. a live “claimed right now” ticker) or (b) we want gated features for logged-in clients (which belongs on the dashboard subdomain anyway, not here).

7. Recommended Architecture

                       ┌──────────────────────────────────┐
                       │  Kwiz Quants repo                │
                       │  ────────────────                │
                       │  scripts/publish_public_stats.R  │
                       │     │  reads:                    │
                       │     │   • Neon pipeline_queue    │
                       │     │   • Neon strategy_index    │
                       │     │   • parquet strategy store │
                       │     │  filters → public-safe     │
                       │     ▼                            │
                       │  data/public/                    │
                       │   ├─ stats.json                  │
                       │   ├─ sources.json                │
                       │   └─ strategies/<key>.json       │
                       └─────────────┬────────────────────┘
                                     │  GHA: cron 30m
                                     │  upload to GCS or cross-repo PR
                                     ▼
                       ┌──────────────────────────────────┐
                       │  Firebase Storage / CDN          │
                       │  https://quants.kwizresearch.com │
                       │      /data/stats.json            │
                       └─────────────┬────────────────────┘
                                     │  fetch() at page load
                                     ▼
                       ┌──────────────────────────────────┐
                       │  sites/quants/index.qmd          │
                       │   <div data-stat="combinations"  │
                       │   …                              │
                       │   small JS hydrates counters,    │
                       │   builds glimpse cards.          │
                       └──────────────────────────────────┘

7.1 JSON contract (sketch)

stats.json — counters strip + last-updated badge:

{
  "generated_at": "2026-05-07T12:00:00Z",
  "queue": {
    "total": 52248, "pending": 41112, "claimed": 38, "done": 10894,
    "failed": 168, "skipped": 36
  },
  "instruments": { "count": 105, "asset_classes": 9 },
  "validation": { "passing_top_tier_pct": 0.7 }
}

sources.json — strategy-card status histograms:

[
  { "source": "ll",       "label": "Lead-Lag Pairs",     "queued": 8120, "running": 12, "done": 2104, "validated": 18, "badge": "backtested" },
  { "source": "corr",     "label": "Correlation Pairs",  "queued": 6044, "running":  6, "done": 1430, "validated":  7, "badge": "running"     },
  { "source": "kalman",   "label": "Cointegration Pairs","queued": 7211, "running":  9, "done": 1801, "validated": 11, "badge": "backtested" },
  { "source": "ma_cross", "label": "ML-Based Strategies","queued": 5602, "running": 11, "done": 1209, "validated":  4, "badge": "running"     }
]

strategies/<key>.json — one per recently completed strategy (rolling window of, say, 50):

{
  "key": "ll_EURUSD_GBPUSD_H4_v2",
  "source": "ll",
  "label": "Lead-Lag Pairs · EURUSD/GBPUSD · H4",
  "completed_at": "2026-05-05T08:14:00Z",
  "metrics": {
    "sharpe_ratio": 1.12,
    "max_drawdown_pct": -8.4,
    "annualized_return_pct": 14.6,
    "win_rate_pct": 54.0,
    "n_trades": 412
  },
  "equity_curve": [/* ~120 sparkline points, normalized 0..1 */],
  "tier": "validated"
}

What is deliberately not in any payload: required_symbols (the full portfolio), parameter values, fold-level breakdowns, DSR / PBO numerics, claimed_by (runner identity), MT5 tester raw output. The publisher script is the security boundary and must explicitly whitelist columns.

7.2 Hydration on the static site

Add sites/quants/quants-stats.js (small, no framework). On DOMContentLoaded: 1. fetch('/data/stats.json') and sources.json. 2. For each [data-stat="..."], write the value. 3. For each .strategy-card[data-source="..."], swap the badge text/class. 4. For #recent-completions, render N glimpse cards from strategies/*.json (or a single rolled-up recent.json). 5. On fetch failure, leave the existing static numbers in place (graceful degradation — the site is never broken by the data layer being down).

Wire it via _quarto.yml:

format:
  html:
    include-after-body:
      - text: |
          <script src="quants-stats.js" defer></script>

7.3 Where the JSON lives

Two viable hosts:

Same Firebase Hosting site (quants.kwizresearch.com/data/...). Pro: one origin, no CORS. Con: requires the JSON to be in the website’s _site/ at deploy time, which means either a build-step fetch or a cross-repo PR per refresh.
Public GCS bucket under the existing project, e.g. https://storage.googleapis.com/kwiz-quants-public/stats.json. Pro: independent update cadence, no website redeploy. Con: CORS preflight
- a second hostname.

Recommend GCS — it matches Option B’s “publish independently” premise.

8. Security

Concern	Mitigation
Leaking the `NEON_DATABASE_URL`	Never present in this repo, in CI secrets here, or in any client-side bundle. Stays in the Quants repo’s secrets where it already lives.
A read-only DB role	Create a `kwiz_public_reader` role in Neon with `SELECT` on `pipeline_queue` and `strategy_index` only — no other tables, no functions. The publisher uses this role.
Connection abuse	The publisher runs once per cron tick, not per visitor. No public traffic ever hits Neon.
Schema leakage via JSON	The publisher script defines an explicit allowlist of columns/keys. It is the only place that touches Postgres for public output. Anything new appearing in `stats.json` requires a code change there.
Alpha leakage	`required_symbols`, parameters, fold-level metrics, DSR/PBO, MT5 tester raw — none ever serialized. CRS is exposed as a tier name (Top / Validated / Monitored), not a number.
`claimed_by` (runner identity)	Aggregated only — the public payload exposes counts, never per-runner rows.
Tampered JSON	Hosted on GCS under our project; integrity inherited from the bucket’s IAM. Optionally subresource-integrity-hash the JS, not the JSON.
CORS	If served from GCS, set `Access-Control-Allow-Origin: https://quants.kwizresearch.com` on the bucket. Don’t use `*`.

9. Is this reasonable for a static site?

Yes — this is the standard pattern for “static site that needs live-ish data”:

The HTML/CSS/Quarto build stays fully static and CI-friendly.
The data layer is a JSON file at a known URL.
The publisher is decoupled, run on a cadence, owned by the team that owns the data.
If the publisher is ever down, the page degrades gracefully to the existing hand-coded numbers — visitors see something credible, not a blank.
No new long-running service to operate.

The main thing we lose vs. a dynamic site is sub-minute freshness. For this audience (prospective clients reading marketing copy), that doesn’t matter.

10. Implementation Phases

Phase 1 — Static-to-data refactor of the page (no DB yet) ~½ day

Move the four counter values out of index.qmd and into a checked-in sites/quants/_data/stats.json placeholder.
Add quants-stats.js that reads it via fetch() and hydrates [data-stat="..."] elements.
Verify the page is unchanged visually (same numbers, just sourced from JSON now). This decouples the structural change from the data change.

Phase 2 — Publisher in the Kwiz Quants repo ~1 day

New script Kwiz Quants/scripts/publish_public_stats.R. Reuses cloud_store_init() and queue_progress(). Writes stats.json and sources.json to a local data/public/ dir.
Explicit allowlist of columns. Unit-test the redaction.
New Neon role kwiz_public_reader with SELECT on the two tables.

Phase 3 — Publish & wire ~½ day

GHA workflow in the Quants repo: cron: '*/30 * * * *', runs the script, uploads to a public GCS bucket via existing service account.
Update quants-stats.js to fetch from the GCS URL.
CORS header on the bucket.

Phase 4 — Strategy-card live badges ~½ day

Tag each .strategy-card with data-source="ll" etc.
Extend quants-stats.js to set badge text/class from sources.json.

Phase 5 — Recent-completions row ~1 day

Extend the publisher to emit recent.json (top N most recently done strategies, with redacted metrics + sparkline data from the parquet store).
Add a “Pipeline Progress” section to index.qmd rendered client-side.

Phase 6 (optional, later) — per-strategy glimpse pages

Quarto pre-render at build time using strategies/*.json. Or stay client-side. Decide based on whether we want each glimpse to be its own shareable URL.

Phase 7 (optional, much later) — Firebase Functions proxy

Only if we want sub-minute live counters (e.g. a “claimed right now” ticker). Reuse content-dashboard/functions/. Use @neondatabase/serverless. Aggressive caching (e.g. 60s Cache-Control).

11. Open Questions

Does the Quants repo’s GHA runner have gcloud + a service account configured for the bucket we’d write to, or does that need new IAM?
Where should the kwiz_public_reader role’s password be stored — same secret manager as NEON_DATABASE_URL, or split?
For the equity-curve sparklines: are we comfortable publishing 120 normalized points per completed strategy? If yes, the publisher needs to read parquet (not just Neon). If no, the glimpse cards drop the curve and show metrics-only.
Do we want recent.json to filter on a tier gate (only show strategies with CRS ≥ 0.50)? Probably yes — “recent completions” should mean “interesting recent completions”, not “the last 50 strategies the runner finished, including the ones that flunked”.

12. Non-Goals

Real-time tick streaming.
Logged-in customer features (those belong on admin.kwizresearch.com).
Embedding the Shiny dashboard in the marketing page.
Historical time-series of pipeline progress (a chart of “% done over time”). Nice eventually, not in scope here.