Neon → Kwiz Quants Site Integration
Status: Design proposal Owner: Kwiz Quants Last updated: 2026-05-07
1. Goal
Surface live, auto-updating data from the Kwiz Quants Neon Postgres pipeline on quants.kwizresearch.com:
- Headline counters on the Strategy Library section of
index.qmd(combinations scanned, instruments covered, % surviving, etc.) reflect the current state of the pipeline rather than hand-edited numbers. - Strategy cards (“Lead-Lag Pairs”, “Cointegration Pairs”, etc.) carry a live status badge — how many strategies in that family are queued / running / completed / promoted — instead of the static In Pipeline label.
- Per-strategy “glimpse” cards / page sections for completed strategies show a high-level subset of the deep-dive: equity curve thumbnail, headline risk/return metrics, symbol + timeframe. Visitors can browse progress without ever seeing proprietary signal logic.
This is not a real-time trading dashboard and not a logged-in customer portal. It’s a public marketing surface that is honest about progress.
2. Constraints
| Constraint | Implication |
|---|---|
| Site is static Quarto, deployed to Firebase Hosting | No server-side rendering at request time. |
| Neon DB credentials must never ship to the client | No browser-side pg client; no JWT-with-DB-access pattern. |
| Numbers should be fresh (hours, not days) | A pure render-time snapshot is too stale unless we redeploy frequently. |
| Strategy signal logic, parameters, fold splits must stay private | Public payloads are pre-filtered server-side; no raw row dump. |
Existing Firebase project (fit-asset-476910-t5) already has Functions |
We can reuse content-dashboard/functions/ runtime for a Neon proxy. |
Quants R repo already has a clean Postgres adapter (R/utils/cloud_store.R) |
We can run an R-side exporter as the source of truth, not re-implement queries in Node. |
3. Source of Truth
From Kwiz Quants/R/utils/cloud_store.R and dashboard/app/logic/pipeline_queue.R:
pipeline_queue(Postgres in cloud mode):key,sym_tf_group,source(ma_cross/ll/corr/kalman/ …),timeframe,sharpe_priority,required_symbols,first_sym,second_sym,status(pending/claimed/done/failed/skipped),claimed_by,claimed_at,completed_at.strategy_index(Postgres, lightweight):key+ per-strategy metric columns updated by runners —total_return_pct,annualized_return_pct,max_drawdown_pct,sharpe_ratio,sortino_ratio,calmar_ratio,profit_factor,win_rate_pct,n_trades,avg_trade,expectancy,total_costs,last_updated.all_strategies_deduped.parquet(DuckDB-queryable, on disk in the Quants repo): the richer per-strategy metric store; not in Neon today.- Connection:
NEON_DATABASE_URL(withsslmode=require), viaRPostgres. Helperqueue_progress()already returns the exact totals we want for the headline counters.
Key point: the queue and the index are already in Neon. The expensive metrics live in parquet inside the Quants repo. The website integration must be designed around what’s actually in Neon and a thin “publisher” pattern for whatever else we want to expose.
4. What to Surface, and Where
4.1 sites/quants/index.qmd — Strategy Library counter strip
Replace the four hand-edited <div> counters (currently 52,248, 105, <1%, 6) with values from a single JSON artifact:
| Counter | Source |
|---|---|
| Combinations in pipeline | SELECT count(*) FROM pipeline_queue |
| Completed | ... WHERE status = 'done' |
| In progress | ... WHERE status = 'claimed' |
| Failed / skipped | ... WHERE status IN ('failed','skipped') |
| Pass rate (top tier) | count(strategy_index WHERE crs_tier='top') / count(done) once crs_tier is exported |
| Instruments covered | count(distinct first_sym) + pairs |
| Last updated | max(completed_at) |
The “6 robustness dimensions” figure is a constant about the methodology, not a DB count — keep it static.
4.2 Strategy cards (“Lead-Lag Pairs”, etc.)
Each card maps to a source value (ll, ma_cross, corr, kalman, …). For each card surface: - queued / in-progress / completed counts (status histogram, scoped by source), - a colored badge: Queued (only pending), Running (any claimed), Backtested (any done), or Validated (any with CRS ≥ gate, once exported).
4.3 New section: “Pipeline Progress — Recent Completions”
A horizontally-scrolling row of glimpse cards, one per recently done strategy: - Strategy family (source → human label). - sym_tf_group rendered as e.g. EURUSD · D1 or EURUSD/GBPUSD · H4. - Sharpe ratio, max drawdown, total return — to one decimal. - Equity curve thumbnail (~200×80 SVG/PNG sparkline, no axis labels). - “Completed 2 days ago” relative timestamp. - No entry/exit logic, no parameters, no fold-by-fold breakdown, no DSR / PBO numerics.
4.4 Optional later: per-strategy “glimpse” pages
sites/quants/strategies/<key>/index.qmd rendered at build time from a per-key JSON file. Same content as the card, plus a slightly larger equity curve, a returns-distribution histogram, and a top-line CRS band (Top / Validated / Monitored) — never the underlying score numbers.
5. Architecture Options
| Option | How it works | Pros | Cons |
|---|---|---|---|
| A. Build-time R query in CI | The deploy workflow runs an R script before quarto render that reads NEON_DATABASE_URL, queries Neon, writes JSON files into sites/quants/_data/ (or similar), and Quarto picks them up via shortcode/JS. |
Zero extra runtime infra. Numbers committed nowhere. Cheapest. | Stale until next push (and CI only fires on main push). Adds R toolchain to the deploy workflow (currently Quarto-only). DB credential is a GitHub Action secret, fine. |
| B. Pre-rendered JSON artifacts published by the Quants repo | The Quants repo (which already has Neon access, R, and parquet) runs a scheduled job that writes a small set of public JSON files, then git pushes them into this repo (or uploads to a public Firebase Storage bucket / CDN). The website fetches the JSON client-side. |
Source of truth lives where the data lives. Update cadence is independent of website deploys. Can include parquet-derived metrics (Sharpe, drawdown) that aren’t in Neon. No DB credential anywhere near the website. | Two-repo coupling. Need a publish mechanism (GHA cross-repo push, or upload to GCS). Client-side fetch means the page flashes empty until JSON arrives. |
| C. Firebase Functions proxy | New endpoint under content-dashboard/functions/ (e.g. /api/quants/stats) connects to Neon server-side and returns JSON. The static site fetches it with <script>. |
Real-time freshness. Cleanly fits the existing Functions infra. Easy to add caching. | Adds a Postgres dependency to a Node Functions runtime that today doesn’t have one. Per-request DB connections to Neon are slow without pooling — we’d want Neon’s HTTP/serverless driver (@neondatabase/serverless) and short cache TTLs. Cold-start latency on every visitor. |
| D. Plumber API in Quants repo, fronted by Cloud Run | Extend dashboard/plumber/api.R (or a new microservice) with read-only public endpoints, deploy to Cloud Run, point the website at it via CORS. |
Single language (R) all the way through. Reuses queue progress helpers verbatim. | New service to operate, scale, monitor. Overkill for a 4-counter strip. |
| E. Browser-direct to Neon via PostgREST or Hasura | Stand up PostgREST in front of a read-only role with row-level security; site fetches JSON directly. | Pure-static, real-time, no custom backend code. | Yet another service. RLS rules become the security boundary. Far more surface area than the use case justifies. |
5.1 Why not just plug pg into the browser?
Even with a “read-only” role, putting a Postgres credential into a public JS bundle is a hard no — it lets anyone open a psql session with that role’s grants, run pg_sleep(...), exhaust the connection pool, and discover schema we’d rather not advertise. Every option above keeps the credential on the server side (CI secret, Functions env, Cloud Run env).
6. Recommendation
Hybrid: Option B (pre-rendered JSON, published from Quants) for the slow-changing payload, with a thin Option C (Firebase Functions proxy) only if/when we want true live numbers later.
Reasoning: - The headline counters and per-source histograms move on the order of hours, not seconds. JSON refreshed every 30 min is indistinguishable from “live” for a marketing site. - The expensive metrics (Sharpe, drawdown, equity curve points) live in parquet inside the Quants repo, not Neon. Generating them needs R + DuckDB there. A publisher in that repo is the natural seam. - Keeps deploy.yml clean — it stays a pure Quarto build. No R toolchain in the website CI. - Keeps Neon credentials confined to the Quants repo’s existing secrets (where they already live). - Gives us a single, versioned JSON contract between the two repos. Easy to audit what’s public.
Defer Option C until either (a) we want sub-minute freshness on a specific counter (e.g. a live “claimed right now” ticker) or (b) we want gated features for logged-in clients (which belongs on the dashboard subdomain anyway, not here).
7. Recommended Architecture
┌──────────────────────────────────┐
│ Kwiz Quants repo │
│ ──────────────── │
│ scripts/publish_public_stats.R │
│ │ reads: │
│ │ • Neon pipeline_queue │
│ │ • Neon strategy_index │
│ │ • parquet strategy store │
│ │ filters → public-safe │
│ ▼ │
│ data/public/ │
│ ├─ stats.json │
│ ├─ sources.json │
│ └─ strategies/<key>.json │
└─────────────┬────────────────────┘
│ GHA: cron 30m
│ upload to GCS or cross-repo PR
▼
┌──────────────────────────────────┐
│ Firebase Storage / CDN │
│ https://quants.kwizresearch.com │
│ /data/stats.json │
└─────────────┬────────────────────┘
│ fetch() at page load
▼
┌──────────────────────────────────┐
│ sites/quants/index.qmd │
│ <div data-stat="combinations" │
│ … │
│ small JS hydrates counters, │
│ builds glimpse cards. │
└──────────────────────────────────┘
7.1 JSON contract (sketch)
stats.json — counters strip + last-updated badge:
{
"generated_at": "2026-05-07T12:00:00Z",
"queue": {
"total": 52248, "pending": 41112, "claimed": 38, "done": 10894,
"failed": 168, "skipped": 36
},
"instruments": { "count": 105, "asset_classes": 9 },
"validation": { "passing_top_tier_pct": 0.7 }
}sources.json — strategy-card status histograms:
[
{ "source": "ll", "label": "Lead-Lag Pairs", "queued": 8120, "running": 12, "done": 2104, "validated": 18, "badge": "backtested" },
{ "source": "corr", "label": "Correlation Pairs", "queued": 6044, "running": 6, "done": 1430, "validated": 7, "badge": "running" },
{ "source": "kalman", "label": "Cointegration Pairs","queued": 7211, "running": 9, "done": 1801, "validated": 11, "badge": "backtested" },
{ "source": "ma_cross", "label": "ML-Based Strategies","queued": 5602, "running": 11, "done": 1209, "validated": 4, "badge": "running" }
]strategies/<key>.json — one per recently completed strategy (rolling window of, say, 50):
{
"key": "ll_EURUSD_GBPUSD_H4_v2",
"source": "ll",
"label": "Lead-Lag Pairs · EURUSD/GBPUSD · H4",
"completed_at": "2026-05-05T08:14:00Z",
"metrics": {
"sharpe_ratio": 1.12,
"max_drawdown_pct": -8.4,
"annualized_return_pct": 14.6,
"win_rate_pct": 54.0,
"n_trades": 412
},
"equity_curve": [/* ~120 sparkline points, normalized 0..1 */],
"tier": "validated"
}What is deliberately not in any payload: required_symbols (the full portfolio), parameter values, fold-level breakdowns, DSR / PBO numerics, claimed_by (runner identity), MT5 tester raw output. The publisher script is the security boundary and must explicitly whitelist columns.
7.2 Hydration on the static site
Add sites/quants/quants-stats.js (small, no framework). On DOMContentLoaded: 1. fetch('/data/stats.json') and sources.json. 2. For each [data-stat="..."], write the value. 3. For each .strategy-card[data-source="..."], swap the badge text/class. 4. For #recent-completions, render N glimpse cards from strategies/*.json (or a single rolled-up recent.json). 5. On fetch failure, leave the existing static numbers in place (graceful degradation — the site is never broken by the data layer being down).
Wire it via _quarto.yml:
format:
html:
include-after-body:
- text: |
<script src="quants-stats.js" defer></script>7.3 Where the JSON lives
Two viable hosts:
- Same Firebase Hosting site (
quants.kwizresearch.com/data/...). Pro: one origin, no CORS. Con: requires the JSON to be in the website’s_site/at deploy time, which means either a build-step fetch or a cross-repo PR per refresh. - Public GCS bucket under the existing project, e.g.
https://storage.googleapis.com/kwiz-quants-public/stats.json. Pro: independent update cadence, no website redeploy. Con: CORS preflight- a second hostname.
Recommend GCS — it matches Option B’s “publish independently” premise.
8. Security
| Concern | Mitigation |
|---|---|
Leaking the NEON_DATABASE_URL |
Never present in this repo, in CI secrets here, or in any client-side bundle. Stays in the Quants repo’s secrets where it already lives. |
| A read-only DB role | Create a kwiz_public_reader role in Neon with SELECT on pipeline_queue and strategy_index only — no other tables, no functions. The publisher uses this role. |
| Connection abuse | The publisher runs once per cron tick, not per visitor. No public traffic ever hits Neon. |
| Schema leakage via JSON | The publisher script defines an explicit allowlist of columns/keys. It is the only place that touches Postgres for public output. Anything new appearing in stats.json requires a code change there. |
| Alpha leakage | required_symbols, parameters, fold-level metrics, DSR/PBO, MT5 tester raw — none ever serialized. CRS is exposed as a tier name (Top / Validated / Monitored), not a number. |
claimed_by (runner identity) |
Aggregated only — the public payload exposes counts, never per-runner rows. |
| Tampered JSON | Hosted on GCS under our project; integrity inherited from the bucket’s IAM. Optionally subresource-integrity-hash the JS, not the JSON. |
| CORS | If served from GCS, set Access-Control-Allow-Origin: https://quants.kwizresearch.com on the bucket. Don’t use *. |
9. Is this reasonable for a static site?
Yes — this is the standard pattern for “static site that needs live-ish data”:
- The HTML/CSS/Quarto build stays fully static and CI-friendly.
- The data layer is a JSON file at a known URL.
- The publisher is decoupled, run on a cadence, owned by the team that owns the data.
- If the publisher is ever down, the page degrades gracefully to the existing hand-coded numbers — visitors see something credible, not a blank.
- No new long-running service to operate.
The main thing we lose vs. a dynamic site is sub-minute freshness. For this audience (prospective clients reading marketing copy), that doesn’t matter.
10. Implementation Phases
Phase 1 — Static-to-data refactor of the page (no DB yet) ~½ day
- Move the four counter values out of
index.qmdand into a checked-insites/quants/_data/stats.jsonplaceholder. - Add
quants-stats.jsthat reads it viafetch()and hydrates[data-stat="..."]elements. - Verify the page is unchanged visually (same numbers, just sourced from JSON now). This decouples the structural change from the data change.
Phase 2 — Publisher in the Kwiz Quants repo ~1 day
- New script
Kwiz Quants/scripts/publish_public_stats.R. Reusescloud_store_init()andqueue_progress(). Writesstats.jsonandsources.jsonto a localdata/public/dir. - Explicit allowlist of columns. Unit-test the redaction.
- New Neon role
kwiz_public_readerwithSELECTon the two tables.
Phase 3 — Publish & wire ~½ day
- GHA workflow in the Quants repo:
cron: '*/30 * * * *', runs the script, uploads to a public GCS bucket via existing service account. - Update
quants-stats.jsto fetch from the GCS URL. - CORS header on the bucket.
Phase 4 — Strategy-card live badges ~½ day
- Tag each
.strategy-cardwithdata-source="ll"etc. - Extend
quants-stats.jsto set badge text/class fromsources.json.
Phase 5 — Recent-completions row ~1 day
- Extend the publisher to emit
recent.json(top N most recentlydonestrategies, with redacted metrics + sparkline data from the parquet store). - Add a “Pipeline Progress” section to
index.qmdrendered client-side.
Phase 6 (optional, later) — per-strategy glimpse pages
- Quarto pre-render at build time using
strategies/*.json. Or stay client-side. Decide based on whether we want each glimpse to be its own shareable URL.
Phase 7 (optional, much later) — Firebase Functions proxy
- Only if we want sub-minute live counters (e.g. a “claimed right now” ticker). Reuse
content-dashboard/functions/. Use@neondatabase/serverless. Aggressive caching (e.g. 60sCache-Control).
11. Open Questions
- Does the Quants repo’s GHA runner have
gcloud+ a service account configured for the bucket we’d write to, or does that need new IAM? - Where should the
kwiz_public_readerrole’s password be stored — same secret manager asNEON_DATABASE_URL, or split? - For the equity-curve sparklines: are we comfortable publishing 120 normalized points per completed strategy? If yes, the publisher needs to read parquet (not just Neon). If no, the glimpse cards drop the curve and show metrics-only.
- Do we want
recent.jsonto filter on a tier gate (only show strategies with CRS ≥ 0.50)? Probably yes — “recent completions” should mean “interesting recent completions”, not “the last 50 strategies the runner finished, including the ones that flunked”.
12. Non-Goals
- Real-time tick streaming.
- Logged-in customer features (those belong on
admin.kwizresearch.com). - Embedding the Shiny dashboard in the marketing page.
- Historical time-series of pipeline progress (a chart of “% done over time”). Nice eventually, not in scope here.