REX

REX is a small LLM orchestration project with a FastAPI backend, a React/Vite frontend, and an optional Go fanout service for high-throughput parallel model calls.

It’s built to be easy to run locally, while still including practical engineering features:

Provider abstraction (swap direct calls vs Go fanout)
Deadlines + cancellation for sub-queries
Structured JSON logs, Prometheus metrics, and optional OpenTelemetry tracing
A tiny deterministic evaluation harness you can run in CI

Architecture

flowchart LR
  UI[React/Vite UI] -->|HTTP/WebSocket| API[FastAPI API]
  API --> ORCH[Orchestrator + Pipeline]
  ORCH -->|Provider abstraction| PROV[Gemini Provider]
  PROV -->|Optional| GO[Go fanout-service]
  GO --> GEM[Gemini API]

  API -->|optional| REDIS[(Redis)]
  REDIS -->|pubsub| WS[WebSocket clients]

  API --> METRICS[/Prometheus metrics/]
  GO --> METRICS
  API --> TRACE[(OpenTelemetry spans)]
  GO --> TRACE

Where to look in the code:

Backend entrypoint: src/main.py
API routes + orchestration pipeline: src/api/routes.py
Provider layer (Go fanout vs direct): src/providers/gemini.py
Go fanout-service: go/fanout-service/main.go

Quickstart

1) Backend (FastAPI)

From recursion/:

python -m uvicorn src.main:app --reload --host 0.0.0.0 --port 8000

2) Frontend (React/Vite)

From recursion/frontend/:

npm install
npm run dev

Open the UI at http://localhost:5173.

3) Optional: Go fanout-service (high throughput)

From recursion/go/fanout-service/:

go run .

Enable it from the Python side by setting FANOUT_URL:

set FANOUT_URL=http://127.0.0.1:8099

On macOS/Linux:

export FANOUT_URL=http://127.0.0.1:8099

Configuration

Required (for real model calls)

GEMINI_API_KEY (or GOOGLE_API_KEY)

Optional performance & reliability

FANOUT_URL — if set, routes model fanout through the Go service
SUBQUERY_DEADLINE_MS — per-subquery hard deadline (default 30000)

Optional caching

REX_CACHE_ENABLED — set to 0/false to disable caching
REX_CACHE_TTL_SECONDS — cache TTL (default 86400)

Optional async jobs

REDIS_URL — enables Redis caching and (if RQ is installed and configured) async jobs

Optional tracing (OpenTelemetry)

OTEL_ENABLED=1 — turns on tracing (best-effort; safe to leave off)
OTEL_EXPORTER_OTLP_ENDPOINT — e.g. http://127.0.0.1:4318 (otherwise spans print to console)
OTEL_SERVICE_NAME — overrides service name (defaults to rex-api / Go defaults)

Observability

Structured logs

Both Python and Go emit JSON logs and propagate request IDs.

Prometheus metrics

Python API metrics: GET http://127.0.0.1:8000/metrics
Go fanout-service metrics: GET http://127.0.0.1:8099/metrics

Distributed tracing

Tracing spans connect API → orchestrator → provider → Go fanout-service. Trace context is propagated using standard W3C headers (for example, traceparent).

API surface (high level)

POST /api/run — run a query (sync)
POST /api/run-async — enqueue async job if Redis/RQ available (falls back to sync)
WebSocket — pushes query_started / query_partial / query_completed (and query_rate_limited)

Client identity (no auth)

For multi-tenant readiness, you can pass a lightweight client identity header:

X-Client-Id: your-workspace-name

This identity is used for:

Per-client in-memory rate limiting (REX_CLIENT_QPS, REX_CLIENT_BURST, optional REX_CLIENT_LIMITS_JSON)
Cache key isolation (same prompt + models but different client IDs won’t share cached results)
Per-client metrics (client IDs are hashed before being used as metric labels)

Tests & evaluation

Unit tests

From recursion/:

python -m pytest -q

Eval harness (deterministic)

The eval harness runs without external network calls (it uses the simulated pipeline), so it’s stable in CI.

python -m pytest -m eval

Dataset: eval/dataset.jsonl

CI

GitHub Actions runs:

Python: ruff (bug gate) + mypy (baseline) + pytest
Go: golangci-lint (govet baseline) + go test
Frontend: npm build

Workflow: .github/workflows/ci.yml

Load/perf scripts

scripts/fanout_smoke_test.py — quick contract smoke test for fanout-service
scripts/fanout_load_test.py — produces throughput/latency artifacts under results/

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
eval		eval
frontend		frontend
go/fanout-service		go/fanout-service
results		results
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
task.md		task.md
users.db		users.db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

REX

Architecture

Quickstart

1) Backend (FastAPI)

2) Frontend (React/Vite)

3) Optional: Go fanout-service (high throughput)

Configuration

Required (for real model calls)

Optional performance & reliability

Optional caching

Optional async jobs

Optional tracing (OpenTelemetry)

Observability

Structured logs

Prometheus metrics

Distributed tracing

API surface (high level)

Client identity (no auth)

Tests & evaluation

Unit tests

Eval harness (deterministic)

CI

Load/perf scripts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

REX

Architecture

Quickstart

1) Backend (FastAPI)

2) Frontend (React/Vite)

3) Optional: Go fanout-service (high throughput)

Configuration

Required (for real model calls)

Optional performance & reliability

Optional caching

Optional async jobs

Optional tracing (OpenTelemetry)

Observability

Structured logs

Prometheus metrics

Distributed tracing

API surface (high level)

Client identity (no auth)

Tests & evaluation

Unit tests

Eval harness (deterministic)

CI

Load/perf scripts

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages