Skip to content

feat(query): safe Query Explorer — structured DSL → parameterized SQL (#10)#172

Open
LeoLin990405 wants to merge 2 commits into
hoangsonww:masterfrom
LeoLin990405:feat/query-explorer
Open

feat(query): safe Query Explorer — structured DSL → parameterized SQL (#10)#172
LeoLin990405 wants to merge 2 commits into
hoangsonww:masterfrom
LeoLin990405:feat/query-explorer

Conversation

@LeoLin990405

Copy link
Copy Markdown
Contributor

Summary

Implements an Advanced Query Explorer (issue #10) so power users can answer custom questions across events / agents / sessions without SQL access — via a safe, structured query DSL that compiles to fully parameterized SQL, plus a schema-driven builder UI, saved queries, and CSV/JSON export.

Changes

Backend (read-only, injection-safe by construction)

  • server/lib/query-dsl.js — a structured DSL validated against a strict per-entity field + operator allowlist, compiled to parameterized SQL. Every value flows through a ? placeholder; the only user-influenced tokens reaching SQL (column/table names) are taken verbatim from the SCHEMA allowlist after an exact hasOwnProperty lookup — never the raw request. SELECT lists are explicit (never SELECT *), so blob columns (data/metadata) cannot leak.
  • server/routes/query.jsPOST /api/query/run (?format=csv|json, pagination, truncation + warnings, tookMs), GET /api/query/schema (drives the UI builder), and saved-query CRUD (the DSL is validated before persist, so a saved query always re-runs safely). Saved queries are user config and survive Clear Data, like alert_rules.
  • server/db.js — additive saved_queries table (CREATE TABLE IF NOT EXISTS, migration-safe). server/index.js — mount.
  • Hardening: operator-by-type gating (like text-only), in array cap (100), filter/sort count caps, limit clamp (1–1000), offset clamped to a safe integer, unknown keys rejected at every level, non-string fields rejected.

Frontend

  • client/src/pages/QueryExplorer.tsx — schema-driven filter builder (field → type-aware operator → value), AND/OR match, sort, limit; results table with total / tookMs / truncation + warning banners; CSV + JSON export; saved-query load/save/delete. Reuses the app's Select/EmptyState/Skeleton + utility classes.
  • api.query client + types, route + sidebar entry, full en/zh/vi i18n.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change
  • Refactor (no functional changes)
  • Documentation update
  • Infrastructure / CI / DevOps
  • Dependency update

How to Test

  1. npm run test:server — all server tests pass (373), including server/__tests__/query.test.js (29 real-SQLite tests).
  2. cd client && npm run build — clean tsc -b && vite build.
  3. Manual: npm run dev, open Query Explorer in the sidebar. Pick an entity, add filters (e.g. events where tool_name = Bash), Run, sort/paginate, export CSV/JSON, save & reload a query.
  4. API: curl -s localhost:4820/api/query/schema and curl -s -XPOST localhost:4820/api/query/run -H 'content-type: application/json' -d '{"entity":"events","filters":[{"field":"tool_name","op":"eq","value":"Bash"}],"limit":20}'.

Security notes

The DSL is the entire attack surface, so it was built defensively and verified with an adversarial test battery (query.test.js): SQL-injection attempts in field names and values, allowlist/operator bypass (incl. prototype-pollution-style keys like __proto__/constructor), oversized in lists, too-many-filters/sort, and a huge offset are all rejected or safely parameterized, with the table left intact. No user value is ever concatenated into SQL.

Checklist

  • I have read the contributing guidelines
  • My code follows the project's coding standards
  • I have added/updated tests that prove my feature works
  • All new and existing tests pass (npm test) — see note below
  • Code is formatted (npm run format:check)
  • I have updated documentation where necessary (self-describing /schema endpoint + i18n; happy to add wiki/README copy if you'd like)

Screenshots

I wasn't able to attach a live screenshot from this environment. The page follows the existing dashboard layout (cards, results table, sidebar entry "Query Explorer"); happy to add screenshots on request.

Note on tests: npm run test:server is green (373/373) and cd client && npm run build compiles cleanly (zero TS errors). npm run test:client reports the same ~11 pre-existing failures (Tabby, Dashboard snapshot) that occur on a clean master checkout under Node 26 (jsdom 27 leaves localStorage undefined); they're unrelated to this change and untouched here. CI runs Node 22, where the client suite passes. I intentionally did not add the new page to screens.snapshot.test.tsx since I couldn't regenerate a trustworthy baseline on Node 26 — glad to add it if you can confirm the baseline.

…hoangsonww#10)

Adds an Advanced Query Explorer so power users can answer custom questions
across events / agents / sessions without SQL access, addressing hoangsonww#10.

Backend (read-only, injection-safe by construction):
- server/lib/query-dsl.js — a structured query DSL validated against a strict
  per-entity field + operator allowlist, compiled to fully parameterized SQL.
  Every value flows through a `?` placeholder; the only user-influenced tokens
  reaching SQL (column/table names) come verbatim from the SCHEMA allowlist
  after an exact hasOwnProperty lookup, never the raw request. SELECT lists are
  explicit (never SELECT *), so blob columns (data/metadata) can't leak.
- server/routes/query.js — POST /api/query/run (?format=csv|json, pagination,
  truncation + warnings, tookMs), GET /api/query/schema (drives the UI builder),
  and saved-query CRUD (validated before persist, so a saved query always
  re-runs safely). Saved queries are user config and survive Clear Data.
- server/db.js — additive saved_queries table (CREATE TABLE IF NOT EXISTS).
- Hardening: operator-by-type gating, `in` array cap (100), filter/sort count
  caps, limit clamp (1..1000), offset clamped to a safe integer, unknown keys
  rejected at every level, non-string fields rejected.

Frontend:
- client/src/pages/QueryExplorer.tsx — schema-driven filter builder (field →
  type-aware operator → value), AND/OR, sort, limit; results table with total /
  tookMs / truncation banners; CSV + JSON export; saved-query load/save/delete.
- api.query client, types, route + sidebar entry, full en/zh/vi i18n.

Tests: server/__tests__/query.test.js — 29 real-SQLite integration tests
covering filters/sort/pagination/CSV/saved-CRUD plus an adversarial battery
(injection in field/value, allowlist & operator bypass, oversized/DoS inputs).
@LeoLin990405 LeoLin990405 requested a review from hoangsonww as a code owner June 17, 2026 18:39

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a schema-driven Query Explorer feature, allowing users to build ad-hoc queries against dashboard data, export results to CSV/JSON, and manage saved queries. It includes a frontend query builder UI, localization support, and a secure backend DSL compiler that validates queries against an allowlist and generates parameterized SQL. The review feedback highlights a critical issue where querying integer fields will fail because the frontend sends string values instead of numbers. To resolve this, suggestions were made to update the TypeScript types to support numbers, parse integer inputs on the frontend, and handle value casting when loading saved queries. A defensive check was also recommended to prevent a runtime crash if a saved query fails to parse on the backend.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread client/src/lib/types.ts
Comment on lines +720 to +724
export interface QueryFilter {
field: string;
op: QueryOperator;
value?: string | string[];
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The QueryFilter.value type needs to support number and number[] to accommodate integer fields (such as id in events). Currently, it only allows string | string[], which will cause TypeScript compilation errors when the frontend is updated to parse integer inputs into numbers before sending them to the backend.

Suggested change
export interface QueryFilter {
field: string;
op: QueryOperator;
value?: string | string[];
}
export interface QueryFilter {
field: string;
op: QueryOperator;
value?: string | number | (string | number)[];
}

Comment on lines +225 to +237
const builtFilters: QueryFilter[] = filters.map((row) => {
if (NO_VALUE_OPS.has(row.op)) {
return { field: row.field, op: row.op };
}
if (row.op === IN_OP) {
const values = row.value
.split(",")
.map((v) => v.trim())
.filter((v) => v.length > 0);
return { field: row.field, op: row.op, value: values };
}
return { field: row.field, op: row.op, value: row.value };
});

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The backend strictly validates that values for int fields must be of type number (using typeof value === 'number'). Since the frontend input values are always strings, submitting a query on an integer field (like id in events) will always fail with a 400 Bad Request error. We need to parse the string values to numbers if the field's schema type is int.

      const builtFilters: QueryFilter[] = filters.map((row) => {
        if (NO_VALUE_OPS.has(row.op)) {
          return { field: row.field, op: row.op };
        }
        const fieldType = schema?.entities[entity]?.fields[row.field]?.type;
        if (row.op === IN_OP) {
          const values = row.value
            .split(",")
            .map((v) => v.trim())
            .filter((v) => v.length > 0)
            .map((v) => (fieldType === "int" ? Number(v) : v));
          return { field: row.field, op: row.op, value: values };
        }
        const parsedValue = fieldType === "int" ? Number(row.value) : row.value;
        return { field: row.field, op: row.op, value: parsedValue };
      });

Comment on lines +324 to +328
const rows: FilterRow[] = (q.filters ?? []).map((f) => ({
field: f.field,
op: f.op,
value: Array.isArray(f.value) ? f.value.join(", ") : (f.value ?? ""),
}));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

With QueryFilter.value updated to support number | number[], assigning f.value ?? "" directly to FilterRow.value (which is strictly typed as string) will cause a TypeScript compilation error. We should explicitly cast/convert the loaded values to strings.

Suggested change
const rows: FilterRow[] = (q.filters ?? []).map((f) => ({
field: f.field,
op: f.op,
value: Array.isArray(f.value) ? f.value.join(", ") : (f.value ?? ""),
}));
const rows: FilterRow[] = (q.filters ?? []).map((f) => ({
field: f.field,
op: f.op,
value: Array.isArray(f.value)
? f.value.join(", ")
: f.value !== undefined && f.value !== null
? String(f.value)
: "",
}));

Comment on lines +317 to +320
(item: SavedQuery) => {
if (!schema) return;
const q = item.query;
const targetEntity = schema.entities[q.entity] ? q.entity : entity;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If a saved query fails to parse on the backend, the API returns query: null. Accessing q.entity directly without checking if q is null/undefined will cause a runtime crash. Let's add a defensive guard here.

Suggested change
(item: SavedQuery) => {
if (!schema) return;
const q = item.query;
const targetEntity = schema.entities[q.entity] ? q.entity : entity;
(item: SavedQuery) => {
if (!schema) return;
const q = item.query;
if (!q) return;
const targetEntity = schema.entities[q.entity] ? q.entity : entity;

@hoangsonww hoangsonww added bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers question Further information is requested labels Jun 17, 2026
Addresses review feedback on the Query Explorer (hoangsonww#10):

- The backend requires `int` columns (events.id) to be a JS number, but the
  builder always submitted filter values as strings, so any query on an int
  field 400'd. buildBody now coerces a filter value to Number when the field's
  schema type is `int` (both scalar and `in`-list values).
- QueryFilter.value widened to `string | number | (string | number)[]`.
- Loading a saved query: stringify numeric values back into the (string-typed)
  builder inputs, and guard against a null `query` (the API returns
  `query: null` when a saved DSL fails to parse) before dereferencing it.
- SavedQuery.query typed as `QueryBody | null` to match that response.
- Add a server test locking the int-field contract: a numeric value on
  events.id is accepted; a string there is still rejected.
@LeoLin990405

Copy link
Copy Markdown
Contributor Author

Thanks for the review — all four points addressed in 5a0f928:

  • int fields sent as numbers (the critical one): buildBody now coerces a filter value to Number when the field's schema type is int (scalar and in-list), so querying events.id no longer 400s.
  • QueryFilter.value widened to string | number | (string | number)[].
  • Loading a saved query: numeric values are stringified back into the (string-typed) builder inputs, and there's now a if (!q) return; guard before dereferencing — SavedQuery.query is typed QueryBody | null to match the API returning query: null on a parse failure.
  • Added a server test locking the contract: a numeric value on events.id is accepted; a string there is still rejected.

npm run test:server → 374 pass; cd client && npm run build clean.

@hoangsonww hoangsonww linked an issue Jun 24, 2026 that may be closed by this pull request
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working documentation Improvements or additions to documentation enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed question Further information is requested

Projects

Development

Successfully merging this pull request may close these issues.

Feature: Advanced Query Explorer for Events, Agents, and Sessions

2 participants