Skip to content

fix(web-search): allow keyless Brave as a server-side default#642

Open
ly-wang19 wants to merge 1 commit into
THU-MAIC:mainfrom
ly-wang19:fix/keyless-brave-server-default
Open

fix(web-search): allow keyless Brave as a server-side default#642
ly-wang19 wants to merge 1 commit into
THU-MAIC:mainfrom
ly-wang19:fix/keyless-brave-server-default

Conversation

@ly-wang19

Copy link
Copy Markdown
Contributor

Summary

Brave Search is requiresApiKey: false and searchWithBrave already scrapes Brave's public results page when no key is set — but the server-side wiring never let keyless Brave activate or be selected, so in the autonomous classroom-generation flow resolveClassroomWebSearchConfig returned undefined and web search was silently disabled for operators who wanted free, keyless Brave.

This mirrors the existing keyless handling for Ollama/Lemonade: set BRAVE_BASE_URL (e.g. https://search.brave.com) to enable keyless Brave, which then becomes a valid server default (after any keyed providers).

Related Issues

Closes #641

Changes

  • lib/server/provider-config.ts: pass keylessProviders: new Set(['brave']) to the web-search loadEnvSection (so BRAVE_BASE_URL alone activates Brave), and add Brave to the resolveServerWebSearchProviderId fallback chain (after keyed providers).
  • .env.example: document BRAVE_BASE_URL for keyless Brave.
  • tests/server/provider-config.test.ts: tests that keyless Brave is selected via BRAVE_BASE_URL, and that a keyed provider still takes priority.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)

Verification

Steps to reproduce / test

  1. With no web-search keys but BRAVE_BASE_URL=https://search.brave.com, the server now resolves brave and web search works (previously: undefined, disabled).
  2. pnpm vitest run tests/server/provider-config.test.ts → 31 passed (2 new).

What you personally verified

  • New tests cover keyless selection + keyed-provider priority.
  • Behavior-preserving for existing setups: Brave only activates when BRAVE_BASE_URL is explicitly set; keyed-provider resolution is unchanged.

Evidence

  • CI passes (pnpm check ✓, pnpm lint ✓, npx tsc --noEmit ✓, vitest 31/31 ✓)

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have added/updated documentation as needed (.env.example)
  • My changes do not introduce new warnings

🤖 AI-assisted PR (Claude). Self-reviewed before submission.

@ly-wang19 ly-wang19 force-pushed the fix/keyless-brave-server-default branch from 0a5eef1 to 70ebbe4 Compare June 2, 2026 09:48
Brave Search is `requiresApiKey: false` and already falls back to scraping
its public results page when no key is set (`searchWithBrave`). But the
server-side wiring never let it activate without a key:

- `loadEnvSection` for web search was not given `keylessProviders`, so Brave
  could only be configured via `BRAVE_API_KEY`, defeating keyless mode.
- `resolveServerWebSearchProviderId` only auto-selected providers with an
  `apiKey`, so keyless Brave was never chosen.

Net effect: in the autonomous classroom-generation flow,
`resolveClassroomWebSearchConfig` returned `undefined` and web search was
silently disabled for operators who wanted free, keyless Brave.

This mirrors the existing keyless handling for Ollama/Lemonade: set
`BRAVE_BASE_URL` (e.g. https://search.brave.com) to enable keyless Brave,
which then becomes a valid server default (after any keyed providers).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
wyuc
wyuc previously approved these changes Jun 7, 2026

@wyuc wyuc left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on the logic — keyless Brave mirrors the Ollama/Lemonade pattern, keyed providers still win in resolveServerWebSearchProviderId, and the tests cover both cases. Confirmed it stays opt-in: webSearch.brave is only set when BRAVE_BASE_URL is configured, and the classroom flow is still gated by enableWebSearch (off by default), so no silent scraping.

Before it can land:

  1. Branch conflicts with main — please rebase.
  2. Keyless mode scrapes Brave's public page; please confirm you verified it returns results end-to-end.

@ly-wang19

Copy link
Copy Markdown
Contributor Author

Thanks @wyuc — both points addressed:

1. Rebase ✅ — rebased onto main; the provider-config.ts conflict (your ttsDisabled vs my keyless webSearch) is resolved keeping both. Now MERGEABLE, 43 tests green.

2. Verify keyless returns results end-to-end — I did, and it turned up a real problem: it currently returns 0 results. A live scrape of search.brave.com (app headers) is HTTP 200 with 20 data-type="web" snippets, but parseBraveSearchHtml extracts nothing — Brave moved the result title from <span class="search-snippet-title"> to <div class="… search-snippet-title …">, so the title regex misses and every snippet is skipped. (The existing brave test stayed green because its fixture still used the old <span> markup.)

Fixed in #688 (accept <span>/<div>, fixtures updated to current markup). With that fix the same scrape returns real results end-to-end, e.g. Photosynthesis - Wikipedia -> en.wikipedia.org/wiki/Photosynthesis.

So this PR should land after / together with #688 — otherwise it enables a keyless provider that returns nothing.

One caveat worth a note: keyless scraping is best-effort — rapid repeat requests get rate-limited/challenged by Brave (some responses come back empty). The keyed Brave API path is unaffected. Happy to add a short "best-effort, keyed recommended" note to the docs/env example if you want it in this PR.

@tongshu2023

Copy link
Copy Markdown
Contributor

Heads-up on a merge-order interaction with #589 (SearXNG provider), which touches the same two spots in provider-config.ts:

  1. loadEnvSection optsfeat: add SearXNG self-hosted web search provider support #589 passes keylessProviders: new Set(['searxng']) at the same call site. Trivial union on whichever lands second: new Set(['brave', 'searxng']).
  2. resolveServerWebSearchProviderId tail — a real semantic difference, not just a textual conflict. feat: add SearXNG self-hosted web search provider support #589 currently ends with return Object.keys(webSearch).find((id) => webSearch[id]?.apiKey) (keyless providers never auto-selected). Your if (webSearch.brave) return 'brave' is the better call given the keylessProviders gating: anything in the map got there via an explicit operator opt-in (BRAVE_BASE_URL / SEARXNG_BASE_URL), so treating it as a valid fallback default is right, and feat: add SearXNG self-hosted web search provider support #589's blanket skip would silently swallow your keyless-Brave default.

No action needed on this PR — flagging so the rebase burden lands on #589, not here. Once this merges I'll rebase #589 to: union the keyless set, and follow this PR's pattern (if (webSearch.brave) return 'brave'; if (webSearch.searxng) return 'searxng';) instead of the apiKey-only fallback.

@wyuc

wyuc commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Second-pass review after #688 landed — end-to-end confirmed working; three findings before merge

First, verification: on current main (with #688), a live fetch of search.brave.com/search?q=photosynthesis using the app's headers parses 20 data-type="web" snippets into real results (Photosynthesis - Wikipedia -> en.wikipedia.org/wiki/Photosynthesis), so the keyless path now delivers end-to-end. A repeat request from a different egress got HTTP 429, which confirms the best-effort caveat — so yes, please do include the short "best-effort, keyed recommended" note you offered for .env.example.

Three findings from a second review pass, all verified against the code:

🟠 P2 — Setting BRAVE_BASE_URL silently downgrades users' own Brave API keys to scraping

Once the operator sets BRAVE_BASE_URL, Brave becomes server-managed: the web-search route drops the client-sent key (managed guard in app/api/web-search/route.ts), and resolveSectionApiKey returns the managed entry's apiKey || '' — empty by construction for a keyless entry. So a user who configured their own (paid) Brave Search API key in Settings silently goes from the official JSON API to regex-scraping the public page. The classroom path (lib/server/web-search-config.tsresolveWebSearchApiKey) downgrades the same way. This is the first managed-with-empty-key case in webSearch, where the managed-is-authoritative rule masks a real client credential with an empty one.

Two ways to resolve — your pick:

  • Minimal: document the override in .env.example — a line under BRAVE_BASE_URL noting it makes Brave server-managed, so client Brave keys are then ignored (scrape mode).
  • Better: for a keyless-capable managed entry, fall back to the client key when the managed entry's key is empty. This doesn't reintroduce the Fix server API key fallback when client echoes provider base URL #533 tri-state: searchWithBraveApi always talks to the fixed api.search.brave.com, so the client key never flows to an operator-controlled base URL.

🟡 P3 — resolveServerWebSearchProviderId(preferredProviderId) rejects keyless Brave as an explicit preference while accepting it as a fallback

The preference check requires a truthy apiKey (webSearch[preferredProviderId]?.apiKey), so with keyless Brave plus any keyed provider configured, preferredProviderId: 'brave' returns the keyed provider — contradicting the new fallback a few lines below. No caller passes a preference today, so it's latent, but it's this PR's own feature interacting with the exported API. A presence-based check for keyless providers fixes it.

🟡 P3 — New tests can leak host env: ENV_PREFIXES_TO_CLEAR clears neither BRAVE nor BAIDU

A dev/CI host with BAIDU_API_KEY set fails "selects keyless Brave when only BRAVE_BASE_URL is set" (baidu is checked before the keyless fallback); a host with BRAVE_API_KEY makes it pass for the wrong reason (keyed path instead of the keyless fallback under test). Adding both prefixes keeps host state out of the tests, per the file's own header comment.

All three are small. Once they're addressed (P2 can be consciously resolved as doc-only if you prefer), this is good to merge — and #589 will follow this PR's pattern per the earlier thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Keyless Brave Search can't be used as a server-side web search provider

3 participants