Reagan/hitl#443
Conversation
The outer bordered container around the picker was removed in 9bdcd99; bring it back so the option grid reads as a card on the chat surface again. Also render two skeleton cards at the tail while the fence is still streaming, mirroring AskForm's TRAILING_SKELETONS pattern, so the UI doesn't appear frozen between option arrivals.
Match Google's actual challenge UI: small lead-in ("Select all images
with"), big bold subject ("a bus"), small trailer ("Click verify
once there are none left."). If the agent emits only `prompt` as a
full sentence, splitCapturePrompt() recovers the three parts via the
standard "... with X. ..." shape; an explicit `target` short-circuits
the split.
Also:
- accept an optional `targetImage` URL in the payload (renders as the
reCAPTCHA example thumbnail on the right of the banner).
- hug the grid width (`width: fit-content; max-width: min(360px, 100%)`)
so the container no longer stretches across the chat surface.
- bump the outer border to `--color-border-strong` so the card is
visible against the chat background in light mode.
- flip the tile gutter color to `--color-bg-base` so the grout reads
white in light mode / near-black in dark, matching Google.
Agents kept clipping with `.rc-imageselect-table-33`'s bounding rect, which is up to 100px taller than the visible tile area (the floating Verify toolbar lives absolutely-positioned inside that table). The resulting screenshots had a misaligned bottom row and an unwanted toolbar strip. The reliable recipe is to take the union of `td.rc-imageselect-tile` rects inside the bframe (same-origin with the parent) and pass that to `Page.captureScreenshot`'s `clip` in CSS pixels. Document the verification checklist (near-square ratio, no toolbar visible, 9 cells edge-to-edge) and the common ways agents have gotten this wrong, plus the bframe-attach fallback for the rare non-standard layout.
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 33203192 | Triggered | Username Password | 4dbb658 | app/src/renderer/hub/chat-v2/htmlBlocks.ts | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
There was a problem hiding this comment.
4 issues found across 16 files
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
The bframe is cross-origin on any third-party host that embeds reCAPTCHA (parent is e.g. `2captcha.com` while the bframe is `google.com`), so the previous `bf.contentDocument` recipe would throw a SecurityError on those sites — only Google's own /recaptcha/api2/demo page happens to be same-origin with the bframe. Rewrite the canonical recipe to attach to the bframe target via `Target.attachToTarget`, run the tile-union + prompt-text queries inside that session, then translate inner-iframe coords back to page coords using the outer <iframe> rect read from the parent page. Works for both same-origin and cross-origin deployments. Also drop two stale fallback sections that referred to constants (32/120/336) and variables (`bf`) that never existed in this document — leftovers from an earlier draft.
The previous `window.electronAPI?.sessions?.resume(...)` optional-chain silently resolved to `undefined` when the preload bridge wasn't wired in (in unit tests, in window types that drift, or if the IPC handler was removed). `result?.error` was then `undefined`, so the submit path took the success branch — recording a cache entry and locking the picker into the "Sent to agent" state even though no message ever left the renderer. Resolve `resume` first; if it's missing, surface a 'sessions bridge unavailable' error and roll the local state back so the user can retry.
There was a problem hiding this comment.
1 issue found across 2 files (changes from recent commits).
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
The previous recipe called a `cdp('Runtime.evaluate', ..., sessionId)`
helper that doesn't exist in browser-harness-js — only the Python
harness exposes `cdp(...)`. The JS runner's documented pattern for
routing Runtime/DOM calls into an OOPIF is `session.use(targetId)`
followed by ordinary `session.Runtime.evaluate(...)`, as shown in
cross-origin-iframes.md.
Rewrite the recipe around `session.use()`: resolve the outer iframe
rect from the parent target, route into the bframe, run the tile-union
+ challenge-text query there, then route back to the parent target
(in a `finally` so a thrown error doesn't strand subsequent
`Page.captureScreenshot` calls on the wrong target).
Lets the agent emit an iframe fence carrying { url, prompt?, width?,
height?, submitLabel? } to embed an arbitrary https URL inline in a
chat turn. The renderer mounts a sandboxed iframe and surfaces a foot
button the user clicks once they are done inside the frame.
Tiny flat payload, no streaming-partial path -- the picker cannot
render meaningfully until the URL is in hand. URL must be absolute
http(s); non-http(s) schemes (javascript:, data:, etc.) are rejected
at parse time. width and height are clamped to safe ranges.
Sandboxed iframe (allow-scripts allow-same-origin allow-forms allow-popups) inside a chat-turn card matching the OptionList / CaptureBlock shell. Foot button offers "I'm done" / "Skip" -- submission resumes the agent with "Iframe interaction complete." or "Iframe skipped." as the next user turn. The component cannot read the iframe's DOM (sandbox + cross-origin), so the agent only learns that the user finished, not what they did. That is a feature: the iframe runs in the renderer's session, decoupled from whatever the agent is doing in its CDP-driven Chromium instance. Also widen hub.html's CSP from default-src 'self' to allow frame-src 'self' https: -- without this the new iframe is blocked. Unit tests cover the streaming + parser contract (7 cases) and the hasStructuredBlock dispatch in ChatTurn is extended to route the new iframe_block event variant.
Documents when the iframe fence is the right tool (public forms, OAuth screens, demos) and -- more importantly -- when it is not. The reCAPTCHA bframe section is based on a live experiment: the bframe URL does load inside our renderer (Google sets no X-Frame-Options and no frame-ancestors restriction), but it renders completely blank because the challenge UI is populated by postMessage from the matching anchor iframe on the original host. Loading the URL standalone gets you Google's bootstrap script plus an empty <div></div>. For captcha challenges, the capture fence remains the right tool -- screenshot + proxied clicks keep the live agent session intact.
There was a problem hiding this comment.
1 issue found across 7 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="app/src/renderer/hub/chat-v2/IframeBlock.tsx">
<violation number="1" location="app/src/renderer/hub/chat-v2/IframeBlock.tsx:57">
P2: Cache key is too coarse (URL-only), so repeated iframe blocks with the same URL can be falsely restored as already submitted.</violation>
</file>
Tip: Review your code locally with the cubic CLI to iterate faster.
Fix all with cubic | Re-trigger cubic
| function IframeReady({ payload, sessionId, streaming, nextUserText }: ReadyProps): React.ReactElement { | ||
| const { url, prompt, width, height, submitLabel } = payload; | ||
| const cacheKey = useMemo( | ||
| () => `iframe:${submissionKey(sessionId, [url])}`, |
There was a problem hiding this comment.
P2: Cache key is too coarse (URL-only), so repeated iframe blocks with the same URL can be falsely restored as already submitted.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At app/src/renderer/hub/chat-v2/IframeBlock.tsx, line 57:
<comment>Cache key is too coarse (URL-only), so repeated iframe blocks with the same URL can be falsely restored as already submitted.</comment>
<file context>
@@ -0,0 +1,209 @@
+function IframeReady({ payload, sessionId, streaming, nextUserText }: ReadyProps): React.ReactElement {
+ const { url, prompt, width, height, submitLabel } = payload;
+ const cacheKey = useMemo(
+ () => `iframe:${submissionKey(sessionId, [url])}`,
+ [sessionId, url],
+ );
</file context>
Summary by cubic
Adds three human-in-the-loop blocks:
loginfor credential entry,capturefor reCAPTCHA tile selection, andiframefor embedding a live URL inline. This lets agents pause, collect input safely in chat, or let users complete a remote step, then resume browser automation.New Features
LoginForm,CaptureBlock, andIframeBlockwith transcript-based hydration.htmlBlocksnow recognizeslogin,capture, andiframefences with strict JSON validation (LoginPayload,CapturePayload,IframePayload+ parsers).loginBlockGuidanceLines()and wired intobrowsercode,claude-code, andcodex.ChatTurndispatches all new blocks;ChatPanelistens forchatv2:open-browser; CSP updated to allowframe-src 'self' https:.capture-block.md(per-tile-union recipe, cross-origin bframe viasession.use(), verification) andiframe-block.md(when to use, limits).CaptureBlock.spec.tsx,captureBlocks.test.ts,iframeBlocks.test.ts).UI Improvements
CaptureBlock: reCAPTCHA-style banner (lead/bold/trailer, optionaltargetImage), NxM CSS tiling, clear submitted state; guarded submit when the sessions IPC bridge is missing.IframeBlock: sandboxed frame with “I’m done”/“Skip” foot actions; read-only (no DOM access).OptionList: restored outer card chrome and added streaming skeleton cards.captureBlock.css,loginForm.css,iframeBlock.css) and minor polish across blocks.Written for commit d1fabba. Summary will update on new commits.
Review in cubic