Skip to content

feat(outline): Design Brief mode — natural-language brief + structured media manifest#754

Draft
wyuc wants to merge 1 commit into
mainfrom
feat/outline-design-brief-and-media
Draft

feat(outline): Design Brief mode — natural-language brief + structured media manifest#754
wyuc wants to merge 1 commit into
mainfrom
feat/outline-design-brief-and-media

Conversation

@wyuc

@wyuc wyuc commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Design Brief mode — natural-language brief + structured media manifest

Draft / RFC — opening for design discussion. Default behavior is unchanged; everything is behind an opt-in flag.

Motivation

The outliner today describes each slide with a terse {description, keyPoints}. Two problems show up downstream:

  1. Weak layout signal. A bullet list under-specifies the page, so the slide-content model has to invent most of the layout. Richer, in-distribution guidance produces noticeably better and more consistent slides.
  2. Decorative image spam. Because image intent only lives as prose ("add an icon", "a diagram here") and is not structured, the slide model emits an image element per mention. With image generation enabled those become per-card placeholders that mostly add noise; with it disabled they are dead boxes.

What this adds (opt-in via requirements.designBriefMode)

For every slide scene, the outliner now emits:

  • brief — a detailed natural-language design brief (page goal, visual style + palette in words, region-based layout, the actual written-out content, focal emphasis). It becomes the authoritative slide-content input when present.
  • media[] — a structured manifest of the images/videos the slide actually needs, unifying real assets and to-be-generated media:
    { id, source: 'asset' | 'generate', type: 'image' | 'video', prompt?, caption?, aspectRatio? }
    The brief references each item by id (gen_img_1, img_2), with mandatory two-way consistency (every brief id ⇔ a media entry).

The prompt also adds media discipline: most slides need zero images (icons/dividers are drawn with shapes; table/chart/latex carry information), and media is only requested when it carries irreplaceable visual information.

How it wires into the existing pipeline

  • media[] generate items are bridged into the existing mediaGenerations (and asset items into suggestedImageIds) inside uniquifyMediaElementIds, so the current media-orchestrator dispatches generation unchanged. The id-uniquification now rewrites mediaGenerations, media[] and the inline brief references together, so all three stay consistent.
  • slide-content consumes brief as the authoritative layout spec (new {{#if brief}} block); media flows through the existing generated-image instructions once bridged. Model-agnostic — no change to model routing.

Scope / files

8 files, ~150 lines: types (SlideMediaItem, SceneOutline.brief/media, UserRequirements.designBriefMode), the requirements-to-outlines prompt (gated blocks + two worked examples), the media[]→mediaGenerations bridge, and slide-content brief consumption.

Open questions for reviewers

  • Flag surface: a request-level designBriefMode here — should it instead be a setting / per-stage option, and should there be a UI toggle (not included yet)?
  • Whether to make the media[] manifest the canonical media field longer-term and derive mediaGenerations from it everywhere, vs the current additive bridge.
  • Prompt-token budget of the gated blocks (only added when the mode is on).

Status

Types compile (tsc --noEmit clean). Not yet wired to a UI toggle (backend accepts the flag). Looking for feedback on the interface before fleshing out tests + UI.

…d media manifest

When requirements.designBriefMode is set, the outliner emits, per slide scene:
- brief: a detailed natural-language design brief (the primary slide-content input)
- media[]: a structured manifest unifying source (asset) and to-be-generated media,
  referenced by id from the brief

Media discipline in the prompt stops the model from over-emitting decorative image
placeholders (most slides need none; icons are drawn with shapes). media[] generate
items are bridged into the existing mediaGenerations pipeline (uniquifyMediaElementIds),
keeping ids consistent across media[], mediaGenerations and the brief. slide-content
consumes the brief as the authoritative layout spec when present. Fully opt-in; default
behavior unchanged.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants