feat: Evaluate Prompt Orchestration Conflict Resolution

## Description
Evaluate and validate that USER instructions correctly override SERVER instructions when conflicts occur

## Context
The prompt orchestration system composes prompts from multiple sources:
- SERVER-side (developer-controlled): timing, system prompt, deep research, tools, code assistant, chain-of-thought
- USER-side (user-controlled): project instruction, custom instruction, memory, tone/style

When these conflict (e.g., server says "JSON" but user says "Markdown"), user settings should take priority.

Models to test:
- Jan 30B Instruct
- Jan 30B Thinking

## Scope
* Test conflict scenarios: SERVER vs USER instruction priority
* Test toggle behavior: Enable/disable optional server prompts one-by-one
* Validate with LLM-as-judge evaluation
* Document findings and edge cases

## Out of Scope
* Changing the actual prompt orchestration implementation
* Production deployment


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Evaluate Prompt Orchestration Conflict Resolution #309

Description

Context

Scope

Out of Scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

feat: Evaluate Prompt Orchestration Conflict Resolution #309

Description

Description

Context

Scope

Out of Scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions