The Converse MCP Server provides four main tools through the Model Context Protocol (MCP):
- Chat Tool - Single-provider conversational AI with context support and AI summarization
- Consensus Tool - Multi-provider parallel execution with response aggregation and combined summaries
- Check Status Tool - Monitor and retrieve results from asynchronous operations with intelligent summaries
- Cancel Job Tool - Cancel running background operations
All tools support both synchronous (immediate response) and asynchronous (background processing) execution modes. When AI summarization is enabled, tools automatically generate titles and summaries for better context understanding.
The server supports two transport modes:
- Endpoint:
http://localhost:3157/mcp - Protocol: HTTP streaming with JSON-RPC 2.0
- Usage: Best for development, debugging, and web integrations
- Features: Health endpoints, CORS support, session management
- Protocol: Standard input/output with JSON-RPC 2.0
- Usage: Traditional MCP client integrations
- Features: Process-based communication, lower latency
Transport Selection:
# Default (HTTP)
npm start
# Explicit HTTP
npm start -- --transport=http
# Stdio transport
npm start -- --transport=stdio
# Environment variable
MCP_TRANSPORT=stdio npm startDescription: General conversational AI with context and continuation support.
{
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "Your question or topic with relevant context. Example: 'How should I structure the authentication module for this Express.js API?'"
},
"model": {
"type": "string",
"description": "AI model to use. Examples: 'auto' (recommended), 'gemini-2.5-flash', 'gpt-5', 'grok-4'. Default: 'auto'"
},
"files": {
"type": "array",
"items": {"type": "string"},
"description": "File paths to include as context (absolute paths required). Example: ['/path/to/src/auth.js', '/path/to/config.json']"
},
"images": {
"type": "array",
"items": {"type": "string"},
"description": "Image paths for visual context (absolute paths or base64). Example: ['/path/to/diagram.png', 'data:image/jpeg;base64,...']"
},
"continuation_id": {
"type": "string",
"description": "Continuation ID for persistent conversation. Example: 'chat_1703123456789_abc123'"
},
"temperature": {
"type": "number",
"minimum": 0.0,
"maximum": 1.0,
"default": 0.5,
"description": "Response randomness (0.0-1.0). Examples: 0.2 (focused), 0.5 (balanced), 0.8 (creative)"
},
"reasoning_effort": {
"type": "string",
"enum": ["minimal", "low", "medium", "high", "max"],
"default": "medium",
"description": "Reasoning depth for thinking models. Examples: 'minimal' (fastest, few reasoning tokens), 'low' (light analysis), 'medium' (balanced), 'high' (complex analysis)"
},
"verbosity": {
"type": "string",
"enum": ["low", "medium", "high"],
"default": "medium",
"description": "Output verbosity for GPT-5 models. Examples: 'low' (concise answers), 'medium' (balanced), 'high' (thorough explanations)"
},
"use_websearch": {
"type": "boolean",
"default": false,
"description": "Enable web search for current information. Example: true for framework docs, false for private code analysis"
},
"media_resolution": {
"type": "string",
"enum": ["MEDIA_RESOLUTION_LOW", "MEDIA_RESOLUTION_MEDIUM", "MEDIA_RESOLUTION_HIGH", "MEDIA_RESOLUTION_UNSPECIFIED"],
"default": "MEDIA_RESOLUTION_HIGH",
"description": "Control image/PDF/video processing quality (Gemini 3.0). Defaults to 'MEDIA_RESOLUTION_HIGH' for Gemini 3.0. Examples: 'MEDIA_RESOLUTION_LOW' (faster, less detail), 'MEDIA_RESOLUTION_MEDIUM' (balanced), 'MEDIA_RESOLUTION_HIGH' (maximum detail)"
},
"async": {
"type": "boolean",
"default": false,
"description": "Execute in background mode. Returns continuation_id immediately for status monitoring. Example: true for long-running analysis"
},
"export": {
"type": "boolean",
"default": false,
"description": "Export conversation to disk. Creates folder with continuation_id name containing numbered request/response files and metadata. Example: true to save for documentation"
}
},
"required": ["prompt"]
}Synchronous Response (async=false):
{
"content": "AI response text",
"continuation": {
"id": "conv_d6a6a5ec-6900-4fd8-a4e0-1fa4f75dfc42",
"provider": "openai",
"model": "gpt-5-mini",
"messageCount": 3
},
"metadata": {
"model": "gpt-5-mini",
"usage": {
"input_tokens": 150,
"output_tokens": 85,
"total_tokens": 235
},
"response_time_ms": 1247,
"provider": "openai"
},
"title": "Authentication Module Structure Guide", // When summarization enabled
"final_summary": "Provided architectural recommendations for Express.js auth module with JWT tokens and role-based access control." // When summarization enabled
}Asynchronous Response (async=true):
{
"content": "⏳ PROCESSING | CHAT | conv_abc123def | 1/1 | Started: 2023-12-01 10:30:00 | openai/gpt-5",
"continuation": {
"id": "conv_abc123def",
"status": "processing"
},
"async_execution": true
}Basic query:
{
"prompt": "Review this authentication function for security issues",
"model": "o3",
"files": ["/project/src/auth.js", "/project/config/security.json"],
"temperature": 0.2,
"reasoning_effort": "high"
}With conversation export:
{
"prompt": "Help me design a scalable architecture for our system",
"model": "gpt-5",
"export": true,
"continuation_id": "conv_architecture_design"
}When export is enabled, the conversation will be saved to disk in the following structure:
conv_architecture_design/
├── 1_request.txt # First user prompt
├── 1_response.txt # First AI response
├── 2_request.txt # Second user prompt (if continuing)
├── 2_response.txt # Second AI response
└── metadata.json # Conversation metadata and settings
Description: Multi-provider parallel execution with cross-model feedback for gathering perspectives from multiple AI models.
{
"type": "object",
"properties": {
"prompt": {
"type": "string",
"description": "The problem or proposal to gather consensus on. Example: 'Should we use microservices or monolith architecture for our e-commerce platform?'"
},
"models": {
"type": "array",
"items": {"type": "string"},
"minItems": 1,
"description": "List of models to consult. Example: ['o3', 'gemini-2.5-flash', 'grok-4']"
},
"files": {
"type": "array",
"items": {"type": "string"},
"description": "File paths for additional context. Example: ['/path/to/architecture.md', '/path/to/requirements.txt']"
},
"images": {
"type": "array",
"items": {"type": "string"},
"description": "Image paths for visual context. Example: ['/path/to/architecture.png', '/path/to/user_flow.jpg']"
},
"continuation_id": {
"type": "string",
"description": "Thread continuation ID for multi-turn conversations. Example: 'consensus_1703123456789_xyz789'"
},
"enable_cross_feedback": {
"type": "boolean",
"default": true,
"description": "Enable refinement phase where models see others' responses. Example: true (recommended), false (faster)"
},
"cross_feedback_prompt": {
"type": "string",
"description": "Custom prompt for refinement phase. Example: 'Focus on scalability trade-offs in your refinement'"
},
"temperature": {
"type": "number",
"minimum": 0.0,
"maximum": 1.0,
"default": 0.2,
"description": "Response randomness. Examples: 0.1 (very focused), 0.2 (analytical), 0.5 (balanced)"
},
"reasoning_effort": {
"type": "string",
"enum": ["minimal", "low", "medium", "high", "max"],
"default": "medium",
"description": "Reasoning depth. Examples: 'medium' (balanced), 'high' (complex analysis), 'max' (thorough evaluation)"
},
"async": {
"type": "boolean",
"default": false,
"description": "Execute in background mode with per-provider progress tracking. Returns continuation_id immediately for monitoring."
},
"export": {
"type": "boolean",
"default": false,
"description": "Export conversation to disk. Creates folder with continuation_id name containing numbered request/response files and metadata. Example: true to save consensus results"
}
},
"required": ["prompt", "models"]
}Synchronous Response (async=false):
{
"status": "consensus_complete",
"models_consulted": 3,
"successful_initial_responses": 3,
"failed_responses": 0,
"refined_responses": 3,
"title": "Architecture Review Recommendations", // When summarization enabled
"final_summary": "All models agree on microservices approach with event-driven architecture for scalability.", // When summarization enabled
"phases": {
"initial": [
{
"model": "o3",
"status": "success",
"response": "Initial analysis from O3...",
"metadata": {
"provider": "openai",
"input_tokens": 200,
"output_tokens": 150,
"response_time": 2500
}
}
],
"refined": [
{
"model": "o3",
"status": "success",
"initial_response": "Initial analysis...",
"refined_response": "After considering other perspectives...",
"metadata": {
"total_response_time": 4800,
"total_input_tokens": 450,
"total_output_tokens": 320
}
}
],
"failed": []
},
"continuation": {
"id": "consensus_xyz789",
"messageCount": 2
},
"settings": {
"enable_cross_feedback": true,
"temperature": 0.2,
"models_requested": ["o3", "gemini-2.5-flash", "grok-4"]
}
}Asynchronous Response (async=true):
{
"content": "⏳ PROCESSING | CONSENSUS | consensus_xyz789 | 0/3 | Started: 2023-12-01 10:30:00 | gpt-5,gemini-2.5-pro,grok-4",
"continuation": {
"id": "consensus_xyz789",
"status": "processing"
},
"async_execution": true,
"metadata": {
"total_models": 3,
"successful_models": 0,
"models_list": "gpt-5,gemini-2.5-pro,grok-4"
}
}{
"prompt": "What's the best database solution for a high-traffic social media platform?",
"models": [
{"model": "o3"},
{"model": "gemini-2.5-pro"},
{"model": "grok-4"}
],
"files": ["/docs/requirements.md", "/docs/current_architecture.md"],
"enable_cross_feedback": true,
"temperature": 0.1,
"reasoning_effort": "high"
}| Model | Context | Tokens | Features | Use Cases |
|---|---|---|---|---|
gpt-5.1 |
1M | 128K | Latest GPT | Multimodal, general purpose |
gpt-5 |
1M | 64K | Advanced | Complex reasoning, analysis |
gpt-5-mini |
1M | 64K | Fast | Balanced performance/speed |
gpt-5-nano |
1M | 64K | Ultra-fast | Quick responses, simple queries |
gpt-5-pro |
1M | 128K | Pro tier | Extended capabilities |
o3 |
200K | 100K | Reasoning | Logic, analysis, complex problems |
o3-pro |
200K | 100K | Extended reasoning | Deep analysis |
o4-mini |
200K | 100K | Fast reasoning | General purpose, rapid reasoning |
gpt-4.1 |
1M | 32K | Large context | Long documents, analysis |
| Model | Alias | Context | Tokens | Features | Use Cases |
|---|---|---|---|---|---|
gemini-3-pro-preview |
pro |
1M | 64K | Thinking levels, enhanced reasoning | Complex problems, deep analysis |
gemini-2.5-pro |
pro 2.5 |
1M | 65K | Thinking mode | Deep reasoning, architecture |
gemini-2.5-flash |
flash |
1M | 65K | Ultra-fast | Quick analysis, simple queries |
Note: The short model name gemini now routes to Gemini CLI (OAuth-based). For Google API access, use specific model names like gemini-2.5-pro or gemini-2.5-flash.
| Model | Alias | Context | Tokens | Features | Use Cases |
|---|---|---|---|---|---|
grok-4-0709 |
grok, grok-4 |
256K | 256K | Advanced | Latest capabilities |
grok-code-fast-1 |
grok-code-fast |
256K | 256K | Code optimization | Agentic coding |
| Model | Alias | Context | Tokens | Features | Use Cases |
|---|---|---|---|---|---|
claude-opus-4-5-20250220 |
opus-4.5, opus |
200K | 32K | Extended thinking, images, caching | Most capable reasoning |
claude-opus-4-1-20250805 |
opus-4.1, opus-4 |
200K | 32K | Extended thinking, images, caching | Complex reasoning tasks |
claude-sonnet-4-5-20250929 |
sonnet-4.5, sonnet |
200K | 64K | Extended thinking, images, caching | Enhanced reasoning |
claude-sonnet-4-20250514 |
sonnet-4 |
200K | 64K | Extended thinking, images, caching | High performance, balanced |
claude-haiku-4-5-20251001 |
haiku-4.5, haiku |
200K | 64K | Extended thinking, caching | Fast and intelligent |
Prompt Caching (Always Enabled):
- System prompts are automatically cached for 1 hour using Anthropic's prompt caching
- Reduces latency and costs for repeated requests with the same system prompt
- Minimum 1024 tokens required for caching (2048 for Haiku models)
- Cache information available in response metadata:
cache_creation_input_tokensandcache_read_input_tokens
| Model | Alias | Context | Tokens | Features | Use Cases |
|---|---|---|---|---|---|
deepseek-v3 |
deepseek-chat, deepseek |
128K | 64K | Latest model | General purpose AI |
deepseek-coder-v2.5 |
deepseek-coder |
128K | 16K | Code optimization | Programming tasks |
| Model | Alias | Context | Tokens | Features | Use Cases |
|---|---|---|---|---|---|
magistral-medium-2506 |
magistral, magistral-medium |
40K | 8K | Reasoning model | Complex reasoning |
magistral-small-2506 |
magistral-small |
40K | 8K | Small reasoning | Fast reasoning |
mistral-medium-2505 |
mistral-medium, mistral |
128K | 32K | Multimodal | General + images |
| Model | Alias | Context | Tokens | Features | Use Cases |
|---|---|---|---|---|---|
kimi/k2 |
k2, kimi-k2 |
256K | 128K | Latest Kimi | Large context tasks |
qwen/qwen-2.5-coder-32b-instruct |
qwen-coder |
32K | 32K | Code focus | Programming |
qwen/qwq-32b-preview |
qwen-thinking, qwq |
32K | 32K | Reasoning | Step-by-step thinking |
Codex is an agentic coding assistant with direct filesystem access:
- Model:
codex - Thread-based sessions: Persistent conversation history via continuation_id
- Direct file access: Reads files from working directory (paths relative to CLIENT_CWD)
- Response times: 6-20 seconds typical (complex tasks may take minutes)
- Authentication: Requires ChatGPT login OR
CODEX_API_KEYenvironment variable
Gemini CLI provides subscription-based access to Gemini models through OAuth:
- Model:
gemini(routes to gemini-3-pro-preview) - Authentication: OAuth via Gemini CLI (requires one-time setup)
- Setup: Install
@google/gemini-cliglobally and rungeminito authenticate - Billing: Uses Google subscription (Google One AI Premium or Gemini Advanced) instead of API credits
- Credentials: Stored in
~/.gemini/oauth_creds.json - Features: Access to enhanced agentic features available through CLI
- Context: 1M tokens (inherited from gemini-3-pro-preview)
- Output: 64K tokens
Authentication Setup:
# Install Gemini CLI globally
npm install -g @google/gemini-cli
# Run interactive authentication
gemini
# Follow prompts to authenticate via browser
# Credentials are saved to ~/.gemini/oauth_creds.jsonUsage Example:
{
"name": "chat",
"arguments": {
"prompt": "Explain the event loop in JavaScript",
"model": "gemini"
}
}Codex-Specific Behavior:
continuation_id- Required for thread continuation (maintains full conversation history)filesparameter - Files accessed directly from working directory, not passed as message contenttemperature,use_websearch- Not supported by Codex (ignored if specified)- Responses significantly longer than API-based providers
Configuration (see Codex Configuration section):
CODEX_SANDBOX_MODE- Filesystem access controlCODEX_SKIP_GIT_CHECK- Git repository requirementCODEX_APPROVAL_POLICY- Command approval behavior
Use "auto" for automatic selection or specify exact models:
// Automatic selection (recommended)
{"model": "auto"}
// Specific models
{"model": "gemini-2.5-flash"}
{"model": "o3"}
{"model": "grok-4-0709"}
// Using aliases
{"model": "flash"} // -> gemini-2.5-flash
{"model": "pro"} // -> gemini-2.5-pro
{"model": "grok"} // -> grok-4-0709
{"model": "grok-4"} // -> grok-4-0709Configure intelligent title and summary generation for better context understanding:
# Environment variables
ENABLE_RESPONSE_SUMMARIZATION=true # Enable AI-powered summarization (default: false)
SUMMARIZATION_MODEL=gpt-5-nano # Model for summarization (default: gpt-5-nano)When Enabled:
- Automatic title generation (up to 60 chars) for each request
- Status check returns an up-to-date summary of the progress based on the partially streamed response
- Final summaries (1-2 sentences) for completed responses
- Enhanced check_status display with titles and summaries
- Persistent storage of summaries with async jobs
Implementation Details:
- Uses fast models (gpt-5-nano, gemini-2.5-flash) for minimal latency
- Temperature set to 0.3 for consistent, focused summaries
- Graceful fallback to text snippets when disabled or on errors
- Non-blocking - summarization failures don't affect main flow
Control Codex behavior through environment variables:
CODEX_SANDBOX_MODE - Filesystem access control:
read-only(default): Can read files but not modifyworkspace-write: Can modify files in workspace onlydanger-full-access: Full filesystem access (use in containers only)
CODEX_SKIP_GIT_CHECK - Git repository requirement:
true(default): Works in any directoryfalse: Requires working directory to be a Git repository
CODEX_APPROVAL_POLICY - Command approval behavior:
never(default): Never prompt for approval (recommended for servers)untrusted: Prompt for untrusted commandson-failure: Prompt when commands failon-request: Let model decide (may hang in headless mode)
Authentication:
- Requires ChatGPT login (system-wide, persists across restarts)
- Alternative: Set
CODEX_API_KEYenvironment variable for headless deployments
Example Configuration (.env file):
# Codex authentication (optional if ChatGPT login available)
CODEX_API_KEY=your_codex_api_key_here
# Codex behavior
CODEX_SANDBOX_MODE=read-only # Default: read-only
CODEX_SKIP_GIT_CHECK=true # Default: true
CODEX_APPROVAL_POLICY=never # Default: neverSupported Text Formats:
.txt,.md,.js,.ts,.json,.yaml,.yml.py,.java,.c,.cpp,.h,.css,.html.xml,.csv,.sql,.sh,.bat,.log
Supported Image Formats:
.jpg,.jpeg,.png,.gif,.webp,.bmp
Size Limits:
- Text files: 1MB default
- Image files: 10MB default
{
"files": [
"/absolute/path/to/file.js",
"./relative/path/to/file.md"
]
}Response includes:
- File content with line numbers
- Metadata (size, last modified)
- Error handling for inaccessible files
{
"images": [
"/path/to/diagram.png",
"data:image/jpeg;base64,/9j/4AAQ..."
]
}Features:
- Base64 encoding for AI processing
- MIME type detection
- Size validation
- Security path checking
First request creates a continuation automatically:
{
"prompt": "Start a conversation about architecture",
"model": "auto"
}Response includes continuation ID:
{
"content": "Let's discuss architecture...",
"continuation": {
"id": "conv_abc123",
"provider": "openai",
"model": "gpt-5-mini",
"messageCount": 2
}
}Use the continuation ID in subsequent requests:
{
"prompt": "What about microservices?",
"continuation_id": "conv_abc123"
}Features:
- Persistent conversation history
- Provider and model consistency
- Message count tracking
- Automatic expiration
Continuation ID Missing (Critical):
// Some responses may not include continuation metadata
{
"content": "Response without continuation...",
// Missing: continuation field
}Workaround: Use single-turn interactions until fixed. Track conversation manually if needed.
Status: Implementation gap identified in integration testing. High priority fix planned.
Missing API Key:
{
"error": "Provider not available. Check API key configuration.",
"code": "PROVIDER_UNAVAILABLE",
"provider": "openai"
}Invalid Model:
{
"error": "Model not found: invalid-model",
"code": "MODEL_NOT_FOUND",
"provider": "openai"
}Rate Limiting:
{
"error": "OpenAI rate limit exceeded",
"code": "RATE_LIMIT_EXCEEDED",
"provider": "openai",
"retry_after": 60
}Context Too Large:
{
"error": "Context length exceeded for model",
"code": "CONTEXT_LENGTH_EXCEEDED",
"max_tokens": 128000,
"provided_tokens": 150000
}OpenAI:
- Rate limits vary by model and tier
- Automatic retry with exponential backoff
- Error codes:
rate_limit_error,insufficient_quota
Google:
- Free tier: 50 requests/day
- Paid: Based on quota settings
- Automatic retry for temporary failures
X.AI:
- Based on account tier
- Higher limits for paid accounts
- Standard HTTP 429 handling
Default Limits:
- Max output tokens: 25,000 (configurable to 200,000)
- Request timeout: 5 minutes
- Concurrent requests: Unlimited
Configuration:
MAX_MCP_OUTPUT_TOKENS=200000
REQUEST_TIMEOUT_MS=300000Environment Variables:
OPENAI_API_KEY=sk-proj-...
GOOGLE_API_KEY=AIzaSy...
XAI_API_KEY=xai-...MCP Client Configuration:
{
"env": {
"OPENAI_API_KEY": "sk-proj-...",
"GOOGLE_API_KEY": "AIzaSy...",
"XAI_API_KEY": "xai-..."
}
}Features:
- API keys never logged or exposed
- Path traversal protection for files
- File access limited to allowed directories
- Input validation on all parameters
Typical Performance:
- Simple chat: 500-2000ms
- Complex reasoning: 2-10 seconds
- Consensus (3 models): 3-15 seconds
- File processing: <100ms per file
Optimization:
- Parallel consensus execution
- Efficient context processing
- Connection pooling
- Response caching for repeated requests
Metrics Available:
- Response times per provider
- Token usage statistics
- Error rates and types
- Request concurrency
Logging:
LOG_LEVEL=debug # Detailed operation logs
LOG_LEVEL=info # Standard operation logs
LOG_LEVEL=error # Errors only{
"tool": "chat",
"arguments": {
"prompt": "Explain the benefits of TypeScript over JavaScript",
"model": "gemini-2.5-flash",
"temperature": 0.3
}
}{
"tool": "chat",
"arguments": {
"prompt": "Review this code for potential security vulnerabilities",
"model": "o3",
"files": ["/project/src/auth.js", "/project/src/middleware.js"],
"reasoning_effort": "high",
"temperature": 0.1
}
}{
"tool": "consensus",
"arguments": {
"prompt": "What's the best approach for implementing real-time notifications?",
"models": [
{"model": "o3"},
{"model": "flash"},
{"model": "grok"}
],
"enable_cross_feedback": false,
"temperature": 0.2
}
}{
"tool": "consensus",
"arguments": {
"prompt": "Design a scalable architecture for a video streaming platform",
"models": [
{"model": "o3"},
{"model": "gemini-2.5-pro"},
{"model": "grok-4"}
],
"files": [
"/docs/requirements.md",
"/docs/current_architecture.md",
"/docs/performance_goals.md"
],
"images": ["/diagrams/current_system.png"],
"enable_cross_feedback": true,
"cross_feedback_prompt": "Focus on scalability and cost optimization in your refinement",
"temperature": 0.15,
"reasoning_effort": "max"
}
}Enable detailed logging:
LOG_LEVEL=debug npx converse-mcp-server# Test OpenAI
curl -H "Authorization: Bearer $OPENAI_API_KEY" https://api.openai.com/v1/models
# Test Google (replace YOUR_KEY)
curl "https://generativelanguage.googleapis.com/v1beta/models?key=YOUR_KEY"
# Test X.AI
curl -H "Authorization: Bearer $XAI_API_KEY" https://api.x.ai/v1/models"No providers available":
- Check API key environment variables
- Verify API key format and validity
- Ensure at least one provider is configured
"Context length exceeded":
- Reduce file content or prompt length
- Use shorter conversation history
- Switch to model with larger context window
Slow responses:
- Check network connectivity
- Verify API service status
- Consider using faster models (flash, mini variants)
Provider-Specific Issues:
Google Provider:
{
"error": "genAI.getGenerativeModel is not a function",
"status": "connected_with_issues",
"workaround": "Provider handles gracefully, requests still processed"
}XAI Provider:
{
"error": "grok-beta does not exist or your team does not have access",
"status": "api_key_limitations",
"workaround": "Try different model names or contact XAI support"
}Input Validation:
{
"issue": "Missing required parameters may not be rejected",
"impact": "Some invalid requests may be processed",
"workaround": "Always provide required parameters like 'prompt'"
}Performance Benchmarks (From Integration Testing):
- Chat Tool: 581ms average (OpenAI), excellent performance
- Consensus Tool: 496ms parallel execution (3 providers), excellent
- File Processing: 1779ms for analysis, good performance
- Auto Selection: 1900ms, acceptable for complex selection
- Success Rate: 75% (6/8 tests passing), core functionality working
Validated Functionality:
- ✅ Real API connectivity to all three providers
- ✅ Chat tool with actual AI responses
- ✅ Consensus tool with parallel execution
- ✅ File context processing and analysis
- ✅ HTTP transport for MCP protocol
- ✅ Automatic provider selection
- ✅ Graceful error handling for provider issues
Create a new provider by implementing the standard interface:
// src/providers/newprovider.js
export async function invoke(messages, options = {}) {
// Validate API key availability
if (!process.env.NEWPROVIDER_API_KEY) {
throw new Error('NEWPROVIDER_API_KEY not configured');
}
try {
// Implement API call logic
const response = await apiCall(messages, options);
return {
content: response.text,
stop_reason: response.stop_reason || 'stop',
rawResponse: response
};
} catch (error) {
throw new Error(`New Provider error: ${error.message}`);
}
}
export function isAvailable() {
return Boolean(process.env.NEWPROVIDER_API_KEY);
}
export const supportedModels = ['model-1', 'model-2'];
export const name = 'newprovider';Registration:
Add to src/providers/index.js:
import * as newprovider from './newprovider.js';
export const providers = {
// ... existing providers
newprovider: newprovider
};Create a new tool following the MCP tool pattern:
// src/tools/newtool.js
import { createToolResponse, createToolError } from './index.js';
export async function newTool(args, dependencies) {
const { config, providers, continuationStore } = dependencies;
try {
// Validate required arguments
if (!args.requiredParam) {
return createToolError('requiredParam is required');
}
// Implement tool logic
const result = await processToolLogic(args, dependencies);
return createToolResponse(result);
} catch (error) {
return createToolError(`Tool execution failed: ${error.message}`);
}
}
// Tool definition for MCP registration
export const newToolDefinition = {
name: 'newtool',
description: 'Description of what the new tool does',
inputSchema: {
type: 'object',
properties: {
requiredParam: {
type: 'string',
description: 'Description of required parameter'
},
optionalParam: {
type: 'boolean',
default: false,
description: 'Description of optional parameter'
}
},
required: ['requiredParam']
}
};Registration:
Add to src/tools/index.js:
import { newTool, newToolDefinition } from './newtool.js';
export const tools = {
// ... existing tools
newtool: newTool
};
export const toolDefinitions = {
// ... existing definitions
newtool: newToolDefinition
};Add new configuration options:
// src/config.js
export const config = {
// ... existing config
newFeature: {
enabled: process.env.NEW_FEATURE_ENABLED === 'true',
timeout: parseInt(process.env.NEW_FEATURE_TIMEOUT) || 30000,
customOption: process.env.NEW_FEATURE_OPTION || 'default'
}
};Create tests for new components:
// tests/providers/newprovider.test.js
import { describe, it, expect } from 'vitest';
import * as newProvider from '../../src/providers/newprovider.js';
describe('New Provider', () => {
it('should implement required interface', () => {
expect(newProvider.invoke).toBeDefined();
expect(newProvider.isAvailable).toBeDefined();
expect(newProvider.name).toBe('newprovider');
});
it('should handle API calls correctly', async () => {
// Test implementation
});
});Description: Monitor progress and retrieve results from asynchronous operations.
{
"type": "object",
"properties": {
"continuation_id": {
"type": "string",
"description": "Optional job continuation ID to query. If not provided, returns the 10 most recent jobs."
},
"full_history": {
"type": "boolean",
"default": false,
"description": "When used with continuation_id, returns the full conversation history for that continuation ID."
}
},
"additionalProperties": false
}Status Check Response:
{
"content": {
"id": "conv_abc123def",
"status": "completed",
"tool": "chat",
"progress": {
"completed": 1,
"total": 1,
"percentage": 100
},
"result": {
"content": "Final AI response...",
"metadata": {
"provider": "openai",
"model": "gpt-5",
"usage": {
"input_tokens": 150,
"output_tokens": 85
}
}
},
"elapsed_seconds": 4.2,
"completed_at": "2023-12-01T10:30:04.200Z"
}
}Recent Jobs List Response:
{
"content": {
"jobs": [
{
"id": "conv_abc123def",
"status": "completed",
"tool": "chat",
"elapsed_seconds": 4.2,
"completed_at": "2023-12-01T10:30:04.200Z"
},
{
"id": "consensus_xyz789",
"status": "processing",
"tool": "consensus",
"progress": {
"completed": 2,
"total": 3,
"percentage": 67
},
"elapsed_seconds": 8.5
}
]
}
}// Check specific job
{
"continuation_id": "conv_abc123def"
}
// List recent jobs
{}
// Get full history for completed job
{
"continuation_id": "conv_abc123def",
"full_history": true
}Description: Cancel running asynchronous operations when needed.
{
"type": "object",
"properties": {
"continuation_id": {
"type": "string",
"description": "The continuation_id of the job to cancel"
}
},
"required": ["continuation_id"],
"additionalProperties": false
}Successful Cancellation:
{
"content": {
"status": "cancelled",
"message": "Job conv_abc123def cancelled successfully",
"job_id": "conv_abc123def",
"elapsed_seconds": 2.1,
"cancelled_at": "2023-12-01T10:30:02.100Z"
}
}Already Completed:
{
"content": {
"status": "completed",
"message": "Job conv_abc123def has already completed and cannot be cancelled",
"job_id": "conv_abc123def"
}
}{
"continuation_id": "conv_abc123def"
}Both Chat and Consensus tools support asynchronous execution mode for long-running operations. When async: true is specified:
- Immediate Response: Returns a
continuation_idinstantly - Background Processing: Job runs in the background with streaming support
- Status Monitoring: Use
check_statustool to monitor progress - Result Retrieval: Full results available when job completes
- Cancellation: Use
cancel_jobtool to stop running operations
sequenceDiagram
participant Client
participant Server
participant Provider
Client->>Server: chat(prompt, async=true)
Server-->>Client: continuation_id (immediate)
Server->>Provider: Background execution
Provider-->>Server: Streaming response
loop Status Checking
Client->>Server: check_status(continuation_id)
Server-->>Client: Progress update
end
Provider->>Server: Final response
Server->>Server: Cache result
Client->>Server: check_status(continuation_id)
Server-->>Client: Complete result
| Status | Description | Actions Available |
|---|---|---|
processing |
Job is running | Cancel, Check Status |
completed |
Job finished successfully | Get Results |
failed |
Job encountered an error | Check Error Details |
cancelled |
Job was cancelled by user | None |
completed_with_errors |
Partial success (consensus only) | Get Partial Results |
Memory Cache (24 hours):
- Active jobs and recent completions
- Fast lookup for status checks
- Automatic cleanup
Disk Cache (3 days):
- Long-term result storage
- Survives server restarts
- Automatic cleanup of old results
Async Benefits:
- Non-blocking client operations
- Better resource utilization
- Parallel processing for consensus
- Graceful handling of long operations
When to Use Async:
- Long analysis tasks (>30 seconds)
- Large file processing
- Multi-model consensus
- Complex reasoning operations
- Batch operations
Provider Development:
- Always check API key availability in
isAvailable() - Implement consistent error handling
- Follow the standard response format
- Add comprehensive logging
- Handle rate limiting gracefully
Tool Development:
- Validate all input parameters
- Use dependency injection pattern
- Return standardized responses
- Implement proper error handling
- Add detailed input schema
Testing:
- Write unit tests for core logic
- Add integration tests with mocked APIs
- Test error conditions thoroughly
- Validate input/output formats
Documentation:
- Update API documentation with new tools/providers
- Add usage examples
- Document configuration options
- Include troubleshooting guides
For more examples and integration patterns, see EXAMPLES.md.