AI-powered browser automation using Claude's Computer Use API. Describe a task in plain language and CopyCat will see your screen and do it — clicking, typing, scrolling, and filling out forms automatically.
- Node.js 18+
- An Anthropic API key
- Chrome or Chromium-based browser
git clone <repo-url>
cd browser-assistant
npm install
npm run build- Open
chrome://extensions - Enable Developer mode (top right)
- Click Load unpacked and select the
dist/folder - Click the CopyCat icon in your toolbar to open the side panel
Open Settings (gear icon) and:
- Enter your Anthropic API key
- Choose a model (Sonnet 4 or Opus 4)
- Optionally fill in your profile info for automatic form filling
- Create custom templates (e.g. "Job Application") with fields the agent can use
- Navigate to any website
- Type a prompt like "Fill out this form with my job application info"
- Hit Go — the agent takes a screenshot, plans its actions, and executes them
- Each step shows the agent's reasoning and a screenshot of what it sees
- Click Stop to cancel at any time
npm run dev # watch mode — rebuilds on file changes
npm run build # production buildAfter rebuilding, go to chrome://extensions and click the refresh icon on CopyCat to reload.
User prompt → Screenshot (1024x768) → Anthropic API → Parse actions
↓
Execute via Chrome Debugger API
↓
New screenshot → Loop until done
- Side panel (React + Zustand) — chat UI with reasoning chain
- Service worker — orchestrates the agent loop
- Anthropic API — direct fetch to Claude Computer Use (
computer-use-2025-01-24beta) - Chrome Debugger API — executes clicks, keystrokes, and scrolling via CDP
- Screenshot service — captures and scales to 1024x768 via OffscreenCanvas
src/
├── background/service-worker.ts # Agent loop
├── content/content-script.ts # Minimal content script
├── services/
│ ├── anthropicApi.ts # API client + system prompt
│ ├── screenshotService.ts # Capture + scale screenshots
│ ├── debuggerService.ts # Chrome Debugger wrapper
│ └── actionExecutor.ts # Map actions → CDP commands
├── sidepanel/
│ ├── App.tsx # Main app shell
│ ├── store/agentStore.ts # Zustand state
│ ├── hooks/ # useSettings, useAgentLoop, useAutoScroll
│ └── components/ # Header, PromptInput, ReasoningChain, Settings
└── types/ # TypeScript definitions
MIT
