feat(slack): trigger automations from watched channel messages#819
feat(slack): trigger automations from watched channel messages#819ColeMurray wants to merge 20 commits into
Conversation
Add a `slack` automation event source so messages posted in watched Slack channels can auto-trigger coding sessions, extending the existing automation engine rather than forking a parallel path. Flow: the slack-bot filters ambient channel messages (kill switch, bot-mention suppression, watched-channel pre-filter), normalizes them, and forwards to the control plane's /internal/slack-event endpoint. The scheduler selects candidates by channel (new automation_slack_channels join table), evaluates conditions (text contains/exact/regex, slack_channel, slack_actor), enforces a per-automation hourly rate limit and per-thread concurrency, dispatches a session, and posts the result back into the originating thread. - shared: `slack` event source, SlackAutomationEvent + normalizer, and the three condition handlers, co-located under triggers/slack/. - control-plane: D1 migration 0025 (join table + run thread coordinates + blast-radius columns), store queries, scheduler slack handling, and the /internal/slack-event forwarding route. - slack-bot: channel-trigger ingress, bot-identity resolution (fail-closed auth.test cache), watched-channel cache, and thread-reply / concurrency-skip callbacks. - web: slack trigger type, condition editors, reply-in-thread + max-runs/hour controls, and an auto-triage template. Ships dark behind SLACK_TRIGGERS_ENABLED (Terraform var, default false). Structural cleanups landed alongside (behavior-preserving): a shared createAutomationEventRoute factory backing the github + slack webhook routes; a generic two-tier cache fetcher behind getRoutingRules/getWatchedChannels; a single rejectInvalidCallback guard for the signed callback routes; channel ingress extracted to its own module; and a shared isDuplicateKeyError / SlackRunColumns contract in the DB layer.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds Slack message-trigger automation across shared trigger primitives, control-plane persistence and routes, scheduler processing, Slack-bot ingress and callbacks, web UI configuration, and docs/production kill-switch settings. ChangesSlack Channel-Message Automation Triggers
Sequence Diagram(s)Slack channel-trigger ingress sequenceDiagram
participant Slack as Slack
participant ChannelTrigger as handleChannelTrigger
participant ControlPlaneRoute as createAutomationEventRoute
participant SchedulerDO
participant SlackAPI as Slack API
Slack->>ChannelTrigger: message event
ChannelTrigger->>ChannelTrigger: getBotUserId, getWatchedChannels, normalizeSlackEvent
ChannelTrigger->>ControlPlaneRoute: POST /internal/slack-event
ControlPlaneRoute->>SchedulerDO: POST /internal/event
SchedulerDO-->>ControlPlaneRoute: { triggered, skipped }
ControlPlaneRoute-->>ChannelTrigger: JSON response
ChannelTrigger->>SlackAPI: reactions.add eyes
Automation completion and skip fan-out sequenceDiagram
participant SchedulerDO
participant CompleteCallback as handleAutomationComplete
participant SkipCallback as handleAutomationSkip
participant SlackAPI as Slack API
SchedulerDO->>CompleteCallback: POST /callbacks/automation-complete
CompleteCallback->>SlackAPI: chat.postMessage
CompleteCallback->>SlackAPI: reactions.remove
SchedulerDO->>SkipCallback: POST /callbacks/automation-skip
SkipCallback->>SlackAPI: chat.postEphemeral
Estimated code review effort🎯 5 (Critical) | ⏱️ ~90+ minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
| if (!binding || !secret) return; | ||
|
|
||
| const automation = await store.getById(run.automation_id); | ||
| const body = buildSlackCompletionNotification({ |
There was a problem hiding this comment.
This ignores the stored reply_in_thread setting. buildSlackCompletionNotification is called for every Slack-origin run with thread coordinates, so an automation saved with replyInThread: false will still post the completion back into Slack. Please gate this on automation?.reply_in_thread !== 0 (or pass the flag into the builder); if we still want the 👀 reaction cleared when replies are disabled, that should be handled separately from posting the completion message.
There was a problem hiding this comment.
Fixed in 6f28d528. reply_in_thread is now threaded from the scheduler (notifySlackCompletion → buildSlackCompletionNotification) through to the bot's AutomationCompletePayload. When it's false the bot skips the thread post but still clears the 👀 reaction, so the triggering message isn't left marked. Added coverage in slack-completion.test.ts and callbacks.test.ts.
There was a problem hiding this comment.
Summary
Reviewed PR #819, feat(slack): trigger automations from watched channel messages, by @ColeMurray. The change adds Slack channel-message automation triggers across shared types, control-plane scheduling/storage, slack-bot ingress, web UI, tests, docs, and Terraform. GitHub reports 57 files changed with +3994/-238.
Overall the architecture is cohesive and the test coverage is strong, but I found one blocking correctness issue: the stored replyInThread / reply_in_thread control is ignored when the scheduler posts Slack completions.
Critical Issues
- [Functionality]
packages/control-plane/src/scheduler/durable-object.ts:626-reply_in_threadis persisted and exposed in the UI, but completion delivery does not check it, so automations saved withreplyInThread: falsestill post results back into Slack threads. I left an inline comment with the suggested fix.
Suggestions
- [Input Validation]
packages/control-plane/src/routes/automations.ts:488- ValidatemaxRunsPerHourserver-side asnullor a positive integer on create and update before persisting it. I left an inline comment.
Positive Feedback
- The Slack trigger path is integrated through the existing automation event/scheduler model rather than adding a parallel dispatch path.
- The watched-channel join table avoids full scans and JSON parsing for Slack candidate selection.
- Good coverage around condition validation, watched-channel lookup, rate limits, concurrency skips, deduplication, and slack-bot ingress filtering.
Verification
- Ran
npm run build -w @open-inspect/shared && npm run typecheck -w @open-inspect/control-plane; the targeted typecheck passes.
Verdict
Request Changes.
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
packages/slack-bot/src/callbacks.test.ts (1)
314-381: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd failure-path tests for Slack API
ok: falseand thrown fetch errors.These new route tests currently assert validation + happy path only. Adding explicit failure-path assertions for
chat.postMessage/chat.postEphemeralwill prevent silent regressions in the fire-and-forget handlers.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/slack-bot/src/callbacks.test.ts` around lines 314 - 381, The test suites for the callback routes (automation-complete and automation-skip) are missing failure-path test cases for when Slack API calls fail. Add new test cases within the existing describe blocks that mock the fetch to simulate Slack API returning ok: false responses or throwing fetch errors, then verify that the error handlers (fire-and-forget handlers in postCallback) respond appropriately without crashing or silently failing. Include assertions on the response status and any error handling behavior to prevent regressions when Slack API calls fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/control-plane/src/db/automation-store.ts`:
- Around line 644-647: The isDuplicateKeyError function in automation-store.ts
currently matches any UNIQUE constraint violation by checking for the generic
"UNIQUE constraint failed" string, which can mask unrelated uniqueness failures
like scheduled_at collisions in scheduler dispatch. Instead of a generic UNIQUE
check, modify the return statement in isDuplicateKeyError to specifically detect
only the trigger-key constraint violation by checking for the actual constraint
name associated with the trigger-key field, ensuring that only the intended
duplicate-key error is detected while allowing other constraint violations to
surface properly.
In `@packages/control-plane/src/routes/automations.ts`:
- Around line 85-95: The validateSlackRequiredConditions function assumes that
triggerConfig.conditions is an array when calling .some() on it at lines 89 and
92, but it does not validate that conditions is actually an array. Since this
helper runs before shape validation in the update path, malformed input could
provide a non-array value and cause a runtime error. Add a guard using
Array.isArray() to check that the conditions variable is actually an array
before calling .some() on it, and return an appropriate error message if it is
not an array.
In `@packages/control-plane/src/scheduler/durable-object.ts`:
- Around line 636-640: Add timeout handling and response status validation to
both binding.fetch calls. For the fetch call to
"https://internal/callbacks/automation-complete" at line 636, create an
AbortController with a reasonable timeout and pass it in the fetch options to
prevent blocking on slow/unhealthy services. Additionally, check the response
status code and handle non-2xx responses appropriately instead of silently
ignoring them. Apply the same timeout and error handling pattern to the second
binding.fetch call mentioned at line 679 to ensure consistent behavior across
all cross-service calls.
In `@packages/control-plane/src/webhooks/automation-event.ts`:
- Around line 52-55: The code casts body to Record<string, unknown> but does not
validate that it is actually an object before accessing the source property,
which means null or primitive JSON values will throw when accessing
event.source. Add a guard clause immediately after the event variable assignment
to check if the parsed body is a non-null object using typeof and hasOwnProperty
or a similar validation method, and return the error function with a 400 status
code if the validation fails, before attempting to access event.source in the
subsequent condition.
In `@packages/slack-bot/src/callbacks.ts`:
- Around line 551-565: The postMessage call can return a response with ok: false
without throwing an error, but the current code does not check the response
status before proceeding to clear reactions and log success. Capture the return
value from the postMessage function call and check if the response has ok:
false. If it does, log an appropriate error message and return early to prevent
the success log from being recorded. Only proceed with the clearThinkingReaction
call and the success logging if postMessage returned a successful response.
- Around line 580-602: The postEphemeral() call in the handleAutomationSkip
function can throw exceptions due to network or runtime failures, but it is not
wrapped in a try/catch block. Currently only response errors are handled via the
if (!result.ok) check, leaving thrown exceptions unhandled and causing
background task rejections. Wrap the entire postEphemeral() call in a try/catch
block and add a catch handler that logs the caught error using the same logging
pattern as the existing error case, including trace_id, channel, user, and
outcome fields to ensure all failure modes are properly captured.
In `@packages/web/src/components/automations/automation-form.tsx`:
- Around line 236-240: The code uses Number.parseInt to parse maxRunsPerHour
which truncates decimal values (e.g., "1.9" becomes 1) even though the HTML
number input allows decimal entries. Replace Number.parseInt with
Number.parseFloat to preserve decimal values while maintaining the existing
validation logic that checks if the parsed value is greater than 0 and returns
null for empty or invalid inputs. This change should be made in the conditional
block where isSlack is true.
In `@packages/web/src/components/automations/condition-builder.tsx`:
- Around line 300-304: The onChange handler in the condition-builder.tsx file is
overwriting the entire flags string when toggling the case-insensitive option,
which causes loss of other valid regex flags like "m". Instead of setting flags
to just "i" or an empty string, preserve existing flags by conditionally adding
or removing only the "i" flag. When the checkbox is checked, append "i" to the
existing condition.value.flags if it is not already present; when unchecked,
remove "i" from the existing flags while keeping any other flags intact.
---
Nitpick comments:
In `@packages/slack-bot/src/callbacks.test.ts`:
- Around line 314-381: The test suites for the callback routes
(automation-complete and automation-skip) are missing failure-path test cases
for when Slack API calls fail. Add new test cases within the existing describe
blocks that mock the fetch to simulate Slack API returning ok: false responses
or throwing fetch errors, then verify that the error handlers (fire-and-forget
handlers in postCallback) respond appropriately without crashing or silently
failing. Include assertions on the response status and any error handling
behavior to prevent regressions when Slack API calls fail.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d0bac659-6499-4820-bc1a-3f36ea6d8ec5
📒 Files selected for processing (57)
docs/AUTOMATIONS.mddocs/integrations/SLACK.mdpackages/control-plane/src/db/automation-store.test.tspackages/control-plane/src/db/automation-store.tspackages/control-plane/src/routes/automations.tspackages/control-plane/src/scheduler/durable-object.tspackages/control-plane/src/scheduler/slack-completion.test.tspackages/control-plane/src/scheduler/slack-completion.tspackages/control-plane/src/webhooks/automation-event.tspackages/control-plane/src/webhooks/github.tspackages/control-plane/src/webhooks/index.tspackages/control-plane/src/webhooks/slack.tspackages/control-plane/test/integration/automation-store.test.tspackages/control-plane/test/integration/automations-slack-route.test.tspackages/control-plane/test/integration/cleanup.tspackages/control-plane/test/integration/scheduler-slack-events.test.tspackages/control-plane/test/integration/webhooks-slack.test.tspackages/shared/src/slack/client.tspackages/shared/src/slack/index.tspackages/shared/src/slack/mrkdwn.test.tspackages/shared/src/slack/mrkdwn.tspackages/shared/src/triggers/conditions.tspackages/shared/src/triggers/index.tspackages/shared/src/triggers/registry.tspackages/shared/src/triggers/slack/conditions.test.tspackages/shared/src/triggers/slack/conditions.tspackages/shared/src/triggers/slack/index.tspackages/shared/src/triggers/slack/normalizer.test.tspackages/shared/src/triggers/slack/normalizer.tspackages/shared/src/triggers/testing.tspackages/shared/src/triggers/types.tspackages/shared/src/types/index.tspackages/slack-bot/src/bot-identity.test.tspackages/slack-bot/src/bot-identity.tspackages/slack-bot/src/callbacks.test.tspackages/slack-bot/src/callbacks.tspackages/slack-bot/src/channel-trigger.test.tspackages/slack-bot/src/channel-trigger.tspackages/slack-bot/src/classifier/repos.test.tspackages/slack-bot/src/classifier/repos.tspackages/slack-bot/src/dm-utils.test.tspackages/slack-bot/src/dm-utils.tspackages/slack-bot/src/index.tspackages/slack-bot/src/internal-auth.tspackages/slack-bot/src/types/index.tspackages/web/src/app/(app)/automations/[id]/edit/page.tsxpackages/web/src/app/(app)/automations/[id]/page.tsxpackages/web/src/components/automations/automation-form.test.tsxpackages/web/src/components/automations/automation-form.tsxpackages/web/src/components/automations/condition-builder.test.tsxpackages/web/src/components/automations/condition-builder.tsxpackages/web/src/components/automations/trigger-type-selector.tsxpackages/web/src/lib/automation-templates.test.tspackages/web/src/lib/automation-templates.tsterraform/d1/migrations/0025_slack_triggers.sqlterraform/environments/production/variables.tfterraform/environments/production/workers-slack.tf
Addresses code-review feedback on #819 (PR review by open-inspect + CodeRabbit bots). Correctness: - Honor the stored reply_in_thread setting on Slack completion. The flag was persisted and shown in the UI but ignored: replyInThread=false still posted. Thread the flag from the scheduler through to the bot, which now skips the thread post but still clears the eyes reaction. Input validation / boundaries (untrusted JSON): - Validate maxRunsPerHour as null-or-positive-integer on create and update before it reaches the scheduler's rate-limit comparison. - Guard validateSlackRequiredConditions against a non-array `conditions` so the update path returns 400 instead of throwing a 500. - Reject non-object JSON bodies in the automation-event webhook factory (a null body returned 500 instead of 400). Robustness: - handleAutomationComplete now checks postMessage ok:false before logging success; handleAutomationSkip wraps its Slack call in try/catch (it runs in waitUntil); the scheduler's skip callback logs non-2xx responses. - Narrow isDuplicateKeyError to the trigger_key dedup index so unrelated UNIQUE violations surface as real errors instead of being swallowed. Atomicity: - Persist an automation and its watched-channel index in a single db.batch via createWithSlackChannels / updateWithSlackChannels, so trigger_config and the scheduler's channel index can never drift apart on a partial write. Web: - Preserve other regex flags when toggling case-insensitive; constrain the max-runs-per-hour input to integers. Deferred (follow-up): refactoring the scheduler's per-source Slack branches behind a source-policy boundary — premature with a single specialized source.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
packages/control-plane/test/integration/automations-slack-route.test.ts (1)
145-151: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winTighten the
maxRunsPerHour: nullassertion to avoid false positives.
expect(res.status).not.toBe(400)can pass on unrelated failures, so this test may stop proving that validation truly acceptednull.Suggested tightening
it("accepts a null maxRunsPerHour (use the app default)", async () => { const res = await postAutomation( createBody({ triggerConfig: { conditions: validSlackConditions }, maxRunsPerHour: null }) ); - // Passes validation; only later fails at repo resolution in the test env. - expect(res.status).not.toBe(400); + // Passes maxRunsPerHour validation; request may still fail later in test env. + const body = await res.text(); + expect(body).not.toContain("maxRunsPerHour"); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/control-plane/test/integration/automations-slack-route.test.ts` around lines 145 - 151, The assertion in the test case for null maxRunsPerHour is too loose and can produce false positives by passing on unrelated failures. Replace the `expect(res.status).not.toBe(400)` assertion with a specific assertion checking for the expected success status code (likely 201 for a successful POST operation). This ensures the test definitively proves that validation accepts null maxRunsPerHour values rather than just confirming a validation error did not occur.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/slack-bot/src/callbacks.test.ts`:
- Around line 381-384: The throw-path tests in the test file can pass even if
background scheduling regresses because flushWaitUntil(ctx) no-ops when no call
exists. Add an explicit assertion before calling flushWaitUntil(ctx) in both
test blocks (the ones around line 381-384 and line 427-430) to verify that
waitUntil was actually called on the context object. This will ensure the test
fails if the background scheduling functionality breaks, making the tests more
robust and preventing silent regressions.
---
Nitpick comments:
In `@packages/control-plane/test/integration/automations-slack-route.test.ts`:
- Around line 145-151: The assertion in the test case for null maxRunsPerHour is
too loose and can produce false positives by passing on unrelated failures.
Replace the `expect(res.status).not.toBe(400)` assertion with a specific
assertion checking for the expected success status code (likely 201 for a
successful POST operation). This ensures the test definitively proves that
validation accepts null maxRunsPerHour values rather than just confirming a
validation error did not occur.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 77d6d53d-040c-4009-9761-2c2752789547
📒 Files selected for processing (14)
packages/control-plane/src/db/automation-store.test.tspackages/control-plane/src/db/automation-store.tspackages/control-plane/src/routes/automations.tspackages/control-plane/src/scheduler/durable-object.tspackages/control-plane/src/scheduler/slack-completion.test.tspackages/control-plane/src/scheduler/slack-completion.tspackages/control-plane/src/webhooks/automation-event.tspackages/control-plane/test/integration/automation-store.test.tspackages/control-plane/test/integration/automations-slack-route.test.tspackages/control-plane/test/integration/webhooks-slack.test.tspackages/slack-bot/src/callbacks.test.tspackages/slack-bot/src/callbacks.tspackages/web/src/components/automations/automation-form.tsxpackages/web/src/components/automations/condition-builder.tsx
🚧 Files skipped from review as they are similar to previous changes (11)
- packages/control-plane/src/scheduler/slack-completion.test.ts
- packages/control-plane/src/scheduler/slack-completion.ts
- packages/web/src/components/automations/condition-builder.tsx
- packages/control-plane/src/webhooks/automation-event.ts
- packages/control-plane/src/routes/automations.ts
- packages/control-plane/test/integration/automation-store.test.ts
- packages/control-plane/test/integration/webhooks-slack.test.ts
- packages/control-plane/src/scheduler/durable-object.ts
- packages/slack-bot/src/callbacks.ts
- packages/web/src/components/automations/automation-form.tsx
- packages/control-plane/src/db/automation-store.ts
|
Thanks both — addressed the review in Fixed
Deferred → #820: refactoring Push-backs (reasoning in-thread)
All suites green: control-plane 1326 unit + 416 integration, slack-bot 155, web 492; shared / github-bot / linear-bot unchanged. |
reply_in_thread was a dedicated NOT NULL column on the shared automations row, but it is slack_event-only and never SQL-queried — a Single-Table- Inheritance nullable-column smell that also wrote a meaningless value onto every cron/github automation. Move it into trigger_config JSON (keyed by trigger_type, like conditions) and drop the column from migration 0025, which is still unmerged so there is no data migration. - shared: add optional TriggerConfig.replyInThread - store: drop the column from AutomationRow/INSERT/UPDATE; parse once in toAutomation; add getReplyInThread() as the single default-true source - routes: fold the top-level API flag into the stored config (slack only); the update path preserves it across a conditions-only edit - scheduler: read reply-in-thread from the parsed trigger_config - keep max_runs_per_hour a column, recommented as a generic rate limit - drop the now-unneeded #716 issue citations from inline comments The camelCase API (Automation.replyInThread) is unchanged, so web and bot consumers are unaffected. Behavior-preserving: the prior column default is now a default-in-code (?? true).
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/control-plane/src/db/automation-store.ts (1)
542-567: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick winAvoid a public write path that can desync the Slack channel index.
setSlackChannels()updatesautomation_slack_channelsby itself, while the rest of this change treats that table as a derived index that should move in lockstep withtrigger_config. Keeping this public makes it easy for a future caller to drift the index from the canonical config.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/control-plane/src/db/automation-store.ts` around lines 542 - 567, `setSlackChannels()` is exposing a standalone write path for `automation_slack_channels`, even though that table should stay synchronized with the canonical `trigger_config` state. Make the Slack-channel update flow private/internal and drive it only from the existing trigger-config update path in `automation-store.ts` (using `bindSlackChannelStatements()`), so external callers can’t mutate the derived index independently.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@packages/control-plane/src/db/automation-store.ts`:
- Around line 542-567: `setSlackChannels()` is exposing a standalone write path
for `automation_slack_channels`, even though that table should stay synchronized
with the canonical `trigger_config` state. Make the Slack-channel update flow
private/internal and drive it only from the existing trigger-config update path
in `automation-store.ts` (using `bindSlackChannelStatements()`), so external
callers can’t mutate the derived index independently.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e8605dac-9a7d-47f5-b745-53235634b746
📒 Files selected for processing (9)
packages/control-plane/src/db/automation-store.test.tspackages/control-plane/src/db/automation-store.tspackages/control-plane/src/routes/automations.tspackages/control-plane/src/scheduler/durable-object.tspackages/control-plane/src/scheduler/slack-completion.tspackages/control-plane/test/integration/automation-store.test.tspackages/control-plane/test/integration/automations-slack-route.test.tspackages/shared/src/triggers/conditions.tsterraform/d1/migrations/0025_slack_triggers.sql
🚧 Files skipped from review as they are similar to previous changes (5)
- packages/control-plane/src/scheduler/slack-completion.ts
- packages/control-plane/test/integration/automations-slack-route.test.ts
- packages/control-plane/src/db/automation-store.test.ts
- packages/control-plane/src/routes/automations.ts
- packages/control-plane/src/scheduler/durable-object.ts
Move all slack_event watched-channel persistence out of the generic AutomationStore into a dedicated SlackChannelStore, so trigger-source- specific code no longer leaks into the canonical automation store (slack was the only source with bespoke store methods). The automation+channels write stays atomic: the create/update routes compose one db.batch from the two single-table stores' prepared-statement builders (AutomationStore.bindAutomationInsert/Update + SlackChannelStore.bindChannelStatements), replacing the former createWithSlackChannels/updateWithSlackChannels helpers. The automation_slack_channels table and migration 0025 are unchanged.
Resolve the scheduler conflict in durable-object.ts: keep the slack imports (SlackChannelStore, getReplyInThread, SlackRunColumns, isDuplicateKeyError, slack-completion) and SLACK_RATE_LIMIT_WINDOW_MS alongside main's new RECOVERY_SWEEP_LIMIT and limit-bearing recovery-sweep query signatures. Renumber the slack migration 0025_slack_triggers -> 0026_slack_triggers to avoid colliding with main's 0025_optimize_active_run_lookup. d1-migrate.sh tracks applied migrations by numeric prefix, so two 0025s would record one version and silently skip the second (the slack tables) in prod.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
The automation_runs ledger is shared by all six trigger sources, but it
carried four slack-only columns (slack_channel, slack_thread_ts,
slack_message_ts, actor_user_id) that were NULL for the other five — a
Single-Table-Inheritance nullable-column smell. None were ever queried
(no WHERE/JOIN/index); they are pure payload read back only to post a
run's result into its originating Slack thread. actor_user_id was
write-only (nothing read it).
Collapse the three live coordinates into one generic, source-agnostic
trigger_run_metadata TEXT column holding {channel, threadTs?, messageTs}
for slack runs (absent otherwise), and drop actor_user_id. The canonical
run-row layer (AutomationRunRow) is now free of slack specifics;
interpretation lives in the slack files — slackRunMetadata() serializes
on the write side, getSlackRunMetadata() parses on the read side. Keeps
the single-row INSERT (no atomicity change) and mirrors the earlier
reply_in_thread -> trigger_config move. Migration 0026 reshaped in place
(branch unmerged; no backfill).
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Remove the duplicate top-level replyInThread from the shared Automation, CreateAutomationRequest, and UpdateAutomationRequest types. The setting now lives only inside TriggerConfig, interpreted via trigger_type like conditions. Delete the server-side fold (create) and preservation-merge (update): the web form owns the full trigger_config blob and always submits replyInThread within it. toAutomation no longer echoes the field; getReplyInThread stays the single read-path accessor (scheduler and edit page read straight from trigger_config). Also align maxRunsPerHour doc comments — a generic per-automation hourly cap enforced for slack_event today, not a slack-only field.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Revert getRoutingRules to its main form (inline routingRulesLocalCache + getRoutingRulesFromCache) and remove the generic CacheSlot/fetchCachedList engine this PR introduced. Implement getWatchedChannels as a KV-backed read-through with no in-memory tier: serve the KV last-known-good set, refresh from the control plane on a miss, write KV, and fail closed to an empty set. KV absorbs the per-message hot path. bot-identity's cache is left unchanged.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Address PR review on slack_event automations: - Validate slack condition value shapes at the handler level (untrusted JSON is type-asserted, not checked): a string slack_channel value passed the type's .length check and was then iterated character-by- character into the watched-channel index. slack_channel/slack_actor now require a non-empty string array; text_match requires an object value with a string pattern (a null/primitive value previously threw a 500). - Reject a non-boolean replyInThread on write and coerce any non-boolean to the default on read, so the scheduler -> bot completion payload is always the boolean the callback validator requires. - Reject clearing triggerConfig to null on slack_event updates, which would leave the automation enabled but untriggerable. - Mark setSlackChannels @internal (test-only); production writes batch bindChannelStatements with the automation row. - Web form validates maxRunsPerHour with Number/Number.isInteger instead of parseInt, so 1.5 is rejected rather than truncated to 1.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
The 'Automation completed/failed' thread reply was noise. The scheduler → bot completion callback now only clears the 👀 reaction on the triggering message; success/failure is observed in the web UI run history. With no message ever posted, the replyInThread setting controlled nothing, so it is removed end to end (TriggerConfig field, store accessor, route validation, scheduler payload, and the web form checkbox). SlackRunMetadata drops its now-dead threadTs.
The per-automation hourly run cap added arbitrary friction: the default was too low and any fixed ceiling is a design smell. Remove it end to end — the form field, the maxRunsPerHour request/response fields, the route validation, the scheduler rate-limit branch, the countRunsInWindow query, and the column (migration 0027 drops it; 0026 stays untouched). The per-thread concurrency guard remains the backstop against runaway slack_event runs.
A slack_event automation no longer requires a text_match condition — only a slack_channel. Without a text filter it responds to every message in the watched channel, which is the desired 'respond to any reply' behavior. Channel scoping stays required so a trigger is always bounded to specific channels.
Pasting raw channel IDs into the slack_channel condition was impractical. Add a channel picker: a new conversations.list-backed shared client method (listChannels), an internal control-plane route GET /integration-settings/slack/channels, a web proxy + useSlackChannels hook, and a SlackChannelPicker that resolves IDs to #names and stores IDs. It degrades to manual channel-ID entry when listing is unavailable (no bot token, missing channels:read/groups:read scopes, or a Slack error), and flags channels the bot hasn't been invited to.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
A reply in a thread whose automation run is still in flight was dropped
with an ephemeral "a run is already active" notice, so users could never
steer a running channel-trigger agent.
Route such follow-ups to the active run's session via /internal/prompt
(the same queue the interactive @mention/DM path uses): the message lands
as the next turn on the persistent OpenCode session, and its reply posts
back in-thread via a slack completion callback. The active-run lookup now
precedes condition matching, since a natural reply ("also do X") won't
contain the trigger keyword — conditions gate new runs, not replies.
Falls back to the "already active" notice only when the run has no
session yet (still starting) or the enqueue fails. Non-slack sources keep
condition-then-concurrency ordering unchanged.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Broaden the channel-automation thread lookup from the active run to the latest run with a session within a 24h window, so every reply continues the same session — during the run and after it completes or fails — matching the interactive @mention path. Steering precedes condition matching: a reply is a steer, not a new trigger. - add AutomationStore.getLatestSteerableRunForThread + the backing index idx_runs_thread_continuity (migration 0028) - SLACK_THREAD_CONTINUITY_WINDOW_MS (24h, measured from the root trigger, matching the interactive thread->session KV TTL) - steer failure falls through to the trigger path (stale-session recovery) - non-slack dispatch is unchanged; the active-run lookup just moves after condition matching Extends 9a7b7a0's in-run steering to cover replies after completion.
Terraform Validation Results
Pushed by: @ColeMurray, Action: |
Summary
Adds a
slackautomation event source: messages posted in watched Slack channels auto-trigger coding sessions. Built as an extension of the existing automation engine (event source + condition handlers + scheduler dispatch) rather than a parallel path. Closes #716.Ships dark behind
SLACK_TRIGGERS_ENABLED(defaultfalse) — no behavior change until an operator flips the flag.How it works
automation_slack_channelsjoin table (no full-scan/JSON parse), evaluates conditions, enforces a per-automation hourly rate limit + per-thread concurrency, dispatches, and (scheduler-owned) posts the thread reply.Conditions
text_matchcontains/exact/regex(nativeRegExp, length + flag allowlist capped)slack_channelslack_actorinclude/excludeby Slack userWhat's included
slackevent source,SlackAutomationEvent+ normalizer, and the three condition handlers, co-located undertriggers/slack/.0025(join table + run thread coordinates + blast-radius columns), store queries, scheduler slack handling,/internal/slack-eventroute.slack_triggers_enabledvariable wired to the bot binding.Safety / rollout
SLACK_TRIGGERS_ENABLED(default off).max_runs_per_hour, default 10) + per-thread concurrency guard; both record auditableskippedruns.docs/integrations/SLACK.md.Structural cleanups (behavior-preserving)
Landed alongside the feature, each removing real duplication:
createAutomationEventRoutefactory backing the github + slack webhook routes (was ~90% duplicated).getRoutingRules/getWatchedChannels(3rd hand-rolled copy → one engine).rejectInvalidCallbackguard for the four signed callback routes (one signature-verification path).channel-trigger.tsmodule (keptindex.tsfrom sprawling further).isDuplicateKeyError+SlackRunColumnscontract moved to the DB layer.Testing
Operator rollout checklist (post-merge, not in this PR)
message.channels(+message.groupsfor private) event subscriptions andchannels:history/groups:historyscopes to the Slack app; reinstall.slack_eventautomation with a channel + text condition; invite the bot to that channel.slack_triggers_enabled = trueand apply.falsehalts ingestion.Summary by CodeRabbit