Skip to content

feat: transaction response API for device set/get commands#30774

Open
rusty-art wants to merge 1 commit intoKoenkk:devfrom
rusty-art:master
Open

feat: transaction response API for device set/get commands#30774
rusty-art wants to merge 1 commit intoKoenkk:devfrom
rusty-art:master

Conversation

@rusty-art
Copy link

@rusty-art rusty-art commented Jan 24, 2026

Closes #30679

What

Adds request/response support for device commands, mirroring the existing bridge pattern (docs).

  • zigbee2mqtt/FRIENDLY_NAME/request/set → response on zigbee2mqtt/FRIENDLY_NAME/response/set
  • zigbee2mqtt/FRIENDLY_NAME/request/get → response on zigbee2mqtt/FRIENDLY_NAME/response/get
  • Existing /set and /get topics are unchanged — fully backward compatible

Response shape

/response/set — echoes requested values

Success:

{"data": {"state": "ON", "brightness": 200}, "status": "ok", "z2m_transaction": "my-id"}

Error:

{"data": {}, "status": "error", "error": "failed:brightness", "z2m_transaction": "my-id"}

Superseded (sleepy device, newer command replaced queued one):

{"data": {}, "status": "error", "error": "superseded:brightness", "z2m_transaction": "my-id"}

data echoes the values from the request, not from the device or cache. It confirms what was accepted, not current device state.

/response/get — status only, no data

{"data": {}, "status": "ok", "z2m_transaction": "my-id"}

GET responses deliberately omit data values. Actual device values arrive on the state topic (zigbee2mqtt/FRIENDLY_NAME), which is the authoritative source. This avoids a race condition where the state cache may not yet be updated when the response is built.

Common fields

  • z2m_transaction — optional, echoed back if provided in request
  • error format — "group:key1,key2|group:key3", parseable via split('|')split(':')split(',')
  • superseded — herdsman replaced a queued command with a newer one (not a real failure)
  • QoS of response matches request QoS
  • retain: false (responses are events, not state)
  • Ping: send {"z2m_transaction": "ping1"} (or {}) to /request/set{"data": {}, "status": "ok"}

Why z2m_transaction instead of transaction

The bridge uses transaction. We use z2m_transaction because CSM-300ZB has a real device attribute called transaction (shinasystem.js). A flat transaction field would collide.

Diff

 lib/extension/publish.ts   | +56 (regex, ParsedTopic.isRequest, response logic, superseded detection)
 lib/mqtt.ts                | +3  (QoS propagation through event system)
 lib/extension/frontend.ts  | +2  (QoS for WebSocket messages)
 lib/types/types.d.ts       | +1  (qos field on MQTTMessage event)
 package.json               | +1  (zigbee-herdsman 9.0.2 → 9.0.6 for superseded error)
 test/extensions/publish.ts | +131 (13 tests covering all response branches)
 test/controller.test.ts    | +2/-2 (update existing tests for qos field)

Test plan

  • 13 unit tests covering all response branches (set success, get status-only, errors, superseded, ping, groups, QoS, endpoint/attribute topics)
  • 752 tests pass, 0 failures
  • Biome lint clean
  • Manual testing with mains and sleepy devices

Related PRs for Frontends

Frontend PR (minimal): Nerivec/zigbee2mqtt-windfront#409

Frontend PR (full implementation with visual feedback):
https://github.com/rusty-art/zigbee2mqtt-windfront/tree/transaction-response-api

Testing Together

To test the full feature end-to-end, use the transaction-response-api branch of the windfront fork:

  • Backend: Publishes structured responses on <device>/response/set and <device>/response/get
  • Frontend: Consumes responses to show real-time command status (pending/success/error indicators, sleepy device support, retry on
    failure)

@Koenkk
Copy link
Owner

Koenkk commented Jan 24, 2026

I think it's good to have a request/response like API similar to how it is implemented for the bridge (docs). For this existing function like getResponse can be re-used (and should also reduce the code to add by a lot). I propose something like: zigbee2mqtt/FRIENDLY_NAME/request/set which then sends a response to zigbee2mqtt/FRIENDLY_NAME/response/set (same as the current zigbee2mqtt/FRIENDLY_NAME/set

@rusty-art
Copy link
Author

The consistency with bridge makes a lot of sense. I've worked through the changes and have a few questions - happy to go with whatever approach you prefer.

1. Correlation field naming

Nerivec previously flagged that CSM-300ZB has a "transaction" device attribute (0-1000ms enum). If we use flat "transaction" for correlation, those users can't set their device attribute AND use request-response simultaneously - we'd have to strip it before forwarding, breaking their device config.

Options:

  • Nested: "z2m: { request_id }" (current) - entire z2m object stripped before forwarding, no collision possible
  • Flat with prefix: "transaction_id" or "transaction_request" (or more unique like z2m_id) - more similar to bridge pattern, avoids CSM-300ZB collision
  • Flat: "transaction" - matches bridge exactly but breaks device(s): CSM-300ZB users can't set the transaction attribute

Preference?

2. Response metadata

Bridge responses are minimal. For device commands, I've added fields that frontends/automation can use for better sleepy-device handling and performance monitoring:

  • "elapsed_ms" - latency monitoring (frontend can show "responded in 45ms")
  • "status: pending" - for sleepy battery devices (command queued, will deliver when device wakes)
  • "status: partial" - some attributes succeeded, some failed (per-attribute tracking)
  • "transmission_type: multicast" + "member_count" - for group commands (no per-device ACK per ZCL spec)
  • "final" - enables multi-response streaming (e.g., if a command triggers multiple state updates over time); but currently always 'true'

these can be wrapped in the z2m structure (item 1 above) or flattened like z2m_transaction_id, z2m_status, z2m_type, etc.
These fields allow frontends to provide clear feedback to users on the status of set requests and sleep-device handling.

Should we keep these, or go minimal to match bridge exactly for now (foregoing some better frontend UX)

3. Request topic (optional)

We could add {device}/request/set as an alternative to {device}/set (and similar for get) for clients that want explicit request-response semantics. This would mirror bridge exactly, but introduce extra messages and the slow/painful migration for clients to deprecate {device}/[set|get]/ and replace with {device}/request/[get|set]. Probably unnecessary, but just wanted to check if you wanted to introduce that as well.

(FYI, Ecosystem impact: I checked Home Assistant and windfront - neither will break. It seems that HA uses specific topics from discovery (not wildcards on device topics). Windfront uses WebSocket with the backend forwarding all MQTT messages. Both approaches just add new topics; existing /set and /get behavior unchanged.)

Let me know your thoughts and I'll adjust accordingly.

@rusty-art rusty-art marked this pull request as draft January 25, 2026 02:33
@rusty-art rusty-art force-pushed the master branch 3 times, most recently from 25d57ea to 210e118 Compare March 1, 2026 03:12
@rusty-art rusty-art changed the title feat: implement Transaction Response API feat: transaction response API for device set/get commands Mar 1, 2026
Adds request/response support for device commands, mirroring the
existing bridge pattern. Clients send to /request/set or /request/get
and receive structured responses on /response/set or /response/get.

SET responses echo the requested values. GET responses return status
only (no data) — actual values arrive on the state topic to avoid a
race condition between convertGet() resolving and the cache update.

Includes z2m_transaction correlation, superseded command detection,
QoS propagation, and ping support.
Copy link
Author

@rusty-art rusty-art left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explanatory comments only.

// Used by `publish.test.ts` to reload regex when changing `mqtt.base_topic`.
export const loadTopicGetSetRegex = (): void => {
topicGetSetRegex = new RegExp(`^${settings.get().mqtt.base_topic}/(?!bridge)(.+?)/(get|set)(?:/(.+))?$`);
topicGetSetRegex = new RegExp(`^${settings.get().mqtt.base_topic}/(?!bridge)(.+?)/(request/)?(get|set)(?:/(.+))?$`);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The (request/)? group is optional (?), so this regex still matches legacy /set and /get topics exactly as before. The new capture group shifts subsequent group indices (match[2] → type, match[3] → isRequest, match[4] → attribute). Backwards compatibility is verified by all existing tests passing unchanged, plus an explicit "Should NOT publish response for legacy /set topic" test.

return;
}

// Extract and strip z2m_transaction before forwarding to converters
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We strip z2m_transaction from the message before forwarding to converters. This is necessary because the CSM-300ZB device (shinasystem.js) has a real device attribute called transaction — if we'd used that name, the converter would consume it as a device setting. We use z2m_transaction to avoid the collision, and strip it here so converters never see it.

delete message.z2m_transaction;
}

// Ping: /request/ topic with empty payload after stripping z2m_transaction
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping pattern: after stripping z2m_transaction, if the payload is empty ({}), we short-circuit with an immediate {data: {}, status: "ok"} response. This lets clients verify the bridge is responsive without generating any Zigbee traffic. Useful for health checks and connection validation.

// biome-ignore lint/style/noNonNullAssertion: always Error
logger.debug((error as Error).stack!);
if (parsedTopic.isRequest) {
((error as Error).message?.includes("Request superseded") ? supersededKeys : failedKeys).push(originalKey);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Herdsman ≥9.0.5 throws "Request superseded" when a queued command for a sleepy device is replaced by a newer one. We distinguish this from generic failures so clients can tell the difference between "your command was replaced" vs "your command failed." Both map to status: "error" but with different error string prefixes (superseded: vs failed:).

if (!this.publishedTopics.has(topic)) {
logger.debug(() => `Received MQTT message on '${topic}' with data '${message.toString()}'`, NS);
this.eventBus.emitMQTTMessage({topic, message: message.toString()});
this.eventBus.emitMQTTMessage({topic, message: message.toString(), qos: /* v8 ignore next */ packet?.qos ?? 0});
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Propagates the MQTT QoS level from the incoming packet into the event system, so the response can be published at the same QoS as the request. Previously QoS was discarded at the event boundary. The ?? 0 fallback handles edge cases where packet is undefined (e.g. WebSocket messages via frontend.ts).

await flushPromises();
expect(mockLogger.error).toHaveBeenCalledWith("Entity 'an_unknown_entity' is unknown");
});

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 tests covering all transaction response branches: success with/without z2m_transaction, converter failure, superseded error, legacy topic (no response), ping pattern, QoS matching, group commands, attribute-in-topic, endpoint-in-topic, GET response (status only, no data), partial success, and mixed superseded+failed outcomes.

@rusty-art rusty-art marked this pull request as ready for review March 1, 2026 04:51
@rusty-art
Copy link
Author

@Koenkk, now implemented and rebased onto dev. Thanks for the guidance on the bridge pattern, it worked out really well!

Quick summary of what's in the updated PR:

  • Bridge-style topics and response shape: {device}/request/set{device}/response/set with the same {data, status, error} structure as getResponse. Legacy /set and /get are fully unchanged - no response published.

  • z2m_transaction instead of transaction: Used a prefixed name to avoid the CSM-300ZB collision that Nerivec flagged (it has a real transaction device attribute). Otherwise follows the same correlation pattern as bridge.

  • Didn't reuse getResponse directly - two small differences: we need z2m_transaction instead of transaction, and on partial success we return the succeeded attributes in data alongside the error (bridge always returns empty data on error). The response building is only ~12 lines inline in publish.ts though. Happy to refactor into a shared helper if you think that's cleaner!

  • Kept it minimal: Dropped the extra metadata (elapsed_ms, pending status, etc.) from my earlier proposal. Just data, status, error, and z2m_transaction - simple and consistent with bridge.

I've added inline comments on the key design decisions. Total diff is +187/-10 across 6 files with 13 new tests. Let me know if anything needs adjusting.

@Koenkk
Copy link
Owner

Koenkk commented Mar 2, 2026

Nice! I need to do a in-depth review, @Nerivec do you think this is also useful for the frontend? (to get better error messages when changing stuff through the exposes tab?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants