Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Updates the library to v1.12.0 by expanding audio and LLM capabilities and syncing the JS/TS API with new native bindings.
Changes:
- Replaces
CactusVADwith a broaderCactusAudioAPI (VAD + diarization + speaker embeddings) and updates exports/docs. - Adds LM “prefill” support and optional “thinking” output, plus STT language override support.
- Updates native interfaces (Nitro spec + C++ FFI) and bumps version/runtime metadata to 1.12.0.
Reviewed changes
Copilot reviewed 31 out of 37 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/types/CactusVAD.ts | Removes legacy VAD-only type definitions. |
| src/types/CactusSTT.ts | Adds optional language to stream transcription start options. |
| src/types/CactusLM.ts | Adds enableThinking, prefill params/result types, and optional thinking field in completion result. |
| src/types/CactusAudio.ts | Introduces unified audio types for VAD/diarize/embedSpeaker. |
| src/specs/Cactus.nitro.ts | Extends Nitro spec with prefill, diarize, and embedSpeaker. |
| src/native/Cactus.ts | Wires new options and adds JS wrappers for prefill, diarize, and embedSpeaker. |
| src/modelRegistry.ts | Bumps runtime version constant to 1.12.0. |
| src/index.tsx | Switches public exports to CactusAudio/useCactusAudio and new LM types. |
| src/hooks/useCactusVAD.ts | Reworks hook to manage CactusAudio and adds diarize/embedSpeaker methods. |
| src/classes/CactusVAD.ts | Reworks class into CactusAudio with new audio methods. |
| src/classes/CactusLM.ts | Adds prefill() API and internal buffer sizing logic. |
| package.json | Bumps package version to 1.12.0. |
| nitrogen/generated/shared/c++/HybridCactusSpec.hpp | Adds prefill, diarize, and embedSpeaker virtual methods. |
| nitrogen/generated/shared/c++/HybridCactusSpec.cpp | Registers new Nitro hybrid methods. |
| ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/kernel_utils.h | Adds SIMD helpers and Android thread pinning utilities. |
| ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/kernel.h | Extends kernel API surface (ops, attention params, conv, pooling, etc.). |
| ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/graph.h | Extends graph op types/params and adds helpers/new ops. |
| ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/gemma_tools.h | Improves tool formatting/parsing and supports alternate tags. |
| ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/engine.h | Extends engine config/model/tokenizer features and adds vocab bias support. |
| ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/cactus_utils.h | Adds audio preprocessing, options parsing refactors, thinking extraction, and validation helpers. |
| ios/cactus.xcframework/ios-arm64/cactus.framework/Headers/cactus_ffi.h | Extends FFI with prefill, diarize, speaker embedding, logging, and graph export. |
| ios/cactus.xcframework/ios-arm64-simulator/cactus.framework/Headers/kernel_utils.h | Mirrors kernel_utils updates for simulator slice. |
| ios/cactus.xcframework/ios-arm64-simulator/cactus.framework/Headers/kernel.h | Mirrors kernel API updates for simulator slice. |
| ios/cactus.xcframework/ios-arm64-simulator/cactus.framework/Headers/graph.h | Mirrors graph updates for simulator slice. |
| ios/cactus.xcframework/ios-arm64-simulator/cactus.framework/Headers/gemma_tools.h | Mirrors gemma_tools updates for simulator slice. |
| ios/cactus.xcframework/ios-arm64-simulator/cactus.framework/Headers/engine.h | Mirrors engine updates for simulator slice. |
| ios/cactus.xcframework/ios-arm64-simulator/cactus.framework/Headers/cactus_ffi.h | Mirrors FFI updates for simulator slice. |
| example/ios/Podfile.lock | Updates example pod version/checksum to Cactus 1.12.0. |
| cpp/cactus_ffi.h | Updates C++ header to match new/extended FFI surface. |
| cpp/HybridCactus.hpp | Adds new Hybrid methods for prefill, diarize, and embedSpeaker. |
| cpp/HybridCactus.cpp | Implements new Hybrid methods and updates telemetry version string. |
| README.md | Renames VAD section to Audio Processing and documents new APIs/options. |
| .gitattributes | Adds LFS rule for Android static library artifact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| export { CactusLM } from './classes/CactusLM'; | ||
| export { CactusSTT } from './classes/CactusSTT'; | ||
| export { CactusVAD } from './classes/CactusVAD'; | ||
| export { CactusAudio } from './classes/CactusAudio'; |
There was a problem hiding this comment.
These exports reference ./classes/CactusAudio and ./hooks/useCactusAudio, but the diff shows the implementations were edited in src/classes/CactusVAD.ts and src/hooks/useCactusVAD.ts (no new files/renames shown). As-is, this will fail module resolution at build time. Either rename/move the files to match the new import paths, or update the export paths to the actual file locations (and ideally add backward-compatible re-exports for the old names if this is not a major version bump).
| export { useCactusLM } from './hooks/useCactusLM'; | ||
| export { useCactusSTT } from './hooks/useCactusSTT'; | ||
| export { useCactusVAD } from './hooks/useCactusVAD'; | ||
| export { useCactusAudio } from './hooks/useCactusAudio'; |
There was a problem hiding this comment.
These exports reference ./classes/CactusAudio and ./hooks/useCactusAudio, but the diff shows the implementations were edited in src/classes/CactusVAD.ts and src/hooks/useCactusVAD.ts (no new files/renames shown). As-is, this will fail module resolution at build time. Either rename/move the files to match the new import paths, or update the export paths to the actual file locations (and ideally add backward-compatible re-exports for the old names if this is not a major version bump).
| const auto &audioDoubles = std::get<std::vector<double>>(audio); | ||
|
|
||
| std::vector<uint8_t> audioBytes; | ||
| audioBytes.reserve(audioDoubles.size()); | ||
| for (double d : audioDoubles) { | ||
| d = std::clamp(d, 0.0, 255.0); | ||
| audioBytes.emplace_back(static_cast<uint8_t>(d)); | ||
| } |
There was a problem hiding this comment.
The FFI audio path uses pcm_buffer/pcm_buffer_size and (per the new validation code) expects 16-bit PCM byte data with an even byte length. This conversion treats number[] as raw bytes (0..255) and produces an arbitrary byte length (often odd), which will either be rejected or lead to incorrect diarization/embedding. Align the JS-to-native contract: either accept a byte buffer type from JS and pass through verbatim, or convert float samples (e.g., [-1, 1]) to int16 little-endian PCM bytes (ensuring pcm_buffer_size is even and matches the sample count * 2).
| std::string responseBuffer; | ||
| responseBuffer.resize(responseBufferSize); | ||
|
|
||
| int result = cactus_prefill(this->_model, messagesJson.c_str(), | ||
| responseBuffer.data(), responseBufferSize, |
There was a problem hiding this comment.
responseBufferSize is a double in the Hybrid method signature, but it is used as a byte count in std::string::resize() and passed to the FFI as a size_t. This relies on implicit narrowing/truncation and can misbehave if a non-integer or negative value is ever passed. Cast to size_t once (after validating it is finite and > 0), and use the casted size for both resize() and the FFI call.
No description provided.