Skip to content

fix(macOS): updates whisper from v1.8.4 to v.1.8.5 for metal#2309

Merged
fl0rianr merged 3 commits into
mainfrom
fl0rianr/whispher_ci_fix
Jun 20, 2026
Merged

fix(macOS): updates whisper from v1.8.4 to v.1.8.5 for metal#2309
fl0rianr merged 3 commits into
mainfrom
fl0rianr/whispher_ci_fix

Conversation

@fl0rianr

@fl0rianr fl0rianr commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

updates whisper from v1.8.4 to v.1.8.5 for metal only (since only this one is available right now) to try a ci error fix.

@github-actions github-actions Bot added engine::whispercpp whisper.cpp backend; audio transcription runtime::metal Apple Metal runtime (macOS) labels Jun 18, 2026
@jeremyfowers

Copy link
Copy Markdown
Member

@claude review

@jeremyfowers jeremyfowers requested a review from Geramy June 18, 2026 19:06
@jeremyfowers

Copy link
Copy Markdown
Member

@Geramy this seems reminiscent of the problems we had in llamacpp for macOS. Can you review please?

@github-actions

github-actions Bot commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Claude finished @jeremyfowers's task in 1m 22s —— View job


Review: fix(macOS): launch whisper Metal server with dyld env

I read the full diff against origin/main and cross-checked the helper methods and the llama.cpp launcher this PR mirrors. Overall this is a clean, well-scoped fix and the core diagnosis is correct: on macOS the loader honors DYLD_LIBRARY_PATH, not LD_LIBRARY_PATH, so the previous code never made the bundled dylib dir visible to the spawned whisper-server. The refactor and the new diagnostics are solid.

  • Read the PR diff
  • Cross-check helper methods (consume_process_handle_for_cleanup, stop_backend_watchdog, reap_process)
  • Compare against llama.cpp launcher for parity claims
  • Post review

What looks good

  • Correct platform split. #ifdef __APPLE__DYLD_LIBRARY_PATH, #elif !defined(_WIN32)LD_LIBRARY_PATH. This is the actual bug fix and it's right.
  • prepend_to_env_path helper (lines 16–24) cleanly de-duplicates the "prepend dir, preserve existing value" logic and handles the empty/unset cases correctly.
  • fs::absolute(exe_dir) (line 319) makes the loader path robust to a relative exe_path, an improvement over the old exe_dir.string().
  • HOME / GGML_METAL_NO_RESIDENCY forwarding matches the llama.cpp launcher (llamacpp_server.cpp:509–539), so the parity claim holds.
  • Failure-path diagnostics (lines 379–395) correctly distinguish "exited before ready" (reap + exit code) from "alive but failed readiness" (stop). Using consume_process_handle_for_cleanup() clears the member handle so a later unload() won't double-stop — good.

Suggestions / things to consider

  1. filter_health_logs flipped from falsetrue for all platforms (line 367). Previously the call passed false; now it's true unconditionally. On Linux/Windows in is_debug() mode this means whisper-server health-check log lines will now be filtered out where they weren't before. That's almost certainly desirable, but it's a behavior change beyond macOS, which slightly undercuts the "does not change workflow behavior" framing. Worth a one-line note in the PR description.

  2. inherit_output logic diverges from llama.cpp. llama.cpp uses (log_level_ == "info") || is_debug() on every platform (llamacpp_server.cpp:552), whereas whisper here keeps is_debug() and only adds the metal && info case for macOS (lines 355–360). The result is that on Linux at info level, llama-server output is visible but whisper-server output is not. Not a bug — and the narrow scoping is defensible — but if the goal is parity with the llama launcher, matching that one expression would be simpler and more consistent.

  3. GGML_METAL_NO_RESIDENCY empty-string check differs from llama.cpp. Here you guard with no_residency[0] != '\0' (line 333); llama.cpp forwards on bare if (no_residency) (line 514), i.e. even an empty value. Your version is arguably the better behavior, but the two now differ — minor inconsistency if someone reads both.

  4. macOS caveat worth verifying: DYLD_* environment variables are stripped by dyld for processes with hardened runtime / restricted entitlements (and not inherited across such spawns). If the shipped whisper-server is ever signed with hardened runtime, this DYLD_LIBRARY_PATH won't take effect and you'd be back to relying on @rpath/@loader_path. Notably the llama.cpp launcher sets no DYLD_LIBRARY_PATH at all and presumably resolves its colocated dylibs via rpath — so if whisper's dylibs are similarly colocated, it's worth confirming whether the real CI failure is the missing search path vs. something dyld would strip anyway. The change is harmless either way; just flagging so the CI result is interpreted correctly. If this lands and CI is still red, that's the first thing I'd check.

None of these are blocking. The fix is correct and the added logging will make any remaining CI failure far easier to diagnose.
· fl0rianr/whispher_ci_fix

@Geramy

Geramy commented Jun 18, 2026

Copy link
Copy Markdown
Member

@Geramy this seems reminiscent of the problems we had in llamacpp for macOS. Can you review please?

yes im a little confused on why we would set GGML_METAL_NO_RESIDENCY=1 this is a workaround feature to stop memory accumulation from happening but it also makes the memory not become "wired" to the process specifically and the model could get ejected or "free'd" its a work around, second, why do we need DYLD_LIBRARY_PATH set? if it is compiled for metal correctly we don't need that. This was an issue on sd-server as well but we later fixed it and the libraries became linked properly and it was no longer an issue. I could be wrong but im pretty sure that was a issue with sd that got fixed which could be the same issue here.

@Geramy

Geramy commented Jun 18, 2026

Copy link
Copy Markdown
Member

The model is loading fine here this must be a CI specific issue, I would say its safe to make whatever changes needed only to effect and work for fixing CI. Github uses fake macOS virtual machines and virtual m3's which sometimes can have problems, I am not sure if GGML_METAL_NO_RESIDENCY being 1 or 0 can solve this specifically but adjusting DYLD_LIBRARY_PATH should not be needed ever we should not touch that. if compiled right it launches I don't want to redirect dynamic library paths, its a no no for Mac really, unless you have to do it to accomplish a specific user case. otherwise its what you would call a workaround just like setting GGML_METAL_NO_RESIDENCY Is a workaround for stopping memory leaks.

2026-06-18 14:32:24.891 [Debug] (ProcessManager) Starting process with inherited output: /Library/Application Support/lemonade/.cache/bin/whispercpp/metal/whisper-server -m /Library/Application Support/lemonade/hub/models--ggerganov--whisper.cpp/snapshots/5359861c739e955e79d9a303bcbc70fb988958b1/ggml-large-v3-turbo.bin --port 8001
2026-06-18 14:32:24.896 [Info] (ProcessManager) Process started successfully, PID: 33153
2026-06-18 14:32:24.897 [Info] (WhisperServer) Process started with PID: 33153
2026-06-18 14:32:24.897 [Debug] (WrappedServer) Waiting for whisper-server to be ready...
2026-06-18 14:32:28.178 [Debug] (Server) GET /api/v1/models
2026-06-18 14:32:30.841 [Info] (WrappedServer) whisper-server is ready!
2026-06-18 14:32:30.841 [Info] (BackendWatchdog) Started watchdog for whisper-server using http://127.0.0.1:8001/health (streaming=on, non_streaming=on)
2026-06-18 14:32:30.841 [Info] (WhisperServer) Server is ready!
2026-06-18 14:32:30.841 [Debug] (Router) Backend started successfully in 5950ms
2026-06-18 14:32:30.841 [Info] (Router) Model loaded successfully. Total loaded: 1
2026-06-18 14:32:30.841 [Debug] (Server) POST /api/v1/load - 200
2026-06-18 14:32:30.860 [Debug] (Server) GET /api/v1/models
2026-06-18 14:32:33.258 [Debug] (Server) GET /api/v1/models
2026-06-18 14:32:33.266 [Debug] (Server) GET /api/v1/models
2026-06-18 14:32:33.272 [Debug] (Server) GET /internal/config
2026-06-18 14:32:33.274 [Debug] (Server) GET /internal/config - 200
2026-06-18 14:32:38.353 [Debug] (Server) GET /api/v1/models
2026-06-18 14:32:43.435 [Debug] (Server) GET /api/v1/models
2026-06-18 14:32:48.526 [Debug] (Server) GET /api/v1/models

@fl0rianr fl0rianr force-pushed the fl0rianr/whispher_ci_fix branch from 0bbf41e to 3f875b9 Compare June 18, 2026 21:52
@fl0rianr

fl0rianr commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator Author

Ok, I reworked it to be 100% CI only again and an isolated runner with better logs should improve the situation more.

@fl0rianr fl0rianr changed the title fix(macOS): launch whisper Metal server with dyld env fix ci(macOS): isolate whisper Metal test on fresh runner Jun 18, 2026
@fl0rianr fl0rianr force-pushed the fl0rianr/whispher_ci_fix branch from c8326c1 to 033070c Compare June 19, 2026 21:34
@fl0rianr fl0rianr changed the title fix ci(macOS): isolate whisper Metal test on fresh runner fix(macOS): harder whisper server with DYLD_LIBRARY_PATH fallback Jun 19, 2026
@fl0rianr fl0rianr enabled auto-merge June 20, 2026 00:55

@Geramy Geramy left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The right thing to do.....

@fl0rianr fl0rianr added this pull request to the merge queue Jun 20, 2026
@fl0rianr fl0rianr changed the title fix(macOS): harder whisper server with DYLD_LIBRARY_PATH fallback fix(macOS): updates whisper from v1.8.4 to v.1.8.5 for metal Jun 20, 2026
Merged via the queue into main with commit aab5528 Jun 20, 2026
78 checks passed
@fl0rianr fl0rianr deleted the fl0rianr/whispher_ci_fix branch June 20, 2026 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

engine::whispercpp whisper.cpp backend; audio transcription runtime::metal Apple Metal runtime (macOS)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants