Conversation
There was a problem hiding this comment.
Pull request overview
This PR replaces TurboMind’s legacy printf-style logger with a new turbomind::core::Logger that uses {}-style formatting via fmtlib and supports async logging, then migrates call sites and build targets accordingly.
Changes:
- Introduce a new fmt-backed async logger in
src/turbomind/core(with Catch2 coverage). - Migrate
TM_LOG_*call sites from printf-style (%d/%s/...) to fmt-style ({}) across core/engine/models/kernels/comm/tests. - Remove the old
src/turbomind/utils/logger.*library and update CMake targets/dependencies (FetchContent fmt + concurrentqueue).
Reviewed changes
Copilot reviewed 61 out of 61 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| CMakeLists.txt | Fetch fmtlib via FetchContent; add nvcc diag suppression for fmt headers. |
| tests/csrc/unittests/CMakeLists.txt | Removes legacy logger linkage from unittest target link sets. |
| tests/csrc/unittests/gtest_utils.h | Migrates test logging to {}-style TM_LOG_*. |
| tests/csrc/unittests/test_logprob_kernels.cu | Switches include to new core/logger.h. |
| tests/csrc/unittests/test_sampling_layer.cu | Converts debug logging to fmt-style formatting. |
| tests/csrc/unittests/unittest_utils.h | Converts some test logging to fmt-style formatting. |
| src/turbomind/utils/logger.h | Deletes legacy logger header. |
| src/turbomind/utils/logger.cc | Deletes legacy logger implementation. |
| src/turbomind/utils/CMakeLists.txt | Drops logger lib; updates cuda_utils to link fmt; removes logger deps from utils libs. |
| src/turbomind/utils/cuda_utils.h | Uses new logger; converts CUDA error logging to fmt-style strings. |
| src/turbomind/utils/cuda_utils.cc | Uses new logger macros for sync-and-check logging. |
| src/turbomind/utils/anomaly_handler.cu | Migrates warnings to TM_LOG_WARN with fmt-style formatting; switches include to core logger. |
| src/turbomind/turbomind.cc | Converts various warnings/errors/info logs to fmt-style formatting. |
| src/turbomind/python/bind.cpp | Converts memcpy failure log to fmt-style formatting. |
| src/turbomind/models/CMakeLists.txt | Removes logger from models link list (models already links core). |
| src/turbomind/models/llama/CMakeLists.txt | Removes logger from Llama target link list. |
| src/turbomind/models/llama/unified_attention_layer.cc | Switches include to core logger; converts logs to fmt-style. |
| src/turbomind/models/llama/SequenceManager.h | Adds <map> include (compile fix for new usage). |
| src/turbomind/models/llama/SequenceManager.cc | Switches include to core logger; converts many logs to fmt-style. |
| src/turbomind/models/llama/LlamaWeight.cc | Converts padding warnings to fmt-style. |
| src/turbomind/models/llama/LlamaLinear.cu | Converts error log to fmt-style. |
| src/turbomind/models/llama/LlamaDecoderLayerWeight.cc | Switches include to core logger; converts env log to fmt-style. |
| src/turbomind/models/llama/llama_kernels.cu | Adds string_utils.h include (likely for helpers used elsewhere in file). |
| src/turbomind/models/llama/GatedDeltaNetLayer.cc | Switches include to core logger; converts info log to fmt-style. |
| src/turbomind/models/llama/BlockTrie.cc | Converts warnings/info to fmt-style. |
| src/turbomind/models/llama/BlockManager.h | Switches include to core logger. |
| src/turbomind/models/llama/BlockManager.cc | Switches include to core logger; converts logs to fmt-style. |
| src/turbomind/models/llama/Barrier.h | Switches include to core logger; converts log to fmt-style. |
| src/turbomind/kernels/sampling_topk_kernels.h | Switches include to core logger. |
| src/turbomind/kernels/sampling_topk_kernels.cu | Adds string_utils.h include. |
| src/turbomind/kernels/sampling_penalty_kernels.cu | Converts debug log to fmt-style. |
| src/turbomind/kernels/logprob_kernels.cu | Switches include to core logger; converts debug log to fmt-style. |
| src/turbomind/kernels/decoding_kernels.cu | Adds string_utils.h include. |
| src/turbomind/kernels/attention/CMakeLists.txt | Replaces logger linkage with core/cuda_utils/fmt::fmt for tests. |
| src/turbomind/generation/utils.h | Converts warning log to fmt-style. |
| src/turbomind/generation/sampling.cc | Switches include to core logger; converts debug logs to fmt-style. |
| src/turbomind/generation/logits_processor.cc | Converts debug/warn logs to fmt-style; minor cast cleanup. |
| src/turbomind/engine/request.cc | Converts callback error logs to fmt-style. |
| src/turbomind/engine/gateway.h | Switches include to core logger; converts warnings/info to fmt-style. |
| src/turbomind/engine/gateway.cc | Converts gateway errors to fmt-style and removes legacy prefixes. |
| src/turbomind/engine/engine.cc | Switches include to core logger; converts various logs to fmt-style. |
| src/turbomind/core/CMakeLists.txt | Fetches concurrentqueue; adds logger.cc to core; updates core link deps; adds test_logger. |
| src/turbomind/core/logger.h | New Logger API and TM_LOG_* macros using fmt format strings. |
| src/turbomind/core/logger.cc | Implements async worker + formatting/prefixing + env var parsing + optional signal handling. |
| src/turbomind/core/test_logger.cc | Adds Catch2 tests for logger formatting, filtering, env vars, sync/async behavior, fatal/signal cases (POSIX-guarded). |
| src/turbomind/core/tensor.cc | Updates warning logging to fmt-style. |
| src/turbomind/core/stream.h | Updates CUDA destroy error logs to fmt-style. |
| src/turbomind/core/check.h | Extends CheckErrorStream to store source location. |
| src/turbomind/core/check.cc | Switches check failures to go through new logger fatal path. |
| src/turbomind/core/allocator.h | Updates commented-out debug logging to fmt-style placeholders. |
| src/turbomind/comm/CMakeLists.txt | Removes legacy logger linkage from comm libs. |
| src/turbomind/comm/host_comm.h | Switches include to core logger. |
| src/turbomind/comm/env.h | Switches include to core logger; converts env logging to fmt-style. |
| src/turbomind/comm/test_comm.cu | Converts info logs to fmt-style; updates commented debug log format. |
| src/turbomind/comm/nccl/CMakeLists.txt | Removes logger from nccl_comm linkage. |
| src/turbomind/comm/nccl/nccl.cu | Switches include to core logger; converts warnings/errors and NCCL check message formatting. |
| src/turbomind/comm/gloo/CMakeLists.txt | Removes logger from gloo_comm linkage. |
| src/turbomind/comm/gloo/tcp_store.cc | Switches include to core logger; converts warning/error logs to fmt-style. |
| src/turbomind/comm/gloo/gloo_comm.cc | Switches include to core logger; converts info log to fmt-style. |
| src/turbomind/comm/cuda_ipc/CMakeLists.txt | Removes logger from cuda_ipc_comm linkage. |
| src/turbomind/comm/cuda_ipc/cuda_ipc_comm.cu | Switches include to core logger; converts warnings to fmt-style. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| explicit TestFailureError(std::string name, std::string msg = "") | ||
| { | ||
| msg_ = fmtstr("TEST FAIL [%s] %s", name.c_str(), msg.c_str()); | ||
| msg_ = fmtstr("TEST FAIL [%s] %s", name, msg.c_str()); | ||
| } |
There was a problem hiding this comment.
fmtstr uses snprintf/printf-style formatting. Passing name (a std::string) to a %s specifier is undefined behavior and can corrupt the error message or crash. Use name.c_str() (and msg.c_str()), or switch this to fmt::format/TM_LOG_*-style formatting instead of fmtstr.
| param.kind = cudaMemcpyDefault; | ||
|
|
||
| if (auto ec = cudaMemcpy3DAsync(¶m, stream.handle()); ec == cudaSuccess) { | ||
| TM_LOG_WARNING(cudaGetErrorString(ec)); | ||
| TM_LOG_WARN("{}", cudaGetErrorString(ec)); | ||
| return; | ||
| } |
There was a problem hiding this comment.
The cudaMemcpy3DAsync return-code check is inverted: it logs a warning and returns when ec == cudaSuccess, which will emit a spurious "no error" message and short-circuit the normal fallback path. This should return silently on success, and only log/fallback when ec != cudaSuccess.
src/turbomind/core/logger.cc
Outdated
|
|
||
| static void OnFatalSignal(int signum) | ||
| { | ||
| AsyncLogWorker::Instance().Stop(); |
There was a problem hiding this comment.
OnFatalSignal calls AsyncLogWorker::Stop(), which enqueues to a concurrent queue and join()s a std::thread. Those operations are not async-signal-safe and can deadlock/crash if the signal interrupts code holding internal locks/allocators. Consider making the signal handler minimal (e.g., disable async + best-effort write(2, ...)), or gate this draining behavior behind an opt-in env var with a safe default.
| AsyncLogWorker::Instance().Stop(); | |
| #ifndef _WIN32 | |
| // Best-effort, async-signal-safe diagnostic. | |
| static const char msg[] = "turbomind: caught fatal signal, aborting\n"; | |
| ::write(STDERR_FILENO, msg, sizeof(msg) - 1); | |
| #endif |
| TM_LOG_FATAL("CUDA runtime error: {} {}:{}", _cudaGetErrorEnum(result), file, line); | ||
| std::abort(); | ||
| } |
There was a problem hiding this comment.
TM_LOG_FATAL is [[noreturn]] (it calls Logger::LogFatal which aborts), so the subsequent std::abort() is unreachable/redundant. Dropping the extra abort avoids dead code and potential compiler warnings.
| if (failures > tol_failures) { | ||
| TM_LOG_ERROR("%s (failures: %.2f%% atol: %.2e rtol: %.2e rel_gap: %.2e%%)", | ||
| name.c_str(), 100. * failures / size, atol, rtol, 100. * relative_gap); | ||
| TM_LOG_ERROR({} (failures: {:.2f}% atol: {:.2e} rtol: {:.2e} rel_gap: {:.2e}%)", | ||
| name, (double)(100. * failures / size), atol, rtol, 100. * relative_gap); |
There was a problem hiding this comment.
This TM_LOG_ERROR call is missing the opening quote for the format string (TM_LOG_ERROR({} ...), which makes the file fail to compile. Also, this file still includes the removed legacy header (src/turbomind/utils/logger.h), so even after fixing the quote it should include src/turbomind/core/logger.h (or another header that provides the new {}-based TM_LOG_* macros).
| target_link_libraries( # Libs for test_attention_kernels | ||
| unittest PUBLIC | ||
| CUDA::cudart CUDA::curand | ||
| gpt_kernels gtest memory_utils tensor unfused_attention_kernels cuda_utils logger) | ||
| gpt_kernels gtest memory_utils tensor unfused_attention_kernels cuda_utils) | ||
| target_link_libraries( # Libs for test_logprob_kernels | ||
| unittest PUBLIC | ||
| CUDA::cudart | ||
| logprob_kernels memory_utils cuda_utils logger) | ||
| logprob_kernels memory_utils cuda_utils) |
There was a problem hiding this comment.
logger was removed from the unittest link lineups, but the new TM_LOG_* macros now live in core/logger.h and require core (for turbomind::core::Logger implementation in logger.cc) to be linked into the unittest executable. As-is, unit tests that use TM_LOG_* are likely to hit undefined references at link time; add core (or a higher-level target that already brings it in) to the relevant target_link_libraries(unittest ...) lines.
Add an async logger backed by fmtlib