Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
14 commits
Select commit Hold shift + click to select a range
f1387bc
fix: ensure content+role fields always present in streaming deltas fo…
bong-water-water-bong Jun 27, 2026
6849d92
fix: download TheRock ROCm backend for all detected GPU architectures
bong-water-water-bong Jun 27, 2026
184d3b7
fix: SSE heartbeat for long prefill + download resilience (#1364, #1546)
bong-water-water-bong Jun 27, 2026
cba818a
ci: add PR-Agent (DeepSeek) + Qodo dual review workflows
bong-water-water-bong Jun 27, 2026
31a64eb
fix: pre-load OOM memory warning in router (#1804)
bong-water-water-bong Jun 27, 2026
132b094
fix: SSE heartbeat, download resilience, OOM guard, PR-Agent
bong-water-water-bong Jun 27, 2026
74918a2
fix: model-level resume for interrupted HF downloads (#1546a)
bong-water-water-bong Jun 27, 2026
11991bc
ci: remove claude-review.yml (no Anthropic API key available)
bong-water-water-bong Jun 27, 2026
5aa839c
Revert "ci: remove claude-review.yml (no Anthropic API key available)"
bong-water-water-bong Jun 27, 2026
b96224c
fix: prevent deadlock in get_system_info_with_cache during recipe det…
bong-water-water-bong Jun 27, 2026
7048b17
fix: skip optional draft checkpoint in are_required_checkpoints_compl…
bong-water-water-bong Jun 27, 2026
406e82d
feat: add setup-repo-secrets script and document required PR-Agent se…
bong-water-water-bong Jun 27, 2026
d33b12e
fix: skip redundant TheRock download when already installed (#2413)
bong-water-water-bong Jun 27, 2026
f4b2dc4
fix: show warning when pulling model in offline mode (#2412)
bong-water-water-bong Jun 27, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
106 changes: 106 additions & 0 deletions .github/scripts/setup-repo-secrets.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/usr/bin/env bash
# ---------------------------------------------------------------------------
# Setup GitHub repo secrets for PR-Agent (DeepSeek) and Qodo Merge workflows.
#
# Reads API keys from local files in ~/Documents/ and pushes them as
# encrypted GitHub Actions secrets via `gh secret set`.
#
# Usage:
# bash .github/scripts/setup-repo-secrets.sh # current repo
# bash .github/scripts/setup-repo-secrets.sh --repo owner/repo
# bash .github/scripts/setup-repo-secrets.sh --dry-run # preview only
#
# Key sources (edit paths below to match your setup):
# ~/Documents/deepseek api key.txt → DEEPSEEK_API_KEY
# ~/Documents/qodo api key.txt → QODO_API_KEY
#
# Prerequisites:
# - gh CLI installed and authenticated (`gh auth status`)
# - Write / admin access to the target repo
# ---------------------------------------------------------------------------
set -euo pipefail

REPO=""
DRY_RUN=false

while [[ $# -gt 0 ]]; do
case "$1" in
--repo) REPO="$2"; shift 2 ;;
--dry-run) DRY_RUN=true; shift ;;
*) echo "Unknown: $1"; exit 1 ;;
esac
done

# --- Config: key file paths -----------------------------------------------
DEEPSEEK_KEY_FILE="$HOME/Documents/deepseek api key.txt"
QODO_KEY_FILE="$HOME/Documents/qodo api key.txt"

# --- Checks ----------------------------------------------------------------
if ! command -v gh &>/dev/null; then
echo "❌ gh CLI not found — install it from https://cli.github.com/"
exit 1
fi

if ! gh auth status &>/dev/null; then
echo "❌ gh CLI not authenticated — run 'gh auth login' first."
exit 1
fi

GH_ARGS=()
[[ -n "$REPO" ]] && GH_ARGS+=(--repo "$REPO")

echo "🔍 Target: ${REPO:-$(gh repo view --json nameWithOwner -q .nameWithOwner 2>/dev/null || echo 'current repo')}"

# --- Read keys -------------------------------------------------------------
read_key() {
local path="$1" label="$2"
if [[ ! -f "$path" ]]; then
echo "⚠️ $label key file not found at: $path"
return 1
Comment on lines +58 to +59

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Send missing-key warnings to stderr

When a key file is missing or empty, read_key prints the warning to stdout and returns nonzero, but the caller captures stdout with DEEPSEEK_KEY="$(read_key ...)" || true. That makes the warning text a non-empty secret value, so set_secret uploads it instead of skipping the unset key. Emit diagnostics on stderr or explicitly clear the captured value on failure.

Useful? React with 👍 / 👎.

fi
local val
val="$(tr -d '[:space:]' < "$path")"
if [[ -z "$val" ]]; then
echo "⚠️ $label key file is empty: $path"
return 1
fi
echo "$val"
}

echo ""
echo "📂 Reading keys from ~/Documents/ …"

DEEPSEEK_KEY="$(read_key "$DEEPSEEK_KEY_FILE" "DeepSeek")" || true
QODO_KEY="$(read_key "$QODO_KEY_FILE" "Qodo")" || true

# --- Set secrets -----------------------------------------------------------
set_secret() {
local name="$1" value="$2"
if [[ -z "$value" ]]; then
echo " ⏭️ Skipping $name (no value)"
return
fi
if $DRY_RUN; then
echo " 🏁 [DRY-RUN] gh secret set $name ${GH_ARGS[*]}"
else
echo " 🔐 Setting $name …"
echo -n "$value" | gh secret set "$name" "${GH_ARGS[@]}"
echo " ✅ $name set"
fi
}

echo ""
$DRY_RUN && echo "🏁 DRY RUN — no secrets will be written" || echo "🚀 Setting secrets …"
echo ""

set_secret "DEEPSEEK_API_KEY" "$DEEPSEEK_KEY"
set_secret "QODO_API_KEY" "$QODO_KEY"

# --- Summary ---------------------------------------------------------------
echo ""
if $DRY_RUN; then
echo "🏁 Dry run complete. Run without --dry-run to apply."
else
echo "✅ Done! Secrets are now available to GitHub Actions workflows."
echo " Verify: gh secret list ${GH_ARGS[*]}"
fi
37 changes: 37 additions & 0 deletions .github/workflows/pr-agent-review.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# ---------------------------------------------------------------------------
# PR-Agent review powered by DeepSeek.
#
# Required repo secret: DEEPSEEK_API_KEY
# - Setup: bash .github/scripts/setup-repo-secrets.sh
# - Or set manually at: https://github.com/<owner>/<repo>/settings/secrets/actions
# ---------------------------------------------------------------------------
name: pr-agent-review
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
issue_comment:
types: [created]
permissions:
contents: read
pull-requests: write
issues: write
jobs:
pr_agent_job:
name: PR-Agent (DeepSeek)
runs-on: ubuntu-latest
if: ${{ github.event.sender.type != 'Bot' && secrets.DEEPSEEK_API_KEY != '' }}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Move secret checks out of job conditions

GitHub's Actions docs state that "Secrets cannot be directly referenced in if: conditionals" (https://docs.github.com/en/actions/how-tos/write-workflows/choose-what-workflows-do/use-secrets), but this job-level condition uses secrets.DEEPSEEK_API_KEY; the same pattern is present in qodo-merge.yml. These newly added review jobs will not be evaluated as intended before any steps run. Use a preliminary check step/job and condition on a non-secret output instead.

Useful? React with 👍 / 👎.

steps:
- name: PR Agent review
uses: the-pr-agent/pr-agent@main
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
config.model: "deepseek/deepseek-chat"
config.fallback_models: '["deepseek/deepseek-chat"]'
github_action_config.auto_review: "true"
github_action_config.auto_describe: "true"
github_action_config.auto_improve: "true"
pr_description.publish_labels: "true"
pr_description.publish_description_as: "suggestion"
pr_reviewer.require_score_review: "false"
pr_reviewer.num_code_suggestions: "4"
26 changes: 26 additions & 0 deletions .github/workflows/qodo-merge.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# ---------------------------------------------------------------------------
# Qodo Merge — automated PR reviews powered by Qodo.
#
# Required repo secret: QODO_API_KEY
# - Setup: bash .github/scripts/setup-repo-secrets.sh
# - Or set manually at: https://github.com/<owner>/<repo>/settings/secrets/actions
# ---------------------------------------------------------------------------
name: qodo-merge
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
jobs:
qodo-merge:
if: ${{ secrets.QODO_API_KEY != "" }}
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
contents: read
steps:
- name: Qodo Merge Review
uses: qodo-ai/qodo-merge@main
env:
QODO_API_KEY: ${{ secrets.QODO_API_KEY }}
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
22 changes: 22 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1834,3 +1834,25 @@ if(EXISTS "${_AUTO_TUNE_TEST_SRC}")
include(CTest)
add_test(NAME AutoTuneTest COMMAND test_auto_tune)
endif()

# Streaming proxy reasoning content normalization: ensures `content` field is
# always present alongside `reasoning_content` in SSE delta chunks.
set(_STREAMING_REASONING_TEST_SRC
"${CMAKE_CURRENT_SOURCE_DIR}/test/cpp/test_streaming_proxy_reasoning_content.cpp"
)
if(EXISTS "${_STREAMING_REASONING_TEST_SRC}")
add_executable(test_streaming_proxy_reasoning_content
test/cpp/test_streaming_proxy_reasoning_content.cpp
)
target_include_directories(test_streaming_proxy_reasoning_content PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/src/cpp/include
${CMAKE_CURRENT_BINARY_DIR}/include
)
target_link_libraries(test_streaming_proxy_reasoning_content PRIVATE
lemonade-server-core
nlohmann_json::nlohmann_json
)

include(CTest)
add_test(NAME StreamingProxyReasoningContentTest COMMAND test_streaming_proxy_reasoning_content)
endif()
16 changes: 15 additions & 1 deletion src/cpp/cli/lemonade_client.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -696,6 +696,15 @@ int LemonadeClient::pull_model(const json& model_data, const std::string& displa
if (event_type == "complete") {
std::cout << std::endl;
state.success = true;
// Check for warnings (e.g. offline mode skipped the download)
try {
auto complete_json = json::parse(event_data);
if (complete_json.contains("warning") && complete_json["warning"].is_string()) {
state.warning_message = complete_json["warning"].get<std::string>();
}
} catch (...) {
// Ignore parse errors on the completion event
}
} else if (event_type == "error") {
try {
auto error_json = json::parse(event_data);
Expand Down Expand Up @@ -730,7 +739,12 @@ int LemonadeClient::pull_model(const json& model_data, const std::string& displa
throw std::runtime_error("Model pull failed");
}

std::cout << "Model pulled successfully: " << output_name << std::endl;
if (!state.warning_message.empty()) {
std::cout << "Model pulled with warning: " << output_name << std::endl;
std::cout << " Warning: " << state.warning_message << std::endl;
} else {
std::cout << "Model pulled successfully: " << output_name << std::endl;
}
return 0;
} catch (const HttpError& e) {
std::cerr << "Error pulling model: " << extract_server_error_message(e) << std::endl;
Expand Down
4 changes: 3 additions & 1 deletion src/cpp/include/lemon/model_manager.h
Original file line number Diff line number Diff line change
Expand Up @@ -296,8 +296,10 @@ class ModelManager {
// Download from a JSON manifest
void download_from_manifest(const json& manifest, std::map<std::string, std::string>& headers, DownloadProgressCallback progress_callback);

// Download from Hugging Face
// Download from Hugging Face. When do_not_upgrade is true and a .completed
// sentinel exists in the snapshot directory, the HF API call is skipped.
void download_from_huggingface(const ModelInfo& info,
bool do_not_upgrade = false,
DownloadProgressCallback progress_callback = nullptr);

// Download from FLM
Expand Down
8 changes: 7 additions & 1 deletion src/cpp/include/lemon/streaming_proxy.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,8 +51,14 @@ class StreamingProxy {
std::function<void()> on_chunk = nullptr
);

// Normalize streaming chat.completion.chunk SSE deltas for OpenAI API
// compatibility. Applies:
// 1. Injects `role: "assistant"` when null/missing on assistant deltas
// 2. Injects `content: ""` alongside `reasoning_content` when absent
static std::string normalize_chat_completion_chunk(const std::string& sse_chunk);

private:
static TelemetryData parse_telemetry(const std::string& buffer);
};

} // namespace lemon
} // namespace lemon
7 changes: 7 additions & 0 deletions src/cpp/include/lemon/system_info.h
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,14 @@ class SystemInfo {
static std::string get_system_llamacpp_version();

// Device support detection
// Return the first (primary) ROCm architecture — typically the iGPU.
// Used for rocm_channel selection and single-arch contexts.
static std::string get_rocm_arch();

// Return ALL detected AMD GPU ROCm architectures, ordered iGPU first
// then dGPUs. Used by backend download paths to install ROCm binaries
// for every GPU on the system, not just the first one detected.
static std::vector<std::string> get_rocm_arches();
static std::string get_cuda_arch();

// CUDA release assets are architecture-specific (sm_89, sm_120, etc.).
Expand Down
1 change: 1 addition & 0 deletions src/cpp/include/lemon_cli/lemonade_client.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ struct StreamingRequestState {
bool success = false;
std::string error_message;
std::string error_code;
std::string warning_message;
bool total_size_printed = false;
uint64_t last_file_size = 0;
std::chrono::steady_clock::time_point file_start_time;
Expand Down
23 changes: 20 additions & 3 deletions src/cpp/server/backend_manager.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -215,11 +215,28 @@ void install_therock_if_needed(const std::string& os, const json& backend_versio
return;
}

std::string rocm_arch = SystemInfo::get_rocm_arch();
// Already installed — skip to avoid redundant 3 GB download
if (is_therock_installed_for_current_arch(backend_versions)) {
LOG(DEBUG, "BackendManager") << "TheRock already installed, skipping" << std::endl;
return;
Comment on lines +219 to +221

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Check all TheRock architectures before skipping

On a multi-AMD-GPU machine, this early return only checks the primary architecture from get_rocm_arch(), but the new code below is intended to install TheRock for every architecture returned by get_rocm_arches(). If the primary/iGPU runtime is already installed and a secondary dGPU architecture is missing, installation exits here and the dGPU still lacks its TheRock libraries. Please skip only when every detected architecture has the expected version.

Useful? React with 👍 / 👎.

}

std::vector<std::string> rocm_arches = SystemInfo::get_rocm_arches();
std::string version = backend_versions["therock"]["version"].get<std::string>();

// Install TheRock for this architecture
backends::BackendUtils::install_therock(rocm_arch, version, progress_cb);
if (rocm_arches.empty()) {
// Fall back to single-arch detection for backward compatibility
std::string single_arch = SystemInfo::get_rocm_arch();
if (!single_arch.empty()) {
backends::BackendUtils::install_therock(single_arch, version, progress_cb);
}
return;
}

// Install TheRock for each detected GPU architecture
for (const auto& rocm_arch : rocm_arches) {
backends::BackendUtils::install_therock(rocm_arch, version, progress_cb);
}
}

} // namespace
Expand Down
Loading
Loading