Fix Corrective-RAG never triggering web search (exact-match grade parsing) by douxiao398 · Pull Request #246 · patchy631/ai-engineering-hub

douxiao398 · 2026-06-18T14:17:09Z

The Corrective-RAG demos never actually run their corrective web search, because the trigger condition can't match real LLM output.

In corrective-rag/workflow.py (and the same logic in firecrawl-agent/workflow.py), each retrieved document is graded by an LLM and the full response is stored:

relevancy_results.append(relevancy.text.lower().strip())

The decision to do a web search is then:

if "no" in relevancy_results:
    return WebSearchEvent(...)

relevancy_results is a list of full grader responses, so "no" in relevancy_results is a membership test that is only true when a response is exactly the string "no". The grading prompt asks for a binary yes/no, but in practice the model replies with things like "no.", "No, the document is not relevant", etc. None of those equal "no", so the condition is almost never true and the corrective web search is silently skipped — the core Corrective-RAG behavior doesn't fire.

corrective-rag has the same issue on the relevant-doc filter (result == "yes"), which drops any document whose grade isn't exactly "yes" (e.g. "yes, relevant").

Fix — make the grade parsing robust instead of exact-match:

corrective-rag: a document is relevant when its grade starts with "yes", and the web search triggers when any document is not a clear "yes".
firecrawl-agent: it already treats a document as relevant when the grade contains "yes", so I made the trigger consistent with that — web search when any document's grade does not contain "yes". (It had switched the relevant-doc filter to fuzzy matching but left the "no" in ... trigger as exact, so the two were inconsistent.)

Easy way to see it before the fix: grade an irrelevant document so the model answers "No, ..." — the old code returns QueryEvent (no web search) instead of WebSearchEvent.

Summary by CodeRabbit

Bug Fixes
- Improved relevancy evaluation to be more tolerant of varied language model output formatting, ensuring corrective web searches are reliably triggered for irrelevant content regardless of response capitalization or punctuation variations.

…actually triggers The corrective-rag and firecrawl-agent workflows decide whether to run a corrective web search with `if "no" in relevancy_results`. Each element of that list is the grader LLM's full response (lower-cased/stripped), so this only matches when a response is exactly "no". Real outputs like "no.", "No, the document is not relevant", etc. never match, so the corrective web search is silently skipped and the core Corrective-RAG mechanism never fires. corrective-rag also filtered relevant docs with `result == "yes"`, which drops any doc whose grade isn't exactly "yes" (e.g. "yes, relevant"). Make the parsing robust: - corrective-rag: treat a doc as relevant when its grade starts with "yes", and trigger the web search when any doc is not a clear "yes". - firecrawl-agent: it already treats a doc as relevant when the grade contains "yes", so trigger the web search when any doc does not contain "yes" (consistent with that definition).

coderabbitai · 2026-06-18T14:17:32Z

📝 Walkthrough

Walkthrough

Two corrective RAG workflow files update eval_relevance to replace exact-match LLM grade parsing with prefix/substring checks. Both now treat any grade that does not start with "yes" as irrelevant and trigger a corrective web search, handling LLM outputs that include punctuation or brief justifications.

Changes

Tolerant LLM Relevancy Grading

Layer / File(s)	Summary
Prefix-based relevancy gate `corrective-rag/workflow.py`, `firecrawl-agent/workflow.py`	Both files replace exact-match grade checks (`== "yes"`, `"no" in results`) with `startswith("yes")` / `"yes" not in` substring checks. `corrective-rag` also updates the `relevant_text` assembly to use the new rule. Both now trigger `WebSearchEvent` for any grade that does not begin with `"yes"`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~5 minutes

Poem

A fuzzy "Yes!" the LLM cried,
With punctuation tucked inside.
Old code ignored it, search was skipped,
But now the prefix check has gripped.
startswith("yes") — the rabbit's fix,
No more ambiguous parser tricks! 🐇✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: fixing a bug where Corrective-RAG web search doesn't trigger due to exact-match grade parsing logic.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

firecrawl-agent/workflow.py (1)

188-193: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Route empty grading results to corrective web search.

At Line 188, an empty relevancy_results_striped makes any(...) false, so the workflow skips WebSearchEvent when retrieval returns no documents. That leaves the user on the non-corrective path with no context.

Suggested fix

-        if any("yes" not in result.lower() for result in relevancy_results_striped):
+        if not relevancy_results_striped or any(
+            "yes" not in result.lower() for result in relevancy_results_striped
+        ):
             print("DEBUG: Some documents irrelevant, returning WebSearchEvent")
             return WebSearchEvent(relevant_text=relevant_text)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@firecrawl-agent/workflow.py` around lines 188 - 193, The condition at line
188 using any() returns False when relevancy_results_striped is empty, causing
the workflow to incorrectly route to QueryEvent instead of WebSearchEvent when
there are no documents to evaluate. Modify the condition to explicitly check if
relevancy_results_striped is empty or if any result does not contain "yes",
ensuring that both empty retrieval results and documents with missing "yes"
values trigger the WebSearchEvent for corrective web search. You can do this by
adding an additional check like `if not relevancy_results_striped or any(...)`
to handle the empty case separately and return WebSearchEvent.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@corrective-rag/workflow.py`:
- Around line 144-147: The condition at line 144 using `any(not
result.startswith("yes") for result in relevancy_results)` evaluates to False
when relevancy_results is empty, causing the code to incorrectly return
QueryEvent instead of triggering a web search. Modify the condition to
explicitly check if relevancy_results is empty OR if any result doesn't start
with "yes", so that empty retrieval results trigger the WebSearchEvent fallback
path as intended for corrective search behavior.

---

Outside diff comments:
In `@firecrawl-agent/workflow.py`:
- Around line 188-193: The condition at line 188 using any() returns False when
relevancy_results_striped is empty, causing the workflow to incorrectly route to
QueryEvent instead of WebSearchEvent when there are no documents to evaluate.
Modify the condition to explicitly check if relevancy_results_striped is empty
or if any result does not contain "yes", ensuring that both empty retrieval
results and documents with missing "yes" values trigger the WebSearchEvent for
corrective web search. You can do this by adding an additional check like `if
not relevancy_results_striped or any(...)` to handle the empty case separately
and return WebSearchEvent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 485a9728-6d5a-44a1-99c0-b9858ee7262d

📥 Commits

Reviewing files that changed from the base of the PR and between cfd74dc and 1766dcc.

📒 Files selected for processing (2)

corrective-rag/workflow.py
firecrawl-agent/workflow.py

coderabbitai · 2026-06-18T14:19:49Z

+        if any(not result.startswith("yes") for result in relevancy_results):
            return WebSearchEvent(relevant_text=relevant_text)
        else:
            return QueryEvent(relevant_text=relevant_text, search_text="")


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle empty retrieval as a web-search trigger.

At Line 144, any(...) is false for an empty relevancy_results, so zero retrieved docs bypass corrective search and go straight to QueryEvent. That misses the fallback path exactly when retrieval fails.

Suggested fix

- if any(not result.startswith("yes") for result in relevancy_results): + if not relevancy_results or any( + not result.startswith("yes") for result in relevancy_results + ): return WebSearchEvent(relevant_text=relevant_text)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@corrective-rag/workflow.py` around lines 144 - 147, The condition at line 144 using `any(not result.startswith("yes") for result in relevancy_results)` evaluates to False when relevancy_results is empty, causing the code to incorrectly return QueryEvent instead of triggering a web search. Modify the condition to explicitly check if relevancy_results is empty OR if any result doesn't start with "yes", so that empty retrieval results trigger the WebSearchEvent fallback path as intended for corrective search behavior.

coderabbitai Bot reviewed Jun 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Corrective-RAG never triggering web search (exact-match grade parsing)#246

Fix Corrective-RAG never triggering web search (exact-match grade parsing)#246
douxiao398 wants to merge 1 commit into
patchy631:mainfrom
douxiao398:fix/corrective-rag-relevancy-parsing

douxiao398 commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

douxiao398 commented Jun 18, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

douxiao398 commented Jun 18, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 18, 2026 •

edited

Loading