Skip to content

fix(v2): don't mutate instance hybrid_search_config in similarity searches#309

Open
Humphrey (HumphreySun98) wants to merge 1 commit into
langchain-ai:mainfrom
HumphreySun98:fix/v2-asimilarity-search-mutates-hybrid-search-config
Open

fix(v2): don't mutate instance hybrid_search_config in similarity searches#309
Humphrey (HumphreySun98) wants to merge 1 commit into
langchain-ai:mainfrom
HumphreySun98:fix/v2-asimilarity-search-mutates-hybrid-search-config

Conversation

@HumphreySun98

Copy link
Copy Markdown

Description

`AsyncPGVectorStore.asimilarity_search` and `asimilarity_search_with_score` back-fill `fts_query` on the `HybridSearchConfig` object retrieved from kwargs or, by default, from `self.hybrid_search_config`:

```python
hybrid_search_config = kwargs.get("hybrid_search_config", self.hybrid_search_config)
if hybrid_search_config and not hybrid_search_config.fts_query:
hybrid_search_config.fts_query = query # ← in-place
kwargs["hybrid_search_config"] = hybrid_search_config
```

Because the assignment hits the shared object in place, the very first call's `query` sticks on it and every subsequent search reuses that `fts_query` regardless of the new `query` argument. Same hazard for callers that pass the same `HybridSearchConfig` to multiple searches. This is exactly the symptom reported in #288.

The fix is to copy the config before back-filling, mirroring how other LangChain integrations handle caller-owned input (see e.g. langchain-anthropic's recent `bind_tools` mutation fix).

Relevant issues

Fixes #288

Changes

  • `langchain_postgres/v2/async_vectorstore.py`: in both `asimilarity_search` and `asimilarity_search_with_score`, `copy.copy(hybrid_search_config)` before assigning `fts_query`. The `copy` module is already imported at the top of the file.
  • `tests/unit_tests/v2/test_async_vectorstore_no_mutation.py`: four async unit tests that bypass `AsyncPGVectorStore.create(...)` (no Postgres needed) and pin:
    • instance-level config is preserved across two calls;
    • caller-provided config is preserved when reused across calls;
    • existing `fts_query` is not overwritten (back-fill branch correctly skips);
    • `asimilarity_search_with_score` has the same invariants.

Testing

```
$ python -m pytest tests/unit_tests/v2/test_async_vectorstore_no_mutation.py -q
4 passed in 0.49s
```

Reverting only the `copy.copy(...)` line while keeping the new tests makes three of the four tests fail (the fourth covers the no-back-fill branch and is unaffected), confirming the regression coverage pins the buggy behavior. `ruff check` and `ruff format --check` pass on both files.

Note

Fix is intentionally minimal — a shallow `copy.copy` is sufficient because `HybridSearchConfig` is a flat dataclass and only `fts_query` (a string) is reassigned. No callers need updating; non-hybrid-search paths are untouched.

Disclaimer: this PR was prepared with the assistance of an AI agent (Claude Code). All code and test changes were reviewed by the author before submission.

…rches

`AsyncPGVectorStore.asimilarity_search` and `asimilarity_search_with_score`
back-filled `fts_query` on the `hybrid_search_config` object retrieved
from kwargs or, by default, from `self.hybrid_search_config`. Because
the assignment hit the shared instance variable in place, the very
first call's query "stuck" on it and every subsequent search reused that
fts_query regardless of the new `query` argument. Same hazard for
callers that reuse a single `HybridSearchConfig` across calls.

Copy the config before setting `fts_query`. Adds regression tests that
bypass `AsyncPGVectorStore.create(...)` (no Postgres needed) and pin
all three invariants: instance-level config is preserved, caller-
provided config is preserved, and an existing `fts_query` is left
alone when the back-fill branch is skipped.

Fixes langchain-ai#288
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Instance variable hybrid_search_config‎ incorrectly mutated

1 participant