Skip to content

Similarity search with distance filter failed: (psycopg.errors.UndefinedColumn) column "product_ids" does not exist #304

@esseti

Description

@esseti

Hello,

I've a PGVectorStore where i've instered Documents in this content:

id: fff86dca-918c-421d-9124-333e8ac3bf8a
metadata: {
  "scope": "company",
  "title": "data_test.txt",
  "filename": "data_test.txt",
  "document_id": "31927",
  "product_ids": [],
  "start_index": 0,
  "source_document_id": "31927"
}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Vestibulum in justo sodales, accumsan orci quis, malesuada ligula. Ut dapibus lectus a eros efficitur, at convallis  … (+510 chars)

Now, i'm performing a query of this type

all_results = vector_store.similarity_search_with_score(
            query=query,
            k=k,
            filter=filter,
        )

where the actual content is this

Performing similarity search with query Qual è l'architettura di conservazione dei dati e dove si trovano fisicamente i server? , filter={'product_ids': {'$in': [34]}} and k=50

i get this as an error

2026-05-14 15:16:54.281 | [14/May/2026 13:16:54] ERROR [4699.ai.rag:504]  Similarity search with distance filter failed: (psycopg.errors.UndefinedColumn) column "product_ids" does not exist
2026-05-14 15:16:54.281 | LINE 2:         FROM "public"."rag_company_11" WHERE product_ids = A...
2026-05-14 15:16:54.281 |                                                      ^
2026-05-14 15:16:54.281 | [SQL: SELECT "langchain_id", "content", "embedding", "langchain_metadata", cosine_distance("embedding", %(query_embedding)s) as distance
2026-05-14 15:16:54.281 |         FROM "public"."rag_company_11" WHERE product_ids = ANY(%(product_ids_in_bc347a3b)s) ORDER BY "embedding" <=> %(query_embedding)s LIMIT %(dense_limit)s;
2026-05-14 15:16:54.281 |         ]
2026-05-14 15:16:54.281 | [parameters: {'query_embedding': '[-0.004596710205078125, 0.01456451416015625, 0.0518798828125, 0.012725830078125, 0.0634765625, 0.03546142578125, -0.046844482421875, -0.0014133453369 ... (30701 characters truncated) ... 34942626953125, 0.002788543701171875, -0.0008635520935058594, -0.0179290771484375, -0.034637451171875, 0.002643585205078125, -0.00012230873107910156]', 'product_ids_in_bc347a3b': [34], 'dense_limit': 50}]
2026-05-14 15:16:54.281 | (Background on this error at: https://sqlalche.me/e/20/f405)
2026-05-14 15:16:54.281 | Traceback (most recent call last):
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
2026-05-14 15:16:54.281 |     self.dialect.do_execute(
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 952, in do_execute
2026-05-14 15:16:54.281 |     cursor.execute(statement, parameters)
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/dialects/postgresql/psycopg.py", line 673, in execute
2026-05-14 15:16:54.281 |     result = self.await_(self._cursor.execute(query, params, **kw))
2026-05-14 15:16:54.281 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 132, in await_only
2026-05-14 15:16:54.281 |     return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 196, in greenlet_spawn
2026-05-14 15:16:54.281 |     value = await result
2026-05-14 15:16:54.281 |             ^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/psycopg/cursor_async.py", line 117, in execute
2026-05-14 15:16:54.281 |     raise ex.with_traceback(None)
2026-05-14 15:16:54.281 | psycopg.errors.UndefinedColumn: column "product_ids" does not exist
2026-05-14 15:16:54.281 | LINE 2:         FROM "public"."rag_company_11" WHERE product_ids = A...
2026-05-14 15:16:54.281 |                                                      ^
2026-05-14 15:16:54.281 | 
2026-05-14 15:16:54.281 | The above exception was the direct cause of the following exception:
2026-05-14 15:16:54.281 | 
2026-05-14 15:16:54.281 | Traceback (most recent call last):
2026-05-14 15:16:54.281 |   File "/opt/src/ai/rag.py", line 491, in similarity_search_with_distance_filter
2026-05-14 15:16:54.281 |     all_results = vector_store.similarity_search_with_score(
2026-05-14 15:16:54.281 |                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/langchain_postgres/v2/vectorstores.py", line 714, in similarity_search_with_score
2026-05-14 15:16:54.281 |     return self._engine._run_as_sync(
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/langchain_postgres/v2/engine.py", line 131, in _run_as_sync
2026-05-14 15:16:54.281 |     return asyncio.run_coroutine_threadsafe(coro, self._loop).result()  # type: ignore[arg-type]
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 456, in result
2026-05-14 15:16:54.281 |     return self.__get_result()
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/usr/local/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
2026-05-14 15:16:54.281 |     raise self._exception
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/langchain_postgres/v2/async_vectorstore.py", line 738, in asimilarity_search_with_score
2026-05-14 15:16:54.281 |     docs = await self.asimilarity_search_with_score_by_vector(
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/langchain_postgres/v2/async_vectorstore.py", line 765, in asimilarity_search_with_score_by_vector
2026-05-14 15:16:54.281 |     results = await self.__query_collection(
2026-05-14 15:16:54.281 |               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/langchain_postgres/v2/async_vectorstore.py", line 636, in __query_collection
2026-05-14 15:16:54.281 |     result = await conn.execute(text(dense_query_stmt), param_dict)
2026-05-14 15:16:54.281 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/ext/asyncio/engine.py", line 659, in execute
2026-05-14 15:16:54.281 |     result = await greenlet_spawn(
2026-05-14 15:16:54.281 |              ^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 201, in greenlet_spawn
2026-05-14 15:16:54.281 |     result = context.throw(*sys.exc_info())
2026-05-14 15:16:54.281 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1419, in execute
2026-05-14 15:16:54.281 |     return meth(
2026-05-14 15:16:54.281 |            ^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/sql/elements.py", line 527, in _execute_on_connection
2026-05-14 15:16:54.281 |     return connection._execute_clauseelement(
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1641, in _execute_clauseelement
2026-05-14 15:16:54.281 |     ret = self._execute_context(
2026-05-14 15:16:54.281 |           ^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
2026-05-14 15:16:54.281 |     return self._exec_single_context(
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1986, in _exec_single_context
2026-05-14 15:16:54.281 |     self._handle_dbapi_exception(
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 2363, in _handle_dbapi_exception
2026-05-14 15:16:54.281 |     raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
2026-05-14 15:16:54.281 |     self.dialect.do_execute(
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 952, in do_execute
2026-05-14 15:16:54.281 |     cursor.execute(statement, parameters)
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/dialects/postgresql/psycopg.py", line 673, in execute
2026-05-14 15:16:54.281 |     result = self.await_(self._cursor.execute(query, params, **kw))
2026-05-14 15:16:54.281 |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 132, in await_only
2026-05-14 15:16:54.281 |     return current.parent.switch(awaitable)  # type: ignore[no-any-return,attr-defined] # noqa: E501
2026-05-14 15:16:54.281 |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/sqlalchemy/util/_concurrency_py3k.py", line 196, in greenlet_spawn
2026-05-14 15:16:54.281 |     value = await result
2026-05-14 15:16:54.281 |             ^^^^^^^^^^^^
2026-05-14 15:16:54.281 |   File "/opt/.venv/lib/python3.12/site-packages/psycopg/cursor_async.py", line 117, in execute
2026-05-14 15:16:54.281 |     raise ex.with_traceback(None)
2026-05-14 15:16:54.281 | sqlalchemy.exc.ProgrammingError: (psycopg.errors.UndefinedColumn) column "product_ids" does not exist
2026-05-14 15:16:54.281 | LINE 2:         FROM "public"."rag_company_11" WHERE product_ids = A...
2026-05-14 15:16:54.281 |                                                      ^
2026-05-14 15:16:54.281 | [SQL: SELECT "langchain_id", "content", "embedding", "langchain_metadata", cosine_distance("embedding", %(query_embedding)s) as distance
2026-05-14 15:16:54.281 |         FROM "public"."rag_company_11" WHERE product_ids = ANY(%(product_ids_in_bc347a3b)s) ORDER BY "embedding" <=> %(query_embedding)s LIMIT %(dense_limit)s;
2026-05-14 15:16:54.281 |         ]
2026-05-14 15:16:54.281 | [parameters: {'query_embedding': '[-0.004596710205078125, 0.01456451416015625, 0.0518798828125, 0.012725830078125, 0.0634765625, 0.03546142578125, -0.046844482421875, -0.0014133453369 ... (30701 characters truncated) ... 34942626953125, 0.002788543701171875, -0.0008635520935058594, -0.0179290771484375, -0.034637451171875, 0.002643585205078125, -0.00012230873107910156]', 'product_ids_in_bc347a3b': [34], 'dense_limit': 50}]
2026-05-14 15:16:54.281 | (Background on this error at: https://sqlalche.me/e/20/f405)

Where do i make the mistake?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions