Skip to content

Arax pathfinder#64

Open
mohsenht wants to merge 12 commits intomainfrom
arax-pathfinder
Open

Arax pathfinder#64
mohsenht wants to merge 12 commits intomainfrom
arax-pathfinder

Conversation

@mohsenht
Copy link
Collaborator

Hi @maximusunc,

Please review this pull request.

@mohsenht mohsenht requested a review from maximusunc January 21, 2026 16:17
@codecov
Copy link

codecov bot commented Jan 21, 2026

Codecov Report

❌ Patch coverage is 2.82486% with 172 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.48%. Comparing base (928e4d8) to head (ce1a5e6).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
workers/arax_pathfinder/worker.py 0.00% 111 Missing ⚠️
workers/arax/worker.py 0.00% 35 Missing ⚠️
shepherd_utils/inject_shepherd_arax_provenance.py 0.00% 26 Missing ⚠️
Files with missing lines Coverage Δ
shepherd_server/main.py 0.00% <ø> (ø)
shepherd_utils/config.py 100.00% <100.00%> (ø)
shepherd_utils/inject_shepherd_arax_provenance.py 0.00% <0.00%> (ø)
workers/arax/worker.py 0.00% <0.00%> (ø)
workers/arax_pathfinder/worker.py 0.00% <0.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a561fa...ce1a5e6. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maximusunc
Copy link
Collaborator

I tried running a query through the ARAX pathfinder and it's unclear what happened. I got these logs:

arax_pathfinder  | 2026-02-04T16:02:05.361576: DEBUG: lookup map not here! /tmp/biolink/biolink_lookup_map_4.2.5_v5.pickle
arax_pathfinder  | 2026-02-04T16:02:07.267560: INFO: Building local Biolink 4.2.5 ancestor/descendant lookup map because one doesn't yet exist
arax_pathfinder  | [2026-02-04 16:02:07,299: INFO/shepherd.arax.pathfinder.e3355332.00527f4e]: Model release date: 12/01/2025
arax_pathfinder  | [2026-02-04 16:02:07,299: INFO/shepherd.arax.pathfinder.e3355332.00527f4e]: Finding paths process has started
arax_pathfinder  | [2026-02-04 16:02:07,300: INFO/shepherd.arax.pathfinder.e3355332.00527f4e]: Expanding CHEBI:45783
arax_pathfinder  | [2026-02-04 16:02:07,301: INFO/shepherd.arax.pathfinder.e3355332.00527f4e]: Expanding MONDO:0004979

but nothing else before it was timed out after 5 minutes. I sent it Imatinib->Asthma.

@mohsenht
Copy link
Collaborator Author

mohsenht commented Feb 4, 2026

Hi @maximusunc,

I changed the parameters to make it faster for now. I will get back to Shepherd-pathfinder and check it probably next week to figure out what the problem is.

Copy link
Collaborator

@maximusunc maximusunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When testing with Imatinib->Asthma, your Pathfinder is returning 0 paths. Is this intended?

try:
start = time.perf_counter()
logger.info("Starting pathfinder.get_paths()")
result, aux_graphs, knowledge_graph = pathfinder.get_paths(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if your pathfinder code is asynchronous or not, but this call is blocking and so your pathfinder implementation can only handle one query at a time. Is this intended?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @maximusunc

Could you please provide me your json query that you sent and got 0 paths?

Copy link
Collaborator Author

@mohsenht mohsenht Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my query and I got result for this one.

{
    "message": {
        "query_graph": {
            "nodes": {
                "n0": {
                    "ids": [
                        "CHEBI:31690"
                    ]
                },
                "n1": {
                    "ids": [
                        "MONDO:0004979"
                    ]
                }
            },
            "paths": {
                "p0": {
                    "subject": "n0",
                    "object": "n1",
                    "predicates": [
                        "biolink:related_to"
                    ],
                    "constraints": []
                }
            }
        }
    }
}

Copy link
Collaborator Author

@mohsenht mohsenht Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is now can handle multiple queries.

@maximusunc
Copy link
Collaborator

I ran some tests last night and ran into some issues. I was able to run your query and get back results, but then I tried sending 5 concurrent queries and while they all fired off, I got this error for all of them:

arax_pathfinder  | requests.exceptions.ConnectionError: HTTPSConnectionPool(host='kg2cplover3.rtx.ai', port=9990): Max retries exceeded with url: /query (Caused by NewConnectionError("HTTPSConnection(host='kg2cplover3.rtx.ai', port=9990): Failed to establish a new connection: [Errno 111] Connection refused"))

And then I tried backing off and just sending one query and got this error:

arax_pathfinder  | [2026-02-26 02:18:32,536: ERROR/shepherd.arax.pathfinder.d18b72ad.f07114e6]: Path MONDO:0004979MONDO:0011786 raised an exception: MySQL connection failed: 2003 (HY000): Can't connect to MySQL server on 'arax-databases-mysql.rtx.ai:3306' (111)
arax_pathfinder  | [2026-02-26 02:18:34,166: ERROR/shepherd.arax.pathfinder.d18b72ad.f07114e6]: Path CHEBI:31690NCBIGene:1544 raised an exception: MySQL connection failed: 2003 (HY000): Can't connect to MySQL server on 'arax-databases-mysql.rtx.ai:3306' (111)
arax_pathfinder  | [2026-02-26 02:18:34,186: ERROR/shepherd.arax.pathfinder.d18b72ad.f07114e6]: PathFinder failed to find paths between on and sn. Error message is: MySQL connection failed: 2003 (HY000): Can't connect to MySQL server on 'arax-databases-mysql.rtx.ai:3306' (111)

Now this doesn't seem to be an issue with Shepherd but more coming from these external services. So my follow up questions are:
Can these external services handle concurrent queries? If they can't, then your Pathfinder shouldn't either. Is your Pathfinder heavily CPU-bound? If so, then we will want to set up some multi-processing potentially.

@mohsenht
Copy link
Collaborator Author

PloverDB Concurrency: The error came from PloverDB, which is actually designed to handle thousands of requests in parallel for Pathfinder and other services. The KG2 team is currently working hard on its stability.

Pathfinder Performance: Pathfinder is both CPU-bound and IO-bound. It already uses multiprocessing in its core code to calculate rankings and expand nodes while building paths and trees.

The subsequent failure for the single query shows that the database connection to arax-databases-mysql was also down. I will ping KG2 team for this one.

Thanks Max

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants