Skip to content

papercopilot/paperlists-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

paperlists-agent

AI-native query and agent tooling for the papercopilot/paperlists conference-paper corpus.

This repo is the dedicated home for the agent layer that started in papercopilot/paperlists#29. It keeps the data corpus and the agent/query product separate:

Surface Directory Use it when
FastAPI query service query-api/ You want HTTPS or localhost access to the corpus
MCP server mcp-server/ You want Claude Code, Cursor, Codex, Claude Desktop, or another MCP host to query papers
Cross-tool Skill + CLI skill/ You want a portable markdown skill and stdlib-only command-line client

The core verbs are research-evolution oriented, not just keyword search:

  • topic_trend: yearly topic volume and citation-weighted volume
  • topic_evolution: per-year/per-window keywords, venues, and landmark papers
  • compare_periods: emerged/faded/sustained terms, authors, and affiliations
  • author_trajectory: papers by author across years
  • field_landscape: single-year field snapshot
  • corpus_manifest: corpus freshness/provenance contract for the data pipeline

Quick start

Use the hosted demo only for evaluation:

export PAPERLISTS_API_URL=https://api-production-18d3.up.railway.app
python3 skill/scripts/paperlists.py coverage
python3 skill/scripts/paperlists.py corpus_manifest  # confirm api.version/build identity
python3 skill/scripts/paperlists.py topic_evolution q="LLM reasoning" year_from=2024 year_to=2025 conferences=iclr,nips,icml,acl,emnlp match_mode=token_and

For longitudinal claims, require corpus_manifest.api.version >= 0.2.0. For deployment canaries that must prove the endpoint is current HEAD, require a known corpus_manifest.api.git_sha; version alone only rejects pre-0.2 demos. Older demos used token-AND query semantics without match_mode, query_expression, venue_diff, or query-noise metadata.

For a local API:

cd query-api
uv run python -m paperlists_api.indexer /path/to/paperlists ./papers.db
PAPERLISTS_DB=$PWD/papers.db uv run uvicorn paperlists_api.main:app --reload

Then visit http://127.0.0.1:8000/docs.

Deploy

The root Dockerfile fetches the upstream paperlists JSON archive during build and bakes a sqlite FTS5 index into the runtime image. This avoids committing or uploading the raw data.

Railway can deploy from the repo root:

railway up

Runtime knobs:

  • WEB_CONCURRENCY=4
  • PAPERLISTS_RATE_PER_MIN=60
  • PAPERLISTS_RATE_BURST=20
  • PAPERLISTS_TRUST_PROXY=auto
  • PAPERLISTS_DB=/app/papers.db
  • PAPERLISTS_GIT_SHA, PAPERLISTS_GIT_BRANCH, PAPERLISTS_DEPLOYMENT_ID, PAPERLISTS_ENVIRONMENT are exposed in /, /healthz, and /v1/corpus_manifest; the Dockerfile maps Railway's Git build args into these fields when Railway provides them.

Development

cd query-api
uv run --extra dev pytest -q
uv run python -m compileall paperlists_api ../mcp-server/paperlists_mcp ../skill/scripts/paperlists.py

The API currently indexes 237,735 papers across 31 venues in the hosted demo. Local papers.db files are generated artifacts and must not be committed.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors