Skip to content

PMID GeneReviews entries (e.g., PMID:31536184) fall back to abstract_only instead of fetching NBK full text #41

@cmungall

Description

@cmungall

Summary

When caching PMID:31536184 (GeneReviews chapter for GA1), linkml-reference-validator stores content_type: abstract_only and does not fetch the full NCBI Bookshelf chapter text, even though full text is publicly available.

Version

  • linkml-reference-validator==0.1.4rc8

Repro

linkml-reference-validator cache reference PMID:31536184 --force --verbose

Output includes:

  • Content type: abstract_only

Cached file frontmatter:

reference_id: PMID:31536184
content_type: abstract_only

Why this is a problem

For GeneReviews PMIDs, users often cite statements that are present in the full NBK chapter but not in the PubMed abstract/summary text. Validation then fails with abstract-only limitations.

Expected behavior

One of:

  1. PMID source should detect NCBI Bookshelf/GeneReviews links and fetch chapter full text (e.g., printable view), or
  2. Provide a built-in NBK source and/or an automatic PubMed->Bookshelf handoff for PMIDs that resolve to Bookshelf chapters.

Additional evidence

Full chapter is available at:

(HTML content is large and includes full sections/tables/references, not just the summary.)

Notes

Current PMID implementation appears to do:

  • PubMed abstract fetch
  • optional PMC full text fetch via PMCID
  • fallback to abstract_only when no PMCID

This misses Bookshelf-hosted full text for GeneReviews PMIDs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions