docs(extraction): sync post-#2179 extraction doc fixes to main#2203
docs(extraction): sync post-#2179 extraction doc fixes to main#2203kheiss-uwzoo wants to merge 5 commits into
Conversation
Greptile SummaryThis PR syncs post-#2179 extraction documentation fixes from the
|
| Filename | Overview |
|---|---|
| docs/docs/extraction/custom-metadata.md | Deleted — content consolidated into vdbs.md#metadata-and-filtering; no content or links are stranded |
| docs/docs/extraction/deployment-options.md | One-line anchor retarget from #image-captioning-2605 to #image-captioning-nim-hardware, which is the new explicit anchor on prerequisites-support-matrix.md |
| docs/docs/extraction/faq.md | Two link updates: chart-captioning FAQ now points to verified in-page anchors on multimodal-extraction.md; Docker Compose disclaimer added to env-vars answer |
| docs/docs/extraction/multimodal-extraction.md | Simplified OCR prose; chart-caption scope sentence added under #image-captioning; cross-links updated to in-page anchors; all targets verified present |
| docs/docs/extraction/prerequisites-support-matrix.md | Section headings given explicit anchor IDs; #image-captioning-nim-hardware is new canonical anchor; legacy #image-captioning-2605 preserved as a span alias; admonition block removed |
| docs/docs/extraction/vdbs.md | Metadata-and-filtering section expanded with inline guidance replacing the deleted custom-metadata.md page; old links to that page replaced with notebook links |
| docs/mkdocs.yml | Nav renumbered after removing the single-page '7. Retrieval & ranking' section; redirect for custom-metadata.md added, consistent with fragment-redirect pattern already used elsewhere |
| nemo_retriever/tests/test_src_documentation_snippets.py | custom-metadata.md removed from the doc-snippet test list to match the deleted file |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["custom-metadata.md\n(deleted)"] -- "redirect" --> B["vdbs.md\n#metadata-and-filtering"]
C["faq.md"] -- "chart-caption FAQ" --> D["multimodal-extraction.md\n#charts-and-infographics"]
C -- "image captioning FAQ" --> E["multimodal-extraction.md\n#image-captioning"]
E -- "NIM & hardware" --> F["prerequisites-support-matrix.md\n#image-captioning-nim-hardware"]
F -- "legacy alias" --> G["span id=image-captioning-2605"]
H["deployment-options.md"] -- "offline captioning link" --> F
I["integrations doc"] -- "filter search link" --> B
J["workflow-agentic-retrieval.md"] -- "metadata link" --> B
Reviews (11): Last reviewed commit: "docs(extraction): fix caption anchor and..." | Re-trigger Greptile
477f48a to
9a24d31
Compare
9a24d31 to
019547d
Compare
…a page, remove chart admonition Remove custom-metadata.md in favor of vdbs.md#metadata-and-filtering and the metadata filtering notebook. Drop the PDF chart caption admonition from multimodal-extraction.md per review feedback.
PR NVIDIA#2194 merged into 26.05 on 2026-06-02 but never reached main. This backport keeps main aligned with the release branch and the published docs.nvidia.com site after Randy's follow-up review. Timeline: - Friday: 26.05 docs built for docs.nvidia upload; branch differed from NRL GitHub Pages source and the uploaded docs were incorrect. - Saturday: diff main vs 26.05 produced PR NVIDIA#2179 to sync extraction docs. - Monday: PR NVIDIA#2179 merged and docs uploaded to the public site. - Follow-up: Randy opened PR NVIDIA#2194 on 26.05 with additional fixes found after the NVIDIA#2179 sync. Those fixes landed on 26.05 only. - This commit: cherry-pick of c5b257e onto main (five extraction doc files only). Changes from NVIDIA#2194: - Fix audio-video.md indented code block rendering - Restore custom-metadata example service variables and storage prose - Move caption scope admonition to multimodal-extraction.md - Trim redundant Helm/OCR deploy detail per review feedback - Restore FAQ Docker Compose note and support-matrix section anchors
…a page, remove chart admonition Remove custom-metadata.md in favor of vdbs.md#metadata-and-filtering and the metadata filtering notebook. Drop the PDF chart caption admonition from multimodal-extraction.md per review feedback.
Rename the support-matrix caption section for main and keep a legacy #image-captioning-2605 alias so existing deep links keep working.
6a46fb2 to
56fa45c
Compare
Resolve modify/delete conflict on custom-metadata.md by keeping the PR deletion (content consolidated in vdbs.md with redirect). Bring in main mkdocstrings path fixes and support-matrix updates.
…A#2203 Point deployment-options.md at #image-captioning-nim-hardware. Add one sentence under multimodal-extraction #image-captioning so FAQ cross-references have scope detail without restoring the admonition.
Summary
Sync extraction doc structure on
mainwith post-#2179 review fixes that landed on26.05in PR #2194 but never reachedmain. This is not a literal cherry-pick of #2194 — review feedback on this PR evolved the approach.What changed
custom-metadata.md— consolidate metadata/filtering guidance intovdbs.md#metadata-and-filtering; addmkdocs.ymlredirect; update cross-links inworkflow-agentic-retrieval.md,integrations-langchain-llamaindex-haystack.md, and the doc-snippet test list.multimodal-extraction.mdlinks fromprerequisites-support-matrix.md#image-captioning-2605to in-page anchors; rename the support-matrix section to#image-captioning-nim-hardwarewith a legacy#image-captioning-2605span for external bookmarks; fixdeployment-options.mdto use the new anchor.#image-captioninginmultimodal-extraction.md.multimodal-extraction.md; explicit section anchor IDs inprerequisites-support-matrix.md.Out of scope (already on
main)audio-video.mdmarkdown rendering fixes from fix audio-video.md markdown rendering (follow-up to #2179) #2194 are already present onmainvia the26.05merge and follow-up doc PRs.Follow-up (eng, not this PR)
Reviewer checklist
custom-metadata.md; link to metadata filtering notebook fromvdbs.md26.05from image-captioning heading; keep legacy anchor aliasmain; conflicts resolvedTest plan
extraction/custom-metadata.mdredirects tovdbs.md#metadata-and-filteringmultimodal-extraction.md#charts-and-infographicsand#image-captioningdeployment-options.mdoffline caption link targets#image-captioning-nim-hardware