feat: add schema modification MCP tools (add-fields, add-field-types) by adityamparikh · Pull Request #131 · apache/solr-mcp

adityamparikh · 2026-05-18T15:35:59Z

Summary

Adds add-fields and add-field-types MCP tools to extend a Solr collection's schema additively through the MCP layer.
Partially closes Add schema and configuration modification tools #30. Replace/delete operations, add-copy-field, add-dynamic-field, and add-codec-factory are deliberately deferred (see spec for rationale).
Both tools take List<Map<String, Object>> matching Solr's Schema API JSON shape; batched via SchemaRequest.MultiUpdate. No transformation layer.
Both tools are @PreAuthorize("isAuthenticated()") — HTTP enforces auth, stdio bypasses (same pattern as existing tools).
Renames metadata/ package to schema/ since it only ever held schema-related types.

Design and plan

Why two tools instead of one combined? LLM tool-use ergonomics: single-purpose tools eliminate cross-wiring risk between field and field-type definitions. The orphan-field-type cost of separation is harmless.

Why no replace/delete? Schema replace/delete silently breaks indexed data without an explicit reindex. AI-driven workflows are the wrong place to expose that footgun without a guardrail design.

Why Map<String, Object> instead of records? Solr field-type shape includes analyzers/tokenizers/filters with class-specific param bags; records collapse to Map<String, Object> at the leaves anyway. Map shape matches SolrJ's AddField(Map) constructor — zero transformation.

Tool descriptions include inline recipes

add-field-types description includes three common recipes inline (case-insensitive exact match, dense vector for semantic search, autocomplete) to give the LLM a diagnostic-to-fix bridge — symptom in user prompt → recipe in description → exact analyzer chain.

Test plan

Unit tests (Mockito, @DisabledInNativeImage) — validation, happy path, error propagation, single-analyzer, separate index/query analyzers, DenseVectorField
Integration tests (Testcontainers, real Solr) — schema round-trip, custom analyzer behavior at index + query time, DenseVectorField round-trip, duplicate-field / unknown-type error propagation
MCP protocol tests in McpClientIntegrationTestBase — end-to-end add-fields → index-json-documents → search workflow, exercised against both HTTP and stdio transports
./gradlew clean build — BUILD SUCCESSFUL, all tests pass
./gradlew jibDockerBuild — Jib JVM image built
./gradlew bootBuildImage -Pnative — Paketo native stdio image built
./gradlew bootBuildImage -Pnative -Pprofile=http — Paketo native HTTP image built
./gradlew nativeTest -Pnative — 168/168 native tests pass (113 Mockito tests appropriately skipped)

The nativeTest run also caught a real native-image reflection gap: Jackson.convertValue(map, AnalyzerDefinition.class) needs reflection metadata for AnalyzerDefinition (and FieldTypeDefinition defensively). Fixed in fix(native): register AnalyzerDefinition and FieldTypeDefinition for reflection.

🤖 Generated with Claude Code

Spec and plan for adding add-fields and add-field-types MCP tools per issue #30. See docs/superpowers/specs/ and docs/superpowers/plans/. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

… tools Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

@JsonIgnoreProperties

…oject DTO convention @JsonIgnoreProperties, @JsonInclude(NON_NULL), and @jsonformat on the timestamp field match the pattern used by every record in Dtos.java. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

Package previously named "metadata" only contained schema-related types (SchemaService, SchemaUpdateResult, and their tests). Renaming to "schema" makes the package name accurate to its contents. Moves preserve git history via git mv. Imports updated in Main, MainTest, and McpToolRegistrationTest. Spec and plan paths updated. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

Closes part of #30. Adds one or more fields per call via SolrJ's SchemaRequest.MultiUpdate. Input is List<Map<String, Object>> matching the Solr Schema API add-field JSON shape; validation is limited to collection name and non-empty list (Solr returns clear errors for malformed field defs). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

String.valueOf(null) returns the literal "null"; direct cast yields a real null on missing key, which makes the result honest about input shape (Solr's error surfaces before any result is returned). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

…elper Supports single analyzer, separate index/query analyzers, and non-analyzer field types like DenseVectorField. Manual conversion from flat input map to SolrJ FieldTypeDefinition because name/class go into attributes map and analyzers are typed sub-objects. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

…es test Aligns addFieldTypes_blankCollection_throws with the parallel addFields_blankCollection_throws test which already covers null + empty + whitespace. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

…flection Same pattern as the other @mcptool response records — invisible to AOT because MCP dispatches via Object. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

End-to-end against real Solr via Testcontainers. Verifies schema round-trip, custom analyzer behavior, vector field type registration, and error propagation on duplicate field / unknown type. SolrJ's MultiUpdate.process() throws natively on Schema API errors, so no explicit response-body inspection was needed in SchemaService. Note: SolrJ 10 moved SolrQuery to org.apache.solr.client.solrj.request.SolrQuery; vectorDimension attribute is returned as String by the schema API, handled via toString/parseInt. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

Adds ordered tests 16-18 exercising the add-fields → index → search workflow through the MCP protocol against both HTTP and stdio transports. Also asserts add-fields and add-field-types appear in listTools output. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

…n tools Reflects the metadata→schema package rename and the new add-fields and add-field-types capabilities. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

…reflection GraalVM native test caught that Jackson's convertValue(map, AnalyzerDefinition.class) in SchemaService.toAnalyzerDefinition fails at runtime without reflection metadata. The spec anticipated this; adding both SolrJ types defensively. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

The get-schema MCP tool returns SolrJ's SchemaRepresentation, which Spring AI serializes to JSON for MCP clients. Without reflection metadata in the native image, the JSON Spring AI produces is missing the fields/fieldTypes/dynamicFields/copyFields arrays — silently breaking any consumer that introspects the schema. JVM tests didn't catch this because the pre-existing get-schema test only asserts the response is non-empty. The new end-to-end shows workflow in the next commit parses the schema JSON, which surfaced the gap. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

Adds 9 new ordered MCP-protocol tests (orders 19-27) covering the full workflow an LLM would drive against the shows-collection use case from issue #30: 1. create-collection → "shows" 2. add-fields → 16 user-defined fields (title, platform, genres, release_year, ..., imdb_rating, description, tags) 3. index-json-documents → 61 docs from src/test/resources/shows.json (Netflix, Prime Video, HBO Max, Disney+, Apple TV+, Hulu, Peacock, Paramount+) 4. search → numFound=61 5. search + facet → platform facet returns Netflix=20, Prime=20 6. search + filter → multi-valued genres:Sci-Fi 7. search + keyword → description:apocalyptic with platform filter 8. get-schema → all 16 added fields present 9. get-collection-stats → numDocs=61 Because the test base class is reused by both HTTP and stdio MCP client transports and is also compiled into the GraalVM native test binary, these tests exercise all four combinations: - JVM + stdio (McpClientStdioIntegrationTest) - JVM + HTTP (McpClientIntegrationTest) - Native + stdio (via nativeTest -Pnative) - Native + HTTP (via nativeTest -Pnative) The shows collection inherits the same _default configset that prior tests in this class have already modified via schemaless indexing and add-fields against mcp-client-test. addShowsSchema() calls get-schema first to filter the desired field list to only the fields not already present in the shared configset's managed-schema — which is exactly what the add-fields tool description tells the LLM to do. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

…keys Two correctness fixes flagged by the Solr-expert review of PR apache#131: 1. MultiUpdate IS transactional, not sequential Per the Solr Schema API reference guide and SolrJ's SchemaRequest.MultiUpdate Javadoc, all commands in a single call either succeed or fail together. The previous tool descriptions wrongly told the LLM "commands run in input order; if one fails mid-batch, prior commands remain applied" / "partial application possible on failure" — which would lead an LLM to issue partial- recovery commands that double-apply on retry. Updated both add-fields and add-field-types descriptions and the spec doc to say "Solr's Schema API is transactional — if any command in the batch fails, none are applied." 2. toAnalyzerDefinition silently dropped unknown analyzer keys SolrJ's AnalyzerDefinition only exposes typed setters for charFilters, tokenizer, and filters. A naive objectMapper.convertValue(raw, AnalyzerDefinition.class) silently drops every other top-level analyzer key (class, luceneMatchVersion, positionIncrementGap, ...). This broke the valid single-class analyzer form {"analyzer":{"class":"solr.WhitespaceAnalyzer"}} which the Solr Ref Guide documents for the StandardAnalyzer / WhitespaceAnalyzer / KeywordAnalyzer / per-language analyzer (ArabicAnalyzer etc.) patterns. Rewrote the helper to manually split the map: charFilters, tokenizer, filters go through the typed setters; everything else is preserved via setAttributes(). Added a unit-test regression guard that serializes the captured MultiUpdate's wire body and asserts the analyzer-level class key is present. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

Apply the same drop-noise treatment to SchemaUpdateResult that PR apache#132 applies to Dtos.java records: - success (boolean) — always true on return; failures throw before reaching the result. - timestamp (Date) — always "now"; sub-second operation, MCP host records call timing already. After this change, SchemaUpdateResult is just (collection, addedNames): both fields carry real information. addedNames echoes the field names back so the LLM can confirm what landed. java.util.Date and com.fasterxml.jackson.annotation.JsonFormat imports drop from the record entirely. Test assertions referencing the dropped fields are removed. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

…safety README: - Add `add-fields` and `add-field-types` to the Tools table (PR apache#131). - Add a sentence on MCP behavior hints (`readOnlyHint`, `destructiveHint`, `idempotentHint`) so client integrators know to read them (PR apache#134). - Add an "MCP Prompts" section listing the six workflow prompts introduced in the prompts PR. - Clarify that `solr://{collection}/schema` autocompletion is prefix-filtered, case-insensitive, capped at 100 — describing the behavior shipped with the completion PR. CONTRIBUTING: - Add a "Null safety" subsection covering the project-wide `@NullMarked` contract and NullAway enforcement (PR apache#133). These docs land alongside the corresponding feature PRs at apache/main. Signed-off-by: adityamparikh <[email protected]>

adityamparikh and others added 17 commits May 17, 2026 01:20

feat(metadata): add SchemaUpdateResult record for schema modification…

2049a1c

… tools Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

docs: document add-fields and add-field-types MCP tools in README

87ca85d

Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]> Signed-off-by: adityamparikh <[email protected]>

adityamparikh mentioned this pull request May 18, 2026

refactor(dtos): drop noise fields from MCP tool response records #132

Open

4 tasks

This was referenced May 19, 2026

Add schema and configuration modification tools #30

Open

docs(mcp): add add-fields and add-field-types to Schema tool table adityamparikh/solr-site#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add schema modification MCP tools (add-fields, add-field-types)#131

feat: add schema modification MCP tools (add-fields, add-field-types)#131
adityamparikh wants to merge 18 commits into
apache:mainfrom
adityamparikh:schema-modification

adityamparikh commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adityamparikh commented May 18, 2026

Summary

Design and plan

Tool descriptions include inline recipes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant