[Core] Add support for draft models for speculative decoding by TJ5 · Pull Request #535 · ome-projects/ome

TJ5 · 2026-03-05T07:44:00Z

What this PR does

Updates inference service to have a draft model as part of the spec, for use in speculative decoding. Configurations for use of the feature with nvidia/gpt-oss-120b-eagle3-long-context are added.

Why we need it

Speculative decoding increases inference speed, which is useful for long running agentic tasks. OME previously did not support deploying draft models, required for speculative decoding.

How to test

go test ./pkg/runtimeselector/...

Checklist

Tests added/updated (if applicable)
Docs updated (if applicable)
make test passes locally

gemini-code-assist · 2026-03-05T07:44:03Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

TJ5 added 9 commits March 4, 2026 13:28

init

f70b5ca

sample configs

0bc62d5

list draftmodel details

5087804

draft model size

1b56af4

addendum

e7a8020

revert basemodel to model

5a5bb85

add supporteddraftmodelformat

918d9df

supportedDraftModelFormats

c1a7072

revert

a1577e4

TJ5 requested review from CatherineSue, XinyueZhang369, YouNeedCryDear, beiguo218, pallasathena92 and slin1237 as code owners March 5, 2026 07:44

TJ5 changed the title ~~[Core] Add support for draft modles for speculative decoding~~ [Core] Add support for draft models for speculative decoding Mar 5, 2026

TJ5 marked this pull request as draft March 5, 2026 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Core] Add support for draft models for speculative decoding#535

[Core] Add support for draft models for speculative decoding#535
TJ5 wants to merge 9 commits intoome-projects:mainfrom
TJ5:spec-decoding

TJ5 commented Mar 5, 2026

Uh oh!

gemini-code-assist Bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

TJ5 commented Mar 5, 2026

What this PR does

Why we need it

How to test

Checklist

Uh oh!

gemini-code-assist Bot commented Mar 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant