Skip to content

Comments

Add Sglang deployment instructions#48

Merged
alay2shah merged 5 commits intoLiquid4All:mainfrom
vincentzed:vz/sglang
Feb 17, 2026
Merged

Add Sglang deployment instructions#48
alay2shah merged 5 commits intoLiquid4All:mainfrom
vincentzed:vz/sglang

Conversation

@vincentzed
Copy link
Contributor

No description provided.

@vincentzed
Copy link
Contributor Author

cc @tugot17


* `--chunked-prefill-size -1`: Disables chunked prefill for lower latency

### Ultra Low Latency on Blackwell (B300)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be a seperate section overall. Low latency, and all of the flags


For more details on tool parsing configuration, see the [SGLang Tool Parser documentation](https://docs.sglang.io/advanced_features/tool_parser.html).

## Vision Models
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not yet officially supported (not merged into sglang), we should dorp it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just merge this PR when it is merged. Thoughts

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drop the vision models for now please, we will do this update step by step, and merg of the vision models will take a while

Signed-off-by: vincentzed <[email protected]>

WIP

Signed-off-by: vincentzed <[email protected]>
Signed-off-by: vincentzed <[email protected]>
@vincentzed
Copy link
Contributor Author

Changes as req'ed, done in b8dacc3

@alay2shah
Copy link
Contributor

I'll take a look at this and make sure formatting is standard to rest of docs

- Add supported models table (dense, MoE coming in 0.5.9, vision not yet)
- Add install-from-main instructions for MoE support
- Consolidate launch command with --tool-call-parser lfm2 by default
- Move Docker under Launching the Server section
- Replace verbose chat examples with concise curl + Python tool calling
- Simplify low latency section with key metrics only
- Fold precision info into a Note
Copy link
Contributor

@tugot17 tugot17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added my changes, now i think it is fine and we can merge this.


For more details on tool parsing configuration, see the [SGLang Tool Parser documentation](https://docs.sglang.io/advanced_features/tool_parser.html).

## Vision Models
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's drop the vision models for now please, we will do this update step by step, and merg of the vision models will take a while

python3 -m sglang.launch_server \
--model LiquidAI/LFM2.5-1.2B-Instruct \
--host 0.0.0.0 \
--port 30000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--tool-call-parser lfm2

Let's add this here, and add an example tool call in the Chat completions section, and drop the seperate Tool Calling secion?

@tugot17
Copy link
Contributor

tugot17 commented Feb 17, 2026

@Paulescu could you give it a "go"?

- Add Tip callout at top with concise use-case summary
- Use Tabs (Python/Docker) for server launch section
- Show Python example directly under Usage, curl in Accordion
- Trim redundant pip/uv install lines from MoE install

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@alay2shah alay2shah merged commit 3031eac into Liquid4All:main Feb 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants