@@ -100,7 +100,7 @@ For GPU-accelerated inference at scale, consider using [vLLM](/docs/inference/vl
## Downloading GGUF Models
-llama.cpp uses the GGUF format, which stores quantized model weights for efficient inference. All LFM models are available in GGUF format on Hugging Face. See the [Models page](/docs/models/complete-library) for all available GGUF models.
+llama.cpp uses the GGUF format, which stores quantized model weights for efficient inference. All LFM models are available in GGUF format on Hugging Face. See the [Models page](/lfm/models/complete-library) for all available GGUF models.
You can download LFM models in GGUF format from Hugging Face as follows:
diff --git a/docs/inference/lm-studio.mdx b/deployment/on-device/lm-studio.mdx
similarity index 98%
rename from docs/inference/lm-studio.mdx
rename to deployment/on-device/lm-studio.mdx
index 19ea48a..e961050 100644
--- a/docs/inference/lm-studio.mdx
+++ b/deployment/on-device/lm-studio.mdx
@@ -18,7 +18,7 @@ Download and install LM Studio directly from [lmstudio.ai](https://lmstudio.ai/d
3. Select a model and quantization level (`Q4_K_M` recommended)
4. Click **Download**
-See the [Models page](/docs/models/complete-library) for all available GGUF models.
+See the [Models page](/lfm/models/complete-library) for all available GGUF models.
## Using the Chat Interface
diff --git a/docs/inference/mlx.mdx b/deployment/on-device/mlx.mdx
similarity index 95%
rename from docs/inference/mlx.mdx
rename to deployment/on-device/mlx.mdx
index 199f001..35b0bfa 100644
--- a/docs/inference/mlx.mdx
+++ b/deployment/on-device/mlx.mdx
@@ -21,7 +21,7 @@ pip install mlx-lm
The `mlx-lm` package provides a simple interface for text generation with MLX models.
-See the [Models page](/docs/models/complete-library) for all available MLX models, or browse MLX community models at [mlx-community LFM2 models](https://huggingface.co/models?sort=created&search=mlx-communityLFM2).
+See the [Models page](/lfm/models/complete-library) for all available MLX models, or browse MLX community models at [mlx-community LFM2 models](https://huggingface.co/models?sort=created&search=mlx-communityLFM2).
```python
from mlx_lm import load, generate
diff --git a/docs/inference/ollama.mdx b/deployment/on-device/ollama.mdx
similarity index 98%
rename from docs/inference/ollama.mdx
rename to deployment/on-device/ollama.mdx
index a149954..65939f7 100644
--- a/docs/inference/ollama.mdx
+++ b/deployment/on-device/ollama.mdx
@@ -68,7 +68,7 @@ You can run LFM2 models directly from Hugging Face:
ollama run hf.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF
```
-See the [Models page](/docs/models/complete-library) for all available GGUF repositories.
+See the [Models page](/lfm/models/complete-library) for all available GGUF repositories.
To use a local GGUF file, first download a model from Hugging Face:
diff --git a/docs/inference/onnx.mdx b/deployment/on-device/onnx.mdx
similarity index 98%
rename from docs/inference/onnx.mdx
rename to deployment/on-device/onnx.mdx
index 7d363a9..9cf5cf7 100644
--- a/docs/inference/onnx.mdx
+++ b/deployment/on-device/onnx.mdx
@@ -72,7 +72,7 @@ For complete documentation and advanced options, see the [LiquidONNX GitHub repo
## Pre-exported Models
-Many LFM models are available as pre-exported ONNX packages from [LiquidAI](https://huggingface.co/LiquidAI/models?search=onnx) and the [onnx-community](https://huggingface.co/onnx-community). Check the [Model Library](/docs/models/complete-library) for a complete list of available formats.
+Many LFM models are available as pre-exported ONNX packages from [LiquidAI](https://huggingface.co/LiquidAI/models?search=onnx) and the [onnx-community](https://huggingface.co/onnx-community). Check the [Model Library](/lfm/models/complete-library) for a complete list of available formats.
### Quantization Options
diff --git a/leap/leap-bundle/authentication.mdx b/deployment/tools/model-bundling/authentication.mdx
similarity index 100%
rename from leap/leap-bundle/authentication.mdx
rename to deployment/tools/model-bundling/authentication.mdx
diff --git a/leap/leap-bundle/bundle-creation.mdx b/deployment/tools/model-bundling/bundle-creation.mdx
similarity index 100%
rename from leap/leap-bundle/bundle-creation.mdx
rename to deployment/tools/model-bundling/bundle-creation.mdx
diff --git a/leap/leap-bundle/bundle-management.mdx b/deployment/tools/model-bundling/bundle-management.mdx
similarity index 100%
rename from leap/leap-bundle/bundle-management.mdx
rename to deployment/tools/model-bundling/bundle-management.mdx
diff --git a/leap/leap-bundle/changelog.mdx b/deployment/tools/model-bundling/changelog.mdx
similarity index 100%
rename from leap/leap-bundle/changelog.mdx
rename to deployment/tools/model-bundling/changelog.mdx
diff --git a/leap/leap-bundle/configuration.mdx b/deployment/tools/model-bundling/configuration.mdx
similarity index 100%
rename from leap/leap-bundle/configuration.mdx
rename to deployment/tools/model-bundling/configuration.mdx
diff --git a/leap/leap-bundle/data-privacy.mdx b/deployment/tools/model-bundling/data-privacy.mdx
similarity index 100%
rename from leap/leap-bundle/data-privacy.mdx
rename to deployment/tools/model-bundling/data-privacy.mdx
diff --git a/leap/leap-bundle/download.mdx b/deployment/tools/model-bundling/download.mdx
similarity index 100%
rename from leap/leap-bundle/download.mdx
rename to deployment/tools/model-bundling/download.mdx
diff --git a/leap/leap-bundle/quick-start.mdx b/deployment/tools/model-bundling/quick-start.mdx
similarity index 96%
rename from leap/leap-bundle/quick-start.mdx
rename to deployment/tools/model-bundling/quick-start.mdx
index f9a71a4..4710156 100644
--- a/leap/leap-bundle/quick-start.mdx
+++ b/deployment/tools/model-bundling/quick-start.mdx
@@ -69,7 +69,7 @@ If model uploads fail with connectivity errors, verify that your network allows
3. Select the [`API keys` tab](https://leap.liquid.ai/profile#/api-keys) and create a new API key
- 
+ 
4. Authenticate the Model Bundling Service with your API token:
@@ -228,4 +228,4 @@ If model uploads fail with connectivity errors, verify that your network allows
## Next Steps
* Visit the [LEAP Model Library](https://leap.liquid.ai/models) to explore available models.
-* Check the [Bundle Creation](/leap/leap-bundle/bundle-creation) page for detailed command reference.
+* Check the [Bundle Creation](/deployment/tools/model-bundling/bundle-creation) page for detailed command reference.
diff --git a/leap/leap-bundle/reference.mdx b/deployment/tools/model-bundling/reference.mdx
similarity index 100%
rename from leap/leap-bundle/reference.mdx
rename to deployment/tools/model-bundling/reference.mdx
diff --git a/docs.json b/docs.json
index 67a08f3..71c6ad4 100644
--- a/docs.json
+++ b/docs.json
@@ -1,7 +1,7 @@
{
"$schema": "https://mintlify.com/docs.json",
"banner": {
- "content": "π New: LFM2.5-1.2B-Instruct and LFM2.5-1.2B-Thinking are now available! [Learn more β](/docs/models/text-models)",
+ "content": "π New: LFM2.5-1.2B-Instruct and LFM2.5-1.2B-Thinking are now available! [Learn more β](/lfm/models/text-models)",
"dismissible": true
},
"theme": "mint",
@@ -36,7 +36,7 @@
"logo": {
"light": "/logo/light.svg",
"dark": "/logo/dark.svg",
- "href": "/docs/getting-started/welcome"
+ "href": "/lfm/getting-started/welcome"
},
"navbar": {
"links": [
@@ -54,136 +54,161 @@
"navigation": {
"tabs": [
{
- "tab": "Documentation",
+ "tab": "LFM",
"groups": [
{
- "group": "Get Started",
+ "group": "Getting Started",
"icon": "rocket",
"pages": [
- "docs/getting-started/welcome",
- "docs/getting-started/connect-ai-tools"
+ "lfm/getting-started/welcome",
+ "lfm/getting-started/connect-ai-tools"
]
},
{
"group": "Models",
"icon": "brain",
"pages": [
- "docs/models/complete-library",
- "docs/models/text-models",
- "docs/models/vision-models",
- "docs/models/audio-models",
- "docs/models/liquid-nanos"
+ "lfm/models/complete-library",
+ "lfm/models/text-models",
+ "lfm/models/vision-models",
+ "lfm/models/audio-models",
+ "lfm/models/liquid-nanos"
]
},
{
"group": "Key Concepts",
"icon": "lightbulb",
"pages": [
- "docs/key-concepts/chat-template",
- "docs/key-concepts/text-generation-and-prompting",
- "docs/key-concepts/tool-use"
+ "lfm/key-concepts/chat-template",
+ "lfm/key-concepts/text-generation-and-prompting",
+ "lfm/key-concepts/tool-use"
]
},
{
- "group": "Inference",
- "icon": "play",
+ "group": "Help",
+ "icon": "book",
"pages": [
- "docs/inference/transformers",
- "docs/inference/llama-cpp",
- "docs/inference/vllm",
- "docs/inference/sglang",
- "docs/inference/mlx",
- "docs/inference/ollama",
- "docs/inference/onnx",
- {
- "group": "Other Frameworks",
- "icon": "server",
- "pages": [
- "docs/inference/lm-studio",
- "docs/inference/modal-deployment",
- "docs/inference/baseten-deployment",
- "docs/inference/fal-deployment"
- ]
- }
+ "lfm/help/faqs",
+ "lfm/help/troubleshooting",
+ "lfm/help/contributing"
+ ]
+ }
+ ]
+ },
+ {
+ "tab": "Customization",
+ "groups": [
+ {
+ "group": "Getting Started",
+ "icon": "rocket",
+ "pages": [
+ "customization/getting-started/welcome",
+ "customization/getting-started/connect-ai-tools"
]
},
{
- "group": "Fine-tuning",
- "icon": "sliders",
+ "group": "Tools",
+ "icon": "wrench",
"pages": [
- "docs/fine-tuning/workbench",
- "docs/fine-tuning/datasets",
- "docs/fine-tuning/trl",
- "docs/fine-tuning/unsloth"
+ "customization/tools/workbench"
]
},
{
- "group": "Help",
- "icon": "book",
+ "group": "Finetuning Frameworks",
+ "icon": "sliders",
"pages": [
- "docs/help/faqs",
- "docs/help/troubleshooting",
- "docs/help/contributing"
+ "customization/finetuning-frameworks/datasets",
+ "customization/finetuning-frameworks/trl",
+ "customization/finetuning-frameworks/unsloth"
]
}
]
},
{
- "tab": "SDK Reference",
+ "tab": "Deployment",
"groups": [
{
- "group": "Get Started",
+ "group": "Getting Started",
"icon": "rocket",
"pages": [
- "leap/edge-sdk/overview",
- "docs/getting-started/connect-ai-tools"
+ "deployment/getting-started/welcome",
+ "deployment/getting-started/connect-ai-tools"
]
},
{
- "group": "iOS",
- "icon": "apple",
+ "group": "On-Device",
+ "icon": "mobile",
"pages": [
- "leap/edge-sdk/ios/ios-quick-start-guide",
- "leap/edge-sdk/ios/ai-agent-usage-guide",
- "leap/edge-sdk/ios/model-loading",
- "leap/edge-sdk/ios/conversation-generation",
- "leap/edge-sdk/ios/messages-content",
- "leap/edge-sdk/ios/advanced-features",
- "leap/edge-sdk/ios/utilities",
- "leap/edge-sdk/ios/cloud-ai-comparison",
- "leap/edge-sdk/ios/constrained-generation",
- "leap/edge-sdk/ios/function-calling"
+ {
+ "group": "iOS SDK",
+ "icon": "apple",
+ "pages": [
+ "deployment/on-device/ios/ios-quick-start-guide",
+ "deployment/on-device/ios/ai-agent-usage-guide",
+ "deployment/on-device/ios/model-loading",
+ "deployment/on-device/ios/conversation-generation",
+ "deployment/on-device/ios/messages-content",
+ "deployment/on-device/ios/advanced-features",
+ "deployment/on-device/ios/utilities",
+ "deployment/on-device/ios/cloud-ai-comparison",
+ "deployment/on-device/ios/constrained-generation",
+ "deployment/on-device/ios/function-calling"
+ ]
+ },
+ {
+ "group": "Android SDK",
+ "icon": "robot",
+ "pages": [
+ "deployment/on-device/android/android-quick-start-guide",
+ "deployment/on-device/android/ai-agent-usage-guide",
+ "deployment/on-device/android/model-loading",
+ "deployment/on-device/android/conversation-generation",
+ "deployment/on-device/android/messages-content",
+ "deployment/on-device/android/advanced-features",
+ "deployment/on-device/android/utilities",
+ "deployment/on-device/android/cloud-ai-comparison",
+ "deployment/on-device/android/constrained-generation",
+ "deployment/on-device/android/function-calling"
+ ]
+ },
+ "deployment/on-device/llama-cpp",
+ "deployment/on-device/lm-studio",
+ "deployment/on-device/mlx",
+ "deployment/on-device/onnx",
+ "deployment/on-device/ollama"
]
},
{
- "group": "Android",
- "icon": "robot",
+ "group": "GPU Inference",
+ "icon": "microchip",
"pages": [
- "leap/edge-sdk/android/android-quick-start-guide",
- "leap/edge-sdk/android/ai-agent-usage-guide",
- "leap/edge-sdk/android/model-loading",
- "leap/edge-sdk/android/conversation-generation",
- "leap/edge-sdk/android/messages-content",
- "leap/edge-sdk/android/advanced-features",
- "leap/edge-sdk/android/utilities",
- "leap/edge-sdk/android/cloud-ai-comparison",
- "leap/edge-sdk/android/constrained-generation",
- "leap/edge-sdk/android/function-calling"
+ "deployment/gpu-inference/transformers",
+ "deployment/gpu-inference/vllm",
+ "deployment/gpu-inference/sglang",
+ "deployment/gpu-inference/modal",
+ "deployment/gpu-inference/baseten",
+ "deployment/gpu-inference/fal"
]
},
{
- "group": "Model Bundling Service",
- "icon": "box",
+ "group": "Tools",
+ "icon": "toolbox",
"pages": [
- "leap/leap-bundle/quick-start",
- "leap/leap-bundle/authentication",
- "leap/leap-bundle/configuration",
- "leap/leap-bundle/bundle-creation",
- "leap/leap-bundle/bundle-management",
- "leap/leap-bundle/download",
- "leap/leap-bundle/reference",
- "leap/leap-bundle/data-privacy",
- "leap/leap-bundle/changelog"
+ {
+ "group": "Model Bundling Services",
+ "icon": "box",
+ "pages": [
+ "deployment/tools/model-bundling/quick-start",
+ "deployment/tools/model-bundling/authentication",
+ "deployment/tools/model-bundling/configuration",
+ "deployment/tools/model-bundling/bundle-creation",
+ "deployment/tools/model-bundling/bundle-management",
+ "deployment/tools/model-bundling/download",
+ "deployment/tools/model-bundling/reference",
+ "deployment/tools/model-bundling/data-privacy",
+ "deployment/tools/model-bundling/changelog"
+ ]
+ }
]
}
]
@@ -192,11 +217,11 @@
"tab": "Examples",
"groups": [
{
- "group": "Get Started",
+ "group": "Getting Started",
"icon": "rocket",
"pages": [
"examples/index",
- "docs/getting-started/connect-ai-tools"
+ "examples/connect-ai-tools"
]
},
{
@@ -244,8 +269,100 @@
},
"redirects": [
{
- "source": "/lfm/:slug*",
- "destination": "/docs/:slug*"
+ "source": "/docs/getting-started/welcome",
+ "destination": "/lfm/getting-started/welcome"
+ },
+ {
+ "source": "/docs/getting-started/connect-ai-tools",
+ "destination": "/lfm/getting-started/connect-ai-tools"
+ },
+ {
+ "source": "/docs/models/:slug*",
+ "destination": "/lfm/models/:slug*"
+ },
+ {
+ "source": "/docs/key-concepts/:slug*",
+ "destination": "/lfm/key-concepts/:slug*"
+ },
+ {
+ "source": "/docs/help/:slug*",
+ "destination": "/lfm/help/:slug*"
+ },
+ {
+ "source": "/docs/fine-tuning/workbench",
+ "destination": "/customization/tools/workbench"
+ },
+ {
+ "source": "/docs/fine-tuning/datasets",
+ "destination": "/customization/finetuning-frameworks/datasets"
+ },
+ {
+ "source": "/docs/fine-tuning/trl",
+ "destination": "/customization/finetuning-frameworks/trl"
+ },
+ {
+ "source": "/docs/fine-tuning/unsloth",
+ "destination": "/customization/finetuning-frameworks/unsloth"
+ },
+ {
+ "source": "/docs/inference/llama-cpp",
+ "destination": "/deployment/on-device/llama-cpp"
+ },
+ {
+ "source": "/docs/inference/mlx",
+ "destination": "/deployment/on-device/mlx"
+ },
+ {
+ "source": "/docs/inference/onnx",
+ "destination": "/deployment/on-device/onnx"
+ },
+ {
+ "source": "/docs/inference/ollama",
+ "destination": "/deployment/on-device/ollama"
+ },
+ {
+ "source": "/docs/inference/lm-studio",
+ "destination": "/deployment/on-device/lm-studio"
+ },
+ {
+ "source": "/docs/inference/transformers",
+ "destination": "/deployment/gpu-inference/transformers"
+ },
+ {
+ "source": "/docs/inference/vllm",
+ "destination": "/deployment/gpu-inference/vllm"
+ },
+ {
+ "source": "/docs/inference/sglang",
+ "destination": "/deployment/gpu-inference/sglang"
+ },
+ {
+ "source": "/docs/inference/modal-deployment",
+ "destination": "/deployment/gpu-inference/modal"
+ },
+ {
+ "source": "/docs/inference/baseten-deployment",
+ "destination": "/deployment/gpu-inference/baseten"
+ },
+ {
+ "source": "/docs/inference/fal-deployment",
+ "destination": "/deployment/gpu-inference/fal"
+ },
+ {
+ "source": "/leap/edge-sdk/overview",
+ "destination": "/deployment/on-device/ios/ios-quick-start-guide"
+ },
+ {
+ "source": "/leap/edge-sdk/ios/:slug*",
+ "destination": "/deployment/on-device/ios/:slug*"
+ },
+ {
+ "source": "/leap/edge-sdk/android/:slug*",
+ "destination": "/deployment/on-device/android/:slug*"
+ },
+ {
+ "source": "/leap/leap-bundle/:slug*",
+ "destination": "/deployment/tools/model-bundling/:slug*"
}
],
"ai": {
diff --git a/docs/models/complete-library.mdx b/docs/models/complete-library.mdx
deleted file mode 100644
index a78bd7e..0000000
--- a/docs/models/complete-library.mdx
+++ /dev/null
@@ -1,94 +0,0 @@
----
-title: "Model Library"
-description: "Liquid Foundation Models (LFMs) are a new class of multimodal architectures built for fast inference and on-device deployment. Browse all available models and formats here."
----
-
-
-
-All of our models share the following capabilities:
-
-- 32k token context length for extended conversations and document processing
-- Designed for fast inference with [Transformers](/docs/inference/transformers), [llama.cpp](/docs/inference/llama-cpp), [vLLM](/docs/inference/vllm), [MLX](/docs/inference/mlx), [Ollama](/docs/inference/ollama), and [LEAP](/docs/frameworks/leap)
-- Trainable via SFT, DPO, and GRPO with [TRL](/docs/fine-tuning/trl) and [Unsloth](/docs/fine-tuning/unsloth)
-
-
-
-## Model Families
-
-Choose a model based on your desired functionalities. Each individual model card has specific details on deployment and customization.
-
-
-
-
- Chat, tool calling, structured output, and classification.
-
-
-
- Image understanding with LFM backbones and custom encoders.
-
-
-
- Interleaved audio/text models for TTS, ASR, and voice chat.
-
-
-
- Task-specific models for extraction, summarization, RAG, and translation.
-
-
-
-
-## Model Formats
-
-All LFM2 models are available in multiple formats for flexible deployment:
-
-- **GGUF** β Best for local CPU/GPU inference on any platform. Use with [llama.cpp](/docs/inference/llama-cpp), [LM Studio](/docs/inference/lm-studio), or [Ollama](/docs/inference/ollama). Append `-GGUF` to any model name.
-- **MLX** β Best for Mac users with Apple Silicon. Leverages unified memory for fast inference via [MLX](/docs/inference/mlx). Browse at [mlx-community](https://huggingface.co/mlx-community/collections?search=LFM).
-- **ONNX** β Best for production deployments and edge devices. Cross-platform with ONNX Runtime across CPUs, GPUs, and accelerators. Append `-ONNX` to any model name.
-
-### Quantization
-
-Quantization reduces model size and speeds up inference with minimal quality loss. Available options by format:
-
-- **GGUF** β Supports `Q4_0`, `Q4_K_M`, `Q5_K_M`, `Q6_K`, `Q8_0`, `BF16`, and `F16`. `Q4_K_M` offers the best balance of size and quality.
-- **MLX** β Available in `3bit`, `4bit`, `5bit`, `6bit`, `8bit`, and `BF16`. `8bit` is recommended.
-- **ONNX** β Supports `FP32`, `FP16`, `Q4`, and `Q8` (MoE models also support `Q4F16`). `Q4` is recommended for most deployments.
-
-## Model Chart
-
-| Model | HF | GGUF | MLX | ONNX | Trainable? |
-| ----- | -- | ---- | --- | ---- | ---------- |
-| **Text-to-text Models** | | | | | |
-| LFM2.5 Models (Latest Release) | | | | | |
-| [LFM2.5-1.2B-Instruct](/docs/models/lfm25-1.2b-instruct) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-MLX-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-ONNX) | Yes (TRL) |
-| [LFM2.5-1.2B-Thinking](/docs/models/lfm25-1.2b-thinking) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking-GGUF) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking-MLX-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking-ONNX) | Yes (TRL) |
-| [LFM2.5-1.2B-Base](/docs/models/lfm25-1.2b-base) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base-GGUF) | β | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base-ONNX) | Yes (TRL) |
-| [LFM2.5-1.2B-JP](/docs/models/lfm25-1.2b-jp) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-GGUF) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-MLX-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-ONNX) | Yes (TRL) |
-| LFM2 Models | | | | | |
-| [LFM2-8B-A1B](/docs/models/lfm2-8b-a1b) | [β](https://huggingface.co/LiquidAI/LFM2-8B-A1B) | [β](https://huggingface.co/LiquidAI/LFM2-8B-A1B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-8B-A1B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-8B-A1B-ONNX) | Yes (TRL) |
-| [LFM2-2.6B](/docs/models/lfm2-2.6b) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-2.6B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-2.6B-ONNX) | Yes (TRL) |
-| [LFM2-2.6B-Exp](/docs/models/lfm2-2.6b-exp) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Exp) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Exp-GGUF) | β | β | Yes (TRL) |
-| [LFM2-1.2B](/docs/models/lfm2-1.2b) Deprecated | [β](https://huggingface.co/LiquidAI/LFM2-1.2B) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-1.2B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-1.2B-ONNX) | Yes (TRL) |
-| [LFM2-700M](/docs/models/lfm2-700m) | [β](https://huggingface.co/LiquidAI/LFM2-700M) | [β](https://huggingface.co/LiquidAI/LFM2-700M-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-700M-8bit) | [β](https://huggingface.co/onnx-community/LFM2-700M-ONNX) | Yes (TRL) |
-| [LFM2-350M](/docs/models/lfm2-350m) | [β](https://huggingface.co/LiquidAI/LFM2-350M) | [β](https://huggingface.co/LiquidAI/LFM2-350M-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-350M-8bit) | [β](https://huggingface.co/onnx-community/LFM2-350M-ONNX) | Yes (TRL) |
-| **Vision Language Models** | | | | | |
-| LFM2.5 Models (Latest Release) | | | | | |
-| [LFM2.5-VL-1.6B](/docs/models/lfm25-vl-1.6b) | [β](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B) | [β](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2.5-VL-1.6B-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B-ONNX) | Yes (TRL) |
-| LFM2 Models | | | | | |
-| [LFM2-VL-3B](/docs/models/lfm2-vl-3b) | [β](https://huggingface.co/LiquidAI/LFM2-VL-3B) | [β](https://huggingface.co/LiquidAI/LFM2-VL-3B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-VL-3B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-VL-3B-ONNX) | Yes (TRL) |
-| [LFM2-VL-1.6B](/docs/models/lfm2-vl-1.6b) | [β](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | [β](https://huggingface.co/LiquidAI/LFM2-VL-1.6B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-VL-1.6B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-VL-1.6B-ONNX) | Yes (TRL) |
-| [LFM2-VL-450M](/docs/models/lfm2-vl-450m) | [β](https://huggingface.co/LiquidAI/LFM2-VL-450M) | [β](https://huggingface.co/LiquidAI/LFM2-VL-450M-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-VL-450M-8bit) | [β](https://huggingface.co/onnx-community/LFM2-VL-450M-ONNX) | Yes (TRL) |
-| **Audio Models** | | | | | |
-| LFM2.5 Models (Latest Release) | | | | | |
-| [LFM2.5-Audio-1.5B](/docs/models/lfm25-audio-1.5b) | [β](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) | [β](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-GGUF) | β | [β](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-ONNX) | Yes (TRL) |
-| LFM2 Models | | | | | |
-| [LFM2-Audio-1.5B](/docs/models/lfm2-audio-1.5b) | [β](https://huggingface.co/LiquidAI/LFM2-Audio-1.5B) | [β](https://huggingface.co/LiquidAI/LFM2-Audio-1.5B-GGUF) | β | β | No |
-| **Liquid Nanos** | | | | | |
-| [LFM2-1.2B-Extract](/docs/models/lfm2-1.2b-extract) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Extract) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Extract-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-1.2B-Extract-ONNX) | Yes (TRL) |
-| [LFM2-350M-Extract](/docs/models/lfm2-350m-extract) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Extract) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Extract-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-350M-Extract-ONNX) | Yes (TRL) |
-| [LFM2-350M-ENJP-MT](/docs/models/lfm2-350m-enjp-mt) | [β](https://huggingface.co/LiquidAI/LFM2-350M-ENJP-MT) | [β](https://huggingface.co/LiquidAI/LFM2-350M-ENJP-MT-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-350M-ENJP-MT-8bit) | [β](https://huggingface.co/onnx-community/LFM2-350M-ENJP-MT-ONNX) | Yes (TRL) |
-| [LFM2-1.2B-RAG](/docs/models/lfm2-1.2b-rag) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-1.2B-RAG-ONNX) | Yes (TRL) |
-| [LFM2-1.2B-Tool](/docs/models/lfm2-1.2b-tool) Deprecated | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Tool) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Tool-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-1.2B-Tool-ONNX) | Yes (TRL) |
-| [LFM2-350M-Math](/docs/models/lfm2-350m-math) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Math) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Math-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-350M-Math-ONNX) | Yes (TRL) |
-| [LFM2-350M-PII-Extract-JP](/docs/models/lfm2-350m-pii-extract-jp) | [β](https://huggingface.co/LiquidAI/LFM2-350M-PII-Extract-JP) | [β](https://huggingface.co/LiquidAI/LFM2-350M-PII-Extract-JP-GGUF) | β | β | Yes (TRL) |
-| [LFM2-ColBERT-350M](/docs/models/lfm2-colbert-350m) | [β](https://huggingface.co/LiquidAI/LFM2-ColBERT-350M) | β | β | β | Yes (PyLate) |
-| [LFM2-2.6B-Transcript](/docs/models/lfm2-2.6b-transcript) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-2.6B-Transcript-ONNX) | Yes (TRL) |
diff --git a/examples/connect-ai-tools.mdx b/examples/connect-ai-tools.mdx
new file mode 100644
index 0000000..69ee17b
--- /dev/null
+++ b/examples/connect-ai-tools.mdx
@@ -0,0 +1,8 @@
+---
+title: "Connect AI Tools"
+description: "Connect your AI coding tools to Liquid Docs via MCP for live, queryable access to documentation"
+---
+
+import ConnectAiTools from "/snippets/connect-ai-tools.mdx";
+
+
diff --git a/examples/laptop-examples/audio-to-text-in-real-time.mdx b/examples/laptop-examples/audio-to-text-in-real-time.mdx
index 40d5ece..3fc5000 100644
--- a/examples/laptop-examples/audio-to-text-in-real-time.mdx
+++ b/examples/laptop-examples/audio-to-text-in-real-time.mdx
@@ -6,7 +6,7 @@ title: "Audio transcription in real-time"
Browse the complete example on GitHub
-This example demonstrates how to use the [LFM2-Audio-1.5B](https://docs.liquid.ai/docs/models/lfm2-audio-1.5b) model with llama.cpp to transcribe audio files locally in real-time.
+This example demonstrates how to use the [LFM2-Audio-1.5B](https://docs.liquid.ai/lfm/models/lfm2-audio-1.5b) model with llama.cpp to transcribe audio files locally in real-time.
Intelligent audio assistants on the edge are possible, and this repository is just one step towards that.
@@ -120,7 +120,7 @@ For example, we can use
### What is LFM2-350M?
-[LFM2-350M](https://docs.liquid.ai/docs/models/lfm2-350m) is a small text-to-text model that can be used for tasks like text cleaning. To achieve optimal performance for your particular use case, you need to optimize your system and user prompts.
+[LFM2-350M](https://docs.liquid.ai/lfm/models/lfm2-350m) is a small text-to-text model that can be used for tasks like text cleaning. To achieve optimal performance for your particular use case, you need to optimize your system and user prompts.
Use our no-code tool to optimize your system and user prompts, and get your model ready for deployment.
diff --git a/examples/laptop-examples/flight-search-assistant.mdx b/examples/laptop-examples/flight-search-assistant.mdx
index a7d6ba4..bef9128 100644
--- a/examples/laptop-examples/flight-search-assistant.mdx
+++ b/examples/laptop-examples/flight-search-assistant.mdx
@@ -6,7 +6,7 @@ title: "Flight search assistant with tool calling"
Browse the complete example on GitHub
-This project demonstrates a Python CLI that leverages the [LFM2.5-1.2B-Thinking](/docs/models/lfm25-1.2b-thinking) model to help users find and book plane tickets through multi-step reasoning and tool calling.
+This project demonstrates a Python CLI that leverages the [LFM2.5-1.2B-Thinking](/lfm/models/lfm25-1.2b-thinking) model to help users find and book plane tickets through multi-step reasoning and tool calling.

diff --git a/examples/laptop-examples/invoice-extractor-tool-with-liquid-nanos.mdx b/examples/laptop-examples/invoice-extractor-tool-with-liquid-nanos.mdx
index 1a3efc1..b8c345c 100644
--- a/examples/laptop-examples/invoice-extractor-tool-with-liquid-nanos.mdx
+++ b/examples/laptop-examples/invoice-extractor-tool-with-liquid-nanos.mdx
@@ -22,7 +22,7 @@ In this example, you will learn how to:
* **Set up local AI inference** using llama.cpp to run Liquid models entirely on your machine without requiring cloud services or API keys
* **Build a file monitoring system** that automatically processes new files dropped into a directory
-* **Extract structured output from images** using [LFM2.5-VL-1.6B](https://docs.liquid.ai/docs/models/lfm25-vl-1.6b), a small vision-language model.
+* **Extract structured output from images** using [LFM2.5-VL-1.6B](https://docs.liquid.ai/lfm/models/lfm25-vl-1.6b), a small vision-language model.
## Environment setup
diff --git a/examples/web/vl-webgpu-demo.mdx b/examples/web/vl-webgpu-demo.mdx
index a80dfed..a9c2671 100644
--- a/examples/web/vl-webgpu-demo.mdx
+++ b/examples/web/vl-webgpu-demo.mdx
@@ -6,7 +6,7 @@ title: "Real-time video captioning with LFM2.5-VL-1.6B and WebGPU"
Browse the complete example on GitHub
-This example demonstrates how to run a vision-language model directly in your web browser using WebGPU acceleration. The demo showcases real-time video captioning with the [LFM2.5-VL-1.6B](/docs/models/lfm25-vl-1.6b) model, eliminating the need for cloud-based inference services.
+This example demonstrates how to run a vision-language model directly in your web browser using WebGPU acceleration. The demo showcases real-time video captioning with the [LFM2.5-VL-1.6B](/lfm/models/lfm25-vl-1.6b) model, eliminating the need for cloud-based inference services.
## Key Features
@@ -43,7 +43,7 @@ This example demonstrates how to run a vision-language model directly in your we
## Understanding the Architecture
-This demo uses the **[LFM2.5-VL-1.6B](/docs/models/lfm25-vl-1.6b)** model, a 1.6 billion parameter vision-language model that has been quantized for efficient browser-based inference. The model runs entirely client-side using ONNX Runtime Web with WebGPU acceleration.
+This demo uses the **[LFM2.5-VL-1.6B](/lfm/models/lfm25-vl-1.6b)** model, a 1.6 billion parameter vision-language model that has been quantized for efficient browser-based inference. The model runs entirely client-side using ONNX Runtime Web with WebGPU acceleration.
### Remote vs. Local Inference
@@ -57,7 +57,7 @@ With WebGPU and local inference, everything runs directly in your browser:
### Technical Stack
-- **Model**: [LFM2.5-VL-1.6B](/docs/models/lfm25-vl-1.6b) (quantized ONNX format)
+- **Model**: [LFM2.5-VL-1.6B](/lfm/models/lfm25-vl-1.6b) (quantized ONNX format)
- **Inference Engine**: ONNX Runtime Web with WebGPU backend
- **Build Tool**: Vite for fast development and optimized production builds
- **Browser Requirements**: WebGPU-compatible browser (Chrome, Edge)
diff --git a/leap/edge-sdk/overview.mdx b/leap/edge-sdk/overview.mdx
index 9e4d1ca..fa91a45 100644
--- a/leap/edge-sdk/overview.mdx
+++ b/leap/edge-sdk/overview.mdx
@@ -36,4 +36,4 @@ The current list of main features includes:
We are consistently adding to this list - see our [changelog](/leap/changelog) for detailed updates.
-[Edit this page](https://github.com/Liquid4All/docs/tree/main/leap/edge-sdk/overview.mdx)
+[Edit this page](https://github.com/Liquid4All/docs/tree/main/deployment/on-device/ios/ios-quick-start-guide.mdx)
diff --git a/lfm/getting-started/connect-ai-tools.mdx b/lfm/getting-started/connect-ai-tools.mdx
new file mode 100644
index 0000000..69ee17b
--- /dev/null
+++ b/lfm/getting-started/connect-ai-tools.mdx
@@ -0,0 +1,8 @@
+---
+title: "Connect AI Tools"
+description: "Connect your AI coding tools to Liquid Docs via MCP for live, queryable access to documentation"
+---
+
+import ConnectAiTools from "/snippets/connect-ai-tools.mdx";
+
+
diff --git a/docs/getting-started/welcome.mdx b/lfm/getting-started/welcome.mdx
similarity index 84%
rename from docs/getting-started/welcome.mdx
rename to lfm/getting-started/welcome.mdx
index 09d08ac..2e40ad5 100644
--- a/docs/getting-started/welcome.mdx
+++ b/lfm/getting-started/welcome.mdx
@@ -42,16 +42,19 @@ Built on a new hybrid architecture, LFM2 sets a new standard in quality, speed,
## Get Started
-
-
+
+
Browse our collection of language models and their capabilities
-
+
Learn how to run models for different use cases and platforms
-
+
Customize models for your specific requirements
+
+ End-to-end examples for mobile, laptop, and web
+
diff --git a/docs/help/contributing.mdx b/lfm/help/contributing.mdx
similarity index 92%
rename from docs/help/contributing.mdx
rename to lfm/help/contributing.mdx
index 0f84180..d545dc5 100644
--- a/docs/help/contributing.mdx
+++ b/lfm/help/contributing.mdx
@@ -102,8 +102,8 @@ Use Mintlify components appropriately:
### Links
-- Use relative links for internal pages: `/docs/inference/transformers`
-- Use descriptive link text: "See the [inference guide](/docs/inference/transformers)" not "Click [here](/docs/inference/transformers)"
+- Use relative links for internal pages: `/deployment/gpu-inference/transformers`
+- Use descriptive link text: "See the [inference guide](/deployment/gpu-inference/transformers)" not "Click [here](/deployment/gpu-inference/transformers)"
## What to Contribute
diff --git a/docs/help/faqs.mdx b/lfm/help/faqs.mdx
similarity index 72%
rename from docs/help/faqs.mdx
rename to lfm/help/faqs.mdx
index fa81927..d455f22 100644
--- a/docs/help/faqs.mdx
+++ b/lfm/help/faqs.mdx
@@ -15,12 +15,12 @@ All LFM models support a 32k token text context length for extended conversation
LFM models are compatible with:
-- [Transformers](/docs/inference/transformers) - For research and development
-- [llama.cpp](/docs/inference/llama-cpp) - For efficient CPU inference
-- [vLLM](/docs/inference/vllm) - For high-throughput production serving
-- [MLX](/docs/inference/mlx) - For Apple Silicon optimization
-- [Ollama](/docs/inference/ollama) - For easy local deployment
-- [LEAP](/leap/edge-sdk/overview) - For edge and mobile deployment
+- [Transformers](/deployment/gpu-inference/transformers) - For research and development
+- [llama.cpp](/deployment/on-device/llama-cpp) - For efficient CPU inference
+- [vLLM](/deployment/gpu-inference/vllm) - For high-throughput production serving
+- [MLX](/deployment/on-device/mlx) - For Apple Silicon optimization
+- [Ollama](/deployment/on-device/ollama) - For easy local deployment
+- [LEAP](/deployment/on-device/ios/ios-quick-start-guide) - For edge and mobile deployment
## Model Selection
@@ -39,7 +39,7 @@ LFM2.5 models are updated versions with improved training that deliver higher pe
-[Liquid Nanos](/docs/models/liquid-nanos) are task-specific models fine-tuned for specialized use cases like:
+[Liquid Nanos](/lfm/models/liquid-nanos) are task-specific models fine-tuned for specialized use cases like:
- Information extraction (LFM2-Extract)
- Translation (LFM2-350M-ENJP-MT)
- RAG question answering (LFM2-1.2B-RAG)
@@ -49,7 +49,7 @@ LFM2.5 models are updated versions with improved training that deliver higher pe
## Deployment
-Yes! Use the [LEAP SDK](/leap/edge-sdk/overview) to deploy models on iOS and Android devices. LEAP provides optimized inference for edge deployment with support for quantized models.
+Yes! Use the [LEAP SDK](/deployment/on-device/ios/ios-quick-start-guide) to deploy models on iOS and Android devices. LEAP provides optimized inference for edge deployment with support for quantized models.
@@ -69,7 +69,7 @@ For most use cases, Q4_K_M or Q5_K_M provide good quality with significant size
## Fine-tuning
-Yes! Most LFM models support fine-tuning with [TRL](/docs/fine-tuning/trl) and [Unsloth](/docs/fine-tuning/unsloth). Check the [Model Library](/docs/models/complete-library) for trainability information.
+Yes! Most LFM models support fine-tuning with [TRL](/customization/finetuning-frameworks/trl) and [Unsloth](/customization/finetuning-frameworks/unsloth). Check the [Model Library](/lfm/models/complete-library) for trainability information.
@@ -82,4 +82,4 @@ Yes! Most LFM models support fine-tuning with [TRL](/docs/fine-tuning/trl) and [
- Join our [Discord community](https://discord.gg/DFU3WQeaYD) for real-time help
- Check the [Cookbook](https://github.com/Liquid4All/cookbook) for examples
-- See [Troubleshooting](/docs/help/troubleshooting) for common issues
+- See [Troubleshooting](/lfm/help/troubleshooting) for common issues
diff --git a/docs/help/troubleshooting.mdx b/lfm/help/troubleshooting.mdx
similarity index 100%
rename from docs/help/troubleshooting.mdx
rename to lfm/help/troubleshooting.mdx
diff --git a/docs/key-concepts/chat-template.mdx b/lfm/key-concepts/chat-template.mdx
similarity index 98%
rename from docs/key-concepts/chat-template.mdx
rename to lfm/key-concepts/chat-template.mdx
index db15e2f..2f8ea9c 100644
--- a/docs/key-concepts/chat-template.mdx
+++ b/lfm/key-concepts/chat-template.mdx
@@ -23,7 +23,7 @@ LFM2 supports four conversation roles:
* **`system`** β (Optional) Defines who the assistant is and how it should respond.
* **`user`** β Messages from the user containing questions and instructions.
* **`assistant`** β Responses from the model.
-* **`tool`** β Results from tool/function execution. Used for [tool use](/docs/key-concepts/tool-use) workflows.
+* **`tool`** β Results from tool/function execution. Used for [tool use](/lfm/key-concepts/tool-use) workflows.
The complete chat template definition can be found in the `chat_template.jinja` file in each model's Hugging Face repository.
diff --git a/docs/key-concepts/text-generation-and-prompting.mdx b/lfm/key-concepts/text-generation-and-prompting.mdx
similarity index 95%
rename from docs/key-concepts/text-generation-and-prompting.mdx
rename to lfm/key-concepts/text-generation-and-prompting.mdx
index 35b5852..7fc71f8 100644
--- a/docs/key-concepts/text-generation-and-prompting.mdx
+++ b/lfm/key-concepts/text-generation-and-prompting.mdx
@@ -73,7 +73,7 @@ Control text generation behavior, balancing creativity, determinism, and quality
* **`repetition_penalty`** (1.0+) - Reduces repetition. 1.0 = no penalty; >1.0 = prevents repetition.
* **`max_tokens`** / **`max_new_tokens`** - Maximum tokens to generate.
-Parameter names and syntax vary by platform. See [Transformers](/docs/inference/transformers), [vLLM](/docs/inference/vllm), or [llama.cpp](/docs/inference/llama-cpp) for details.
+Parameter names and syntax vary by platform. See [Transformers](/deployment/gpu-inference/transformers), [vLLM](/deployment/gpu-inference/vllm), or [llama.cpp](/deployment/on-device/llama-cpp) for details.
### Recommended Settings Text
@@ -132,5 +132,5 @@ min_image_tokens = 32
* `do_image_splitting=True`
-**Liquid Nanos** (task-specific models like LFM2-Extract, LFM2-RAG, LFM2-Tool, etc.) may have special prompting requirements and different generation parameters. For the best usage guidelines, refer to the individual model cards on the [Liquid Nanos](/docs/models/liquid-nanos) page.
+**Liquid Nanos** (task-specific models like LFM2-Extract, LFM2-RAG, LFM2-Tool, etc.) may have special prompting requirements and different generation parameters. For the best usage guidelines, refer to the individual model cards on the [Liquid Nanos](/lfm/models/liquid-nanos) page.
diff --git a/docs/key-concepts/tool-use.mdx b/lfm/key-concepts/tool-use.mdx
similarity index 100%
rename from docs/key-concepts/tool-use.mdx
rename to lfm/key-concepts/tool-use.mdx
diff --git a/docs/models/audio-models.mdx b/lfm/models/audio-models.mdx
similarity index 94%
rename from docs/models/audio-models.mdx
rename to lfm/models/audio-models.mdx
index a81c50e..1434969 100644
--- a/docs/models/audio-models.mdx
+++ b/lfm/models/audio-models.mdx
@@ -32,7 +32,7 @@ icon: "headphones"
-
+
1.5B Β· Recommended
Best audio model for most use cases. Fast, accurate, and CPU-friendly.
@@ -44,7 +44,7 @@ icon: "headphones"
-
+
1.5B Β· Deprecated
Use the new LFM2.5-Audio-1.5B checkpoint instead.
diff --git a/lfm/models/complete-library.mdx b/lfm/models/complete-library.mdx
new file mode 100644
index 0000000..1d7a6f8
--- /dev/null
+++ b/lfm/models/complete-library.mdx
@@ -0,0 +1,94 @@
+---
+title: "Model Library"
+description: "Liquid Foundation Models (LFMs) are a new class of multimodal architectures built for fast inference and on-device deployment. Browse all available models and formats here."
+---
+
+
+
+All of our models share the following capabilities:
+
+- 32k token context length for extended conversations and document processing
+- Designed for fast inference with [Transformers](/deployment/gpu-inference/transformers), [llama.cpp](/deployment/on-device/llama-cpp), [vLLM](/deployment/gpu-inference/vllm), [MLX](/deployment/on-device/mlx), [Ollama](/deployment/on-device/ollama), and [LEAP](/deployment/on-device/ios/ios-quick-start-guide)
+- Trainable via SFT, DPO, and GRPO with [TRL](/customization/finetuning-frameworks/trl) and [Unsloth](/customization/finetuning-frameworks/unsloth)
+
+
+
+## Model Families
+
+Choose a model based on your desired functionalities. Each individual model card has specific details on deployment and customization.
+
+
+
+
+ Chat, tool calling, structured output, and classification.
+
+
+
+ Image understanding with LFM backbones and custom encoders.
+
+
+
+ Interleaved audio/text models for TTS, ASR, and voice chat.
+
+
+
+ Task-specific models for extraction, summarization, RAG, and translation.
+
+
+
+
+## Model Formats
+
+All LFM2 models are available in multiple formats for flexible deployment:
+
+- **GGUF** β Best for local CPU/GPU inference on any platform. Use with [llama.cpp](/deployment/on-device/llama-cpp), [LM Studio](/deployment/on-device/lm-studio), or [Ollama](/deployment/on-device/ollama). Append `-GGUF` to any model name.
+- **MLX** β Best for Mac users with Apple Silicon. Leverages unified memory for fast inference via [MLX](/deployment/on-device/mlx). Browse at [mlx-community](https://huggingface.co/mlx-community/collections?search=LFM).
+- **ONNX** β Best for production deployments and edge devices. Cross-platform with ONNX Runtime across CPUs, GPUs, and accelerators. Append `-ONNX` to any model name.
+
+### Quantization
+
+Quantization reduces model size and speeds up inference with minimal quality loss. Available options by format:
+
+- **GGUF** β Supports `Q4_0`, `Q4_K_M`, `Q5_K_M`, `Q6_K`, `Q8_0`, `BF16`, and `F16`. `Q4_K_M` offers the best balance of size and quality.
+- **MLX** β Available in `3bit`, `4bit`, `5bit`, `6bit`, `8bit`, and `BF16`. `8bit` is recommended.
+- **ONNX** β Supports `FP32`, `FP16`, `Q4`, and `Q8` (MoE models also support `Q4F16`). `Q4` is recommended for most deployments.
+
+## Model Chart
+
+| Model | HF | GGUF | MLX | ONNX | Trainable? |
+| ----- | -- | ---- | --- | ---- | ---------- |
+| **Text-to-text Models** | | | | | |
+| LFM2.5 Models (Latest Release) | | | | | |
+| [LFM2.5-1.2B-Instruct](/lfm/models/lfm25-1.2b-instruct) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-GGUF) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-MLX-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct-ONNX) | Yes (TRL) |
+| [LFM2.5-1.2B-Thinking](/lfm/models/lfm25-1.2b-thinking) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking-GGUF) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking-MLX-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Thinking-ONNX) | Yes (TRL) |
+| [LFM2.5-1.2B-Base](/lfm/models/lfm25-1.2b-base) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base-GGUF) | β | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Base-ONNX) | Yes (TRL) |
+| [LFM2.5-1.2B-JP](/lfm/models/lfm25-1.2b-jp) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-GGUF) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-MLX-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-1.2B-JP-ONNX) | Yes (TRL) |
+| LFM2 Models | | | | | |
+| [LFM2-8B-A1B](/lfm/models/lfm2-8b-a1b) | [β](https://huggingface.co/LiquidAI/LFM2-8B-A1B) | [β](https://huggingface.co/LiquidAI/LFM2-8B-A1B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-8B-A1B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-8B-A1B-ONNX) | Yes (TRL) |
+| [LFM2-2.6B](/lfm/models/lfm2-2.6b) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-2.6B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-2.6B-ONNX) | Yes (TRL) |
+| [LFM2-2.6B-Exp](/lfm/models/lfm2-2.6b-exp) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Exp) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Exp-GGUF) | β | β | Yes (TRL) |
+| [LFM2-1.2B](/lfm/models/lfm2-1.2b) Deprecated | [β](https://huggingface.co/LiquidAI/LFM2-1.2B) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-1.2B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-1.2B-ONNX) | Yes (TRL) |
+| [LFM2-700M](/lfm/models/lfm2-700m) | [β](https://huggingface.co/LiquidAI/LFM2-700M) | [β](https://huggingface.co/LiquidAI/LFM2-700M-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-700M-8bit) | [β](https://huggingface.co/onnx-community/LFM2-700M-ONNX) | Yes (TRL) |
+| [LFM2-350M](/lfm/models/lfm2-350m) | [β](https://huggingface.co/LiquidAI/LFM2-350M) | [β](https://huggingface.co/LiquidAI/LFM2-350M-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-350M-8bit) | [β](https://huggingface.co/onnx-community/LFM2-350M-ONNX) | Yes (TRL) |
+| **Vision Language Models** | | | | | |
+| LFM2.5 Models (Latest Release) | | | | | |
+| [LFM2.5-VL-1.6B](/lfm/models/lfm25-vl-1.6b) | [β](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B) | [β](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2.5-VL-1.6B-8bit) | [β](https://huggingface.co/LiquidAI/LFM2.5-VL-1.6B-ONNX) | Yes (TRL) |
+| LFM2 Models | | | | | |
+| [LFM2-VL-3B](/lfm/models/lfm2-vl-3b) | [β](https://huggingface.co/LiquidAI/LFM2-VL-3B) | [β](https://huggingface.co/LiquidAI/LFM2-VL-3B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-VL-3B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-VL-3B-ONNX) | Yes (TRL) |
+| [LFM2-VL-1.6B](/lfm/models/lfm2-vl-1.6b) | [β](https://huggingface.co/LiquidAI/LFM2-VL-1.6B) | [β](https://huggingface.co/LiquidAI/LFM2-VL-1.6B-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-VL-1.6B-8bit) | [β](https://huggingface.co/onnx-community/LFM2-VL-1.6B-ONNX) | Yes (TRL) |
+| [LFM2-VL-450M](/lfm/models/lfm2-vl-450m) | [β](https://huggingface.co/LiquidAI/LFM2-VL-450M) | [β](https://huggingface.co/LiquidAI/LFM2-VL-450M-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-VL-450M-8bit) | [β](https://huggingface.co/onnx-community/LFM2-VL-450M-ONNX) | Yes (TRL) |
+| **Audio Models** | | | | | |
+| LFM2.5 Models (Latest Release) | | | | | |
+| [LFM2.5-Audio-1.5B](/lfm/models/lfm25-audio-1.5b) | [β](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) | [β](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-GGUF) | β | [β](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B-ONNX) | Yes (TRL) |
+| LFM2 Models | | | | | |
+| [LFM2-Audio-1.5B](/lfm/models/lfm2-audio-1.5b) | [β](https://huggingface.co/LiquidAI/LFM2-Audio-1.5B) | [β](https://huggingface.co/LiquidAI/LFM2-Audio-1.5B-GGUF) | β | β | No |
+| **Liquid Nanos** | | | | | |
+| [LFM2-1.2B-Extract](/lfm/models/lfm2-1.2b-extract) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Extract) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Extract-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-1.2B-Extract-ONNX) | Yes (TRL) |
+| [LFM2-350M-Extract](/lfm/models/lfm2-350m-extract) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Extract) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Extract-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-350M-Extract-ONNX) | Yes (TRL) |
+| [LFM2-350M-ENJP-MT](/lfm/models/lfm2-350m-enjp-mt) | [β](https://huggingface.co/LiquidAI/LFM2-350M-ENJP-MT) | [β](https://huggingface.co/LiquidAI/LFM2-350M-ENJP-MT-GGUF) | [β](https://huggingface.co/mlx-community/LFM2-350M-ENJP-MT-8bit) | [β](https://huggingface.co/onnx-community/LFM2-350M-ENJP-MT-ONNX) | Yes (TRL) |
+| [LFM2-1.2B-RAG](/lfm/models/lfm2-1.2b-rag) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-1.2B-RAG-ONNX) | Yes (TRL) |
+| [LFM2-1.2B-Tool](/lfm/models/lfm2-1.2b-tool) Deprecated | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Tool) | [β](https://huggingface.co/LiquidAI/LFM2-1.2B-Tool-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-1.2B-Tool-ONNX) | Yes (TRL) |
+| [LFM2-350M-Math](/lfm/models/lfm2-350m-math) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Math) | [β](https://huggingface.co/LiquidAI/LFM2-350M-Math-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-350M-Math-ONNX) | Yes (TRL) |
+| [LFM2-350M-PII-Extract-JP](/lfm/models/lfm2-350m-pii-extract-jp) | [β](https://huggingface.co/LiquidAI/LFM2-350M-PII-Extract-JP) | [β](https://huggingface.co/LiquidAI/LFM2-350M-PII-Extract-JP-GGUF) | β | β | Yes (TRL) |
+| [LFM2-ColBERT-350M](/lfm/models/lfm2-colbert-350m) | [β](https://huggingface.co/LiquidAI/LFM2-ColBERT-350M) | β | β | β | Yes (PyLate) |
+| [LFM2-2.6B-Transcript](/lfm/models/lfm2-2.6b-transcript) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript) | [β](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript-GGUF) | β | [β](https://huggingface.co/onnx-community/LFM2-2.6B-Transcript-ONNX) | Yes (TRL) |
diff --git a/docs/models/lfm2-1.2b-extract.mdx b/lfm/models/lfm2-1.2b-extract.mdx
similarity index 97%
rename from docs/models/lfm2-1.2b-extract.mdx
rename to lfm/models/lfm2-1.2b-extract.mdx
index 2995713..eccaa68 100644
--- a/docs/models/lfm2-1.2b-extract.mdx
+++ b/lfm/models/lfm2-1.2b-extract.mdx
@@ -3,7 +3,7 @@ title: "LFM2-1.2B-Extract"
description: "1.2B parameter model for structured information extraction from documents"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-1.2B-Extract is optimized for extracting structured data (JSON, XML, YAML) from unstructured documents. It handles complex nested schemas and multi-field extraction with high accuracy.
diff --git a/docs/models/lfm2-1.2b-rag.mdx b/lfm/models/lfm2-1.2b-rag.mdx
similarity index 97%
rename from docs/models/lfm2-1.2b-rag.mdx
rename to lfm/models/lfm2-1.2b-rag.mdx
index a43d963..bf219d5 100644
--- a/docs/models/lfm2-1.2b-rag.mdx
+++ b/lfm/models/lfm2-1.2b-rag.mdx
@@ -3,7 +3,7 @@ title: "LFM2-1.2B-RAG"
description: "1.2B parameter model optimized for Retrieval-Augmented Generation"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-1.2B-RAG is optimized for answering questions grounded in provided context documents. It excels at extracting relevant information from retrieved documents while avoiding hallucination.
diff --git a/docs/models/lfm2-1.2b-tool.mdx b/lfm/models/lfm2-1.2b-tool.mdx
similarity index 82%
rename from docs/models/lfm2-1.2b-tool.mdx
rename to lfm/models/lfm2-1.2b-tool.mdx
index 91e7a63..a62e330 100644
--- a/docs/models/lfm2-1.2b-tool.mdx
+++ b/lfm/models/lfm2-1.2b-tool.mdx
@@ -3,10 +3,10 @@ title: "LFM2-1.2B-Tool"
description: "1.2B parameter model for tool calling (deprecated)"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
-This model is deprecated. Use [LFM2.5-1.2B-Instruct](/docs/models/lfm25-1.2b-instruct) for tool calling instead, which offers improved accuracy and follows the standard tool use format.
+This model is deprecated. Use [LFM2.5-1.2B-Instruct](/lfm/models/lfm25-1.2b-instruct) for tool calling instead, which offers improved accuracy and follows the standard tool use format.
LFM2-1.2B-Tool was optimized for efficient and precise tool calling. It has been superseded by LFM2.5-1.2B-Instruct which provides better tool calling performance alongside general chat capabilities.
@@ -40,4 +40,4 @@ model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
# See the Tool Use guide for complete examples
```
-See the [Tool Use](/docs/key-concepts/tool-use) guide for detailed tool calling documentation.
+See the [Tool Use](/lfm/key-concepts/tool-use) guide for detailed tool calling documentation.
diff --git a/docs/models/lfm2-1.2b.mdx b/lfm/models/lfm2-1.2b.mdx
similarity index 91%
rename from docs/models/lfm2-1.2b.mdx
rename to lfm/models/lfm2-1.2b.mdx
index 72a379d..7af719b 100644
--- a/docs/models/lfm2-1.2b.mdx
+++ b/lfm/models/lfm2-1.2b.mdx
@@ -7,10 +7,10 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
-This model is deprecated. Use [LFM2.5-1.2B-Instruct](/docs/models/lfm25-1.2b-instruct) for improved performance.
+This model is deprecated. Use [LFM2.5-1.2B-Instruct](/lfm/models/lfm25-1.2b-instruct) for improved performance.
LFM2-1.2B was the original 1.2B parameter model in the LFM2 series. It has been superseded by LFM2.5-1.2B-Instruct, which offers better chat, instruction-following, and tool-calling performance.
diff --git a/docs/models/lfm2-2.6b-exp.mdx b/lfm/models/lfm2-2.6b-exp.mdx
similarity index 95%
rename from docs/models/lfm2-2.6b-exp.mdx
rename to lfm/models/lfm2-2.6b-exp.mdx
index 967c712..552a05d 100644
--- a/docs/models/lfm2-2.6b-exp.mdx
+++ b/lfm/models/lfm2-2.6b-exp.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2-2.6B-Exp is an experimental checkpoint of LFM2-2.6B with RL-only post-training, delivering improved performance on math and reasoning benchmarks. Use this model when you need stronger analytical capabilities.
diff --git a/docs/models/lfm2-2.6b-transcript.mdx b/lfm/models/lfm2-2.6b-transcript.mdx
similarity index 98%
rename from docs/models/lfm2-2.6b-transcript.mdx
rename to lfm/models/lfm2-2.6b-transcript.mdx
index 6a71d13..c1afe91 100644
--- a/docs/models/lfm2-2.6b-transcript.mdx
+++ b/lfm/models/lfm2-2.6b-transcript.mdx
@@ -3,7 +3,7 @@ title: "LFM2-2.6B-Transcript"
description: "2.6B parameter model for private, on-device meeting summarization"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-2.6B-Transcript is designed for private, on-device meeting summarization from transcripts. It generates executive summaries, detailed summaries, action items, key decisions, and participant lists.
diff --git a/docs/models/lfm2-2.6b.mdx b/lfm/models/lfm2-2.6b.mdx
similarity index 96%
rename from docs/models/lfm2-2.6b.mdx
rename to lfm/models/lfm2-2.6b.mdx
index 4c3f768..4dd6635 100644
--- a/docs/models/lfm2-2.6b.mdx
+++ b/lfm/models/lfm2-2.6b.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2-2.6B is a versatile mid-sized model delivering strong performance across chat, reasoning, and tool-calling tasks. Optimized for deployment on consumer devices including phones and laptops.
diff --git a/docs/models/lfm2-350m-enjp-mt.mdx b/lfm/models/lfm2-350m-enjp-mt.mdx
similarity index 97%
rename from docs/models/lfm2-350m-enjp-mt.mdx
rename to lfm/models/lfm2-350m-enjp-mt.mdx
index 1b153fb..3d5efd6 100644
--- a/docs/models/lfm2-350m-enjp-mt.mdx
+++ b/lfm/models/lfm2-350m-enjp-mt.mdx
@@ -3,7 +3,7 @@ title: "LFM2-350M-ENJP-MT"
description: "350M parameter model for bidirectional English-Japanese translation"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-350M-ENJP-MT is a specialized translation model for near real-time bidirectional Japanese/English translation. Optimized for short-to-medium text with low latency.
diff --git a/docs/models/lfm2-350m-extract.mdx b/lfm/models/lfm2-350m-extract.mdx
similarity index 97%
rename from docs/models/lfm2-350m-extract.mdx
rename to lfm/models/lfm2-350m-extract.mdx
index fff4455..9162c46 100644
--- a/docs/models/lfm2-350m-extract.mdx
+++ b/lfm/models/lfm2-350m-extract.mdx
@@ -3,7 +3,7 @@ title: "LFM2-350M-Extract"
description: "350M parameter extraction model for edge deployment"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-350M-Extract is the fastest extraction model, optimized for edge deployment with strict memory and compute constraints. It delivers structured data extraction with minimal latency.
diff --git a/docs/models/lfm2-350m-math.mdx b/lfm/models/lfm2-350m-math.mdx
similarity index 96%
rename from docs/models/lfm2-350m-math.mdx
rename to lfm/models/lfm2-350m-math.mdx
index a053910..62db0e1 100644
--- a/docs/models/lfm2-350m-math.mdx
+++ b/lfm/models/lfm2-350m-math.mdx
@@ -3,7 +3,7 @@ title: "LFM2-350M-Math"
description: "350M parameter model for math problem solving"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-350M-Math is a tiny reasoning model optimized for mathematical problem solving. It provides step-by-step solutions while maintaining a small footprint for edge deployment.
diff --git a/docs/models/lfm2-350m-pii-extract-jp.mdx b/lfm/models/lfm2-350m-pii-extract-jp.mdx
similarity index 97%
rename from docs/models/lfm2-350m-pii-extract-jp.mdx
rename to lfm/models/lfm2-350m-pii-extract-jp.mdx
index cf70d24..e24748c 100644
--- a/docs/models/lfm2-350m-pii-extract-jp.mdx
+++ b/lfm/models/lfm2-350m-pii-extract-jp.mdx
@@ -3,7 +3,7 @@ title: "LFM2-350M-PII-Extract-JP"
description: "350M parameter model for Japanese PII detection and extraction"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-350M-PII-Extract-JP extracts personally identifiable information (PII) from Japanese text as structured JSON. Output can be used to mask sensitive information on-device for privacy-preserving applications.
diff --git a/docs/models/lfm2-350m.mdx b/lfm/models/lfm2-350m.mdx
similarity index 96%
rename from docs/models/lfm2-350m.mdx
rename to lfm/models/lfm2-350m.mdx
index 5176bd2..00dae65 100644
--- a/docs/models/lfm2-350m.mdx
+++ b/lfm/models/lfm2-350m.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2-350M is Liquid AI's smallest text model, designed for edge devices with strict memory and compute constraints. Delivers surprisingly strong performance for its size, making it ideal for low-latency applications.
diff --git a/docs/models/lfm2-700m.mdx b/lfm/models/lfm2-700m.mdx
similarity index 96%
rename from docs/models/lfm2-700m.mdx
rename to lfm/models/lfm2-700m.mdx
index 854ed38..8cdb0c2 100644
--- a/docs/models/lfm2-700m.mdx
+++ b/lfm/models/lfm2-700m.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2-700M is a compact model balancing capability and efficiency. Suitable for deployment on a wide range of devices including phones, tablets, and laptops with limited resources.
diff --git a/docs/models/lfm2-8b-a1b.mdx b/lfm/models/lfm2-8b-a1b.mdx
similarity index 96%
rename from docs/models/lfm2-8b-a1b.mdx
rename to lfm/models/lfm2-8b-a1b.mdx
index 8a40718..9dc8964 100644
--- a/docs/models/lfm2-8b-a1b.mdx
+++ b/lfm/models/lfm2-8b-a1b.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2-8B-A1B is Liquid AI's Mixture-of-Experts model, combining 8B total parameters with only 1.5B active parameters per forward pass. This delivers the quality of larger models with the speed and efficiency of smaller onesβideal for on-device deployment.
diff --git a/docs/models/lfm2-audio-1.5b.mdx b/lfm/models/lfm2-audio-1.5b.mdx
similarity index 94%
rename from docs/models/lfm2-audio-1.5b.mdx
rename to lfm/models/lfm2-audio-1.5b.mdx
index 6cc4296..0b5a6f2 100644
--- a/docs/models/lfm2-audio-1.5b.mdx
+++ b/lfm/models/lfm2-audio-1.5b.mdx
@@ -3,10 +3,10 @@ title: "LFM2-Audio-1.5B"
description: "1.5B audio model (deprecated - use LFM2.5-Audio-1.5B instead)"
---
-β Back to Audio Models
+β Back to Audio Models
-This model is deprecated. Use [LFM2.5-Audio-1.5B](/docs/models/lfm25-audio-1.5b) for improved ASR, TTS, and CPU-friendly inference.
+This model is deprecated. Use [LFM2.5-Audio-1.5B](/lfm/models/lfm25-audio-1.5b) for improved ASR, TTS, and CPU-friendly inference.
LFM2-Audio-1.5B was the original fully interleaved audio/text model. It has been superseded by LFM2.5-Audio-1.5B, which features a custom LFM-based audio detokenizer and improved performance.
diff --git a/docs/models/lfm2-colbert-350m.mdx b/lfm/models/lfm2-colbert-350m.mdx
similarity index 97%
rename from docs/models/lfm2-colbert-350m.mdx
rename to lfm/models/lfm2-colbert-350m.mdx
index 7150405..b691250 100644
--- a/docs/models/lfm2-colbert-350m.mdx
+++ b/lfm/models/lfm2-colbert-350m.mdx
@@ -3,7 +3,7 @@ title: "LFM2-ColBERT-350M"
description: "350M parameter ColBERT model for multi-language document retrieval and reranking"
---
-β Back to Liquid Nanos
+β Back to Liquid Nanos
LFM2-ColBERT-350M generates dense embeddings for document retrieval and reranking using the ColBERT late-interaction architecture. It supports 8 languages and excels at semantic search tasks.
diff --git a/docs/models/lfm2-vl-1.6b.mdx b/lfm/models/lfm2-vl-1.6b.mdx
similarity index 91%
rename from docs/models/lfm2-vl-1.6b.mdx
rename to lfm/models/lfm2-vl-1.6b.mdx
index c617000..dc4ff52 100644
--- a/docs/models/lfm2-vl-1.6b.mdx
+++ b/lfm/models/lfm2-vl-1.6b.mdx
@@ -7,10 +7,10 @@ import { VlTransformers } from "/snippets/quickstart/vl-transformers.mdx";
import { VlVllm } from "/snippets/quickstart/vl-vllm.mdx";
import { VlLlamacpp } from "/snippets/quickstart/vl-llamacpp.mdx";
-β Back to Vision Models
+β Back to Vision Models
-This model is deprecated. Use [LFM2.5-VL-1.6B](/docs/models/lfm25-vl-1.6b) for improved performance.
+This model is deprecated. Use [LFM2.5-VL-1.6B](/lfm/models/lfm25-vl-1.6b) for improved performance.
LFM2-VL-1.6B was the original 1.6B vision-language model. It has been superseded by LFM2.5-VL-1.6B, which offers better visual understanding and reasoning through extended reinforcement learning.
diff --git a/docs/models/lfm2-vl-3b.mdx b/lfm/models/lfm2-vl-3b.mdx
similarity index 96%
rename from docs/models/lfm2-vl-3b.mdx
rename to lfm/models/lfm2-vl-3b.mdx
index 03ed3f2..a53db13 100644
--- a/docs/models/lfm2-vl-3b.mdx
+++ b/lfm/models/lfm2-vl-3b.mdx
@@ -7,7 +7,7 @@ import { VlTransformers } from "/snippets/quickstart/vl-transformers.mdx";
import { VlVllm } from "/snippets/quickstart/vl-vllm.mdx";
import { VlLlamacpp } from "/snippets/quickstart/vl-llamacpp.mdx";
-β Back to Vision Models
+β Back to Vision Models
LFM2-VL-3B is Liquid AI's highest-capacity multimodal model, delivering enhanced visual reasoning and detailed image understanding. Ideal for complex vision tasks requiring deeper comprehension.
diff --git a/docs/models/lfm2-vl-450m.mdx b/lfm/models/lfm2-vl-450m.mdx
similarity index 96%
rename from docs/models/lfm2-vl-450m.mdx
rename to lfm/models/lfm2-vl-450m.mdx
index 3009611..6c82036 100644
--- a/docs/models/lfm2-vl-450m.mdx
+++ b/lfm/models/lfm2-vl-450m.mdx
@@ -7,7 +7,7 @@ import { VlTransformers } from "/snippets/quickstart/vl-transformers.mdx";
import { VlVllm } from "/snippets/quickstart/vl-vllm.mdx";
import { VlLlamacpp } from "/snippets/quickstart/vl-llamacpp.mdx";
-β Back to Vision Models
+β Back to Vision Models
LFM2-VL-450M is Liquid AI's smallest vision-language model, designed for edge deployment with strict memory and compute constraints. Delivers fast multimodal inference on resource-limited devices.
diff --git a/docs/models/lfm25-1.2b-base.mdx b/lfm/models/lfm25-1.2b-base.mdx
similarity index 97%
rename from docs/models/lfm25-1.2b-base.mdx
rename to lfm/models/lfm25-1.2b-base.mdx
index 52bdaef..3dc3bfb 100644
--- a/docs/models/lfm25-1.2b-base.mdx
+++ b/lfm/models/lfm25-1.2b-base.mdx
@@ -3,7 +3,7 @@ title: "LFM2.5-1.2B-Base"
description: "Pre-trained 1.2B parameter base model for fine-tuning and custom applications"
---
-β Back to Text Models
+β Back to Text Models
LFM2.5-1.2B-Base is the pre-trained foundation model for the LFM2.5 series. Ideal for fine-tuning on custom datasets or building specialized checkpoints. Not instruction-tunedβuse LFM2.5-1.2B-Instruct for chat applications.
diff --git a/docs/models/lfm25-1.2b-instruct.mdx b/lfm/models/lfm25-1.2b-instruct.mdx
similarity index 96%
rename from docs/models/lfm25-1.2b-instruct.mdx
rename to lfm/models/lfm25-1.2b-instruct.mdx
index 155a8f8..d32ee03 100644
--- a/docs/models/lfm25-1.2b-instruct.mdx
+++ b/lfm/models/lfm25-1.2b-instruct.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2.5-1.2B-Instruct is Liquid AI's flagship instruction-tuned model, delivering exceptional performance for chat, instruction-following, and tool-calling tasks. Built on the LFM2.5 architecture with extended pre-training and reinforcement learning.
diff --git a/docs/models/lfm25-1.2b-jp.mdx b/lfm/models/lfm25-1.2b-jp.mdx
similarity index 96%
rename from docs/models/lfm25-1.2b-jp.mdx
rename to lfm/models/lfm25-1.2b-jp.mdx
index 4bd5fdd..734e985 100644
--- a/docs/models/lfm25-1.2b-jp.mdx
+++ b/lfm/models/lfm25-1.2b-jp.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2.5-1.2B-JP is fine-tuned for Japanese language tasks, delivering high-quality Japanese text generation, translation, and conversation. Built on LFM2.5 with specialized Japanese training data.
diff --git a/docs/models/lfm25-1.2b-thinking.mdx b/lfm/models/lfm25-1.2b-thinking.mdx
similarity index 96%
rename from docs/models/lfm25-1.2b-thinking.mdx
rename to lfm/models/lfm25-1.2b-thinking.mdx
index fa5476b..7627d90 100644
--- a/docs/models/lfm25-1.2b-thinking.mdx
+++ b/lfm/models/lfm25-1.2b-thinking.mdx
@@ -7,7 +7,7 @@ import { TextTransformers } from "/snippets/quickstart/text-transformers.mdx";
import { TextVllm } from "/snippets/quickstart/text-vllm.mdx";
import { TextLlamacpp } from "/snippets/quickstart/text-llamacpp.mdx";
-β Back to Text Models
+β Back to Text Models
LFM2.5-1.2B-Thinking is optimized for reasoning tasks, delivering strong performance on math, logic, and multi-step problem-solving. Built on the LFM2.5 architecture with specialized training for chain-of-thought reasoning.
diff --git a/docs/models/lfm25-audio-1.5b.mdx b/lfm/models/lfm25-audio-1.5b.mdx
similarity index 98%
rename from docs/models/lfm25-audio-1.5b.mdx
rename to lfm/models/lfm25-audio-1.5b.mdx
index 25ad96d..3b47d04 100644
--- a/docs/models/lfm25-audio-1.5b.mdx
+++ b/lfm/models/lfm25-audio-1.5b.mdx
@@ -3,7 +3,7 @@ title: "LFM2.5-Audio-1.5B"
description: "1.5B fully interleaved audio/text model for TTS, ASR, and voice chat"
---
-β Back to Audio Models
+β Back to Audio Models
LFM2.5-Audio-1.5B is Liquid AI's flagship audio model, featuring a custom LFM-based audio detokenizer. It delivers natural speech synthesis, multilingual speech recognition, and fully interleaved voice chat with reasoning capabilities in a single compact model.
diff --git a/docs/models/lfm25-vl-1.6b.mdx b/lfm/models/lfm25-vl-1.6b.mdx
similarity index 96%
rename from docs/models/lfm25-vl-1.6b.mdx
rename to lfm/models/lfm25-vl-1.6b.mdx
index aeb81dd..527151c 100644
--- a/docs/models/lfm25-vl-1.6b.mdx
+++ b/lfm/models/lfm25-vl-1.6b.mdx
@@ -7,7 +7,7 @@ import { VlTransformers } from "/snippets/quickstart/vl-transformers.mdx";
import { VlVllm } from "/snippets/quickstart/vl-vllm.mdx";
import { VlLlamacpp } from "/snippets/quickstart/vl-llamacpp.mdx";
-β Back to Vision Models
+β Back to Vision Models
LFM2.5-VL-1.6B is Liquid AI's flagship vision-language model, delivering exceptional performance on image understanding, visual reasoning, and multimodal tasks. Built on LFM2.5 with a dynamic SigLIP2 image encoder.
diff --git a/docs/models/liquid-nanos.mdx b/lfm/models/liquid-nanos.mdx
similarity index 79%
rename from docs/models/liquid-nanos.mdx
rename to lfm/models/liquid-nanos.mdx
index eff2033..b5dcba7 100644
--- a/docs/models/liquid-nanos.mdx
+++ b/lfm/models/liquid-nanos.mdx
@@ -32,55 +32,55 @@ icon: "sparkles"
-
+
1.2B Β· Extraction
Extract structured JSON from unstructured documents.
-
+
350M Β· Extraction
Fastest extraction model for edge deployment.
-
+
350M Β· Extraction
Japanese PII detection into structured JSON.
-
+
2.6B Β· Summarization
Private, on-device meeting summarization from transcripts.
-
+
1.2B Β· RAG
Answer questions grounded in provided context documents.
-
+
350M Β· Retrieval
Multi-language document embeddings for retrieval and reranking.
-
+
350M Β· Translation
Near real-time bidirectional Japanese/English translation.
-
+
350M Β· Reasoning
Tiny reasoning model for math problem solving.
-
+
1.2B Β· Deprecated
Use LFM2.5-1.2B-Instruct for tool calling instead.
diff --git a/docs/models/text-models.mdx b/lfm/models/text-models.mdx
similarity index 86%
rename from docs/models/text-models.mdx
rename to lfm/models/text-models.mdx
index 36c00f2..f65d210 100644
--- a/docs/models/text-models.mdx
+++ b/lfm/models/text-models.mdx
@@ -32,25 +32,25 @@ icon: "comment"
-
+
1.2B Β· Recommended
Instruction-tuned for chat. Best for most use cases.
-
+
1.2B Β· Reasoning
Optimized for math and logical problem-solving.
-
+
1.2B Β· Pre-trained
Base model for finetuning or custom checkpoints.
-
+
1.2B Β· Japanese
Fine-tuned model for high-quality Japanese text generation.
@@ -62,37 +62,37 @@ icon: "comment"
-
+
8B Β· 1.5B active Β· MoE
Mixture-of-experts model for on-device speed and quality.
-
+
2.6B
Highly capable model for deployment on most phones and laptops.
-
+
2.6B
RL-only post-trained checkpoint for improved math and reasoning.
-
+
1.2B Β· Deprecated
Use the new LFM2.5-1.2B-Instruct checkpoint instead.
-
+
700M
Mid sized model for deploying on most devices.
-
+
350M Β· Fastest
Our smallest model for edge devices and low latency deployments.
diff --git a/docs/models/vision-models.mdx b/lfm/models/vision-models.mdx
similarity index 93%
rename from docs/models/vision-models.mdx
rename to lfm/models/vision-models.mdx
index f9bec43..1e92172 100644
--- a/docs/models/vision-models.mdx
+++ b/lfm/models/vision-models.mdx
@@ -32,7 +32,7 @@ icon: "eye"
-
+
1.6B Β· Recommended
Best vision model for most use cases. Fast and accurate.
@@ -44,19 +44,19 @@ icon: "eye"
-
+
3B
Highest-capacity multimodal model with enhanced visual reasoning.
-
+
1.6B Β· Deprecated
Use the new LFM2.5-VL-1.6B checkpoint instead.
-
+
450M Β· Fastest
Compact multimodal model for edge deployment and fast inference.
diff --git a/docs/getting-started/connect-ai-tools.mdx b/snippets/connect-ai-tools.mdx
similarity index 89%
rename from docs/getting-started/connect-ai-tools.mdx
rename to snippets/connect-ai-tools.mdx
index 44a6fe0..7f80b29 100644
--- a/docs/getting-started/connect-ai-tools.mdx
+++ b/snippets/connect-ai-tools.mdx
@@ -1,8 +1,3 @@
----
-title: "Connect AI Tools"
-description: "Connect your AI coding tools to Liquid Docs via MCP for live, queryable access to documentation"
----
-
## What is MCP?
The Model Context Protocol (MCP) is an open standard that gives AI applications a standardized way to connect to external data sources and tools. By connecting your AI coding tool to Liquid docs via MCP, you're giving it live, queryable access to the complete documentation: not a snapshot, not a cached file, but a real-time search against our official documentation.
@@ -111,10 +106,10 @@ You're all set! Cursor now has real-time access to Liquid AI documentation.
## Next Steps
-
+
Browse our collection of language models
-
+
Get started with the LEAP SDK