diff --git a/docs.json b/docs.json
index ed4d996..80ec89a 100644
--- a/docs.json
+++ b/docs.json
@@ -36,7 +36,7 @@
   "logo": {
     "light": "/logo/light.svg",
     "dark": "/logo/dark.svg",
-    "href": "/"
+    "href": "https://liquid.ai"
   },
   "navbar": {
     "links": [
@@ -52,20 +52,6 @@
     }
   },
   "navigation": {
-    "global": {
-      "anchors": [
-        {
-          "anchor": "About Us",
-          "icon": "building",
-          "href": "https://www.liquid.ai/company/about"
-        },
-        {
-          "anchor": "Blog",
-          "icon": "pencil",
-          "href": "https://www.liquid.ai/company/blog"
-        }
-      ]
-    },
     "tabs": [
       {
         "tab": "Documentation",
@@ -202,7 +188,7 @@
         ]
       },
       {
-        "tab": "Guides",
+        "tab": "Examples",
         "groups": [
           {
             "group": "Get Started",
diff --git a/docs/fine-tuning/leap-finetune.mdx b/docs/fine-tuning/leap-finetune.mdx
index f7b7c11..d25add6 100644
--- a/docs/fine-tuning/leap-finetune.mdx
+++ b/docs/fine-tuning/leap-finetune.mdx
@@ -20,10 +20,10 @@ LEAP Finetune will provide:
 While LEAP Finetune is in development, you can fine-tune models using:
 
 <CardGroup cols={2}>
-  <Card title="TRL" icon="graduation-cap" href="/lfm/fine-tuning/trl">
+  <Card title="TRL" icon="graduation-cap" href="/docs/fine-tuning/trl">
     Hugging Face's training library with LoRA/QLoRA support
   </Card>
-  <Card title="Unsloth" icon="zap" href="/lfm/fine-tuning/unsloth">
+  <Card title="Unsloth" icon="zap" href="/docs/fine-tuning/unsloth">
     Memory-efficient fine-tuning with 2x faster training
   </Card>
 </CardGroup>
diff --git a/docs/frameworks/outlines.mdx b/docs/frameworks/outlines.mdx
index 750b6ff..1984a1e 100644
--- a/docs/frameworks/outlines.mdx
+++ b/docs/frameworks/outlines.mdx
@@ -19,7 +19,7 @@ pip install outlines transformers torch
 
 ## Setup[​](#setup "Direct link to Setup")
 
-Outlines provides a simple interface for constrained generation. The examples below use Transformers, but Outlines works with all major inference frameworks including [vLLM](/lfm/inference/vllm), [llama.cpp](/lfm/inference/llama-cpp), [MLX](/lfm/inference/mlx), [Ollama](/lfm/inference/ollama), and more. See the [Outlines documentation](https://dottxt-ai.github.io/outlines/latest/) for framework-specific examples.
+Outlines provides a simple interface for constrained generation. The examples below use Transformers, but Outlines works with all major inference frameworks including [vLLM](/docs/inference/vllm), [llama.cpp](/docs/inference/llama-cpp), [MLX](/docs/inference/mlx), [Ollama](/docs/inference/ollama), and more. See the [Outlines documentation](https://dottxt-ai.github.io/outlines/latest/) for framework-specific examples.
 
 Start by wrapping your model:
 
@@ -263,5 +263,3 @@ For a detailed example of using Outlines with LFM2-350M for smart home control,
 * [Outlines GitHub](https://github.com/dottxt-ai/outlines)
 * [Outlines Documentation](https://dottxt-ai.github.io/outlines/)
 * [LFM2 × .txt Collaboration Blog Post](https://www.liquid.ai/blog/liquid-txt-collaboration)
-
-[Edit this page](https://github.com/Liquid4All/docs/tree/main/lfm/frameworks/outlines.md)
diff --git a/docs/help/contributing.mdx b/docs/help/contributing.mdx
index 31e0bc1..0f84180 100644
--- a/docs/help/contributing.mdx
+++ b/docs/help/contributing.mdx
@@ -102,8 +102,8 @@ Use Mintlify components appropriately:
 
 ### Links
 
-- Use relative links for internal pages: `/lfm/inference/transformers`
-- Use descriptive link text: "See the [inference guide](/lfm/inference/transformers)" not "Click [here](/lfm/inference/transformers)"
+- Use relative links for internal pages: `/docs/inference/transformers`
+- Use descriptive link text: "See the [inference guide](/docs/inference/transformers)" not "Click [here](/docs/inference/transformers)"
 
 ## What to Contribute
 
diff --git a/docs/help/faqs.mdx b/docs/help/faqs.mdx
index 2a0b8d2..4610e6f 100644
--- a/docs/help/faqs.mdx
+++ b/docs/help/faqs.mdx
@@ -15,11 +15,11 @@ All LFM models support a 32k token text context length for extended conversation
 
 <Accordion title="Which inference frameworks are supported?">
 LFM models are compatible with:
-- [Transformers](/lfm/inference/transformers) - For research and development
-- [llama.cpp](/lfm/inference/llama-cpp) - For efficient CPU inference
-- [vLLM](/lfm/inference/vllm) - For high-throughput production serving
-- [MLX](/lfm/inference/mlx) - For Apple Silicon optimization
-- [Ollama](/lfm/inference/ollama) - For easy local deployment
+- [Transformers](/docs/inference/transformers) - For research and development
+- [llama.cpp](/docs/inference/llama-cpp) - For efficient CPU inference
+- [vLLM](/docs/inference/vllm) - For high-throughput production serving
+- [MLX](/docs/inference/mlx) - For Apple Silicon optimization
+- [Ollama](/docs/inference/ollama) - For easy local deployment
 - [LEAP](/leap/index) - For edge and mobile deployment
 </Accordion>
 
@@ -39,7 +39,7 @@ LFM2.5 models are updated versions with improved training that deliver higher pe
 </Accordion>
 
 <Accordion title="What are Liquid Nanos?">
-[Liquid Nanos](/lfm/models/liquid-nanos) are task-specific models fine-tuned for specialized use cases like:
+[Liquid Nanos](/docs/models/liquid-nanos) are task-specific models fine-tuned for specialized use cases like:
 - Information extraction (LFM2-Extract)
 - Translation (LFM2-350M-ENJP-MT)
 - RAG question answering (LFM2-1.2B-RAG)
@@ -69,7 +69,7 @@ For most use cases, Q4_K_M or Q5_K_M provide good quality with significant size
 ## Fine-tuning
 
 <Accordion title="Can I fine-tune LFM models?">
-Yes! Most LFM models support fine-tuning with [TRL](/lfm/fine-tuning/trl) and [Unsloth](/lfm/fine-tuning/unsloth). Check the [Complete Model Library](/lfm/models/complete-library) for trainability information.
+Yes! Most LFM models support fine-tuning with [TRL](/docs/fine-tuning/trl) and [Unsloth](/docs/fine-tuning/unsloth). Check the [Model Library](/docs/models/complete-library) for trainability information.
 </Accordion>
 
 <Accordion title="What fine-tuning methods are supported?">
@@ -82,4 +82,4 @@ Yes! Most LFM models support fine-tuning with [TRL](/lfm/fine-tuning/trl) and [U
 
 - Join our [Discord community](https://discord.gg/DFU3WQeaYD) for real-time help
 - Check the [Cookbook](https://github.com/Liquid4All/cookbook) for examples
-- See [Troubleshooting](/lfm/help/troubleshooting) for common issues
+- See [Troubleshooting](/docs/help/troubleshooting) for common issues
diff --git a/docs/index.mdx b/docs/index.mdx
index 9fcb65b..456d6a7 100644
--- a/docs/index.mdx
+++ b/docs/index.mdx
@@ -3,6 +3,6 @@ title: "LFM Documentation"
 description: "Redirect to LFM Getting Started"
 ---
 
-<meta http-equiv="refresh" content="0; url=/lfm/getting-started/intro" />
+<meta http-equiv="refresh" content="0; url=/docs/getting-started/welcome" />
 
-Redirecting to [Getting Started](/lfm/getting-started/intro)...
+Redirecting to [Getting Started](/docs/getting-started/welcome)...
diff --git a/docs/inference/llama-cpp.mdx b/docs/inference/llama-cpp.mdx
index 2d95ab5..f6a8294 100644
--- a/docs/inference/llama-cpp.mdx
+++ b/docs/inference/llama-cpp.mdx
@@ -114,67 +114,25 @@ hf download LiquidAI/LFM2.5-1.2B-Instruct-GGUF lfm2.5-1.2b-instruct-q4_k_m.gguf
 
 ## Basic Usage
 
-llama.cpp offers three main interfaces for running inference: `llama-cpp-python` (Python bindings), `llama-server` (OpenAI-compatible server), and `llama-cli` (interactive CLI).
+llama.cpp offers two main interfaces for running inference: `llama-server` (OpenAI-compatible server) and `llama-cli` (interactive CLI).
 
 <Tabs>
-  <Tab title="llama-cpp-python">
-    For Python applications, use the `llama-cpp-python` package.
-
-    **Installation:**
-    ```bash
-    pip install llama-cpp-python
-    ```
-
-    For GPU support:
-    ```bash
-    CMAKE_ARGS="-DLLAMA_CUDA=on" pip install llama-cpp-python
-    ```
-
-    **Model Setup:**
-    ```python
-    from llama_cpp import Llama
-
-    # Load model
-    llm = Llama(
-        model_path="lfm2.5-1.2b-instruct-q4_k_m.gguf",
-        n_ctx=4096,
-        n_threads=8
-    )
-
-    # Generate text
-    output = llm(
-        "What is artificial intelligence?",
-        max_tokens=512,
-        temperature=0.7,
-        top_p=0.9
-    )
-    print(output["choices"][0]["text"])
-    ```
-
-    **Chat Completions:**
-    ```python
-    response = llm.create_chat_completion(
-        messages=[
-            {"role": "system", "content": "You are a helpful assistant."},
-            {"role": "user", "content": "Explain quantum computing."}
-        ],
-        temperature=0.7,
-        max_tokens=512
-    )
-    print(response["choices"][0]["message"]["content"])
-    ```
-  </Tab>
-
   <Tab title="llama-server">
     llama-server provides an OpenAI-compatible API for serving models locally.
 
     **Starting the Server:**
     ```bash
+    llama-server -hf LiquidAI/LFM2.5-1.2B-Instruct-GGUF -c 4096 --port 8080
+    ```
+
+    The `-hf` flag downloads the model directly from Hugging Face. Alternatively, use a local model file:
+    ```bash
     llama-server -m lfm2.5-1.2b-instruct-q4_k_m.gguf -c 4096 --port 8080
     ```
 
     Key parameters:
-    * `-m`: Path to GGUF model file
+    * `-hf`: Hugging Face model ID (downloads automatically)
+    * `-m`: Path to local GGUF model file
     * `-c`: Context length (default: 4096)
     * `--port`: Server port (default: 8080)
     * `-ngl 99`: Offload layers to GPU (if available)
@@ -216,12 +174,18 @@ llama.cpp offers three main interfaces for running inference: `llama-cpp-python`
   <Tab title="llama-cli">
     llama-cli provides an interactive terminal interface for chatting with models.
 
+    ```bash
+    llama-cli -hf LiquidAI/LFM2.5-1.2B-Instruct-GGUF -c 4096 --color -i
+    ```
+
+    The `-hf` flag downloads the model directly from Hugging Face. Alternatively, use a local model file:
     ```bash
     llama-cli -m lfm2.5-1.2b-instruct-q4_k_m.gguf -c 4096 --color -i
     ```
 
     Key parameters:
-    * `-m`: Path to GGUF model file
+    * `-hf`: Hugging Face model ID (downloads automatically)
+    * `-m`: Path to local GGUF model file
     * `-c`: Context length
     * `--color`: Colored output
     * `-i`: Interactive mode
@@ -242,43 +206,6 @@ Control text generation behavior using parameters in the OpenAI-compatible API o
 * **`repetition_penalty`** / **`--repeat-penalty`** (`float`, default 1.1): Penalty for repeating tokens (>1.0 = discourage repetition). Typical range: 1.0-1.5
 * **`stop`** (`str` or `list[str]`): Strings that terminate generation when encountered
 
-<Accordion title="llama-cpp-python example">
-  ```python
-  from llama_cpp import Llama
-
-  llm = Llama(
-      model_path="lfm2.5-1.2b-instruct-q4_k_m.gguf",
-      n_ctx=4096,
-      n_threads=8
-  )
-
-  # Text generation with sampling parameters
-  output = llm(
-      "What is machine learning?",
-      max_tokens=512,
-      temperature=0.7,
-      top_p=0.9,
-      top_k=40,
-      repeat_penalty=1.1,
-      stop=["<|im_end|>", "<|endoftext|>"]
-  )
-  print(output["choices"][0]["text"])
-
-  # Chat completion with sampling parameters
-  response = llm.create_chat_completion(
-      messages=[
-          {"role": "user", "content": "Explain quantum computing."}
-      ],
-      temperature=0.7,
-      top_p=0.9,
-      top_k=40,
-      max_tokens=512,
-      repeat_penalty=1.1
-  )
-  print(response["choices"][0]["message"]["content"])
-  ```
-</Accordion>
-
 <Accordion title="llama-server (OpenAI-compatible API) example">
   ```python
   from openai import OpenAI
@@ -407,38 +334,6 @@ hf download LiquidAI/LFM2-VL-1.6B-GGUF mmproj-LFM2-VL-1.6B-Q8_0.gguf --local-dir
   ```
 </Accordion>
 
-<Accordion title="Using llama-cpp-python">
-  ```python
-  from llama_cpp import Llama
-  from llama_cpp.llama_chat_format import Llava15ChatHandler
-
-  # Initialize with vision support
-  # Note: Use the correct chat handler for your model architecture
-  chat_handler = Llava15ChatHandler(clip_model_path="mmproj-model-f16.gguf")
-
-  llm = Llama(
-      model_path="lfm2.5-vl-1.6b-q4_k_m.gguf",
-      chat_handler=chat_handler,
-      n_ctx=4096
-  )
-
-  # Generate with image
-  response = llm.create_chat_completion(
-      messages=[
-          {
-              "role": "user",
-              "content": [
-                  {"type": "image_url", "image_url": {"url": "file:///path/to/image.jpg"}},
-                  {"type": "text", "text": "Describe this image."}
-              ]
-          }
-      ],
-      max_tokens=256
-  )
-  print(response["choices"][0]["message"]["content"])
-  ```
-</Accordion>
-
 <Info>
 For a complete working example with step-by-step instructions, see the [llama.cpp Vision Model Colab notebook](https://colab.research.google.com/drive/1q2PjE6O_AahakRlkTNJGYL32MsdUcj7b?usp=sharing).
 </Info>
diff --git a/docs/inference/transformers.mdx b/docs/inference/transformers.mdx
index 3129191..a8509e4 100644
--- a/docs/inference/transformers.mdx
+++ b/docs/inference/transformers.mdx
@@ -14,9 +14,14 @@ Transformers provides the most flexibility for model development and is ideal fo
 Install the required dependencies:
 
 ```bash
-pip install transformers>=4.57.1 torch>=2.6
+pip install "transformers>=5.0.0" torch
 ```
 
+> **Note:** Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
+> ```bash
+> pip install git+https://github.com/huggingface/transformers.git@0c9a72e4576fe4c84077f066e585129c97bfd4e6 torch
+> ```
+
 GPU is recommended for faster inference.
 
 ## Basic Usage
diff --git a/docs/inference/vllm.mdx b/docs/inference/vllm.mdx
index 21da2b0..c327514 100644
--- a/docs/inference/vllm.mdx
+++ b/docs/inference/vllm.mdx
@@ -185,9 +185,16 @@ To use LFM Vision Models with vLLM, install the precompiled wheel along with the
 VLLM_PRECOMPILED_WHEEL_COMMIT=72506c98349d6bcd32b4e33eec7b5513453c1502 VLLM_USE_PRECOMPILED=1 pip install git+https://github.com/vllm-project/vllm.git
 ```
 
+```bash
+pip install "transformers>=5.0.0" pillow
+```
+
+<Note>
+Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
 ```bash
 pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow
 ```
+</Note>
 
 This installs vLLM with the necessary changes for LFM Vision Model support. Once these changes are merged upstream, you'll be able to use the standard vLLM installation.
 
diff --git a/docs/key-concepts/models.mdx b/docs/key-concepts/models.mdx
index 3609c36..e8ca0e2 100644
--- a/docs/key-concepts/models.mdx
+++ b/docs/key-concepts/models.mdx
@@ -5,7 +5,7 @@ description: "The LFM model collection includes general-purpose language models,
 
 * These models are built on the backbone of a new hybrid architecture that's designed for incredibly fast training and inference. Learn more in our [blog post](https://www.liquid.ai/blog/liquid-foundation-models-v2-our-second-series-of-generative-ai-models).
 * All models support a **32k token text context length** for extended conversations and document processing.
-* Our models are compatible with various open-source deployment libraries including [Transformers](/lfm/inference/transformers), [llama.cpp](/lfm/inference/llama-cpp), [vLLM](/lfm/inference/vllm), [MLX](/lfm/inference/mlx), [Ollama](/lfm/inference/ollama), and our own edge deployment platform [LEAP](/lfm/frameworks/leap).
+* Our models are compatible with various open-source deployment libraries including [Transformers](/docs/inference/transformers), [llama.cpp](/docs/inference/llama-cpp), [vLLM](/docs/inference/vllm), [MLX](/docs/inference/mlx), [Ollama](/docs/inference/ollama), and our own edge deployment platform [LEAP](/leap/index).
 
 <Accordion title="Complete Model Table">
   | Model                          | HF                                                            | GGUF                                                               | MLX                                                                | ONNX                                                                 | Trainable?   |
@@ -119,25 +119,25 @@ description: "The LFM model collection includes general-purpose language models,
 
 | Model                                                                                           | Description                                                                                                                                                                                             |
 | ----------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
-| [`LiquidAI/LFM2-1.2B-Extract`](https://huggingface.co/LiquidAI/LFM2-1.2B-Extract)               | Extract important information from a wide variety of unstructured documents into structured outputs like JSON. [See prompting guidelines](/lfm/key-concepts/text-generation-and-prompting#lfm2-extract) |
-| [`LiquidAI/LFM2-350M-Extract`](https://huggingface.co/LiquidAI/LFM2-350M-Extract)               | Smaller version of the extraction model. [See prompting guidelines](/lfm/key-concepts/text-generation-and-prompting#lfm2-extract)                                                                       |
-| [`LiquidAI/LFM2-350M-ENJP-MT`](https://huggingface.co/LiquidAI/LFM2-350M-ENJP-MT)               | Near real-time bi-directional Japanese/English translation of short-to-medium inputs. > [See prompting guidelines](/lfm/key-concepts/text-generation-and-prompting#lfm2-350m-enjp-mt)                   |
-| [`LiquidAI/LFM2-1.2B-RAG`](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG)                       | Answer questions based on provided contextual documents, for use in RAG systems. > [See prompting guidelines](/lfm/key-concepts/text-generation-and-prompting#lfm2-rag)                                 |
+| [`LiquidAI/LFM2-1.2B-Extract`](https://huggingface.co/LiquidAI/LFM2-1.2B-Extract)               | Extract important information from a wide variety of unstructured documents into structured outputs like JSON. [See prompting guidelines](/docs/key-concepts/text-generation-and-prompting#lfm2-extract) |
+| [`LiquidAI/LFM2-350M-Extract`](https://huggingface.co/LiquidAI/LFM2-350M-Extract)               | Smaller version of the extraction model. [See prompting guidelines](/docs/key-concepts/text-generation-and-prompting#lfm2-extract)                                                                       |
+| [`LiquidAI/LFM2-350M-ENJP-MT`](https://huggingface.co/LiquidAI/LFM2-350M-ENJP-MT)               | Near real-time bi-directional Japanese/English translation of short-to-medium inputs. > [See prompting guidelines](/docs/key-concepts/text-generation-and-prompting#lfm2-350m-enjp-mt)                   |
+| [`LiquidAI/LFM2-1.2B-RAG`](https://huggingface.co/LiquidAI/LFM2-1.2B-RAG)                       | Answer questions based on provided contextual documents, for use in RAG systems. > [See prompting guidelines](/docs/key-concepts/text-generation-and-prompting#lfm2-rag)                                 |
 | [`LiquidAI/LFM2-350M-Math`](https://huggingface.co/LiquidAI/LFM2-350M-Math)                     | Tiny reasoning model designed for tackling tricky math problems.                                                                                                                                        |
-| [`LiquidAI/LFM2-350M-PII-Extract-JP`](https://huggingface.co/LiquidAI/LFM2-350M-PII-Extract-JP) | Extract personally identifiable information (PII) from Japanese text and output it in JSON format. [See prompting guidelines](/lfm/key-concepts/text-generation-and-prompting#lfm2-350m-pii-extract-jp) |
+| [`LiquidAI/LFM2-350M-PII-Extract-JP`](https://huggingface.co/LiquidAI/LFM2-350M-PII-Extract-JP) | Extract personally identifiable information (PII) from Japanese text and output it in JSON format. [See prompting guidelines](/docs/key-concepts/text-generation-and-prompting#lfm2-350m-pii-extract-jp) |
 | [`LiquidAI/LFM2-ColBERT-350M`](https://huggingface.co/LiquidAI/LFM2-ColBERT-350M)               | Embed documents and queries for fast retrieval and reranking across many languages.                                                                                                                     |
-| [`LiquidAI/LFM2-2.6B-Transcript`](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript)         | Designed for private, on-device meeting summarization. [See prompting guidelines](/lfm/key-concepts/text-generation-and-prompting#lfm2-2.6b-transcript)                                                 |
+| [`LiquidAI/LFM2-2.6B-Transcript`](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript)         | Designed for private, on-device meeting summarization. [See prompting guidelines](/docs/key-concepts/text-generation-and-prompting#lfm2-2.6b-transcript)                                                 |
 | [`LiquidAI/LFM2-1.2B-Tool`](https://huggingface.co/LiquidAI/LFM2-1.2B-Tool)                     | Deprecated Model optimized for concise and precise tool calling. See updated [`LFM2.5-1.2B-Instruct`](https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct) instead.                                    |
 
 ## GGUF Models[​](#gguf-models "Direct link to GGUF Models")
 
-GGUF quantized versions are available for all LFM2 models for efficient inference with [llama.cpp](/lfm/inference/llama-cpp), [LM Studio](/lfm/inference/lm-studio), and [Ollama](/lfm/inference/ollama). These models offer reduced memory usage and faster CPU inference.
+GGUF quantized versions are available for all LFM2 models for efficient inference with [llama.cpp](/docs/inference/llama-cpp), [LM Studio](/docs/inference/lm-studio), and [Ollama](/docs/inference/ollama). These models offer reduced memory usage and faster CPU inference.
 
 To access our official GGUF models, append `-GGUF` to any model repository name (e.g., `LiquidAI/LFM2-1.2B-GGUF`). All models are available in multiple quantization levels (`Q4_0`, `Q4_K_M`, `Q5_K_M`, `Q6_K`, `Q8_0`, `F16`).
 
 ## MLX Models[​](#mlx-models "Direct link to MLX Models")
 
-MLX quantized versions are available for many of the LFM2 model library for efficient inference on Apple Silicon with [MLX](/lfm/inference/mlx). These models leverage unified memory architecture for optimal performance on M-series chips.
+MLX quantized versions are available for many of the LFM2 model library for efficient inference on Apple Silicon with [MLX](/docs/inference/mlx). These models leverage unified memory architecture for optimal performance on M-series chips.
 
 Browse all MLX-compatible models at [mlx-community LFM2 models](https://huggingface.co/mlx-community/collections?search=LFM). All models are available in multiple quantization levels (`4-bit`, `5-bit`, `6-bit`, `8-bit`, `bf16`).
 
@@ -146,5 +146,3 @@ Browse all MLX-compatible models at [mlx-community LFM2 models](https://huggingf
 ONNX versions are available for many LFM2 models for cross-platform deployment and inference with ONNX Runtime. These models enable deployment across diverse hardware including CPUs, GPUs, and specialized accelerators.
 
 To access our official ONNX models, append `-ONNX` to any model repository name (e.g., `LiquidAI/LFM2.5-1.2B-Instruct-ONNX`).
-
-[Edit this page](https://github.com/Liquid4All/docs/tree/main/lfm/key-concepts/models.md)
diff --git a/docs/key-concepts/text-generation-and-prompting.mdx b/docs/key-concepts/text-generation-and-prompting.mdx
index 365540c..2cba4ef 100644
--- a/docs/key-concepts/text-generation-and-prompting.mdx
+++ b/docs/key-concepts/text-generation-and-prompting.mdx
@@ -90,8 +90,6 @@ Parameter names and syntax vary by platform. See [Transformers](/docs/inference/
 * `min_p=0.15`
 * `repetition_penalty=1.05`
 
-Note that Liquid Nanos have [special requirements](#liquid-nanos) with different parameters.
-
 ## Vision Models
 
 LFM2-VL models use a **variable resolution encoder** to control the quality/speed tradeoff by adjusting how images are tokenized.
@@ -133,319 +131,6 @@ min_image_tokens = 32
 * `max_image_tokens=256`
 * `do_image_splitting=True`
 
-## Liquid Nanos
-
-### LFM2-Extract
-
-Structured information extraction models. Use `temperature=0` (greedy decoding).
-
-**System Prompt Format:**
-
-```
-Identify and extract information matching the following schema.
-Return data as a JSON object. Missing data should be omitted.
-
-Schema:
-- field_name: "Description of what to extract"
-- nested_object:
-  - nested_field: "Description"
-```
-
-If no system prompt is provided, defaults to JSON. Specify format (JSON, XML, or YAML) and schema for better accuracy.
-
-<Accordion title="Example">
-  **System Prompt:**
-
-  ```
-  Identify and extract information matching the following schema.
-  Return data as a JSON object. Missing data should be omitted.
-
-  Schema:
-  - project_name: "The name of the martial arts venture"
-  - executive_summary: "A brief overview of the business"
-  - financial_projections:
-    - startup_costs: "Initial investment required"
-    - projected_revenue_year_1: "Expected revenue in the first year"
-    - projected_revenue_year_3: "Expected revenue in the third year"
-    - return_on_investment: "Timeframe for recouping investment"
-  - facility_details:
-    - location:
-      - address: "Full address of the facility"
-      - latitude: "Latitude coordinate"
-      - longitude: "Longitude coordinate"
-      - distance_to_park_boundary: "Distance to park boundary"
-      - noise_level_assessment: "Noise level rating"
-  ```
-
-  **User Prompt:**
-
-  ```
-  We have secured a lease for a facility located at 345 Ocean View Drive,
-  Seward, AK 99664. The facility offers 3000 sq ft of dedicated training
-  space. Its location is approximately 2 miles from the boundary of the
-  Kenai Fjords National Park. A noise level assessment has been conducted,
-  resulting in a rating of Moderate. The coordinates of the facility are
-  60.1728° N latitude and 149.3422° W longitude.
-  ```
-
-  **Assistant Response:**
-
-  ```json
-  {
-    "project_name": "Arctic Spirit Martial Arts",
-    "executive_summary": "This business plan outlines the establishment of a premier martial arts training center in Seward, Alaska.",
-    "financial_projections": {
-      "startup_costs": "$75000",
-      "projected_revenue_year_1": "$120000",
-      "projected_revenue_year_3": "$250000",
-      "return_on_investment": "3 years"
-    },
-    "facility_details": {
-      "location": {
-        "address": "345 Ocean View Drive, Seward, AK 99664",
-        "latitude": "60.1728° N",
-        "longitude": "149.3422° W",
-        "distance_to_park_boundary": "2 miles",
-        "noise_level_assessment": "Moderate"
-      }
-    }
-  }
-  ```
-</Accordion>
-
-⚠️ **The model is intended for single turn conversations.**
-
-### LFM2-RAG
-
-Optimized for Retrieval-Augmented Generation. Use `temperature=0` (greedy decoding). Provide relevant documents in the system prompt.
-
-**System Prompt Format:**
-
-```
-The following documents may provide you additional information to answer questions:
-
-<document1>
-[Document content here]
-</document1>
-
-<document2>
-[Document content here]
-</document2>
-```
-
-<Accordion title="Example">
-  **System Prompt:**
-
-  ```
-  The following documents may provide you additional information to answer questions:
-
-  <document1>
-  The centre, which was created in 1906, has been instrumental in advancing
-  agriculture research. The library at the Agriculture Canada research centre
-  in Lethbridge serves 48 scientists and 85 technicians, along with many
-  visiting staff and students.
-  </document1>
-  ```
-
-  **User Prompt:**
-
-  ```
-  How many individuals were reported to be served by the library at the
-  Agriculture Canada research centre in Lethbridge?
-  ```
-
-  **Assistant Response:**
-
-  ```
-  The library at the Agriculture Canada research centre in Lethbridge was
-  reported to serve 48 scientists and 85 technicians, along with many
-  visiting staff and students.
-  ```
-</Accordion>
-
-### LFM2-Tool
-
-Optimized for efficient and precise tool calling. Use `temperature=0` (greedy decoding). See the [Tool Use](/docs/key-concepts/tool-use) guide for details.
-
-### LFM2-350M-ENJP-MT
-
-Specialized English-Japanese translation model. **Requires** one of these system prompts:
-
-* `"Translate to Japanese."` - English → Japanese
-* `"Translate to English."` - Japanese → English
-
-Use `.apply_chat_template()` with the system prompt to specify direction.
-
-<Accordion title="Example">
-  ```python
-  from transformers import AutoTokenizer
-
-  tokenizer = AutoTokenizer.from_pretrained("LiquidAI/LFM2-350M-ENJP-MT")
-
-  chat = [
-      {"role": "system", "content": "Translate to Japanese."},
-      {"role": "user", "content": "What is C. elegans?"}
-  ]
-
-  prompt = tokenizer.apply_chat_template(chat, tokenize=False)
-  print(prompt)
-  ```
-
-  **Output:**
-
-  ```
-  <|startoftext|><|im_start|>system
-  Translate to Japanese.<|im_end|>
-  <|im_start|>user
-  What is C. elegans?<|im_end|>
-  <|im_start|>assistant
-  C. elegansとは何ですか？<|im_end|>
-  ```
-</Accordion>
-
-⚠️ **The model is intended for single turn conversations.**
-
-### LFM2-350M-PII-Extract-JP
-
-Extracts PII from Japanese text as JSON. Use `temperature=0` (greedy decoding). Output can mask sensitive information on-device.
-
-**System Prompt Format:**
-
-```
-Extract <address>, <company_name>, <email_address>, <human_name>, <phone_number>
-```
-
-Extract specific entities by listing only what you need (e.g., `Extract <human_name>`). List categories in alphabetical order for optimal performance.
-
-**Output Format:** JSON with lists per category. Empty lists for missing entities. Outputs entities exactly as they appear (including notation variations) for exact-match masking.
-
-<Accordion title="Example">
-  **System Prompt:**
-
-  ```
-  Extract <address>, <company_name>, <email_address>, <human_name>, <phone_number>
-  ```
-
-  **User Prompt:**
-
-  ```
-  こんにちは、ラミンさんに B200 GPU を 10000 台 至急請求してください。
-  連絡先は celegans@liquid.ai (電話番号010-000-0000) で、これは C. elegans
-  線虫に着想を得たニューラルネットワークアーキテクチャを 今すぐ構築するために不可欠です。
-  ```
-
-  **Assistant Response:**
-
-  ```json
-  {
-    "address": [],
-    "company_name": [],
-    "email_address": ["celegans@liquid.ai"],
-    "human_name": ["ラミン"],
-    "phone_number": ["010-000-0000"]
-  }
-  ```
-</Accordion>
-
-⚠️ **The model is intended for single turn conversations.**
-
-### LFM2-2.6B-Transcript
-
-Designed for private, on-device meeting summarization. Use `temperature=0.3` for optimal results.
-
-**Generation Parameters:**
-
-* `temperature=0.3` (strongly recommended)
-
-**System Prompt Format:**
-
-```
-You are an expert meeting analyst. Analyze the transcript carefully
-and provide clear, accurate information based on the content.
-```
-
-**Input Format:**
-
-The model expects meeting transcripts in a specific format:
-
-```
-<user_prompt>
-
-Title (example: Claims Processing training module)
-Date (example: July 2, 2021)
-Time (example: 1:00 PM)
-Duration (example: 45 minutes)
-Participants (example: Julie Franco (Training Facilitator), Amanda Newman (Subject Matter Expert))
-
-----------
-
-**Speaker 1**: Message 1 (example: **Julie Franco**: Good morning, everyone. Thanks for joining me today.)
-**Speaker 2**: Message 2 (example: **Amanda Newman**: Good morning, Julie. Happy to be here.)
-etc.
-```
-
-Replace `<user_prompt>` with one of the following summary types, or combine multiple prompts:
-
-| Summary type      | User prompt                                                                                                                     |
-| ----------------- | ------------------------------------------------------------------------------------------------------------------------------- |
-| Executive summary | Provide a brief executive summary (2-3 sentences) of the key outcomes and decisions from this transcript.                       |
-| Detailed summary  | Provide a detailed summary of the transcript, covering all major topics, discussions, and outcomes in paragraph form.           |
-| Action items      | List the specific action items that were assigned during this meeting. Include who is responsible for each item when mentioned. |
-| Key decisions     | List the key decisions that were made during this meeting. Focus on concrete decisions and outcomes.                            |
-| Participants      | List the participants mentioned in this transcript. Include their roles or titles when available.                               |
-| Topics discussed  | List the main topics and subjects that were discussed in this meeting.                                                          |
-
-<Accordion title="Example">
-  **Example inputs and outputs:**
-
-  | Title                       | Input meeting                                                                                   | Model output                                                                                   |
-  | --------------------------- | ----------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------- |
-  | Budget planning             | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/meeting1.txt) | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/output1.txt) |
-  | Design review               | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/meeting2.txt) | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/output2.txt) |
-  | Coffee chat / social hour   | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/meeting3.txt) | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/output3.txt) |
-  | Procurement / vendor review | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/meeting4.txt) | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/output4.txt) |
-  | Task force meeting          | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/meeting5.txt) | [Link](https://huggingface.co/LiquidAI/LFM2-2.6B-Transcript/resolve/main/examples/output5.txt) |
-
-  **System Prompt:**
-
-  ```
-  You are an expert meeting analyst. Analyze the transcript carefully
-  and provide clear, accurate information based on the content.
-  ```
-
-  **User Prompt:**
-
-  ```
-  Provide a detailed summary of the transcript, covering all major topics,
-  discussions, and outcomes in paragraph form.
-
-  Title: Budget Planning Meeting
-  Date: March 15, 2024
-  Time: 2:00 PM
-  Duration: 60 minutes
-  Participants: Sarah Chen (Finance Director), Mike Johnson (Operations Manager), Lisa Park (HR Manager)
-
-  ----------
-
-  **Sarah Chen**: Good afternoon everyone. Let's start by reviewing our Q1 budget performance.
-  **Mike Johnson**: Thanks Sarah. Operations came in 5% under budget this quarter.
-  **Lisa Park**: HR expenses were slightly over due to the new hiring initiative, but we're on track for Q2.
-  **Sarah Chen**: That's good to hear. For Q2, we need to allocate additional funds for the expansion project.
-  **Mike Johnson**: I can provide a detailed breakdown by next week.
-  **Lisa Park**: I'll coordinate with Mike on staffing needs for the expansion.
-  ```
-
-  **Assistant Response:**
-
-  ```
-  The Budget Planning Meeting on March 15, 2024, reviewed Q1 financial performance
-  and planned for Q2. Operations performed well, coming in 5% under budget, while
-  HR expenses were slightly over due to new hiring. The team discussed allocating
-  additional funds for an expansion project in Q2, with Mike Johnson committing to
-  provide a detailed breakdown by next week and Lisa Park coordinating staffing needs.
-  ```
-</Accordion>
-
-**Try it yourself:** See the [meeting summarization cookbook example](https://github.com/Liquid4All/cookbook/tree/main/examples/meeting-summarization) for a complete implementation.
-
-⚠️ **The model is intended for single turn conversations with a specific format.**
+<Note>
+**Liquid Nanos** (task-specific models like LFM2-Extract, LFM2-RAG, LFM2-Tool, etc.) may have special prompting requirements and different generation parameters. For the best usage guidelines, refer to the individual model cards on the [Liquid Nanos](/docs/models/liquid-nanos) page.
+</Note>
diff --git a/docs/models/complete-library.mdx b/docs/models/complete-library.mdx
index d1f8383..64a2abd 100644
--- a/docs/models/complete-library.mdx
+++ b/docs/models/complete-library.mdx
@@ -1,5 +1,5 @@
 ---
-title: "Complete Model Library"
+title: "Model Library"
 description: "Liquid Foundation Models (LFMs) are a new class of multimodal architectures built for fast inference and on-device deployment. Browse all available models and formats here."
 ---
 
diff --git a/docs/models/lfm2-8b-a1b.mdx b/docs/models/lfm2-8b-a1b.mdx
index f41baae..1a99bc8 100644
--- a/docs/models/lfm2-8b-a1b.mdx
+++ b/docs/models/lfm2-8b-a1b.mdx
@@ -45,8 +45,15 @@ LFM2-8B-A1B is Liquid AI's Mixture-of-Experts model, combining 8B total paramete
   <Tab title="Transformers">
     **Install:**
     ```bash
+    pip install "transformers>=5.0.0" torch
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
+    ```bash
     pip install git+https://github.com/huggingface/transformers.git@0c9a72e4576fe4c84077f066e585129c97bfd4e6 torch
     ```
+    </Note>
 
     **Download & Run:**
     ```python
diff --git a/docs/models/lfm2-vl-1.6b.mdx b/docs/models/lfm2-vl-1.6b.mdx
index db66865..b331a7d 100644
--- a/docs/models/lfm2-vl-1.6b.mdx
+++ b/docs/models/lfm2-vl-1.6b.mdx
@@ -31,8 +31,15 @@ LFM2-VL-1.6B was the original 1.6B vision-language model. It has been superseded
   <Tab title="Transformers">
     **Install:**
     ```bash
+    pip install "transformers>=5.0.0" pillow torch
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
+    ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch
     ```
+    </Note>
 
     **Download & Run:**
     ```python
@@ -86,9 +93,16 @@ LFM2-VL-1.6B was the original 1.6B vision-language model. It has been superseded
       pip install git+https://github.com/vllm-project/vllm.git
     ```
 
+    ```bash
+    pip install "transformers>=5.0.0" pillow
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
     ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow
     ```
+    </Note>
 
     **Run:**
     ```python
diff --git a/docs/models/lfm2-vl-3b.mdx b/docs/models/lfm2-vl-3b.mdx
index 7f5c71c..009981f 100644
--- a/docs/models/lfm2-vl-3b.mdx
+++ b/docs/models/lfm2-vl-3b.mdx
@@ -45,8 +45,15 @@ LFM2-VL-3B is Liquid AI's highest-capacity multimodal model, delivering enhanced
   <Tab title="Transformers">
     **Install:**
     ```bash
+    pip install "transformers>=5.0.0" pillow torch
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
+    ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch
     ```
+    </Note>
 
     **Download & Run:**
     ```python
@@ -100,9 +107,16 @@ LFM2-VL-3B is Liquid AI's highest-capacity multimodal model, delivering enhanced
       pip install git+https://github.com/vllm-project/vllm.git
     ```
 
+    ```bash
+    pip install "transformers>=5.0.0" pillow
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
     ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow
     ```
+    </Note>
 
     **Run:**
     ```python
diff --git a/docs/models/lfm2-vl-450m.mdx b/docs/models/lfm2-vl-450m.mdx
index 4393517..ddaa038 100644
--- a/docs/models/lfm2-vl-450m.mdx
+++ b/docs/models/lfm2-vl-450m.mdx
@@ -45,8 +45,15 @@ LFM2-VL-450M is Liquid AI's smallest vision-language model, designed for edge de
   <Tab title="Transformers">
     **Install:**
     ```bash
+    pip install "transformers>=5.0.0" pillow torch
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
+    ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch
     ```
+    </Note>
 
     **Download & Run:**
     ```python
@@ -100,9 +107,16 @@ LFM2-VL-450M is Liquid AI's smallest vision-language model, designed for edge de
       pip install git+https://github.com/vllm-project/vllm.git
     ```
 
+    ```bash
+    pip install "transformers>=5.0.0" pillow
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
     ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow
     ```
+    </Note>
 
     **Run:**
     ```python
diff --git a/docs/models/lfm25-vl-1.6b.mdx b/docs/models/lfm25-vl-1.6b.mdx
index db79510..61cbccf 100644
--- a/docs/models/lfm25-vl-1.6b.mdx
+++ b/docs/models/lfm25-vl-1.6b.mdx
@@ -46,8 +46,15 @@ LFM2.5-VL-1.6B is Liquid AI's flagship vision-language model, delivering excepti
   <Tab title="Transformers">
     **Install:**
     ```bash
+    pip install "transformers>=5.0.0" pillow torch
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
+    ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow torch
     ```
+    </Note>
 
     **Download & Run:**
     ```python
@@ -101,9 +108,16 @@ LFM2.5-VL-1.6B is Liquid AI's flagship vision-language model, delivering excepti
       pip install git+https://github.com/vllm-project/vllm.git
     ```
 
+    ```bash
+    pip install "transformers>=5.0.0" pillow
+    ```
+
+    <Note>
+    Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
     ```bash
     pip install git+https://github.com/huggingface/transformers.git@3c2517727ce28a30f5044e01663ee204deb1cdbe pillow
     ```
+    </Note>
 
     **Run:**
     ```python
diff --git a/examples/index.mdx b/examples/index.mdx
index 192499f..b6c0e61 100644
--- a/examples/index.mdx
+++ b/examples/index.mdx
@@ -1,5 +1,5 @@
 ---
-title: "Guides Library"
+title: "Examples Library"
 ---
 
 <CardGroup cols={2}>
diff --git a/leap/edge-sdk/android/cloud-ai-comparison.mdx b/leap/edge-sdk/android/cloud-ai-comparison.mdx
index 87a039f..7f6ca41 100644
--- a/leap/edge-sdk/android/cloud-ai-comparison.mdx
+++ b/leap/edge-sdk/android/cloud-ai-comparison.mdx
@@ -162,4 +162,4 @@ lifecycleScope.launch {
 
 ## Next steps
 
-For more information, please refer to the [quick start guide](android-quick-start-guide) and [API reference](android-api-spec).
+For more information, please refer to the [quick start guide](./android-quick-start-guide).
diff --git a/leap/edge-sdk/android/constrained-generation.mdx b/leap/edge-sdk/android/constrained-generation.mdx
index 5dab90e..f01220e 100644
--- a/leap/edge-sdk/android/constrained-generation.mdx
+++ b/leap/edge-sdk/android/constrained-generation.mdx
@@ -4,7 +4,7 @@ description: "Generate structured JSON output with compile-time validation using
 sidebar_position: 3
 ---
 
-Setting the `jsonSchemaConstraint` field in [`GenerationOptions`](./android-api-spec#generationoptions) will enable constrained generation. While it is possible to
+Setting the `jsonSchemaConstraint` field in [`GenerationOptions`](./conversation-generation#generationoptions) will enable constrained generation. While it is possible to
 directly set the constraint with raw JSON Schema strings, we recommend to create the constraints with the `Generatable` annotation.
 
 ## `Generatable` annotation
diff --git a/leap/edge-sdk/android/conversation-generation.mdx b/leap/edge-sdk/android/conversation-generation.mdx
index 540a402..7b80ef8 100644
--- a/leap/edge-sdk/android/conversation-generation.mdx
+++ b/leap/edge-sdk/android/conversation-generation.mdx
@@ -74,7 +74,7 @@ Register a function for the model to invoke. See [function calling](./function-c
 
 Export the whole conversation history into a `JSONArray`. Each element can be interpreted as a `ChatCompletionRequestMessage` instance in OpenAI API schema.
 
-See also: [Gson Support](../utilities#gson-support).
+See also: [Gson Support](./utilities#gson-support).
 
 ### Cancellation of the generation
 
@@ -129,12 +129,12 @@ Fields
 - `topP`: Nucleus sampling parameter.In nucleus sampling, the model only considers the results of the tokens with `topP` probability mass.
 - `minP`: Minimal possibility for a token to be considered in generation.
 - `repetitionPenalty`: Repetition penalty parameter. A positive value will decrease the model's likelihood to repeat the same line verbatim.
-- `jsonSchemaConstraint`: Enable constrained generation with a [JSON Schema](https://json-schema.org/). See [constrained generation](../advanced-features#constrained-generation) for more details.
+- `jsonSchemaConstraint`: Enable constrained generation with a [JSON Schema](https://json-schema.org/). See [constrained generation](./constrained-generation) for more details.
 - `functionCallParser`: Define the parser for function calling requests from the model. See [function calling](./function-calling) guide for more details.
 
 Methods
 
-- `setResponseFormatType`: Enable constrained generation with a `Generatable` data class. See [constrained generation](../advanced-features#constrained-generation) for more details.
+- `setResponseFormatType`: Enable constrained generation with a `Generatable` data class. See [constrained generation](./constrained-generation) for more details.
 
 Kotlin builder function `GenerationOptions.build` is also available. For example,
 
diff --git a/leap/edge-sdk/android/function-calling.mdx b/leap/edge-sdk/android/function-calling.mdx
index 0afffff..0bfdc81 100644
--- a/leap/edge-sdk/android/function-calling.mdx
+++ b/leap/edge-sdk/android/function-calling.mdx
@@ -10,7 +10,7 @@ Not all models support function calling. Please check the model card before usin
 
 ## Register functions to conversations
 
-To enable function calling, function definitions should be registered to the [`Conversation`](./android-api-spec#conversation) instance before content generation.
+To enable function calling, function definitions should be registered to the [`Conversation`](./conversation-generation#conversation) instance before content generation.
 `Conversation.registerFunction` takes a `LeapFunction` instance as the input, which describes the name, parameters and ability of the function.
 
 ```kotlin
diff --git a/leap/edge-sdk/android/messages-content.mdx b/leap/edge-sdk/android/messages-content.mdx
index 7335527..94b5347 100644
--- a/leap/edge-sdk/android/messages-content.mdx
+++ b/leap/edge-sdk/android/messages-content.mdx
@@ -31,7 +31,7 @@ ChatMessage.fromJSONObject(obj: JSONObject): ChatMessage
 
 Return a `JSONObject` that represents the chat message. The returned object is compatible with `ChatCompletionRequestMessage` from OpenAI API. It contains 2 fields: `role` and `content` .
 
-See also: [Gson Support](../utilities#gson-support).
+See also: [Gson Support](./utilities#gson-support).
 
 ### `fromJSONObject`
 
@@ -42,7 +42,7 @@ Construct a `ChatMessage` instance from a `JSONObject`. Not all JSON object vari
 message.
 </Info>
 
-See also: [Gson Support](../utilities#gson-support).
+See also: [Gson Support](./utilities#gson-support).
 
 ### `ChatMessage.Role`
 
diff --git a/leap/edge-sdk/android/model-loading.mdx b/leap/edge-sdk/android/model-loading.mdx
index 0691404..09c54dc 100644
--- a/leap/edge-sdk/android/model-loading.mdx
+++ b/leap/edge-sdk/android/model-loading.mdx
@@ -30,7 +30,7 @@ Download a model from the LEAP Model Library and load it into memory. If the mod
 
 **Returns**
 
-`ModelRunner`: A [`ModelRunner`](../conversation-generation#modelrunner) instance that can be used to interact with the loaded model.
+`ModelRunner`: A [`ModelRunner`](./conversation-generation#modelrunner) instance that can be used to interact with the loaded model.
 
 <br/>
 ### `downloadModel`
@@ -126,7 +126,7 @@ It is rarely necessary to instantiate a `Manifest` class directly. It is created
 
   ### `loadModel`
 
-  This function can be called from UI thread. The app should hold the `ModelRunner` object returned by this function until there is no need to interact with the model anymore. See [`ModelRunner`](../conversation-generation#modelrunner) for more details.
+  This function can be called from UI thread. The app should hold the `ModelRunner` object returned by this function until there is no need to interact with the model anymore. See [`ModelRunner`](./conversation-generation#modelrunner) for more details.
 
   **Arguments**
 
diff --git a/leap/edge-sdk/android/utilities.mdx b/leap/edge-sdk/android/utilities.mdx
index 09903ad..8448780 100644
--- a/leap/edge-sdk/android/utilities.mdx
+++ b/leap/edge-sdk/android/utilities.mdx
@@ -26,8 +26,8 @@ dependencies {
 
 The following types are supported:
 
-- [`ChatMessage`](../messages-content#chatmessage)
-- [`ChatMessageContent`](../messages-content#chatmessagecontent)
+- [`ChatMessage`](./messages-content#chatmessage)
+- [`ChatMessageContent`](./messages-content#chatmessagecontent)
 
 ### Create Gson Object
 
@@ -42,7 +42,7 @@ val gson = GsonBuilder().registerLeapAdapters().create()
 
 ### Serializing and Deserializing Conversation History
 
-With a [`Conversation`](../conversation-generation#conversation) object, simply call `Gson.toJson` to convert the chat message history into a JSON string. The returned JSON will be an array.
+With a [`Conversation`](./conversation-generation#conversation) object, simply call `Gson.toJson` to convert the chat message history into a JSON string. The returned JSON will be an array.
 
 ```kotlin
 val json = gson.toJson(conversation.history)
@@ -59,7 +59,7 @@ val chatHistory: List<ChatMessage> = gson.fromJson(json, LeapGson.messageListTyp
 ## Model Downloader (deprecated)
 
 <Warning>
-This module is deprecated and will be removed in the near future. To download models using the Edge SDK, see [`LeapDownloader`](../model-loading#leapdownloader).
+This module is deprecated and will be removed in the near future. To download models using the Edge SDK, see [`LeapDownloader`](./model-loading#leapdownloader).
 </Warning>
 
 LeapSDK Android Model Downloader module is a helper for downloading models from Leap Model Library.
diff --git a/leap/edge-sdk/ios/cloud-ai-comparison.mdx b/leap/edge-sdk/ios/cloud-ai-comparison.mdx
index 6beacc9..e4419ac 100644
--- a/leap/edge-sdk/ios/cloud-ai-comparison.mdx
+++ b/leap/edge-sdk/ios/cloud-ai-comparison.mdx
@@ -178,4 +178,4 @@ func sendMessage(_ text: String) {
 
 ## Next steps
 
-For more information, please refer to the [quick start guide](./ios-quick-start-guide) and [API reference](ios-api-spec).
+For more information, please refer to the [quick start guide](./ios-quick-start-guide).
diff --git a/leap/edge-sdk/ios/function-calling.mdx b/leap/edge-sdk/ios/function-calling.mdx
index ceb86a6..401e153 100644
--- a/leap/edge-sdk/ios/function-calling.mdx
+++ b/leap/edge-sdk/ios/function-calling.mdx
@@ -17,7 +17,7 @@ your messages and tool responses.
 
 ## Register functions to conversations
 
-To enable function calling, function definitions should be registered to the [`Conversation`](./ios-api-spec#conversation) instance before content generation.
+To enable function calling, function definitions should be registered to the [`Conversation`](./conversation-generation#conversation) instance before content generation.
 `Conversation.registerFunction` takes a `LeapFunction` instance as the input, which describes the name, parameters and ability of the function.
 
 ```swift
diff --git a/leap/edge-sdk/ios/model-loading.mdx b/leap/edge-sdk/ios/model-loading.mdx
index cc36861..4c56242 100644
--- a/leap/edge-sdk/ios/model-loading.mdx
+++ b/leap/edge-sdk/ios/model-loading.mdx
@@ -33,7 +33,7 @@ public struct Leap {
 
 **Returns**
 
-`ModelRunner`: A [`ModelRunner`](../conversation-generation#modelrunner) instance that can be used to interact with the loaded model.
+`ModelRunner`: A [`ModelRunner`](./conversation-generation#modelrunner) instance that can be used to interact with the loaded model.
 
 <br/>
 
diff --git a/quickstarts/LFM2-8B-A1B__transformers.md b/quickstarts/LFM2-8B-A1B__transformers.md
index 2061940..b891bd5 100644
--- a/quickstarts/LFM2-8B-A1B__transformers.md
+++ b/quickstarts/LFM2-8B-A1B__transformers.md
@@ -11,9 +11,14 @@ Perfect for research, prototyping, and quick experimentation in Jupyter notebook
 ## Install Python dependencies
 
 ```shell
-pip install git+https://github.com/huggingface/transformers.git@0c9a72e4576fe4c84077f066e585129c97bfd4e6 bitsandbytes
+pip install "transformers>=5.0.0" bitsandbytes
 ```
 
+> **Note:** Transformers v5 is newly released. If you encounter issues, fall back to the pinned git source:
+> ```shell
+> pip install git+https://github.com/huggingface/transformers.git@0c9a72e4576fe4c84077f066e585129c97bfd4e6 bitsandbytes
+> ```
+
 ## Run inference
 
 ```python
diff --git a/style.css b/style.css
index 712f910..4814ff5 100644
--- a/style.css
+++ b/style.css
@@ -62,6 +62,12 @@ a[href*="discord.gg"]:hover img {
   background-color: #864bc4 !important;
 }
 
+/* Hide external link arrow on Discord card */
+a[href*="discord"] svg[class*="arrow"],
+a[href*="discord.gg"] svg[class*="arrow"] {
+  display: none !important;
+}
+
 /* Light mode LEFT sidebar - selected item styling */
 /* Target links with bg-primary-light class (Mintlify's active state) */
 :root:not(.dark) a[class*="bg-primary-light"],
diff --git a/styles.js b/styles.js
index 302e5e1..48727bc 100644
--- a/styles.js
+++ b/styles.js
@@ -84,6 +84,12 @@
       background-color: #864bc4 !important;
     }
 
+    /* Hide external link arrow on Discord card */
+    a[href*="discord"] svg[class*="arrow"],
+    a[href*="discord.gg"] svg[class*="arrow"] {
+      display: none !important;
+    }
+
     /* Light mode LEFT sidebar - selected item styling */
     /* Target links with bg-primary-light class (Mintlify's active state) */
     :root:not(.dark) a[class*="bg-primary-light"],