wso2 · SashenkaG · Feb 16, 2026 · Feb 18, 2026 · Feb 18, 2026 · Feb 18, 2026
@@ -1,7 +1,4 @@
----
-title: RAG Application
-description: Learn about Retrieval-Augmented Generation (RAG) ingestion and retrieval in Devant.
----
+# RAG Ingestion
 
 ## Introduction
 
@@ -10,13 +7,14 @@ Retrieval-Augmented Generation (RAG) is an AI framework that improves how Large
 RAG works through two main processes, which are ingestion and retrieval.
 
 ## RAG ingestion
-To make use of RAG effectively, data must be systematically ingested into vector databases. This process, known as RAG Ingestion, involves setting up a vector database, utilizing embedding models, processing source files and chunking data.
+To make use of RAG effectively, data must be systematically ingested into vector databases. This process, known as RAG ingestion, involves setting up a vector database, utilizing embedding models, processing source files and chunking data.
 Devant offers a platform to efficiently ingest and manage unstructured documents for RAG.
-This guide walks through the key steps of RAG Ingestion in Devant.
+This guide walks through the key steps of RAG ingestion in Devant.
+Devant RAG ingestion supports multiple file types including PDFs (including scanned PDFs), DOCX, PPTX, XLSX, CSV, HTML, MD, images, and audio files (MP3, WAV, M4A, FLAC, OGG).
 
-Go to your Organization by selecting the organization from the **Organization** dropdown in the top left corner. Select **RAG Ingestion** from the **Admin** dropdown at the bottom of the left navigation. 
+Navigate to your organization using the **Organization** dropdown in the top left of the Devant console header. In the left navigation menu, click **RAG**, then select **Ingestion**.
 
-### Step 1: Initialize Vector Store
+### Step 1: Initialize vector store
 
 LLMs receive contextual information as numerical vectors (embeddings). A vector database stores these embeddings for efficient retrieval.
 Devant supports a wide range of vector databases like Pinecone, Weaviate, Chroma, and so on. 
@@ -25,22 +23,22 @@ Devant supports a wide range of vector databases like Pinecone, Weaviate, Chroma
 2. Enter the API key in the **API Key** field.
 
     ???+ info "Info"
-        To create an API key, refer to the [Pinecone API Key documentation](https://docs.pinecone.io/guides/projects/manage-api-keys#create-an-api-key).
+        To create an API key, refer to the [Pinecone API key documentation](https://docs.pinecone.io/guides/projects/manage-api-keys#create-an-api-key).
 
 3. Enter the **Collection Name**. The collection will be automatically created if it does not exist.
 4. Click **Next**.
 
-### Step 2: Configure the Embedding Model
+### Step 2: Configure the embedding model
 
 1. Select `text-embedding-ada-002` embedding model from the **Open AI** dropdown.
 2. Enter the API key in the **Embedding Model API Key** field.
 
     ???+ info "Info"
-        To create an API key, refer to the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/embeddings).
+        To create an API key, refer to the [OpenAI platform documentation](https://platform.openai.com/docs/guides/embeddings).
 
 3. Click **Next**.
 
-### Step 3: Configure Chunking
+### Step 3: Configure chunking
 
 Chunking is used to break large documents into manageable parts because processing them all at once is not feasible.
 **Chunking strategy**, **Max segment size**, and **Max overlap size** are automatically populated with default values. You can modify them if needed.
@@ -50,7 +48,16 @@ Chunking is used to break large documents into manageable parts because processi
     - **Max segment size** determines the maximum length of tokens for each chunk.
     - **Max overlap size** defines how many tokens repeat between consecutive chunks.
 
-### Step 4: Upload Source Files
+ ![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion1.gif)
+
+### Step 4: Choose ingestion mode
+
-
-
+Choose how you want to perform RAG ingestion:
+
+- **Upload Now**: Instantly upload and ingest files into your vector store. The steps below will guide you through the **Upload Now** workflow for immediate ingestion.
+- **Schedule RAG Ingestion**: Set up automated, scheduled ingestion from a selected data source. For step-by-step instructions, refer to the [Schedule Automation](schedule-rag-automation.md) guide.
+
+### Step 5: Upload source files
 
 Next, upload your source files (e.g., PDFs, CSVs, or text documents) for processing.
 
@@ -61,7 +68,9 @@ Next, upload your source files (e.g., PDFs, CSVs, or text documents) for process
     !!! note
         When you click **Upload** it will generate embeddings for the uploaded files and store them in the vector database.
 
-### Step 5: Verify
+ ![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion2.gif)
+
+### Step 6: Verify
 
 Once processing is complete, execute test queries to ensure proper data retrieval.
 
@@ -72,15 +81,9 @@ Once processing is complete, execute test queries to ensure proper data retrieva
         - **Maximum chunks to retrieve** defines the number of matching chunks to retrieve against the query.
         - **Minimum similarity threshold** determines whether a chunk is relevant enough to be considered a match for a given query. Expressed as a value between 0 and 1 (for example, 0.7 or 70% similarity).
 
-3. Click **Retrieve**. The search results will display the chunks that match the query.
-
-    <a href="{{base_path}}/assets/img/ai/rag-application/rag-ingestion.gif"><img src="{{base_path}}/assets/img/ai/rag-application/rag-ingestion.gif" alt="RAG Ingestion" width="80%"></a>
+3. Click **Retrieve**. The search results will display the chunks that match your query.
 
-!!! note  
+ ![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion3.gif)
+
+!!! note
     Follow this detailed tutorial [video](https://www.youtube.com/watch?v=8GlrHYS-EYI&list=PLp0TUr0bmhX4colDnjhEKAnZ3RmjCv5y2&ab_channel=WSO2) to understand how to set up the RAG ingestion and create your vector index.
-
-## RAG retrieval
-
-After completing the RAG ingestion process, you need to implement a rag retrieval to connect your vector database with user queries and generate responses.
-
-For detailed implementation steps and configuration, refer to the [RAG retrieval](https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval) tutorial in the WSO2 Integrator: BI documentation.
@@ -0,0 +1,52 @@
+# RAG Retrieval
+
+## Introduction
+Retrieval-Augmented Generation (RAG) retrieval is the process of searching a vector database for the most relevant information in response to a user query.
+
+!!! note 
+    - This guide assumes you have already ingested files into your vector store. If you haven't already follow the [Ingestion](rag-ingestion.md) guide on how you can do that.
+
+To retrieve chunks that have already been ingested (without uploading new files), navigate to your organization using the **Organization** dropdown in the top left of the Devant console header. In the left navigation menu, click **RAG**, then select **Retrieval**.
+
+### Step 1: Initialize vector store
+
+1. Select `Pinecone` as the vector database.
+2. Enter the API key in the **API Key** field.
+
+    ???+ info "Info"
+        To create an API key, refer to the [Pinecone API key documentation](https://docs.pinecone.io/guides/projects/manage-api-keys#create-an-api-key).
+
+3. Enter the **Collection Name** from which you want to retrieve data.
+4. Click **Next**.
+
+### Step 2: Configure the embedding model
+
+1. Select `text-embedding-ada-002` embedding model from the **OpenAI** dropdown.
+2. Enter the API key in the **Embedding Model API Key** field.
+
+    ???+ info "Info"
+        To create an API key, refer to the [OpenAI platform documentation](https://platform.openai.com/docs/guides/embeddings).
+
+3. Click **Next**.
+
+### Step 3: Query and retrieve chunks
+
+Execute queries to ensure proper data retrieval.
+
+1. Enter a query according to the content of the files ingested previously.
+2. **Maximum chunks to retrieve** and **Minimum similarity threshold** are automatically populated with default values. You can modify them if needed.
+
+    ???+ info "Info"
+        - **Maximum chunks to retrieve** defines the number of matching chunks to retrieve against the query.
+        - **Minimum similarity threshold** determines whether a chunk is relevant enough to be considered a match for a given query. Expressed as a value between 0 and 1 (for example, 0.7 or 70% similarity).
+
+3. Click **Retrieve**. The search results will display the chunks that match the query.
+
+???+ info "Info"
+     - Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned.
+
+![RAG retrieval](../assets/img/ai/rag-application/rag-retrieval.gif)
+
+After completing the RAG ingestion process, you can also implement a RAG retrieval to connect your vector database with user queries and generate responses using the WSO2 Integrator: BI.
+
+For detailed implementation steps and configuration, refer to the [RAG retrieval](https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval) tutorial in the WSO2 Integrator: BI documentation.
@@ -0,0 +1,152 @@
+# RAG Service
+
+## Introduction
+Devant provides a set of RESTful API endpoints for Retrieval-Augmented Generation (RAG) workflows. These endpoints enable you to ingest, retrieve, and process documents programmatically.
+
+### Step 1: Create service
+Navigate to your organization using the **Organization** dropdown in the top left of the Devant console header. In the left navigation menu, click **RAG**, then select **Service**.
+
+Fill in the required fields in the **Create RAG Ingestion Service** form to set up a new RAG service in Devant.
+
+| **Field**        | **Value**         |
+| ---------------- | ----------------- |
+|**Project**       | Select the target project from dropdown which lists the available projects in your organization |
+| **Display Name** | Sample RAG Service  |
+| **Name**   | sample-rag-service  |
+| **Description (Optional)**  | My rag service description |
+
+Click the **Create Service** button and your service will get created 
+
+!!! note
+    - When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
+
+### Step 2: Test endpoints
+Once the component is created you will be redirected to the Overview page.
+
+1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
+2. Expand the resource you want to test.
+3. Click Try it out to enable it.
-Click the **Create Service** button and your service will get created 
-
-!!! note
-    - When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
-
-### Step 2: Test endpoints
-Once the component is created you will be redirected to the Overview page.
-
-1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
-2. Expand the resource you want to test.
-3. Click Try it out to enable it.
+Click the **Create Service** button and your service will be created.
+
+!!! note
+    - When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
+
+### Step 2: Test endpoints
+Once the component is created, you will be redirected to the Overview page.
+
+1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
+2. Expand the resource you want to test.
+3. Click **Try it out** to enable it.
-Click the **Create Service** button and your service will get created 
-
-!!! note
-    - When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
-
-### Step 2: Test endpoints
-Once the component is created you will be redirected to the Overview page.
-
-1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
-2. Expand the resource you want to test.
-3. Click Try it out to enable it.
+Click the **Create Service** button and your service will be created.
+
+!!! note
+    - When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
+
+### Step 2: Test endpoints
+Once the component is created, you will be redirected to the Overview page.
+
+1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
+2. Expand the resource you want to test.
+3. Click **Try it out** to enable it.
+4. Provide values for the parameters.
+5. Click **Execute**. The response will be displayed under the **Responses** section.
+
+!!! note 
+    - Some parameters are automatically populated with default values. You can modify them as needed.
+
+![RAG Service](../assets/img/ai/rag-application/rag-service.gif)    
+
+### Available API endpoints
+The image below shows how all the available endpoints are listed in the **Console** page. You can expand each endpoint and try them out as needed.
+
+![RAG Service](../assets/img/ai/rag-application/rag-service.png)  
+
+#### **POST `/upload`**
+
+Upload a file and ingest it into your vector store. Supports PDF (including scanned PDFs), DOCX, PPTX, XLSX, CSV, HTML, MD, images, and audio (MP3, WAV, M4A, FLAC, OGG).
+
+**Required in the request:**
+
+- File to upload. 
+- Vector DB provider (e.g., Pinecone, Chroma, Weaviate, Postgres) and connection/API key details.
+- Collection name: where data will be stored.
+- Embedding model provider and model (e.g., OpenAI, Azure, Mistral) and API key.
+- Chunking strategy: `recursive`, `sentence`, or `character`.
+- Max segment size and max overlap size for chunking.
- Max segment size and max overlap size for chunking.
+- Max chunk size and max overlap size for chunking.
- Max segment size and max overlap size for chunking.
+- Max chunk size and max overlap size for chunking.
+
+**Expected response:**
+
+Returns a JSON object indicating successful ingestion, including the file name and type. 
+
+```json
+{
+    "message": "Added data to vector store successfully",
+    "filename": "example.pdf",
+    "file_type": "document"
+}
+```
+
+---
+
+#### **POST `/retrieve`**
+
+Retrieve relevant chunks from your vector store based on a user query. Supports semantic search and optional reranking with Cohere.
+
+**Required in the request:**
+
+- Vector DB provider and connection/API key details.
+- Name of the collection from which you want to retrieve chunks.
+- Embedding model provider and model, and API key.
+- User query for which you want to retrieve chunks.
+- Max number of chunks to retrieve and minimum similarity threshold.
+
+**Optional:**
+
+- Cohere re-ranking model and API key (if using reranking) and the number of top results to rerank.
+
+???+ info "Info" 
+    - To create a Cohere API key, refer to the [Cohere documentation](https://dashboard.cohere.com/api-keys).
+
+**Expected response:**
+
+Returns a JSON object containing the user query and an array of retrieved chunks. Each chunk includes the content, the file it is from, and a timestamp.
+
+```json
+{
+    "query": "What is Devant?",
+    "retrieved_chunks": [
+        {
+            "text": "Devant is ...",
+            "source": "example.pdf",
+            "timestamp": "2026-02-16T12:02:25.076312"
+        },
+        ...
+    ]
+}
+```
+
+---
+
+#### **POST `/chunks`**
+
+Parse and chunk an uploaded file, returning the chunks as a JSON array. Does not store data in the vector DB.
+
+This endpoint accepts all the same file types as the `/upload` endpoint, including PDF (with scanned PDF support), DOCX, PPTX, XLSX, CSV, HTML, MD, images, and audio files (MP3, WAV, M4A, FLAC, OGG).
+
+**Required in the request:**
+
+- File to upload.
+- Chunk type: `recursive`, `sentence`, or `character`.
+- Max chunk size and max overlap size.
+
+**Expected response:**
+
+Returns a JSON object containing the file name and an array of chunks. Each chunk includes a chunk ID and its content.
+
+```json
+{
+    "filename": "example.pdf",
+    "chunks": [
+        { "chunk_id": 0, "content": "First chunk content..." },
+        { "chunk_id": 1, "content": "Second chunk content..." }
+    ]
+}
+```
+
+---
+
+#### **GET `/health`**
+
+Health check endpoint. 
+
+**Expected response:**
+
+Returns a JSON object indicating the service status.
+
+```json
+{
+    "status": "ok"
+}
+```
+
+???+ info "Info"
+    - For more details on RAG ingestion and retrieval, and how to obtain API keys and credentials refer the [RAG Ingestion](rag-ingestion.md) guide.
+
@@ -0,0 +1,66 @@
+# Schedule RAG Automation
+
+## Introduction
+
+Devant provides a platform to automate document ingestion on a schedule and manage unstructured data for Retrieval-Augmented Generation (RAG) workflows.
+
+This guide outlines the steps to set up scheduled RAG ingestion automation in Devant.
+
+!!! note
+    - Schedule RAG ingestion option is available only for paid Devant users.
+    - This guide assumes you have completed steps 1 to 4 in the [RAG ingestion](rag-ingestion.md) guide.
+
+After selecting **Schedule RAG Ingestion** as the ingestion mode, follow these steps:
+
+### Step 1: Create automation
+Fill in the required fields in the details form to create an automation in Devant for scheduled RAG ingestion.
+
+| **Field**        | **Value**         |
+| ---------------- | ----------------- |
+|**Project**       | Select the target project from dropdown which lists the available projects in your organization |
+| **Display Name** | Sample Automation   |
+| **Name**   | sample-automation   |
+| **Description (Optional)**  | My sample automation description |
+
+![RAG schedule](../assets/img/ai/rag-application/rag-schedule1.gif)
+
+### Step 2: Configure datasource 
+
+The datasource specifies the location from which files will be ingested. Devant supports both Google Drive folders and Amazon S3 buckets as datasources.
+
+1. Select `Google Drive` as the datasource.
+
+2. Enter the **API Key** in the API key field.
+
+    ???+ info "Info"
+        To obtain a key, use the [Google Cloud Console](https://console.cloud.google.com/) to create a project and generate a key as described in the [Google Documentation](https://cloud.google.com/docs/authentication/api-keys#create) and restrict it to the **Google Drive API**.
+
+        **Note:** The target folder must be **public** ("Anyone with the link"), as API keys cannot access private files.
+
+3. Provide the **Folder ID** of the Google Drive folder containing the files to be ingested.
+
+    ???+ info "Info"
+        The Google Drive folder ID can be found in the URL when viewing the folder in Google Drive. It is the string that appears after `/folders/` in the URL.
+
+4. Click **Create Automation** to complete the setup. You will be redirected to the automation overview page.
+
+    !!! note
+        - When a scheduled RAG ingestion automation is created, Devant automatically increases the container resources (CPU and memory) for the automation to ensure reliable operation.
+
+        - If you need to process very large files or expect high ingestion volumes, you can further scale your container resources  in **Containers** tab from the **Admin** dropdown at the bottom of the left navigation.
+
+![RAG schedule](../assets/img/ai/rag-application/rag-schedule2.gif)
+### Step 3: Schedule Ingestion
+
+Once created, the automation is automatically deployed in the development environment with all previously entered configurations prefilled.
+
+- To trigger an immediate ingestion, click the **Test** button.
+- To schedule ingestion for a specific time interval, click the **Schedule** button and select your desired time.
+
+You can verify successful ingestion by reviewing the automation logs.
+
+![RAG schedule](../assets/img/ai/rag-application/rag-schedule3.gif)
+
+As shown below, you can automate your ingestion workflow at specified intervals (e.g., minutely, hourly, daily, monthly). During each scheduled run, the system detects new files in the data source and ingests them into the vector store.
+
+![RAG schedule](../assets/img/ai/rag-application/rag-schedule.png)
@@ -85,7 +85,11 @@ nav:
       - Explore the Demo Organization: devant-samples/explore-the-demo-organization.md
       - "Deploy with WSO2 Integrator: MI": devant-samples/deploy-with-wso2-integrator-mi.md
   - AI:
-      - RAG Application: ai/rag-application.md
+      - RAG:
+          - Ingestion: ai/rag-ingestion.md
+          - Retrieval: ai/rag-retrieval.md
+          - Schedule Automation: ai/schedule-rag-automation.md
+          - Service: ai/rag-service.md
       - 'Develop an AI Agent Created Using WSO2 Integrator: BI': ai/develop-an-ai-agent-created-using-wso2-integrator-bi.md
   - Devant Concepts:
       - Resource Hierarchy: devant-concepts/resource-hierarchy.md