Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 27 additions & 24 deletions en/docs/ai/rag-application.md → en/docs/ai/rag-ingestion.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
---
title: RAG Application
description: Learn about Retrieval-Augmented Generation (RAG) ingestion and retrieval in Devant.
---
# RAG Ingestion

## Introduction

Expand All @@ -10,13 +7,14 @@ Retrieval-Augmented Generation (RAG) is an AI framework that improves how Large
RAG works through two main processes, which are ingestion and retrieval.

## RAG ingestion
To make use of RAG effectively, data must be systematically ingested into vector databases. This process, known as RAG Ingestion, involves setting up a vector database, utilizing embedding models, processing source files and chunking data.
To make use of RAG effectively, data must be systematically ingested into vector databases. This process, known as RAG ingestion, involves setting up a vector database, utilizing embedding models, processing source files and chunking data.
Devant offers a platform to efficiently ingest and manage unstructured documents for RAG.
This guide walks through the key steps of RAG Ingestion in Devant.
This guide walks through the key steps of RAG ingestion in Devant.
Devant RAG ingestion supports multiple file types including PDFs (including scanned PDFs), DOCX, PPTX, XLSX, CSV, HTML, MD, images, and audio files (MP3, WAV, M4A, FLAC, OGG).

Go to your Organization by selecting the organization from the **Organization** dropdown in the top left corner. Select **RAG Ingestion** from the **Admin** dropdown at the bottom of the left navigation.
Navigate to your organization using the **Organization** dropdown in the top left of the Devant console header. In the left navigation menu, click **RAG**, then select **Ingestion**.

### Step 1: Initialize Vector Store
### Step 1: Initialize vector store

LLMs receive contextual information as numerical vectors (embeddings). A vector database stores these embeddings for efficient retrieval.
Devant supports a wide range of vector databases like Pinecone, Weaviate, Chroma, and so on.
Expand All @@ -25,22 +23,22 @@ Devant supports a wide range of vector databases like Pinecone, Weaviate, Chroma
2. Enter the API key in the **API Key** field.

???+ info "Info"
To create an API key, refer to the [Pinecone API Key documentation](https://docs.pinecone.io/guides/projects/manage-api-keys#create-an-api-key).
To create an API key, refer to the [Pinecone API key documentation](https://docs.pinecone.io/guides/projects/manage-api-keys#create-an-api-key).

3. Enter the **Collection Name**. The collection will be automatically created if it does not exist.
4. Click **Next**.

### Step 2: Configure the Embedding Model
### Step 2: Configure the embedding model

1. Select `text-embedding-ada-002` embedding model from the **Open AI** dropdown.
2. Enter the API key in the **Embedding Model API Key** field.

???+ info "Info"
To create an API key, refer to the [OpenAI Platform documentation](https://platform.openai.com/docs/guides/embeddings).
To create an API key, refer to the [OpenAI platform documentation](https://platform.openai.com/docs/guides/embeddings).

3. Click **Next**.

### Step 3: Configure Chunking
### Step 3: Configure chunking

Chunking is used to break large documents into manageable parts because processing them all at once is not feasible.
**Chunking strategy**, **Max segment size**, and **Max overlap size** are automatically populated with default values. You can modify them if needed.
Expand All @@ -50,7 +48,16 @@ Chunking is used to break large documents into manageable parts because processi
- **Max segment size** determines the maximum length of tokens for each chunk.
- **Max overlap size** defines how many tokens repeat between consecutive chunks.

### Step 4: Upload Source Files
![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion1.gif)

### Step 4: Choose ingestion mode

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Choose how you want to perform RAG ingestion:

- **Upload Now**: Instantly upload and ingest files into your vector store. The steps below will guide you through the **Upload Now** workflow for immediate ingestion.
- **Schedule RAG Ingestion**: Set up automated, scheduled ingestion from a selected data source. For step-by-step instructions, refer to the [Schedule Automation](schedule-rag-automation.md) guide.

### Step 5: Upload source files

Next, upload your source files (e.g., PDFs, CSVs, or text documents) for processing.

Expand All @@ -61,7 +68,9 @@ Next, upload your source files (e.g., PDFs, CSVs, or text documents) for process
!!! note
When you click **Upload** it will generate embeddings for the uploaded files and store them in the vector database.

### Step 5: Verify
![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion2.gif)

### Step 6: Verify

Once processing is complete, execute test queries to ensure proper data retrieval.

Expand All @@ -72,15 +81,9 @@ Once processing is complete, execute test queries to ensure proper data retrieva
- **Maximum chunks to retrieve** defines the number of matching chunks to retrieve against the query.
- **Minimum similarity threshold** determines whether a chunk is relevant enough to be considered a match for a given query. Expressed as a value between 0 and 1 (for example, 0.7 or 70% similarity).

3. Click **Retrieve**. The search results will display the chunks that match the query.

<a href="{{base_path}}/assets/img/ai/rag-application/rag-ingestion.gif"><img src="{{base_path}}/assets/img/ai/rag-application/rag-ingestion.gif" alt="RAG Ingestion" width="80%"></a>
3. Click **Retrieve**. The search results will display the chunks that match your query.

!!! note
![RAG ingestion](../assets/img/ai/rag-application/rag-ingestion3.gif)

!!! note
Follow this detailed tutorial [video](https://www.youtube.com/watch?v=8GlrHYS-EYI&list=PLp0TUr0bmhX4colDnjhEKAnZ3RmjCv5y2&ab_channel=WSO2) to understand how to set up the RAG ingestion and create your vector index.

## RAG retrieval

After completing the RAG ingestion process, you need to implement a rag retrieval to connect your vector database with user queries and generate responses.

For detailed implementation steps and configuration, refer to the [RAG retrieval](https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval) tutorial in the WSO2 Integrator: BI documentation.
52 changes: 52 additions & 0 deletions en/docs/ai/rag-retrieval.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# RAG Retrieval

## Introduction
Retrieval-Augmented Generation (RAG) retrieval is the process of searching a vector database for the most relevant information in response to a user query.

!!! note
- This guide assumes you have already ingested files into your vector store. If you haven't already follow the [Ingestion](rag-ingestion.md) guide on how you can do that.

To retrieve chunks that have already been ingested (without uploading new files), navigate to your organization using the **Organization** dropdown in the top left of the Devant console header. In the left navigation menu, click **RAG**, then select **Retrieval**.

### Step 1: Initialize vector store

1. Select `Pinecone` as the vector database.
2. Enter the API key in the **API Key** field.

???+ info "Info"
To create an API key, refer to the [Pinecone API key documentation](https://docs.pinecone.io/guides/projects/manage-api-keys#create-an-api-key).

3. Enter the **Collection Name** from which you want to retrieve data.
4. Click **Next**.

### Step 2: Configure the embedding model

1. Select `text-embedding-ada-002` embedding model from the **OpenAI** dropdown.
2. Enter the API key in the **Embedding Model API Key** field.

???+ info "Info"
To create an API key, refer to the [OpenAI platform documentation](https://platform.openai.com/docs/guides/embeddings).

3. Click **Next**.

### Step 3: Query and retrieve chunks

Execute queries to ensure proper data retrieval.

1. Enter a query according to the content of the files ingested previously.
2. **Maximum chunks to retrieve** and **Minimum similarity threshold** are automatically populated with default values. You can modify them if needed.

???+ info "Info"
- **Maximum chunks to retrieve** defines the number of matching chunks to retrieve against the query.
- **Minimum similarity threshold** determines whether a chunk is relevant enough to be considered a match for a given query. Expressed as a value between 0 and 1 (for example, 0.7 or 70% similarity).

3. Click **Retrieve**. The search results will display the chunks that match the query.

???+ info "Info"
- Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned.
Comment on lines +45 to +46
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

???+ block content indented with 5 spaces instead of 4 — potential rendering issue.

Python-Markdown's admonition extension expects exactly 4 spaces of indentation for block content. The current 5-space indent ( -) could cause the line to be parsed as a code block rather than a list item within the collapsible note.

✏️ Proposed fix
 ???+ info "Info"
-     - Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned.
+    - Devant's retrieval process uses a reranking model to ensure that only the most accurate and contextually relevant chunks are returned.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/rag-retrieval.md` around lines 45 - 46, The admonition block
starting with '???+ info "Info"' has its list item line indented with 5 spaces,
which can render as a code block; change the indentation of the line beginning
with '-' to exactly 4 spaces so the list item is parsed correctly (locate the
'???+ info "Info"' block and the subsequent line that currently starts with five
spaces and reduce it to four).


![RAG retrieval](../assets/img/ai/rag-application/rag-retrieval.gif)

After completing the RAG ingestion process, you can also implement a RAG retrieval to connect your vector database with user queries and generate responses using the WSO2 Integrator: BI.

For detailed implementation steps and configuration, refer to the [RAG retrieval](https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval) tutorial in the WSO2 Integrator: BI documentation.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check the HTTP status of the WSO2 BI RAG tutorial link
curl -o /dev/null -s -w "%{http_code}\n" \
  "https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval"

Repository: wso2/docs-devant

Length of output: 63


Remove or replace the inaccessible external link. The referenced WSO2 Integrator: BI documentation link returns a 403 Forbidden status, making it inaccessible to users. Either provide a working link, embed the relevant information directly in the documentation, or remove the reference if the external resource is no longer available.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/rag-retrieval.md` at line 52, The external link labeled "RAG
retrieval"
(https://bi.docs.wso2.com/integration-guides/ai/rag/build-a-rag-application/#rag-retrieval)
in en/docs/ai/rag-retrieval.md is returning 403 and must be removed or replaced;
update the sentence that references the RAG retrieval tutorial to either (a)
point to a working public URL, (b) embed the essential implementation
steps/configuration directly into this doc under a new "RAG retrieval"
subsection, or (c) remove the cross-reference entirely and adjust surrounding
text accordingly so there are no broken links or references to the inaccessible
WSO2 Integrator: BI documentation.

152 changes: 152 additions & 0 deletions en/docs/ai/rag-service.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# RAG Service

## Introduction
Devant provides a set of RESTful API endpoints for Retrieval-Augmented Generation (RAG) workflows. These endpoints enable you to ingest, retrieve, and process documents programmatically.

### Step 1: Create service
Navigate to your organization using the **Organization** dropdown in the top left of the Devant console header. In the left navigation menu, click **RAG**, then select **Service**.

Fill in the required fields in the **Create RAG Ingestion Service** form to set up a new RAG service in Devant.

| **Field** | **Value** |
| ---------------- | ----------------- |
|**Project** | Select the target project from dropdown which lists the available projects in your organization |
| **Display Name** | Sample RAG Service |
| **Name** | sample-rag-service |
| **Description (Optional)** | My rag service description |

Click the **Create Service** button and your service will get created

!!! note
- When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.

### Step 2: Test endpoints
Once the component is created you will be redirected to the Overview page.

1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
2. Expand the resource you want to test.
3. Click Try it out to enable it.
Comment on lines +18 to +28
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Several minor prose issues in the Create/Test service steps.

  • Line 18: missing period and trailing space; "will get created" → "will be created."
  • Line 24: missing comma — "Once the component is created**,** you will be redirected…"
  • Line 28: Try it out is a UI element and should be bold-formatted, consistent with **Execute** on line 30.
✏️ Proposed fix
-Click the **Create Service** button and your service will get created 
+Click the **Create Service** button and your service will be created.
-Once the component is created you will be redirected to the Overview page.
+Once the component is created, you will be redirected to the Overview page.
-3. Click Try it out to enable it.
+3. Click **Try it out** to enable it.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Click the **Create Service** button and your service will get created
!!! note
- When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
### Step 2: Test endpoints
Once the component is created you will be redirected to the Overview page.
1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
2. Expand the resource you want to test.
3. Click Try it out to enable it.
Click the **Create Service** button and your service will be created.
!!! note
- When the service is created, Devant automatically increases the container resources (CPU and memory) to ensure reliable operation.
### Step 2: Test endpoints
Once the component is created, you will be redirected to the Overview page.
1. On the development environment card, click **Test** to open the OpenAPI Console, where you will be able to try out all the available endpoints from the endpoint list.
2. Expand the resource you want to test.
3. Click **Try it out** to enable it.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/rag-service.md` around lines 18 - 28, Fix minor prose issues in
the Create/Test service steps: change "Click the **Create Service** button and
your service will get created " to "Click the **Create Service** button and your
service will be created." (remove trailing space and add period), add a comma in
"Once the component is created you will be redirected to the Overview page." so
it reads "Once the component is created, you will be redirected to the Overview
page.", and bold-format the UI element "Try it out" so it appears as **Try it
out** (to match the already bolded **Execute**).

4. Provide values for the parameters.
5. Click **Execute**. The response will be displayed under the **Responses** section.

!!! note
- Some parameters are automatically populated with default values. You can modify them as needed.

![RAG Service](../assets/img/ai/rag-application/rag-service.gif)

### Available API endpoints
The image below shows how all the available endpoints are listed in the **Console** page. You can expand each endpoint and try them out as needed.

![RAG Service](../assets/img/ai/rag-application/rag-service.png)

#### **POST `/upload`**

Upload a file and ingest it into your vector store. Supports PDF (including scanned PDFs), DOCX, PPTX, XLSX, CSV, HTML, MD, images, and audio (MP3, WAV, M4A, FLAC, OGG).

**Required in the request:**

- File to upload.
- Vector DB provider (e.g., Pinecone, Chroma, Weaviate, Postgres) and connection/API key details.
- Collection name: where data will be stored.
- Embedding model provider and model (e.g., OpenAI, Azure, Mistral) and API key.
- Chunking strategy: `recursive`, `sentence`, or `character`.
- Max segment size and max overlap size for chunking.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Terminology inconsistency between /upload and /chunks: "Max segment size" vs "Max chunk size".

Line 53 (/upload) calls the parameter "Max segment size", while line 118 (/chunks) calls the same parameter "Max chunk size". Using different names for the same field across two closely related endpoints will confuse users who reference both sections while implementing API calls.

Pick one term (e.g., "Max chunk size") and use it consistently across both endpoints.

✏️ Proposed fix
-- Max segment size and max overlap size for chunking.
+- Max chunk size and max overlap size for chunking.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Max segment size and max overlap size for chunking.
- Max chunk size and max overlap size for chunking.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/rag-service.md` at line 53, The docs use two different names for
the same parameter — "Max segment size" in the /upload section and "Max chunk
size" in the /chunks section; standardize on a single term (use "Max chunk
size") by replacing all occurrences of "Max segment size" in the /upload
endpoint text with "Max chunk size" and ensure any related descriptions,
examples, and parameter headings in both the /upload and /chunks sections
reference the same exact phrase so the parameter name is consistent across the
documentation.


**Expected response:**

Returns a JSON object indicating successful ingestion, including the file name and type.

```json
{
"message": "Added data to vector store successfully",
"filename": "example.pdf",
"file_type": "document"
}
```

---

#### **POST `/retrieve`**

Retrieve relevant chunks from your vector store based on a user query. Supports semantic search and optional reranking with Cohere.

**Required in the request:**

- Vector DB provider and connection/API key details.
- Name of the collection from which you want to retrieve chunks.
- Embedding model provider and model, and API key.
- User query for which you want to retrieve chunks.
- Max number of chunks to retrieve and minimum similarity threshold.

**Optional:**

- Cohere re-ranking model and API key (if using reranking) and the number of top results to rerank.

???+ info "Info"
- To create a Cohere API key, refer to the [Cohere documentation](https://dashboard.cohere.com/api-keys).

**Expected response:**

Returns a JSON object containing the user query and an array of retrieved chunks. Each chunk includes the content, the file it is from, and a timestamp.

```json
{
"query": "What is Devant?",
"retrieved_chunks": [
{
"text": "Devant is ...",
"source": "example.pdf",
"timestamp": "2026-02-16T12:02:25.076312"
},
...
]
}
```

---

#### **POST `/chunks`**

Parse and chunk an uploaded file, returning the chunks as a JSON array. Does not store data in the vector DB.

This endpoint accepts all the same file types as the `/upload` endpoint, including PDF (with scanned PDF support), DOCX, PPTX, XLSX, CSV, HTML, MD, images, and audio files (MP3, WAV, M4A, FLAC, OGG).

**Required in the request:**

- File to upload.
- Chunk type: `recursive`, `sentence`, or `character`.
- Max chunk size and max overlap size.

**Expected response:**

Returns a JSON object containing the file name and an array of chunks. Each chunk includes a chunk ID and its content.

```json
{
"filename": "example.pdf",
"chunks": [
{ "chunk_id": 0, "content": "First chunk content..." },
{ "chunk_id": 1, "content": "Second chunk content..." }
]
}
```

---

#### **GET `/health`**

Health check endpoint.

**Expected response:**

Returns a JSON object indicating the service status.

```json
{
"status": "ok"
}
```

???+ info "Info"
- For more details on RAG ingestion and retrieval, and how to obtain API keys and credentials refer the [RAG Ingestion](rag-ingestion.md) guide.

66 changes: 66 additions & 0 deletions en/docs/ai/schedule-rag-automation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Schedule RAG Automation

## Introduction

Devant provides a platform to automate document ingestion on a schedule and manage unstructured data for Retrieval-Augmented Generation (RAG) workflows.

This guide outlines the steps to set up scheduled RAG ingestion automation in Devant.

!!! note
- Schedule RAG ingestion option is available only for paid Devant users.
- This guide assumes you have completed steps 1 to 4 in the [RAG ingestion](rag-ingestion.md) guide.
Comment on lines +10 to +11
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and display all step headings in rag-ingestion.md
fd -e md -g "rag-ingestion.md" --exec cat -n {} | grep -E "^[^:]*:\s*#{2,4} (Step|step)"

Repository: wso2/docs-devant

Length of output: 42


🏁 Script executed:

#!/bin/bash
# Find rag-ingestion.md in the repository
git ls-files | grep -i "rag-ingestion"

Repository: wso2/docs-devant

Length of output: 258


🏁 Script executed:

#!/bin/bash
# Read rag-ingestion.md and find all step headings
cat -n en/docs/ai/rag-ingestion.md | grep -E "#{2,4} Step"

Repository: wso2/docs-devant

Length of output: 295


🏁 Script executed:

#!/bin/bash
# View the exact content around line 11 in schedule-rag-automation.md to confirm the reference
head -15 en/docs/ai/schedule-rag-automation.md | tail -8

Repository: wso2/docs-devant

Length of output: 369


Update the prerequisite step reference to match the current RAG ingestion guide.

Line 11 references "steps 1 to 4" in the RAG ingestion guide, but rag-ingestion.md now contains 6 steps (Step 1–6: Initialize vector store, Configure embedding model, Configure chunking, Choose ingestion mode, Upload source files, and Verify). Either update the reference to "steps 1 to 6" or remove the specific step count if only a subset is actually required as a prerequisite.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/schedule-rag-automation.md` around lines 10 - 11, Update the
prerequisite reference in schedule-rag-automation.md so it correctly matches the
current rag-ingestion.md: either change the text that currently says "steps 1 to
4" to "steps 1 to 6" or remove the specific step count and refer generically to
"the RAG ingestion guide" (e.g., "This guide assumes you have completed the
steps in the RAG ingestion guide.") to avoid mismatches; locate and edit the
line that mentions rag-ingestion.md to apply the chosen wording.


After selecting **Schedule RAG Ingestion** as the ingestion mode, follow these steps:

### Step 1: Create automation
Fill in the required fields in the details form to create an automation in Devant for scheduled RAG ingestion.

| **Field** | **Value** |
| ---------------- | ----------------- |
|**Project** | Select the target project from dropdown which lists the available projects in your organization |
| **Display Name** | Sample Automation |
| **Name** | sample-automation |
| **Description (Optional)** | My sample automation description |

![RAG schedule](../assets/img/ai/rag-application/rag-schedule1.gif)

### Step 2: Configure datasource

The datasource specifies the location from which files will be ingested. Devant supports both Google Drive folders and Amazon S3 buckets as datasources.

1. Select `Google Drive` as the datasource.

2. Enter the **API Key** in the API key field.

???+ info "Info"
To obtain a key, use the [Google Cloud Console](https://console.cloud.google.com/) to create a project and generate a key as described in the [Google Documentation](https://cloud.google.com/docs/authentication/api-keys#create) and restrict it to the **Google Drive API**.

**Note:** The target folder must be **public** ("Anyone with the link"), as API keys cannot access private files.

3. Provide the **Folder ID** of the Google Drive folder containing the files to be ingested.

???+ info "Info"
The Google Drive folder ID can be found in the URL when viewing the folder in Google Drive. It is the string that appears after `/folders/` in the URL.
Comment on lines +29 to +43
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

S3 datasource is mentioned as supported but its configuration steps are not documented.

Line 29 explicitly states both Google Drive and Amazon S3 are supported, yet the guide only walks through Google Drive. Users choosing Amazon S3 as the datasource have no instructions for providing the bucket URL, region, access key, or secret key.

Either document the S3 configuration in a sub-section (parallel to the Google Drive sub-section), or note that S3 documentation is covered elsewhere and link to it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@en/docs/ai/schedule-rag-automation.md` around lines 29 - 43, The docs
currently say both Google Drive and Amazon S3 are supported but only show Google
Drive steps; add a parallel "Amazon S3" subsection under the datasource
selection that lists the required fields (Bucket name/URL, Region, Access Key
ID, Secret Access Key, optional Endpoint for S3-compatible providers, and any
IAM role or public/private access notes), provide brief guidance on where to
find the bucket name and region, and either include example steps for entering
these values in the UI or add a cross-link to existing S3 configuration
documentation if one exists (mirror the structure used in the Google Drive steps
and include an info block for important notes such as bucket permissions).


4. Click **Create Automation** to complete the setup. You will be redirected to the automation overview page.

!!! note
- When a scheduled RAG ingestion automation is created, Devant automatically increases the container resources (CPU and memory) for the automation to ensure reliable operation.

- If you need to process very large files or expect high ingestion volumes, you can further scale your container resources in **Containers** tab from the **Admin** dropdown at the bottom of the left navigation.

![RAG schedule](../assets/img/ai/rag-application/rag-schedule2.gif)
### Step 3: Schedule Ingestion

Once created, the automation is automatically deployed in the development environment with all previously entered configurations prefilled.

- To trigger an immediate ingestion, click the **Test** button.
- To schedule ingestion for a specific time interval, click the **Schedule** button and select your desired time.

You can verify successful ingestion by reviewing the automation logs.

![RAG schedule](../assets/img/ai/rag-application/rag-schedule3.gif)

As shown below, you can automate your ingestion workflow at specified intervals (e.g., minutely, hourly, daily, monthly). During each scheduled run, the system detects new files in the data source and ingests them into the vector store.

![RAG schedule](../assets/img/ai/rag-application/rag-schedule.png)
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 5 additions & 1 deletion en/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,11 @@ nav:
- Explore the Demo Organization: devant-samples/explore-the-demo-organization.md
- "Deploy with WSO2 Integrator: MI": devant-samples/deploy-with-wso2-integrator-mi.md
- AI:
- RAG Application: ai/rag-application.md
- RAG:
- Ingestion: ai/rag-ingestion.md
- Retrieval: ai/rag-retrieval.md
- Schedule Automation: ai/schedule-rag-automation.md
- Service: ai/rag-service.md
- 'Develop an AI Agent Created Using WSO2 Integrator: BI': ai/develop-an-ai-agent-created-using-wso2-integrator-bi.md
- Devant Concepts:
- Resource Hierarchy: devant-concepts/resource-hierarchy.md
Expand Down
Loading