Skip to content

Commit 5678700

Browse files
committed
Update documentation
Signed-off-by: fbugarski <filipbugarski@gmail.com>
1 parent e42fdfa commit 5678700

File tree

2 files changed

+207
-52
lines changed

2 files changed

+207
-52
lines changed

docs/api/embeddings.md

Lines changed: 114 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,124 @@
11
---
22
id: embeddings
3-
title: Embeddings
3+
title: Embeddings & RAG
44
---
55

6+
# Embeddings & Retrieval-Augmented Generation (RAG)
7+
8+
Large Language Models (LLMs) are powerful, but they **do not know your private or internal data**.
9+
They are trained on public information and cannot access your documents, databases, or source
10+
code unless you explicitly provide that context.
11+
12+
This is the problem that **Embeddings** and **Retrieval-Augmented Generation (RAG)** solve.
13+
14+
---
15+
16+
## The Problem LLMs Have
17+
18+
Without embeddings and RAG:
19+
20+
- LLMs cannot answer questions about private data
21+
- responses are often generic or inaccurate
22+
- models may hallucinate answers
23+
- updating knowledge requires retraining (slow and expensive)
24+
25+
---
26+
27+
## What Are Embeddings?
28+
29+
An **embedding** is a numerical (vector) representation of text that captures its meaning.
30+
31+
Embeddings allow Cube AI to:
32+
33+
- compare text by semantic similarity
34+
- search documents by meaning instead of keywords
35+
- retrieve relevant context for LLM prompts
36+
37+
In simple terms:
38+
39+
> Embeddings allow Cube AI to understand and search your data.
40+
41+
Cube AI embeddings are generated inside **Trusted Execution Environments (TEEs)**,
42+
ensuring that both input text and resulting vectors remain confidential.
43+
44+
---
45+
46+
## What Is RAG?
47+
48+
**Retrieval-Augmented Generation (RAG)** is a technique where:
49+
50+
1. Your data is converted into embeddings
51+
2. Relevant content is retrieved based on a user query
52+
3. The retrieved content is injected into the LLM prompt
53+
4. The model generates an answer grounded in your data
54+
55+
Instead of asking the model to guess, RAG lets it **answer using facts you provide**.
56+
57+
---
58+
59+
## How RAG Works in Cube AI
60+
61+
<!-- IMAGE: rag-flow-diagram -->
62+
<!-- Diagram: Documents → Embeddings → Vector Store → Retrieved Context → LLM -->
63+
64+
The RAG flow in Cube AI looks like this:
65+
66+
1. Documents are split into chunks
67+
2. Each chunk is converted into an embedding
68+
3. Embeddings are stored in a vector database
69+
4. A user asks a question
70+
5. Cube AI retrieves the most relevant chunks
71+
6. The LLM generates an answer using retrieved context
72+
73+
All processing stays inside your Cube AI deployment.
74+
75+
---
76+
77+
## Why Use RAG with Cube AI?
78+
79+
Using RAG enables:
80+
81+
- chat over internal documentation
82+
- question answering over PDFs and files
83+
- AI assistants for support and operations
84+
- safer and more accurate LLM responses
85+
- no data leakage to external providers
86+
87+
---
88+
89+
## Common Use Cases
90+
91+
### Internal Documentation Assistant
92+
Ask questions about internal docs, wikis, or README files.
93+
94+
### Support & Helpdesk Bots
95+
Answer customer questions using company knowledge bases.
96+
97+
### Codebase Search
98+
Query large repositories using natural language.
99+
100+
### Knowledge-Based AI Assistants
101+
Build enterprise-grade ChatGPT-like systems backed by private data.
102+
103+
---
104+
105+
## Embeddings API Reference
106+
6107
The embeddings endpoint allows you to generate vector representations of text.
7108
These vectors can be used for semantic search, clustering, retrieval-augmented
8109
generation (RAG), and similarity comparisons.
9110

10-
Cube AI embeddings are generated inside **Trusted Execution Environments (TEEs)**,
11-
ensuring that input text and resulting vectors remain confidential.
12-
13111
---
14112

15-
## Endpoint
113+
### Endpoint
16114

17115
```http
18116
POST /proxy/{domain_id}/v1/embeddings
19117
```
20118

21119
---
22120

23-
## Example Request
121+
### Example Request
24122

25123
```bash
26124
curl -k https://localhost/proxy/<domain_id>/v1/embeddings \
@@ -34,15 +132,23 @@ curl -k https://localhost/proxy/<domain_id>/v1/embeddings \
34132

35133
---
36134

37-
## Response
135+
### Response
38136

39137
Returns an OpenAI-compatible `embeddings` response object containing one or more
40138
embedding vectors.
41139

42140
---
43141

44-
## Notes
142+
### Notes
45143

46144
- Embeddings are **domain-scoped**
47145
- Input text is processed securely inside a TEE
48146
- Use embedding models such as `nomic-embed-text` for best results
147+
148+
---
149+
150+
## Next Steps
151+
152+
- Combine Embeddings with **Chat Completions**
153+
- Explore available **Models**
154+
- Build a complete RAG pipeline using Cube AI

docs/integrations/continue.md

Lines changed: 93 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -4,48 +4,85 @@ title: Continue for VS Code
44
sidebar_position: 1
55
---
66

7-
## Continue Integration for VS Code
7+
# Continue Integration for VS Code
88

9-
The **Continue** extension brings Cube AI’s LLM capabilities directly into
10-
Visual Studio Code, enabling inline completions, refactoring help, and
11-
chat-based assistance.
9+
The **Continue** extension brings **Cube AI** LLM capabilities directly into **Visual Studio Code**, enabling:
1210

13-
This guide explains how to connect Continue with a Cube AI domain.
11+
- inline code completions
12+
- refactoring assistance
13+
- chat-based explanations
14+
- test and documentation generation
15+
16+
This guide shows how to connect **Continue** with a **Cube AI domain** in a few simple steps.
17+
18+
---
19+
20+
## What You Will Get
21+
22+
After completing this guide, you will be able to:
23+
24+
- use Cube AI models inside VS Code
25+
- chat with your codebase
26+
- refactor and explain code using enterprise-grade LLMs
27+
- keep all data inside your Cube AI deployment
28+
29+
---
30+
31+
## Architecture Overview
32+
33+
Continue runs locally inside VS Code and forwards requests to Cube AI, which handles authentication, model routing, and security, while all data remains inside your Cube AI deployment.
34+
35+
<!-- IMAGE: architecture-diagram -->
36+
<!-- Add diagram: Continue → Cube AI → Models -->
1437

1538
---
1639

1740
## 1. Install Requirements
1841

19-
1. Install **Visual Studio Code**
20-
[https://code.visualstudio.com](https://code.visualstudio.com)
42+
### Install Visual Studio Code
43+
https://code.visualstudio.com
2144

22-
2. Install the **Continue** extension
23-
[https://www.continue.dev](https://www.continue.dev)
45+
### Install the Continue Extension
46+
https://www.continue.dev
2447

2548
---
2649

2750
## 2. Open Continue Configuration
2851

29-
In Visual Studio Code:
52+
In **Visual Studio Code**:
3053

31-
1. Click the **Continue** icon
32-
2. Open the **Settings / gear** menu
54+
1. Click the **Continue** icon in the sidebar
55+
2. Open the **Settings (⚙️)** menu
3356
3. Select **Configure Continue**
3457

3558
This opens the configuration file:
3659

37-
```yaml
60+
```
3861
.continue/config.yaml
3962
```
4063

64+
<!-- IMAGE: continue-open-config -->
65+
<!-- Screenshot: Continue icon + Configure option -->
66+
4167
---
4268

43-
## 3. Configure Continue to Use Cube AI
69+
## 3. Generate a Cube AI Access Token
4470

45-
Replace the contents of `config.yaml` with the configuration below.
71+
Before configuring Continue, generate an access token in **Cube AI UI**:
4672

47-
Before editing the file, make sure you have generated a Cube AI access token.
48-
You can obtain it from the Cube AI UI under **Profile → Tokens**.
73+
1. Open Cube AI UI
74+
2. Go to **Profile → Tokens**
75+
3. Click **Generate token**
76+
4. Copy the token value
77+
78+
<!-- IMAGE: cube-token-generation -->
79+
<!-- Screenshot: Profile → Tokens -->
80+
81+
---
82+
83+
## 4. Configure Continue to Use Cube AI
84+
85+
Replace the contents of `.continue/config.yaml` with the configuration below.
4986

5087
```yaml
5188
name: Cube AI
@@ -57,15 +94,15 @@ models:
5794
provider: ollama
5895
model: tinyllama:1.1b
5996
apiKey: <access_token>
60-
apiBase: https://<your-cube-instance>/proxy/<your-domain-id>
97+
apiBase: https://<cube-instance>/proxy/<domain-id>
6198
requestOptions:
6299
verifySsl: false
63100

64101
- name: starcoder2
65102
provider: ollama
66103
model: starcoder2:7b
67104
apiKey: <access_token>
68-
apiBase: https://<your-cube-instance>/proxy/<your-domain-id>
105+
apiBase: https://<cube-instance>/proxy/<domain-id>
69106
requestOptions:
70107
verifySsl: false
71108

@@ -78,54 +115,66 @@ context:
78115
- provider: docs
79116
```
80117
81-
### Replace
118+
### Replace the placeholders
82119
83120
- `<access_token>` → your Cube AI access token
84-
- `<your-cube-instance>` → usually `localhost`
85-
- `<your-domain-id>` the domain ID you want VS Code to use
121+
- `<cube-instance>` → usually `localhost`
122+
- `<domain-id>` Cube AI domain ID
86123

87-
> `verifySsl: false` should be used **only for local development**.
124+
⚠️ `verifySsl: false` is for local development only.
88125

89126
---
90127

91-
## 4. Using Continue With Cube AI
92-
93-
Once configured:
128+
## 5. Verify the Connection
94129

95-
- Press **Ctrl + L** to open the Continue chat
96-
- Ask questions or request explanations
97-
- Use inline completions powered by Cube AI models
130+
1. Open Continue chat using **Ctrl + L**
131+
2. Select a configured model
132+
3. Ask:
98133

99-
Example prompts:
134+
```
135+
Explain what this project does
136+
```
100137

101-
- “Explain this function”
102-
- “Refactor this TypeScript file”
103-
- “Write unit tests for this module”
138+
<!-- IMAGE: continue-chat-success -->
139+
<!-- Screenshot: Continue chat with response -->
104140

105141
---
106142

107-
## 5. Troubleshooting
143+
## 6. Example Prompts
144+
145+
- Explain this function
146+
- Refactor this file
147+
- Write unit tests
148+
- Summarize this folder
108149

109-
### Connection issues
150+
---
110151

111-
- Ensure Cube AI is running (`make up`)
112-
- Verify that the domain exists
113-
- Check that your access token is valid
152+
## 7. Troubleshooting
114153

115-
### SSL issues
154+
### Connection Issues
155+
- Ensure Cube AI is running
156+
- Verify domain ID
157+
- Check access token
116158

117-
If you are running Cube AI locally without valid TLS certificates, set:
159+
### Unauthorized (401)
160+
- Token expired or invalid
118161

162+
### SSL Errors
119163
```yaml
120164
requestOptions:
121165
verifySsl: false
122166
```
123167

124-
For production deployments, always use valid TLS certificates.
168+
---
169+
170+
## 8. Video Tutorial
171+
172+
https://www.youtube.com/watch?v=BGpv_iTB2NE
125173

126174
---
127175

128-
## 6. Video Tutorial
176+
## Next Steps
129177

130-
A complete walkthrough is available here:
131-
[https://www.youtube.com/watch?v=BGpv_iTB2NE](https://www.youtube.com/watch?v=BGpv_iTB2NE)
178+
- Embeddings & RAG
179+
- Models overview
180+
- API integrations

0 commit comments

Comments
 (0)