Skip to content

๐Ÿ•ท๏ธ Ingest clean documentation into LLM pipelines effortlessly, filtering out noise for better data quality and improved insights.

Notifications You must be signed in to change notification settings

siddueswar/doc-crawler-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

9 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ•ท๏ธ doc-crawler-rag - Easy Documentation Crawler in Docker

๐Ÿš€ Getting Started

Welcome to doc-crawler-rag! This tool helps you collect and organize documentation easily. Itโ€™s designed to work with RAG (Retrieval-Augmented Generation) and is optimized for using large language models. Follow the steps below to download and run the application effortlessly, even if you don't have any programming knowledge.

๐Ÿ“ฅ Download Link

Download doc-crawler-rag

๐ŸŒ Overview

doc-crawler-rag is a clean, user-friendly documentation crawler that you can run on your local machine. It helps you gather and chunk documentation into manageable parts. Running in a Docker container means you donโ€™t need to worry about installation conflicts on your computer.

๐Ÿ“‹ Features

  • User-Friendly: Designed for non-technical users.
  • Optimized for RAG & LLM: Perfect for handling documentation related to Retrieval-Augmented Generation.
  • Dockerized: Easy to set up and run without complex installations.
  • Chunking Capabilities: Organizes documents into smaller, manageable sections.
  • Self-Hosted: You can run it locally without needing a cloud connection.

๐Ÿ› ๏ธ System Requirements

Before you begin, ensure you have the following:

  • Operating System: Windows, macOS, or Linux
  • Docker: Installed on your machine. You can download it from here.

๐Ÿ“ฅ How to Download & Install

To get started, follow these steps:

  1. Visit the Releases Page: Click the link below to access the releases page: Download from the Releases Page

  2. Find the Latest Version: On the releases page, look for the latest version of doc-crawler-rag. This is usually listed at the top.

  3. Download the Docker Image: Once you find the latest version, click on the Docker image link to download it.

  4. Run the Docker Container: After the image downloads, open your terminal or command prompt, and run the following command (replace latest with the actual version number if different):

    docker run -p 8501:8501 doc-crawler-rag:latest
  5. Access the Application: After running the command, open your web browser and go to http://localhost:8501. You should see the doc-crawler-rag interface.

โš™๏ธ Using doc-crawler-rag

Once the application is running, you can start using it to crawl your documents:

  1. Upload Documentation: Use the interface to upload documents. The application supports various formats, including PDFs and Word files.

  2. Choose Options: After uploading, select how you want to chunk the documents. You can decide on the size or the format of the output.

  3. Start Crawling: Click the "Crawl" button to start the process. The application will gather the information and present it in a structured layout.

  4. Download Your Results: After crawling, you can download the organized documentation directly from the application.

๐Ÿ“ Support and Documentation

For any questions or issues, check our wiki. We provide detailed explanations and FAQs to help you.

๐Ÿ”— Conclusion

doc-crawler-rag simplifies the process of gathering and managing documentation. With easy setup and user-friendly interface, you can focus more on content and less on technology. Enjoy your experience with the powerful capabilities of this application!

๐Ÿ“ฅ Download Link Again

Access the releases page to download the latest version of doc-crawler-rag: Download From Releases Page

About

๐Ÿ•ท๏ธ Ingest clean documentation into LLM pipelines effortlessly, filtering out noise for better data quality and improved insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •