generated from kbase/kbase-template
-
Notifications
You must be signed in to change notification settings - Fork 0
tmp pangenome_id #8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,6 @@ | ||
| trash/ | ||
| docs/DEMO_SCRIPT.md | ||
| docs/QUICKSTART.md | ||
|
|
||
| .DS_Store | ||
| .idea | ||
|
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,128 +1,102 @@ | ||
| # TableScanner | ||
|
|
||
| FastAPI application for table scanning operations with MinIO storage integration. | ||
| **High-Performance Tabular Data Microservice for KBase** | ||
|
|
||
| # Local Dev | ||
| TableScanner is a professional-grade FastAPI application designed to provide lightning-fast, filtered, and paginated access to massive datasets stored within KBase. By leveraging local SQLite caching and automatic indexing, it transforms slow object retrievals into instantaneous API responses. | ||
|
|
||
| ``` | ||
| bash scripts/dev.sh | ||
| ``` | ||
| --- | ||
|
|
||
| ## 🚀 Key Features | ||
|
|
||
| ## Features | ||
| - **Instant Queries**: Query millions of rows with sub-second response times. | ||
| - **Intelligent Caching**: Automatic local caching of KBase blobs for repeated access. | ||
| - **Dynamic Indexing**: Automatically optimizes database performance on first-access. | ||
| - **Dual-API Support**: Choose between a flexible **Flat POST** for scripts or a hierarchical **RESTful Path** for web apps. | ||
| - **Zero Memory Overhead**: Handles massive datasets without loading them into RAM. | ||
|
|
||
| - FastAPI web framework | ||
| - Search endpoint accepting ID parameters | ||
| - Docker and Docker Compose support | ||
| - Dependency management with uv | ||
| - MinIO client integration | ||
| - KBUtilLib utilities | ||
| --- | ||
|
|
||
| ## Prerequisites | ||
| ## 🛠️ Architecture Overview | ||
|
|
||
| - Docker | ||
| - Docker Compose | ||
| TableScanner acts as a high-speed bridge between KBase's persistent storage and your application. | ||
|
|
||
| ## Quick Start | ||
| 1. **KBase Blobstore**: Raw data is stored as SQLite databases. | ||
| 2. **TableScanner Cache**: Downloads and indexes the database locally. | ||
| 3. **FastAPI Layer**: Provides a clean, modern interface for selective data retrieval. | ||
|
|
||
| ### Using Docker Compose | ||
| For a deep dive into the service internals, see [ARCHITECTURE.md](docs/ARCHITECTURE.md). | ||
|
|
||
| 1. Build and start the application: | ||
| ```bash | ||
| docker compose up --build | ||
| ``` | ||
|
|
||
| 2. The API will be available at `http://localhost:8000` | ||
| --- | ||
|
|
||
| 3. Access the interactive API documentation at `http://localhost:8000/docs` | ||
| ## 📖 Quick Start | ||
|
|
||
| ### API Endpoints | ||
| ### 1. Run via Docker (Production) | ||
|
|
||
| #### Root Endpoint | ||
| - **URL**: `GET /` | ||
| - **Description**: Returns service information | ||
| - **Response**: | ||
| ```json | ||
| { | ||
| "service": "TableScanner", | ||
| "version": "1.0.0", | ||
| "status": "running" | ||
| } | ||
| ``` | ||
|
|
||
| #### Search Endpoint | ||
| - **URL**: `GET /search` | ||
| - **Parameters**: | ||
| - `id` (required): The ID to search for | ||
| - **Description**: Searches for a table by ID | ||
| - **Example**: `GET /search?id=12345` | ||
| - **Response**: | ||
| ```json | ||
| { | ||
| "query_id": "12345", | ||
| "status": "success", | ||
| "message": "Search completed for ID: 12345" | ||
| } | ||
| ```bash | ||
| docker compose up --build -d | ||
| ``` | ||
| The service will be available at `http://localhost:8000`. | ||
| Interactive documentation is at `/docs`. | ||
|
|
||
| ## Development | ||
| ### 2. Local Development | ||
|
|
||
| ### Project Structure | ||
| ``` | ||
| . | ||
| ├── app/ | ||
| │ ├── __init__.py | ||
| │ ├── main.py # FastAPI application factory | ||
| │ └── routes.py # API route definitions | ||
| ├── Dockerfile # Docker build configuration | ||
| ├── docker-compose.yml # Docker Compose configuration | ||
| ├── pyproject.toml # Python project metadata | ||
| ├── requirements.txt # Python dependencies | ||
| └── README.md | ||
| ```bash | ||
| # Setup environment | ||
| cp .env.example .env | ||
| # Start dev server | ||
| bash scripts/dev.sh | ||
| ``` | ||
|
|
||
| ### Dependencies | ||
| --- | ||
|
|
||
| The application requires: | ||
| - `fastapi` - Web framework | ||
| - `uvicorn[standard]` - ASGI server | ||
| - `minio` - MinIO client for object storage | ||
| - `KBUtilLib` - KBase utility library | ||
| ## 🔌 API Usage Styles | ||
|
|
||
| ### Local Development | ||
| TableScanner provides two primary ways to interact with your data. | ||
|
|
||
| To run locally without Docker: | ||
| ### A. Flat POST (Recommended for Scripts) | ||
| Everything you need in a single JSON body. Ideal for Python scripts and complex filters. | ||
|
|
||
| 1. Install dependencies: | ||
| ```bash | ||
| pip install -r requirements.txt | ||
| ```python | ||
| import requests | ||
| payload = { | ||
| "berdl_table_id": "76990/7/2", | ||
| "table_name": "Genes", | ||
| "limit": 100 | ||
| } | ||
| response = requests.post("http://localhost:8000/table-data", json=payload) | ||
| ``` | ||
|
|
||
| 2. Run the application: | ||
| ### B. Path-based REST (Recommended for Web Apps) | ||
| Clean, hierarchical URLs that mirror your data structure. | ||
|
|
||
| ```bash | ||
| uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 | ||
| # List all tables in a KBase object | ||
| GET /object/76990/7/2/tables | ||
|
|
||
| # Get specific table data | ||
| GET /object/76990/7/2/tables/Genes/data?limit=100 | ||
| ``` | ||
|
|
||
| ## Docker | ||
| --- | ||
|
|
||
| ### Build the Image | ||
| ```bash | ||
| docker build -t tablescanner . | ||
| ``` | ||
| ## 📈 Use Cases | ||
|
|
||
| ### Run the Container | ||
| ```bash | ||
| docker run -p 8000:8000 tablescanner | ||
| ``` | ||
| - **High-Throughput Analytics**: Powering large-scale pangenome comparisons. | ||
| - **Interactive Dashboards**: Real-time filtering for community structure visualizations. | ||
| - **CLI Tools**: Integrating KBase data into local bioinformatics pipelines. | ||
|
|
||
| ## Health Check | ||
| --- | ||
|
|
||
| ## 👨💻 Development | ||
|
|
||
| ### Project Structure | ||
| - `app/`: Core logic and FastAPI routes. | ||
| - `app/utils/`: Caching, SQLite, and Workspace integration. | ||
| - `docs/`: Detailed technical documentation. | ||
| - `scripts/`: Demo clients and deployment scripts. | ||
|
|
||
| The application includes a health check that verifies the service is running: | ||
| - Endpoint: `GET /` | ||
| - Interval: 30 seconds | ||
| - Timeout: 10 seconds | ||
| - Start period: 40 seconds | ||
| --- | ||
|
|
||
| ## License | ||
| ## ⚖️ License | ||
|
|
||
| See [LICENSE](LICENSE) file for details. | ||
| Distributed under the MIT License. See `LICENSE` for more information. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CORS middleware is configured with allow_origins=["*"] which allows any origin to access the API. While this may be intentional for development, consider restricting this to specific trusted origins in production environments for better security.