This project implements a complete text completion system based on Samsung's Tiny Recursive Model (TRM). The system is designed to be parameter-efficient while enabling high-quality text completion through iterative refinement.
.
├── api/
│ ├── Dockerfile
│ ├── requirements.txt
│ └── server.py
├── dummy_data.json
├── requirements.txt
├── tests/
│ ├── __init__.py
│ ├── test_components.py
│ └── test_trm.py
└── trm_text_completion/
├── __init__.py
├── model/
│ ├── __init__.py
│ ├── components.py
│ └── trm.py
├── scripts/
│ ├── __init__.py
│ └── run_training.py
└── utils/
├── __init__.py
├── data.py
└── loss.py
-
Clone the repository:
git clone <repository-url> cd <repository-name>
-
Install dependencies for training:
pip install -r requirements.txt
-
Install dependencies for the API:
pip install -r api/requirements.txt
To train the model, run the run_training.py script. You will need to provide a path to your training data in JSON format. A dummy_data.json file is provided for testing purposes.
python -m trm_text_completion.scripts.run_training --data_path dummy_data.jsonThe trained model will be saved to the ./trm_model directory by default.
To run the API server, you first need to have a trained model. Once you have a model, you can run the server using the following command:
python api/server.pyBy default, the server will run on http://0.0.0.0:5000.
GET /health: Returns the status of the API.GET /model_info: Returns the configuration of the loaded model.POST /complete: Generates a text completion.- Body:
{ "prompt": "Write a story about a robot who dreams of becoming a chef.", "max_new_tokens": 256, "temperature": 0.8, "include_stats": true }
- Body:
POST /batch_complete: (Not yet implemented)
To build and run the API server using Docker, use the following commands from the root of the project:
-
Build the Docker image:
docker build -t trm-api -f api/Dockerfile . -
Run the Docker container:
docker run -p 5000:5000 trm-api
This will start the API server, and it will be accessible at http://localhost:5000.
Currently, there are no pre-trained TRM models available for direct download. However, you can train your own models using the provided training script and datasets. For more information on training, please refer to the official TRM GitHub repository.
The API also includes an OpenAI-compatible endpoint at /v1/chat/completions. This allows you to use the TRM model with tools and libraries that are compatible with the OpenAI API.
curl -X POST http://localhost:5000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Write a short story about a friendly dragon."}
],
"max_tokens": 128,
"temperature": 0.7
}'You can also use the openai Python library to interact with the endpoint. You will need to install it first (pip install openai).
import openai
client = openai.OpenAI(
base_url="http://localhost:5000/v1",
api_key="trm-is-great", # The API key is not actually used, but it is required by the library.
)
response = client.chat.completions.create(
model="trm-text-completion",
messages=[
{"role": "user", "content": "Write a short story about a friendly dragon."}
],
)
print(response.choices[0].message.content)