Proxy that allows for
- pushing prometheus metrics signed with bittensor wallets. Operating in this manner does not require a db or redis.
- verifying incoming signed metrics. Operating in this manner does not require a wallet. Verification is two-fold:
- the full payload is signed, both the signature and the hotkey are included in the headers - that is verified
- the metrics data blob is unpacked and each metric is checked for the "hotkey" label - it has to be the same as the value in the header
- docker with compose plugin
- python 3.11
- pdm
- nox
./setup-dev.sh
docker compose up -d # this will also start node_Exporter and two prometheus instances
cd app/src
pdm run manage.py wait_for_database --timeout 10
pdm run manage.py migrate
pdm run manage.py runserver 0.0.0.0:8000this setup requires a working bittensor wallet (for the on-site prometheus to read the hotkey and so that the proxy can sign requests). Requests will be sent from on-site prometheus to proxy then to the same proxy (different view though) and to the central prometheus. Starting celery and celery beat is not, however, required for local development, because instead of having a periodic task populate the validator list, one can add records to it manually using
python manage.py debug_add_validator <hotkey>Details
This sets up "deployment by pushing to git storage on remote", so that:
git push origin ...just pushes code to Github / other storage without any consequences;git push production masterpushes code to a remote server running the app and triggers a git hook to redeploy the application.
Local .git ------------> Origin .git
\
------> Production .git (redeploy on push)
Use ssh-keygen to generate a key pair for the server, then add read-only access to repository in "deployment keys" section (ssh -A is easy to use, but not safe).
# remote server
mkdir -p ~/repos
cd ~/repos
git init --bare --initial-branch=master bittensor-prometheus-proxy.git
mkdir -p ~/domains/bittensor-prometheus-proxy# locally
git remote add production root@<server>:~/repos/bittensor-prometheus-proxy.git
git push production master# remote server
cd ~/repos/bittensor-prometheus-proxy.git
cat <<'EOT' > hooks/post-receive
#!/bin/bash
unset GIT_INDEX_FILE
export ROOT=/root
export REPO=bittensor-prometheus-proxy
while read oldrev newrev ref
do
if [[ $ref =~ .*/master$ ]]; then
export GIT_DIR="$ROOT/repos/$REPO.git/"
export GIT_WORK_TREE="$ROOT/domains/$REPO/"
git checkout -f master
cd $GIT_WORK_TREE
./deploy.sh
else
echo "Doing nothing: only the master branch may be deployed on this server."
fi
done
EOT
chmod +x hooks/post-receive
./hooks/post-receive
cd ~/domains/bittensor-prometheus-proxy
sudo bin/prepare-os.sh
./setup-prod.sh
# adjust the `.env` file
mkdir letsencrypt
./letsencrypt_setup.sh
./deploy.shOnly master branch is used to redeploy an application.
If one wants to deploy other branch, force may be used to push desired branch to remote's master:
git push --force production local-branch-to-deploy:masterDetails
There is a special queue named `dead_letter` that is used to store tasks that failed for some reason.A task should be annotated with on_failure=send_to_dead_letter_queue.
Once the reason of tasks failure is fixed, the task can be re-processed
by moving tasks from dead letter queue to the main one ("celery"):
manage.py move_tasks "dead_letter" "celery"
If tasks fails again, it will be put back to dead letter queue.
To flush add tasks in specific queue, use
manage.py flush_tasks "dead_letter"
Running the app requires proper certificates to be put into nginx/monitoring_certs,
see nginx/monitoring_certs/README.md for more details.
Somewhere, probably in metrics.py:
some_calculation_time = prometheus_client.Histogram(
'some_calculation_time',
'How Long it took to calculate something',
namespace='django',
unit='seconds',
labelnames=['task_type_for_example'],
buckets=[0.5, 1, *range(2, 30, 2), *range(30, 75, 5), *range(75, 135, 15)]
)Somewhere else:
with some_calculation_time.labels('blabla').time():
do_some_work()Click to for backup setup & recovery information
Add to crontab:
# crontab -e
30 0 * * * cd ~/domains/bittensor-prometheus-proxy && ./bin/backup-db.sh > ~/backup.log 2>&1Set BACKUP_LOCAL_ROTATE_KEEP_LAST to keep only a specific number of most recent backups in local .backups directory.
Backups are put in .backups directory locally, additionally then can be stored offsite in following ways:
Backblaze
Set in .env file:
BACKUP_B2_BUCKET_NAMEBACKUP_B2_KEY_IDBACKUP_B2_KEY_SECRET
Set in .env file:
EMAIL_HOSTEMAIL_PORTEMAIL_HOST_USEREMAIL_HOST_PASSWORDEMAIL_TARGET
- Follow the instructions above to set up a new production environment
- Restore the database using bin/restore-db.sh
- See if everything works
- Set up backups on the new machine
- Make sure everything is filled up in .env, error reporting integration, email accounts etc
Skeleton of this project was generated using cookiecutter-rt-django.
Use cruft update to update the project to the latest version of the template with all current bugfixes and features.