The aim is to set up a list of tools that can be used with Vast.ai. The tools are free to use, modify and distribute. If you find this helpful and would like to donate, you can send your donations to the following wallets.
BTC 15qkQSYXP2BvpqJkbj2qsNFb6nd7FyVcou
XMR 897VkA8sG6gh7yvrKrtvWningikPteojfSgGff3JAUs3cu7jxPDjhiAZRdcQSYPE2VGFVHAdirHqRZEpZsWyPiNK6XPQKAg
RVN RSgWs9Co8nQeyPqQAAqHkHhc5ykXyoMDUp
USDT(ETH ERC20) 0xa5955cf9fe7af53bcaa1d2404e2b17a1f28aac4f
PayPal PayPal.Me/cryptolabsZA
These tools have evolved into a complete datacenter management suite under the CryptoLabs organisation. If you're running GPU infrastructure on Vast.ai, RunPod, or bare metal, check out the full toolkit:
| Tool | Description | Link |
|---|---|---|
| IPMI Monitor | IPMI/BMC server monitoring with AI-powered insights, SSH log collection, and web dashboard. Monitors SEL events, sensors, GPU health, and more. | cryptolabsza/ipmi-monitor |
| DC Overview | Prometheus/Grafana datacenter monitoring dashboards. Full visibility into your fleet with pre-built dashboards for GPU, network, and system metrics. | cryptolabsza/dc-overview |
| DC Exporter | Rust-based GPU metrics exporter (dc-exporter-rs). Collects GPU core, hotspot, and VRAM temps including GDDR6X. Runs alongside Vast.ai and RunPod without interference. | cryptolabsza/dc-exporter-releases |
| DC Watchdog | SaaS uptime monitoring for your fleet. Replaces the old Telegram uptime bot with a managed service — multi-machine agents, alerts, and a dashboard. | cryptolabsza/dc-overview |
Recommended: Start with DC Overview + DC Exporter for monitoring, and IPMI Monitor if you have IPMI/BMC access to your servers.
- CryptoLabs Datacenter Tools
- Host install guide for Vast.ai
- Self-verification test
- Speedtest-cli fix for vast
- Analytics dashboard
- Monitor your Nvidia 3000/4000 Core, GPU Hotspot and VRAM temps
- nvml-error-when-using-ubuntu-22-and-24
- Remove Persistent red error messages
- Memory OC
- OC monitor
- Stress testing GPUs on Vast with Python benchmark of RTX3090's
- Telegram-Vast-Uptime-Bot / DC Watchdog
- Auto update the price for host listing based on mining profits
- Background job or idle job for Vast.ai
- Setting fan speeds if you have a headless system
- Remove unattended-upgrades package
- How to update a host
- How to move your Vast.ai Docker driver to another drive
- Backup varlibdocker to another machine on your network
- Connecting to running instance with VNC to see applications GUI
- Setting up 3D accelerated desktop in a web browser on Vast.ai
- Useful commands
- How to set up a Docker registry for the systems on your network
#Start with a clean install of ubuntu 22.04.x HWE Kernel server. Just add openssh.
sudo apt update && sudo apt upgrade -y && sudo apt dist-upgrade -y && sudo apt install update-manager-core -y
#if you did not install HWE kernel do the following
sudo apt install --install-recommends linux-generic-hwe-22.04 -y
sudo reboot
#install the drivers.
sudo apt install build-essential -y
sudo add-apt-repository ppa:graphics-drivers/ppa -y
sudo apt update
# to search for available NVIDIA drivers: use this command
sudo apt search nvidia-driver | grep nvidia-driver | sort -r
sudo apt install nvidia-driver-560 -y # assuming the latest is 560
#Remove unattended-upgrades Package so that the drivers don't upgrade when you have clients
sudo apt purge --auto-remove unattended-upgrades -y
sudo systemctl disable apt-daily-upgrade.timer
sudo systemctl mask apt-daily-upgrade.service
sudo systemctl disable apt-daily.timer
sudo systemctl mask apt-daily.service
# This is needed to remove xserver and GNOME if you started with Ubuntu desktop. Clients can't run a desktop GUI in a container without an X server.
bash -c 'sudo apt-get update; sudo apt-get -y upgrade; sudo apt-get install -y libgtk-3-0; sudo apt-get install -y xinit; sudo apt-get install -y xserver-xorg-core; sudo apt-get remove -y gnome-shell; sudo update-grub; sudo nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration --enable-all-gpus'
#if Ubuntu is installed to an SSD and you plan to have the Vast.ai client data stored on an NVMe follow the below instructions.
#WARNING IF YOUR OS IS ON /dev/nvme0n1 IT WILL BE WIPED. CHECK TWICE. Change this device to the intended device name that you plan to use.
# This is one command that will create the XFS partition and write it to the disk /dev/nvme0n1.
echo -e "n\n\n\n\n\n\nw\n" | sudo cfdisk /dev/nvme0n1 && sudo mkfs.xfs /dev/nvme0n1p1
sudo mkdir /var/lib/docker
#I added discard so that the SSD is trimmed by Ubuntu and nofail so that if there is some problem with the drive the system will still boot.
sudo bash -c 'uuid=$(sudo xfs_admin -lu /dev/nvme0n1p1 | sed -n "2p" | awk "{print \$NF}"); echo "UUID=$uuid /var/lib/docker/ xfs rw,auto,pquota,discard,nofail 0 0" >> /etc/fstab'
sudo mount -a
# check that /dev/nvme0n1p1 is mounted to /var/lib/docker/
df -h
#this will enable Persistence mode on reboot so that the GPUs can go to idle power when not used
sudo bash -c '(crontab -l; echo "@reboot nvidia-smi -pm 1" ) | crontab -'
#run the install command for Vast.ai
sudo apt install python3 -y
sudo wget https://console.vast.ai/install -O install; sudo python3 install YourKey; history -d $((HISTCMD-1));
nano /etc/default/grub # find the GRUB_CMDLINE_LINUX="" and ensure it looks like this.
GRUB_CMDLINE_LINUX="amd_iommu=on nvidia_drm.modeset=0 systemd.unified_cgroup_hierarchy=false"
#only run this command if you plan to support VMs on your machines. Read the Vast.ai guide to understand more https://vast.ai/docs/hosting/vms
sudo bash -c 'sed -i "/^GRUB_CMDLINE_LINUX=\"\"/s/\"\"/\"amd_iommu=on nvidia_drm.modeset=0\"/" /etc/default/grub && update-grub'
update-grub
#if you get an NVML error then run this
sudo wget https://raw.githubusercontent.com/jjziets/vasttools/main/nvml_fix.py
sudo python3 nvml_fix.py
sudo reboot
#follow the Configure Networking instructions as per https://console.vast.ai/host/setup
#test the ports with running sudo nc -l -p port on the host machine and use https://portchecker.co to verify
sudo bash -c 'echo "40000-40019" > /var/lib/vastai_kaalia/host_port_range'
sudo reboot
#After reboot, check that the drive is mounted to /var/lib/docker and that your systems show up on the Vast.ai dashboard.
df -h # look for /var/lib/docker mount
sudo systemctl status vastai
sudo systemctl status docker
You can run the following test to ensure your new machine will be on the shortlist for verification testing. If you pass, there is a high chance that your machine will be eligible for verification. Take note that your router needs to allow loopback if you run this from a machine on the same network as the machine you want to test. If you do not know how to enable loopback it will be better to run this on a VM from a cloud provider or with a mobile connection to your PC.
Download the latest vastcli and set your API key
wget https://raw.githubusercontent.com/vast-ai/vast-python/master/vast.py; chmod +x vast.py;./vast.py set api-key xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-
Single machine, fail if not meeting requirements:
./vast.py self-test machine 54321If it fails, you see the failing requirements in the output, and the test ends.
-
Single machine, continue testing anyway:
./vast.py self-test machine 54321 --ignore-requirementsPrints the failure reasons but still runs the tests.
-
Multiple machines from a single host ID, ignoring requirements:
python3 vast_machine_tester.py --host_id 123456 --ignore-requirementsIn a few minutes, you’ll have passed_machines.txt and failed_machines.txt with a summary.
The autoverify_machineid.sh script is part of a suite of tools designed to automate the testing of machines on the Vast.ai marketplace. This script specifically tests a single machine to determine if it meets the minimum requirements necessary for further verification.
Before you start using ./autoverify_machineid.sh, ensure you have the following:
- Vast.ai Command Line Interface (vastcli): This tool is used to interact with the Vast.ai platform.
- Vast.ai Listing: The machine should be listed on the Vast.ai marketplace.
- Ubuntu OS: The scripts are designed to run on Ubuntu 20.04 or newer.
-
Download and Setup
vastcli:-
Download the Vast.ai CLI tool using the following command:
wget https://raw.githubusercontent.com/vast-ai/vast-python/master/vast.py -O vast chmod +x vast
-
Set your Vast.ai API key:
./vast set api-key 6189d1be9f15ad2dced0ac4e3dfd1f648aeb484d592e83d13aaf50aee2d24c07
-
-
Download autoverify_machineid.sh:
- Use wget to download autoverify_machineid.sh to your local machine:
wget https://github.com/jjziets/VastVerification/releases/download/0.4-beta/autoverify_machineid.sh
- Use wget to download autoverify_machineid.sh to your local machine:
-
Make Scripts Executable:
- Change the permissions of the main scripts to make them executable:
chmod +x autoverify_machineid.sh
- Change the permissions of the main scripts to make them executable:
-
Dependencies
- Run the following to install the required packages
apt update apt install bc jq
-
Check Machine Requirements:
- The
./autoverify_machineid.shscript is designed to test if a single machine meets the minimum requirements for verification. This is useful for hosts who want to verify their own machines. - To test a specific machine by its
machine_id, use the following command:Replace./autoverify_machineid.sh <machine_id>
<machine_id>with the actual ID of the machine you want to test.
- The
-
To Ignore Requirements Check:
./autoverify_machineid.sh --ignore-requirements <machine_id>
This command runs the tests for the machine, regardless of whether it meets the minimum requirements.
-
Progress and Results Logging:
- The script logs the progress and results of the tests.
- Successful results and machines that pass the requirements will be logged in
Pass_testresults.log. - Machines that do not meet the requirements or encounter errors during testing will be logged in
Error_testresults.log.
-
Understanding the Logs:
Pass_testresults.log: This file contains entries for machines that successfully passed all tests.Error_testresults.log: This file contains entries for machines that failed to meet the minimum requirements or encountered errors during testing.
Here’s how you can run the autoverify_machineid.sh script to test a machine with machine_id 10921:
./autoverify_machineid.sh 10921- API Key Issues: Ensure your API key is correctly set using
./vast set api-key <your-api-key>. - Permission Denied: If you encounter permission issues, make sure the script files have executable permissions (
chmod +x <script_name>). - Connection Issues: Verify your network connection and ensure the Vast.ai CLI can communicate with the Vast.ai servers.
By following this guide, you will be able to use the ./autoverify_machineid.sh script to test individual machines on the Vast.ai marketplace. This process helps ensure that machines meet the required specifications for GPU and system performance, making them candidates for further verification and use in the marketplace.
If you are having problems with your machine not showing its upload and download speed correctly.
first check if there is a problem by forcing the speedtest to run
cd /var/lib/vastai_kaalia
./send_mach_info.py --speedtest
output should look like this
2024-10-03 08:50:04.587469
os version
running df
checking errors
nvidia-smi
560035003
/usr/bin/fio
checking speedtest
/usr/bin/speedtest
speedtest
running speedtest on random server id 19897
{"type":"result","timestamp":"2024-10-03T08:50:24Z","ping":{"jitter":0.243,"latency":21.723,"low":21.526,"high":22.047},"download":{"bandwidth":116386091,"bytes":1010581968,"elapsed":8806,"latency":{"iqm":22.562,"low":20.999,"high":296.975,"jitter":3.976}},"upload":{"bandwidth":116439919,"bytes":980885877,"elapsed":8508,"latency":{"iqm":36.457,"low":6.852,"high":349.495,"jitter":34.704}},"packetLoss":0,"isp":"Vox Telecom","interface":{"internalIp":"192.168.1.101","name":"bond0","macAddr":"F2:6A:67:0C:85:8B","isVpn":false,"externalIp":"41.193.204.66"},"server":{"id":19897,"host":"speedtest.wibernet.co.za","port":8080,"name":"Wibernet","location":"Cape Town","country":"South Africa","ip":"102.165.64.110"},"result":{"id":"18bb02e4-466d-43dd-b1fc-3f106319a9f6","url":"https://www.speedtest.net/result/c/18bb02e4-466d-43dd-b1fc-3f106319a9f6","persisted":true}}
....
If the above speedtest does not work, you can try to install an alternative newer one. Due to the newer speed test output not having the same format, a script will translate it so that Vast.ai can use the new speed test. All the commands combined
bash -c "sudo apt-get install curl -y && sudo curl -s https://packagecloud.io/install/repositories/ookla/speedtest-cli/script.deb.sh | sudo bash && sudo apt-get install speedtest -y && sudo apt install python3 -y && cd /var/lib/vastai_kaalia/latest && sudo mv speedtest-cli speedtest-cli.old && sudo wget -O speedtest-cli https://raw.githubusercontent.com/jjziets/vasttools/main/speedtest-cli.py && sudo chmod +x speedtest-cli"
or step by step
sudo apt-get install curl
sudo curl -s https://packagecloud.io/install/repositories/ookla/speedtest-cli/script.deb.sh | sudo bash
sudo apt-get install speedtest -y
sudo apt install python3 -y
cd /var/lib/vastai_kaalia/latest
sudo mv speedtest-cli speedtest-cli.old
sudo wget -O speedtest-cli https://raw.githubusercontent.com/jjziets/vasttools/main/speedtest-cli.py
sudo chmod +x speedtest-cli
This updates your speed test to the newer one and translates the output so that the Vast daemon can use it. If you now get slower speeds, follow this:
## If migrating from prior bintray install instructions please first...
# sudo rm /etc/apt/sources.list.d/speedtest.list
# sudo apt-get update
# sudo apt-get remove speedtest -y
## Other non-official binaries will conflict with Speedtest CLI
# Example how to remove using apt-get
# sudo apt-get remove speedtest-cli
sudo apt-get install curl
curl -s https://packagecloud.io/install/repositories/ookla/speedtest-cli/script.deb.sh | sudo bash
sudo apt-get install speedtest
Recommended: Use the CryptoLabs monitoring stack for a modern, production-ready solution:
- DC Overview — Prometheus/Grafana dashboards with pre-built panels for GPU, network, earnings, and system metrics.
- DC Exporter — Rust-based GPU metrics exporter. Runs as a lightweight binary alongside Vast.ai and RunPod without affecting your rentals.
- IPMI Monitor — IPMI/BMC monitoring with AI-powered insights, SSH log collection, and a web dashboard.
Legacy DCMontoring (archived): https://github.com/jjziets/DCMontoring
Run the script below if you have a problem with the Vast.ai installer on 22 or 24 and receive an NVML error. This script is based on Bo26fhmC5M, so credit goes to him.
sudo wget https://raw.githubusercontent.com/jjziets/vasttools/main/nvml_fix.py
sudo python nvml_fix.py
If you have a red error message on your machine that you have confirmed has been addressed, it might help to delete /var/lib/vastai_kaalia/kaalia.log and reboot.
sudo rm /var/lib/vastai_kaalia/kaalia.log
sudo systemctl restart vastai
Recommended: dc-exporter-rs — A modern Rust-based GPU metrics exporter that collects GPU core, hotspot, and VRAM temperatures (including GDDR6X). Exposes Prometheus metrics for Grafana dashboards.
Key advantage: Runs as a lightweight system binary and does not interfere with Vast.ai or RunPod — your rentals and workloads are unaffected.
# Quick install (see repo for full instructions) wget https://github.com/cryptolabsza/dc-exporter-releases/releases/latest/download/dc-exporter-rs-linux-amd64 chmod +x dc-exporter-rs-linux-amd64 sudo ./dc-exporter-rs-linux-amd64
Legacy nvml_direct_access tool (archived):

sudo wget https://github.com/jjziets/gddr6_temps/raw/master/nvml_direct_access
sudo chmod +x nvml_direct_access
sudo ./nvml_direct_access
Set the OC of the RTX 3090. It requires the following:
On the host run the following command:
sudo apt-get install libgtk-3-0 && sudo apt-get install xinit && sudo apt-get install xserver-xorg-core && sudo update-grub && sudo nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration --enable-all-gpus
wget https://raw.githubusercontent.com/jjziets/vasttools/main/set_mem.sh
sudo chmod +x set_mem.sh
sudo ./set_mem.sh 2000 # this will set the memory OC to +1000MHz on all the GPUs. You can use 3000 on some GPUs, which will give 1500MHz OC.
Set up the monitoring program that will change the memory OC based on what program is running. It is designed for RTX3090s and targets ethminer at this stage. It requires both set_mem.sh and ocmonitor.sh to run as root.
wget https://raw.githubusercontent.com/jjziets/vasttools/main/ocminitor.sh
sudo chmod +x ocminitor.sh
sudo ./ocminitor.sh # I suggest running this in tmux or screen so that when you close the SSH connection it keeps running. It looks for ethminer and if it finds it, it will set the OC based on your choice. You can also set power limits with nvidia-smi -pl 350
To load at reboot use the crontab below
sudo (crontab -l; echo "@reboot screen -dmS ocmonitor /home/jzietsman/ocminitor.sh") | crontab - # replace the user with your user
Mining does not stress your system the same as Python workloads do, so this is a good test to run as well.
First, set a maintenance window, and then once you have no clients running, you can do the stress testing.
https://github.com/jjziets/pytorch-benchmark-volta
A full suite of stress tests can be found in the docker image jjziets/vastai-benchmarks:latest in folder /app/
stress-ng - CPU stress
stress-ng - Drive stress
stress-ng - Memory stress
sysbench - Memory latency and speed benchmark
dd - Drive speed benchmark
Hashcat - Benchmark
bandwidthTest - GPU bandwidth benchmark
pytorch - Pytorch DL benchmark
#test or bash interface
sudo docker run --shm-size 1G --rm -it --gpus all jjziets/vastai-benchmarks /bin/bash
apt update && apt upgrade -y
./benchmark.sh
#Run using default settings Results are saved to ./output.
sudo docker run -v ${PWD}/output:/app/output --shm-size 1G --rm -it --gpus all jjziets/vastai-benchmarks
Run with parameters SLEEP_TIME/BENCH_TIME
sudo docker run -v ${PWD}/output:/app/output --shm-size 1G --rm -it -e SLEEP_TIME=2 -e BENCH_TIME=2 --gpus all jjziets/vastai-benchmarks
You can also do a GPU burn test.
sudo docker run --gpus all --rm oguzpastirmaci/gpu-burn <test duration in seconds>
If you want to run it for one GPU, run the command below, replacing the x with the GPU number starting at 0.
sudo docker run --gpus '"device=x"' --rm oguzpastirmaci/gpu-burn <test duration in seconds>
*Based on Leona / vast.ai-tools
Recommended: DC Watchdog — A managed SaaS uptime monitoring service that replaces the self-hosted Telegram bot. Features include:
- Multi-machine monitoring with lightweight agents
- Alerts via Telegram, email, and push notifications
- Centralized dashboard with history and analytics
- No need to run your own server — the agents report to the CryptoLabs cloud
Legacy self-hosted bot (still works for simple setups): This is a set of scripts for monitoring machine crashes. Run the client on your Vast.ai machine and the server on a remote one. You get notifications on Telegram if no heartbeats are sent within the timeout (default 12 seconds). https://github.com/jjziets/Telegram-Vast-Uptime-Bot
Based on RTX 3090 120MHz for eth. It sets the price of my two hosts. It works with a custom Vast CLI which can be found here https://github.com/jjziets/vast-python/blob/master/vast.py The manager is here https://github.com/jjziets/vasttools/blob/main/setprice.sh
This should be run on a VPS, not on a host. Do not expose your Vast API keys by using it on the host.
wget https://github.com/jjziets/vast-python/blob/master/vast.py
sudo chmod +x vast.py
./vast.py set api-key UseYourVasset
wget https://github.com/jjziets/vasttools/blob/main/setprice.sh
sudo chmod +x setprice.sh
The best way to manage your idle job is via the Vast CLI. To my knowledge, the GUI set job is broken, so to set an idle job follow the following steps. You will need to download the Vast CLI and run the following commands. The idea is to rent yourself as an interruptible job. The Vast CLI allows you to set one idle job for all the GPUs or one GPU per instance. You can also set the SSH connection method or any other method. Go to https://cloud.vast.ai/cli/ and install your CLI flavor.
Set up your account key so that you can use the Vast CLI. You get this key from your account page.
./vast set api-key API_KEY
You can use my SetIdleJob.py script to set up your idle job based on the minimum price set on your machines.
wget https://raw.githubusercontent.com/jjziets/vasttools/main/SetIdleJob.py
Here is an example of how I mine to NiceHash
python3 SetIdleJob.py --args 'env | grep _ >> /etc/environment; echo "starting up"; apt -y update; apt -y install wget; apt -y install libjansson4; apt -y install xz-utils; wget https://github.com/develsoftware/GMinerRelease/releases/download/3.44/gminer_3_44_linux64.tar.xz; tar -xvf gminer_3_44_linux64.tar.xz; while true; do ./miner --algo kawpow --server stratum+tcp://kawpow.auto.nicehash.com:9200 --user 3LNHVWvUEufL1AYcKaohxZK2P58iBHdbVH.${VAST_CONTAINERLABEL:2}; done'
Or the full command if you don't want to use the defaults
python3 SetIdleJob.py --image nvidia/cuda:12.4.1-runtime-ubuntu22.04 --disk 16 --args 'env | grep _ >> /etc/environment; echo "starting up"; apt -y update; apt -y install wget; apt -y install libjansson4; apt -y install xz-utils; wget https://github.com/develsoftware/GMinerRelease/releases/download/3.44/gminer_3_44_linux64.tar.xz; tar -xvf gminer_3_44_linux64.tar.xz; while true; do ./miner --algo kawpow --server stratum+tcp://kawpow.auto.nicehash.com:9200 --user 3LNHVWvUEufL1AYcKaohxZK2P58iBHdbVH.${VAST_CONTAINERLABEL:2}; done' --api-key b149b011a1481cd852b7a1cf1ccc9248a5182431b23f9410c1537fca063a68b1
Troubleshoot your bash -c command by using the logs on the instance page
Alternatively, you can rent yourself with the following command and then log in and load what you want to run. Make sure to add your process to onstart.sh. To rent yourself first find your machine with the machine ID
./vast search offers "machine_id=14109 verified=any gpu_frac=1 " # gpu_frac=1 will give you the instance with all the gpus.
or
./vast search offers -i "machine_id=14109 verified=any min_bid>0.1 num_gpus=1" # it will give you the instance with one GPU
Once you have the offer_id, and in this case, the search with a -i switch will give you an interruptible instance_id
Let's assume you want to mine with lolminer
./vast create instance 9554646 --price 0.2 --image nvidia/cuda:12.0.1-devel-ubuntu20.04 --env '-p 22:22' --onstart-cmd 'bash -c "apt -y update; apt -y install wget; apt -y install libjansson4; apt -y install xz-utils; wget https://github.com/Lolliedieb/lolMiner-releases/releases/download/1.77b/lolMiner_v1.77b_Lin64.tar.gz; tar -xf lolMiner_v1.77b_Lin64.tar.gz -C ./; cd 1.77b; ./lolMiner --algo ETCHASH --pool etc.2miners.com:1010 --user 0xYour_Wallet_Goes_Here.VASTtest"' --ssh --direct --disk 100
It will start the instance at price 0.2.
./vast show instances # will give you the list of instances
./vast change bid 9554646 --price 0.3 # This will change the price to 0.3 for the instance
Here is a repo with two programs and a few scripts that you can use to manage your fans https://github.com/jjziets/GPU_FAN_OC_Manager/tree/main
bash -c "wget https://github.com/jjziets/GPU_FAN_OC_Manager/raw/main/set_fan_curve; chmod +x set_fan_curve; CURRENT_PATH=\$(pwd); nohup bash -c \"while true; do \$CURRENT_PATH/set_fan_curve 65; sleep 1; done\" > output.txt & (crontab -l; echo \"@reboot screen -dmS gpuManger bash -c 'while true; do \$CURRENT_PATH/set_fan_curve 65; sleep 1; done'\") | crontab -"
If your system updates while Vast.ai is running, or even worse when a client is renting you, then you might get de-verified or banned. It's advised to only update when the system is unrented and delisted. The best approach would be to set an end date for your listing and conduct updates and upgrades at that stage. To stop unattended-upgrades run the following commands.
sudo apt purge --auto-remove unattended-upgrades -y
sudo systemctl disable apt-daily-upgrade.timer
sudo systemctl mask apt-daily-upgrade.service
sudo systemctl disable apt-daily.timer
sudo systemctl mask apt-daily.service
When the system is idle and delisted run the following commands. Vast daemon and Docker services are stopped. It is also a good idea to upgrade Nvidia drivers like this. If you don't and the upgrades break a package you might get de-verified or even banned from Vast.ai.
bash -c ' sudo systemctl stop vastai; sudo systemctl stop docker.socket; sudo systemctl stop docker; sudo apt update; sudo apt upgrade -y; sudo systemctl start docker.socket ; sudo systemctl start docker; sudo systemctl start vastai'
This guide illustrates how to back up Vast.ai Docker data from an existing drive and transfer it to a new drive. In this case a RAID drive /dev/md0
- No clients are running and that you are unlisted from the Vast.ai market.
- Docker data exists on the current drive.
- Install required tools:
sudo apt install pv pixz - Stop and disable relevant services:
sudo systemctl stop vastai docker.socket docker sudo systemctl disable vastai docker.socket docker - Backup the Docker directory:
Create a compressed backup of the
/var/lib/dockerdirectory. Ensure there's enough space on the OS drive for this backup, or move the data to a backup server. See https://github.com/jjziets/vasttools/blob/main/README.md#backup-varlibdocker-to-another-machine-on-your-networkNote:sudo tar -c -I 'pixz -k -1' -f ./docker.tar.pixz /var/lib/docker | pv # you can change ./ to a destination directorypixzutilizes multiple cores for faster compression. - Unmount the Docker directory:
If you're planning to shut down and install a new drive:
sudo umount /var/lib/docker - Update
/etc/fstab: Disable auto-mounting of the current Docker directory at startup to prevent boot issues:Comment out the line associated withsudo nano /etc/fstab/var/lib/dockerby adding a#at the start of the line. - Partition the New Drive:
(Adjust the device name based on your system. The guide uses
/dev/md0for RAID and/dev/nvme0n1for NVMe drives as examples.)sudo cfdisk /dev/md0 - Format the new partition with XFS:
sudo mkfs.xfs -f /dev/md0p1 - Retrieve the UUID:
You'll need the UUID for updating
/etc/fstab.sudo xfs_admin -lu /dev/md0p1 - Update
/etc/fstabwith the New Drive:Add the following line (replace the UUID with the one you retrieved):sudo nano /etc/fstabUUID="YOUR_UUID_HERE" /var/lib/docker xfs rw,auto,pquota,discard,nofail 0 0 - Mount the new partition:
Confirm the mount:
sudo mount -aEnsuredf -h/dev/md0p1(or the appropriate device name) is mounted to/var/lib/docker. - Restore the Docker data:
Navigate to the root directory:
Decompress and restore: Ensure you change the user to the relevant name
cd /sudo cat /home/user/docker.tar.pixz | pv | sudo tar -x -I 'pixz -d -k' - Enable services:
sudo systemctl enable vastai docker.socket docker - Reboot:
sudo reboot
Check if the desired drive is mounted to /var/lib/docker and ensure vastai is operational.
If you're looking to migrate your Docker setup to another machine, whether for replacing the drive or setting up a RAID, follow this guide. For this example, we'll assume the backup server's IP address is 192.168.1.100.
- Temporarily Enable Root SSH Login:
It's essential to ensure uninterrupted SSH communication during the backup process, especially when transferring large files like compressed Docker data.
a. Open the SSH configuration:
b. Locate and change the line:
sudo nano /etc/ssh/sshd_configto:PermitRootLogin noc. Reload the SSH configuration:PermitRootLogin yessudo systemctl restart sshd
- Generate an SSH Key and Transfer it to the Backup Server:
a. Create the SSH key:
b. Copy the SSH key to the backup server:
sudo ssh-keygensudo ssh-copy-id -i ~/.ssh/id_rsa root@192.168.1.100 - Disable Root Password Authentication:
Ensure only the SSH key can be used for root login, enhancing security.
a. Modify the SSH configuration:
b. Change the line to:
sudo nano /etc/ssh/sshd_configc. Reload the SSH configuration:PermitRootLogin prohibit-password
sudo systemctl restart sshd - Preparation for Backup:
Before backing up, ensure relevant services are halted:
sudo systemctl stop docker.socket sudo systemctl stop docker sudo systemctl stop vastai sudo systemctl disable vastai sudo systemctl disable docker.socket sudo systemctl disable docker - Backup Procedure:
This procedure compresses the
/var/lib/dockerdirectory and transfers it to the backup server. a. Switch to the root user and install necessary tools:It might be a good idea to run the backup command in tmux or screen so that if you lose the SSH connection the process will finish. b. Perform the backup:sudo su apt install pixz apt install pvtar -c -I 'pixz -k -0' -f - /var/lib/docker | pv | ssh root@192.168.1.100 "cat > /mnt/backup/machine/docker.tar.pixz"
- Restoring the Backup:
Make sure your new drive is mounted at
/var/lib/docker. a. Switch to the root user:b. Restore from the backup:sudo sucd / ssh root@192.168.1.100 "cat /mnt/backup/machine/docker.tar.pixz" | pv | sudo tar -x -I 'pixz -d -k' - Reactivate Services:
sudo systemctl enable vastai sudo systemctl enable docker.socket sudo systemctl enable docker sudo reboot
Post-reboot: Ensure your target drive is mounted to /var/lib/docker and that vastai is operational.
Using an instance with open ports If the display color depth is 16 not 16-bit try another VNC viewer. TightVNC worked for me on Windows
First tell Vast.ai to allow a port to be assigned. Use the -p 8081:8081 and tick the direct command.
Find a host with open ports and then rent it, preferably on demand. Go to the client instances page and wait for the connect button.
Use SSH to connect to the instances.

Run the commands below. The second part can be placed in the onstart.sh to run on restart.
bash -c 'apt-get update; apt-get -y upgrade; apt-get install -y x11vnc; apt-get install -y xvfb; apt-get install -y firefox;apt-get install -y xfce4;apt-get install -y xfce4-goodies'
export DISPLAY=:20
Xvfb :20 -screen 0 1920x1080x16 &
x11vnc -passwd TestVNC -display :20 -N -forever -rfbport 8081 &
startxfce4
To connect use the IP of the host and the port that was provided. In this case it is 400010.

Then enjoy the desktop. Sadly this is not hardware accelerated, so no games will work.
We will be using ghcr.io/ehfd/nvidia-glx-desktop:latest
Use these environment parameters
-e TZ=UTC -e SIZEW=1920 -e SIZEH=1080 -e REFRESH=60 -e DPI=96 -e CDEPTH=24 -e VIDEO_PORT=DFP -e PASSWD=mypasswd -e WEBRTC_ENCODER=nvh264enc -e BASIC_AUTH_PASSWORD=mypasswd -p 8080:8080
Find a system that has open ports

The username is user and the password is what you set, mypasswd in this case.

3D accelerated desktop environment in a web browser

This will reduce the number of pull requests from your public IP. Docker is restricted to 100 pulls per six hours for unauthenticated login, and it can speed up the startup time for your rentals. This guide provides instructions on how to set up a Docker registry server using Docker Compose, as well as configuring Docker clients to use this registry. Prerequisites Docker and Docker Compose are installed on the server that has a lot of fast storage on your local LAN. Docker is installed on all client machines.
Setting Up the Docker Registry Server Install docker-compose if you have not already.
sudo su
curl -L "https://github.com/docker/compose/releases/download/v2.24.4/docker-compose-$(uname -s)-$(uname -m)" -o /usr/local/bin/docker-compose
chmod +x /usr/local/bin/docker-compose
ln -s /usr/local/bin/docker-compose /usr/bin/docker-compose
apt-get update && sudo apt-get install -y gettext-base
Create a docker-compose.yml file: Create a file named docker-compose.yml on your server with the following content:
version: '3'
services:
registry:
restart: unless-stopped
image: registry:2
ports:
- 5000:5000
environment:
- REGISTRY_PROXY_REMOTEURL=https://registry-1.docker.io
- REGISTRY_STORAGE_DELETE_ENABLED="true"
volumes:
- data:/var/lib/registry
volumes:
data:
This configuration sets up a Docker registry server running on port 5000 and uses a volume named data for storage. Start the Docker Registry:
Run the following command in the directory where your docker-compose.yml file is located:
sudo docker-compose up -d
This command will start the Docker registry in detached mode.
If space is limited, you can run this cleanup task as a cron job on the server.
wget https://github.com/jjziets/vasttools/raw/main/cleanup-registry.sh
chmod +x cleanup-registry.sh
Add this line to your crontab -e
0 * * * * /path/to/cleanup-registry.sh
Replace /path/to/ with where the file is saved.
To configure Docker clients to use the registry, follow these steps on each client machine: Edit the Docker Daemon Configuration: Run the following command to add your Docker registry as a mirror in the Docker daemon configuration:
echo '{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"registry-mirrors": ["http://192.168.100.7:5000"]
}' | sudo tee /etc/docker/daemon.json
Replace 192.168.100.7:5000 with the IP address and port of your Docker registry server. Restart the Docker daemon:
sudo systemctl restart docker
Verifying the setup To verify that the Docker registry is set up correctly, you can try pulling an image from the registry:
docker pull 192.168.100.7:5000/your-image
Replace 192.168.100.7:5000/your-image with the appropriate registry URL and image name.
If you set up the Vast CLI, you can enter this
./vast show machines | grep "current_rentals_running_on_demand"
If it returns 0, then it's an interruptible rent.
Command on a host that provides logs of the daemon running
tail /var/lib/vastai_kaalia/kaalia.log -f
Uninstall Vast
wget https://s3.amazonaws.com/vast.ai/uninstall.py
sudo python uninstall.py






