Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[package]
name = "pastebin"
version = "0.1.3"
version = "0.1.4"
authors = ["Kaczanowski Mateusz <kaczanowski.mateusz@gmail.com>"]
edition = "2018"

Expand Down
58 changes: 42 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,30 +13,35 @@
[github-workflow]: https://github.com/mkaczanowski/pastebin/workflows/Test%20and%20Build/badge.svg

# Pastebin

Simple, fast, standalone pastebin service.

## Why?

Whenever you need to share a code snippet, diff, logs, or a secret with another human being, the Pastebin service is invaluable. However, using public services such as [pastebin.com](https://pastebin.com), [privnote.com](https://privnote.com), etc. should be avoided when you're sharing data that should be available only for a selected audience (i.e., your company, private network). Instead of trusting external providers, you could host your own Pastebin service and take ownership of all your data!

**There are numerous [Pastebin implementations](https://github.com/awesome-selfhosted/awesome-selfhosted#pastebins) out there, why would you implement another one?**

While the other implementations are great, I couldn't find one that would satisfy my requirements:

* no dependencies - one binary is all I want, no python libs, ruby runtime magic, no javascript or external databases to setup
* storage - fast, lightweight, self-hosted key-value storage able to hold a lot of data.
* speed - it must be fast. Once deployed in a mid-sized company you can expect high(er) traffic with low latency expectations from users
* reliability - no one wants to fix things that should just work (and are that simple!)
* cheap - low-cost service that would not steal too much CPU time, thus adding up to your bill
* CLI + GUI - it must be easy to interface from both ends (but still, no deps!)
* other features:
* on-demand encryption
* syntax highlighting
* destroy after reading
* destroy after expiration date
* on-demand encryption
* syntax highlighting
* destroy after reading
* destroy after expiration date

This Pastebin implementation satisfies all of the above requirements!

## Implementation

This is a rust version of Pastebin service with [rocksdb](https://rocksdb.org/) database as storage. In addition to previously mentioned features it's worth to mention:

* all-in-one binary - all the data, including css/javascript files are compiled into the binary. This way you don't need to worry about external dependencies, it's all within. (see: [std::include_bytes](https://doc.rust-lang.org/std/macro.include_bytes.html))
* [REST endpoint](https://rocket.rs/) - you can add/delete pastes via standard HTTP client (ie. curl)
* [RocksDB compaction filter](https://github.com/facebook/rocksdb/wiki/Compaction-Filter) - the expired pastes will be automatically removed by custom compaction filter
Expand All @@ -45,55 +50,68 @@ This is a rust version of Pastebin service with [rocksdb](https://rocksdb.org/)
* Encryption - password-protected pastes are AES encrypted/decprypted in the browser via [CryptoJS](https://code.google.com/archive/p/crypto-js/)

### Plugins

The default configuration enables only one plugin, this is syntax highlighting through `prism.js`. This should be enough for p90 of the users but if you need extra features you might want to use the plugin system (`src/plugins`).

To enable additional plugins, pass:
```

```shell
--plugins prism <custom_plugin_name>
```

Currently supported:

* [prism.js](https://prismjs.com/)
* [mermaid.js](https://github.com/mermaid-js/mermaid)


## Usage

Pastebin builds only with `rust-nightly` version and requires `llvm` compiler (rocksdb deps). To skip the build process, you can use the docker image.

### Cargo
```

```shell
cargo build --release
cargo run
```

### Docker

x86 image:
```

```shell
docker pull mkaczanowski/pastebin:latest
docker run --init --network host mkaczanowski/pastebin --address localhost --port 8000
```

ARM images:
```

```shell
docker pull mkaczanowski/pastebin:armv7
docker pull mkaczanowski/pastebin:armv8
```

Compose setup:
```

```shell
URI="http://localhost" docker-compose up
curl -L "http://localhost"
```

### Client
```

```shell
alias pastebin="curl -w '\n' -q -L --data-binary @- -o - http://localhost:8000/"

echo "hello World" | pastebin
http://localhost:8000/T9kGrI5aNkI4Z-PelmQ5U
```

## Nginx (optional)

The Pastebin service serves `/static` files from memory. To lower down the load on the service you might want to consider setting up nginx with caching and compression enabled, as shown here:
```

```nginx
map $sent_http_content_type $expires {
default off;
text/css 30d;
Expand All @@ -119,17 +137,21 @@ server {
```

## REST API

See [REST API doc](https://github.com/mkaczanowski/pastebin/blob/master/API.md)

## Benchmark
I used [k6.io](https://k6.io/) for benchmarking the read-by-id HTTP endoint. Details:

I used [k6.io](https://k6.io/) for benchmarking the read-by-id HTTP endpoint. Details:

* CPU: Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz (4 CPUs, 8 threads = 16 rocket workers)
* Mem: 24 GiB
* Storage: NVMe SSD Controller SM981/PM981/PM983
* both client (k6) and server (pastebin) running on the same machine

### Setup
```

```shell
$ cargo run --release

$ echo "Hello world" | curl -q -L -d @- -o - http://localhost:8000/
Expand All @@ -147,7 +169,8 @@ $ docker pull loadimpact/k6
```

### Test 1: 5 concurrent clients, duration: 15s
```

```shell
$ docker run --network=host -i loadimpact/k6 run --vus 5 -d 15s - <script.js

data_received..............: 206 MB 14 MB/s
Expand All @@ -167,7 +190,8 @@ vus_max....................: 5 min=5 max=5
```

### Test 2: Every 15s double concurrent clients
```

```shell
docker run --network=host -i loadimpact/k6 run --vus 2 --stage 15s:4,15s:8,15s:16,15s:32 - <script.js

data_received..............: 654 MB 11 MB/s
Expand All @@ -187,11 +211,13 @@ vus_max....................: 32 min=32 max=32
```

### Interpretation

At first glance, the performance is pretty good. In the simplest scenario (5 concurrent clients), we can get up to `1000 rps` with the p95 response time at `6.59 ms` (`14986` total requests made).

As we add more concurrent clients, the rps drops a bit (`794 rps`) but still provides a good timing (p95 `38.67ms`) with high throughput at `47699` request made in 15s window (3x compared to Test 1).

The CPU utilization is at 100% on every core available and the memory usage is stable at `~13 Mb RSS`.

## Demo

[![Pastebin service demo](https://i.imgur.com/Fv19H71.png)](https://www.youtube.com/watch?v=BG7f61H7C4I "Pastebin service demo")
39 changes: 31 additions & 8 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,17 +1,40 @@
version: "3.7"

services:
pastebin:
image: mkaczanowski/pastebin:latest
container_name: pastebin
volumes:
- $DOCKERDIR/pastebin:/var/lib/pastebin
# Following Variables are optional, default values are shown. You can pass command line arguments directly as well, but not both.
# environment:
# - PASTEBIN_ADDRESS=localhost # IP address or host to listen on
# - PASTEBIN_PORT=8000 # Port to listen on
# - PASTEBIN_ENVIRONMENT=production # Rocket server environment
# - PASTEBIN_WORKERS=0 # Number of concurrent thread workers
# - PASTEBIN_KEEP_ALIVE=5 # Keep-alive timeout in seconds
# - PASTEBIN_LOG_LEVEL=normal # Max log level
# - PASTEBIN_TTL=0 # Time to live for entries, by default '0# kept forever
# - PASTEBIN_DB_PATH=./pastebin.db # Database file path
# - PASTEBIN_TLS_CERTS_PATH="" # Path to certificate chain in PEM format
# - PASTEBIN_TLS_KEY="" # Path to private key for tls-certs in PEM format
# - PASTEBIN_URI="" # Override default URI
# - PASTEBIN_URI_PREFIX="" # Prefix appended to the URI (ie. '/pastebin')
# - PASTEBIN_SLUG_CHARSET="[A-Za-z0-9_-]" # Character set (expressed as rust compatible regex) to use for generating the URL slug
# - PASTEBIN_SLUG_LENGTH=21 # Length of URL slug
# - PASTEBIN_UI_EXPIRY_TIMES="5 minutes, 10 minutes, 1 hour, 1 day, 1 week, 1 month, 1 year, Never" # List of paste expiry times rendered in the UI dropdown selector
# - PASTEBIN_UI_LINE_NUMBERS=true # Display line numbers
# - PASTEBIN_PLUGINS=prism # Enable additional functionalities (ie. prism, mermaid)
restart: unless-stopped
command: --address 0.0.0.0 --port 8081 --uri ${URI} --db=/var/lib/pastebin/
ports:
- "8081:8081"
volumes:
- ./db:/var/lib/pastebin/
# You can also pass command line arguments directly
command: --address 0.0.0.0 --uri ${URI} --db=/var/lib/pastebin/
# Uncomment below to expose pastebin service directly
# ports:
# - 8000:8000
healthcheck:
test: sh -c 'ls -l /proc/*/exe | grep pastebin'
interval: 5m00s
timeout: 10s
retries: 2
start_period: 30s

nginx:
image: "nginx"
Expand Down Expand Up @@ -46,7 +69,7 @@ services:

expires $$expires;
location / {
proxy_pass http://pastebin:8081;
proxy_pass http://pastebin:8000;

}

Expand Down
47 changes: 40 additions & 7 deletions src/main.rs
Original file line number Diff line number Diff line change
Expand Up @@ -201,97 +201,130 @@ const VERSION: &str = env!("CARGO_PKG_VERSION");
struct PastebinConfig {
#[structopt(
long = "address",
env = "PASTEBIN_ADDRESS",
help = "IP address or host to listen on",
default_value = "localhost"
)]
address: String,

#[structopt(
long = "port",
env = "PASTEBIN_PORT",
help = "Port number to listen on",
default_value = "8000"
)]
port: u16,

#[structopt(
long = "environment",
env = "PASTEBIN_ENVIRONMENT",
help = "Rocket server environment",
default_value = "production"
)]
environment: String,

#[structopt(
long = "workers",
env = "PASTEBIN_WORKERS",
help = "Number of concurrent thread workers",
default_value = "0"
)]
workers: u16,

#[structopt(
long = "keep-alive",
env = "PASTEBIN_KEEP_ALIVE",
help = "Keep-alive timeout in seconds",
default_value = "5"
)]
keep_alive: u32,

#[structopt(long = "log", help = "Max log level", default_value = "normal")]
#[structopt(
long = "log",
env = "PASTEBIN_LOG_LEVEL",
help = "Max log level",
default_value = "normal"
)]
log: rocket::config::LoggingLevel,

#[structopt(
long = "ttl",
help = "Time to live for entries, by default kept forever",
env = "PASTEBIN_TTL",
help = "Time to live for entries, by default '0' kept forever",
default_value = "0"
)]
ttl: u64,

#[structopt(
long = "db",
env = "PASTEBIN_DB_PATH",
help = "Database file path",
default_value = "./pastebin.db"
)]
db_path: String,

#[structopt(long = "tls-certs", help = "Path to certificate chain in PEM format")]
#[structopt(
long = "tls-certs",
env = "PASTEBIN_TLS_CERTS_PATH",
help = "Path to certificate chain in PEM format"
)]
tls_certs: Option<String>,

#[structopt(
long = "tls-key",
env = "PASTEBIN_TLS_KEY",
help = "Path to private key for tls-certs in PEM format"
)]
tls_key: Option<String>,

#[structopt(long = "uri", help = "Override default URI")]
#[structopt(
long = "uri",
env = "PASTEBIN_URI",
help = "Override default URI"
)]
uri: Option<String>,

#[structopt(
long = "uri-prefix",
env = "PASTEBIN_URI_PREFIX",
help = "Prefix appended to the URI (ie. '/pastebin')",
default_value = ""
)]
uri_prefix: String,

#[structopt(
long = "slug-charset",
env = "PASTEBIN_SLUG_CHARSET",
help = "Character set (expressed as rust compatible regex) to use for generating the URL slug",
default_value = "[A-Za-z0-9_-]"
)]
slug_charset: String,

#[structopt(long = "slug-len", help = "Length of URL slug", default_value = "21")]
#[structopt(
long = "slug-len",
env = "PASTEBIN_SLUG_LENGTH",
help = "Length of URL slug",
default_value = "21")]
slug_len: usize,

#[structopt(
long = "ui-expiry-times",
help = "List of paste expiry times redered in the UI dropdown selector",
env = "PASTEBIN_UI_EXPIRY_TIMES",
help = "List of paste expiry times rendered in the UI dropdown selector",
default_value = "5 minutes, 10 minutes, 1 hour, 1 day, 1 week, 1 month, 1 year, Never"
)]
ui_expiry_times: Vec<String>,

#[structopt(long = "ui-line-numbers", help = "Display line numbers")]
#[structopt(
long = "ui-line-numbers",
env = "PASTEBIN_UI_LINE_NUMBERS",
help = "Display line numbers"
)]
ui_line_numbers: bool,

#[structopt(
long = "plugins",
env = "PASTEBIN_PLUGINS",
help = "Enable additional functionalities (ie. prism, mermaid)",
default_value = "prism"
)]
Expand Down
2 changes: 1 addition & 1 deletion static/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

<title>Pastebin</title>

<link rel="icon" href="/static/favicon.ico">
<link rel="icon" href="{{uri_prefix}}/static/favicon.ico">

{{#each css_imports as |url|}}
<link href="{{format_url ../uri_prefix url}}" rel="stylesheet" />
Expand Down