Home

librsync reimplementation in Rust

Observations

The librsync code is not bad, but has some performance problems and may have safety problems.
I wouldn't choose to write code that handles untrusted input in C, in 2016.
I would write a lot more tests.
I like Rust.
It's probably possible to support the existing librsync C API on top of a Rust implementation.
There is already a Rust binding to the C API, https://github.com/mbrt/librsync-rs
Advantages of Rust?
- Rust safety guarantees seem like a good fit for the buffer management librsync needs to do
- Potentially can run multiple threads for IO and hashing
- Built-in dictionary/tree structures might perform better than the fairly naive ones in librsync

A plan

Rename this to librdiff to avoid perpetuating confusion that it implements the format or all the capabilities of the rsync tool
Make a new implementation in pure Rust that supports the same formats.
Four layers:
- librdiff-impl-rs: pure Rust, non-blocking state machine, on buffers provided by caller
  - Build just with Cargo.
  - Unit tests.
- rdiff-rs: a pure Rust binary based on librsdiff-impl-rs.
- librdiff-capi-rs: support the same librsync C API on top
  - Build with Cargo(?) but includes a C header?
  - Includes tests, in C, for the wrapper and implementation.
- librsync-crosstest: check interoperability
  - They produce the exact same output (however, deltas could be better.)
  - Can consume each other's output.
Implement the core librsync

Things to improve:

be clearer about the push vs pull APIs?

Questions

Do I care about supporting the full existing C API, or perhaps just making it possible to call from C?

This might be a good time to drop things from the API that are

Exposing too many internals
Hard to implement safely/efficiently/on Rust
Redundant with other parts of the API, eg having too many different usage modes

However where none of the same apply, be compatible.

Should the API use the exact same names (rs_foo) or perhaps a different prefix allowing them to both be linked in to the same test?

How to build the cross-tests? Maybe just at the rdiff layer? We essentially care that the black boxes are compatible: the library should be tested separately.

C API changes

A cleaned-up API:

Only expose struct implementations/sizes where the caller is really expected to look in them, eg stats.
Don't expose any statics, even readonly. eg, have a function to get the library version.
Don't take FILE*?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly