
This is Part 2 of an article about creating Data Structures in Rust. We’ll look at rules 6 to 9:
-
- Define operators and fast operations.
-
- Follow the "Nine Rules of Good API Design" especially "write good documentation".
-
- Optimize performance using representative data, Criterion Benchmarking, and profiling.
-
- Test coverage, docs, traits, compiler errors, and correctness.
See Part 1 for rules 1 to 5.
- Plagiarize your API, documentation, and even code – from the standard library.
- Design constructors for ease-of-use, compatibility, and speed.
- Create more Rust iterators than you expect.
- Make illegal values unpresentable with traits.
- Define generic iterators with guaranteed properties and useful methods.
These rules are informed by my experience creating [range-set-blaze](https://crates.io/crates/range-set-blaze)
, a new range-set crate. A range set is a data structure that represents sets of, for example, integers as sorted & disjoint ranges. For example, 0..=5, 10..=10
. Compared to other range-set crates, range-set-blaze
offers a full collection of set operations and is performant.
Let’s start by looking at how the crate offers set operations.
Rule 6: Define Operators and Fast Operations
The standard BTreeSet
offers set operators, for example, if a
and b
are each a BTreeSet
, then a | b
will be a BTreeSet
of their union. Decide if operators make sense for your data structure. Operators can be logical such as |
,&
,!
and/or arithmetic such as +
,-
,*
,/
,^
. This table lists Rust’s "overloadable" operators and their method names.
The range-set-blaze
crate offers set-related operators on both the RangeSetBlaze
struct:
use range_set_blaze::RangeSetBlaze;
let a = RangeSetBlaze::from_iter([1..=4]);
let b = RangeSetBlaze::from_iter([0..=0, 3..=5, 10..=10]);
let union = a | b;
assert_eq!(union, RangeSetBlaze::from_iter([0..=5, 10..=10]));
and any iterators implementing the SortedDisjoint
trait (see Rule 5 in Part 1):
use range_set_blaze::prelude::*;
let a = CheckSortedDisjoint::new(vec![1..=2, 5..=100].into_iter());
let b = CheckSortedDisjoint::from([2..=6]);
let union = a | b;
assert_eq!(union.to_string(), "1..=100");
BitOr
is Rust’s method name for the |
operator (again, see table). Here is the definition of the union operator on RangeSetBlaze
:
impl<T: Integer> BitOr<RangeSetBlaze<T>> for RangeSetBlaze<T> {
type Output = RangeSetBlaze<T>;
fn bitor(mut self, other: Self) -> RangeSetBlaze<T> {
// code omitted for now
}
}
This says that for all RangeSetBlaze
, we define a |
operator that takes a RangeSetBlaze
as a second input and that returns a RangeSetBlaze
.
Subrule #1: Support Borrowed Inputs: The definition above lets users do a | b
, but not &a | &b
, &a | b
, or a | &b
. Those "borrow combinations" require three additional definitions of this form:
impl<T: Integer> BitOr<&RangeSetBlaze<T>> for &RangeSetBlaze<T> {...}
impl<T: Integer> BitOr<RangeSetBlaze<T>> for &RangeSetBlaze<T> {...}
impl<T: Integer> BitOr<&RangeSetBlaze<T>> for RangeSetBlaze<T> {...}
Alternatively, you can use the [gen_ops](https://crates.io/crates/gen_ops)
crate. It lets you define all borrow combinations across multiple operators in one statement.
Subrule #2: Offer Operators on Traits (Kind of): Set-related methods, such as complement
and union
, can be defined efficiently on our generic SortedDisjoint
iterators. The methods run in one pass and require minimal memory. (In Rule 5 in Part 1, we saw the definition of complement
.) Instead of a.complement()
, however, we’d like to say !a
. Can we? No and yes.
No, you can’t implement trait ops::Not
(the !
operator) on SortedDisjoint
– see Internet discussion. (You could but should not make ops::Not
a supertype of SortedDisjoint
, because that would make it impossible to implement SortedDisjoint
on outside types, such as [Itertools::Tee](https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.tee)
.)
You can and should, however, define the operators on as many of your own types as you can. **** For example, here is the implementation of ops::Not
for CheckSortedDisjoint
:
impl<T: Integer, I> ops::Not for CheckSortedDisjoint<T, I>
where
I: Iterator<Item = RangeInclusive<T>>,
{
type Output = NotIter<T, Self>;
fn not(self) -> Self::Output {
self.complement()
}
}
which lets us use !
in this function:
fn print_first_complement_gap() {
let a = CheckSortedDisjoint::from([-10i16..=0, 1000..=2000]);
println!("{:?}", (!a).next().unwrap()); // prints -32768..=-11
}
Aside: This example function runs in constant time and memory.
Subrule #3: Exploit Ownership: I originally did not support the |=
(BitOrAssign
) operator. I figured that if users wanted to union two RangeSetBlaze
objects and put the results in the first, they could just write a = a | b
.
Later, however, I noticed a potential speed-up. Specifically, when the number of ranges in b
is very small relative to a
, we should add the ranges of b
to a
, one at a time, rather than merging together their two ranges
iterators. So, I implemented the BitOrAssign
. (I have no such speed-up for intersection, so I haven’t implemented BitAndAssign
.)
But wait there is more! When a user calls a | &b
, the operator owns the first input. This means the code can modify the first input in place (if it so desires) and return it as the result. Here is the definition of that operator:
impl<T: Integer> BitOr<&RangeSetBlaze<T>> for RangeSetBlaze<T> {
type Output = RangeSetBlaze<T>;
fn bitor(mut self, other: &Self) -> RangeSetBlaze<T> {
self |= other;
self
}
}
I ended up creating special, optimized implementations for a | &b
and a | b,
and &a | b
. Rust even allowed me to optimize &mut a |= b
both when b
is smaller (as before) and when a
is smaller (we put a
into the owned b
, one range at a time, and then assign a
to b
.
Subrule #4: Offer fast multiway operations: We can define the intersection (operator &
) of two RangeSetBlaze
objects in terms the intersection of their ranges:
(a.ranges() & b.ranges()).into_range_set_blaze()
And, thanks to the miracle of Boolean algebra, we can define the intersection of two RangesIter
in terms of complement and union:
!(self.complement() | other.into_iter().complement())
This gives us a constant-memory one-pass intersection for all SortedDisjoint
iterators. (The same trick works for set difference. Symmetric difference requires two passes, but still only constant memory.)
Aside:
BTreeSet
doesn’t offercomplement
. Why? Because its complement of, for example,[0, 3, 4, 5, 10]
would require 4 billion entries. Range-based representations, however, store just 4 entries:-2147483648..=-1, 1..=2, 6..=9, 11..=2147483647
.
What about the intersection of more than two RangeSetBlaze
objects? We could do them two at a time but that would create unnecessary intermediate objects. Instead, we can offer multiway intersection, so users can do this:
use range_set_blaze::prelude::*;
let a = RangeSetBlaze::from_iter([1..=6, 8..=9, 11..=15]);
let b = RangeSetBlaze::from_iter([5..=13, 18..=29]);
let c = RangeSetBlaze::from_iter([-100..=100]);
let intersection = [a, b, c].intersection();
assert_eq!(intersection, RangeSetBlaze::from_iter([5..=6, 8..=9, 11..=13]));
Multiway intersection is defined as a method on an iterator of RangeSetBlaze
objects:
pub trait MultiwayRangeSetBlaze<'a, T: Integer + 'a>:
IntoIterator<Item = &'a RangeSetBlaze<T>> + Sized
{
fn intersection(self) -> RangeSetBlaze<T> {
self.into_iter()
.map(RangeSetBlaze::ranges)
.intersection()
.into_range_set_blaze()
}
}
Which calls a new multiway SortedDisjoint
intersection
and union
:
pub trait MultiwaySortedDisjoint<T: Integer, I>: IntoIterator<Item = I> + Sized
where
I: SortedDisjoint<T>,
{
fn intersection(self) -> BitAndKMerge<T, I> {
self.into_iter()
.map(|seq| seq.into_iter().complement())
.union()
.complement()
}
fn union(self) -> BitOrKMerge<T, I> {
UnionIter::new(KMerge::new(self))
}
}
Putting this all together, we get the ability to intersect any number of RangeSetBlaze
objects in one-pass and with constant memory (plus the memory for the final RangeSetBlaze
result).
Subrule #5: If needed, use Box<dyn
…>
types to offer multiway operations across different types: Multiway operators on SortedDisjoint
can have a problem. You can union inputs of the same type, here RangesIter<T>
:
let _i0 = [a.ranges(), b.ranges(), c.ranges()].intersection();
But not inputs of mixed type, here NotIter<T, RangesIter<T>>
and RangesIter<T>:
// doesn't compile: let _i1 = [!a.ranges(), b.ranges(), c.ranges()].intersection();
The fix is to wrap all the inputs in a new common type (details on GitHub):
pub struct DynSortedDisjoint<'a, T: Integer> {
iter: Box<dyn SortedDisjoint<T> + 'a>,
}
You can use the new type either explicitly or via a macro.
let _i2 = [
DynSortedDisjoint::new(!a.ranges()),
DynSortedDisjoint::new(b.ranges()),
DynSortedDisjoint::new(c.ranges()),
]
.intersection();
// or
let _i3 = intersection_dyn!(!a.ranges(), b.ranges(), c.ranges());
Rule 7: Follow the Nine Rules of Good API Design especially "Write Good Documentation"
All the rules from the article Nine Rules for Elegant Rust Library APIs apply. Here is a discussion of the most interesting six, starting with:
Write good documentation to keep your design honest. Create examples that don’t embarrass you.
You should put #![warn(missing_docs)]
in your lib.rs
. This will remind you to write documentation for every public type, trait, and method. Include an example in (almost) every bit of documentation.
In one case, I got tired of writing warnings to users that some iterators should contain only sorted & disjoint ranges. This led me to Rule 5 and having the compiler enforce the needed guarantee.
In another case, I found myself explaining in multiple places how to call union. Specifically, I told users to use the extend
method with short second inputs and the | (union)
operator, otherwise. I found this embarrassing to explain. This led me to offering a |=
operator which always does the fastest thing.
Use Clippy
This rule is important. I now take it for granted. So, use Clippy.
Accept all types of types.
I mostly tried to follow this rule, but with one big exception. Although Rust offers many range types, I only accept ranges of the form start..=end
. This not only simplifies the code, but I think it also simplifies the documentation and examples. (Given user requests, I’d revisit this.)
Define and return nice errors.
Surprisingly, this rule doesn’t come up much in many data structures. For example, the standard BTreeSet
doesn’t return any results that might be errors. Likewise, RangeSetBlaze
has a few places where it might panic – for example, if CheckSortedDisjoint
finds a problem – but it never returns error results.
Know your users’ needs, ideally by eating your own dogfood.
I feel bad that I didn’t follow this rule. I created RangeSetBlaze
for fun. None of my current projects need it. I hope other folks will use it in their projects and give me feedback on what they need from it.
Rule 8: Optimize Performance using Representative Data, Criterion Benchmarking, and Profiling
Aside: The latest version of the benchmark report shows the most recent results, including comparisons with the popular Roaring Compressed Bitmaps.
Subrule #1: Use Representative Data: On some data, RangeSetBlaze
will be worse than the standard HashSet
and BTreeSet
. For example, its construction from random integers (uniform and with replacement) is about 2.5 times slower than HashSet
‘s. Here is a plot of construction time as a function of the number of random integers:

But what if the integers are ‘clumpy’? For example, 12, 13, 14, 998, 999, 1000, 1001, -333, -332, etc. To see, we’ll construct 1 million clumpy integers. We’ll vary the average clump size from 1 (no clumps) to 100K (ten big clumps). (For more details, see Benchmarks for (some) Range-Related Rust Crates.) With this data, as clump sizes reach 100, RangeSetBlaze
is 30 times faster than HashTable
and 15 times faster than BTreeSet
. And if we’re allowed to provide the clumps as ranges (instead of as individual integers), RangeSetBlaze
is 700 times faster than the standard data types when the clump size is 1000.

The advice here is to test on data (either real or synthetic) that you believe to be interesting. For this project, I created "random clumpy integer iterators" with control over:
- the span of the integers (for example, all integers will be in
0..=9_999_999
) - number of integers (for example, 1 million)
- the average clump size (for example, 100)
- number of iterators (for example, 1)
- coverage, this is the fraction of the span expected to be covered by the iterators (unioned or intersected, if more than one) – (for example, 10% coverage).
Subrule #2: Use the Criterion crate to benchmark programs in Rust: Here are some tips:
- To get good HTML reports and excellent plots, put
criterion = { version = "
Current version", features = ["html_reports"] }
in the[dev-dependencies]
section of yourCargo.toml
. (This instruction is currently missing from the Criterion documentation.) - Put the following in your
Cargo.toml
. It declares a benchmark called "bench" with a 3rd party benchmarking harness, namely, Criterion.
[[bench]]
name = "bench"
harness = false
- At the top level of your project create folder
benches
and filebench.rs
. Make the last line of the filecriterion_main!(benches);
- Create benchmarks as described in the Criterion users guide.
- Don’t fight Criterion. It may complain that your benchmarks are too slow. If you can, listen to it and make them faster. This lets it generate better statistics.
Aside: You can run Criterion experiments under a debugger and as integration tests. See
[fn debug_k_play](https://github.com/CarlKCarlK/range-set-blaze/blob/main/tests/integration_test.rs#L634)
in[tests/integration.rs](https://github.com/CarlKCarlK/range-set-blaze/blob/main/tests/integration_test.rs)
for an example. The key is to put common code into its own project (for example,[tests_common](https://github.com/CarlKCarlK/range-set-blaze/tree/main/tests_common)
) and then create a workspace in your main[Cargo.toml](https://github.com/CarlKCarlK/range-set-blaze/blob/main/Cargo.toml#LL16C1-L18C32)
.
Subrule #3: Use benchmarks to drive design decisions: Eight years ago, when I created the Python version of this data structure, I stored the sorted ranges in a vector. For the new version, I wondered if using Rust’s standard BTreeMap
would be faster. To see, let’s compare two BTreeMap
-based crates – [rangemap](https://crates.io/crates/rangemap)
and RangeSetBlaze -
to two SmallVec
-based crates – [Range-collections](https://crates.io/crates/range-collections)
and [range-set](https://crates.io/crates/range-set)
:

(Please see the benchmark report for details.) In summary, the fastest vector-based method is 14 times slower than the slowest tree-based method. It is also 50 times slower than RangeSetBlaze
. We should not be surprised because vector-based methods are not designed for the large numbers of inserts required by our data of interest.
Here is another example. When we intersect (or union) many RangeSet
objects together, is it OK to do it two-at-a-time or is multiway faster? The benchmark gives the answer:

On two sets, all methods are similar but as the number of sets increases two-at-a-time falls behind. At 100 sets, two-at-a-time must create about 100 intermediate sets and is about 14 times slower than multiway. Dynamic multiway is not used by RangeSetBlaze
but is sometimes needed by SortedDisjoint
iterators. It is 5% to 10% slower than static multiway. (Details).
Subrule #4: Profile your code to find bottlenecks: The most popular profiler for Rust is [flamegraph](https://github.com/flamegraph-rs/flamegraph)
. I found it easy to use but I missed being able to zoom in and manipulate the results interactively. It turns out; however, that you can use almost any profiler with Rust.
For example, I have a paid subscription to Visual Studio 2022. I previously used its full-featured profiler on C++ and C# code. To use it on Rust, I just add this (temporarily) to my Cargo.toml
file and recompile in release mode.
[profile.release]
debug = true
Other full-featured profilers that I haven’t used but look like they would work with Rust include AMD μProf and, on Windows, Superluminal.
Rule 9: Test Coverage, Docs, Traits, Compiler Errors, and Correctness
Data structures typically support many methods and are used by other projects. They, thus, require extensive testing.
Subrule #1: Combine integration and coverage tests: Integration tests see only the public interface of your data structure. I put mine in tests/integration_test.rs
. Each test function is prefaced with #[test]
.
Coverage tests ensure that every line of your code is run at least once by your tests. Use the powerful and easy [llvm-cov](https://github.com/taiki-e/cargo-llvm-cov#)
to measure coverage. Installation instructions are here. Run it with the command: cargo llvm-cov --open.
To get your coverage to 100%, you’ll need to add tests. I suggest that, to the degree possible, make these coverage tests part of your integration tests. This will let you experience the usability of your public interface.
To create coverage (and other) tests across the types you support, use the wonderful syntactic-for crate:
#[test]
fn lib_coverage_6() {
syntactic_for! { ty in [i8, u8, isize, usize, i16, u16, i32, u32, i64, u64, isize, usize, i128, u128] {
$(
let mut a = RangeSetBlaze::<$ty>::from_iter([1..=3, 5..=7, 9..=120]);
a.ranges_insert(2..=100);
assert_eq!(a, RangeSetBlaze::from_iter([1..=120]));
)*
}};
}
Subrule #2: Put many examples in your documentation and run them as tests: Here is an example from the documentation for SortedDisjoint::equals
:
/// Given two [`SortedDisjoint`] iterators, efficiently tells if they
/// are equal. Unlike most equality testing in Rust,
/// this method takes ownership of the iterators and consumes them.
///
/// # Examples
///
/// ```
/// use range_set_blaze::prelude::*;
///
/// let a = CheckSortedDisjoint::from([1..=2]);
/// let b = RangeSetBlaze::from_iter([1..=2]).into_ranges();
/// assert!(a.equal(b));
/// ```
This runs with cargo test
or cargo test --doc
. (It doesn’t, however, run when measuring coverage.)
Subrule #3: Test all structs and enums for missing traits: I want RangeSetBlaze::Iter
to (generally) support the same traits as the standard BTreeSet::Iter
. Also, all types should, if practical, auto implement Sized
, Send
, Sync
, and Unpin
. (See this Let’s Get Rusty video). Here is how to test traits:
fn is_sssu<T: Sized + Send + Sync + Unpin>() {}
fn is_like_btreeset_iter<T: Clone + std::fmt::Debug + FusedIterator + Iterator>() {}
// removed DoubleEndedIterator +ExactSizeIterator for now
#[test]
fn iter_traits() {
type ARangesIter<'a> = RangesIter<'a, i32>;
type AIter<'a> = Iter<i32, ARangesIter<'a>>;
is_sssu::<AIter>();
is_like_btreeset_iter::<AIter>();
}
When I first did this, I found that I was missing Debug
, FusedIterator
, DoubleEndedIterator
, and ExactSizeIterator
. I added the first two and decided not to add the second two for now.
Similarly, a test that compared the traits of RangeSetBlaze
to those of the standard BTreeSet
reminded me to implement Clone
, PartialOrd
, Hash
, Eq
, Ord
, and IntoIterator
.
Subrule #4: Test that Illegal Values really are Unrepresentable: You want your data structure to raise a compiler error if a user applies it to the wrong types (for example, tries to create a set of characters rather than integers). Likewise, some of your methods should raise a compiler error if the user gives them a regular iterator rather than an iterator like SortedDisjoint
. How do we test this?
Use [trybuild](https://docs.rs/trybuild/latest/trybuild/)
to create "ui" tests. First, create an integration test like this:
#[test]
fn ui() {
let t = trybuild::TestCases::new();
t.compile_fail("tests/ui/*.rs");
}
Then in tests/ui
, create files such as untrusted_pairs.rs
that test the compiler error messages of interest.
use range_set_blaze::RangeSetBlaze;
fn main() {
let guaranteed = RangeSetBlaze::from_iter([1..=2, 3..=4, 5..=6]).into_ranges();
let _range_set_int = RangeSetBlaze::from_sorted_disjoint(guaranteed); // yep
let not_guaranteed = [5..=6, 1..=3, 3..=4].into_iter();
let _range_set_int = RangeSetBlaze::from_sorted_disjoint(not_guaranteed); // nope
}
Set the environment variable TRYBUILD=overwrite
to record the expected compiler error message. See [trybuild](https://docs.rs/trybuild/latest/trybuild/)
for details. If you have touble getting the same results locally as on CI (for example, GitHub Action), see this thread.
Subrule #5: Test correctness with automatic QuickCheck tests: As recommended by Rüdiger Klaehn, I implemented automatic QuickCheck
tests. Here is a test that checks that RangeSetBlaze::is_disjoint
gets the same answer as BTreeSet::is_disjoint
over values that QuickCheck
generates.
type Element = i64;
type Reference = std::collections::BTreeSet<Element>;
#[quickcheck]
fn disjoint(a: Reference, b: Reference) -> bool {
let a_r = RangeSetBlaze::from_iter(&a);
let b_r = RangeSetBlaze::from_iter(&b);
a.is_disjoint(&b) == a_r.is_disjoint(&b_r)
}
So, there you have it, nine rules for creating a data structure in Rust. Creating a data structure in Rust took more time than I expected. Half of this was my fault for creating a full SortedDisjoint
sub-library. Half was Rust’s fault for, for example, requiring me to define 13 public iterator structs. The payoff – in terms of Rust’s speed and safety – however, makes the time worthwhile. Follow these nine rules to create your own powerful Rust data structures.
Aside: If you’re interested in future articles, please follow me on Medium. I write on scientific Programming in Rust and Python, machine learning, and statistics. I tend to write about one article per month.