Zero-cost abstractions
How Rust gives safety without runtime cost
What is Zero-cost?
The zero-cost abstraction principle was articulated by Bjarne Stroustrup, the creator of C++:
"What you don't use, you don't pay for. And further: what you do use, you couldn't hand code any better."
Rust embraces this principle throughout its design. High-level abstractions -- iterators, generics, traits, pattern matching, the ownership system itself -- compile down to machine code that is as efficient as if you had written the low-level code by hand.
The Zero-cost Spectrum
────────────────────────────────────────────
Language Abstractions Runtime cost
────────────────────────────────────────────
Assembly None None
C Minimal None
C++ Rich Zero-cost*
Rust Rich + Safe Zero-cost
Java/C# Rich + Safe GC + JIT
Python Very rich Interpreter
────────────────────────────────────────────
* C++ zero-cost is not guaranteed; virtual
dispatch, exceptions, and RTTI add overhead.
Rust achieves BOTH high-level expressiveness
AND low-level performance, with the added
guarantee of memory safety.This is not marketing. It is a measurable, verifiable property. You can inspect the generated assembly and confirm that Rust's abstractions add no overhead compared to the equivalent hand-written code.
Iterators
Rust iterators are a chain of lazy transformations. They look high-level and functional, but the compiler fuses the entire chain into a single loop with no intermediate allocations and no function call overhead.
// Functional style: filter, map, fold
fn sum_of_squares_of_odds(numbers: &[i32]) -> i32 {
numbers.iter()
.filter(|&&n| n % 2 != 0) // keep odd numbers
.map(|&n| n * n) // square each
.sum() // add them up
}
// This compiles to the SAME assembly as:
fn sum_of_squares_of_odds_manual(numbers: &[i32]) -> i32 {
let mut total = 0;
let mut i = 0;
while i < numbers.len() {
let n = numbers[i];
if n % 2 != 0 {
total += n * n;
}
i += 1;
}
total
}The compiler applies an optimization called iterator fusion (or loop fusion). The chain of .iter().filter().map().sum() is not creating three separate passes over the data. It is compiled into a single pass with all operations inlined.
Iterator Fusion: What the Compiler Does
──────────────────────────────────────────
Source code (chained iterators):
numbers.iter()
.filter(|&&n| n % 2 != 0)
.map(|&n| n * n)
.sum()
After compilation (single fused loop):
xor eax, eax ; total = 0
.loop:
cmp rcx, rdx ; while i < len
jge .done
mov esi, [rdi+rcx*4] ; n = numbers[i]
test esi, 1 ; if n % 2 != 0
jz .skip
imul esi, esi ; n * n
add eax, esi ; total += n*n
.skip:
inc rcx ; i++
jmp .loop
.done:
ret ; return total
No heap allocation. No function calls.
One tight loop, identical to hand-written C.In contrast, the equivalent Python code sum(n*n for n in numbers if n%2!=0) creates a generator object, calls __next__ repeatedly, and boxes every integer. Rust gives you the same expressiveness with none of the cost.
Generics and Monomorphization
Generics let you write code that works with any type. In Java, generics use type erasure: at runtime, a List<Integer> and a List<String> are the same class, and elements are accessed through object pointers with runtime casts.
Rust uses monomorphization: the compiler generates a separate, specialized version of the function for each concrete type it is used with. No indirection, no boxing, no type casts at runtime.
// Generic function: works with any type that supports addition
fn add<T: std::ops::Add<Output = T>>(a: T, b: T) -> T {
a + b
}
fn main() {
let int_sum = add(5_i32, 10_i32); // calls add::<i32>
let float_sum = add(1.5_f64, 2.5_f64); // calls add::<f64>
println!("{}, {}", int_sum, float_sum);
} Monomorphization: What the Compiler Generates
───────────────────────────────────────────────
Source: fn add<T: Add>(a: T, b: T) -> T
The compiler generates specialized versions:
┌────────────────────────────────────────┐
│ fn add_i32(a: i32, b: i32) -> i32 { │
│ a + b // compiled to: add eax,edx│
│ } │
├────────────────────────────────────────┤
│ fn add_f64(a: f64, b: f64) -> f64 { │
│ a + b // compiled to: addsd xmm0 │
│ } │
└────────────────────────────────────────┘
Each version uses the optimal CPU instruction
for its type. No indirection, no vtable, no
boxing. The generic code is as fast as if you
had written two separate functions.
Tradeoff: binary size increases (more code),
but execution speed is identical to hand-written.This is fundamentally different from C++ templates in one important way: Rust generics are type-checked at the definition site using trait bounds, not at the call site. This means you get clear error messages pointing to the constraint violation, instead of C++'s notoriously cryptic template errors.
Traits vs Virtual Dispatch
In C++ and Java, polymorphism typically means virtual functions: a vtable lookup at runtime to determine which method to call. This adds a pointer indirection for every call and prevents inlining. Rust defaults to static dispatch and only uses dynamic dispatch when you explicitly ask for it.
trait Shape {
fn area(&self) -> f64;
}
struct Circle { radius: f64 }
struct Rectangle { width: f64, height: f64 }
impl Shape for Circle {
fn area(&self) -> f64 {
std::f64::consts::PI * self.radius * self.radius
}
}
impl Shape for Rectangle {
fn area(&self) -> f64 {
self.width * self.height
}
}
// STATIC dispatch: the compiler knows the concrete type.
// The trait method is inlined directly. No vtable.
fn print_area<T: Shape>(shape: &T) {
println!("Area: {:.2}", shape.area());
}
// DYNAMIC dispatch: only when you explicitly use 'dyn'.
// Uses a vtable, like C++ virtual functions.
fn print_area_dyn(shape: &dyn Shape) {
println!("Area: {:.2}", shape.area());
}
fn main() {
let c = Circle { radius: 5.0 };
let r = Rectangle { width: 4.0, height: 6.0 };
print_area(&c); // static: inlined Circle::area
print_area(&r); // static: inlined Rectangle::area
print_area_dyn(&c); // dynamic: vtable lookup
}Static vs Dynamic Dispatch ────────────────────────────────────────── Static dispatch (default in Rust): ┌────────────────────────────┐ │ print_area(&circle) │ │ → inlined Circle::area() │ │ → PI * r * r │ direct computation │ → no indirection │ └────────────────────────────┘ Dynamic dispatch (opt-in with 'dyn'): ┌────────────────────────────┐ │ print_area_dyn(&circle) │ │ → load vtable pointer │ │ → look up area() slot │ pointer indirection │ → call through pointer │ └────────────────────────────┘ C++ / Java use dynamic dispatch by default (virtual functions). Rust uses static dispatch by default, so you only pay for indirection when you choose to.
Use dyn Trait when you need a heterogeneous collection (e.g., a vector of different shapes). Use generics with trait bounds for everything else. The compiler will monomorphize the generic version and inline the trait methods, giving you polymorphism at zero cost.
Ownership is Free
The ownership and borrowing system is Rust's most distinctive feature -- and it has absolutely zero runtime cost. The borrow checker exists only at compile time. Once compilation succeeds, no ownership tracking, no reference counting, and no garbage collection happens at runtime.
fn main() {
let s1 = String::from("hello");
// This MOVE is free at runtime.
// No reference count increment, no deep copy.
// The compiler just transfers the stack pointer.
let s2 = s1;
// This BORROW is free at runtime.
// No reference count, no bookkeeping.
// It's just passing a pointer.
let len = calculate_length(&s2);
println!("{} has {} bytes", s2, len);
} // s2 is dropped here: one call to free().
// The compiler inserted this. You didn't write it.
fn calculate_length(s: &String) -> usize {
s.len()
} // s is a reference. Nothing to free.What Exists at Runtime vs Compile Time ────────────────────────────────────────── COMPILE TIME ONLY (zero runtime cost): ┌──────────────────────────────────────┐ │ Ownership tracking │ │ Borrow checking │ │ Lifetime verification │ │ Move semantics enforcement │ │ Send/Sync trait checking │ │ Type parameter resolution │ └──────────────────────────────────────┘ These exist only in the compiler. They generate NO runtime code. RUNTIME (minimal, predictable cost): ┌──────────────────────────────────────┐ │ Drop calls (deterministic free) │ │ Bounds checks on array indexing │ │ Option/Result pattern matching │ └──────────────────────────────────────┘ These are predictable and optimizable. No GC pauses. No reference counting.
Compare this to other safe languages: Java's garbage collector pauses all threads to reclaim memory. Python uses reference counting plus a cycle collector. Go has a concurrent GC that still adds latency. Rust achieves the same safety with none of these runtime mechanisms. The compiler does all the work before your program runs.
The compiled output is identical to C
If you write equivalent code in C (manually calling malloc and free at the right places), the generated assembly will be virtually identical to Rust's. The difference is that in C, you wrote the memory management manually and hoped you got it right. In Rust, the compiler wrote it for you and proved it correct.
Summary
Zero-cost abstractions are not a feature of Rust -- they are a fundamental design principle that permeates the entire language. Every major abstraction in Rust is designed to compile away completely.
Iterators compile to tight loops
Chains of .filter().map().sum() are fused into a single loop with no intermediate allocations. The generated assembly is identical to a hand-written while loop.
Generics are monomorphized
Generic functions produce specialized versions for each type. No boxing, no indirection, no type erasure. The code runs at the same speed as if you had written a separate function for each type.
Traits use static dispatch by default
Trait method calls are resolved at compile time and inlined. Dynamic dispatch with dyn is available when needed but never forced upon you.
Ownership has zero runtime presence
The entire ownership and borrowing system is erased after compilation. No garbage collector, no reference counting, no runtime checks. Safety is enforced at compile time and costs nothing at runtime.
This is Rust's core promise: you do not have to choose between writing expressive, maintainable code and writing fast code. The compiler bridges the gap, turning high-level abstractions into optimal machine code. Abstraction without compromise.