Why does the lifetime name appear as part of the function type?
Let me expand on the previous answers…
What does the annotation <'a> after the function name mean?
I wouldn't use the word "annotation" for that. Much like <T>
introduces a generic type parameter, <'a>
introduces a generic lifetime parameter. You can't use any generic parameters without introducing them first and for generic functions this introduction happens right after their name. You can think of a generic function as a family of functions. So, essentially, you get one function for every combination of generic parameters. substr::<'x>
would be a specific member of that function family for some lifetime 'x
.
If you're unclear on when and why we have to be explicit about lifetimes, read on…
A lifetime parameter is always associated with all reference types. When you write
fn main() {
let x = 28374;
let r = &x;
}
the compiler knows that x lives in the main function's scope enclosed with curly braces. Internally, it identifies this scope with some lifetime parameter. For us, it is unnamed. When you take the address of x
, you'll get a value of a specific reference type. A reference type is kind of a member of a two dimensional family of reference types. One axis is the type of what the reference points to and the other axis is a lifetime that is used for two constraints:
- The lifetime parameter of a reference type represents an upper bound for how long you can hold on to that reference
- The lifetime parameter of a reference type represents a lower bound for the lifetime of the things you can make the reference point to.
Together, these constraints play a vital role in Rust's memory safety story. The goal here is to avoid dangling references. We would like to rule out references that point to some memory region we are not allowed to use anymore because that thing it used to point to does not exist anymore.
One potential source of confusion is probably the fact that lifetime parameters are invisible most of the time. But that does not mean they are not there. References always have a lifetime parameter in their type. But such a lifetime parameter does not have to have a name and most of the time we don't need to mention it anyways because the compiler can assign names for lifetime parameters automatically. This is called "lifetime elision". For example, in the following case, you don't see any lifetime parameters being mentioned:
fn substr(s: &str, until: u32) -> &str {…}
But it's okay to write it like this. It's actually a short-cut syntax for the more explicit
fn substr<'a>(s: &'a str, until: u32) -> &'a str {…}
Here, the compiler automatically assigns the same name to the "input lifetime" and the "output lifetime" because it's a very common pattern and most likely exactly what you want. Because this pattern is so common, the compiler lets us get away without saying anything about lifetimes. It assumes that this more explicit form is what we meant based on a couple of "lifetime elision" rules (which are at least documented here)
There are situations in which explicit lifetime parameters are not optional. For example, if you write
fn min<T: Ord>(x: &T, y: &T) -> &T {
if x <= y {
x
} else {
y
}
}
the compiler will complain because it will interpret the above declaration as
fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T { … }
So, for each reference a separate lifetime parameter is introduced. But no information on how the lifetime parameters relate to each other is available in this signature. The user of this generic function could use any lifetimes. And that's a problem inside its body. We're trying to return either x
or y
. But the type of x
is &'a T
. That's not compatible with the return type &'c T
. The same is true for y
. Since the compiler knows nothing about how these lifetimes relate to each other, it's not safe to return these references as a reference of type &'c T
.
Can it ever be safe to go from a value of type &'a T
to &'c T
? Yes. It's safe if the lifetime 'a
is equal or greater than the lifetime 'c
. Or in other words 'a: 'c
. So, we could write this
fn min<'a, 'b, 'c, T: Ord>(x: &'a T, y: &'b T) -> &'c T
where 'a: 'c, 'b: 'c
{ … }
and get away with it without the compiler complaining about the function's body. But it's actually unnecessarily complex. We can also simply write
fn min<'a, T: Ord>(x: &'a T, y: &'a T) -> &'a T { … }
and use a single lifetime parameter for everything. The compiler is able to deduce 'a
as the minimum lifetime of the argument references at the call site just because we used the same lifetime name for both parameters. And this lifetime is precisely what we need for the return type.
I hope this answers your question. :) Cheers!
The <'a>
annotation just declares the lifetimes used in the function, exactly like generic parameters <T>
.
fn subslice<'a, T>(s: &'a [T], until: u32) -> &'a [T] { \\'
&s[..until as usize]
}
Note that in your example, all lifetimes can be inferred.
fn subslice<T>(s: &[T], until: u32) -> &[T] {
&s[..until as usize]
}
fn substr(s: &str, until: u32) -> &str {
&s[..until as usize]
}
playpen example
What does the annotation <'a> after the function name mean?
fn substr<'a>(s: &'a str, until: u32) -> &'a str;
// ^^^^
This is declaring a generic lifetime parameter. It's similar to a generic type parameter (often seen as <T>
), in that the caller of the function gets to decide what the lifetime is. Like you said, the lifetime of the result will be the same as the lifetime of the first argument.
All lifetime names are equivalent, except for one: 'static
. This lifetime is pre-set to mean "guaranteed to live for the entire life of the program".
The most common lifetime parameter name is probably 'a
, but you can use any letter or string. Single letters are most common, but any snake_case
identifier is acceptable.
Why does the compiler need it, and what does it do with it?
Rust generally favors things to be explicit, unless there's a very good ergonomic benefit. For lifetimes, lifetime elision takes care of something like 85+% of cases, which seemed like a clear win.
Type parameters live in the same namespace as other types — is T
a generic type or did someone name a struct that? Thus type parameters need to have an explicit annotation that shows that T
is a parameter and not a real type. However, lifetime parameters don't have this same problem, so that's not the reason.
Instead, the main benefit of explicitly listing type parameters is because you can control how multiple parameters interact. A nonsense example:
fn better_str<'a, 'b, 'c>(a: &'a str, b: &'b str) -> &'c str
where
'a: 'c,
'b: 'c,
{
if a.len() < b.len() {
a
} else {
b
}
}
We have two strings and say that the input strings may have different lifetimes, but must both outlive the lifetime of the result value.
Another example, as pointed out by DK, is that structs can have their own lifetimes. I made this example also a bit of nonsense, but it hopefully conveys the point:
struct Player<'a> {
name: &'a str,
}
fn name<'p, 'n>(player: &'p Player<'n>) -> &'n str {
player.name
}
Lifetimes can be one of the more mind-bending parts of Rust, but they are pretty great when you start to grasp them.