Member constraints
A member constraint 'm member of ['c_1..'c_N]
expresses that the
region 'm
must be equal to some choice regions 'c_i
(for
some i
). These constraints cannot be expressed by users, but they
arise from impl Trait
due to its lifetime capture rules. Consider a
function such as the following:
fn make(a: &'a u32, b: &'b u32) -> impl Trait<'a, 'b> { .. }
Here, the true return type (often called the "hidden type") is only
permitted to capture the lifetimes 'a
or 'b
. You can kind of see
this more clearly by desugaring that impl Trait
return type into its
more explicit form:
type MakeReturn<'x, 'y> = impl Trait<'x, 'y>;
fn make(a: &'a u32, b: &'b u32) -> MakeReturn<'a, 'b> { .. }
Here, the idea is that the hidden type must be some type that could
have been written in place of the impl Trait<'x, 'y>
-- but clearly
such a type can only reference the regions 'x
or 'y
(or
'static
!), as those are the only names in scope. This limitation is
then translated into a restriction to only access 'a
or 'b
because
we are returning MakeReturn<'a, 'b>
, where 'x
and 'y
have been
replaced with 'a
and 'b
respectively.
Detailed example
To help us explain member constraints in more detail, let's spell out
the make
example in a bit more detail. First off, let's assume that
you have some dummy trait:
trait Trait<'a, 'b> { }
impl<T> Trait<'_, '_> for T { }
and this is the make
function (in desugared form):
type MakeReturn<'x, 'y> = impl Trait<'x, 'y>;
fn make(a: &'a u32, b: &'b u32) -> MakeReturn<'a, 'b> {
(a, b)
}
What happens in this case is that the return type will be (&'0 u32, &'1 u32)
,
where '0
and '1
are fresh region variables. We will have the following
region constraints:
'0 live at {L}
'1 live at {L}
'a: '0
'b: '1
'0 member of ['a, 'b, 'static]
'1 member of ['a, 'b, 'static]
Here the "liveness set" {L}
corresponds to that subset of the body
where '0
and '1
are live -- basically the point from where the
return tuple is constructed to where it is returned (in fact, '0
and
'1
might have slightly different liveness sets, but that's not very
interesting to the point we are illustrating here).
The 'a: '0
and 'b: '1
constraints arise from subtyping. When we
construct the (a, b)
value, it will be assigned type (&'0 u32, &'1 u32)
-- the region variables reflect that the lifetimes of these
references could be made smaller. For this value to be created from
a
and b
, however, we do require that:
(&'a u32, &'b u32) <: (&'0 u32, &'1 u32)
which means in turn that &'a u32 <: &'0 u32
and hence that 'a: '0
(and similarly that &'b u32 <: &'1 u32
, 'b: '1
).
Note that if we ignore member constraints, the value of '0
would be
inferred to some subset of the function body (from the liveness
constraints, which we did not write explicitly). It would never become
'a
, because there is no need for it too -- we have a constraint that
'a: '0
, but that just puts a "cap" on how large '0
can grow to
become. Since we compute the minimal value that we can, we are happy
to leave '0
as being just equal to the liveness set. This is where
member constraints come in.
Choices are always lifetime parameters
At present, the "choice" regions from a member constraint are always lifetime
parameters from the current function. As of October 2021,
this falls out from the placement of impl Trait, though in the future it may not
be the case. We take some advantage of this fact, as it simplifies the current
code. In particular, we don't have to consider a case like '0 member of ['1, 'static]
, in which the value of both '0
and '1
are being inferred and hence
changing. See rust-lang/rust#61773 for more information.
Applying member constraints
Member constraints are a bit more complex than other forms of
constraints. This is because they have a "or" quality to them -- that
is, they describe multiple choices that we must select from. E.g., in
our example constraint '0 member of ['a, 'b, 'static]
, it might be
that '0
is equal to 'a
, 'b
, or 'static
. How can we pick the
correct one? What we currently do is to look for a minimal choice
-- if we find one, then we will grow '0
to be equal to that minimal
choice. To find that minimal choice, we take two factors into
consideration: lower and upper bounds.
Lower bounds
The lower bounds are those lifetimes that '0
must outlive --
i.e., that '0
must be larger than. In fact, when it comes time to
apply member constraints, we've already computed the lower bounds of
'0
because we computed its minimal value (or at least, the lower
bounds considering everything but member constraints).
Let LB
be the current value of '0
. We know then that '0: LB
must
hold, whatever the final value of '0
is. Therefore, we can rule out
any choice 'choice
where 'choice: LB
does not hold.
Unfortunately, in our example, this is not very helpful. The lower
bound for '0
will just be the liveness set {L}
, and we know that
all the lifetime parameters outlive that set. So we are left with the
same set of choices here. (But in other examples, particularly those
with different variance, lower bound constraints may be relevant.)
Upper bounds
The upper bounds are those lifetimes that must outlive '0
--
i.e., that '0
must be smaller than. In our example, this would be
'a
, because we have the constraint that 'a: '0
. In more complex
examples, the chain may be more indirect.
We can use upper bounds to rule out members in a very similar way to
lower bounds. If UB is some upper bound, then we know that UB: '0
must hold, so we can rule out any choice 'choice
where UB: 'choice
does not hold.
In our example, we would be able to reduce our choice set from ['a, 'b, 'static]
to just ['a]
. This is because '0
has an upper bound
of 'a
, and neither 'a: 'b
nor 'a: 'static
is known to hold.
(For notes on how we collect upper bounds in the implementation, see the section below.)
Minimal choice
After applying lower and upper bounds, we can still sometimes have
multiple possibilities. For example, imagine a variant of our example
using types with the opposite variance. In that case, we would have
the constraint '0: 'a
instead of 'a: '0
. Hence the current value
of '0
would be {L, 'a}
. Using this as a lower bound, we would be
able to narrow down the member choices to ['a, 'static]
because 'b: 'a
is not known to hold (but 'a: 'a
and 'static: 'a
do hold). We
would not have any upper bounds, so that would be our final set of choices.
In that case, we apply the minimal choice rule -- basically, if
one of our choices if smaller than the others, we can use that. In
this case, we would opt for 'a
(and not 'static
).
This choice is consistent with the general 'flow' of region propagation, which always aims to compute a minimal value for the region being inferred. However, it is somewhat arbitrary.
Collecting upper bounds in the implementation
In practice, computing upper bounds is a bit inconvenient, because our
data structures are setup for the opposite. What we do is to compute
the reverse SCC graph (we do this lazily and cache the result) --
that is, a graph where 'a: 'b
induces an edge SCC('b) -> SCC('a)
. Like the normal SCC graph, this is a DAG. We can then do a
depth-first search starting from SCC('0)
in this graph. This will
take us to all the SCCs that must outlive '0
.
One wrinkle is that, as we walk the "upper bound" SCCs, their values
will not yet have been fully computed. However, we have already
applied their liveness constraints, so we have some information about
their value. In particular, for any regions representing lifetime
parameters, their value will contain themselves (i.e., the initial
value for 'a
includes 'a
and the value for 'b
contains 'b
). So
we can collect all of the lifetime parameters that are reachable,
which is precisely what we are interested in.