Returning refs

16 February 2012

One commonly requested feature for regions is the ability to return references to the inside of structures. I did not allow that in the proposal in my previous post because I did not want to have any region annotations beyond a simple &. I think, however, that if you want to allow returning references to the interior of a parameter, you need a way for the user to denote region names explicitly.

The big problem with returning references to the interior of data structures is ensuring the lifetime and validity of that reference. I think it can be supported in some cases but not particularly well in general. It will work ok for structures allocated on the stack but be quite limited for structures allocated in the heap, unless we have strong support from the garbage collector. Another scenario where it could work well would be user-managed memory pools (which we probably do want to support eventually).

Simplest case

Let’s start with the simplest case:

type T = {mut f: uint};
fn get_f(t: r&T) -> r&mut uint { &t.f }

Here we have a function that returns a pointer to a field of its parameter. On the callee side, I think this is fairly straightforward. You name the region in which the parameter is located (r) and say that the return value is a mutable pointer to uint in the same region (r&mut uint). The notation may leave something to be desired, but the concept is hopefully clear enough.

Returning references to the stack

But what about the caller? Again we’ll start with the simplest case, where get_f() is invoked with data that lives on the stack:

fn caller_on_stack() {   // let region `b` refer to the function body
    let t = &{mut f: 3}; // type of t: `b&T`
    let p = get_f(t);    // type of p: `b&mut uint`
    *p = 5;              // legal, `b` region is in scope.
}

In this case, there is no concern about memory management per se. The returned pointer is in the region b. The type checker will enforce the rule that if a function returns a reference type, the reference must be in scope (note: I haven’t thought through what this means for generic types with the ref kind, but I guess they can be handled one way or another). Generally, because the parameter regions must be in scope for the call, the returned region will be in scope too—but we’ll see that this does not always hold.

Returning references to the heap, take 1

Let’s move on to a more complicated case where the data being accessed is in the heap. I’ll first discuss how it could work in some theoretical world and then show how this can cause problems:

fn caller_on_heap() {    // (note: not fully sound)
    let t = @{mut f: 3}; // type of t: `@T`
    let p = get_f(t);    // type of p: `@mut uint`
    *p = 5;              // legal, `@` region is in scope.
}

The only difference is that the variable t is in the heap region @. Now the returned pointer is considered to be in that region as well, and so the assignment is permitted. This seems reasonable at first, but it is of course incompatible with ref counting (p is not a ref-counted entity). If we moved to a garbage collector which could handle interior pointers, this would be reasonably safe. Interior pointer support is needed to accomodate cases where p gets returned, like this one:

fn caller_on_heap_r() -> @mut uint {
    let t = @{mut f: 3};
    ret get_f(t);
}

So is there anything we can do that is compatible with ref. counting? The answer is “sort of”.

Returning references to the heap, take 2

One thought is that we say that @ is not actually a region. It’s never been the best fit, due to ref counting requirements and implicit headers. Instead, we say that regions always refer to some block-scoped slice of the program execution. The most common case would be a block in the program, but in some cases the region might be “the time in which a given expression is evaluated” and so forth (as an aside: this is basically what I called an “interval” in my thesis…minus all the parallel parts).

An @T pointer could then be implicitly coerced into an r&T pointer where the region r is the biggest region for which the type checker can guarantee the validity of the @T pointer. So, we can now revisit the previous examples. The first example works more-or-less the same as before:

fn caller_on_heap() {    // region of the block is `r`
    let t = @{mut f: 3}; // type of t: `@T`
    let p = get_f(t);    // type of p: `r&mut uint`
    *p = 5;              // legal, `r` region is in scope.
}

The differences here lie in the type checker. The pointer p is no longer in the @ region but rather in the region r corresponding to the function body. The reason that the region r could be safely used is that t is an immutable local variable (I am assuming issue #1273 is implemented…working on that right now). This means that the memory will remain valid as long as t is in scope.

EDIT: There is no reason to impose the following restriction. See discussion below. This implies that if t were mutable, the example would not work In that case, the validity of the memory to t could not be guaranteed for the entire block, as t could be overwritten. In other words, a program might do something like this:

fn caller_on_heap() {
    let mut t = @{mut f: 3};
    let p = get_f(t);
    t = @{mut f: 22};    // original memory is now freed
    *p = 5;              // memory error.
}

However, such a program would not type check. The reason is that, because t is mutable, when t was coerced to a region type, a narrow region s would be assigned. The region s would correspond to precisely the call to get_f(). The result of get_f() would therefore have type s&mut uint, but the region s would be out of scope after get_f() returned, and so a type error occurs (this is that rule I mentioned before: when returning a reference, the region must be in scope).

As an aside, coercing a unique pointer ~T into a region would work similarly to the second case: that is, the resulting region is always a very narrow one. It does not matter if the variable storing the unique pointer is immutable or not. The reason is that the local variable is being borrowed for the lifetime of the region. If we assigned a large region, the local variable would be inaccessible after the call, because we would not be able to guarantee the uniqueness invariant, as there might be escaped region-typed pointers into its interior. Anyway, I don’t want to go into details about unique pointers in this post as it’s already plenty long.

EDIT: pcwalton pointed out to me that there is no reason to treat mutable variables specially. Instead, we can basically just increment the ref count whenever we coerce an @T to a r&T. The region r would still be the region of an enclosing block b (probably the innermost one, or perhaps the one where the variable is declared) and we would release the reference upon exiting the block b can still optimize immutable variables to not increase the reference at all because it is unnecessary. I rejected this approach initially because I was thinking that we would want to keep it very predictable when references would be dropped, but that’s not actually an important property. Garbage collection traditionally does not define precisely when dead memory will be reclaimed, after all (and, as graydon correctly points out, RC+CC is garbage collection). Note though that borrowing unique pointers probably still ought to use a narrow region corresponding to the call or alt statement in which the borrow occurs.

But does it scale?

So, I think this system I showed above is reasonable. I think it has a clear story, too, which is important to me, because it helps me believe that it is sound even though I haven’t made any kind of formal proof or argument. The story is basically that

  • a region is always a block-scoped slice of the dynamic execution;
  • when a @T is coerced into a region pointer, the result is the largest region for which the validity of the @T pointer can be guaranteed;
  • when a ~T is coerced into a region, the result is a narrow region corresponding to just the duration of the borrow (I haven’t gone into details on this… perhaps in a later post)

However, in all of these examples I showed only the simplest case, where a pointer was returned directly into one of the parameters. A more realistic scenario is probably returning some interior pointer to a record reachable from a parameter. For example, a common C trick is to have a hashtable lookup not return the value which was found but rather a pointer to the value. This allows the caller to update the value without having to use any further API calls. Let’s look at that example: we will find that the regions don’t scale up.

The prototype for such a function might be:

fn get_value_ptr<K,V>(m: r&map<K,V>, k: &K) -> option<r&V> {
    ...
}

This looks reasonable, but once we start digging into the details things don’t work so well. Let’s assume a hashmap with chains for each key:

enum bucket<K,V> = {k: K, mut v: V, mut next: option<@bucket<K,V>>};
enum map<K,V> = {buckets: [mut option<@bucket<K,V>>], ...};

fn get_value_ptr<K,V>(m: r&map<K,V>, k: &K) -> option<r&mut V> {
    let bkt_idx: uint = find_bucket_index(m, k);
    alt search_bucket_chain(m.buckets[bkt_idx], k) {
        none { none }     // no bucket with key k
        some(bkt) {       // found a bucket with key k
                          // bkt has type @bucket<K,V>
            some(&bkt.v)  // (*) error
        }
    }
}

The example should be straightforward if you’ve ever coded up a hashtable. The tricky part is marked with a (*): once we’ve found the bucket containing the value, we attempt to return a pointer to its interior. But this is not safe, of course! The caller expects a pointer in the same region as the map: that is, with the same lifetime. There is no way for get_value_ptr() to guarantee that bkt is valid as long as the map m is valid. The caller might, for example, remove the key from the hashtable, thus invalidating the bucket. This error manifests itself as a type error, because the type of &bkt.v is a region pointer corresponding to the region of the alt, not the region r which was provided as a parameter.

Could this example be made to work?

I don’t think it can without a lot of work. You could imagine that the caller would consider the map “borrowed” while the returned value is in scope and thus try not to modify it. But there may be aliases to the map, so there are still no guarantees.

I think there are two ways to handle an example like that in a safe fashion:

  • Improve the GC, as described before
  • Re-write the example into a top-down style.

The second “solution” is what we support today, and what is supported by the “Regions Lite” idea I described before. You would write:

fn get_value_ptr<K,V,R>(
    m: r&map<K,V>,
    k: &K,
    f: fn(option<&mut V>) -> R) -> R {
    ...
}

In other words, the get_value_ptr() method will call the closure f with the result of the lookup. In this way, the get_value_ptr() method itself can guarantee that the reference is valid.

Is it all worth it?

It may still be worth having labeled region parameters and supporting a limited form of returning references, but I am not sure. I feel like writing things in a “top-down”, CPS-ish style is perhaps just a better solution—it’s certainly more widely applicable.

Regardless, I do like the idea of formalizing regions as representing a slice of time, and defining coercions from @T and ~T. I think that feels intuitively sensible and I think it can be explained in a reasonable way.

There is however one potentially important future case where the simple form of returning references would be enough. If we had user-defined memory pools, then it would be possible to have large, dynamically allocated, multi-object data structures that all lie within one region (the memory pool). A map, for example, might have an associated arena. This scheme would then work. When you think about it, treating @ as a region and having a GC which can handle interior pointers is really the same thing as this scheme, from an abstract point of view.

Another important thing is that I think there is a path from all of these more limited forms to the more general ones. If we say, for now, that @ is not a region and we do not support labeled parameters, we can add both of those later. Existing programs will continue to type check.