Rust has higher kinded types already... sort of

Oct 18, 2018 [ Rust, Types ]

Discuss this post on Reddit.

Refresher

First, a very quick refresher.

In Rust, a type which takes type parameters (Rc<T>, Vec<T>, HashMap<K, V>, etc) is only a valid type when all type parameters are specified. In other words, Rc, Vec, and HashMap<K> are not types. You can't have a variable of type Rc. You can't pass Rc as a parameter to other types.

The ability to have such things be actual types is a feature called higher kinded types (HKT). It's primarily useful when combined with traits which also take type parameters. Consider, for example, a hypothetical trait to abstract over different kinds of reference-counted pointers.

trait RcPtr<T>: Clone + Deref<Target=T> {}

You can easily implement this for Rc and Arc:

impl<T> RcPtr<T> for Rc<T> {}
impl<T> RcPtr<T> for Arc<T> {}

However, with HKTs, RcPtr (as opposed to RcPtr<T>) would be a valid trait, and Rc and Arc would implement it (as opposed to Rc<T> and Arc<T>).

This would allow you to do things like:

struct TreeNode<T, P: RcPtr> {
    val: T,
    right: P<TreeNode<T, P>>,
    left: P<TreeNode<T, P>>,
}

This is a tree data structure where the caller gets to decide whether it's thread-safe or not. There are many applications which would benefit greatly from allowing the caller to specify some property such as thread safety, caching behavior, etc - all things which require HKTs in order to leave type parameters unbound until later than is allowed in normal Rust today (I say "normal" because, well, read the next section).

HKTs in Rust Today

One way to view the limitation that prevents us from writing this code is that, a) when declaring a trait bound, all type parameters must be bound (i.e., P: RcPtr is not a valid trait bound since RcPtr takes a type parameter) and, b) when declaring a type, all type parameters must be bound (i.e., Rc and Arc aren't types; Rc<T> and Arc<T> are).

However, it's not quite true that there's no way to define a trait which allows some type parameters to remain unbound, or that there's no way to implement those traits with types which also leave some type parameters unbound. Consider, for example, the following function from the standard library's Iterator trait:

trait Iterator {
    fn sum<S: Sum<Self::Item>>(self) -> S;
}

Even once a type has implemented the Iterator trait, the S type parameter on sum is still unbound - it's only bound once the method is called. This implies a general pattern - if you want to allow specifying a trait bound while leaving type parameters unbound, just move them into the methods on that trait. Let's give it a try:

trait RcPtr {
    fn clone<T>(&self) -> Self;
    fn deref<T>(&self) -> &T;
}

If this trait gives you an uneasy feeling, you've got the right intuition. Normally, with a trait with a type parameter like:

trait RcPtr<T>  {
    fn clone(&self) -> Self;
    fn deref(&self) -> &T;
}

You declare variables like:

let foo: RcPtr<T> = ...;

Then, whenever you operate on foo, the compiler guarantees that the correct type T is used. This is important, or else you'd be able to operate on the same variable as different types at different points in the program:

let foo: usize = 0;
(foo: &str) = "hello there";
(foo: &[u8])[0] += 1;

In other words, the fact that Rust tracks the type of variables is what allows us to have memory safety! So if the type parameter to RcPtr is moved from the trait itself into each of the methods, then you could do something like:

let foo: RcPtr = ...;
println!("{}", foo.deref::<&str>());
let bar = foo.clone::<&[u8]>();

In other words, you could violate memory safety. No bueno. I guess we're out of luck...

Although... There is a giant, hairy, awful, scary escape hatch in the language, so what if we just...

unsafe trait RcPtr {
    unsafe fn clone<T>(&self) -> Self;
    unsafe fn deref<T>(&self) -> &T;
}

That's better :)

Of course, using unsafe means that you're opting into manually verifying certain properties. So what properties does the programmer pinky-swear to uphold here? Essentially, it becomes the programmer's responsibility to track the types of variables and only ever access them consistently. In other words, for a given instance of a variable, it must only ever be accessed with the same T parameter.

Luckily, Rust is pretty good at encapsulation, so we can just encapsulate this in a safe type that uses unsafe under the hood, and guarantees to always use the same T:

struct RcWrapper<R: RcPtr, T> {
    rc: R,
    _marker: PhantomData<T>,
}

impl<R: RcPtr, T> RcWrapper<R, T> {
    fn new(t: T) -> Self {
        RcWrapper {
            rc: R::new::<T>(t),
            _marker: PhantomData,
        }
    }
}

impl<R: RcPtr, T> Deref for RcWrapper<R, T> {
    type Target = T;
    fn deref(&self) -> &T {
        unsafe { self.rc.deref() }
    }
}

Given an R: RcPtr and a T, instead of writing R<T> as we would if Rust supported HKT natively, we write RcWrapper<R, T>. We can then treat the RcWrapper<R, T> just as we would R<T> if we had native HKT - we can pass it around, call methods on it, clone it, etc.

OK, so we can define the trait, and we can even make it safe to use. But how on earth could we implement the trait? What would such an implementation look like?

Awful, that's what. While one of the benefits of giving variables types is that you can guarantee that you always operate on them as the proper type, another benefit is that you always know what size they are. What size is something whose type isn't known until you call methods on it? God only knows...

And when only God knows something, you've got no choice but to figure it out at runtime, and stick it on the heap.

struct SingleThreadedRc {
    // Actually *mut Rc<T>, but we don't know
    // what T is until runtime...
    inner: *mut (),
}

impl SingleThreadedRc {
    // Dear SingleThreadedRc, I pinky-swear that
    // you're actually holding a pointer to an Rc<T>.
    // I swear on my mother's grave.
    unsafe fn inner<T>(&self) -> &Rc<T> {
        &*(self.inner as *const T)
    }
}

unsafe impl RcPtr for SingleThreadedRc {
    unsafe fn clone<T>(&self) -> Self {
        // Get access to the Rc<T> we knew we had 
        // inside of us deep down all along...
        let inner: &Rc<T> = self.inner();
        let new_inner: &Rc<T> = inner.clone();
        let new_inner = Box::new(new_inner);
        SingleThreadedRc {
            inner: Box::into_raw(new_inner) as *mut (),
        }
    }

    unsafe fn deref<T>(&self) -> &T {
        let inner: &Rc<T> = self.inner();
        inner.deref()
    }
}

Does anybody have any Purell?

But it works. You can see a full example including usage here.

Conclusion

To recap, here's the general pattern. You would like a trait, MyTrait<T>, to be higher-kinded, but we don't support that in Rust right now. Instead, you create a trait, MyTrait, which takes no type parameters, but each of its methods take a type parameter, T. Then, if you have an M: MyTrait, instead of doing M<T> as you normally would, you create a concrete wrapper type, MyTraitWrapper<M: MyTrait, T>. M<T> is then written as MyTraitWrapper<M, T>.

Of course, the code presented here is far from pretty or optimal. One obvious drawback is the heap allocation. If you could guarantee an upper bound on the size of the internal type (e.g., if you knew an upper bound on the size of Rc<T> for all T), you could stack- instead of heap-allocate.

This also doesn't cover everything you might want HKT for. For example, how do you emulate the following trait, where you'd like to be able to do M: MyTrait and then, later, M::<T>::MyType?

trait MyTrait<T> {
    type MyType: Clone;
}

I'm pretty sure it's possible, but I haven't actually gotten it working in code yet. In the interest of keeping this post short (and also reclaiming some of my free time), I've decided to leave it an exercise to readers to either get it working, or argue that it can't be done (and thus that the approach I've proposed here falls short of full HKT).

Finally, in fairness, calling this "HKT" is a bit of a stretch. It's not a general-purpose mechanism - you have to manually implement it for every trait and every type - and it doesn't provide the same soundness guarantees that a language feature would. But, for any given application that needs a particular trait to be HKT-ified, it gets the job done.

Rust has HKT right now, but only the sinners can use it.

joshlf.com

Rust has higher kinded types already... sort of

Refresher

HKTs in Rust Today

Conclusion