Statically safe dynamic typing à la Python

June 28, 2019 // haskell, types

One of my hobby projects includes a long-running service, so it’d be nice if the service provided some metrics (say, using the ekg library) to the outside world for monitoring and alerts. As a consequence, the service needs an internal metrics storage that encapsulates all things related to creating them as needed, updating them, and so on.

Writing a metrics storage (especially on top of ekg) is trivial, but one cannot just solve a problem when doing recreational programming. You’ve got to abstract things away, generalize, and then abstract further and generalize further. So, quite soon I found myself writing an extensible and customizable storage supporting unknown metrics of unknown types in such a way that new metrics could be added in different modules without touching any existing definitions. This deserves a post or two on its own, but today we’ll consider just a tiny part of the solution: writing a type-safe wrapper over types that are only known at runtime. So, yeah, something like dynamic typing but with static guarantees that we don’t do any nonsense.

I don’t think this short post will reveal anything new for seasoned haskellers, but at least we’ll get this little part done and out of our way in our next articles about the storage itself. Or I could be less shy and claim instead that I created a new programming pattern.

Problem statement

Anyway, first things first. Let’s spell out what problem we are trying to solve. So, we want to be able to associate some objects (whose types aren’t known before the runtime) with some values of some other type (which we don’t use so we don’t care about). In other words, we want objects of more or less arbitrary (and different) types to be the keys in the same associative container.

Of course, if we are talking about associative containers, we still need to be able to perform at least some operations. Thus, we’ll stick to a tree-based map and require that those (apriori unknown) types support ordering (as in the Ord class). Alternatively, we could go with a hashmap, but it’d make the task a bit less challenging.

Part 0: existentials

When it comes to the notion of a value of some unknown type satisfying some (known) constraints, existential types are typically used to capture it. For instance, if we want to have a container for some value of unknown type that implements the Ord class, we could have this:

data SomeOrd where
  MkSomeOrd :: Ord a => a -> SomeOrd

Then, if we have a value of type SomeOrd and we pattern-match on it:

foo :: SomeOrd -> Bar
foo (MkSomeOrd val) = ... (1) ...

then we know at (1) that the type of val implements Ord, even though we don’t know the exact type of val. And, perhaps more importantly, the typechecker also knows that val implements Ord, so it allows us to use the methods of Ord.

Sort of. There’s a gotcha.

Let’s take a closer look and try to compare two values of unknown types:

tryCompare :: SomeOrd -> SomeOrd -> Bool
tryCompare (MkSomeOrd val1) (MkSomeOrd val2) = ?

Ord methods require their arguments to have the same type, but this is not necessarily the case for val1 and val2, so we still cannot compare them. Thus our SomeOrd is virtually useless. What do we do then?

TypeRep and type equality

Despite Haskell being a compiled language with aggressive type erasure, the compiler can still generate the runtime witnesses for the types (if asked). The witness for a type a is a value of the type TypeRep a (from the Data.Typeable module), and the way to ask the compiler to generate it is to use the Typeable class.

Moreover, there is a special function eqTypeRep that, given two different values rep1 :: TypeRep a, rep2 :: TypeRep b, checks if they actually represent the same type. If they indeed do, then it produces a proof that could be used to persuade the typechecker in this fact. Let’s see how.

Firstly, let’s look at its type:

eqTypeRep :: forall k1 k2 (a :: k1) (b :: k2). TypeRep a -> TypeRep b -> Maybe (a :~~: b)

What the hell does this mean?

  1. Of course, first of all, it means that eqTypeRep is a function.
  2. Then, it’s a polymorphic function, but not just types-polymorphic but also kinds-polymorphic, and that’s the forall k1 k2 (a :: k1) (b :: k2) part that shows this. Theoretically, this means that we could do some cool stuff with promoted types (as in DataKinds), but I haven’t got to playing around with that yet.
  3. eqTypeRep accepts two runtime witnesses of potentially different types, TypeRep a and TypeRep b for types a and b respectively.
  4. eqTypeRep produces a value of the type Maybe (a :~~: b). And this is where the typechecker magic happens. If the types do not match, the function just returns Nothing. But if they do match, then it returns Just val, where val is of type a :~~: b.

So let’s now move on to the :~~: type. It’s defined as

-- | Kind heterogeneous propositional equality. Like ':~:', @a :~~: b@ is
-- inhabited by a terminating value if and only if @a@ is the same type as @b@.
--
-- @since 4.10.0.0
data (a :: k1) :~~: (b :: k2) where
   HRefl :: a :~~: a

Now assume we have a value of the type a :~~: b. The only way to construct such a value is to use the HRefl constructor (since this is the only constructor of the type), but this constructor requires the same type on both sides of :~~:. Thus a and b indeed coincide. Moreover, if we pattern-match on val :: a :~~: b (via its sole constructor), the typechecker will also be aware of this fact. The bottom line is that eqTypeRep indeed returns a proof of two types being equal if they are in fact equal.

Digression: it’s a matter of trust

Except I lied. Nothing stops you from writing something like

wrong :: Int :~~: String
wrong = wrong    -- yay infinite recursion

in Haskell, or

wrong :: Int :~~: String
wrong = undefined

and use these functions as if they were legitimately defined. There are a few other ways to break the system, and all this is just an instance of a somewhat well-known statement that Haskell is inconsistent as a logic. Languages with more powerful type systems (think Agda or Idris) won’t type check the above definitions (or, at least, they’ll disallow using them in contexts where the logical consistency matters). And yet another way to look at this is as a manifestation of the special bottom value ⊥ inhabiting every type in Haskell: the functions above are actually equivalent to the ⊥.

Anyway, this is precisely why the comment from the quoted definition of :~~: mentions terminating values. Despite all the inconsistencies, the wrong functions above don’t produce a terminating value. This note allows us to get back some peace of mind, although with weaker guarantees: that is, if our Haskell program terminates, then its output is consistent with its types. There are a couple of issues related to the evaluation strategy, though, but let’s disregard that.

By the way, another manifestation of the weakness of Haskell’s type system is the type of eqTypeRep. In a stronger language with more expressive types it could have returned either a proof of type equality if the types match (as it does now), or a proof that the type equality leads to Void… err, I meant absurdity (instead of the boring Nothing that it returns now). On the other hand, this only starts to matter when the language is used to actually prove some theorems, and Haskell isn’t really used in this role, so nobody cares.

Anyway, that’s enough of logic and type theory, let’s get back to our metrics.

Witness-carrying existential wrapper

Next, we draw the rest of the owl the discussion above hints that we should just carry along the TypeRep in our existential type, which leads us to the following definition:

data Dyn where
  Dyn :: Ord a => TypeRep a -> a -> Dyn

  toDyn :: (Typeable a, Ord a) => a -> Dyn
  toDyn val = Dyn (typeOf val) val

Let’s now write a function that accepts the following:

  1. Two values of the type Dyn.
  2. A function that, given two values of any type satisfying the constraints of Dyn, produces a value of some predefined type (determined by the caller). This function will be called when the types of the Dyn values match.
  3. A fallback function that’s called instead of the previous one if the types don’t match.

All this results in the following function:

withDyns :: (forall a. Ord a => a -> a -> b) ->
            (SomeTypeRep -> SomeTypeRep -> b) ->
            Dyn -> Dyn -> b
withDyns f def (Dyn ty1 v1) (Dyn ty2 v2) = case eqTypeRep ty1 ty2 of
  Nothing -> def (SomeTypeRep ty1) (SomeTypeRep ty2)
  Just HRefl -> f v1 v2

Here, SomeTypeRep is the existential wrapper over arbitrary TypeRep a.

Note that the forall really matters. The type forall a. Ord a => a -> a -> b means a function shall accept any type a that satisfies Ord, and it produces a value of type b that cannot possibly be related to a and that’s determined at the function call site.

Given the above, we can finally write an equality check and an ordering function:

instance Eq Dyn where
  (==) = withDyns (==) (\_ _ -> False)

instance Ord Dyn where
  compare = withDyns compare compare

Here we rely on the SomeTypeReps also being comparable, so we just pass compare again as the fallback function.

Done. Problem solved.

Generalizing

But it’d be a shame to not generalize this: note that Dyn, toDyn, withDyns don’t really rely on the particular Ord constraint, so we can turn the ConstraintKinds extension on and generalize by parametrizing Dyn with the specific set of constraints that we need:

data Dyn ctx where
  Dyn :: ctx a => TypeRep a -> a -> Dyn ctx

toDyn :: (Typeable a, ctx a) => a -> Dyn ctx
toDyn val = Dyn (typeOf val) val

withDyns :: (forall a. ctx a => a -> a -> b) ->
            (SomeTypeRep -> SomeTypeRep -> b) ->
            Dyn ctx -> Dyn ctx -> b
withDyns (Dyn ty1 v1) (Dyn ty2 v2) f def = case eqTypeRep ty1 ty2 of
  Nothing -> def (SomeTypeRep ty1) (SomeTypeRep ty2)
  Just HRefl -> f v1 v2

So Dyn Ord is the type equivalent to the unparametrized one we wrote earlier, while, say, Dyn Monoid would allow us to store arbitrary monoids and do something monoidal with them.

The instances for our new generic Dyn would then be

instance Eq (Dyn Eq) where
  (==) = withDyns (==) (\_ _ -> False)

instance Ord (Dyn Ord) where
  compare = withDyns compare compare

…except that it doesn’t work. The type checker does not know that Dyn Ord also implements Eq, so we’ll have to duplicate the corresponding parts of the hierarchy:

instance Eq (Dyn Eq) where
  (==) = withDyns d1 d2 (==) (\_ _ -> False)

instance Eq (Dyn Ord) where
  (==) = withDyns d1 d2 (==) (\_ _ -> False)

instance Ord (Dyn Ord) where
  compare = withDyns d1 d2 compare compare

although in our particular case we don’t really have Dyn Eq anywhere so it could be dropped.

Done. This time for real.

…although looks like in modern Haskell we could make the type checker derive the instances of the form

instance C_i (Dyn (C_1, ..., C_n)) where
  ...

automatically, although it’d require some further investigation.

One more thing to note is that our Dyn looks awfully lot like a dependent pair (ty : Type ** val : ty) from dependently typed languages. The major difference is that in the latter languages one cannot really match on the type (at least, in the languages that I’m aware of), because parametricity. Haskell, on the other hand, allows us to have something resembling a type switch (although not exactly — any function doing the type switch has the Typeable constraint, and this typeclass is the escape hatch that effectively allows us to bypass parametricity while still adhering to it formally).

But, most importantly, we can now have something like Map (Dyn Ord) SomeType and use values of arbitrary types as keys as long as they are comparable (and the type checker will check for us that they are!). For instance, we could use some metrics identifiers in the keys without any prior knowledge about them, but that’s for a next post.