Statically safe dynamic typing à la Python
One of my hobby projects includes a long-running service, so it’d be nice if the service provided some
metrics (say, using the ekg
library) to the outside world for monitoring and alerts.
As a consequence, the service needs an internal metrics storage that encapsulates all things
related to creating them as needed, updating them, and so on.
Writing a metrics storage (especially on top of ekg
) is trivial,
but one cannot just solve a problem when doing recreational programming.
You’ve got to abstract things away, generalize, and then abstract further and generalize further.
So, quite soon I found myself writing an extensible and customizable storage
supporting unknown metrics of unknown types in such a way
that new metrics could be added in different modules without touching any existing definitions.
This deserves a post or two on its own, but today we’ll consider just a tiny part of the solution:
writing a type-safe wrapper over types that are only known at runtime.
So, yeah, something like dynamic typing but with static guarantees that we don’t do any nonsense.
I don’t think this short post will reveal anything new for seasoned haskellers, but at least we’ll get this little part done and out of our way in our next articles about the storage itself. Or I could be less shy and claim instead that I created a new programming pattern.
Problem statement
Anyway, first things first. Let’s spell out what problem we are trying to solve. So, we want to be able to associate some objects (whose types aren’t known before the runtime) with some values of some other type (which we don’t use so we don’t care about). In other words, we want objects of more or less arbitrary (and different) types to be the keys in the same associative container.
Of course, if we are talking about associative containers, we still need to be able to perform
at least some operations. Thus, we’ll stick to a tree-based map
and require that those (apriori unknown) types support ordering (as in the Ord
class).
Alternatively, we could go with a hashmap, but it’d make the task a bit less challenging.
Part 0: existentials
When it comes to the notion of a value of some unknown type satisfying some (known) constraints,
existential types are typically used to capture it. For instance, if we want to have a container
for some value of unknown type that implements the Ord
class, we could have this:
data SomeOrd where
MkSomeOrd :: Ord a => a -> SomeOrd
Then, if we have a value of type SomeOrd
and we pattern-match on it:
foo :: SomeOrd -> Bar
foo (MkSomeOrd val) = ... (1) ...
then we know at (1)
that the type of val
implements Ord
,
even though we don’t know the exact type of val
.
And, perhaps more importantly, the typechecker also knows that val
implements Ord
,
so it allows us to use the methods of Ord
.
Sort of. There’s a gotcha.
Let’s take a closer look and try to compare two values of unknown types:
tryCompare :: SomeOrd -> SomeOrd -> Bool
tryCompare (MkSomeOrd val1) (MkSomeOrd val2) = ?
Ord
methods require their arguments to have the same type,
but this is not necessarily the case for val1
and val2
,
so we still cannot compare them.
Thus our SomeOrd
is virtually useless. What do we do then?
TypeRep
and type equality
Despite Haskell being a compiled language with aggressive type erasure, the compiler can still
generate the runtime witnesses for the types (if asked).
The witness for a type a
is a value of the type TypeRep a
(from the Data.Typeable
module),
and the way to ask the compiler to generate it is to use the Typeable
class.
Moreover, there is a special function eqTypeRep
that, given two different values
rep1 :: TypeRep a, rep2 :: TypeRep b
, checks if they actually represent the same type.
If they indeed do, then it produces a proof that could be used to persuade the typechecker
in this fact. Let’s see how.
Firstly, let’s look at its type:
eqTypeRep :: forall k1 k2 (a :: k1) (b :: k2). TypeRep a -> TypeRep b -> Maybe (a :~~: b)
What the hell does this mean?
- Of course, first of all, it means that
eqTypeRep
is a function. - Then, it’s a polymorphic function, but not just types-polymorphic but also kinds-polymorphic,
and that’s the
forall k1 k2 (a :: k1) (b :: k2)
part that shows this. Theoretically, this means that we could do some cool stuff with promoted types (as inDataKinds
), but I haven’t got to playing around with that yet. eqTypeRep
accepts two runtime witnesses of potentially different types,TypeRep a
andTypeRep b
for typesa
andb
respectively.eqTypeRep
produces a value of the typeMaybe (a :~~: b)
. And this is where the typechecker magic happens. If the types do not match, the function just returnsNothing
. But if they do match, then it returnsJust val
, whereval
is of typea :~~: b
.
So let’s now move on to the :~~:
type. It’s defined as
-- | Kind heterogeneous propositional equality. Like ':~:', @a :~~: b@ is
-- inhabited by a terminating value if and only if @a@ is the same type as @b@.
--
-- @since 4.10.0.0
data (a :: k1) :~~: (b :: k2) where
HRefl :: a :~~: a
Now assume we have a value of the type a :~~: b
. The only way to construct such a value
is to use the HRefl
constructor (since this is the only constructor of the type), but this
constructor requires the same type on both sides of :~~:
. Thus a
and b
indeed coincide.
Moreover, if we pattern-match on val :: a :~~: b
(via its sole constructor),
the typechecker will also be aware of this fact.
The bottom line is that eqTypeRep
indeed returns a proof of two types being equal
if they are in fact equal.
Digression: it’s a matter of trust
Except I lied. Nothing stops you from writing something like
wrong :: Int :~~: String
wrong = wrong -- yay infinite recursion
in Haskell, or
wrong :: Int :~~: String
wrong = undefined
and use these functions as if they were legitimately defined. There are a few other ways to break the system, and all this is just an instance of a somewhat well-known statement that Haskell is inconsistent as a logic. Languages with more powerful type systems (think Agda or Idris) won’t type check the above definitions (or, at least, they’ll disallow using them in contexts where the logical consistency matters). And yet another way to look at this is as a manifestation of the special bottom value ⊥ inhabiting every type in Haskell: the functions above are actually equivalent to the ⊥.
Anyway, this is precisely why the comment from the quoted definition of :~~:
mentions
terminating values.
Despite all the inconsistencies, the wrong
functions above don’t produce a terminating value.
This note allows us to get back some peace of mind, although with weaker guarantees:
that is, if our Haskell program terminates, then its output is consistent with its types.
There are a couple of issues related to the evaluation strategy, though, but let’s disregard that.
By the way, another manifestation of the weakness of Haskell’s type system is the type of eqTypeRep
.
In a stronger language with more expressive types it could have returned
either a proof of type equality if the types match (as it does now), or a proof that the type equality
leads to Void
… err, I meant absurdity (instead of the boring Nothing
that it returns now).
On the other hand, this only starts to matter when the language is used to actually prove some theorems,
and Haskell isn’t really used in this role, so nobody cares.
Anyway, that’s enough of logic and type theory, let’s get back to our metrics.
Witness-carrying existential wrapper
Next, we draw the rest of the owl the discussion above hints that we should just carry along the
TypeRep
in our existential type, which leads us to the following definition:
data Dyn where
Dyn :: Ord a => TypeRep a -> a -> Dyn
toDyn :: (Typeable a, Ord a) => a -> Dyn
toDyn val = Dyn (typeOf val) val
Let’s now write a function that accepts the following:
- Two values of the type
Dyn
. - A function that, given two values of any type satisfying the constraints of
Dyn
, produces a value of some predefined type (determined by the caller). This function will be called when the types of theDyn
values match. - A fallback function that’s called instead of the previous one if the types don’t match.
All this results in the following function:
withDyns :: (forall a. Ord a => a -> a -> b) ->
(SomeTypeRep -> SomeTypeRep -> b) ->
Dyn -> Dyn -> b
withDyns f def (Dyn ty1 v1) (Dyn ty2 v2) = case eqTypeRep ty1 ty2 of
Nothing -> def (SomeTypeRep ty1) (SomeTypeRep ty2)
Just HRefl -> f v1 v2
Here, SomeTypeRep
is the existential wrapper over arbitrary TypeRep a
.
Note that the forall
really matters.
The type forall a. Ord a => a -> a -> b
means a function shall accept any type a
that satisfies Ord
,
and it produces a value of type b
that cannot possibly be related to a
and that’s determined at the function call site.
Given the above, we can finally write an equality check and an ordering function:
instance Eq Dyn where
(==) = withDyns (==) (\_ _ -> False)
instance Ord Dyn where
compare = withDyns compare compare
Here we rely on the SomeTypeRep
s also being comparable,
so we just pass compare
again as the fallback function.
Done. Problem solved.
Generalizing
But it’d be a shame to not generalize this: note that Dyn, toDyn, withDyns
don’t really rely
on the particular Ord
constraint, so we can turn the ConstraintKinds
extension on and generalize
by parametrizing Dyn
with the specific set of constraints that we need:
data Dyn ctx where
Dyn :: ctx a => TypeRep a -> a -> Dyn ctx
toDyn :: (Typeable a, ctx a) => a -> Dyn ctx
toDyn val = Dyn (typeOf val) val
withDyns :: (forall a. ctx a => a -> a -> b) ->
(SomeTypeRep -> SomeTypeRep -> b) ->
Dyn ctx -> Dyn ctx -> b
withDyns (Dyn ty1 v1) (Dyn ty2 v2) f def = case eqTypeRep ty1 ty2 of
Nothing -> def (SomeTypeRep ty1) (SomeTypeRep ty2)
Just HRefl -> f v1 v2
So Dyn Ord
is the type equivalent to the unparametrized one we wrote earlier,
while, say, Dyn Monoid
would allow us to store arbitrary monoids and do something monoidal with them.
The instances for our new generic Dyn
would then be
instance Eq (Dyn Eq) where
(==) = withDyns (==) (\_ _ -> False)
instance Ord (Dyn Ord) where
compare = withDyns compare compare
…except that it doesn’t work. The type checker does not know that Dyn Ord
also implements Eq
,
so we’ll have to duplicate the corresponding parts of the hierarchy:
instance Eq (Dyn Eq) where
(==) = withDyns d1 d2 (==) (\_ _ -> False)
instance Eq (Dyn Ord) where
(==) = withDyns d1 d2 (==) (\_ _ -> False)
instance Ord (Dyn Ord) where
compare = withDyns d1 d2 compare compare
although in our particular case we don’t really have Dyn Eq
anywhere so it could be dropped.
Done. This time for real.
…although looks like in modern Haskell we could make the type checker derive the instances of the form
instance C_i (Dyn (C_1, ..., C_n)) where
...
automatically, although it’d require some further investigation.
One more thing to note is that our Dyn
looks awfully lot like
a dependent pair (ty : Type ** val : ty)
from dependently typed languages.
The major difference is that in the latter languages one cannot really match on the type
(at least, in the languages that I’m aware of), because parametricity. Haskell, on the other hand,
allows us to have something resembling a type switch (although not exactly — any function doing
the type switch has the Typeable
constraint, and this typeclass is the escape hatch that
effectively allows us to bypass parametricity while still adhering to it formally).
But, most importantly, we can now have something like Map (Dyn Ord) SomeType
and
use values of arbitrary types as keys as long as they are comparable
(and the type checker will check for us that they are!).
For instance, we could use some metrics identifiers in the keys without any prior knowledge about them,
but that’s for a next post.