Book review: "Thinking with Types" by Sandy Maguire
There is a shortage of good books on intermediate Haskell and above, and I think it somewhat contributes to the ivory tower feel of the language. Back when I was getting into C++ a couple of decades ago, there were some pretty well-acclaimed books and even whole series like “Effective C++” by Meyers, “Exceptional C++” by Sutter, and so on. They don’t necessarily teach writing production-level code (and I doubt that kind of coaching is at all possible), but these books are good at building the language philosophy, spirit, and style.
I can’t quickly think of as many Haskell-oriented (or even general FP-oriented) resources, so when I heard about the book “Thinking with Types: Type-Level Programming in Haskell” by Sandy Maguire a few months ago, it really caught my attention — I even got to read it the same year!
Although, reading this book gave me mixed feelings. On the one hand, I learned a few interesting tricks and corner cases, so the hours I invested in this book were well-spent. On the other hand, I think this book has some shortcomings, and, while being interesting, it still fails to address the audience it aims at (and, given the first paragraph, I’d like it to aim at).
To summarize my feedback, I think the best title would be “A Few Case Studies in the Intricacies of the GHC’s Type System.” Hence the audience benefitting the most are the folks already having a decent grasp of type-level computations in other, more expressive languages (like Idris) or who have read a book or two on type theory. Why would they want a book like this, then? Well, eventually, one’d need to solve real-world problems with a production-ready compiler and a mature ecosystem that GHC provides instead of recursing infinitely into yak-shaving (like in Idris). So, this book is quite good at showing by example how these advanced abstract type-level things can be somewhat leveraged in a production-level language.
This book would have been a reasonable 9/10 with a title like “A Few Case Studies…”. But with the title it actually has and the target audience it claims, it’s more like 6 or 7 out of 10.
For a more detailed review, follow the spoiler!
The good
Alright, let’s start with the good parts.
First and foremost, even though I’ve been doing Haskell for some time now, I’ve learned quite some things from this book, ranging from some tiny bits to things that might affect the design of the code I’m writing.
For instance, I’ve spent a couple of months writing a defunctionalizer for a System F-like language in one of my projects, and it was… let’s say it was painful. Thus, the chapter on the first-class type families where Maguire shows the approach to defunctionalization in the specific case of Haskell’s type-level thingies was particularly interesting!
The chapter on open sums and products also got me thinking
about implementing them without unsafeCoerce
s but still in O(1) time.
This was also quite an enlightening mental exercise!
As for the tiny bits, as an example, it never occurred to me that instances differing only in the kind:
instance HasPrintf a => HasPrintf ((text :: Symbol) :<< a)
instance HasPrintf a => HasPrintf ((param :: Type) :<< a)
are actually legit. Sure, it makes sense once you think about this since the kind is implicit but also participates in instance resolution, but you have to internalize this.
Or, as another example, in my mental model of the GHC’s type checker,
it’d complain about a
being ambiguous in foo
’s type in
type family AlwaysUnit a where
AlwaysUnit a = ()
foo :: b -> AlwaysUnit a -> b
yet this works! A good example, model adjusted.
Or, as yet another example, I learned about the inspection-testing plugin/library from this book, and it’s a really cool thing!
So, I definitely recommend reading this book or, at the very least, skimming through to iron out such wrinkles.
Moreover, a 5- or 10-years-younger me would have been grateful for this book since it lays out in an accessible (for the most part) form the knowledge I otherwise had to earn by trial and error.
The bad
The impression of a book is mainly about its expectations. For a fiction book, the expectations are about its genre. For a technical book, though, the expectations are about the target audience and the overall depth.
The audience
What does the author say about his target audience? I’ll quote the introduction:
So whom is this book for? The target audience I’ve been trying to write for are intermediate-to-proficient with the language. They’re capable of solving real problems in Haskell, and doing it without too much hassle. They need not have strong opinions on ExceptT vs throwing exceptions in IO, nor do they need to know how to inspect generated Core to find performance bottlenecks.
But the target reader should have a healthy sense of unease about the programs they write. They should look at their comments saying “don’t call this function with n = 5 because it will crash,” and wonder if there’s some way to teach the compiler about that. The reader should nervously eyeball their calls to error that they’re convinced can’t possibly happen, but are required to make the type-checker happy.
In short, the reader should be looking for opportunities to make less code compile. This is not out of a sense of masochism, anarchy, or any such thing. Rather, this desire comes from a place of benevolence—a little frustration with the type-checker now is preferable to a hard-to-find bug making its way into production.
These are the words I can stand by!
But, suppose I’ve joined a Haskell shop,
where my coworkers are excellent at your typical enterprise programming.
Let’s say my coworker Jason is a pro at marshaling JSONs from one service to another,
figuring out business requirements, negotiating key objectives with stakeholders,
and all that jazz.
We all want to write more type-safe code,
but maybe Jason and his team lack experience with the riot of GHC extensions.
Am I comfortable just referring Jason and friends to this book,
leaving them for a while, and then coming back a couple of weeks later,
expecting them to become InjectiveTypeFamilies
masters, DataKinds
gurus,
RankNTypes
ninjas, GADTs
wizards?
Unfortunately, no, and for several reasons.
Shifting gears
Different chapters (and sometimes different sections within a chapter) feel like they’re written by different persons, or at least in very different moods. Unfortunately, the chapters describing more difficult ideas are written in a more hastened style.
As perhaps the best example, Maguire spends about the same number of pages on type variables scoping
(which really amounts to “use forall
and enable ScopedTypeVariables
”)
as on encoding dependent pairs with singletons.
Dropping the example code (there’s just inherently more code about singletons),
the textual description of the singletons machinery seems to be even less than that for the scoping stuff.
But the whole purpose of human-readable books, papers, blog posts, you name it,
is to communicate the ideas, the intent, and perhaps the intuition behind things.
There’s plenty of code already in the singletons
package itself!
I get it; it’s the last chapter, and a proper text on singletons deserves its own book (or at least a few lengthy blog posts, like Justin Le did half a decade ago). But I find the singletons to be the most obscure part of the Haskell type-level culture, even though I have some experience with fully dependently typed languages. I don’t know if a chapter this short actually helps to build those ideas and intuitions.
The ivory tower
There are a few places where that JSON-happy Jason would stumble, or, more likely, the author just shows off. This is a problem with many tutorials and posts aimed at the average Jason, so it’s worth spelling out more thoroughly.
For example, the whole first chapter is dedicated to the algebra of types, and there are a couple of statements that are at best incomplete or misleading at worst, and the whole chapter made me feel uneasy.
Maguire claims that any two types with the same cardinality are always isomorphic.
On the surface, it makes sense, but the devil’s in the details.
What about the types of infinite cardinality?
[a]
and Either [a] [a]
are both countably infinite, but are they isomorphic?
What do you even mean by isomorphism here?
Are Sum Int
and Product Int
isomorphic despite having
deliberately different Semigroup
instances?
Moreover, Maguire claims that for each type, there exists its canonical sum-of-products representation,
presumably to prepare the reader for the machinery behind generics.
For instance, he explicitly says that (a, Bool)
is not in its canonical representation.
But, if you actually do
data Foo a = Foo (a, Bool) deriving (Generic)
and check Rep (Foo a)
, you’ll see GHC doesn’t do any canonicalization.
Hence, this whole exercise (if not the chapter, and that is the first chapter)
does little service to better understand generics.
I think it’d be more educational to say that
different constructors are represented by an Either
-like :+:
,
and the fields in any constructor are represented by a (,)
-like :*:
.
That’s it: no need for isomorphisms, cardinality, canonical representations,
tricky questions about the canonical representation of lists, and so on.
No semantics, just pure syntax.
As another example, in the chapter on existential types,
Maguire shows how to implement the ST
monad with the ST
trick
using rank-N types, and he casually says
GHC calls
s
a rigid skolem type variable. […] A skolem is, for all intents and purposes, any existential type.¹
with a footnote
¹Mathematically, it’s an existentially quantified (∃) variable expressed in terms of a forall quantifier (∀). Since in Haskell we have only forall quantifiers, all existential variables are necessarily skolems.
Yeah, I’ve done my share of exercises on Church/Boehm-Berarducci/etc encodings in λC, so I know what he’s talking about, but I already knew that. Do you expect our Jason who is
capable of solving real problems in Haskell without too much hassle
but who actually needs this book to say, “a-ha, that makes sense!”, or, rather, “WTF is he talking about here? screw these skolems-golems and those existentials-credentials, GHC errors suck.”
Overengineering and correctness
Some other things are just incorrect.
The chapter on dependent types gives a singletons example
to control at runtime whether the logging happens.
Why not a simple Bool
and a branch?
Well, quoting the author:
While one approach to this problem is to simply branch at runtime depending on whether logging is enabled, this carries with it a performance impact. Instead, we can conditionally choose our monad stack depending on a runtime value.
Oh, performance! That’s my shtick. Let’s take a closer look.
So, the end result then looks like
main :: Bool -> IO ()
main bool = do
withSomeSBool (toSBool bool) $ \(sb :: SBool b) ->
case dict @MonadLogging sb of
Dict -> runLogging @b program
where MonadLogging
is indexed by the singletonized Bool
,
and dict
retrieves the MonadLogging
dictionary associated with the required instance,
which is than used by runLogging :: MonadLogging b => ...
.
The problem is that now the compiler cannot monomorphize runLogging
and inline the calls to MonadLogging
methods,
since, well, it doesn’t know at compile time what methods to call.
Instead, what the compiler does is
literally taking the dictionary returned by dict
and passing that as a (hidden) parameter to runLogging
,
which, in turn, resorts to an indirect call through a pointer
(for C++-inclined folks, that’s like being unable to devirtualize a virtual function call).
This means a jump for every logging call, and, what’s worse, no inlining and no subsequent optimizations. Even if we’ve chosen no logging for a particular run, we’ll still have to pay the price of passing around the dictionary, saving the registers that need to be saved before the call, jumping to the right function, executing it, restoring the registers, etc. If we instead had written a simple
main :: Bool -> IO ()
main bool = case bool of
True -> runLogging @True program
False -> runLogging @False program
all this could have been inlined and thus avoided.
Curiously, that’s precisely the problem
I faced
some time ago when playing around with wc
,
and I ended up doing roughly what I suggest here,
generating all the possible branches
so that the compiler could see (and inline!) the right dictionary.
So, Haskell-style “dependent types” here are both an overcomplication and a pessimization.
As an example of either another overcomplication or my stupidity, consider the proposed structured logging, where we might want to log different types. Maguire says
While an existentialized GADT approach might suffice—where we stuff proofs of each typeclass into the datatype—it requires us to know all of our desired dictionaries in advance. Such an approach isn’t particularly extensible, since it requires us to anticipate everything we might like to do with our log.
and proposes a singletonized enumeration of possible log types along with an associated data family:
singletons [d|
data LogType
= JsonMsg
| TextMsg
deriving (Eq, Ord, Show)
|]
data family LogMsg (msg :: LogType)
How do the instances look like? Generally, they have the shape of
data instance LogMsg 'JsonMsg = Json Value
deriving (Eq, Show)
Then, we also need to sort-of-explicitly-forall-quantify
any constraint over the LogMsg a
:
instance ( c (LogMsg 'JsonMsg)
, c (LogMsg 'TextMsg)
) => Dict1 c LogMsg where
dict1 SJsonMsg = Dict
dict1 STextMsg = Dict
because otherwise GHC can’t see the constraint is total over LogType
,
and we’ll have bad time later.
Now, suppose we want to add a new log type. We need to:
- add a new member to the
LogType
enum, - and then don’t forget to add a new
data instance
, - and also don’t forget to add a new constraint and case to
Dict1
.
What if we want to add a new class (say, Read
)?
- We either go through all the data instances and add the corresponding
deriving
clause, - or, if we’re defining the new class of ours,
we need to write the corresponding instances for all
LogType
s, - or, if we’re “connecting” a third-party class with this logging thing pretending it’s now also third-party, we resort to orphan instances.
The upside is that it’s supposedly easy to just filter out a particular type of messages:
catSigmas
:: forall k (a :: k) f
. (SingI a, SDecide k)
=> [Sigma f]
-> [f a]
catSigmas = mapMaybe fromSigma
jsonLogs :: [LogMsg 'JsonMsg]
jsonLogs = catSigmas logs
Now, suppose we throw away all this singletons stuff, and instead do a plain old ADT:
data LogMessage
= LMJson Value
| LMText String
deriving (Eq, Show)
Adding a new log type boils down to adding a new member to this ADT.
No need to write data instances, no need to update Dict1
s.
Adding a new class instance is also the same (or easier) as in the original solution:
we have the same options, except we need to modify just one deriving
clause instead of O(|LogMessage|)
.
Extracting just one type of messages is also straightforward. Even without resorting to Template Haskell, it’s a matter of
toJsonLog :: LogMessage -> Maybe Value
toJsonLog (LMJson value) = Just value
toJsonLog _ = Nothing
jsonLogs = mapMaybe toJsonLog logs
but if you’re open to TH (and using singletons means you are), just go grab any lens library which will generate the right prisms for you.
So, what do singletons give us here? What kind of extensibility do they enable or make easier? It’s not obvious to me, but it seems like they actually make things worse. Maybe it’s overengineering, or maybe I’m just stupid.
But, to be fair, I’ve also quite overcomplicated a part of one of my projects literally the previous weekend, so perhaps it’s a natural instinct of somebody who just loves to code when they get a tool powerful enough. And dependent types are hella powerful!
The minor stuff
I’m still unsure why the ST
trick belongs
to the chapter on existential types, not the (previous) one about the rank-N types.
But that’s minor stuff, indeed. Other minor stuff is some typos, a couple of places where the text obviously describes some previous version of the code that’s different from what ended up being in the final version of the book, and so on.
Again, that’s all minor and editorial, but when I buy a technical book, I expect to pay, among other things, for the editor’s efforts to make the exposition and flow as smooth as possible to simplify focusing on the ideas.
The ugly
Some issues follow from the static nature of books (even e-books!). This book has been released in 2018, and the world has moved on a little since then.
For instance, a couple of pages are spent on
how to do indexed monads with the do
-notation.
But, it looks like the same effect could be achieved much simpler with the
QualifiedDo
introduced in GHC 9.0 (which was released in 2021).
Or, both the ST
trick and some applications of indexed monads
are now accompanied by the recently added linear types,
documentation on which is scarce yet.
So, having a chapter or two on those would be great.
Maybe a wiki-like format is much better suited for something like this. But how can a wiki-based book be monetized? Dunno.
Anyway, all this applies to any book on a subject as fluid as programming, so it’s just something to be aware of.
tl;dr
Ugh, that came out harsh, I guess. Nevertheless, I recommend getting and reading this book, just by different folks that the introduction of the book itself aims at.
“A Few Case Studies in the Intricacies of the GHC’s Type System” would have been a better title, being much closer to what the book really is about. Shaving off a couple of chapters and expanding some more the non-trivial ones would also be very beneficial.
The book that actually teaches to think in types is, IMO, either “Type-driven Development with Idris,” if one’s looking for the coding side of things, or a mix of Pierce’s TAPL and Nederpelt/Geuvers’s “Type Theory and Formal Proof,” if one’s more inclined to learn about the theory. In addition to that, the TDDwI book is indeed a very good resource on, well, the type-driven development to leverage the types to arrive at the right solution effectively.