Book review: "Thinking with Types" by Sandy Maguire

September 24, 2023 // ,

There is a shortage of good books on intermediate Haskell and above, and I think it somewhat contributes to the ivory tower feel of the language. Back when I was getting into C++ a couple of decades ago, there were some pretty well-acclaimed books and even whole series like “Effective C++” by Meyers, “Exceptional C++” by Sutter, and so on. They don’t necessarily teach writing production-level code (and I doubt that kind of coaching is at all possible), but these books are good at building the language philosophy, spirit, and style.

I can’t quickly think of as many Haskell-oriented (or even general FP-oriented) resources, so when I heard about the book “Thinking with Types: Type-Level Programming in Haskell” by Sandy Maguire a few months ago, it really caught my attention — I even got to read it the same year!

Although, reading this book gave me mixed feelings. On the one hand, I learned a few interesting tricks and corner cases, so the hours I invested in this book were well-spent. On the other hand, I think this book has some shortcomings, and, while being interesting, it still fails to address the audience it aims at (and, given the first paragraph, I’d like it to aim at).

To summarize my feedback, I think the best title would be “A Few Case Studies in the Intricacies of the GHC’s Type System.” Hence the audience benefitting the most are the folks already having a decent grasp of type-level computations in other, more expressive languages (like Idris) or who have read a book or two on type theory. Why would they want a book like this, then? Well, eventually, one’d need to solve real-world problems with a production-ready compiler and a mature ecosystem that GHC provides instead of recursing infinitely into yak-shaving (like in Idris). So, this book is quite good at showing by example how these advanced abstract type-level things can be somewhat leveraged in a production-level language.

This book would have been a reasonable 9/10 with a title like “A Few Case Studies…”. But with the title it actually has and the target audience it claims, it’s more like 6 or 7 out of 10.

For a more detailed review, follow the spoiler!

The good

Alright, let’s start with the good parts.

First and foremost, even though I’ve been doing Haskell for some time now, I’ve learned quite some things from this book, ranging from some tiny bits to things that might affect the design of the code I’m writing.

For instance, I’ve spent a couple of months writing a defunctionalizer for a System F-like language in one of my projects, and it was… let’s say it was painful. Thus, the chapter on the first-class type families where Maguire shows the approach to defunctionalization in the specific case of Haskell’s type-level thingies was particularly interesting!

The chapter on open sums and products also got me thinking about implementing them without unsafeCoerces but still in O(1) time. This was also quite an enlightening mental exercise!

As for the tiny bits, as an example, it never occurred to me that instances differing only in the kind:

instance HasPrintf a => HasPrintf ((text :: Symbol) :<< a)
instance HasPrintf a => HasPrintf ((param :: Type) :<< a)

are actually legit. Sure, it makes sense once you think about this since the kind is implicit but also participates in instance resolution, but you have to internalize this.

Or, as another example, in my mental model of the GHC’s type checker, it’d complain about a being ambiguous in foo’s type in

type family AlwaysUnit a where
  AlwaysUnit a = ()

foo :: b -> AlwaysUnit a -> b

yet this works! A good example, model adjusted.

Or, as yet another example, I learned about the inspection-testing plugin/library from this book, and it’s a really cool thing!

So, I definitely recommend reading this book or, at the very least, skimming through to iron out such wrinkles.

Moreover, a 5- or 10-years-younger me would have been grateful for this book since it lays out in an accessible (for the most part) form the knowledge I otherwise had to earn by trial and error.

The bad

The impression of a book is mainly about its expectations. For a fiction book, the expectations are about its genre. For a technical book, though, the expectations are about the target audience and the overall depth.

The audience

What does the author say about his target audience? I’ll quote the introduction:

So whom is this book for? The target audience I’ve been trying to write for are intermediate-to-proficient with the language. They’re capable of solving real problems in Haskell, and doing it without too much hassle. They need not have strong opinions on ExceptT vs throwing exceptions in IO, nor do they need to know how to inspect generated Core to find performance bottlenecks.

But the target reader should have a healthy sense of unease about the programs they write. They should look at their comments saying “don’t call this function with n = 5 because it will crash,” and wonder if there’s some way to teach the compiler about that. The reader should nervously eyeball their calls to error that they’re convinced can’t possibly happen, but are required to make the type-checker happy.

In short, the reader should be looking for opportunities to make less code compile. This is not out of a sense of masochism, anarchy, or any such thing. Rather, this desire comes from a place of benevolence—a little frustration with the type-checker now is preferable to a hard-to-find bug making its way into production.

These are the words I can stand by!

But, suppose I’ve joined a Haskell shop, where my coworkers are excellent at your typical enterprise programming. Let’s say my coworker Jason is a pro at marshaling JSONs from one service to another, figuring out business requirements, negotiating key objectives with stakeholders, and all that jazz. We all want to write more type-safe code, but maybe Jason and his team lack experience with the riot of GHC extensions. Am I comfortable just referring Jason and friends to this book, leaving them for a while, and then coming back a couple of weeks later, expecting them to become InjectiveTypeFamilies masters, DataKinds gurus, RankNTypes ninjas, GADTs wizards? Unfortunately, no, and for several reasons.

Shifting gears

Different chapters (and sometimes different sections within a chapter) feel like they’re written by different persons, or at least in very different moods. Unfortunately, the chapters describing more difficult ideas are written in a more hastened style.

As perhaps the best example, Maguire spends about the same number of pages on type variables scoping (which really amounts to “use forall and enable ScopedTypeVariables”) as on encoding dependent pairs with singletons. Dropping the example code (there’s just inherently more code about singletons), the textual description of the singletons machinery seems to be even less than that for the scoping stuff. But the whole purpose of human-readable books, papers, blog posts, you name it, is to communicate the ideas, the intent, and perhaps the intuition behind things. There’s plenty of code already in the singletons package itself!

I get it; it’s the last chapter, and a proper text on singletons deserves its own book (or at least a few lengthy blog posts, like Justin Le did half a decade ago). But I find the singletons to be the most obscure part of the Haskell type-level culture, even though I have some experience with fully dependently typed languages. I don’t know if a chapter this short actually helps to build those ideas and intuitions.

The ivory tower

There are a few places where that JSON-happy Jason would stumble, or, more likely, the author just shows off. This is a problem with many tutorials and posts aimed at the average Jason, so it’s worth spelling out more thoroughly.

For example, the whole first chapter is dedicated to the algebra of types, and there are a couple of statements that are at best incomplete or misleading at worst, and the whole chapter made me feel uneasy.

Maguire claims that any two types with the same cardinality are always isomorphic. On the surface, it makes sense, but the devil’s in the details. What about the types of infinite cardinality? [a] and Either [a] [a] are both countably infinite, but are they isomorphic?

What do you even mean by isomorphism here? Are Sum Int and Product Int isomorphic despite having deliberately different Semigroup instances?

Moreover, Maguire claims that for each type, there exists its canonical sum-of-products representation, presumably to prepare the reader for the machinery behind generics. For instance, he explicitly says that (a, Bool) is not in its canonical representation. But, if you actually do

data Foo a = Foo (a, Bool) deriving (Generic)

and check Rep (Foo a), you’ll see GHC doesn’t do any canonicalization.

Hence, this whole exercise (if not the chapter, and that is the first chapter) does little service to better understand generics. I think it’d be more educational to say that different constructors are represented by an Either-like :+:, and the fields in any constructor are represented by a (,)-like :*:. That’s it: no need for isomorphisms, cardinality, canonical representations, tricky questions about the canonical representation of lists, and so on. No semantics, just pure syntax.

As another example, in the chapter on existential types, Maguire shows how to implement the ST monad with the ST trick using rank-N types, and he casually says

GHC calls s a rigid skolem type variable. […] A skolem is, for all intents and purposes, any existential type.¹

with a footnote

¹Mathematically, it’s an existentially quantified (∃) variable expressed in terms of a forall quantifier (∀). Since in Haskell we have only forall quantifiers, all existential variables are necessarily skolems.

Yeah, I’ve done my share of exercises on Church/Boehm-Berarducci/etc encodings in λC, so I know what he’s talking about, but I already knew that. Do you expect our Jason who is

capable of solving real problems in Haskell without too much hassle

but who actually needs this book to say, “a-ha, that makes sense!”, or, rather, “WTF is he talking about here? screw these skolems-golems and those existentials-credentials, GHC errors suck.”

Overengineering and correctness

Some other things are just incorrect. The chapter on dependent types gives a singletons example to control at runtime whether the logging happens. Why not a simple Bool and a branch? Well, quoting the author:

While one approach to this problem is to simply branch at runtime depending on whether logging is enabled, this carries with it a performance impact. Instead, we can conditionally choose our monad stack depending on a runtime value.

Oh, performance! That’s my shtick. Let’s take a closer look.

So, the end result then looks like

main :: Bool -> IO ()
main bool = do
  withSomeSBool (toSBool bool) $ \(sb :: SBool b) ->
	case dict @MonadLogging sb of
	  Dict -> runLogging @b program

where MonadLogging is indexed by the singletonized Bool, and dict retrieves the MonadLogging dictionary associated with the required instance, which is than used by runLogging :: MonadLogging b => ....

The problem is that now the compiler cannot monomorphize runLogging and inline the calls to MonadLogging methods, since, well, it doesn’t know at compile time what methods to call. Instead, what the compiler does is literally taking the dictionary returned by dict and passing that as a (hidden) parameter to runLogging, which, in turn, resorts to an indirect call through a pointer (for C++-inclined folks, that’s like being unable to devirtualize a virtual function call).

This means a jump for every logging call, and, what’s worse, no inlining and no subsequent optimizations. Even if we’ve chosen no logging for a particular run, we’ll still have to pay the price of passing around the dictionary, saving the registers that need to be saved before the call, jumping to the right function, executing it, restoring the registers, etc. If we instead had written a simple

main :: Bool -> IO ()
main bool = case bool of
  True -> runLogging @True program
  False -> runLogging @False program

all this could have been inlined and thus avoided.

Curiously, that’s precisely the problem I faced some time ago when playing around with wc, and I ended up doing roughly what I suggest here, generating all the possible branches so that the compiler could see (and inline!) the right dictionary.

So, Haskell-style “dependent types” here are both an overcomplication and a pessimization.

As an example of either another overcomplication or my stupidity, consider the proposed structured logging, where we might want to log different types. Maguire says

While an existentialized GADT approach might suffice—where we stuff proofs of each typeclass into the datatype—it requires us to know all of our desired dictionaries in advance. Such an approach isn’t particularly extensible, since it requires us to anticipate everything we might like to do with our log.

and proposes a singletonized enumeration of possible log types along with an associated data family:

singletons [d|
  data LogType
	 = JsonMsg
	 | TextMsg
	 deriving (Eq, Ord, Show)
  |]

data family LogMsg (msg :: LogType)

How do the instances look like? Generally, they have the shape of

data instance LogMsg 'JsonMsg = Json Value
  deriving (Eq, Show)

Then, we also need to sort-of-explicitly-forall-quantify any constraint over the LogMsg a:

instance ( c (LogMsg 'JsonMsg)
		 , c (LogMsg 'TextMsg)
		 ) => Dict1 c LogMsg where
  dict1 SJsonMsg = Dict
  dict1 STextMsg = Dict

because otherwise GHC can’t see the constraint is total over LogType, and we’ll have bad time later.

Now, suppose we want to add a new log type. We need to:

  1. add a new member to the LogType enum,
  2. and then don’t forget to add a new data instance,
  3. and also don’t forget to add a new constraint and case to Dict1.

What if we want to add a new class (say, Read)?

  1. We either go through all the data instances and add the corresponding deriving clause,
  2. or, if we’re defining the new class of ours, we need to write the corresponding instances for all LogTypes,
  3. or, if we’re “connecting” a third-party class with this logging thing pretending it’s now also third-party, we resort to orphan instances.

The upside is that it’s supposedly easy to just filter out a particular type of messages:

catSigmas
	:: forall k (a :: k) f
	 . (SingI a, SDecide k)
	=> [Sigma f]
	-> [f a]
catSigmas = mapMaybe fromSigma

jsonLogs :: [LogMsg 'JsonMsg]
jsonLogs = catSigmas logs

Now, suppose we throw away all this singletons stuff, and instead do a plain old ADT:

data LogMessage
  = LMJson Value
  | LMText String
  deriving (Eq, Show)

Adding a new log type boils down to adding a new member to this ADT. No need to write data instances, no need to update Dict1s. Adding a new class instance is also the same (or easier) as in the original solution: we have the same options, except we need to modify just one deriving clause instead of O(|LogMessage|).

Extracting just one type of messages is also straightforward. Even without resorting to Template Haskell, it’s a matter of

toJsonLog :: LogMessage -> Maybe Value
toJsonLog (LMJson value) = Just value
toJsonLog _ = Nothing

jsonLogs = mapMaybe toJsonLog logs

but if you’re open to TH (and using singletons means you are), just go grab any lens library which will generate the right prisms for you.

So, what do singletons give us here? What kind of extensibility do they enable or make easier? It’s not obvious to me, but it seems like they actually make things worse. Maybe it’s overengineering, or maybe I’m just stupid.

But, to be fair, I’ve also quite overcomplicated a part of one of my projects literally the previous weekend, so perhaps it’s a natural instinct of somebody who just loves to code when they get a tool powerful enough. And dependent types are hella powerful!

The minor stuff

I’m still unsure why the ST trick belongs to the chapter on existential types, not the (previous) one about the rank-N types.

But that’s minor stuff, indeed. Other minor stuff is some typos, a couple of places where the text obviously describes some previous version of the code that’s different from what ended up being in the final version of the book, and so on.

Again, that’s all minor and editorial, but when I buy a technical book, I expect to pay, among other things, for the editor’s efforts to make the exposition and flow as smooth as possible to simplify focusing on the ideas.

The ugly

Some issues follow from the static nature of books (even e-books!). This book has been released in 2018, and the world has moved on a little since then.

For instance, a couple of pages are spent on how to do indexed monads with the do-notation. But, it looks like the same effect could be achieved much simpler with the QualifiedDo introduced in GHC 9.0 (which was released in 2021). Or, both the ST trick and some applications of indexed monads are now accompanied by the recently added linear types, documentation on which is scarce yet. So, having a chapter or two on those would be great.

Maybe a wiki-like format is much better suited for something like this. But how can a wiki-based book be monetized? Dunno.

Anyway, all this applies to any book on a subject as fluid as programming, so it’s just something to be aware of.

tl;dr

Ugh, that came out harsh, I guess. Nevertheless, I recommend getting and reading this book, just by different folks that the introduction of the book itself aims at.

“A Few Case Studies in the Intricacies of the GHC’s Type System” would have been a better title, being much closer to what the book really is about. Shaving off a couple of chapters and expanding some more the non-trivial ones would also be very beneficial.

The book that actually teaches to think in types is, IMO, either “Type-driven Development with Idris,” if one’s looking for the coding side of things, or a mix of Pierce’s TAPL and Nederpelt/Geuvers’s “Type Theory and Formal Proof,” if one’s more inclined to learn about the theory. In addition to that, the TDDwI book is indeed a very good resource on, well, the type-driven development to leverage the types to arrive at the right solution effectively.