Recent posts

Book review: "Thinking with Types" by Sandy Maguire

September 24, 2023 // ,

There is a shortage of good books on intermediate Haskell and above, and I think it somewhat contributes to the ivory tower feel of the language. Back when I was getting into C++ a couple of decades ago, there were some pretty well-acclaimed books and even whole series like Effective C++ by Meyers, Exceptional C++ by Sutter, and so on. They dont necessarily teach writing production-level code (and I doubt that kind of coaching is at all possible), but these books are good at building the language philosophy, spirit, and style.

I cant quickly think of as many Haskell-oriented (or even general FP-oriented) resources, so when I heard about the book Thinking with Types: Type-Level Programming in Haskell by Sandy Maguire a few months ago, it really caught my attention  I even got to read it the same year!

Although, reading this book gave me mixed feelings. On the one hand, I learned a few interesting tricks and corner cases, so the hours I invested in this book were well-spent. On the other hand, I think this book has some shortcomings, and, while being interesting, it still fails to address the audience it aims at (and, given the first paragraph, Id like it to aim at).

To summarize my feedback, I think the best title would be A Few Case Studies in the Intricacies of the GHCs Type System. Hence the audience benefitting the most are the folks already having a decent grasp of type-level computations in other, more expressive languages (like Idris) or who have read a book or two on type theory. Why would they want a book like this, then? Well, eventually, oned need to solve real-world problems with a production-ready compiler and a mature ecosystem that GHC provides instead of recursing infinitely into yak-shaving (like in Idris). So, this book is quite good at showing by example how these advanced abstract type-level things can be somewhat leveraged in a production-level language.

This book would have been a reasonable 9/10 with a title like A Few Case Studies&. But with the title it actually has and the target audience it claims, its more like 6 or 7 out of 10.

For a more detailed review, follow the spoiler!


Nubbing lists in C++

July 23, 2023 //

Its been a while since I last used C++ for anything serious, but once a C++ guy, youre always a C++ guy, right? So, I decided to see how modern C++ fares in a seemingly simple task: eliminating duplicate list elements.

That sounds trivial, so why bother with a whole blog post? Well, the catch is were gonna do this at compile-time. Moreover, lists will be represented as tuples, and the elements might have different types.

Hopefully, during this little exercise, well also learn (or at least reinforce) a pattern or two of modern metaprogramming.


Haskell is quite OK for images: encoding QOI

January 29, 2022 // ,

Last time weve looked at writing a decoder for the QOI format. Today, well look at the inverse: encoding QOI images and all that it entails.

Like the last time, this post describes both the final result and the road there. So, there will be lots of code and lots of diffs, beware!


Haskell is quite OK for images: decoding QOI

December 18, 2021 // ,

Ive recently come across the new Quite OK Image format  a fast lossless image compression algorithm. Its a very straightforward algorithm thats a pleasure to work with, so, naturally, I got curious what would be the performance of a Haskell implementation if:

  1. I just write reasonably efficient code without getting too deep into low-level details to get the job done in a couple of hours.
  2. I try to push the envelope and see what could be done if ones actually willing to go into those details (within some limits, of course, so no GHC hacking!)

Turns out that yes, its indeed possible to write something with C-level performance in a matter of a couple of hours. Moreover, Haskells type system shines here: class-constrained parametric polymorphism enables using the same decoder implementation for pixels with very different representations, allowing to squeeze as much performance as is reasonably possible without duplicating the code.

In this post, Ill describe the Haskell implementation of the decoder, and the steps I took to get from (1) to (2) for the decoder.


(neo)vim and Haskell, 2021 edition

October 4, 2021 //

In this post, Ill describe my setup for doing Haskell (which I almost exclusively do with stack-based projects).

Spoiler: its much, much more straightforward than a few years ago, almost to the point of vim and Haskell posts being no longer necessary.


Grokking recursion

September 25, 2020 // ,

If we want to use dependently typed languages as proof checkers, we better be sure they are consistent as a logic, so that we dont accidentally prove ¥ and, as a consequence, any proposition.

One huge source of inconsistency is non-terminating computations; hence languages like Idris or Agda go to great lengths to ensure that functions indeed do terminate. But, for deep reasons, a purely automated check having neither false positives nor false negatives just does not exist, so compromises must be made. Naturally, when talking about proofs, its better to be safe than sorry, so these languages strive to never label a function that doesnt really terminate for all inputs as terminating. Consequently, this means that there are terminating functions that the termination checker does not accept. Luckily, these functions can be rewritten to make the checker happy if all the recursive calls are "smaller" in some sense.

This post emerged from me trying to persuade Agda that a bunch of mutually recursive functions are all terminating. I went through the Agdas standard library to figure out how to do this, taking notes about what different abstractions I encountered mean and expand to. Then I figured that, if I pour some more words into my notes, it might turn out to be useful for somebody else, so, well, here it is.


Call stacks aren't really call stacks

August 29, 2020 //

Haskell is a very special language, and one of the peculiarities setting it aside is its evaluation model. In fact, the thing I, for one, find most complicated about Haskell is not monads nor all the countless type system extensions, but rather reasoning about space and time complexity of whatever I write. Thus I better have a good mental model about how Haskell code gets to run, and one of the most fruitful mental models for me is treating a Haskell program as a set of equations that some graph reduction engine churns until& well, the termination criteria are not the point of this post. The point of this post is that its ultimately a graph without any good intrinsic notion of a call stack.

On the other hand, there is a GHC.Stack module (by the way, described as Access to GHCs call-stack simulation, italics ours) as well as some mechanism for capturing something called CallStacks. How do those call stacks connect with the graph reduction model? Lets maybe carry out a few computational experiments all while keeping track of the obstacles we hit, shall we?


The joys and perils of beating C with Haskell: productionizing wc

March 10, 2020 // , ,

Last time weve looked at implementing a toy wc-like program and weve also compared its performance against the full-blown Unix wc. The results were quite interesting: our implementation managed to beat wc by a factor of 5. Of course, thats quite an unfair comparison: our implementation is hardcoded to count just the bytes, lines and words. wc, on the other hand, has command-line options to select specific statistics, it supports some additional ones like maximum line length, it treats Unicode spaces properly (in an Unicode-aware locale, of course), and so on. In other words, its better to consider what weve done last time as a proof-of-concept showing that its possible to achieve (and overcome) C-like performance on this task, even if with all those concessions.

Today well look at ways of productionizing the toy program from the previous post. Our primary goal here is allowing the user to select various statistics, computing just what the user has selected to compute. Well try to do this in a modular and composable way, striving to isolate each statistic into its own unit of some sorts.

Indeed, if we look at the C version  well, personally I wouldnt call that as a prime example of readable and maintainable code, as different statistics are computed in a single big 370-lines-long function. This is something well try to avoid here.

Moreover, well try to express that certain statistics like byte count or lines count can be computed more efficiently if we dont have to look at each byte, while other statistics like word count or max line length just need to look at each byte one by one (unless one does some clever and non-trivial broadword programming or SIMD-enabled things, which is beyond the scope of this post). For instance, byte count can be computed in O(1) if we know were reading from a file  we can just take the file size and call it a day!

In addition to that, we will, among other things:

  • implement more statistics with ease, enjoying local reasoning;
  • throw up some tests, enjoying local reasoning once more;
  • try out some kinda-dependently-typed techniques, successfully obtaining working code but failing spectacularly on the performance side of things;
  • play around with Template Haskell;
  • marvel at the (un)predictability and (un)reproducibility of the resulting code performance.

Further beating C with 20 lines of Haskell: wc

February 2, 2020 // ,

tl;dr: today well look at implementing a toy wc command that is about 4-5 times faster than the corresponding GNU Coreutils implementation.

So Ive recently come across a post by Chris Penner describing a Haskell implementation of the Unix wc command. Chris did a great job optimizing the Haskell version as well as showing how some high-level primitives (monoids and streaming, for one) turn out to be useful here, although the result was still a bit slower than C. Theres also a parallel version that relies on the monoidal structure of the problem a lot, and that one actually beats C.

But that post left me wondering: is it possible to do better without resorting to parallel processing?

Turns out the answer is yes. With some quite minor tweaks, the Haskell version manages to beat the hell out of the C version that presumably has decades of man-hours put into it.


Writing a fast edit distance implementation

January 1, 2020 // ,

In this post we will implement the Levenshtein edit distance algorithm, aiming at a reasonably performant code. We will start with a more or less idiomatic pure implementation and see how various changes (including strictness annotations or compilation options) affect the performance.

We will also compare this to a baseline C++ implementation.

Spoiler alerts:

  • C++ implementation turns out to be slower than the fastest Haskell implementation.
  • LLVM backend really shines here.

Can i haz? Part 3: extending the Has pattern

November 11, 2019 //

Once we scrapped the boilerplate for the Has pattern, the next obvious question is if we can generalize further. And, turns out, we can!

In this post well see how some algebraic considerations help us to discover one more pattern useful with MonadError (and a Generic implementation thereof), and well also update our Has class with one more method that brings it closer to something lens-like and makes it useful with writable environments like MonadState.


Can i haz? Part 2: scrapping the boilerplate, and fun with types

October 12, 2019 // ,

Last time we briefly covered the Has pattern which allows expressing more precisely in types what does a function need from the application environment. We also wrote a few instances for our Has-like classes for a toy environment type.

The problem is that we had to write those classes. As the number of fields in our environment type grows, the number of classes we have to write grows linearly even in the best case. This much manual labour is surely not the Haskell way!


Can i haz? Part 1: intro to the Has pattern

October 10, 2019 //

A few weeks ago Ive been trying to remove the boilerplate of writing instances of a certain type class, and I learned a couple of cool tricks that are probably worth sharing. The class in question is a generalization of the type classes comprising what is known as the Has-pattern. So, before describing those tricks in detail, lets briefly discuss whats the Has-pattern.

Note: this is an introductory post. The Has pattern is definitely not something Ive created or even coined a term for, and seasoned haskellers are surely familiar with this approach. Yet I feel obliged to give a brief overview before delving into the more interesting stuff.

Global configuration

How do Haskell folks solve the problem of managing some environment Env (think some global application configuration) thats needed by a bunch of different functions?

One obvious way is to just pass the Env to the functions that need it. Unfortunately, this does not compose as nicely as some other primitives were used to in Haskell. Indeed, this way every function that needs even a tiny-tiny piece of the environemnt gets the whole of it. Thus, we lose the ability to reason about what parts of the environment does a given function need merely from looking at its type. Thats surely not the Haskell way!

Lets explore some other approaches, shall we?


The joys of C++17

September 10, 2019 // ,

This is gonna be a short one.

Some time ago Ive written a tiny helper Curry for, well, currying functions and function-like objects: given some callable foo accepting arguments of types T_1, ..., T_n, Curry(foo) returns an object such that Curry(foo)(t_1)...(t_n) (where t_i is of type T_i) would, as you might expect, call foo passing all those t_is to it.

This was so long ago that C++11 compatibility was a thing for me back then, so Curry is written with that version of standard in mind. And then a couple of days ago I stumbled upon that code again, and couldnt help but realize how terribly verbose it is. Lets see how modern C++ allows reducing the verbosity.


Statically safe dynamic typing à la Python

June 28, 2019 // ,

One of my hobby projects includes a long-running service, so itd be nice if the service provided some metrics (say, using the ekg library) to the outside world for monitoring and alerts. As a consequence, the service needs an internal metrics storage that encapsulates all things related to creating them as needed, updating them, and so on.

Writing a metrics storage (especially on top of ekg) is trivial, but one cannot just solve a problem when doing recreational programming. Youve got to abstract things away, generalize, and then abstract further and generalize further. So, quite soon I found myself writing an extensible and customizable storage supporting unknown metrics of unknown types in such a way that new metrics could be added in different modules without touching any existing definitions. This deserves a post or two on its own, but today well consider just a tiny part of the solution: writing a type-safe wrapper over types that are only known at runtime. So, yeah, something like dynamic typing but with static guarantees that we dont do any nonsense.

I dont think this short post will reveal anything new for seasoned haskellers, but at least well get this little part done and out of our way in our next articles about the storage itself. Or I could be less shy and claim instead that I created a new programming pattern.

Problem statement

Anyway, first things first. Lets spell out what problem we are trying to solve. So, we want to be able to associate some objects (whose types arent known before the runtime) with some values of some other type (which we dont use so we dont care about). In other words, we want objects of more or less arbitrary (and different) types to be the keys in the same associative container.