Recent posts

Destroying C with 20 lines of Haskell: wc

February 2, 2020 // haskell, performance

tl;dr: today we’ll look at implementing a toy wc command that is about 4-5 times faster than the corresponding GNU Coreutils implementation.

So I’ve recently come across a post by Chris Penner describing a Haskell implementation of the Unix wc command. Chris did a great job optimizing the Haskell version as well as showing how some high-level primitives (monoids and streaming, for one) turn out to be useful here, although the result was still a bit slower than C. There’s also a parallel version that relies on the monoidal structure of the problem a lot, and that one actually beats C.

But that post left me wondering: is it possible to do better without resorting to parallel processing?

Turns out the answer is yes. With some quite minor tweaks, the Haskell version manages to beat the hell out of the C version that presumably has decades of man-hours put into it.


Writing a fast edit distance implementation

January 1, 2020 // haskell, performance

In this post we will implement the Levenshtein edit distance algorithm, aiming at a reasonably performant code. We will start with a more or less idiomatic pure implementation and see how various changes (including strictness annotations or compilation options) affect the performance.

We will also compare this to a baseline C++ implementation.

Spoiler alerts:

  • C++ implementation turns out to be slower than the fastest Haskell implementation.
  • LLVM backend really shines here.

Can i haz? Part 3: extending the Has pattern

November 11, 2019 // haskell

Once we scrapped the boilerplate for the Has pattern, the next obvious question is if we can generalize further. And, turns out, we can!

In this post we’ll see how some algebraic considerations help us to discover one more pattern useful with MonadError (and a Generic implementation thereof), and we’ll also update our Has class with one more method that brings it closer to something lens-like and makes it useful with writable environments like MonadState.


Can i haz? Part 2: scrapping the boilerplate, and fun with types

October 12, 2019 // haskell, types

Last time we briefly covered the Has pattern, the problems that it solves, and we also wrote a few instances for our Has-like classes:

data AppConfig = AppConfig
  { dbConfig :: DbConfig
  , webServerConfig :: WebServerConfig
  , cronConfig :: CronConfig

instance HasDbConfig AppConfig where
  getDbConfig = dbConfig
instance HasWebServerConfig AppConfig where
  getWebServerConfig = webServerCOnfig
instance HasCronConfig AppConfig where
  getCronConfig = cronConfig

Looks good so far. What could be the problems with this approach?

The problem with Has

Let’s think what other instances we might want to write.

The configs themselves are obviously good candidates for (trivially) satisfying the corresponding classes:

instance HasDbConfig DbConfig where
  getDbConfig = id
instance HasWebServerConfig WebServerConfig where
  getWebServerConfig = id
instance HasCronConfig CronConfig where
  getCronConfig = id

These instances allow us to, for example, write separate tests (or utilities like a service tool for our DB) that don’t require the whole of AppConfig.

This is already getting a bit boring, but hold on. Some integration tests might also involve a pair of modules, and we still don’t want to pull the whole application configuration into all of the modules, so we end up writing a few instances for tuples:

instance HasDbConfig (DbConfig, b) where
  getDbConfig = fst
instance HasDbConfig (a, DbConfig) where
  getDbConfig = snd

instance HasWebServerConfig (WebServerConfig, b) where
  getWebServerConfig = fst
instance HasWebServerConfig (a, WebServerConfig) where
  getWebServerConfig = snd

instance HasCronConfig (CronConfig, b) where
  getCronConfig = fst
instance HasCronConfig (a, CronConfig) where
  getCronConfig = snd

Ugh. Let’s just hope we will never need to test three modules at once so we won’t need to write nine dull instances for 3-tuples.

Anyway, if you’re anything like me, this amount of boilerplate will make you seriously uncomfortable and eager to spend a few hours looking for ways to delegate this to the compiler instead a couple of minutes of writing the necessary instances.


Can i haz? Part 1: intro to the Has pattern

October 10, 2019 // haskell

A few weeks ago I’ve been trying to remove the boilerplate of writing instances of a certain type class, and I learned a couple of cool tricks that are probably worth sharing. The class in question is a generalization of the type classes comprising what is known as the Has-pattern. So, before describing those tricks in detail, let’s briefly discuss what’s the Has-pattern.

Note: this is an introductory post. The Has pattern is definitely not something I’ve created or even coined a term for, and seasoned haskellers are surely familiar with this approach. Yet I feel obliged to give a brief overview before delving into the more interesting stuff.

Global configuration

How do Haskell folks solve the problem of managing some environment Env that’s accessed by several different functions, like some global configuration object?

One obvious way is to just pass the Env to the functions that need it:

iNeedEnv :: Env -> Foo
iNeedEnv env = ... -- here we have env of type Env

Unfortunately, this does not compose as nicely as some other primitives we’re used to in Haskell. Like monads.


The joys of C++17

September 10, 2019 // c++, c++17

This is gonna be a short one.

Some time ago I’ve written a tiny helper Curry for, well, currying functions and function-like objects: given some callable foo accepting arguments of types T_1, ..., T_n, Curry(foo) returns an object such that Curry(foo)(t_1)...(t_n) (where t_i is of type T_i) would, as you might expect, call foo passing all those t_is to it.

This was so long ago that C++11 compatibility was a thing for me back then, so Curry is written with that version of standard in mind. And then a couple of days ago I stumbled upon that code again, and couldn’t help but realize how terribly verbose it is. Let’s see how modern C++ allows reducing the verbosity.


Statically safe dynamic typing à la Python

June 28, 2019 // haskell, types

One of my hobby projects includes a long-running service, so it’d be nice if the service provided some metrics (say, using the ekg library) to the outside world for monitoring and alerts. As a consequence, the service needs an internal metrics storage that encapsulates all things related to creating them as needed, updating them, and so on.

Writing a metrics storage (especially on top of ekg) is trivial, but one cannot just solve a problem when doing recreational programming. You’ve got to abstract things away, generalize, and then abstract further and generalize further. So, quite soon I found myself writing an extensible and customizable storage supporting unknown metrics of unknown types in such a way that new metrics could be added in different modules without touching any existing definitions. This deserves a post or two on its own, but today we’ll consider just a tiny part of the solution: writing a type-safe wrapper over types that are only known at runtime. So, yeah, something like dynamic typing but with static guarantees that we don’t do any nonsense.

I don’t think this short post will reveal anything new for seasoned haskellers, but at least we’ll get this little part done and out of our way in our next articles about the storage itself. Or I could be less shy and claim instead that I created a new programming pattern.

Problem statement

Anyway, first things first. Let’s spell out what problem we are trying to solve. So, we want to be able to associate some objects (whose types aren’t known before the runtime) with some values of some other type (which we don’t use so we don’t care about). In other words, we want objects of more or less arbitrary (and different) types to be the keys in the same associative container.