Can i haz? Part 1: intro to the Has pattern
A few weeks ago I’ve been trying to remove the boilerplate of writing instances of a certain type class,
and I learned a couple of cool tricks that are probably worth sharing.
The class in question is a generalization of the type classes comprising what is known as the Has
-pattern.
So, before describing those tricks in detail, let’s briefly discuss what’s the Has
-pattern.
Note: this is an introductory post.
The Has
pattern is definitely not something I’ve created or even coined a term for,
and seasoned haskellers are surely familiar with this approach.
Yet I feel obliged to give a brief overview before delving into the more interesting stuff.
Global configuration
How do Haskell folks solve the problem of managing some environment Env
(think some global application configuration)
that’s needed by a bunch of different functions?
One obvious way is to just pass the Env
to the functions that need it.
Unfortunately, this does not compose as nicely as some other primitives we’re used to in Haskell.
Indeed, this way every function that needs even a tiny-tiny piece of the environemnt gets the whole of it.
Thus, we lose the ability to reason about what parts of the environment does a given function need
merely from looking at its type.
That’s surely not the Haskell way!
Let’s explore some other approaches, shall we?
Firstly, a more generic approach is to wrap the functions
needing to access the environment Env
in the Reader Env
monad:
import Control.Monad.Reader
data Env = Env
{ someConfigVariable :: Int
, otherConfigVariable :: [String]
}
iNeedEnv :: Reader Env Foo
iNeedEnv = do
-- to get the whole Env:
env <- ask
-- or alternatively, if we need just a portion of it:
theInt <- asks someConfigVariable
...
Generalizing this a bit and changing just the type, we arrive at MonadReader
:
iNeedEnv :: MonadReader Env m => m Foo
iNeedEnv = -- everything's the same as before
This is better since now we don’t really care about the whole monad stack we’re in.
We only (explicitly) care that we can get access to some surrounding Env
ironment.
We don’t really care (nor should we) if the caller of iNeedEnv
uses any other capabilities:
it’s perfectly good to use iNeedEnv
in a monad stack having IO
,
error reporting via MonadError
/ExceptT
and what not in addition to MonadReader
:
someCaller :: (MonadIO m, MonadReader Env m, MonadError Err m) => m Bar
someCaller = do
theFoo <- iNeedEnv
...
And, while we’re at it,
I must say I lied to some extent — functions taking the environment argument compose
just as nicely as monads do, but precisely because they are monads!
More precisely, a “partially applied” function type (r ->)
is in fact a lawful MonadReader
.
Building the corresponding intuition is left as an exercise to the reader.
Anyway, that’s a good step towards modularity. Let’s see where it takes us.
Why Has
Let’s say we’re working on a web service. It might have quite a few different components, including, for example:
- a database access layer,
- a web server,
- a timer-activated cron-like module.
Each of those might have its own configuration, like the database credentials, web server host and port it shall listen on, and the timer periodicity, respectively. We can think of the global application config as the union of all those (and likely something else).
Let’s, for the sake of simplicity, assume that each of the modules’ APIs consists of just one function:
setupDatabase
startServer
runCronJobs
Each of those functions needs access to the corresponding part of the global application configuration.
We’ve learned it’s a good idea to use MonadReader
, but what should be the environment type?
One approach might be to define
data AppConfig = AppConfig
{ dbCredentials :: DbCredentials
, serverAddress :: (Host, Port)
, cronPeriodicity :: Ratio Int
}
and then have each of those functions to accept AppConfig
:
setupDatabase :: MonadReader AppConfig m => m Db
startServer :: MonadReader AppConfig m => m Server
runCronJobs :: MonadReader AppConfig m => m ()
Obviously, these functions will also likely require MonadIO
and perhaps something else,
but that’s not as important.
Note we’ve done something really terrible. Why? Just a few reasons off the top of my head:
- We introduced unnecessary coupling. The database layer should ideally know nothing about the web server or even just the mere presence thereof. And, of course, we shouldn’t recompile the DB module shall the set of server parameters change.
- This just does not work if we don’t have the control over some of the modules.
What if the cron module is provided by some third-party library
that knows nothing about our specific application and our specific
AppConfig
? - We introduced confusion. For example, what exactly is
serverAddress
? Is it the host/port that the web server should bind to, or is it the database server address? Throwing all of config options into one big ball of mud increases the chances of such collisions. - We’ve lost some ability to reason about which parts of the environment are required by which module. Everything has access to everything!
What’s the solution? As you might have guessed, it’s
The Has
pattern
Each module doesn’t really care what is the type of the environment as long as it has the configuration info the module requires. This is perhaps best shown by example.
Consider the DB module and assume that it now defines a type representing all the configuration the module needs:
data DbConfig = DbConfig
{ dbCredentials :: DbCredentials
, ...
}
Now the Has
pattern appears in the form of the following class:
class HasDbConfig rec where
getDbConfig :: rec -> DbConfig
Now the type of the setupDatabase
function would be
setupDatabase :: (MonadReader r m, HasDbConfig r) => m Db
and we now need to do asks $ foo . getDbConfig
where we previously did asks foo
due to the extra level of abstraction we just introduced.
Similarly, we’ll have HasWebServerConfig
and HasCronConfig
.
What if some function uses two different modules? Just combine the constraints!
doSmthWithDbAndCron :: (MonadReader r m, HasDbConfig r, HasCronConfig r) => ...
Now what about the implementations of these classes?
We can still have the top-level AppConfig
,
it just that the different modules don’t know about it (and about each other):
data AppConfig = AppConfig
{ dbConfig :: DbConfig
, webServerConfig :: WebServerConfig
, cronConfig :: CronConfig
}
instance HasDbConfig AppConfig where
getDbConfig = dbConfig
instance HasWebServerConfig AppConfig where
getWebServerConfig = webServerCOnfig
instance HasCronConfig AppConfig where
getCronConfig = cronConfig
Looks good so far. But there is a tiny problem with this approach that we’ll consider in the next post.