Call stacks aren't really call stacks
Haskell is a very special language, and one of the peculiarities setting it aside is its evaluation model. In fact, the thing I, for one, find most complicated about Haskell is not monads nor all the countless type system extensions, but rather reasoning about space and time complexity of whatever I write. Thus I better have a good mental model about how Haskell code gets to run, and one of the most fruitful mental models for me is treating a Haskell program as a set of equations that some graph reduction engine churns until… well, the termination criteria are not the point of this post. The point of this post is that it’s ultimately a graph without any good intrinsic notion of a call stack.
On the other hand,
there is a GHC.Stack
module
(by the way, described as Access to GHC’s call-stack simulation
, italics ours)
as well as some mechanism for capturing something called CallStack
s.
How do those call stacks connect with the graph reduction model?
Let’s maybe carry out a few computational experiments all while keeping track of the obstacles we hit, shall we?
Our first call stacks
The gist of the GHC.Stack
module is that
the HasCallStack
constraint does all the heavy lifting regarding keeping track of the stack,
asking GHC to provide the context where the function is called.
Perhaps the most interesting part of the corresponding documentation is this:
GHC solves
HasCallStack
constraints in three steps:
- If there is a
CallStack
in scope – i.e. the enclosing function has a HasCallStack constraint – GHC will append the new call-site to the existingCallStack
.- If there is no
CallStack
in scope – e.g. in the GHCi session above – and the enclosing definition does not have an explicit type signature, GHC will infer aHasCallStack
constraint for the enclosing definition (subject to the monomorphism restriction).- If there is no
CallStack
in scope and the enclosing definition has an explicit type signature, GHC will solve theHasCallStack
constraint for the singletonCallStack
containing just the current call-site.
Having said (or, rather, quoted) that,
let’s now turn to actually playing around with those call stacks.
First, we’ll have a small function that records the call stack and aborts execution.
In fact, the standard error
will do:
λ> :info error
error :: HasCallStack => [Char] -> a -- Defined in ‘GHC.Err’
since it already has the HasCallStack
constraint.
Indeed, if we write a small main
calling this function
main :: IO ()
main = error "yay"
we’ll get something resembling a call stack, albeit a very shallow, single-entry one:
Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:2:8 in main:Main
We need to go deeper
Let’s try to get a more interesting, deeper stack.
One way of doing that is adding a function calling error
and calling that function instead:
foo = error "yay"
main :: IO ()
main = foo
What’d be the output?
The expectation is that the second bullet above is in effect,
so GHC infers the type of foo
to be something like HasCallStack => a
and thus provides the information where foo
itself is called.
In fact, we get:
Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:1:7 in main:Main
Nope, still a single entry.
Even the monomorphism restriction isn’t a problem, since the inferred type of foo
is indeed a
,
and the {-# LANGUAGE NoMonomorphismRestriction #-}
pragma makes no difference.
How come?
Well, I couldn’t figure it out as well, so I asked some folks,
and somebody asked me in return where did I get that second bullet from as they didn’t see anything similar
in the GHC user guide.
Interestingly, the user guide explicitly says that
Importantly, GHC will never infer a
So either the module docs or the GHC user guide is incorrect,
and our little experiment shows that it’s the module docs.HasCallStack
constraint, you must request it explicitly.
Alright, let’s then explicitly add the constraint and see what happens:
import GHC.Stack
foo :: HasCallStack => a
foo = error "yay"
main :: IO ()
main = foo
Now this works as expected:
Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:4:7 in main:Main
foo, called at Main.hs:7:8 in main:Main
Of course, adding more functions calling each other
import GHC.Stack
f1, f2, f3 :: HasCallStack => a
f1 = error "yay"
f2 = f1
f3 = f2
main :: IO ()
main = f3
also works as expected:
Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:4:6 in main:Main
f1, called at Main.hs:5:6 in main:Main
f2, called at Main.hs:6:6 in main:Main
f3, called at Main.hs:9:8 in main:Main
What about recursion?
import GHC.Stack
rec :: HasCallStack => Int -> a
rec 0 = error "yay"
rec n = rec (n - 1)
main :: IO ()
main = rec 5
Of course, that also works!
Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:4:9 in main:Main
rec, called at Main.hs:5:9 in main:Main
rec, called at Main.hs:5:9 in main:Main
rec, called at Main.hs:5:9 in main:Main
rec, called at Main.hs:5:9 in main:Main
rec, called at Main.hs:5:9 in main:Main
rec, called at Main.hs:8:8 in main:Main
Now that starts resembling the call stacks we are used to seeing in other languages!
Break these chains
What happens if a function in our chain doesn’t have the HasCallStack
constraint?
import GHC.Stack
combobreaker :: a
f1, f2, f3 :: HasCallStack => a
f1 = error "yay"
f2 = f1
combobreaker = f2
f3 = combobreaker
main :: IO ()
main = f3
This gives us:
Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:5:6 in main:Main
f1, called at Main.hs:6:6 in main:Main
f2, called at Main.hs:7:16 in main:Main
This time the recorded stack ends at f2
that’s called from combobreaker
and doesn’t go further,
since combobreaker
doesn’t know where it got called from.
It doesn’t matter that its callers (namely, f3
) do know their context.
This still makes sense from the stack-based point of view.
For instance, if I put my C++ hat on I can consider the combobreaker
function to be compiled without debug information or something.
So far, so good.
Stacks and inlining
Oh, speaking of debug information and optimization, what happens if we ask the compiler to inline our functions?
import GHC.Stack
f1, f2, f3 :: HasCallStack => a
f1 = error "yay"
f2 = f1
f3 = f2
{-# INLINE f1 #-}
{-# INLINE f2 #-}
{-# INLINE f3 #-}
main :: IO ()
main = f3
Then we still get the stacks:
Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:4:6 in main:Main
f1, called at Main.hs:5:6 in main:Main
f2, called at Main.hs:6:6 in main:Main
f3, called at Main.hs:12:8 in main:Main
So perhaps it is unwise to annotate the performance-critical functions with HasCallStack
.
A higher place
But let’s get to the real point of this post.
Up until now we’ve been writing code very similar to what we might write in more mainstream languages, and, perhaps not coincidentally, the call stacks we observed were matching our intuition. But what if we actually use some of the functional features of Haskell, like, say, higher-order functions?
Let’s write our own little map'
function to experiment with:
import GHC.Stack
map' :: (Int -> Int) -> [Int] -> [Int]
map' f [] = []
map' f (x:xs) = f x : map' f xs
f :: Int -> Int
f 3 = error "yay"
f n = n
main :: IO ()
main = print $ map' f [1, 2, 3]
This doesn’t propagate any call stack information due to the complete lack of the HasCallStack
constraint in our code,
so we might expect the stack to contain just the error
function.
The reality indeed meets our expectations:
[1,2,Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:8:7 in main:Main
Note the extra [1,2,
at the beginning.
We go through the result of the map'
, which is computed lazily,
so we are able to successfully print first two elements of the result.
Then, when we hit f 3
, which is error
, we get the exception and the call stack.
What if we add our favourite constraint to f
?
import GHC.Stack
map' :: (Int -> Int) -> [Int] -> [Int]
map' _ [] = []
map' f (x:xs) = f x : map' f xs
f :: HasCallStack => Int -> Int
f 3 = error "yay"
f n = n
main :: IO ()
main = print $ map' f [1, 2, 3]
That does help, but not much:
[1,2,Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:8:7 in main:Main
f, called at Main.hs:12:21 in main:Main
Now f
knows where it gets calle… Stop! Hold on!
We aren’t calling f
on line 12
, we’re merely passing it to map'
there!
Now this gets interesting. Anyway, let’s keep going.
What we’ve observed above surely isn’t the full call stack,
so let’s throw some more HasCallStack
annotations at our code.
For example, what about changing map'
to the following type?
map' :: HasCallStack => (Int -> Int) -> [Int] -> [Int]
Turns out this doesn’t change anything. What might we have missed?
Ah! That’s the type of the argument to map'
!
Let’s enable {-# LANGUAGE RankNTypes #-}
and add that too:
{-# LANGUAGE RankNTypes #-}
import GHC.Stack
map' :: HasCallStack => (HasCallStack => Int -> Int) -> [Int] -> [Int]
map' _ [] = []
map' f (x:xs) = f x : map' f xs
f :: HasCallStack => Int -> Int
f 3 = error "yay"
f n = n
main :: IO ()
main = print $ map' f [1, 2, 3]
Now we get the full stack:
[1,2,Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:10:7 in main:Main
f, called at Main.hs:14:21 in main:Main
f, called at Main.hs:7:28 in main:Main
f, called at Main.hs:7:28 in main:Main
f, called at Main.hs:7:17 in main:Main
map', called at Main.hs:7:23 in main:Main
map', called at Main.hs:7:23 in main:Main
map', called at Main.hs:14:16 in main:Main
Interestingly, if we now remove HasCallStack
on f
,
we’ll lose the stack even though the code would still type check
(oh the magic of inferring HasCallStack
constraints!).
This implies we need to have HasCallStack
annotations both on the function definitions
as well as functional arguments types.
But let’s take a closer look at that last stack above.
It says f
calls f
,
and then we also have f
calling f
at line 7
, which doesn’t even belong
to our original definition of f
!
But what if we stop following best Haskell practices and actually give our f
a more-than-one-symbol-long name?
{-# LANGUAGE RankNTypes #-}
import GHC.Stack
map' :: HasCallStack => (HasCallStack => Int -> Int) -> [Int] -> [Int]
map' _ [] = []
map' f (x:xs) = f x : map' f xs
foo :: HasCallStack => Int -> Int
foo 3 = error "yay"
foo n = n
main :: IO ()
main = print $ map' foo [1, 2, 3]
Then we’ll get
[1,2,Main: yay
CallStack (from HasCallStack):
error, called at Main.hs:10:9 in main:Main
foo, called at Main.hs:14:21 in main:Main
f, called at Main.hs:7:28 in main:Main
f, called at Main.hs:7:28 in main:Main
f, called at Main.hs:7:17 in main:Main
map', called at Main.hs:7:23 in main:Main
map', called at Main.hs:7:23 in main:Main
map', called at Main.hs:14:16 in main:Main
Now this looks more reasonable!
At least we got all the functions we expect to see in the stack.
And here, f
is the function we have “inside” of map
matching the corresponding argument of map
.
Although, there’s still something that looks wrong about that.
Why map'
is at the bottom of the stack,
as if map'
had first called itself a few times,
and only then f
started calling itself somehow?
Why does f
call itself at all?
And why is foo
called on line 14,
while it clearly should’ve been called on on line 7?
Let’s get to the bottom — to the bottom of the stack, that is.
The error
gets raised when the last element of the list is being mapped,
or, more precisely, when the last mapped element is evaluated
(remember, Haskell is a non-strict language, and Prelude
list is lazy).
When, or, rather, where does this happen?
Let’s keep in mind that the third element of the list returned by map'
is the result of the third map'
application (no surprises here),
and, given that, let’s try to follow the evaluation model of the language.
- We start with the (outer) call to
map'
insidemain
on line 14, passingfoo
as thef
formal parameter. - This
map'
handles the first list element and calls itself recursively (on line 7) to handle the second one, passing thef
it got as its own formal parameter. - The same thing happens again, calling
map'
for the third time. - Then,
f
is called bymap'
to process the third element. - But that
f
is the thirdmap'
’s formal parameter pointing to the formal parameter for the previous (second) invocation ofmap'
plus the corresponding call site info thanks to theHasCallStack
constraint. So, to evaluate that "third"f
, we proceed with the "second"f
passed to the previous (second) invocation ofmap'
. - The same thing happens again.
- Evaluating the "first"
f
(which corresponds to the formal parameter of the first invocation ofmap'
) finally causes evaluation offoo
, whicherror
s out.
Bottom line
So, Haskell call stacks aren’t really call stacks.
Perhaps a better intuition is something like "dictionary stacks":
every time a HasCallStack
-function is called,
its (static) call site information is appended to the current stack of call sites, if any.
This doesn’t really differ much from what other languages denote as "call stacks" at a first glance,
but that’s until higher-order functions come into play.
Firstly, the formal parameters need to be annotated with HasCallStack
as well in order to keep the full stack information.
And, after that, merely passing a function as an argument records the corresponding stack information.
What’s different from a "conventional" call stack is that it’s indeed the static call site that’s recorded.
Remember the foo
function from the last example:
even though it’s ultimately called inside map'
on line 7,
its call site is recorded as if it’s on line 14, where it’s been mentioned.
On the other hand, leaving higher-order stuff out, the call stacks simulation, as GHC docs call it, is quite close to what our imperative languages-trained intuition tells us.
Many thanks to
monochrom at the #haskell
Freenode channel
who helped me solving the original problem I had with call stacks,
as well as for providing valuable input and food for thought.