So, using the CakeML compiler one can generate
from a source program (left) machine code (right)
and be sure that the behaviours of both are compatible.
🤝
$\text{machine_sem}\,(\text{compile}\,prog)\,\subseteq\\
\text{extend_with_resource_limit}(\text{source_sem}\,prog)$
What we mean by compatible in this setting is that all the behaviours
present in the semantics of the machine code are also present (left)
in the semantics of the source code (right)
Or in other words, the behaviours of the source code are a super
set of those of the machine code
This allows us to translate safety properties from one side to the other
However ther is catch!
$\text{extend_with_resource_limit}(\text{source_sem}\,prog)$
$\text{source_sem}\,prog\,\cup$ 💥
You might have notice that the source sematics is enclosend in this
extended_with_resorce_limits
function
What this does is that it takes all the behaviours of the source semantics
and extends them with running out of memory behaviours
We need this because the source semantics can not run out of memory
🤝
$\text{machine_sem}\,(\text{compile}\,prog)\subseteq\\
\text{extend_with_resource_limit}(\text{source_sem}\,prog)$
So we would like to replace the subset relation with an equality!
👍
$\text{machine_sem}\,(\text{compile}\,prog)=\text{source_sem}\,prog$
This will allow us to not only translate safety but liveness
properties from one side to the other
✅ 👉 👍
$\bbox[background-color: #f19a3e]{\text{is_safe}\,prog} \implies \\
\text{machine_sem}\,(\text{compile}\,prog)=
\text{source_sem}\,prog$
The approach we tooks was to device a "safety" predicate over the source program
which ensures that when compiled the resulting machine code will not run out of memory
This eliminates the need for extend_with_resource_limit
Objectives
Make space reasoning possible in CakeML
Produce concrete and tight space bounds
Enable transportation of liveness properties from source to machine code
We set out to do this with the following objectives:
Contributions
A formal space cost semantics for CakeML
Proof of soundness w.r.t a compiler that relies on (verified) garbage collection
Prove of concrete and tight bounds for a number of examples (non-terminating, higher-order, and bignum)
This papers contributions are
Why do programs run out of memory?
Runs out of heap
Runs out of stack
Object exceeds representation limits
So why do programs run out of memory any way?
It can run out of heap
This is when a single or a set of objects grow so large in size that they
exhausts the available heap space
Alternatively, it can run out of stack
Here, a large number of yet-to-return nested functions fill the call-stack completely
And Finally, and perhaps more exotic, its when an object exceed its representation limit
For example, if an array representation uses, say, 3bits in its header to keep track of its length,
one can exceed this representation by simply concatenation to arrays of length 6.
So where do out-memory-errors appear in CakeML?
the structure of the CakeML compiler, unsurprisingly,
looks like a cake, in the sense that it has many layers
Each the layer corresponds to a compilation phase or an optimization
This approach keeps things modular for both proofs and implementation
High level (Tractable)
??
Low level (Precise)
The higher in the stack the more tractable and high level a language is
While lower levels are more concrete and close to the machine code
CakeML
flatLang
closLang
BVL
BVI
dataLang
wordLang
stackLang
labLang
Machine code
When it comes to the representation of memory, it is worth nothing that...
CakeML
flatLang
closLang
BVL
BVI
dataLang
wordLang
stackLang
labLang
Machine code
The first 6 languages in the stack have semantics that lack any form of out-memory behaviours
Or in other words, theirs memory models have unlimited space
CakeML
=
flatLang
=
closLang
=
BVL
=
BVI
=
dataLang
$\subseteq$
wordLang
stackLang
labLang
Machine code
So the relationship between the behaviours of these languages is an equality
It is only at the point when memory become more concrete that the subset relation is introduced
CakeML
=
flatLang
=
closLang
=
BVL
=
BVI
=
dataLang
wordLang
=
stackLang
=
labLang
=
Machine code
It is worth noting that all the languages at the bottom also share an equality relation
So, when we considered where to do our space reasoning the first obvious candidate was...
Source language
High level
Complex cost-semantics
Loose approximation
CakeML
It is the source language after all, and is what you use to write your programs
It is high level so one does not have to deal with allocation or pointer as much
However, the initially great upsides very quicky turn into downsides
Extending the source semantics to accommodate a space reasoning at such a
high level in the language stack requires either:
A complex space semantics that contains basically all compilers phases and optimizations within it
Or Completely ignore all that and provide a very loose upper bound
Both of those alternatives did not met our objectives so we went back to the drawing board
CakeML
=
flatLang
=
closLang
=
BVL
=
BVI
=
dataLang
$\subseteq$
wordLang
stackLang
labLang
Machine code
We notes that among all the "high level" languages dataLang
had a number of interesting features
Imperative
Abstract values
Stateful storage
An explicit call-stack
(unlike languages above)
It is imperative, so it is easy to step through each instruction and see what effect it had on memory
It provides abstract values, meaning not everything is represented using and there is a bit of structure to values
It has stateful storage, so actual variables instead of bindings
And finally, it has an explicit call-stack, unlike all languages above it
So pretty much everything we need to be able to reason about running out of memory is here
in a concrete enough way, a stack, the heap, variables and values.
Additionally, it gets bonus points for being at a place in the language stack where most optimizations
and tricky compiler phases have already happened simplifying our task greatly
FOLDL f e [] = e
FOLDL f e (x::xs) = FOLDL f (f e x) xs
foldl [0; 1; 2] = # FOLDL (0=l) (1=e) (2=f)
# LENGTH l = 0?
do 4 :≡ (TagLenEq 0 0,[0],NONE);
if_var 4 (return 1) # Nil case, return e
# Cons case
do 6 :≡ (ElemAt 0,[0],NONE); # head (x)
7 :≡ (ElemAt 1,[0],NONE); # tail (xs)
...
# f x e
call (18,⦕ 2; 7 ⦖) NONE [6; 1; 2; 15] NONE;
tailcall_foldl [7; 18; 2]
However, dataLang
is NOT source CakeML and there
is of course a clear downside from writing your programs on a language and
have to reason about it in another.
But we have improved our infrastructure to facilitate this task,
as much as possible.
Lets consider as an example the FOLDL
function,
implemented in the source language at the top
and compiled into dataLang at the bottom
Variables in data lang are numeric thus to better show the similarity
we replace them here with their corresponding source names.
However, the relation between them is still one-to-one.
FOLDL f e [] = e
FOLDL f e (x::xs) = FOLDL f (f e x) xs
foldl [l; e; f] =
# LENGTH l = 0?
do isNil :≡ (TagLenEq 0 0,[l],NONE);
if_var isNil (return e) # Nil case, return e
# Cons case
do x :≡ (ElemAt 0,[l],NONE); # head (x)
xs :≡ (ElemAt 1,[l],NONE); # tail (xs)
...
# f x e
call (e1,⦕ f; xs ⦖) NONE [x; e; f] NONE;
tailcall_foldl [xs; e1; f]
So, the dataLang version of FOLDL
is displayed using a monadic representation, which improves readability
additionally Function names are preserved all the way from source into dataLang so it is easy to know what is what
For the code itself, Arguments are the same, they are just passed in reverse
The first few operation a case distinction over our list argument l
If l
is nil base value is returned, just like in the first pattern of the source function
Otherwise, we are in the cons case
Here, head and tail are obtained
Follow by a call to our function argument f
which generate the new base value
Finally a tail recursive call is performed with the tail, the new base values and the origanal function f
We can see then, that the original structure of the FOLDL
function is somewhat preserved
And in our expericen this extends to other functions as well
However, some understaing of dataLang syntax and semantics is offcourse requiered
on that note...
v = Number int
| Word64 word64
| Block ts tag (v list) -- ts = tag = num
| CodePtr num
| RefPtr num
lets tal about data lang values
Values in dataLang are presented by this data type
We have unbounded integers
Words
Blocks of contiguous values with a constructor tag and a uniqueness timestamp
Code pointer
Value pointers
[1,2,3]
Block 8 cons_tag [Number 1;
Block 7 cons_tag [Number 2;
Block 6 cons_tag [Number 3;
Block 0 nil_tag []
]
]
]
As an example lets take the list [1,2,3]
which is represented in dataLang as follows
The first block contains contains the head of the list (Number 1
) as its first value
Followed by the tail of the list, which is again a list and is represented with a block as well
The tail follows the same structure as before, with its firs element (Number 2
) being the head of the tail
Followed again by the tail of the tail
You can see how the original structure of the list value is represented using blocks
It is also worth pointing out that each block represents a constructor which is identified by the tag value
and that each non-empty block has an unique timestamp
evaluate (prog,s) = (res,s')
state = <| locals : v num_map
; stack : stack list
; refs : v ref num_map
; global : num option
...
|>
The semantics of dataLang is defined using a functional bigstep semantics
it takes a program and a state, and returns a result with an updated state
The state contains everything needed to execute the dataLang program, here we
display only the field relevant to our memory model
locals
contains all variables in the scope of the current function
stack
is the call-stack
refs
is a map from pointers to values
and global
contains global values that can be accessed from everywhere
We can see from the state then, that even though dataLang's memory model is unbounded,
one can still concretely "see" both heap and stack
evaluate (prog,s) = (res,s')
state = <| locals : v num_map
; stack : stack list
; refs : v ref num_map
; global : num option
; limits : limits
; safe_for_space : bool
; stack_max : num option
...
|>
So, to keep track of memory consumption we extended with a number of extra fields
limits
contains concrete bounds for heap and stack
safe_for_space
is always initially true and signals when
the programs has surpassed one of the limits
stack_max
records the maximum stack size seen during execution
With this we have everything we need to measure and keep track of memory, how do we do it then?
size_of_heap s
At every allocation
On heap consuming operations
size_of_stack s.stack
At every function call
On stack consuming operations
The main ideas is to measure the size of both heap and stack at relevant points during execution
We measure heap at every allocation
and before every heap consuming operations, for example, the addition of two bignums which might
require allocating some extra space for its result
The size of the stack is measured at every function call
and before each stack consuming operations, this are operation
that are implemented in lower languages either recursively or as
multiple calls to other functions
By doing this we can make sure space consumptions remains withing the limits, or,
in case it doesn't, set safe_for_space
to false
size_of_stack s.stack
let new_stack = MAX s.stack_max
(size_of_stack s.stack)
in
s with <| safe_for_space :=
s.safe_for_space ∧
new_stack < s.limits.stack_limit;
stack_max := new_stack |>
This is the measurement of stack size that is performed after a function call
First we make a conjunction with the current value of safe_for_space
Then we measure the current size of the stack, and obtain the maximum this measurement and the largest previous measurement
We check if this new_stack
is still within the limits and update safe_for_space
accordingly
Finally we update stack_max
to reflect any changes
size_of_heap s
let new_heap = size_of_heap s + space_consumed s op vs
in
s with <| safe_for_space :=
s.safe_for_space ∧
new_heap < s.limits.heap_limit |>
Conversely, this is the measurement of heap size before a heap consuming operation is performed
Again, we make a conjunction with the current value of safe_for_space
Then we measure the amount of heap performing the operation will require, this is, the size of
the current heap plus the space needed to perform the operation
space_consumed
is a characterization of the space needed to perform an operation in terms of its arguments
Finally, we check if our new_heap
is within the limits and update safe_for_space
accordingly
This extra measurements and checks are the only changes made to the semantics
No new behaviours where added to dataLang's semantics; In fact, if one ignores the new fields the final result is the same as it
was before the changes
If safe_for_space
turns false at any point the semantics will continue evaluating the program.
However, the state of the final result will contain via safe_for_space
an indicator of wether or not our program has ran out of memory
size_of vals refs seen
Where:
(vals
) is a list of v
values
(refs
) is a mapping from numbers to values
(seen
) timestamps we have already seen
We measure values using the function size_of
It takes as arguments
A list of values
A mapping from number to values, this is what pointer are pointing to
And a set of seen timestamps
size_of
traverses the list measuring the size of each value,
but it make sure to avoid counting the same pointer multiple times, and
mitigates aliasing of blocks using timestamps
v = Number int
| Word64 word64
| Block ts tag (v list) -- ts = tag = num
| CodePtr num
| RefPtr num
Block 8 cons_tag [Number 1;
Block 7 cons_tag [Number 2;
Block 6 cons_tag [Number 3,
Block 0 nil_tag []
]
]
]
To see how this works lets briefly recall how values are represented in dataLang
Specifically how a block contain not only a list of
values and a constructor tag but an uniquely identifying timestamp
This timestamp is important since the semantics ensures that if two blocks have the same timestamps their
contents must be the same, because they are the same value in memory
size_of [Block 3 some_tag [Number 1];
Block 3 some_tag [Number 1]]
refs seen
2 +
size_of [Block 3 some_tag [Number 1]]
refs ({3} ∪ seen)
2
In this example we are measuring the size of two blocks, we assume seen
is initially empty and refs
is the one currently in the state
The size of the first block is 2, one word for its tag, and one for its only element the small number 1
We continue traversing the list, but the set of seen timestamps is updated with the timestamp we just saw
So when we reach the next block we check if the timestamp has already been seen, this is in deed the case
This means we don't need to measure this block since its size has already been accounted for
size_of_stack s.stack
To measure the size of the stack, we traverse it counting the size of all stack frame
size_of_heap s
size_of reachabe_values s.refs {}
<|s.locals|> ++ <|s.stack|> ++ <|s.global|>
size_of_heap
is implemented as a call to size_of
with a list of all reachabe values
which are obtained from traversing all values in the locals, the stack, and the global references
size_of_heap s =
size_of_heap (run_gc s)
A very nice side effect of this is that our measurement of heap size is idenpotent w.r.t a garbage collection
is_safe s prog =
let (res,s') = evaluate (prog,s)
in s'.safe_for_space
Our safety predicate is then defined in terms of the evaluation of a dataLang program
we say that a source program is safe if the evaluation of its dataLang representation
is safe_for_space
$\text{is_safe}\,\text{init_s}\,(\text{to_data}\,prog) \implies \\
\text{machine_sem}\,(\text{compile}\,prog)=\text{source_sem}\,prog$
(See paper for actual statement.)
We have proven that if a program is_safe
no out-of-memory behaviours can occur in the machine code
Which implies that the relation between source and machine code can be proven to be an equality
All proofs where formalized using the HOL4 theorem prover in the context of the CakeML project
Conclusion
We use dataLang
as a space cost semantics for
Space cost is precisely measured by size_of
Timestamps are used to mitigate aliasing
By only counting reachable objects our measurement is idempotent over GC passes
?