Cost semantics

Functional language
Proven-correct compiler
Able to bootstrap itself

`<->`

🤝

.asm

🤝

$\text{machine_sem}\,(\text{compile}\,prog)\,\subseteq\\ \text{extend_with_resource_limit}(\text{source_sem}\,prog)$

$\text{extend_with_resource_limit}(\text{source_sem}\,prog)$

$\text{source_sem}\,prog\,\cup$ 💥

🤝

$\text{machine_sem}\,(\text{compile}\,prog)\subseteq\\ \text{extend_with_resource_limit}(\text{source_sem}\,prog)$

👍

$\text{machine_sem}\,(\text{compile}\,prog)=\text{source_sem}\,prog$

✅ 👉 👍

$\bbox[background-color: #f19a3e]{\text{is_safe}\,prog} \implies \\ \text{machine_sem}\,(\text{compile}\,prog)= \text{source_sem}\,prog$

Why do programs run out of memory?

Runs out of heap
Runs out of stack
Object exceeds representation limits

High level (Tractable)

??

Low level (Precise)

CakeML

flatLang

closLang

BVL

BVI

dataLang

wordLang

stackLang

labLang

Assembly

CakeML

flatLang

closLang

BVL

BVI

dataLang

wordLang

stackLang

labLang

Assembly

CakeML

=

flatLang

=

closLang

=

BVL

=

BVI

=

dataLang

$\subseteq$

wordLang

stackLang

labLang

Assembly

`dataLang`

Imperative
Abstract values
Stateful storage
An explicit call-stack (unlike languages above)


    FOLDL f e [] = e
    FOLDL f e (x::xs) = FOLDL f (f e x) xs


    foldl [0; 1; 2] = # FOLDL (0=l) (1=e) (2=f)
      # LENGTH l = 0?
      do 4 :≡ (TagLenEq 0 0,[0],NONE);
         if_var 4 (return 1) # Nil case, return e
           # Cons case
           do 6 :≡ (ElemAt 0,[0],NONE);  # head
              7 :≡ (ElemAt 1,[0],NONE);  # tail
              ...
              # f x e
              call (18,⦕ 2; 7 ⦖) NONE [6; 1; 2; 15] NONE;
              tailcall_foldl [7; 18; 2]


    FOLDL f e [] = e
    FOLDL f e (x::xs) = FOLDL f (f e x) xs


    foldl [l; e; f] =
      # LENGTH l = 0?
      do isNil :≡ (TagLenEq 0 0,[l],NONE);
         if_var isNil (return e) # Nil case, return e
           # Cons case
           do x  :≡ (ElemAt 0,[l],NONE);  # head
              xs :≡ (ElemAt 1,[l],NONE);  # tail
              ...
              # f x e
              call (e1,⦕ f; xs ⦖) NONE [x; e; f] NONE;
              tailcall_foldl [xs; e1; f]


         v = Number  int
           | Word64  word64
           | Block   ts tag (v list)  -- ts = tag = num
           | CodePtr num
           | RefPtr  num

`[1,2,3]`


      Block 8 cons_tag [Number 1;
        Block 7 cons_tag [Number 2;
          Block 6 cons_tag [Number 3;
            Block 0 nil_tag []
          ]
        ]
      ]

evaluate (prog,s) = (res,s')


      state = <| locals           : v num_map
               ; stack            : stack list
               ; refs             : v ref num_map
               ; global           : num option
               ...
               |>

evaluate (prog,s) = (res,s')


      state = <| locals           : v num_map
               ; stack            : stack list
               ; refs             : v ref num_map
               ; global           : num option
               ; limits           : limits
               ; safe_for_space   : bool
               ; stack_max        : num option
               ...
               |>

size_of_heap s

At every allocation
On heap consuming operations

size_of_stack s.stack

At every function call
On stack consuming operations

size_of_stack s.stack


        let new_stack = MAX s.stack_max
                            (size_of_stack s.stack)
        in
          s with <| safe_for_space :=
                      s.safe_for_space ∧
                      new_stack < s.limits.stack_limit |>

size_of_heap s


        let new_heap = size_of_heap s + space_consumed s op vs
        in
          s with <| safe_for_space :=
                      s.safe_for_space ∧
                      new_heap < s.limits.heap_limit |>

How do we count?

size_of vs refs seen

Where:

(vs) is a list of v values
(refs) is a mapping from numbers to values
(seen) is a set of timestamps we have already seen


      v = Number  int
        | Word64  word64
        | Block   ts tag (v list) -- ts = tag = num
        | CodePtr num
        | RefPtr  num


    Block 8 cons_tag [Number 1;
      Block 7 cons_tag [Number 2;
        Block 6 cons_tag [Number 3,
          Block 0 nil_tag []
        ]
      ]
    ]


    size_of [Block 3 some_tag [Number 1];
             Block 3 some_tag [Number 1]]
            refs seen


    2 +
    size_of [Block 3 some_tag [Number 1]]
            refs ({3} ∪ seen)

What do we count?

size_of_heap s

size_of reachabe_values s.refs {}


       <|s.locals|> ++ <|s.stack|> ++ <|s.global|>

`size_of_heap s = size_of_heap (run_gc s)`

Putting it all together


    is_safe s prog =
       let (res,s') = evaluate (prog,s)
       in s'.safe_for_space

$\text{machine_sem}\,(\text{compile}\,prog)\,\subseteq\\ \text{extend_with_resource_limit}(\text{source_sem}\,prog)$

$\text{is_safe}\,\text{init_s}\,(\text{to_data}\,prog) \implies \\ \text{machine_sem}\,(\text{compile}\,prog)=\text{source_sem}\,prog$

*Wildly oversimplified