Do you have space for dessert?


A Verified Space Cost Semantics for CakeML

🗣Alejandro Gómez-Londoño
Johannes Åman Pohjola
Hira Taqdees Syeda
Magnus O. Myreen
Yong Kiam Tan
  • Functional language
  • Proven-correct compiler
  • Able to bootstrap itself

<->

🤝

.asm

🤝

$\text{machine_sem}\,(\text{compile}\,prog)\,\subseteq\\ \text{extend_with_resource_limit}(\text{source_sem}\,prog)$
$\text{extend_with_resource_limit}(\text{source_sem}\,prog)$
$\text{source_sem}\,prog\,\cup$ 💥

🤝

$\text{machine_sem}\,(\text{compile}\,prog)\subseteq\\ \text{extend_with_resource_limit}(\text{source_sem}\,prog)$

👍

$\text{machine_sem}\,(\text{compile}\,prog)=\text{source_sem}\,prog$

✅ 👉 👍

$\bbox[background-color: #f19a3e]{\text{is_safe}\,prog} \implies \\ \text{machine_sem}\,(\text{compile}\,prog)= \text{source_sem}\,prog$

Objectives


Contributions


Why do programs run out of memory?


High level (Tractable)
??
Low level (Precise)
CakeML
flatLang
closLang
BVL
BVI
dataLang
wordLang
stackLang
labLang
Machine code
CakeML
flatLang
closLang
BVL
BVI
dataLang
wordLang
stackLang
labLang
Machine code
CakeML
=
flatLang
=
closLang
=
BVL
=
BVI
=
dataLang
$\subseteq$
wordLang
stackLang
labLang
Machine code
CakeML
=
flatLang
=
closLang
=
BVL
=
BVI
=
dataLang
$\subseteq$
wordLang
=
stackLang
=
labLang
=
Machine code
CakeML

  • Source language
  • High level
  • Complex cost-semantics
  • Loose approximation
CakeML
=
flatLang
=
closLang
=
BVL
=
BVI
=
dataLang
$\subseteq$
wordLang
stackLang
labLang
Machine code
dataLang


    FOLDL f e [] = e
    FOLDL f e (x::xs) = FOLDL f (f e x) xs
  


    foldl [0; 1; 2] = # FOLDL (0=l) (1=e) (2=f)
      # LENGTH l = 0?
      do 4 :≡ (TagLenEq 0 0,[0],NONE);
         if_var 4 (return 1) # Nil case, return e
           # Cons case
           do 6 :≡ (ElemAt 0,[0],NONE);  # head (x)
              7 :≡ (ElemAt 1,[0],NONE);  # tail (xs)
              ...
              # f x e
              call (18,⦕ 2; 7 ⦖) NONE [6; 1; 2; 15] NONE;
              tailcall_foldl [7; 18; 2]
  

    FOLDL f e [] = e
    FOLDL f e (x::xs) = FOLDL f (f e x) xs
  


    foldl [l; e; f] =
      # LENGTH l = 0?
      do isNil :≡ (TagLenEq 0 0,[l],NONE);
         if_var isNil (return e) # Nil case, return e
           # Cons case
           do x  :≡ (ElemAt 0,[l],NONE);  # head (x)
              xs :≡ (ElemAt 1,[l],NONE);  # tail (xs)
              ...
              # f x e
              call (e1,⦕ f; xs ⦖) NONE [x; e; f] NONE;
              tailcall_foldl [xs; e1; f]
  

         v = Number  int
           | Word64  word64
           | Block   ts tag (v list)  -- ts = tag = num
           | CodePtr num
           | RefPtr  num
    

[1,2,3]



      Block 8 cons_tag [Number 1;
        Block 7 cons_tag [Number 2;
          Block 6 cons_tag [Number 3;
            Block 0 nil_tag []
          ]
        ]
      ]
    
evaluate (prog,s) = (res,s')


      state = <| locals           : v num_map
               ; stack            : stack list
               ; refs             : v ref num_map
               ; global           : num option
               ...
               |>
    
evaluate (prog,s) = (res,s')


      state = <| locals           : v num_map
               ; stack            : stack list
               ; refs             : v ref num_map
               ; global           : num option
               ; limits           : limits
               ; safe_for_space   : bool
               ; stack_max        : num option
               ...
               |>
    
size_of_heap s
  • At every allocation
  • On heap consuming operations
size_of_stack s.stack
  • At every function call
  • On stack consuming operations
size_of_stack s.stack


        let new_stack = MAX s.stack_max
                            (size_of_stack s.stack)
        in
          s with <| safe_for_space :=
                      s.safe_for_space ∧
                      new_stack < s.limits.stack_limit;
                    stack_max := new_stack |>
    
size_of_heap s


        let new_heap = size_of_heap s + space_consumed s op vs
        in
          s with <| safe_for_space :=
                      s.safe_for_space ∧
                      new_heap < s.limits.heap_limit |>
    

How do we count?

size_of vals refs seen

Where:


      v = Number  int
        | Word64  word64
        | Block   ts tag (v list) -- ts = tag = num
        | CodePtr num
        | RefPtr  num
  


    Block 8 cons_tag [Number 1;
      Block 7 cons_tag [Number 2;
        Block 6 cons_tag [Number 3,
          Block 0 nil_tag []
        ]
      ]
    ]
  

    size_of [Block 3 some_tag [Number 1];
             Block 3 some_tag [Number 1]]
            refs seen
  

    2 +
    size_of [Block 3 some_tag [Number 1]]
            refs ({3} ∪ seen)
  

    2
  
size_of_stack s.stack

What do we count?

size_of_heap s

size_of reachabe_values s.refs {}


       <|s.locals|> ++ <|s.stack|> ++ <|s.global|>
      

size_of_heap s =
size_of_heap (run_gc s)

Putting it all together


    is_safe s prog =
       let (res,s') = evaluate (prog,s)
       in s'.safe_for_space
  
$\text{is_safe}\,\text{init_s}\,(\text{to_data}\,prog) \implies \\ \text{machine_sem}\,(\text{compile}\,prog)=\text{source_sem}\,prog$
(See paper for actual statement.)

Conclusion


?