First Class Functions

Functions as Values

Consider the following egg program

def f(it):
  it(5)

def incr(x):
  x + 1

f(incr)

What will be the result of compiling/running?













We have functions, but they are second-class entities in our languages: they don’t have the same abilities as other values.

So, we get multiple error messages:

Errors found!
tests/input/hof.diamond:(2:3)-(4:1): Function 'it' is not defined

         2|    it(5)
               ^^

tests/input/hof.diamond:7:3-7: Unbound variable 'incr'

         7|  f(incr)
               ^^^^

This is because the Env only holds

  • parameters, and
  • let-bound variables

and not function definitions.
















Functions as Values

But for the many reasons we saw in CSE 130 – we want to treat functions like values. For example, if you run the above in Python you get:

>>> def f(it): return it(5)

>>> def incr(x): return x + 1

>>> f(incr)
6













Flashback: How do we compile incr?

We compile each function down into a sequence of instructions corresponding to its body.

def incr(x):
  x + 1

incr(5)

becomes, for incr

label_def_incr_start:
  push rbp                 # setup stack frame
  mov rbp, rsp

  mov rax, rdi             # grab param
  add rax, 2               # incr by 1

  mov rsp, rbp             # undo stack frame
  pop rbp
  ret                      # buh-bye

for the main expression

our_code_starts_here:         
  push rbp
  mov rbp, rsp

  mov  rdi 10                 # push arg '5'
  call label_def_incr_start   # call function

  mov rsp, rbp
  pop  rbp
  ret

What is the value of a function?

So now, lets take a step back. Suppose we want to compile

def f(it):
  it(5)

def incr(x):
  x + 1

f(incr)












Attempt 1: What is the value of the parameter it ?

















IDEA: Use the label where incr lives!

label_def_f_start:
  push rbp
  mov  rbp, rsp

  mov rax, rdi       # grab function-address
  mov rdi, 10        # push arg '5'
  call rax           # call function!

  mov rsp, rbp
  pop rbp
  ret

















How to pass the value of the parameter ?

So now the main expression

f(incr)

can be compiled to:

our_code_starts_here:         
  push rbp
  mov rbp, rsp

  mov rdi, ?1     # push arg
  call ?2         # call function

  mov rsp, rbp
  pop rbp
  ret

QUIZ: What are suitable terms for ?1 and ?2 ?

?1 ?2
A label_def_incr_start label_def_f_start
B label_def_f_start label_def_incr_start
C label_def_f_start label_def_f_start
D label_def_incr_start label_def_incr_start














Strategy Progression

  1. Representation = Start-Label


















- **Problem:** How to do run-time checks of valid args?













Yay, that was easy! How should the following behave?

def f(it):
  it(5)

def add(x, y):
  x + y

f(incr)













Lets see what Python does:

>>> def f(it): return it(5)
>>> def add(x,y): return x + y
>>> f(add)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 1, in f
TypeError: add() takes exactly 2 arguments (1 given)













Problem: Ensure Valid Number of Arguments?

How to make sure

  • f(incr) succeeds, but
  • f(add) fails

With proper run-time error?

  1. Where does the run-time check happen?
  2. What information is needed for the check?













Key: Need to also store the function’s arity

  • The number of arguments required by the function













Strategy Progression

  1. Representation = Start-Label

    • Problem: How to do run-time checks of valid args?
  2. Representation = (Arity, Start-Label)













Attempt 2: What is the value of the parameter it ?

def f(it):
  it(5)

def incr(x):
  x + 1

f(incr)














IDEA: Represent a function with a tuple of

(arity, function_start_label)




We can now compile a call

e(x1,...,xn)

via the following strategy:















  1. Evaluate the tuple e
  2. Check that e[0] is equal to n (else arity mismatch error)
  3. Call the function at e[1]














Example

Lets see how we would compile this

def f(it):
  it(5)

def incr(x):
  x + 1

f(incr)

We need to map the variable

incr

to the tuple

(1, label_def_incr_start)

But where will we store this information?













Strategy Progression

  1. Representation = Start-Label

    • Problem: How to do run-time checks of valid args?
  2. Representation = (Arity, Start-Label)

    • Problem: How to map function names to tuples?
  3. Lambda Terms Make functions just another expression!













Attempt 3: Lambda Terms

So far, we could only define functions at top-level

  • First-class functions are like any other expression,

  • Can define a function, wherever you have any other expression.

Language Syntax
Haskell \(x1,...,xn) -> e
Ocaml fun (x1,...,xn) -> e
JS (x1,...,xn) => { return e }
C++ [&](x1,...,xn){ return e }













Example: Lambda Terms

We can now replace def as:

let f    = (lambda (it): it(5))
  , incr = (lambda  (x): x + 1)
in
  f(incr)













Implementation

As always, to the details! Lets figure out:

Representation

  1. How to store function-tuples

Types:

  1. Remove Def
  2. Add lambda to Expr

Transforms

  1. Update tag and ANF
  2. Update checker
  3. Update compile













Implementation: Representation

Represent lambda-tuples'' orfunction-tuples’’ via a special tag:

Type LSB
number xx0
boolean 111
pointer 001
function 101

In our code:

data Ty = ... | TClosure

typeTag :: Ty -> Arg
typeTag TTuple    = HexConst 0x00000001
typeTag TClosure  = HexConst 0x00000005

typeMask :: Ty -> Arg
typeMask TTuple   = HexConst 0x00000007
typeMask TClosure = HexConst 0x00000007

So, Function Values represented just like a tuples

  • padding, ESI etc.
  • but with tag 101.

Crucially, we can get 0-th, or 1-st elements from tuple.

Question: Why not use plain tuples?













Implementation: Types

First, lets look at the new Expr type

  • No more Def
data Expr a
  = ...
  | Lam               [Bind a]   !(Expr a) a      -- fun. definitions
  | App     !(Expr a) [Expr a]             a      -- fun. calls

So we represent a function-definition as:

Lam [x1,...,xn] e

and a function call as:

App e [e1,...,en]













Transforms: Tag

This is pretty straight forward (do it yourself)













Transforms: ANF

QUIZ:

  (App e es)

Does e need to be

  • A Immediate
  • B ANF
  • C None of the above













Transforms: ANF

QUIZ:

  (App e es)

Do es need to be

  • A Immediate
  • B ANF
  • C None of the above













Transforms: ANF

The App case, fun + args should be immediate

  • Need the values to push on stack and make the call happen!

Just like function calls (in diamondback), except

  • Must also handle the callee-expression (named e below)
anf i (App e es)   = (i', stitch bs (App v vs))
  where
    (i', bs, v:vs) = imms i (e:es)













Transforms: ANF

QUIZ:

  (Lam xs e)

Does e need to be

  • A Immediate
  • B ANF
  • C None of the above













Transforms: ANF

The Lam case, the body will be executed (when called)

  • So we just need to make sure its in ANF (like all the code!)
anf i (Lam xs e) = (i', Lam xs e')
  where
    (i', e')     = anf i e













Transforms: Checker

We just have Expr (no Def) so there is a single function:

wellFormed :: BareExpr -> [UserError]
wellFormed = go emptyEnv
  where
    gos                       = concatMap . go
    go _    (Boolean {})      = ...
    go _    (Number  n     l) = largeNumberErrors      n l
    go vEnv (Id      x     l) = unboundVarErrors  vEnv x l
    go vEnv (Prim1 _ e     _) = ...
    go vEnv (Prim2 _ e1 e2 _) = ...
    go vEnv (If   e1 e2 e3 _) = ...
    go vEnv (Let x e1 e2   _) = ... ++ go vEnv e1 ++ go (addEnv x vEnv) e2
    go vEnv (Tuple es      _) = ...
    go vEnv (GetItem e1 e2 _) = ...
    go vEnv (App e es      _) = ?1
    go vEnv (Lam xs e      _) = ?2 ++ go ?3 e
  • How shall we implement ?1 ?

  • How shall we implement ?2 ?

  • How shall we implement ?3 ?













Transforms: Compiler

Finally, lets see how to convert Expr into Asm, two separate cases:

  • Lam : definitions

  • App : calls

Transforms: Compiler: Lam

compileEnv :: Env -> AExp -> [Instruction]
compileEnv env (Lam xs e l) 
  = IJmp   end              -- Why?
  : ILabel start            -- Function start
  : compileDecl l xs e      -- Function code (like Decl)
 ++ ILabel end              -- Function end
  : lamTuple arity start    -- Compile fun-tuple into RAX
  where
    arity = length xs
    start = LamStart l
    end   = LamEnd   l

QUESTION: Why do we start with the IJmp?

lamTuple :: Int -> Label -> [Instruction]
lamTuple arity start
  =  tupleAlloc  2                           -- alloc tuple size = 2  
  ++ tupleWrites [ repr arity                -- fill arity
                 , CodePtr start ]           -- fill code-ptr
  ++ [ IOr  (Reg RAX) (typeTag TClosure) ]   -- set the tag bits












Transforms: Compiler: App

Recall! IDEA: Use a tuple of (arity, label)

We can now compile a call

  e(x1,...,xn)

via the following strategy:












  1. Evaluate the tuple e
  2. Check that e[0] is equal to n (else arity mismatch error)
  3. Call the function at e[1]












compileEnv env (App vE vXs)
  = assertType     env vE TClosure                    -- check vE is a function
 ++ assertArity    env vE (length vXs)                -- check vE arity
 ++ tupleReadRaw   (immArg env vE) (repr (1 :: Int))  -- load vE[1] into RAX
 ++ pushRegArgs   rArgs				      -- push reg args 
 ++ pushStackArgs sArgs				      -- push stack args
 ++ [ICall (Reg RAX)]                                 -- call RAX
 ++ popStackArgs  (length sArgs)
  where
    (rArgs, sArgs) = splitAt 6 args
















A Problem: Scope

Consider the following program:

let one = 1
  , f   = (lambda (it): it(5))
  , inc = (lambda (n): n + one)
in
  f(inc)















A Problem Magnified: Dynamically Created Functions

Will it work? How about this variant:

lambda(m): n + m        ---->   (1, label_lam_99, [n = ???])
let addition = (lambda (a, b): a + b) 
let add    = (lambda (n): (lambda (m): addition(n, m)))
  , f      = (lambda (it): it(5))
  , plus1  = add(1)                 -- < 1, label_lam_99, [n := 1]>
  , plus10 = add(10)                -- < 1, label_lam_99, [n := 10]>
in
  (f(plus1), f(plus10))
  • add(1) should evaluate to a function-that-adds-1
  • add(10) should evaluate to a function-that-adds-10

Yet, its the same code

  • same arity
  • same start-label

Problem: How can we represent different behaviors?















Free and Bound Variables

A variable x is bound inside an expression e if

  • x is a let-bound variable inside e.
  • x is a formal parameter in e, OR

A variable x is free inside an expression e if

  • x is not bound inside e

For example consider the expression e :

lambda (m):
  let t = m in
    n + t
  • m, t are bound inside e, but,
  • n is free inside e
















Computing Free Variables

Lets write a function to compute the set of free variables.

Question Why Set ?

lambda(x): x + y

let x = e1 in e2

- free e1 + (free e2 - x) 
Bin Plus (Id x) (Id y)


App (Id inc) [(Id x), (Id y), (Id z)]
freeVars :: Expr -> [Id]
freeVars e = S.toList (go e)
  where
    go :: Expr -> S.Set Id
    go (Id x)        = S.singleton x
    go (Number _)    = S.empty
    go (Boolean _)   = S.empty
    go (If e e1 e2)  = S.unions [go e, go e1, go e2]
    go (App e es)    = S.unions (go e : map go es) 
    go (Let x e1 e2) = S.union (go e1) (S.minus (go e2) x)
    go (Lam xs e)    = S.difference (go e) xs

TODO-IN-CLASS














Free Variables and Lambdas

Free Variables of a lambda

  • Those whose values come from outside

  • Should use the same values whenever we “call” the lambda.

For example:

let add    = (lambda (n): (lambda (m): n + m))
  , f      = (lambda (it): it(5))
  , plus1  = add(1)
  , plus10 = add(10)
in
  (f(plus1), f(plus10), plus10(20))

should evaluate to (6, 15, 30)

  • plus1 be like lambda (m): 1 + m
  • plus10 be like lambda (m): 10 + m













Achieving Closure

(Recall from CSE 130)

let add    = (lambda (n): (lambda (m): n + m))
  , f      = (lambda (it): it(5))
  , plus1  = add(1)
  , plus10 = add(10)
in
  (f(plus1), f(plus10), plus10(20))

should evaluate to (6, 15, 30)

  • plus1 be like lambda (m): 1 + m
  • plus10 be like lambda (m): 10 + m

Key Idea: Each function value must store its free variables

represent plus1 as:

(arity, code-label, [n := 1])

represent plus10 as:

(arity, code-label, [n := 10])

Same code, but different free variables.














Strategy Progression

  1. Representation = Start-Label

    • Problem: How to do run-time checks of valid args?
  2. Representation = (Arity, Start-Label)

    • Problem: How to map function names to tuples?
  3. Lambda Terms Make functions just another expression!

    • Problem: How to store local variables?
  4. Function Value (Arity, Start-Label, Free_1, ... , Free_N)

    • Ta Da!














Closures: Strategy

What if we have multiple free variables?

let foo    = (lambda (x, y):
                (lambda (z): x + y + z)
             )
  , plus10 = foo(4, 6)
  , plus20 = foo(7, 13)
in
  (plus10(0), plus20(100))

represent plus10 as:

(arity, code-label, [x := 4], [y := 6])

represent plus20 as:

(arity, code-label, [x := 7], [y := 13])















Example

Lets see how to evaluate

let foo    = (lambda (x, y):
                (lambda (z): x + y + z)
             )
  , plus10 = foo(4, 6)
in
  plus10(0)













Example

Lets see how to evaluate

let foo    = (lambda (x, y):
                (lambda (z): x + y + z)
             )
  , plus10 = foo(4, 6)
  , f      = (lambda (it): it(5))
in
  f(plus10)













Implementation

Representation

  1. How to store closures

Types:

  • Same as before

Transforms

  1. Update tag and ANF

    • as before
  2. Update checker

  3. Update compile













Representation

We can represent a closure as a tuple of

(arity, code-ptr, free-var-1, ... free-var-N)

which means, following the convention for tuples, as:

------------------------------------------------
| N + 2 | arity | code-ptr | var1 | ... | varN |
------------------------------------------------

Where each cell represents 64-bits / 8-bytes / 1-(double)word.

Note: (As with all tuples) the first word contains the #elements of the tuple.

  • In this case N + 2
















Transforms: Checker

What environment should we use to check a Lam body ?

wellFormed :: BareExpr -> [UserError]
wellFormed = go emptyEnv
  where
    ...  
    go vEnv (Lam xs e _) = errDupParams xs
                        ++ go ?vEnv e

addsEnv :: Env -> [BareBind] -> Env
addsEnv env xs = foldr addEnv env xs

QUIZ How shall we implement ?vEnv ?

A. addsEnv vEnv []

B. addsEnv vEnv xs

C. addsEnv emptyEnv xs













Transforms: Compile

Question How does the called function know the values of free vars?













  • Needs to restore them from closure tuple














  • Needs to access the closure tuple!













… But how shall we give the called function access to the tuple?








By passing the tuple as an extra parameter















Transforms: Compile

Calls App (as before)















  1. Push closure-pointer + parameters
  2. Call code-label
  3. Pop closure-pointer + params







Definitions Lam















  1. Compute free-vars x1,…,xn
  2. Generate code-block
  • Restore free vars from closure-pointer-parameter New
  • Execute function body (as before)
  1. Allocate tuple (arity, code-label, x1, ... , xn)












Transforms: Compile Definitions

  1. Compute free-vars y1,…,yn
  2. Generate code-block
  • Restore free vars from closure-pointer-parameter
  • Execute function body (as before)
  1. Allocate tuple (arity, code-label, y1, ... , yn)
compileEnv :: Env -> AExp -> [Instruction]
compileEnv env (Lam xs e l)
  = IJmp   end                       -- Why?
  : ILabel start                     -- Function start
  : lambdaBody ys xs e               -- Function code (like Decl)
 ++ ILabel end                       -- Function end
  : lamTuple arity start env ys      -- Compile closure-tuple into RAX
  where
    ys    = freeVars (Lam xs e l)
    arity = length xs
    start = LamStart l
    end   = LamEnd   l
















Creating Closure Tuples

To create the actual closure-tuple we need








  • the free-variables ys
  • the env from which to values of the free variables.








lamTuple :: Int -> Label -> Env -> [Id] -> [Instruction]
lamTuple arity start env ys
  =  tupleAlloc  (2 + length ys)                    -- alloc tuple 2 + |ys|  
  ++ tupleWrites ( repr arity                       -- fill arity
                 : CodePtr start                    -- fill code-ptr
                 : [immArg env (Id y) | y <- ys] )  -- fill free-vars
  ++ [ IOr  (Reg RAX) (typeTag TClosure) ]          -- set the tag bits

Generating Code Block








lambdaBody :: [Id] -> [Id] -> AExp -> [Instruction]
lambdaBody ys xs e = 
    funEntry n               -- 1. setup  stack frame RBP/RSP
 ++ copyArgs xs'             -- 2. copy parameters to stack slots
 ++ restore nXs ys           -- 3. copy (closure) free vars to stack slots
 ++ compileEnv env body      -- 4. execute 'body' with result in RAX
 ++ funExit n                -- 5. teardown stack frame & return 











To restore ys we use the closure-ptr passed in at [RDI] – the special first parameter – to copy the free-vars ys onto the stack.

restore :: Int -> [Id] -> [Instruction]
restore base ys = 
  concat [ copy i | (_, i) <- zip ys [1..]]
    where
      closV  = Reg RDI
      copy i = tupleReadRaw closV (repr (i+1))	       -- copy tuple-fld for y into RAX...
            ++ [ IMov (stackVar (base+i)) (Reg RAX) ]  -- ...write RAX into stackVar for y













A Problem: Recursion

Oops, how do we write:

def fac(n):
  if (n > 1):
    n * fac(n-1)
  else:
    1

fac(5)  

If we try

let fac = (lambda (n):
             if (n < 1):
               1
             else:
               n * fac(n-1))
in fac(5)  

We get a variable unbound error!

Errors found!
tests/input/fac-bad.fdl:5:20-23: Unbound variable 'fac'

         5|                 n * fac(n-1))

We need to teach our compiler that its ok to use the name fac inside the body!


















Solution: Named Functions

We have a new form of named functions which looks like this:

def fac(n):
  if (n < 1):
    1
  else:
    n * fac(n - 1)
in
  fac(5)

Representing Named Functions

We extend Expr to handle such functions as:

data Expr a
  = ...
  | Fun     (Bind a)      -- ^ name of function
            [Bind a]      -- ^ list of parameters  
            (Expr a) a    -- ^ body of function

Note that we parse the code

def foo(x1,...,xn):
  e
in
  e'

as the Expr

Let foo (Fun foo [x1,...,xn] e) e'















Compiling Named Functions

Mostly, this is left as an exercise to you






Non-Recursive functions






  • i.e. f does not appear inside e in Fun f xs e
  • Treat Fun f xs e as Lam xs e
  • … Everything should just work.

Recursive






  • i.e. f does appear inside e in Fun f xs e
  • Can you think of a simple tweak to the Lam strategy that works?









Recap: Functions as Values

We had functions, but they were second-class entities…

Now, they are first-class values

  • passed around as parameters
  • returned from functions
  • stored in tuples etc.

How?

  1. Representation = Start-Label

    • Problem: How to do run-time checks of valid args?
  2. Representation = (Arity, Start-Label)

    • Problem: How to map function names to tuples?
  3. Lambda Terms Make functions just another expression!

    • Problem: How to store local variables?
  4. Function Value (Arity, Start-Label, Free_1, ... , Free_N)

    • Ta Da!

Next: Adding garbage collection

  • Reclaim! Heap memory that is no longer in use

Next: Adding static type inference

  • Faster! Gets rid of those annoying (and slow!) run-time checks
  • Safer! Catches problems at compile-time, when easiest to fix!