Functions as Values
Consider the following egg
program
def f(it):
5)
it(
def incr(x):
+ 1
x
f(incr)
What will be the result of compiling/running?
We have functions, but they are second-class entities in our languages: they don’t have the same abilities as other values.
So, we get multiple error messages:
Errors found!
tests/input/hof.diamond:(2:3)-(4:1): Function 'it' is not defined
2| it(5)
^^
tests/input/hof.diamond:7:3-7: Unbound variable 'incr'
7| f(incr)
^^^^
This is because the Env
only holds
- parameters, and
- let-bound variables
and not function definitions.
Functions as Values
But for the many reasons we saw in CSE 130 – we want to treat functions like values. For example, if you run the above in Python you get:
>>> def f(it): return it(5)
>>> def incr(x): return x + 1
>>> f(incr)
6
Flashback: How do we compile incr
?
We compile each function down into a sequence of instructions corresponding to its body.
def incr(x):
+ 1
x
5) incr(
becomes, for incr
label_def_incr_start:
push rbp # setup stack frame
mov rbp, rsp
mov rax, rdi # grab param
add rax, 2 # incr by 1
mov rsp, rbp # undo stack frame
pop rbp
ret # buh-bye
for the main expression
our_code_starts_here:
push rbp
mov rbp, rsp
mov rdi 10 # push arg '5'
call label_def_incr_start # call function
mov rsp, rbp
pop rbp
ret
What is the value of a function?
So now, lets take a step back. Suppose we want to compile
def f(it):
5)
it(
def incr(x):
+ 1
x
f(incr)
Attempt 1: What is the value of the parameter it
?
IDEA: Use the label where incr
lives!
label_def_f_start:
push rbp
mov rbp, rsp
mov rax, rdi # grab function-address
mov rdi, 10 # push arg '5'
call rax # call function!
mov rsp, rbp
pop rbp
ret
How to pass the value of the parameter ?
So now the main
expression
f(incr)
can be compiled to:
our_code_starts_here:
push rbp
mov rbp, rsp
mov rdi, ?1 # push arg
call ?2 # call function
mov rsp, rbp
pop rbp
ret
QUIZ: What are suitable terms for ?1
and ?2
?
?1 |
?2 |
|
---|---|---|
A | label_def_incr_start |
label_def_f_start |
B | label_def_f_start |
label_def_incr_start |
C | label_def_f_start |
label_def_f_start |
D | label_def_incr_start |
label_def_incr_start |
Strategy Progression
- Representation =
Start-Label
- **Problem:** How to do run-time checks of valid args?
Yay, that was easy! How should the following behave?
def f(it):
5)
it(
def add(x, y):
+ y
x
f(incr)
Lets see what Python does:
>>> def f(it): return it(5)
>>> def add(x,y): return x + y
>>> f(add)
Traceback (most recent call last):"<stdin>", line 1, in <module>
File "<stdin>", line 1, in f
File TypeError: add() takes exactly 2 arguments (1 given)
Problem: Ensure Valid Number of Arguments?
How to make sure
f(incr)
succeeds, butf(add)
fails
With proper run-time error?
- Where does the run-time check happen?
- What information is needed for the check?
Key: Need to also store the function’s arity
- The number of arguments required by the function
Strategy Progression
Representation =
Start-Label
- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)
Attempt 2: What is the value of the parameter it
?
def f(it):
5)
it(
def incr(x):
+ 1
x
f(incr)
IDEA: Represent a function with a tuple of
(arity, function_start_label)
We can now compile a call
e(x1,...,xn)
via the following strategy:
- Evaluate the tuple
e
- Check that
e[0]
is equal ton
(else arity mismatch error) - Call the function at
e[1]
Example
Lets see how we would compile this
def f(it):
5)
it(
def incr(x):
+ 1
x
f(incr)
We need to map the variable
incr
to the tuple
(1, label_def_incr_start)
But where will we store this information?
Strategy Progression
Representation =
Start-Label
- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)
- Problem: How to map function names to tuples?
Lambda Terms Make functions just another expression!
Attempt 3: Lambda Terms
So far, we could only define functions at top-level
First-class functions are like any other expression,
Can define a function, wherever you have any other expression.
Language | Syntax |
---|---|
Haskell | \(x1,...,xn) -> e |
Ocaml | fun (x1,...,xn) -> e |
JS | (x1,...,xn) => { return e } |
C++ | [&](x1,...,xn){ return e } |
Example: Lambda Terms
We can now replace def
as:
= (lambda (it): it(5))
let f = (lambda (x): x + 1)
, incr in
f(incr)
Implementation
As always, to the details! Lets figure out:
Representation
- How to store function-tuples
Types:
- Remove
Def
- Add
lambda
toExpr
Transforms
- Update
tag
andANF
- Update
checker
- Update
compile
Implementation: Representation
Represent lambda-tuples'' or
function-tuples’’ via a special tag:
Type | LSB |
---|---|
number |
xx0 |
boolean |
111 |
pointer |
001 |
function |
101 |
In our code:
data Ty = ... | TClosure
typeTag :: Ty -> Arg
TTuple = HexConst 0x00000001
typeTag TClosure = HexConst 0x00000005
typeTag
typeMask :: Ty -> Arg
TTuple = HexConst 0x00000007
typeMask TClosure = HexConst 0x00000007 typeMask
So, Function Values represented just like a tuples
- padding,
ESI
etc. - but with tag
101
.
Crucially, we can get 0
-th, or 1
-st elements from tuple.
Question: Why not use plain tuples?
Implementation: Types
First, lets look at the new Expr
type
- No more
Def
data Expr a
= ...
| Lam [Bind a] !(Expr a) a -- fun. definitions
| App !(Expr a) [Expr a] a -- fun. calls
So we represent a function-definition as:
Lam [x1,...,xn] e
and a function call as:
App e [e1,...,en]
Transforms: Tag
This is pretty straight forward (do it yourself)
Transforms: ANF
QUIZ:
App e es) (
Does e
need to be
- A Immediate
- B ANF
- C None of the above
Transforms: ANF
QUIZ:
App e es) (
Do es
need to be
- A Immediate
- B ANF
- C None of the above
Transforms: ANF
The App
case, fun + args should be immediate
- Need the values to push on stack and make the call happen!
Just like function calls (in diamondback
), except
- Must also handle the callee-expression (named
e
below)
App e es) = (i', stitch bs (App v vs))
anf i (where
:vs) = imms i (e:es) (i', bs, v
Transforms: ANF
QUIZ:
Lam xs e) (
Does e
need to be
- A Immediate
- B ANF
- C None of the above
Transforms: ANF
The Lam
case, the body will be executed (when called)
- So we just need to make sure its in ANF (like all the code!)
Lam xs e) = (i', Lam xs e')
anf i (where
= anf i e (i', e')
Transforms: Checker
We just have Expr
(no Def
) so there is a single function:
wellFormed :: BareExpr -> [UserError]
= go emptyEnv
wellFormed where
= concatMap . go
gos Boolean {}) = ...
go _ (Number n l) = largeNumberErrors n l
go _ (Id x l) = unboundVarErrors vEnv x l
go vEnv (Prim1 _ e _) = ...
go vEnv (Prim2 _ e1 e2 _) = ...
go vEnv (If e1 e2 e3 _) = ...
go vEnv (Let x e1 e2 _) = ... ++ go vEnv e1 ++ go (addEnv x vEnv) e2
go vEnv (Tuple es _) = ...
go vEnv (GetItem e1 e2 _) = ...
go vEnv (App e es _) = ?1
go vEnv (Lam xs e _) = ?2 ++ go ?3 e go vEnv (
How shall we implement
?1
?How shall we implement
?2
?How shall we implement
?3
?
Transforms: Compiler
Finally, lets see how to convert Expr
into Asm
, two separate cases:
Lam
: definitionsApp
: calls
Transforms: Compiler: Lam
compileEnv :: Env -> AExp -> [Instruction]
Lam xs e l)
compileEnv env (= IJmp end -- Why?
: ILabel start -- Function start
: compileDecl l xs e -- Function code (like Decl)
++ ILabel end -- Function end
: lamTuple arity start -- Compile fun-tuple into RAX
where
= length xs
arity = LamStart l
start = LamEnd l end
QUESTION: Why do we start with the IJmp
?
lamTuple :: Int -> Label -> [Instruction]
lamTuple arity start= tupleAlloc 2 -- alloc tuple size = 2
++ tupleWrites [ repr arity -- fill arity
CodePtr start ] -- fill code-ptr
, ++ [ IOr (Reg RAX) (typeTag TClosure) ] -- set the tag bits
Transforms: Compiler: App
Recall! IDEA: Use a tuple of (arity, label)
We can now compile a call
e(x1,...,xn)
via the following strategy:
- Evaluate the tuple
e
- Check that
e[0]
is equal ton
(else arity mismatch error) - Call the function at
e[1]
App vE vXs)
compileEnv env (= assertType env vE TClosure -- check vE is a function
++ assertArity env vE (length vXs) -- check vE arity
++ tupleReadRaw (immArg env vE) (repr (1 :: Int)) -- load vE[1] into RAX
++ pushRegArgs rArgs -- push reg args
++ pushStackArgs sArgs -- push stack args
++ [ICall (Reg RAX)] -- call RAX
++ popStackArgs (length sArgs)
where
= splitAt 6 args (rArgs, sArgs)
A Problem: Scope
Consider the following program:
let one = 1
= (lambda (it): it(5))
, f = (lambda (n): n + one)
, inc in
f(inc)
A Problem Magnified: Dynamically Created Functions
Will it work? How about this variant:
lambda(m): n + m ----> (1, label_lam_99, [n = ???])
let addition = (lambda (a, b): a + b)
let add = (lambda (n): (lambda (m): addition(n, m)))
= (lambda (it): it(5))
, f = add(1) -- < 1, label_lam_99, [n := 1]>
, plus1 = add(10) -- < 1, label_lam_99, [n := 10]>
, plus10 in
(f(plus1), f(plus10))
add(1)
should evaluate to a function-that-adds-1add(10)
should evaluate to a function-that-adds-10
Yet, its the same code
- same arity
- same start-label
Problem: How can we represent different behaviors?
Free and Bound Variables
A variable x
is bound inside an expression e
if
x
is a let-bound variable insidee
.x
is a formal parameter ine
, OR
A variable x
is free inside an expression e
if
x
is not bound insidee
For example consider the expression e
:
lambda (m):
= m in
let t + t n
m
,t
are bound insidee
, but,n
is free insidee
Computing Free Variables
Lets write a function to compute the set of free variables.
Question Why Set ?
lambda(x): x + y
let x = e1 in e2
- free e1 + (free e2 - x)
Bin Plus (Id x) (Id y)
App (Id inc) [(Id x), (Id y), (Id z)]
freeVars :: Expr -> [Id]
= S.toList (go e)
freeVars e where
go :: Expr -> S.Set Id
Id x) = S.singleton x
go (Number _) = S.empty
go (Boolean _) = S.empty
go (If e e1 e2) = S.unions [go e, go e1, go e2]
go (App e es) = S.unions (go e : map go es)
go (Let x e1 e2) = S.union (go e1) (S.minus (go e2) x)
go (Lam xs e) = S.difference (go e) xs go (
TODO-IN-CLASS
Free Variables and Lambdas
Free Variables of a lambda
Those whose values come from outside
Should use the same values whenever we “call” the
lambda
.
For example:
let add = (lambda (n): (lambda (m): n + m))
= (lambda (it): it(5))
, f = add(1)
, plus1 = add(10)
, plus10 in
20)) (f(plus1), f(plus10), plus10(
should evaluate to (6, 15, 30)
plus1
be likelambda (m): 1 + m
plus10
be likelambda (m): 10 + m
Achieving Closure
(Recall from CSE 130)
let add = (lambda (n): (lambda (m): n + m))
= (lambda (it): it(5))
, f = add(1)
, plus1 = add(10)
, plus10 in
20)) (f(plus1), f(plus10), plus10(
should evaluate to (6, 15, 30)
plus1
be likelambda (m): 1 + m
plus10
be likelambda (m): 10 + m
Key Idea: Each function value must store its free variables
represent plus1
as:
(arity, code-label, [n := 1])
represent plus10
as:
(arity, code-label, [n := 10])
Same code, but different free variables.
Strategy Progression
Representation =
Start-Label
- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)
- Problem: How to map function names to tuples?
Lambda Terms Make functions just another expression!
- Problem: How to store local variables?
Function Value
(Arity, Start-Label, Free_1, ... , Free_N)
- Ta Da!
Closures: Strategy
What if we have multiple free variables?
= (lambda (x, y):
let foo lambda (z): x + y + z)
(
)= foo(4, 6)
, plus10 = foo(7, 13)
, plus20 in
0), plus20(100)) (plus10(
represent plus10
as:
(arity, code-label, [x := 4], [y := 6])
represent plus20
as:
(arity, code-label, [x := 7], [y := 13])
Example
Lets see how to evaluate
= (lambda (x, y):
let foo lambda (z): x + y + z)
(
)= foo(4, 6)
, plus10 in
0) plus10(
Example
Lets see how to evaluate
= (lambda (x, y):
let foo lambda (z): x + y + z)
(
)= foo(4, 6)
, plus10 = (lambda (it): it(5))
, f in
f(plus10)
Implementation
Representation
- How to store closures
Types:
- Same as before
Transforms
Update
tag
andANF
- as before
Update
checker
Update
compile
Representation
We can represent a closure as a tuple of
(arity, code-ptr, free-var-1, ... free-var-N)
which means, following the convention for tuples, as:
------------------------------------------------
| N + 2 | arity | code-ptr | var1 | ... | varN |
------------------------------------------------
Where each cell represents 64-bits / 8-bytes / 1-(double)word.
Note: (As with all tuples) the first word contains the #elements of the tuple.
- In this case
N + 2
Transforms: Checker
What environment should we use to check a Lam
body ?
wellFormed :: BareExpr -> [UserError]
= go emptyEnv
wellFormed where
...
Lam xs e _) = errDupParams xs
go vEnv (++ go ?vEnv e
addsEnv :: Env -> [BareBind] -> Env
= foldr addEnv env xs addsEnv env xs
QUIZ How shall we implement ?vEnv
?
A. addsEnv vEnv []
B. addsEnv vEnv xs
C. addsEnv emptyEnv xs
Transforms: Compile
Question How does the called function know the values of free vars?
- Needs to restore them from closure tuple
- Needs to access the closure tuple!
… But how shall we give the called function access to the tuple?
By passing the tuple as an extra parameter
Transforms: Compile
Calls App
(as before)
- Push closure-pointer + parameters
- Call code-label
- Pop closure-pointer + params
Definitions Lam
- Compute free-vars
x1
,…,xn
- Generate code-block
- Restore free vars from closure-pointer-parameter New
- Execute function body (as before)
- Allocate tuple
(arity, code-label, x1, ... , xn)
Transforms: Compile Definitions
- Compute free-vars
y1
,…,yn
- Generate code-block
- Restore free vars from closure-pointer-parameter
- Execute function body (as before)
- Allocate tuple
(arity, code-label, y1, ... , yn)
compileEnv :: Env -> AExp -> [Instruction]
Lam xs e l)
compileEnv env (= IJmp end -- Why?
: ILabel start -- Function start
: lambdaBody ys xs e -- Function code (like Decl)
++ ILabel end -- Function end
: lamTuple arity start env ys -- Compile closure-tuple into RAX
where
= freeVars (Lam xs e l)
ys = length xs
arity = LamStart l
start = LamEnd l end
Creating Closure Tuples
To create the actual closure-tuple we need
- the free-variables
ys
- the
env
from which to values of the free variables.
lamTuple :: Int -> Label -> Env -> [Id] -> [Instruction]
lamTuple arity start env ys= tupleAlloc (2 + length ys) -- alloc tuple 2 + |ys|
++ tupleWrites ( repr arity -- fill arity
: CodePtr start -- fill code-ptr
: [immArg env (Id y) | y <- ys] ) -- fill free-vars
++ [ IOr (Reg RAX) (typeTag TClosure) ] -- set the tag bits
Generating Code Block
lambdaBody :: [Id] -> [Id] -> AExp -> [Instruction]
=
lambdaBody ys xs e -- 1. setup stack frame RBP/RSP
funEntry n ++ copyArgs xs' -- 2. copy parameters to stack slots
++ restore nXs ys -- 3. copy (closure) free vars to stack slots
++ compileEnv env body -- 4. execute 'body' with result in RAX
++ funExit n -- 5. teardown stack frame & return
To restore ys
we use the closure-ptr passed in at [RDI]
– the special first parameter – to copy the free-vars ys
onto the stack.
restore :: Int -> [Id] -> [Instruction]
=
restore base ys concat [ copy i | (_, i) <- zip ys [1..]]
where
= Reg RDI
closV = tupleReadRaw closV (repr (i+1)) -- copy tuple-fld for y into RAX...
copy i ++ [ IMov (stackVar (base+i)) (Reg RAX) ] -- ...write RAX into stackVar for y
A Problem: Recursion
Oops, how do we write:
def fac(n):
if (n > 1):
* fac(n-1)
n else:
1
5) fac(
If we try
= (lambda (n):
let fac if (n < 1):
1
else:
* fac(n-1))
n in fac(5)
We get a variable unbound error!
Errors found!
tests/input/fac-bad.fdl:5:20-23: Unbound variable 'fac'
5| n * fac(n-1))
We need to teach our compiler that its ok to use the name fac
inside the body!
Solution: Named Functions
We have a new form of named functions which looks like this:
def fac(n):
if (n < 1):
1
else:
* fac(n - 1)
n in
5) fac(
Representing Named Functions
We extend Expr
to handle such functions as:
data Expr a
= ...
| Fun (Bind a) -- ^ name of function
Bind a] -- ^ list of parameters
[Expr a) a -- ^ body of function (
Note that we parse the code
def foo(x1,...,xn):
ein
' e
as the Expr
Let foo (Fun foo [x1,...,xn] e) e'
Compiling Named Functions
Mostly, this is left as an exercise to you
Non-Recursive functions
- i.e.
f
does not appear insidee
inFun f xs e
- Treat
Fun f xs e
asLam xs e
… - … Everything should just work.
Recursive
- i.e.
f
does appear insidee
inFun f xs e
- Can you think of a simple tweak to the
Lam
strategy that works?
Recap: Functions as Values
We had functions, but they were second-class entities…
Now, they are first-class values
- passed around as parameters
- returned from functions
- stored in tuples etc.
How?
Representation =
Start-Label
- Problem: How to do run-time checks of valid args?
Representation =
(Arity, Start-Label)
- Problem: How to map function names to tuples?
Lambda Terms Make functions just another expression!
- Problem: How to store local variables?
Function Value
(Arity, Start-Label, Free_1, ... , Free_N)
- Ta Da!
Next: Adding garbage collection
- Reclaim! Heap memory that is no longer in use
Next: Adding static type inference
- Faster! Gets rid of those annoying (and slow!) run-time checks
- Safer! Catches problems at compile-time, when easiest to fix!