Lets extend the interpreter with the capability to call into a function. Unlike the builtin function
range in the last post, I would like also the called function to be interpreted in my interpreter and not just being executed in the host Python.
def test(): return square(4) + square(3) def square(n): return n * n
In order to simplify the testing I do not want to adapt the globals directory every time I change the test function. Fortunately every function keeps a reference to the global environment it is defined in, so I will just tell the execute function to use the appropriate globals of the test function.
print 'own execute function:', execute(test.func_code, test.func_globals)
The called function will need its own space to run in, e.g. to store the local variables. Also the program counter for the called function must not overwrite the program counter in the calling function. Now it comes handy that all the execution state is stored in a frame object: we can simply create a clean new frame for the called function.
Then we simply point our variable f, holding the executed frame to the newly created frame for transferring control to the called function.
elif bc==CALL_FUNCTION: [...] elif type(callee) is types.FunctionType: subframe = Frame(callee.func_code, callee.func_globals) subframe.locals[:arg] = call_args f = subframe
Note that the functions arguments are just copied into the locals of the called function. In fact (ignoring any * or ** magic) the functions arguments are always the first local variables.
Lets try to execute the test:
normal Python call: 25 own execute function: 16
What happened? Apparently we see only the result of the first function call square(4). The interpreter has just quit too early – upon seeing the first return statement.
Of course when we are calling into functions we also need to transfer control back to the calling function when the called function returns. At the moment we just lose the calling frame when we transfer control to the subframe. For being able to return control to the caller we first need to keep the calling frame. We will give the frame object a reference to the caller.
class Frame(object): def __init__(self, code, globals, caller): self.locals = [None] * code.co_nlocals self.PC = 0 self.stack =  self.globals = globals self.code = code self.caller = caller
The first function executed by our interpreter is not called from inside the interpreter, so for this first frame we set the caller to None.
f = Frame(code, globals, None)
In the CALL_FUNCTION implementation we keep the reference to the calling frame and we also update the calling frames program counter to point to the next instruction before transferring control, so everything will be ready when we decide to return.
elif bc==CALL_FUNCTION: [...] elif type(callee) is types.FunctionType: subframe = Frame(callee.func_code, callee.func_globals, f) subframe.locals[:arg] = call_args f.PC += 3 f = subframe
Now everything is ready for the return statement to be implemented in the RETURN_VALUE bytecode. For this we check if we have a calling frame and if so we push the returned value on the calling frame and transfer control back.
elif bc==RETURN_VALUE: if f.caller is None: return f.stack.pop() else: ret = f.stack.pop() f = f.caller f.stack.append(ret)
And finally we get the desired result.
normal Python call: 25 own execute function: 25
I have created a GitHub repository for this project. See the revision for this post on GitHub.
A note on the value stack:
In my implementation each frame has got its own stack, so a called function has no chance of corrupting the callers stack. If the called function guarantees not to alter the bottom of the stack and also to pop off all values it pushed, then the same stack could be shared between the caller and the callee.
In this hobby / educational implementation I prefer the seperate approach as it is probably more robust and easier to debug.