In this post I am going to write a byte-code interpreter capable of executing the bytecode of this simple function.
I am using Python 2.7.1, the bytecode can be different for different versions of Python. The aim is not to exactly reproduce the way CPython works in every detail, but to understand the concepts.
def test(): a = 2 b = a + 4 return (a + 1) * (b - 2)
CPython uses a stack-based bytecode. This means that inputs and intermediary results for evaluating an expression are pushed on a stack. Most operations in the bytecode will operate on the top one or two items on the stack. In my last post i demonstrated how you can look at the bytecode of test().
The bytecode for the first line looks like this:
0x64 0x0100 0x7D 0x0000
Which is interpreted as
LOAD_CONST(1) # push(code.co_consts) STORE_FAST(0) # locals = pop()
The interpreter will need to process each bytecode (line in the disassembly) sequentially. It will need to keep track of the current instruction to be executed. This is done using a variable which keeps the index of the current instruction in the code. I call this index PC for program counter.
Then it will also need to keep track of the values of all local variables. The code object has the field co_nlocals which tells us how many local variables we need to reserve space for. So far the names of the local variables do not matter, the locals are accessed by a number in the bytecode.
And of course the interpreter needs the stack where all the calculations and shoveling around of values happen. For implementing the stack I will use an ordinary list, .append() for pushing an item, [-1] for accessing the top of stack and .pop() for popping the top of stack.
Even though this is not needed now, I define a frame object to hold all these objects defining the state of the interpreter, as well as a reference to the code object being executed. This will prove useful when we will be implementing function calls in the future. Right now all it does is initialize the PC, reserve space for the locals, and create an empty stack.
class Frame(object): def __init__(self, code): self.locals = [None] * code.co_nlocals self.PC = 0 self.stack =  self.code = code
The core part of the interpreter is a loop executing one bytecode after another. At the beginning of each loop the variable bc is assigned the next opcode to be performed, i.e. the code at the position PC. Note that I use the frame object to hold all the state of the interpreter. In the complete source you can see that I obtain the numeric values for the various opcodes from the dis module.
def execute(code): f = Frame(code) while True: bc = ord(f.code.co_code[f.PC]) if bc==LOAD_FAST: # [...] elif bc==LOAD_CONST: # [...] else: raise Exception('Unknown Opcode %d (%s)' % (bc, dis.opname[bc]))
An exception is raised in case an opcode is encountered, that I did not implement. That way it is easy to see what is still missing when applying the interpreter to more and more complex test functions.
Let’s now look into the implementation of each opcode:
LOAD_FAST and LOAD_CONST are pretty similar. Both take a two-byte argument stored after the opcode at positions PC+1 and PC+2. The arguments is an index into either the locals (stored in the frame) or the constants (stored in the code object). The value is pushed onto the stack and then the program counter is advanced to the next opcode. Both instructions are three bytes wide.
if bc==LOAD_FAST: arg = ord(f.code.co_code[f.PC+1]) + 256*ord(f.code.co_code[f.PC+2]) f.stack.append(f.locals[arg]) f.PC += 3 elif bc==LOAD_CONST: arg = ord(f.code.co_code[f.PC+1]) + 256*ord(f.code.co_code[f.PC+2]) f.stack.append(f.code.co_consts[arg]) f.PC += 3
Writing to a local variable is also pretty similar. The value written is popped off the stack.
elif bc==STORE_FAST: arg = ord(f.code.co_code[f.PC+1]) + 256*ord(f.code.co_code[f.PC+2]) f.locals[arg] = f.stack.pop() f.PC += 3
The binary operations pop two items off the stack and push the result. Be careful about the order in which the operands are expected on the stack. Because these instructions take no argument, the PC needs only to be incremented by one. (I only show subtraction, addition and multiplication work analogously.)
elif bc==BINARY_SUBTRACT: b = f.stack.pop() a = f.stack.pop() f.stack.append(a-b) f.PC += 1
Finally the result of the function needs to be returned to the caller. The RETURN_VALUE instruction expects the value to be returned on the top of the stack. So far there is only one function, the top-level test(). Therefore we can halt the interpreter when reaching the return instruction. The value returned from the test function is returned to the caller of the interpreter in the “normal” python world.
elif bc==RETURN_VALUE: return f.stack.pop()
In the future the interpreter is going to implement function calls and we will have to revise the implementation of RETURN_VALUE as it will no longer be the end of execution.
That’s it! Lets test our implementation, both the CPython interpreter on the first line and the newly developed execute() need to give the same result: 12
print 'normal Python call:', test() print 'own execute function:', execute(test.func_code)
I have created a GitHub repository for this project. See the revision for this post on GitHub.