Virtual Machine experiment
This is an experiment and comparison and optimization of a miniature assembler and VM with the following (sort of but less and less minimalistic) instructions:
Immediate operand instructions:
LoadI,AddI,SubI,MulI,DivI,ModI,ShiftI,AndI(though they can also load the relative address of a label as value)
Relative address based instructions:
LoadR,AddR,SubR,MulR,DivR,StoreR,JNZ(jump if not equal to 0),JNEG(jump if negative),JPOS(jump if positive or 0),JumpR(unconditional jump),IncrR i addrincrements (or decrements ifiis negative the value ataddrbyiand loads the result in the accumulator)
Stack-oriented instructions let the VM manage simple call frames:
Callpushes the return address, andRetunwinds the stack (optionally dropping extra entries).Push/Popmove the accumulator to and from the stack while reserving or discarding extra slots.LoadS,StoreS,AddS,SubS,MulS,DivS, andIncrSread and write relative to the current stack pointer so stack-resident variables can be manipulated without touching memory directly, andSysSmirrorsSysbut uses a stack index operand for its first argument.IdivSdivides the stack location by the accumulator and keeps the remainder in A.StoreSBstores a single byte from the accumulator into a stack-resident buffer: the first operand specifies the base stack offset of the target word span, while 2nd operand indicates a stack slot containing the byte offset (which can be more than 8). The handler computes the word/bit position and patches the selected byte in place. It is handy for building packedstr8buffers on the stack (see programs/itoa.asm).
- String quoting use the go rules (ie in "double-quotes" with \ sequences or single 'x' for 1 character or backtick for verbatim)
- str8: 1 byte size, remaining data (so string 7 bytes or less are 1 word, longer is chunked into 8 bytes words)
- Data can also be just bytes packed by 64 bit words (see ReadN/WriteN below for instance)
Sys8bit callid (lowest byte), 48 remaining bits as (first) argument to the syscallExit(1) with value from argRead8(2) reads from fd (0 == stdin) up to A bytes into param address/stack buffer str8 format (so max 255 bytes).Write8(3) writes a str8 to fd (1 == stdout). In theSysSvariant the accumulator is a byte offset from the passed stack offset. In theSysone A is ignored unless the parameter is 0 in which case A is the address to use for the location of the str8 (see an example in echo.asm).ReadN(4) reads from fd up to A bytes into param address/stack buffer (no limit outside of underlying read syscall and memory as this returns the length and does not write str8 len byte first).WriteN(5) writes A bytes to fd from memory pointed to by the operand.Sleep(6) argument in millisecondsOpen(7) opens a file and returns fd in A. Takes flags (low 8 bits of param) and path address (high 48 bits). IfSysvariant with param=0, uses A as path address. Flags are platform-specific (e.g.,O_RDONLY=0).Close(8) closes a file descriptor. Fd in A. Returns 0 on success, -1 on error.ReadF(9) reads A bytes from fd (low 8 bits of param) to address (high 48 bits). Similar to ReadN but fd comes from param instead of being fixed.WriteF(10) writes A bytes to fd (low 8 bits of param) from address (high 48 bits). Similar to WriteN but fd comes from param instead of being fixed.
The assembler is using space separated arguments and allows basic expressions (e.g. foo<<4+3) (see the Operator constants for list). Note that there are limitations, for instance -2*3 works fine but not 2*-3 e.g. for negative operands gather the sign at the front.
:(e.g.foo:on its own line) defines a label for relative addressing instructions..constdefines a constant that can be used in instructions later (must be defined before use unlike labels)datafor a 64 bit wordstr8for string (with the double or backtick quotes)- on a line preceding an instruction: label +
:label for the *R instruction (relative address calculation). label starts with a letter. .spacefor multiple 0 initialized 64 bit wordsVar v1 v2 ...virtual instruction that generates aPushinstruction with the number of identifiers provided and defines labels for said variables starting at 0 (which will start with the value of the accumulator while the rest will start 0 initialized).Param p1 p2 ...virtual instruction that generates stack labels for p1, p2 as offset from before the return PC (ie parameters pushed (viaVarorPush) by the caller before callingCall)Returnvirtual instruction that generates aRet nwhere n is such as a Var push is undone.
When a VM program starts, the host initializes the stack with command-line arguments:
- Argument addresses are pushed onto the stack in reverse order (so
argv[0]is deepest,argv[N-1]is highest) - The argument count (
argc) is pushed on top - The stack pointer points at
argc
To access arguments, a program typically:
POP 0 ; Pop argc into accumulator
StoreR argc ; Store it for later use
; Now pop each argument address and process itSee echo.asm for a complete example that iterates through all arguments and prints them one per line (and make echo-test as an example to run).
Compares go, tinygo, C based VMs (and plain C loop for reference).
See Makefile / run make
Binary release of the go version also available in releases/ or via
go install grol.io/vm@latest(and homebrew and docker as well)