Currently each part of the opcode is decoded on demand, we should try doing this upfront when the opcode type is constructed to see if the reduced work done during actual processing of the instruction helps performance.
if it does we can build a better/smarter CPU cache.
Relies on #26