Pre:

每处理一笔交易，就要新建一个 EVM对象 来处理交易。
EVM 对象内部主要依赖三个对象：

解释器 Interpreter
虚拟机相关配置对象 vm.Config
以太坊状态数据库 StateDB

这次先看解释器对象的源码

EVM解释器对象：

解释器对象EVMInterpreter 用来解释执行指定的合约指令。
不过需要说明一点的是，实际的指令解释执行并不真正由解释器对象完成的，而是由 vm.Config.JumpTable 中的 operation 对象完成的，
解释器对象只是负责逐条解析指令码，然后获取相应的 operation 对象，并在调用真正解释指令的 operation.execute 函数之前检查堆栈等对象。

也可以说，解释器对象只是负责解释的调度工作。

创建EVM解释器对象：

NewEVMInterpreter()函数的主要操作:

根据不同的以太坊版本，使用不同对象填充cfg.JumpTable字段
填充cfg.ExtraEips字段
生成一个 EVMInterpreter 对象并返回

core/vm/interpreter.go

// NewEVMInterpreter returns a new instance of the Interpreter.
// 创建解释器
func NewEVMInterpreter(evm *EVM, cfg Config) *EVMInterpreter {
	// If jump table was not initialised we set the default one.
	// 在这里设置 操作码对应的函数 
	// 主要原因是以太坊经历版本迭代之后 操作码有了一些变化
	// 我们移植的时候 这个地方只会保留最新版本的操作码表
	if cfg.JumpTable == nil {
		switch {
		// 根据不同的以太坊版本，使用不同对象填充cfg.JumpTable字段
		case evm.chainRules.IsMerge:
			cfg.JumpTable = &mergeInstructionSet
		case evm.chainRules.IsLondon:
			cfg.JumpTable = &londonInstructionSet
		case evm.chainRules.IsBerlin:
			cfg.JumpTable = &berlinInstructionSet
		case evm.chainRules.IsIstanbul:
			cfg.JumpTable = &istanbulInstructionSet
		case evm.chainRules.IsConstantinople:
			cfg.JumpTable = &constantinopleInstructionSet
		case evm.chainRules.IsByzantium:
			cfg.JumpTable = &byzantiumInstructionSet
		case evm.chainRules.IsEIP158:
			cfg.JumpTable = &spuriousDragonInstructionSet
		case evm.chainRules.IsEIP150:
			cfg.JumpTable = &tangerineWhistleInstructionSet
		case evm.chainRules.IsHomestead:
			cfg.JumpTable = &homesteadInstructionSet
		default:
			cfg.JumpTable = &frontierInstructionSet
		}
		for i, eip := range cfg.ExtraEips {
			copy := *cfg.JumpTable
			if err := EnableEIP(eip, &copy); err != nil {
				// Disable it, so caller can check if it's activated or not
				// 填充cfg.ExtraEips字段
				cfg.ExtraEips = append(cfg.ExtraEips[:i], cfg.ExtraEips[i+1:]...)
				log.Error("EIP activation failed", "eip", eip, "error", err)
			}
			cfg.JumpTable = &copy
		}
	}

	return &EVMInterpreter{
		evm: evm,
		cfg: cfg,
	}
}

EVMInterpreter 关键方法是 Run 方法

Interpreter.Run():

初始化执行循环中的中间变量:

core/vm/interpreter.go

// Increment the call depth which is restricted to 1024
// 调用深度递增，evm执行栈的深度不能超过1024
in.evm.depth++
defer func() { in.evm.depth-- }()

// Make sure the readOnly is only set if we aren't in readOnly yet.
// This also makes sure that the readOnly flag isn't removed for child calls.
// 将readOnly设置为true
if readOnly && !in.readOnly {
	in.readOnly = true
	defer func() { in.readOnly = false }()
}

// Reset the previous call's return data. It's unimportant to preserve the old buffer
// as every returning call will return new data anyway.
// 重置上一个call的返回数据 保留旧的缓冲区是不重要的,因为每个有返回的调用都会返回新数据。
in.returnData = nil

// Don't bother with the execution if there's no code.
// 如果合约代码为空则直接退出
if len(contract.Code) == 0 {
	return nil, nil
}

var (
	op          OpCode        // current opcode // 当前的指令集
	mem         = NewMemory() // bound memory // 新建内存
	stack       = newstack()  // local stack // 新建堆栈
	callContext = &ScopeContext{
		Memory:   mem,
		Stack:    stack,
		Contract: contract,
	}
	// For optimisation reason we're using uint64 as the program counter.
	// It's theoretically possible to go above 2^64. The YP defines the PC
	// to be uint256. Practically much less so feasible.
	pc   = uint64(0) // program counter // 重置计数器
	cost uint64
	// copies used by tracer
	pcCopy  uint64 // needed for the deferred EVMLogger
	gasCopy uint64 // for EVMLogger to log gas remaining before execution
	logged  bool   // deferred EVMLogger should ignore already logged steps
	res     []byte // result of the opcode execution function
)
// Don't move this deferred function, it's placed before the capturestate-deferred method,
// so that it get's executed _after_: the capturestate needs the stacks before
// they are returned to the pools
defer func() {
	returnStack(stack)
}()
contract.Input = input

if in.cfg.Debug {
	defer func() {
		if err != nil {
			if !logged {
				in.cfg.Tracer.CaptureState(pcCopy, op, gasCopy, cost, callContext, in.returnData, in.evm.depth, err)
			} else {
				in.cfg.Tracer.CaptureFault(pcCopy, op, gasCopy, cost, callContext, in.evm.depth, err)
			}
		}
	}()
}

进入主循环:

根据pc获取一条指令
根据指令从JumpTable中获得操作码
检查堆栈上的参数是否服符合操作码函数的要求
计算指令所需要的内存大小
获取这个指令需要gas消耗，然后从交易余额中扣除当前指令的消耗，如果余额不足，直接返回ErrOutOfGas
计算新的内存大小以动态调整内存大小，必要时进行扩容(按32字节)
所有使用动态内存的操作码都有动态的gas成本，扣除动态gas成本，如果不够，就返回ErrOutOfGas错误
执行操作指令
处理操作指令的返回值

core/vm/interpreter.go

// The Interpreter main run loop (contextual). This loop runs until either an
// explicit STOP, RETURN or SELFDESTRUCT is executed, an error occurred during
// the execution of one of the operations or until the done flag is set by the
// parent context.
// 解释器主循环，循环运行直到执行显式STOP，RETURN或SELFDESTRUCT指令被执行，或者是遇到任意错误，或者说done标志被父context设置
for {
	if in.cfg.Debug {
		// Capture pre-execution values for tracing.
		// 捕获预执行的值进行跟踪
		logged, pcCopy, gasCopy = false, pc, contract.Gas
	}
	// Get the operation from the jump table and validate the stack to ensure there are
	// enough stack items available to perform the operation.
	// 从合约的二进制数据i获取第pc个opcode操作符
	// opcode是以太坊虚拟机指令，一共不超过256个，正好一个byte大小能装下
	// 根据pc获取一条指令
	op = contract.GetOp(pc)
	// 根据指令从JumpTable中获得操作码
	operation := in.cfg.JumpTable[op]
	// 获取这个指令需要gas消耗
	cost = operation.constantGas // For tracing
	// Validate stack 检查堆栈
	// 检查：检查堆栈上的参数是否符合指令函数的要求
	if sLen := stack.len(); sLen < operation.minStack {
		return nil, &ErrStackUnderflow{stackLen: sLen, required: operation.minStack}
	} else if sLen > operation.maxStack {
		return nil, &ErrStackOverflow{stackLen: sLen, limit: operation.maxStack}
	}
	// 扣除当前指令的消耗
	if !contract.UseGas(cost) {
		return nil, ErrOutOfGas
	}
	if operation.dynamicGas != nil {
		// All ops with a dynamic memory usage also has a dynamic gas cost.
		// 所有使用动态内存的操作码都有动态的gas成本。
		var memorySize uint64
		// calculate the new memory size and expand the memory to fit
		// the operation
		// 计算新的内存大小以适应操作，必要时进行扩容
		// Memory check needs to be done prior to evaluating the dynamic gas portion,
		// to detect calculation overflows
		// 在评估动态气体部分之前需要进行内存检查，检测计算溢出
		if operation.memorySize != nil { // 计算内存使用量，先判断内存大小是否足够
			memSize, overflow := operation.memorySize(stack)
			if overflow {
				return nil, ErrGasUintOverflow
			}
			// memory is expanded in words of 32 bytes. Gas
			// is also calculated in words.
			// 扩容按32字节的字扩展
			if memorySize, overflow = math.SafeMul(toWordSize(memSize), 32); overflow {
				return nil, ErrGasUintOverflow
			}
		}
		// Consume the gas and return an error if not enough gas is available.
		// cost is explicitly set so that the capture state defer method can get the proper cost
		// 计算执行操作所需要的gas
		// 计算gas的Cost 并使用，如果不够，就返回OutOfGas错误。
		var dynamicCost uint64
		dynamicCost, err = operation.dynamicGas(in.evm, contract, stack, mem, memorySize)
		cost += dynamicCost // for tracing
		// 扣除gas
		if err != nil || !contract.UseGas(dynamicCost) {
			return nil, ErrOutOfGas
		}
		// Do tracing before memory expansion
		if in.cfg.Debug {
			in.cfg.Tracer.CaptureState(pc, op, gasCopy, cost, callContext, in.returnData, in.evm.depth, err)
			logged = true
		}
		if memorySize > 0 {  //扩大内存范围
			mem.Resize(memorySize)
		}
	} else if in.cfg.Debug {
		in.cfg.Tracer.CaptureState(pc, op, gasCopy, cost, callContext, in.returnData, in.evm.depth, err)
		logged = true
	}
	// execute the operation
	// 执行操作
	res, err = operation.execute(&pc, in, callContext)
	if err != nil {
		break
	}
	pc++
}

if err == errStopToken {
	err = nil // clear stop token error
}

总体来说，解释器执行循环的过程如下图：

EVM指令与操作:

我们先看下EVM模块的代码结构：

 evm.go	        		  // 定义了EVM运行环境结构体，并实现 转账处理 这些比较高级的，跟交易本身有关的功能 
└── vm
    ├── analysis.go      // 跳转目标判断
    ├── common.go		 // 一些共有方法
    ├── contract.go		 // 智能合约
    ├── contracts.go
    ├── doc.go
    ├── eips.go
    ├── errors.go		 // 错误类
    ├── evm.go           // evm对外接口,定义了EVM结构体，提供Create和Call方法，作为虚拟机的入口，分别对应创建合约和执行合约代码
    ├── gas.go			 // gas花费计算
    ├── gas_table.go     // 绝大部分操作码所需的gas计算表
    ├── instructions.go  // 绝大部分的操作码对应的实现都在这里  
    ├── interface.go
    ├── interpreter.go   // 虚拟机的调度器，开始真正的解析执行合约代码
    ├── jump_table.go	 // 定义了operation，就是将opcode和gas计算函数、具体实现函数等关联起来
    ├── logger.go		 // evm日志
    ├── memory.go		 // evm的内存结构
    ├── memory_table.go
    ├── opcodes.go       // op指令集
    ├── operations_acl.go
    ├── runtime
    │   ├── doc.go
    │   ├── env.go		  // 执行环境
    │   ├── runtime.go	  // 运行时
    ├── stack.go		  // evm所需要的栈
    ├── stack_table.go

从上图来看:

opcodes 中储存的是所有指令码，比如ADD的指令码就是0x01
jump_table 定义了每一个指令对应的指令码、gas花费
instructions 中是所有的指令执行函数的实现，通过这些函数来对堆栈stack进行操作，比如pop()、push()等。

当一个contract对象传入interpreter模块，首先调用了contract的GetOp(n)方法，从Contract对象的Code中拿到n对应的指令。
参数n就是我们上面在Run()函数中定义的pc，是一个程序的计数器。
每次指令执行后都会让pc++，从而调用下一个指令，除非指令执行到最后是退出函数，比如return、stop或selfDestruct。

core/vm/contract.go

// GetOp returns the n'th element in the contract's byte array
// 用来获取下一跳指令
func (c *Contract) GetOp(n uint64) OpCode {
	if n < uint64(len(c.Code)) {
		return OpCode(c.Code[n])
	}

	return STOP
}

基于堆栈的虚拟机:

虚拟机实际上是从软件层面对物理机器的模拟，但以太坊虚拟机相对于我们日常常见到的狭义的虚拟机如vmware或者v-box不同，
仅仅是为了模拟对字节码的取指令、译码、执行和结果储存返回等操作，这些步骤跟真实物理机器上的概念都很类似。
当然，不管虚拟机怎么实现，最终都还是要依靠物理资源。

如今虚拟机的实现方式有两种，一种就是基于栈的，另一种是基于寄存器的。
基于栈的虚拟机有JVM，CPython等，而基于寄存器的有Dalvik以及Lua5.0。
这两种实现方式虽然机制不同，但最终都要实现：

从内存中取指令；
译码，将指令转义成特定的操作；
执行，也就是在栈或者寄存器中进行计算；
返回计算结果。

我们这里简单通过一张图回顾上面那个ADD指令的执行，了解一下基于栈的计算如何执行，以便我们能对以太坊EVM的原理有更深的理解。

我们栈上先PUSH了3和4在栈顶，现在当收到ADD指令时，调用opAdd()函数。
先执行x = stack.pop()，将栈顶的3取出并赋值给x，删除栈顶的3，
然后执行y = stack.peek()，取出此时栈顶的4但是不删除。
然后执行y.Add(x,y)得到y==7，再讲7压如栈顶。

Trash Bin

StudyRecord-以太坊源码分析-EVM解释器对象

Pre:

EVM解释器对象：

创建EVM解释器对象：

Interpreter.Run():

初始化执行循环中的中间变量:

进入主循环:

EVM指令与操作:

基于堆栈的虚拟机:

Refs: