Back
Featured image of post Creating a tiny Ethereum EVM in Go

Creating a tiny Ethereum EVM in Go

Understanding how EVM architecture is and how to build one from scratch

Table of Content

The Ethereum Virtual Machine (EVM)

The Ethereum Virtual Machine (EVM) is a vital component of the Ethereum blockchain ecosystem. It is responsible for executing smart contracts on the Ethereum network. Smart contracts are self-executing contracts with the terms of the agreement between the buyer and the seller being directly written into code.

The EVM can be thought of as a decentralized virtual computer, as it is distributed across the Ethereum network and run by all participating nodes. It is designed to be Turing-complete, meaning that any computable algorithm can be implemented within it.

Architecture

The EVM is a stack-based architecture, meaning that data is stored and retrieved from a stack, which is a last-in, first-out (LIFO) data structure. The EVM operates on a bytecode format, which is generated from high-level programming languages such as Solidity.

The EVM has a total of 256 opcodes that allow for various operations such as arithmetic, logic, and memory manipulation. The EVM also has its own gas cost associated with each opcode, which is used to incentivize efficient use of resources on the network.

Internal Configuration

The EVM is configured with a number of system parameters that govern its behavior. These parameters can be set at the genesis block of the Ethereum blockchain and include items such as the block gas limit, the difficulty target, and the EVM version.

One of the most critical parameters is the block gas limit, which is the maximum amount of gas that can be used in a single block. Gas is a unit of measurement used to calculate the computational cost of running a smart contract on the Ethereum network.

Memory Layout

The EVM has a 256-bit word size and a maximum memory size of 2^256 bytes. The memory is used to store data and can be accessed through the stack or through memory instructions.

The EVM also has several special-purpose registers that are used to store data such as the program counter, which keeps track of the current opcode being executed, and the stack pointer, which keeps track of the top of the stack. The Ethereum Virtual Machine is a critical component of the Ethereum blockchain ecosystem, responsible for executing smart contracts on the network. It is designed to be a Turing-complete, stack-based architecture, with a total of 256 opcodes that allow for various operations such as arithmetic, logic, and memory manipulation. Its internal configuration includes system parameters, and its memory layout is used to store data and can be accessed through the stack or through memory instructions.

EVM Opcodes

Opcodes are the basic instructions that are sent to the EVM to execute operations. These opcodes are grouped into several categories based on their functionality.

Stack Operations

Stack operations include pushing data onto the stack and popping data from the stack. Here are a few examples of stack operations:

  • PUSH: Pushes a byte or a series of bytes onto the stack. For example, the opcode PUSH1 0x60 pushes the value 0x60 onto the stack.
  • POP: Removes the top item from the stack. For example, the opcode POP removes the top item from the stack.
  • DUP: Duplicates the top item on the stack. For example, the opcode DUP1 duplicates the top item on the stack.

Arithmetic Operations

Arithmetic operations include addition, subtraction, multiplication, division, and bitwise operations. Here are a few examples of arithmetic operations:

  • ADD: Adds the top two items on the stack and pushes the result onto the stack. For example, the opcode ADD adds the top two items on the stack.
  • SUB: Subtracts the top two items on the stack and pushes the result onto the stack. For example, the opcode SUB subtracts the top two items on the stack.
  • MUL: Multiplies the top two items on the stack and pushes the result onto the stack. For example, the opcode MUL multiplies the top two items on the stack.

Memory Operations

Memory operations allow for accessing and manipulating the memory in the EVM. Here are a few examples of memory operations:

  • MLOAD: Loads a 32-byte value from memory and pushes it onto the stack. For example, the opcode MLOAD loads a 32-byte value from memory and pushes it onto the stack.
  • MSTORE: Stores a 32-byte value to memory. For example, the opcode MSTORE stores a 32-byte value to memory.
  • MSTORE8: Stores a single byte to memory. For example, the opcode MSTORE8 stores a single byte to memory.

Control Flow Operations

Control flow operations allow for controlling the flow of the program execution. Here are a few examples of control flow operations:

  • JUMP: Jumps to a specific location in the code. For example, the opcode JUMP jumps to a specific location in the code.
  • JUMPI: Jumps to a specific location in the code if the top item on the stack is non-zero. For example, the opcode JUMPI jumps to a specific location in the code if the top item on the stack is non-zero.
  • JUMPDEST: Marks a destination for a jump instruction. For example, the opcode JUMPDEST marks a destination for a jump instruction.

Assembly Example

Let’s look at an example of how these opcodes can be used in assembly code:

1
2
3
PUSH1 0x02
PUSH1 0x03
ADD

In this example, we push the values 0x02 and 0x03 onto the stack using the PUSH1 opcode. We then use the ADD opcode to add the top two items on the stack and push the result onto the stack. The final result on the stack will be 0x05.

In conclusion, the Ethereum Virtual Machine (EVM) has a total of 256 opcodes that allow for various operations such as arithmetic, logic, and memory manipulation. These opcodes can be grouped into several categories based on

Writing a toy EVM like virtual machine

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
package main

import (
    "fmt"
    "math/big"
)

type opcode byte

const (
    opPush1 opcode = iota
    opPush2
    opAdd
)

type vm struct {
    code []opcode
    pc   int
    stack []*big.Int
}

func (v *vm) push(i *big.Int) {
    v.stack = append(v.stack, i)
}

func (v *vm) pop() *big.Int {
    if len(v.stack) == 0 {
        panic("stack underflow")
    }
    i := v.stack[len(v.stack)-1]
    v.stack = v.stack[:len(v.stack)-1]
    return i
}

func (v *vm) run() {
    for v.pc < len(v.code) {
        op := v.code[v.pc]
        v.pc++
        switch op {
        case opPush1:
            // Read next byte as argument
            arg := new(big.Int).SetInt64(int64(v.code[v.pc]))
            v.pc++
            v.push(arg)
        case opPush2:
            // Read next 2 bytes as argument
            arg := new(big.Int).SetUint64(uint64(v.code[v.pc])<<8 | uint64(v.code[v.pc+1]))
            v.pc += 2
            v.push(arg)
        case opAdd:
            a := v.pop()
            b := v.pop()
            c := new(big.Int).Add(a, b)
            v.push(c)
        default:
            panic(fmt.Sprintf("unknown opcode: %d", op))
        }
    }
}

func main() {
    // Push 1 and 2 onto the stack, then add them together
    code := []opcode{opPush1, 0x01, opPush2, 0x00, 0x02, opAdd}
    v := &vm{code: code}
    v.run()
    result := v.pop()
    fmt.Println(result.String()) // Output: 3
}

Conclusion

In this implementation, we define an opcode type to represent the different opcodes that our virtual machine can execute. We also define a vm struct to represent the state of the virtual machine, which includes the bytecode to be executed, the program counter (pc), and the stack.

The push and pop methods are used to manipulate the stack, while the run method executes the bytecode by iterating over the code and performing the appropriate operations for each opcode. In this example, we have implemented three opcodes: opPush1 and opPush2 for pushing one and two-byte arguments onto the stack, and opAdd for adding the top two items on the stack.

Finally, we create a vm instance with some sample bytecode that pushes the values 1 and 2 onto the stack, and then adds them together. We then execute the bytecode using the run method and print the result, which should be 3.

References



💬 Share this post in social media

Thanks for checking this out and I hope you found the info useful! If you have any questions, don't hesitate to write me a comment below. And remember that if you like to see more content on, just let me know it and share this post with your colleges, co-workers, FFF, etc.

Please, don't try to hack this website servers. Guess why...