Back
Featured image of post Part 1: Building Ethereum EVM decompiler from scratch. Getting OPCODEs

Part 1: Building Ethereum EVM decompiler from scratch. Getting OPCODEs

A series of post about how to design a Solidity EVM decompiler from scratch. Part 1

Table of Content

What is the decompilation process?

The decompilation process involves going back to the original source code from compiled source code so that security engineers might have a better understanding of the programs instead of working directly with machine code; in this context of EVM, the goal is to convert EVM bytecode into solidity like code.

The challenge

Compilation back to the original source code is impossible because all variable names, type names and even function names are removed. It might be technically possible to arrive at some source code that is similar to the original source code but that is very complicated, especially when the optimizer was used during compilation. I don’t know of any tools that do more than converting bytecode to opcodes.

Decompilers in Ethereum

Ethereum is gaining a significant popularity in the blockchain community, mainly due to fact that it is design in a way that enables developers to write decentralized applications (Dapps) and smart-contract using blockchain technology.

Ethereum blockchain is a consensus-based globally executed virtual machine, also referred as Ethereum Virtual Machine (EVM) by implemented its own micro-kernel supporting a handful number of instructions, its own stack, memory and storage. This enables the radical new concept of distributed applications.

Contracts live on the blockchain in an Ethereum-specific binary format (EVM bytecode). However, contracts are typically written in some high-level language such as Solidity and then compiled into byte code to be uploaded on the blockchain. Solidity is a contract-oriented, high-level language whose syntax is similar to that of JavaScript.

This new paradigm of applications opens the door to many possibilities and opportunities. Blockchain is often referred as secure by design, but now that blockchains can embed applications this raise multiple questions regarding architecture, design, attack vectors and patch deployments.

As we, reverse engineers, know having access to source code is often a luxury. Hence, EVM bytecode decompilers into readable code is needed.

Ethereum Virtual Machine (EVM) decompilers today

Currently, we have following alternatives when dealing with bytecode decompilation:

Design Approach

As you seen, there are many decompilers out there but none of them create as good quality code as possible. This is mostly, because reversing back to Solidity code is a hard task difficult to accomplish.

To make our own EVM bytecode decompiler, we will follow the next steps:

  • Extract .runtime section code from bytecode.
  • From the opcode sequence, extract the EVM instruction information and arguments when available. See the previous post Converting EVM bytecode to OPCODES in microseconds to know more about it.
  • Find the entrypoint at 0x0
  • Convert the sequence of .runtime opcodes to EVM CFG.
  • From EVM CFG, remove stack related operations and convert the code to register based instructions
  • Remove compiler optimizations when possible. For example: $var & 0xFFFFF
  • reconstruct dispatcher section
  • resolve public methods name when possible.
  • resolve internal functions when possible.
  • lookup and map storage and memory variables usage applying SSA methodology.
  • iterate over CFG tree transpiling code to higher Solidity and Yul representation.
  • Detect common code similarities with known Solidity code and known libraries.
  • Generate the final output.

In the next post, I will show you how to generate a EVM CFG diagram to get a better idea of the bytecode design and instruction execution flow. Bye :)

Next reading

If you like this content, continue reading and find out how to process EVM bytecode in next steps at Part 2: Building Ethereum EVM decompiler from scratch. Getting Code Blocks

References



💬 Share this post in social media

Thanks for checking this out and I hope you found the info useful! If you have any questions, don't hesitate to write me a comment below. And remember that if you like to see more content on, just let me know it and share this post with your colleges, co-workers, FFF, etc.

You are free to use the content, but mention the author (@MrSergioAnguita) and link the post!
Last updated on Jun 20, 2022 11:28 CEST
Please, don't try to hack this website servers. Guess why...