Compiler Handler
The EVM executes instructions written in a special language called "EVM bytecode". This is like a universal instruction set for Ethereum contracts. However, computers don't directly understand EVM bytecode; they understand their own "native machine code". Usually, the EVM acts as an interpreter, reading each bytecode instruction one by one and translating it on the spot. This works, but interpretation can be slower than running native code directly.
What Problem Are We Solving?
The Bottleneck: Executing smart contracts involves interpreting EVM bytecode, which can be computationally intensive and slow down transaction processing. Each time a contract function is called, the interpreter might have to go through the same translation steps.
The Goal: We want to speed up contract execution by converting the general EVM bytecode into the specific, highly optimized native machine code that the computer's processor understands directly. Ideally, we do this translation once and reuse the result.
The Analogy: The Translator
Imagine you have a book written in a foreign language (EVM bytecode) that you need to understand.
- Interpreter: You have a friend who reads the book sentence by sentence and translates it for you live. This works, but it's slow, and they have to re-translate every time you re-read a page.
- Compiler (AOT - Ahead-of-Time): You hire a professional translator to translate the entire book into your native language beforehand and give you the translated copy (native machine code). Now, you can read the translated book directly, which is much faster.
- Compiler (JIT - Just-in-Time): Your friend is clever. The first time you ask them to read a chapter, they quickly translate it and write down the translation. The next time you ask for that chapter, they just read you their written translation, which is faster than translating live again.
The Solution: ExtCompileWorker
and CompilerHandler
work together to act like this translator, converting EVM bytecode into native machine code to potentially speed up execution.
ExtCompileWorker
: This is the background worker, the actual translator. It takes EVM bytecode and uses powerful tools (like LLVM via therevmc
library) to compile it into native code. It also manages a cache (like a library of translated books or chapters) so it doesn't have to re-translate the same code over and over. It can operate in two main modes:- AOT (Ahead-of-Time): Compiles code and saves the result (often to disk) before it's needed.
- JIT (Just-in-Time): Compiles code the first time it's executed and keeps the result in memory for subsequent calls.
CompilerHandler
: This component integrates with therevm
(the EVM implementation). When the EVM is about to execute a contract, theCompilerHandler
intercepts this. It checks with theExtCompileWorker
if a compiled version of the contract already exists in the cache. If yes, it tells the EVM to run the super-fast native code. If not, it might ask theExtCompileWorker
to compile it (either now for JIT, or in the background for AOT) and might fall back to the slower interpreter for the current execution.
How to Use Them (Integration View)
As a user of metis-sdk
, you typically don't interact with CompilerHandler
directly during transaction execution. It's an optional feature that gets enabled when setting up the execution environment.
For example, the core Parallel Executor (metis_pe::ParallelExecutor
) can be configured to use a compiler.
Explanation:
- The
ParallelExecutor
struct has an optional fieldworker
of typeArc<ExtCompileWorker>
. AnArc
allows multiple parts of the system (like different EVM instances running in parallel) to safely share the same compiler worker and its cache. - The
compiler()
constructor creates aParallelExecutor
instance initialized with an activeExtCompileWorker
(using AOT mode in this example). - When this
ParallelExecutor
runs transactions (as seen in Parallel Executor), the internalVm
helper (VmDB) can pass this sharedworker
to therevm
execution context, enabling theCompilerHandler
.
The actual usage often happens deep inside the EVM execution loop, facilitated by the CompilerHandler
. Let's look at a simplified example inspired by how one might use it directly with revm
(which metis-sdk
does internally).
Explanation:
- We set up the
revm
EVM context as usual. - We create an instance of
CompilerHandler
, choosing eitheraot()
orjit()
. This handler holds a reference to anExtCompileWorker
internally. - Instead of calling the standard
evm.transact()
, we usecompile_handler.run(&mut evm)
. This lets the handler intercept the execution. - The first time
run
is called for a specific contract code, the handler might trigger compilation (if the feature is enabled and code isn't cached). - Subsequent calls to
run
for the same contract code should hit the cache managed by theExtCompileWorker
and potentially execute faster using the compiled native code.
Under the Hood: How Compilation Happens
Let's trace the simplified flow when the CompilerHandler
encounters contract code for the first time:
- EVM Call: The
revm
EVM, configured with theCompilerHandler
, is about to execute a contract identified by itscode_hash
. - Handler Intercepts: The
CompilerHandler
'sframe_call
method is invoked. - Cache Check: The handler asks its internal
ExtCompileWorker
: "Do you have a compiled function forcode_hash
in your cache?" (worker.get_function(&code_hash)
). - Worker Checks Cache:
- JIT: Checks an in-memory map (
HashMap
). - AOT: Checks an in-memory LRU cache first. If not found, it checks if a compiled shared object file (
.so
) exists on disk in the configured storage path (e.g.,~/.cache/metis/vm/aot/<code_hash>/a.so
). If it finds the file, it loads it into memory and into the cache.
- JIT: Checks an in-memory map (
- Cache Miss: The cache doesn't contain the compiled function.
- Request Compilation: The handler tells the
ExtCompileWorker
: "Please compile the bytecode forcode_hash
for the currentSpecId
." (worker.spawn(...)
). - Worker Compiles: The
ExtCompileWorker
uses itsCompilePool
to manage compilation tasks (potentially using background threads).- A
Compiler
instance usesrevmc
with an LLVM backend to translate the EVM bytecode into LLVM Intermediate Representation (IR) and then into native machine code. - JIT: The resulting native function is stored directly in the JIT cache (in-memory
HashMap
). - AOT: The compiled native code is written to an object file (
.o
) and then linked into a shared object file (.so
) saved to disk under thecode_hash
.
- A
- Execution:
- JIT: Since compilation might happen quickly in the foreground, the handler might immediately use the newly compiled function for the current execution.
- AOT: Compilation might happen in the background. For this first execution, the
CompilerHandler
tells therevm
EVM to proceed using the standard bytecode interpreter.
- Subsequent Calls: The next time the EVM needs to execute the same
code_hash
, theCompilerHandler
checks the cache again (Step 3). This time, it should be a Cache Hit. The worker returns the native function pointer, and the handler instructs the EVM to execute the highly optimized native code directly, skipping the interpreter.
Simplified Sequence Diagram (Cache Miss followed by Cache Hit):
Diving Deeper into the Code (Simplified)
Let's look at the key structs and methods involved.
1. ExtCompileWorker
(crates/vm/src/compiler.rs
)
Manages the compilation pool and cache access logic.
Explanation:
- Holds an optional
CompilePool
which does the heavy lifting of compilation using background threads. get_function
implements the cache lookup logic: check memory cache -> check disk (if AOT) -> return found function orNotFound
.spawn
delegates the compilation request to theCompilePool
.
2. CompilerHandler
(crates/vm/src/compiler.rs
)
Integrates with revm
's execution flow using the Handler
trait.
Explanation:
- Implements the
revm::Handler
trait. - The key method is
frame_call
, which gets invoked beforerevm
runs contract code. - Inside
frame_call
, it gets thecode_hash
of the contract. - It calls
self.worker.get_function()
to check the cache. - If found (
FetchedFnResult::Found
), it should ideally execute the returned native function pointer. (Note: Current integration might still use the interpreter as a placeholder). - If not found (
FetchedFnResult::NotFound
), it callsself.worker.spawn()
to request compilation and then proceeds with the standard interpreter for this run.
Conclusion
The ExtCompileWorker
and CompilerHandler
are optional but powerful components in metis-sdk
designed to accelerate EVM execution. By translating frequently used EVM bytecode into native machine code (using JIT or AOT strategies) and caching the results, they reduce the overhead of interpretation.
ExtCompileWorker
is the engine performing the compilation and managing the cache.CompilerHandler
integrates this process into therevm
execution flow, checking the cache and coordinating compilation requests.
While the exact performance gains depend on the workload and contracts, this compilation step is a common technique for optimizing virtual machine performance.
We've now explored scheduling, memory management, database interaction, and even execution optimization through compilation. What other kinds of advanced features can extend the execution capabilities? Let's look at a specific example related to AI integration next.