EIP-4573: Entry Points and Procedures for EVM Code Sections
This proposal introduces EVM procedures as entry points to EOF code sections.
Author | Greg Colvin, Greg Colvin |
---|---|
Discussions-To | https://ethereum-magicians.org/t/eip-4573-entry-points-and-procedures-for-evm-code-sections/7776 |
Status | Draft |
Type | Standards Track |
Category | Core |
Created | 2021-12-16 |
Requires | 2315, 3540, 3670, 3779, 4200 |
Table of Contents
Abstract
EOF sections and EVM instructions are introduced to provide procedural entry points to code sections.
Motivation
Currently the Ethereum Object Format supports a single code section, with no further means of structuring the code.
This proposal adds a layer of structure by supporting additional code sections. How to access these sections? This proposal specifies a table of entry points for each additional section. Each entry point corresponds to a procedure within a code section.
We define procedures as blocks of code that can be entered only at their entry point, and at other points can call other procedures and subroutines and return to the procedure that called them.
Specification
EOF Sections
Code sections after the first must be immediately preceded by an entry point section. These sections can only be entered at one of the specified entry points in the code section, given as an offset relative to the beginning of the code section.
Entry Point Section
description | length | ||
---|---|---|---|
section_kind | 1-byte | Encoded as a 8-bit unsigned number. | |
section_size | 2-bytes | Encoded as a 16-bit unsigned big-endian number. | |
entry_array | [] | ||
entry_point | 2-bytes | Encoded as a 16-bit unsigned big-endian number. | |
… | … | … |
Call Frame Stack
These instructions make use of a frame stack
to allocate and free frames of data for procedures in memory. Frame memory begins at address 0 in memory and grows downwards, towards more negative addresses. A frame is allocated for each procedure when it is called, and freed when it returns.
Memory can be addressed relative to the frame pointer FP
or by absolute address. FP
starts at 0, and moves downward to point just after end of the frame for each CALLPROC
, and to the start of that frame – which is just past the end of the previous frame – for a corresponding RETURNPROC
.
For example, on entry to a procedure needing two words of data the frame stack
might look like this
0-> . . . .
. . . .
FP->
Then, on entry to a called procedure needing three words of data the frame stack
would like this
0-> . . . .
. . . .
64-> . . . .
. . . .
. . . .
FP->
On return from that procedure the frame stack
would like this
0-> . . . .
. . . .
FP-> . . . .
. . . .
. . . .
And after another return
FP-> . . . .
. . . .
. . . .
. . . .
. . . .
Instructions
ENTERPROC (0x??) n_inputs: uint2, n_outputs: uint2, n_locals: uint2
Marks the entry point to a procedure taking
n_inputs
arguments from the data stack, returningn_outputs
values on thedata stack
, and reservingn_locals
words of data in memory on theframe stack
. Procedures can only be entered via aCALLPROC
to their entry point.Execution of
ENTERPROC
is invalid.
LEAVEPROC (0x??)
Marks the end of a procedure. Each
ENTERPROC
requires a closingLEAVEPROC
.Execution of
LEAVEPROC
from within the procedure it ends is equivalent toRETURNPROC
. Otherwise it is invalid.
CALLPROC (0x??) code_id: byte, entry_point: byte
frame_stack.push(FP)
FP -= frame_size * 32
dest = (address of code_id section) + (entry_point offset) + 1
asm JUMPSUB dest
Push
FP
on theframe stack
, setFP
to point just past the new frame, then call the indicated procedure usingJUMPSUB
.The
code_id
indexes eachentry_array
in order, from 1, except thatcode_id=0
means the current section. Theentry_point
indexes procedures in order ofentery_point
in theentry_array
in the section, from 0.*Note: The required
n_inputs
words are known at validation time to be available on thedata stack
.
RETURNPROC (0x??)
FP = frame_stack.pop()
asm RETURNSUB
Pop the
frame stack
and return to the calling procedure usingRETURNSUB
.
The promised n_outputs
words are known (at validation time) to be available on the frame stack.
MSTOREFP (0x??) offset: int2
*(u256*)(FP + offset) = *SP--
Pop a word from the top of the data stack to memory at
FP + offset
, decreasing stack size by one word.
MLOADFP (0x??) offset: int2, n_words: uint2
*SP++ = *(u256*)(FP + offset)
Push a word to the top of the data stack from memory at
FP + offset
, increasing stack size by one word.
GETFP (0x??)
*SP++ = FP
Push
FP
to the top of the data stack, increasing stack size by one word.
Memory Costs
Presently,MSTORE
is defined as
memory[stack[0]...(stack[0]+31) = stack[1]
memory_size' = max(memory_size,floor(stack[0]+32)÷32))
where memory_size
is the number of active words of memory above 0.
We propose to treat memory addresses as signed, so the formula needs to be
memory'[stack[0]...(stack[0]+31) = stack[1]
if (stack[0])+32)÷32) < 0
negative_memory_size' = max(negative_memory_size,floor(-(stack[0])+32)÷32))
else
positive_memory_size' = max(positive_memory_size,floor((stack[0])+32)÷32))
where negative_memory_size
is the number of active words of memory below 0 and positive_memory_size
is the number of active words of memory above 0.
Rationale
There is actually not much new here. It amounts to EIP-615, broken into bite-sized pieces.
This proposal uses the EIP-2315 return stack to manage calls and returns, and steals existing ideas from EIP-615, EIP-3336 and EIP-3337. ENTERPROC
is loosely modeled on BEGINSUB
. Per EIP-3336 and EIP-3337 it moves call frames from the data stack to memory. Per EIP-615 it uses a frame stack to track call-frame addresses with FP
as procedures are entered and left.
Note: The aliasing of the frame stack with ordinary memory supports the addressing of call frame data with ordinary stores and loads. This is generally useful, especially for languages like C that provide pointers to variables on the stack.
Backwards Compatibility
This proposal adds new EOF sections and EVM opcodes. It doesn’t remove or change the semantics of any existing opcodes, so there should be no backwards compatibility issues.
Security
Safe use of these constructs can be checked completely at validation time – per EIP-3779 – so there should be no security issues.
Copyright
Copyright and related rights waived via CC0.
Citation
Please cite this document as:
Greg Colvin, Greg Colvin, "EIP-4573: Entry Points and Procedures for EVM Code Sections [DRAFT]," Ethereum Improvement Proposals, no. 4573, December 2021. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-4573.