EIP-4573: Entry Points and Procedures for EVM Code Sections
This proposal introduces EVM procedures as entry points to EOF code sections.
| Author | Greg Colvin, Greg Colvin |
|---|---|
| Discussions-To | https://ethereum-magicians.org/t/eip-4573-entry-points-and-procedures-for-evm-code-sections/7776 |
| Status | Draft |
| Type | Standards Track |
| Category | Core |
| Created | 2021-12-16 |
| Requires | 2315, 3540, 3670, 3779, 4200 |
Table of Contents
Abstract
EOF sections and EVM instructions are introduced to provide procedural entry points to code sections.
Motivation
Currently the Ethereum Object Format supports a single code section, with no further means of structuring the code.
This proposal adds a layer of structure by supporting additional code sections. How to access these sections? This proposal specifies a table of entry points for each additional section. Each entry point corresponds to a procedure within a code section.
We define procedures as blocks of code that can be entered only at their entry point, and at other points can call other procedures and subroutines and return to the procedure that called them.
Specification
EOF Sections
Code sections after the first must be immediately preceded by an entry point section. These sections can only be entered at one of the specified entry points in the code section, given as an offset relative to the beginning of the code section.
Entry Point Section
| description | length | ||
|---|---|---|---|
| section_kind | 1-byte | Encoded as a 8-bit unsigned number. | |
| section_size | 2-bytes | Encoded as a 16-bit unsigned big-endian number. | |
| entry_array | [] | ||
| entry_point | 2-bytes | Encoded as a 16-bit unsigned big-endian number. | |
| … | … | … |
Call Frame Stack
These instructions make use of a frame stack to allocate and free frames of data for procedures in memory. Frame memory begins at address 0 in memory and grows downwards, towards more negative addresses. A frame is allocated for each procedure when it is called, and freed when it returns.
Memory can be addressed relative to the frame pointer FP or by absolute address. FP starts at 0, and moves downward to point just after end of the frame for each CALLPROC, and to the start of that frame – which is just past the end of the previous frame – for a corresponding RETURNPROC.
For example, on entry to a procedure needing two words of data the frame stack might look like this
0-> . . . .
. . . .
FP->
Then, on entry to a called procedure needing three words of data the frame stack would like this
0-> . . . .
. . . .
64-> . . . .
. . . .
. . . .
FP->
On return from that procedure the frame stack would like this
0-> . . . .
. . . .
FP-> . . . .
. . . .
. . . .
And after another return
FP-> . . . .
. . . .
. . . .
. . . .
. . . .
Instructions
ENTERPROC (0x??) n_inputs: uint2, n_outputs: uint2, n_locals: uint2
Marks the entry point to a procedure taking
n_inputsarguments from the data stack, returningn_outputsvalues on thedata stack, and reservingn_localswords of data in memory on theframe stack. Procedures can only be entered via aCALLPROCto their entry point.Execution of
ENTERPROCis invalid.
LEAVEPROC (0x??)
Marks the end of a procedure. Each
ENTERPROCrequires a closingLEAVEPROC.Execution of
LEAVEPROCfrom within the procedure it ends is equivalent toRETURNPROC. Otherwise it is invalid.
CALLPROC (0x??) code_id: byte, entry_point: byte
frame_stack.push(FP)
FP -= frame_size * 32
dest = (address of code_id section) + (entry_point offset) + 1
asm JUMPSUB dest
Push
FPon theframe stack, setFPto point just past the new frame, then call the indicated procedure usingJUMPSUB.The
code_idindexes eachentry_arrayin order, from 1, except thatcode_id=0means the current section. Theentry_pointindexes procedures in order ofentery_pointin theentry_arrayin the section, from 0.*Note: The required
n_inputswords are known at validation time to be available on thedata stack.
RETURNPROC (0x??)
FP = frame_stack.pop()
asm RETURNSUB
Pop the
frame stackand return to the calling procedure usingRETURNSUB.
The promised n_outputs words are known (at validation time) to be available on the frame stack.
MSTOREFP (0x??) offset: int2
*(u256*)(FP + offset) = *SP--
Pop a word from the top of the data stack to memory at
FP + offset, decreasing stack size by one word.
MLOADFP (0x??) offset: int2, n_words: uint2
*SP++ = *(u256*)(FP + offset)
Push a word to the top of the data stack from memory at
FP + offset, increasing stack size by one word.
GETFP (0x??)
*SP++ = FP
Push
FPto the top of the data stack, increasing stack size by one word.
Memory Costs
Presently,MSTORE is defined as
memory[stack[0]...(stack[0]+31) = stack[1]
memory_size' = max(memory_size,floor(stack[0]+32)÷32))
where memory_size is the number of active words of memory above 0.
We propose to treat memory addresses as signed, so the formula needs to be
memory'[stack[0]...(stack[0]+31) = stack[1]
if (stack[0])+32)÷32) < 0
negative_memory_size' = max(negative_memory_size,floor(-(stack[0])+32)÷32))
else
positive_memory_size' = max(positive_memory_size,floor((stack[0])+32)÷32))
where negative_memory_size is the number of active words of memory below 0 and positive_memory_size is the number of active words of memory above 0.
Rationale
There is actually not much new here. It amounts to EIP-615, broken into bite-sized pieces.
This proposal uses the EIP-2315 return stack to manage calls and returns, and steals existing ideas from EIP-615, EIP-3336 and EIP-3337. ENTERPROC is loosely modeled on BEGINSUB. Per EIP-3336 and EIP-3337 it moves call frames from the data stack to memory. Per EIP-615 it uses a frame stack to track call-frame addresses with FP as procedures are entered and left.
Note: The aliasing of the frame stack with ordinary memory supports the addressing of call frame data with ordinary stores and loads. This is generally useful, especially for languages like C that provide pointers to variables on the stack.
Backwards Compatibility
This proposal adds new EOF sections and EVM opcodes. It doesn’t remove or change the semantics of any existing opcodes, so there should be no backwards compatibility issues.
Security
Safe use of these constructs can be checked completely at validation time – per EIP-3779 – so there should be no security issues.
Copyright
Copyright and related rights waived via CC0.
Citation
Please cite this document as:
Greg Colvin, Greg Colvin, "EIP-4573: Entry Points and Procedures for EVM Code Sections [DRAFT]," Ethereum Improvement Proposals, no. 4573, December 2021. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-4573.