⚠️ This EIP is not recommended for general use or implementation as it is likely to change.

EIP-4573: Entry Points and Procedures for EVM Code Sections Source

This proposal introduces EVM procedures as entry points to EOF code sections.

AuthorGreg Colvin, Greg Colvin
Discussions-Tohttps://ethereum-magicians.org/t/eip-4573-entry-points-and-procedures-for-evm-code-sections/7776
StatusDraft
TypeStandards Track
CategoryCore
Created2021-12-16
Requires 2315, 3540, 3670, 3779, 4200

Abstract

EOF sections and EVM instructions are introduced to provide procedural entry points to code sections.

Motivation

Currently the Ethereum Object Format supports a single code section, with no further means of structuring the code.

This proposal adds a layer of structure by supporting additional code sections. How to access these sections? This proposal specifies a table of entry points for each additional section. Each entry point corresponds to a procedure within a code section.

We define procedures as blocks of code that can be entered only at their entry point, and at other points can call other procedures and subroutines and return to the procedure that called them.

Specification

EOF Sections

Code sections after the first must be immediately preceded by an entry point section. These sections can only be entered at one of the specified entry points in the code section, given as an offset relative to the beginning of the code section.

Entry Point Section

description   length  
section_kind   1-byte Encoded as a 8-bit unsigned number.
section_size   2-bytes Encoded as a 16-bit unsigned big-endian number.
entry_array []    
  entry_point 2-bytes Encoded as a 16-bit unsigned big-endian number.
 

Call Frame Stack

These instructions make use of a frame stack to allocate and free frames of data for procedures in memory. Frame memory begins at address 0 in memory and grows downwards, towards more negative addresses. A frame is allocated for each procedure when it is called, and freed when it returns.

Memory can be addressed relative to the frame pointer FP or by absolute address. FP starts at 0, and moves downward to point just after end of the frame for each CALLPROC, and to the start of that frame – which is just past the end of the previous frame – for a corresponding RETURNPROC.

For example, on entry to a procedure needing two words of data the frame stack might look like this

     0-> . . . .
         . . . .
    FP->

Then, on entry to a called procedure needing three words of data the frame stack would like this

     0-> . . . .
         . . . .
    64-> . . . .
         . . . .
         . . . .
    FP->

On return from that procedure the frame stack would like this

     0-> . . . .
         . . . .
    FP-> . . . .
         . . . .
         . . . .

And after another return

    FP-> . . . .
         . . . .
         . . . .
         . . . .
         . . . .

Instructions

ENTERPROC (0x??) n_inputs: uint2, n_outputs: uint2, n_locals: uint2

Marks the entry point to a procedure taking n_inputs arguments from the data stack, returning n_outputs values on the data stack, and reserving n_locals words of data in memory on the frame stack. Procedures can only be entered via a CALLPROC to their entry point.

Execution of ENTERPROC is invalid.

LEAVEPROC (0x??)

Marks the end of a procedure. Each ENTERPROC requires a closing LEAVEPROC.

Execution of LEAVEPROC from within the procedure it ends is equivalent to RETURNPROC. Otherwise it is invalid.

CALLPROC (0x??) code_id: byte, entry_point: byte

   frame_stack.push(FP)
   FP -= frame_size * 32
   dest = (address of code_id section) + (entry_point offset) + 1
   asm JUMPSUB dest

Push FP on the frame stack, set FP to point just past the new frame, then call the indicated procedure using JUMPSUB.

The code_id indexes each entry_array in order, from 1, except that code_id=0 means the current section. The entry_point indexes procedures in order of entery_point in the entry_array in the section, from 0.*

Note: The required n_inputs words are known at validation time to be available on the data stack.

RETURNPROC (0x??)

   FP = frame_stack.pop()
   asm RETURNSUB

Pop the frame stack and return to the calling procedure using RETURNSUB.

The promised n_outputs words are known (at validation time) to be available on the frame stack.

MSTOREFP (0x??) offset: int2

*(u256*)(FP + offset) = *SP--

Pop a word from the top of the data stack to memory at FP + offset, decreasing stack size by one word.

MLOADFP (0x??) offset: int2, n_words: uint2

*SP++ = *(u256*)(FP + offset)

Push a word to the top of the data stack from memory at FP + offset, increasing stack size by one word.

GETFP (0x??)

*SP++ = FP

Push FP to the top of the data stack, increasing stack size by one word.

Memory Costs

Presently,MSTORE is defined as

   memory[stack[0]...(stack[0]+31) = stack[1]
   memory_size' = max(memory_size,floor(stack[0]+32)÷32))

where memory_size is the number of active words of memory above 0.

We propose to treat memory addresses as signed, so the formula needs to be

   memory'[stack[0]...(stack[0]+31) = stack[1]
   if (stack[0])+32)÷32) < 0
      negative_memory_size' = max(negative_memory_size,floor(-(stack[0])+32)÷32))
   else
      positive_memory_size' = max(positive_memory_size,floor((stack[0])+32)÷32))

where negative_memory_size is the number of active words of memory below 0 and positive_memory_size is the number of active words of memory above 0.

Rationale

There is actually not much new here. It amounts to EIP-615, broken into bite-sized pieces.

This proposal uses the EIP-2315 return stack to manage calls and returns, and steals existing ideas from EIP-615, EIP-3336 and EIP-3337. ENTERPROC is loosely modeled on BEGINSUB. Per EIP-3336 and EIP-3337 it moves call frames from the data stack to memory. Per EIP-615 it uses a frame stack to track call-frame addresses with FP as procedures are entered and left.

Note: The aliasing of the frame stack with ordinary memory supports the addressing of call frame data with ordinary stores and loads. This is generally useful, especially for languages like C that provide pointers to variables on the stack.

Backwards Compatibility

This proposal adds new EOF sections and EVM opcodes. It doesn’t remove or change the semantics of any existing opcodes, so there should be no backwards compatibility issues.

Security

Safe use of these constructs can be checked completely at validation time – per EIP-3779 – so there should be no security issues.

Copyright and related rights waived via CC0.

Citation

Please cite this document as:

Greg Colvin, Greg Colvin, "EIP-4573: Entry Points and Procedures for EVM Code Sections [DRAFT]," Ethereum Improvement Proposals, no. 4573, December 2021. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-4573.