⚠️ This EIP is not recommended for general use or implementation as it is likely to change.

EIP-2315: Simple Subroutines for the EVM

This proposal introduces two opcodes to support simple subroutines.

Author	Greg Colvin, Greg Colvin, Martin Holst Swende, Brooklyn Zelenka
Discussions-To	https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941
Status	Draft
Type	Standards Track
Category	Core
Created	2019-10-17
Requires	3540, 3670, 3779, 4200

Abstract
Motivation
Specification
- Instructions
- Safety
Rationale
Backwards Compatibility
Security Considerations
Test Cases
Copyright

Abstract

This proposal introduces two opcodes to support simple subroutines: RJUMPSUB and RETURNSUB.

Taken together with other recent propoposals it provides an efficient, static, and safe control-flow facility.

Motivation

The EVM does not provide subroutines as primitives.

Instead, calls can be synthesized by fetching and pushing the return address and subroutine address on the data stack and executing a JUMP to the subroutine; returns can be synthesized by getting the return address to the top of the stack and jumping back to it. These conventions cost gas and use stack slots unnecessarily. And they create unnecessary complexity that is borne by the humans and programs writing, reading, and analyzing EVM code.

Subroutines

As an alternative, we propose to provide for subroutines with two simple operations:

RJUMPSUB to call a subroutine and
RETURNSUB to return from it.

Taken together with other recent propoposals they provide an efficient, static, and safe control-flow facility.

Efficient. Substantial reductions in the complexity and the gas costs of calling and optimizing simple subroutines – as much as 56% savings in gas in the analysis below. Substantial performance advantages for static jumps are reported elsewhere.

Static. All possible jumps are known at contract creation time.

Safe. Valid contracts will not halt with an exception unless they run out of gas or recursively overflow stack.

History

Facilities to directly support subroutines are provided by all but one of the real and virtual machines we have programmed, including the Burroughs 5000, CDC 7600, IBM 360, DEC PDP 11 and VAX, Motorola 68000, Sun SPARC, a few generations of Intel silicon, ARM, UCSD p-machine, Sun JVM, Wasm, and the sole exception – the EVM. In whatever form, these operations provide for

capturing the current context of execution,
transferring control to a new context, and
returning to the original context
- after possible further transfers of control
- to some arbitrary depth.

The concept goes back to Turing’s Automatic Computing Engine of 1946:

… We also wish to be able to arrange for the splitting up of operations into subsidiary operations. This should be done in such a way that once we have written down how an operation is done we can use it as a subsidiary to any other operation. … When we wish to start on a subsidiary operation we need only make a note of where we left off the major operation and then apply the first instruction of the subsidiary. When the subsidiary is over we look up the note [on] a list of these notes in one or more standard size delay lines, (1024) with the most recent last…

Turing’s machine held data in delay lines made of mercury-filled crystals. We have better hardware now, but the concept is simple and by now familiar: Turing’s “subsidiary operations” are subroutines, and his “list of notes” is a stack of return addresses.

We propose to follow Turing’s lead in the design of our subroutine facility, as specified below.

Efficiency Analysis

We show here how these instructions can be used to reduce the complexity and gas costs of both ordinary subroutine calls and low-level optimizations compared to using JUMP.

We assume that RJUMPSUB costs 5 gas, and RETURNSUB costs 8 gas. We justify these costs below, but they do not make a large difference in this analysis, as much of the improvement is due to PUSH and SWAP operations that are no longer needed.

Simple Subroutine Call

Consider this example of calling a fairly minimal subroutine.

Subroutine call, using JUMP:

TEST_SQUARE:
    jumpdest        ; 1 gas
    RTN_SQUARE      ; 3 gas
    0x02            ; 3 gas
    SQUARE          ; 3 gas
    jump            ; 8 gas
RTN_SQUARE:
    jumpdest        ; 1 gas
    swap1           ; 3 gas
    jump            ; 8 gas

SQUARE:
    jumpdest        ; 1 gas
    dup1            ; 3 gas
    mul             ; 5 gas
    swap1           ; 3 gas
    jump            ; 8 gas

Total: 50 gas

Subroutine call, using RJUMPSUB:

TEST_SQUARE:
    0x02            ; 3 gas
    jumpsub SQUARE  ; 5 gas
    returnsub       ; 3 gas

SQUARE:
    dup1            ; 3 gas
    mul             ; 5 gas
    returnsub       ; 3 gas

Total 22 gas.

Using RJUMPSUB versus JUMP saves 50-22=28 gas – a 56% improvement.

Tail Call Optimization

Of course in cases like this one we can optimize the tail call, so that the return from SQUARE actually returns from TEST_SQUARE.

Tail call optimization, using JUMP:

TEST_SQUARE:
    jumpdest        ; 1 gas
    0x02            ; 3 gas
    SQUARE          ; 3 gas
    jump            ; 8 gas

SQUARE:
    jumpdest        ; 1 gas
    dup1            ; 3 gas
    mul             ; 5 gas
    swap1           ; 3 gas
    jump            ; 8 gas

Total: 33 gas

Tail call optimization, using RJUMP and RETURNSUB:

TEST_SQUARE:
    0x02            ; 3 gas
    rjump SQUARE    ; 3 gas

SQUARE:
    dup1            ; 3 gas
    mul             ; 5 gas
    returnsub       ; 3 gas

Total: 17 gas

Using RJUMPSUB versus JUMP saves 33-17=16 gas – a 48% improvement.

Tail Call Elimination

We can even take advantage of SQUARE just happening to directly follow TEST_SQUARE and just fall through rather than jump at all.

Tail call elimination, using JUMP:

TEST_SQUARE:
    jumpdest        ; 1 gas
    0x02            ; 3 gas

SQUARE:
    jumpdest        ; 1 gas
    dup1            ; 3 gas
    mul             ; 5 gas
    swap1           ; 3 gas
    jump            ; 8 gas

Total: 24 gas

Tail call elimination, using JUMPSUB:

TEST_SQUARE:
    0x02            ; 3 gas

SQUARE:
    dup1            ; 3 gas
    mul             ; 5 gas
    returnsub       ; 3 gas

Total 14 gas.

Using RETURNSUB versus JUMP saves 24-14=10 gas – a 42% improvement.

Call Using Data Stack

We can also consider an alternative call mechanism – call it JALTSUB – that pushes its return address on the data_stack:

TEST_SQUARE:
    jumpdest        ; 1 gas
    0x02            ; 3 gas
    jaltsub SQUARE  ; 6 gas
    returnsub       ; 3 gas

SQUARE:
    jumpdest        ; 1 gas
    dup1            ; 3 gas
    mul             ; 5 gas
    swap1           ; 3 gas
    returnsub       ; 3 gas

Total 29 gas, compared to 22 gas for the RJUMPSUB version. (Setting the gas cost to 6 because using the wide-integer data_stack is less efficient than using a stack of native integers.)

Conlusion

We can see that these instructions provide a simpler and more gas-efficient subroutine mechanism than using JUMP.

Clearly, the benefits of these efficiencies are greater for programs that have been factored into smaller subroutines, but a routine could use 200 more gas than our first example and RJUMPSUB would still use better than 10% less gas than JUMP.

Note: A stack-rolling operator to move the return address to the top of the stack and implicitly shift the intervening items could simplify code using JUMP. It would be an expensive operation with a dynamic gas cost.

Specification

To support calls and returns we specify an EVM return stack in addition to the existing data stack. The return stack, like the data stack, is limited to 1024 items. This stack supports two new instructions for subroutines.

Instructions

`RJUMPSUB (0x5e) jmpdest`

Transfers control to a subroutine. The destination address is relative to current PC.

Decode the jmpdest from the immediate data. The data is encoded as two bytes, MSB-first.

Set PC to PC + jmpdest.

Push PC + 1 on the return stack.

The cost is low.

`RETURNSUB (0x5f)`

Returns control to the caller of a subroutine.

Pop PC off the return stack.

The cost is verylow.

Notes:

If a resulting PC to be executed is beyond the last instruction then the opcode is implicitly a STOP, which is not an error.
Values popped off the return stack do not need to be validated, since they are alterable only by RJUMPSUB and RETURNSUB.
The description above lays out the semantics of these instructions in terms of a return stack. But the actual state of the return stack is not observable by EVM code or consensus-critical to the protocol. (For example, a node implementer may code RJUMPSUB to unobservably push PC on the return stack rather than PC + 1, which is allowed so long as RETURNSUB observably returns control to the PC + 1 location.)
The return stack is the functional equivalent of Turing’s “delay line”.

The low cost of RJUMPSUB versus JUMP is justified by needing only to add the immediate two byte destination to the PC and push the return address on the return stack, all using native arithmetric, versus using the data stack with emulated 256-bit instructions.

The verylow cost of RETURNSUB is justified by needing only to pop the return stack into the PC. Benchmarking will be needed to tell if the costs are well-balanced.

Safety

We define safety here as avoiding exceptional halting states, as defined in the Yellow Paper. A validator can always detect three of these states at validation time:

Insufficient stack items
Invalid jump destination
Invalid instruction

A validator can detect stack overflow only for non-recursive programs.

So two states will still require tests at runtime:

Stack overflow
Insufficient gas

EIP-3779 specifies creation-time validation to detect invalid contracts.

Rationale

There are at least two designs for a subroutine facility.

Turing’s design is to keep return addresses on a dedicated return stack.

The instruction to call a subroutine will

push the return address onto the return stack and
jump to the first instruction of the subroutine.

The instruction to return from a subroutine will

jump to the address popped off of thereturn stack.

We have chosen Turing’s design, as have Forth, the JVM, Wasm, and others.

An alternative design is to keep return addresses on the data stack.

The instruction to call a subroutine will

push the return address onto the data stack and
jump to the first instruction of the subroutine.

The instruction to return from a subroutine will

jump to the address popped off of the data stack. Note that the program must first ensure that the return address is the first element on the stack.

Similar designs are used by the register machines we have programmed, where subroutine calls use the stack for return addresses and arguments, whereas registers are used for computation. We give an example of the alternative design above.

We prefer Turing’s design for a few reasons.

It maintains a clear separation between calculation and flow of control. It’s impossible to overwrite the return stack, and the data stack is uncluttered with vulnerable return addresses.
It improves performance by
- using native arithmetic rather than 256-bit EVM instructions for the return address,
- not needing a data stack slot for the return address,
- and not needing to rearrange arguments, return values, and the return address.
It has a 75-year history of working well, including for the stack machines we have programmed.

Backwards Compatibility

These changes affect the semantics of existing EVM code.

These changes are compatible with EIP-3779. Contracts satisfying the conditions given there will be valid.

Security Considerations

These changes introduce new flow control instructions, so any software which does static/dynamic analysis of EVM code needs to be modified accordingly. The RJUMPSUB semantics are similar to JUMP whereas the RETURNSUB instruction is different, since it can ‘land’ on any opcode (but the possible destinations can be statically inferred).

Test Cases

Simple routine

This should jump into a subroutine, back out and stop.

Bytecode: 0x60045e005b5d (PUSH1 0x04, JUMPSUB, STOP, JUMPDEST, RETURNSUB)

Pc	Op	Cost	Stack	RStack
0	JUMPSUB	5	[]	[]
3	RETURNSUB	5	[]	[0]
4	STOP	0	[]	[]

Output: 0x Consumed gas: 10

Two levels of subroutines

This should execute fine, going into one two depths of subroutines

Bytecode: 0x6800000000000000000c5e005b60115e5d5b5d (PUSH9 0x00000000000000000c, JUMPSUB, STOP, JUMPDEST, PUSH1 0x11, JUMPSUB, RETURNSUB, JUMPDEST, RETURNSUB)

Pc	Op	Cost	Stack	RStack
0	JUMPSUB	5	[]	[]
3	JUMPSUB	5	[]	[0]
4	RETURNSUB	5	[]	[0,3]
5	RETURNSUB	5	[]	[3]
6	STOP	0	[]	[]

Consumed gas: 20

Failure 1: invalid jump

This should fail, since the given location is outside of the code-range. The code is the same as previous example, except that the pushed location is 0x01000000000000000c instead of 0x0c.

Bytecode: (PUSH9 0x01000000000000000c, JUMPSUB, 0x6801000000000000000c5e005b60115e5d5b5d, STOP, JUMPDEST, PUSH1 0x11, JUMPSUB, RETURNSUB, JUMPDEST, RETURNSUB)

Pc	Op	Cost	Stack	RStack
0	JUMPSUB	10	[18446744073709551628]	[]

Error: at pc=10, op=JUMPSUB: invalid jump destination

Failure 2: shallow `return stack`

This should fail at first opcode, due to shallow return_stack

Bytecode: 0x5d5858 (RETURNSUB, PC, PC)

Pc	Op	Cost	Stack	RStack
0	RETURNSUB	5	[]	[]

Error: at pc=0, op=RETURNSUB: invalid retsub

Subroutine at end of code

In this example. the JUMPSUB is on the last byte of code. When the subroutine returns, it should hit the ‘virtual stop’ after the bytecode, and not exit with error

Bytecode: 0x6005565b5d5b60035e (PUSH1 0x05, JUMP, JUMPDEST, RETURNSUB, JUMPDEST, PUSH1 0x03, JUMPSUB)

Pc	Op	Cost	Stack	RStack
0	PUSH1	3	[]	[]
2	JUMP	8	[5]	[]
5	JUMPDEST	1	[]	[]
6	JUMPSUB	5	[]	[]
2	RETURNSUB	5	[]	[2]
7	STOP	0	[]	[]

Consumed gas: 30

Copyright

Citation

Please cite this document as:

Greg Colvin, Greg Colvin, Martin Holst Swende, Brooklyn Zelenka, "EIP-2315: Simple Subroutines for the EVM [DRAFT]," Ethereum Improvement Proposals, no. 2315, October 2019. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-2315.

EIP-2315: Simple Subroutines for the EVM Source