EIP-2315: Simple Subroutines for the EVM
This proposal introduces two opcodes to support simple subroutines.
Author | Greg Colvin, Greg Colvin, Martin Holst Swende, Brooklyn Zelenka |
---|---|
Discussions-To | https://ethereum-magicians.org/t/eip-2315-simple-subroutines-for-the-evm/3941 |
Status | Draft |
Type | Standards Track |
Category | Core |
Created | 2019-10-17 |
Requires | 3540, 3670, 3779, 4200 |
Table of Contents
Abstract
This proposal introduces two opcodes to support simple subroutines: RJUMPSUB
and RETURNSUB
.
Taken together with other recent propoposals it provides an efficient, static, and safe control-flow facility.
Motivation
The EVM does not provide subroutines as primitives.
Instead, calls can be synthesized by fetching and pushing the return address and subroutine address on the data stack and executing a JUMP
to the subroutine; returns can be synthesized by getting the return address to the top of the stack and jumping back to it. These conventions cost gas and use stack slots unnecessarily. And they create unnecessary complexity that is borne by the humans and programs writing, reading, and analyzing EVM code.
Subroutines
As an alternative, we propose to provide for subroutines with two simple operations:
RJUMPSUB
to call a subroutine andRETURNSUB
to return from it.
Taken together with other recent propoposals they provide an efficient, static, and safe control-flow facility.
Efficient. Substantial reductions in the complexity and the gas costs of calling and optimizing simple subroutines – as much as 56% savings in gas in the analysis below. Substantial performance advantages for static jumps are reported elsewhere.
Static. All possible jumps are known at contract creation time.
Safe. Valid contracts will not halt with an exception unless they run out of gas or recursively overflow stack.
History
Facilities to directly support subroutines are provided by all but one of the real and virtual machines we have programmed, including the Burroughs 5000, CDC 7600, IBM 360, DEC PDP 11 and VAX, Motorola 68000, Sun SPARC, a few generations of Intel silicon, ARM, UCSD p-machine, Sun JVM, Wasm, and the sole exception – the EVM. In whatever form, these operations provide for
- capturing the current context of execution,
- transferring control to a new context, and
- returning to the original context
- after possible further transfers of control
- to some arbitrary depth.
The concept goes back to Turing’s Automatic Computing Engine of 1946:
… We also wish to be able to arrange for the splitting up of operations into subsidiary operations. This should be done in such a way that once we have written down how an operation is done we can use it as a subsidiary to any other operation. … When we wish to start on a subsidiary operation we need only make a note of where we left off the major operation and then apply the first instruction of the subsidiary. When the subsidiary is over we look up the note [on] a list of these notes in one or more standard size delay lines, (1024) with the most recent last…
Turing’s machine held data in delay lines made of mercury-filled crystals. We have better hardware now, but the concept is simple and by now familiar: Turing’s “subsidiary operations” are subroutines, and his “list of notes” is a stack of return addresses.
We propose to follow Turing’s lead in the design of our subroutine facility, as specified below.
Efficiency Analysis
We show here how these instructions can be used to reduce the complexity and gas costs of both ordinary subroutine calls and low-level optimizations compared to using JUMP
.
We assume that RJUMPSUB
costs 5 gas, and RETURNSUB
costs 8 gas. We justify these costs below, but they do not make a large difference in this analysis, as much of the improvement is due to PUSH
and SWAP
operations that are no longer needed.
Simple Subroutine Call
Consider this example of calling a fairly minimal subroutine.
Subroutine call, using JUMP
:
TEST_SQUARE:
jumpdest ; 1 gas
RTN_SQUARE ; 3 gas
0x02 ; 3 gas
SQUARE ; 3 gas
jump ; 8 gas
RTN_SQUARE:
jumpdest ; 1 gas
swap1 ; 3 gas
jump ; 8 gas
SQUARE:
jumpdest ; 1 gas
dup1 ; 3 gas
mul ; 5 gas
swap1 ; 3 gas
jump ; 8 gas
Total: 50 gas
Subroutine call, using RJUMPSUB
:
TEST_SQUARE:
0x02 ; 3 gas
jumpsub SQUARE ; 5 gas
returnsub ; 3 gas
SQUARE:
dup1 ; 3 gas
mul ; 5 gas
returnsub ; 3 gas
Total 22 gas.
Using RJUMPSUB
versus JUMP
saves 50-22=28 gas – a 56% improvement.
Tail Call Optimization
Of course in cases like this one we can optimize the tail call, so that the return from SQUARE
actually returns from TEST_SQUARE
.
Tail call optimization, using JUMP
:
TEST_SQUARE:
jumpdest ; 1 gas
0x02 ; 3 gas
SQUARE ; 3 gas
jump ; 8 gas
SQUARE:
jumpdest ; 1 gas
dup1 ; 3 gas
mul ; 5 gas
swap1 ; 3 gas
jump ; 8 gas
Total: 33 gas
Tail call optimization, using RJUMP
and RETURNSUB
:
TEST_SQUARE:
0x02 ; 3 gas
rjump SQUARE ; 3 gas
SQUARE:
dup1 ; 3 gas
mul ; 5 gas
returnsub ; 3 gas
Total: 17 gas
Using RJUMPSUB
versus JUMP
saves 33-17=16 gas – a 48% improvement.
Tail Call Elimination
We can even take advantage of SQUARE
just happening to directly follow TEST_SQUARE
and just fall through rather than jump at all.
Tail call elimination, using JUMP:
TEST_SQUARE:
jumpdest ; 1 gas
0x02 ; 3 gas
SQUARE:
jumpdest ; 1 gas
dup1 ; 3 gas
mul ; 5 gas
swap1 ; 3 gas
jump ; 8 gas
Total: 24 gas
Tail call elimination
, using JUMPSUB:
TEST_SQUARE:
0x02 ; 3 gas
SQUARE:
dup1 ; 3 gas
mul ; 5 gas
returnsub ; 3 gas
Total 14 gas.
Using RETURNSUB
versus JUMP
saves 24-14=10 gas – a 42% improvement.
Call Using Data Stack
We can also consider an alternative call mechanism – call it JALTSUB
– that pushes its return address on the data_stack
:
TEST_SQUARE:
jumpdest ; 1 gas
0x02 ; 3 gas
jaltsub SQUARE ; 6 gas
returnsub ; 3 gas
SQUARE:
jumpdest ; 1 gas
dup1 ; 3 gas
mul ; 5 gas
swap1 ; 3 gas
returnsub ; 3 gas
Total 29 gas, compared to 22 gas for the RJUMPSUB
version. (Setting the gas cost to 6 because using the wide-integer data_stack
is less efficient than using a stack of native integers.)
Conlusion
We can see that these instructions provide a simpler and more gas-efficient subroutine mechanism than using JUMP
.
Clearly, the benefits of these efficiencies are greater for programs that have been factored into smaller subroutines, but a routine could use 200 more gas than our first example and RJUMPSUB
would still use better than 10% less gas than JUMP
.
Note: A stack-rolling operator to move the return address to the top of the stack and implicitly shift the intervening items could simplify code using JUMP
. It would be an expensive operation with a dynamic gas cost.
Specification
To support calls and returns we specify an EVM return stack
in addition to the existing data stack
. The return stack
, like the data stack
, is limited to 1024
items. This stack supports two new instructions for subroutines.
Instructions
RJUMPSUB (0x5e) jmpdest
Transfers control to a subroutine. The destination address is relative to current PC.
- Decode the
jmpdest
from the immediate data. The data is encoded as two bytes, MSB-first.- Set
PC
toPC + jmpdest
.- Push
PC + 1
on thereturn stack
.The cost is low.
RETURNSUB (0x5f)
Returns control to the caller of a subroutine.
- Pop
PC
off thereturn stack
.The cost is verylow.
Notes:
- If a resulting
PC
to be executed is beyond the last instruction then the opcode is implicitly aSTOP
, which is not an error. - Values popped off the
return stack
do not need to be validated, since they are alterable only byRJUMPSUB
andRETURNSUB
. - The description above lays out the semantics of these instructions in terms of a
return stack
. But the actual state of thereturn stack
is not observable by EVM code or consensus-critical to the protocol. (For example, a node implementer may codeRJUMPSUB
to unobservably pushPC
on thereturn stack
rather thanPC + 1
, which is allowed so long asRETURNSUB
observably returns control to thePC + 1
location.) - The
return stack
is the functional equivalent of Turing’s “delay line”.
The low cost of RJUMPSUB
versus JUMP
is justified by needing only to add the immediate two byte destination to the PC
and push the return address on the return stack
, all using native arithmetric, versus using the data stack with emulated 256-bit instructions.
The verylow cost of RETURNSUB
is justified by needing only to pop the return stack
into the PC
. Benchmarking will be needed to tell if the costs are well-balanced.
Safety
We define safety here as avoiding exceptional halting states, as defined in the Yellow Paper. A validator can always detect three of these states at validation time:
- Insufficient stack items
- Invalid jump destination
- Invalid instruction
A validator can detect stack overflow only for non-recursive programs.
So two states will still require tests at runtime:
- Stack overflow
- Insufficient gas
EIP-3779 specifies creation-time validation to detect invalid contracts.
Rationale
There are at least two designs for a subroutine facility.
Turing’s design is to keep return addresses on a dedicated return stack
.
The instruction to call a subroutine will
- push the return address onto the
return stack
and - jump to the first instruction of the subroutine.
The instruction to return from a subroutine will
- jump to the address popped off of the
return stack
.
We have chosen Turing’s design, as have Forth, the JVM, Wasm, and others.
An alternative design is to keep return addresses on the data stack
.
The instruction to call a subroutine will
- push the return address onto the
data stack
and - jump to the first instruction of the subroutine.
The instruction to return from a subroutine will
- jump to the address popped off of the
data stack
. Note that the program must first ensure that the return address is the first element on the stack.
Similar designs are used by the register machines we have programmed, where subroutine calls use the stack for return addresses and arguments, whereas registers are used for computation. We give an example of the alternative design above.
We prefer Turing’s design for a few reasons.
- It maintains a clear separation between calculation and flow of control. It’s impossible to overwrite the return stack, and the data stack is uncluttered with vulnerable return addresses.
- It improves performance by
- using native arithmetic rather than 256-bit EVM instructions for the return address,
- not needing a
data stack
slot for the return address, - and not needing to rearrange arguments, return values, and the return address.
- It has a 75-year history of working well, including for the stack machines we have programmed.
Backwards Compatibility
These changes affect the semantics of existing EVM code.
These changes are compatible with EIP-3779. Contracts satisfying the conditions given there will be valid.
Security Considerations
These changes introduce new flow control instructions, so any software which does static/dynamic analysis of EVM code needs to be modified accordingly. The RJUMPSUB
semantics are similar to JUMP
whereas the RETURNSUB
instruction is different, since it can ‘land’ on any opcode (but the possible destinations can be statically inferred).
Test Cases
Simple routine
This should jump into a subroutine, back out and stop.
Bytecode: 0x60045e005b5d
(PUSH1 0x04, JUMPSUB, STOP, JUMPDEST, RETURNSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | JUMPSUB | 5 | [] | [] |
3 | RETURNSUB | 5 | [] | [0] |
4 | STOP | 0 | [] | [] |
Output: 0x
Consumed gas: 10
Two levels of subroutines
This should execute fine, going into one two depths of subroutines
Bytecode: 0x6800000000000000000c5e005b60115e5d5b5d
(PUSH9 0x00000000000000000c, JUMPSUB, STOP, JUMPDEST, PUSH1 0x11, JUMPSUB, RETURNSUB, JUMPDEST, RETURNSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | JUMPSUB | 5 | [] | [] |
3 | JUMPSUB | 5 | [] | [0] |
4 | RETURNSUB | 5 | [] | [0,3] |
5 | RETURNSUB | 5 | [] | [3] |
6 | STOP | 0 | [] | [] |
Consumed gas: 20
Failure 1: invalid jump
This should fail, since the given location is outside of the code-range. The code is the same as previous example,
except that the pushed location is 0x01000000000000000c
instead of 0x0c
.
Bytecode: (PUSH9 0x01000000000000000c, JUMPSUB,
0x6801000000000000000c5e005b60115e5d5b5d, STOP, JUMPDEST, PUSH1 0x11, JUMPSUB, RETURNSUB, JUMPDEST, RETURNSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | JUMPSUB | 10 | [18446744073709551628] | [] |
Error: at pc=10, op=JUMPSUB: invalid jump destination
Failure 2: shallow return stack
This should fail at first opcode, due to shallow return_stack
Bytecode: 0x5d5858
(RETURNSUB, PC, PC
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | RETURNSUB | 5 | [] | [] |
Error: at pc=0, op=RETURNSUB: invalid retsub
Subroutine at end of code
In this example. the JUMPSUB is on the last byte of code. When the subroutine returns, it should hit the ‘virtual stop’ after the bytecode, and not exit with error
Bytecode: 0x6005565b5d5b60035e
(PUSH1 0x05, JUMP, JUMPDEST, RETURNSUB, JUMPDEST, PUSH1 0x03, JUMPSUB
)
Pc | Op | Cost | Stack | RStack |
---|---|---|---|---|
0 | PUSH1 | 3 | [] | [] |
2 | JUMP | 8 | [5] | [] |
5 | JUMPDEST | 1 | [] | [] |
6 | JUMPSUB | 5 | [] | [] |
2 | RETURNSUB | 5 | [] | [2] |
7 | STOP | 0 | [] | [] |
Consumed gas: 30
Copyright
Copyright and related rights waived via CC0.
Citation
Please cite this document as:
Greg Colvin, Greg Colvin, Martin Holst Swende, Brooklyn Zelenka, "EIP-2315: Simple Subroutines for the EVM [DRAFT]," Ethereum Improvement Proposals, no. 2315, October 2019. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-2315.