EIP-2477: Token Metadata Integrity
Author | Kristijan Sedlak, William Entriken, Witek Radomski |
---|---|
Discussions-To | https://github.com/ethereum/EIPs/issues/2483 |
Status | Stagnant |
Type | Standards Track |
Category | ERC |
Created | 2020-01-02 |
Requires | 165, 721, 1155 |
Table of Contents
Simple Summary
This specification defines a mechanism by which clients may verify that a fetched token metadata document has been delivered without unexpected manipulation.
This is the Web3 counterpart of the W3C Subresource Integrity (SRI) specification.
Abstract
An interface ERC2477
with two functions tokenURIIntegrity
and tokenURISchemaIntegrity
are specified for smart contracts and a narrative is provided to explain how this improves the integrity of the token metadata documents.
Motivation
Tokens are being used in many applications to represent, trace and provide access to assets off-chain. These assets include in-game digital items in mobile apps, luxury watches and products in our global supply chain, among many other creative uses.
Several token standards allow attaching metadata to specific tokens using a URI (RFC 3986) and these are supported by the applications mentioned above. These metadata standards are:
- ERC-721 metadata extension (
ERC721Metadata
) - ERC-1155 metadata extension (
ERC1155Metadata_URI
) - ERC-1046 (DRAFT) ERC-20 Metadata Extension
Although all these standards allow storing the metadata entirely on-chain (using the βdataβ URI, RFC 2397), or using a content-addressable system (e.g. IPFSβs Content IDentifiers [sic]), nearly every implementation we have found is using Uniform Resource Locators (the exception is The Sandbox which uses IPFS URIs). These URLs provide no guarantees of content correctness or immutability. This standard adds such guarantees.
Design
Approach A: A token contract may reference metadata by using its URL. This provides no integrity protection because the referenced metadata and/or schema could change at any time if the hosted content is mutable. This is the world before EIP-2477:
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
β TokenID ββββββββΆβMetadataβββββββΆβ Schema β
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
Note: according to the JSON Schema project, a metadata document referencing a schema using a URI in the $schema
key is a known approach, but it is not standardized.
Approach B: EIP-2477 provides mechanisms to establish integrity for these references. In one approach, there is integrity for the metadata document. Here, the on-chain data includes a hash of the metadata document. The metadata may or may not reference a schema. In this approach, changing the metadata document will require updating on-chain tokenURIIntegrity
:
βββββββββββββββββββββββββ ββββββββββ β β β β β
β TokenID ββββββββΆβMetadataββ β ββΆ Schema β
βββββββββββββββββββββββββ ββββββββββ β β β β β
βββββββββββββββββββββββββ β²
β tokenURIIntegrity ββββββββββββββ
βββββββββββββββββββββββββ
Approach C: In a stronger approach, the schema is referenced by the metadata using an extension to JSON Schema, providing integrity. In this approach, changing the metadata document or the schema will require updating on-chain tokenURIIntegrity
and the metadata document, additionally changing the schema requires updating the on-chain tokenURISchemaIntegrity
:
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
β TokenID ββββββββΆβMetadataβββββββΆβ Schema β
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
βββββββββββββββββββββββββ β²
β tokenURIIntegrity ββββββββββββββ
βββββββββββββββββββββββββ
Approach D: Equally strong, the metadata can make a normal reference (no integrity protection) to the schema and on-chain data also includes a hash of the schema document. In this approach, changing the metadata document will require updating on-chain tokenURIIntegrity
and updating the schema document will require updating the tokenURISchemaIntegrity
:
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
β TokenID ββββββββΆβMetadataβββββββΆβ Schema β
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
βββββββββββββββββββββββββ β² β²
β tokenURIIntegrity ββββββββββββββ β
βββββββββββββββββββββββββ β
βββββββββββββββββββββββββ β
βtokenURISchemaIntegrityββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββ
Approach E: Lastly, the schema can be referenced with integrity from the metadata and also using on-chain data. In this approach, changing the metadata document or the schema will require updating on-chain tokenURIIntegrity
and the metadata document, additionally changing the schema requires updating the on-chain tokenURISchemaIntegrity
:
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
β TokenID ββββββββΆβMetadataβββββββΆβ Schema β
βββββββββββββββββββββββββ ββββββββββ ββββββββββ
βββββββββββββββββββββββββ β² β²
β tokenURIIntegrity ββββββββββββββ β
βββββββββββββββββββββββββ β
βββββββββββββββββββββββββ β
βtokenURISchemaIntegrityββββββββββββββββββββββββββββββ
βββββββββββββββββββββββββ
Specification
The key words βMUSTβ, βMUST NOTβ, βREQUIREDβ, βSHALLβ, βSHALL NOTβ, βSHOULDβ, βSHOULD NOTβ, βRECOMMENDEDβ, βMAYβ, and βOPTIONALβ in this document are to be interpreted as described in RFC 2119.
Smart contracts
Smart contracts implementing the ERC-2477 standard MUST implement the ERC2477
interface.
// SPDX-License-Identifier: CC0-1.0
pragma solidity ^0.8.7;
/// @title ERC-2477 Token Metadata Integrity
/// @dev See https://eips.ethereum.org/EIPS/eip-2477
/// @dev The ERC-165 identifier for this interface is 0x832a7e0e
interface ERC2477 /* is ERC165 */ {
/// @notice Get the cryptographic hash of the specified tokenID's metadata
/// @param tokenId Identifier for a specific token
/// @return digest Bytes returned from the hash algorithm, or "" if not available
/// @return hashAlgorithm The name of the cryptographic hash algorithm, or "" if not available
function tokenURIIntegrity(uint256 tokenId) external view returns(bytes memory digest, string memory hashAlgorithm);
/// @notice Get the cryptographic hash for the specified tokenID's metadata schema
/// @param tokenId Identifier for a specific token
/// @return digest Bytes returned from the hash algorithm, or "" if not available
/// @return hashAlgorithm The name of the cryptographic hash algorithm, or "" if not available
function tokenURISchemaIntegrity(uint256 tokenId) external view returns(bytes memory digest, string memory hashAlgorithm);
}
The returned cryptographic hashes correspond to the tokenβs metadata document and that metadata documentβs schema, respectively.
For example, with ERC-721 tokenURIIntegrity(21)
would correspond to tokenURI(21)
. With ERC-1155, tokenURIIntegrity(16)
would correspond to uri(16)
. In both cases, tokenURISchemaIntegrity(32)
would correspond to the schema of the document matched by tokenURIIntegrity(32)
.
Smart contracts implementing the ERC-2477 standard MUST implement the ERC-165 standard, including the interface identifiers above.
Smart contracts implementing the ERC-2477 standard MAY use any hashing or content integrity scheme.
Smart contracts implementing the ERC-2477 standard MAY use or omit a mechanism to notify when the integrity is updated (e.g. an Ethereum logging operation).
Smart contracts implementing the ERC-2477 standard MAY use any mechanism to provide schemas for metadata documents and SHOULD use JSON-LD on the metadata document for this purpose (i.e. "@schema":...
).
Metadata
A metadata document MAY conform to this schema to provide referential integrity to its schema document.
{
"title": "EIP-2477 JSON Object With Refererential Integrity to Schema",
"type": "object",
"properties": {
"$schema": {
"type": "string",
"format": "uri"
},
"$schemaIntegrity": {
"type": "object",
"properties": {
"digest": {
"type": "string"
},
"hashAlgorithm": {
"type": "string"
}
},
"required": ["digest", "hashAlgorithm"]
}
},
"required": ["$schema", "$schemaIntegrity"]
}
Clients
A client implementing the ERC-2477 standard MUST support at least the sha256
hash algorithm and MAY support other algorithms.
Caveats
- This EIP metadata lists ERC-721 and ERC-1155 as βrequiredβ for implementation, due to a technical limitation of EIP metadata. In actuality, this standard is usable with any token implementation that has a
tokenURI(uint id)
or similar function.
Rationale
Function and parameter naming
The W3C Subresource Integrity (SRI) specification uses the attribute βintegrityβ to perform integrity verification. This ERC-2477 standard provides a similar mechanism and reuses the integrity name so as to be familiar to people that have seen SRI before.
Function return tuple
The SRI integrity attribute encodes elements of the tuple \((cryptographic\ hash\ function, digest, options)\). This ERC-2477 standard returns a digest and hash function name and omits forward-compatibility options.
Currently, the SRI specification does not make use of options. So we cannot know what format they might be when implemented. This is the motivation to exclude this parameter.
The digest return value is first, this is an optimization because we expect on-chain implementations will be more likely to use this return value if they will only be using one of the two.
Function return types
The digest is a byte array and supports various hash lengths. This is consistent with SRI. Whereas SRI uses base64 encoding to target an HTML document, we use a byte array because Ethereum already allows this encoding.
The hash function name is a string. Currently there is no universal taxonomy of hash function names. SRI recognizes the names sha256
, sha384
and sha512
with case-insensitive matching. We are aware of two authorities which provide taxonomies and canonical names for hash functions: ETSI Object Identifiers and NIST Computer Security Objects Register. However, SRIβs approach is easier to follow and we have adopted this here.
Function return type β hash length
Clients must support the SHA-256 algorithm and may optionally support others. This is a departure from the SRI specification where SHA-256, SHA-384 and SHA-512 are all required. The rationale for this less-secure requirement is because we expect some clients to be on-chain. Currently SHA-256 is simple and cheap to do on Ethereum whereas SHA-384 and SHA-512 are more expensive and cumbersome.
The most popular hash function size below 256 bits in current use is SHA-1 at 160 bits. Multiple collisions (the βShatteredβ PDF file, the 320 byte file, the chosen prefix) have been published and a recipe is given to generate infinitely more collisions. SHA-1 is broken. The United States National Institute of Standards and Technology (NIST) has first deprecated SHA-1 for certain use cases in November 2015 and has later further expanded this deprecation.
The most popular hash function size above 256 bits in current use is SHA-384 as specified by NIST.
The United States National Security Agency requires a hash length of 384 or more bits for the SHA-2 (CNSA Suite Factsheet) algorithm suite for use on TOP SECRET networks. (No unclassified documents are currently available to specify use cases at higher classification networks.)
We suspect that SHA-256 and the 0xcert Asset Certification will be popular choices to secure token metadata for the foreseeable future.
In-band signaling
One possible way to achieve strong content integrity with the existing token standards would be to include, for example, a ?integrity=XXXXX
at the end of all URLs. This approach is not used by any existing implementations we know about. There are a few reasons we have not chosen this approach. The strongest reason is that the World Wide Web has the same problem and they chose to use the Sub-Resource Integrity approach, which is a separate data field than the URL.
Other supplementary reasons are:
-
For on-chain consumers of data, it is easier to parse a direct hash field than to perform string operations.
-
Maybe there are some URIs which are not amenable to being modified in that way, therefore limiting the generalizability of that approach.
This design justification also applies to tokenURISchemaIntegrity
. The current JSON-LD specification allows a JSON document to link to a schema document. But it does not provide integrity. Rather than changing how JSON-LD works, or changing JSON Schemas, we have the tokenURISchemaIntegrity
property to just provide the integrity.
Backwards Compatibility
Both ERC-721 and ERC-1155 provide compatible token metadata specifications that use URIs and JSON schemas. The ERC-2477 standard is compatible with both, and all specifications are additive. Therefore, there are no backward compatibility regressions.
ERC-1523 Standard for Insurance Policies as ERC-721 Non Fungible Tokens (DRAFT) proposes an extension to ERC-721 which also tightens the requirements on metadata. Because it is wholly an extension of ERC-721, ERC-1523 is automatically supported by ERC-2477 (since this standard already supports ERC-721).
ERC-1046 (DRAFT) ERC-20 Metadata Extension proposes a comparate extension for ERC-20. Such a concept is outside the scope of this ERC-2477 standard. Should ERC-1046 (DRAFT) be finalized, we will welcome a new ERC which copies ERC-2477 and removes the tokenId
parameter.
Similarly, ERC-918 (DRAFT) Mineable Token Standard proposes an extension for ERC-20 and also includes metadata. The same comment applies here as ERC-1046.
Test Cases
Following is a token metadata document which is simultaneously compatible with ERC-721, ERC-1155 and ERC-2477 standards.
{
"$schema": "https://URL_TO_SCHEMA_DOCUMENT",
"name": "Asset Name",
"description": "Lorem ipsum...",
"image": "https://s3.amazonaws.com/your-bucket/images/{id}.png"
}
This above example shows how JSON-LD is employed to reference the schema document ($schema
).
Following is a corresponding schema document which is accessible using the URI "https://URL_TO_SCHEMA_DOCUMENT"
above.
{
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Identifies the asset to which this NFT represents"
},
"description": {
"type": "string",
"description": "Describes the asset to which this NFT represents"
},
"image": {
"type": "string",
"description": "A URI pointing to a resource with mime type image/* representing the asset to which this NFT represents. Consider making any images at a width between 320 and 1080 pixels and aspect ratio between 1.91:1 and 4:5 inclusive."
}
}
}
Assume that the metadata and schema above apply to a token with identifier 1234. (In ERC-721 this would be a specific token, in ERC-1155 this would be a token type.) Then these two function calls MAY have the following output:
function tokenURIIntegrity(1234)
bytes digest
:3fc58b72faff20684f1925fd379907e22e96b660
string hashAlgorithm
:sha256
function tokenURISchemaIntegrity(1234)
bytes digest
:ddb61583d82e87502d5ee94e3f2237f864eeff72
string hashAlgorithm
:sha256
To avoid doubt: the previous paragraph specifies βMAYβ have that output because other hash functions are also acceptable.
Implementation
0xcert Framework supports ERC-2477.
Reference
Normative standard references
- RFC 2119 Key words for use in RFCs to Indicate Requirement Levels. https://www.ietf.org/rfc/rfc2119.txt
- ERC-165 Standard Interface Detection. ./eip-165.md
- ERC-721 Non-Fungible Token Standard. ./eip-721.md
- ERC-1155 Multi Token Standard. ./eip-1155.md
- JSON-LD. https://www.w3.org/TR/json-ld/
- Secure Hash Standard (SHS). https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf
Other standards
- ERC-1046 ERC-20 Metadata Extension (DRAFT). ./eip-1046.md
- ERC-918 Mineable Token Standard (DRAFT). ./eip-918.md
- ERC-1523 Standard for Insurance Policies as ERC-721 Non Fungible Tokens (DRAFT). ./eip-1523.md
- W3C Subresource Integrity (SRI). https://www.w3.org/TR/SRI/
- The βdataβ URL scheme. https://tools.ietf.org/html/rfc2397
- Uniform Resource Identifier (URI): Generic Syntax. https://tools.ietf.org/html/rfc3986
- CID [Specification] (DRAFT). https://github.com/multiformats/cid
Discussion
- JSON-LD discussion of referential integrity. https://lists.w3.org/Archives/Public/public-json-ld-wg/2020Feb/0003.html
- JSON Schema use of
$schema
key for documents. https://github.com/json-schema-org/json-schema-spec/issues/647#issuecomment-417362877
Other
- [0xcert Framework supports ERC-2477]. https://github.com/0xcert/framework/pull/717
- [Shattered] The first collision for full SHA-1. https://shattered.io/static/shattered.pdf
- [320 byte file] The second SHA Collision. https://privacylog.blogspot.com/2019/12/the-second-sha-collision.html
- [Chosen prefix] https://sha-mbles.github.io
- Transitions: Recommendation for Transitioning the Use of Cryptographic Algorithms and Key Lengths. (Rev. 1. Superseded.) https://csrc.nist.gov/publications/detail/sp/800-131a/rev-1/archive/2015-11-06
- Commercial National Security Algorithm (CNSA) Suite Factsheet. https://apps.nsa.gov/iaarchive/library/ia-guidance/ia-solutions-for-classified/algorithm-guidance/commercial-national-security-algorithm-suite-factsheet.cfm
- ETSI Assigned ASN.1 Object Identifiers. https://portal.etsi.org/pnns/oidlist
- Computer Security Objects Register. https://csrc.nist.gov/projects/computer-security-objects-register/algorithm-registration
- The Sandbox implementation. https://github.com/pixowl/sandbox-smart-contracts/blob/7022ce38f81363b8b75a64e6457f6923d91960d6/src/Asset/ERC1155ERC721.sol
Copyright
Copyright and related rights waived via CC0.
Citation
Please cite this document as:
Kristijan Sedlak, William Entriken, Witek Radomski, "EIP-2477: Token Metadata Integrity," Ethereum Improvement Proposals, no. 2477, January 2020. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-2477.