-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERC-1900: Decentralized Type System for EVM #1882
Comments
Kinda related: https://github.com/ewasm/design. I don't think it's desired to add overhead to L1 and L0. This would work better as a "TypeScript" for Solidity. Remix already includes some static analysis, which could be extended for an extended-type Solidity. |
What is the overhead that you see? (so, I can properly answer) |
Adding extra lines of code of structs and memory allocation. IMHO Solidity code should be as short as possible, and higher level abstractions, such as types and pseudo-HOF should be used in a different dialect that finally compiles to native Solidity. |
@OFRBG , In order to achieve functional programming, you need higher order functions (HOFs) in Solidity. You can have dType itself without the overhead of adding HOFs, but if multiple projects need the same libraries, it is an overhead to not standardize. |
I really really like the general idea of this. Having a type system decoupled from the contract language would make language interoperability (and hence language experimentation) much easier, and generally just make all the data stored on-chain much easier to compute with. However, I would personally want to detach this from anything C-like ( I previously worked on a project called Typedefs and one of the long-term ideas there is to put it on e.g. IPFS as a kind of "global type system" that any data in any computing system can reference, to tell people/programs how it can be deconstructed and used. The whole project is built on an extremely simple core, it doesn't give you any primitive types except for
Of course, this is horribly inefficient in itself, but when writing a backend for a specific language/platform, you can define specializations to utilize the primitives and/or standard library types that you have available, as long as you can provide encode/decode functions between these and this minimal/universal representation. This makes the type system itself as language agnostic as possible, while still maintaining the full power of any modern type system. Building a backend for Solidity itself should be fairly trivial, the interesting thing would be how to make it "blockchain-aware", so to speak. I don't have time at the moment but will try to get back with a few thoughts on how to go about that in the coming week. |
Great comment by @kjekac, really well explained! I see the comments regarding overhead at the lower levels, but let me try to twist that kind of thinking a bit. Something as fundamental as the types of the inputs and outputs to functions can have a huge impact on the complexity at the higher levels. We well chosen type theory can constrain behaviour and maintain hold on complexity. An unfortunate flaw in the design and you feel it all over the place (https://developers.slashdot.org/story/09/03/03/1459209/Null-References-the-Billion-Dollar-Mistake). Let me know if you have any questions, I second that it is a good idea and that typedefs can be applied to this. |
@kjekac , For dType, our focus was:
There are some type theory features that dType does not have, due to Solidity's limitations:
I understand why this would be great. We also wanted the type to be contained in the ABI description. Meaning, for example, that anyone should be able to determine the type (& metadata, implementation libraries etc.) of a return value given that value, ABI definition & Type Registry.
Yes, with
We thought of the same thing - extending the system to any language - our current implementation has an additional And we have
While we have
I would be interested in this, thank you. |
A draft has been submitted as a PR: #1900 |
A few thoughts:
|
Updating ERC-1900 after suggestions from ethereum#1882 (comment): - clearer explanation of what a type is and what it's composed of - defining the purpose of the struct fields when they are first encountered - mentioning type immutability and the possibility of removing the dType `remove` function - adding more information about the `source` field - mentioning human readable names as a primary identifier for types Additionally: - removed the TypeStorage contract description, postponing it for a future EIP, due to multiple storage patterns being researched
Updating ERC-1900 after suggestions from ethereum#1882 (comment): - clearer explanation of what a type is and what it's composed of - defining the purpose of the struct fields when they are first encountered - mentioning type immutability and the possibility of removing the dType `remove` function - adding more information about the `source` field - mentioning human readable names as a primary identifier for types Additionally: - removed the TypeStorage contract description, postponing it for a future EIP, due to multiple storage patterns being researched
@Arachnid, thank you for the feedback! I updated the draft and restructured it, to solve the following points from #1882 (comment):
Replying to your other points:
We would like to make Solidity more functional and that means that the data should be harmonized (well formatted) and kept in the same place.
Are you referring to the dType registry interface and implementation or to the type library?
We expect most of the data to be used on-chain as well.
That is a good idea.
It is a lot of overhead involved. But also makes things very well defined in case the community votes the inclusion of the new types into precompiles. Afterwards, the overhead shrinks considerably. Nevertheless, we are defaulting the
For naming, we should probably adopt capitalized camelback standard. The id is calculated as a TypeName varOfType = <instance data>; A search on the dType registry would be: // language 0 is Solidity
find({name= “TypeName”, language= 0})` This search will return the address of the library, the id of the type, set the correct data structure. Potentially a precompile could be run by a new opcode to do the same thing (not covered in this ERC). |
What does this mean? Can you give an example use-case?
No, I'm talking about the distinction between a type definition and an implementation. For example, the ERC20 standard defines some types, which are implemented by a large number of different contracts. Capturing the ability for a type to have many implementations seems like a pretty basic feature to support.
Can you give an example use-case for using type data onchain?
But why support mutable types at all? Once it changes, it stops being the same type; the type's fields etc are fundamental to what it is. Any change also likely breaks compatibility with anything using it.
If you don't define how these are used in the ERC where they're declared, it seems likely that you'll never be able to use them coherently, because there will be no universal expectation over the content stored in them.
It's not clear to me why you need human names as primary identifiers at all. Why not use the typehash, just like Solidity does for function signatures? This addresses both mutability and name collisions. If you must have a human readable identifier, you could use ENS, for instance, to point a human readable name to a typehash. |
Devs have the freedom to implement type helper functions, as long as the required ones are implemented (to be discussed). As for the definition, I am open to other proposals that not necessarily based on
I have some in-work examples in the dType repo with on-chain permissions control based on the dType registry identifiers. E.g. fine-grained function permissions, that can also be used in the storage contracts. And in-work patterns for functional programming, which use function dType identifiers to mimic more complex HOFs.
I removed the
I started a draft at https://github.com/loredanacirstea/EIPs/blob/d6fbbff5f1a1ecfa1eee6f8efa4ca3d896303e38/EIPS/eip-dtype_language.md with details. I will make a PR soon. We wanted to separate the ERCs because some devs may agree with dType core but not with the language extension. The
Our initial version was using
|
I still think you're not understanding the difference between an interface and an implementation. An interface describes an API for other code to interact with, but not how it's implemented. What you're proposing here seems to be more along the lines of a directory of library code.
If a type's ID is based on a serialization of its fields, then types are immutable and this doesn't matter; types can be inserted into the registry in any order.
Can you qualify 'more secure'? What's your threat model?
Removing a type seems like it would cause chaos if it's in use somewhere, and still requires you to maintain a permission model.
EIP 1900 currently says:
I see several problems with this:
I'm also not sure what the motivation here is; why would anyone need to fetch the source in this context? It seems to me that this spec is a long way away from being a simple distributed type registry. It has a lot of unnecessary complexity, and doesn't make a clear distinction between interfaces and implementations. I wish you luck, but I don't plan to offer further technical feedback. |
I asked you before: "Are you referring to the dType registry interface and implementation or to the type library?", to which you answered with "No, I'm talking about the distinction between a type definition and an implementation". I suggest you pay attention to how clearly you phrase your questions before being unsatisfied with the answer.
So aside from your destructive (as opposed to constructive) and your inexact criticism, what can I do? Is ERC not the correct category? EIP-1 definition is "ERC - application-level standards and conventions", so it seems right.
On-chain calculation is the standard. Any off-chain calculation may have a faulty implementation. You need a standard to compare off-chain implementations - why is this unclear? The threat model is the same as for blockchain vs. off-chain data & behavior.
Correct. I am ok with removing the
It's a
But tell me a way for the EVM to check something that is off-chain. You know very well it cannot and this is not constructive criticism. We can, however, replace the
Non-centralized source code verification (see EthPM).
To summarize, I understand the "unnecessary complexity" as making the type ABI computable on chain, as opposed to blackboxing it to a packed encoding that the EVM cannot decode. In this case, the complexity is beneficial and overhead-worth (if used by many, overhead decreases).
This is fine, I thank you for the effort and time. However, you do not present a way forward.
You did not say that this ERC does not make technical sense. Your opposition is currently "unnecessary complexity", which you did not define properly. I am (and have been) open to improving anything that is not technically sound. Therefore, I do not see a reason to deny merging of this ERC Draft to If you do not want to approve and merge due to technical reasons, please clearly list what they are and what editor to ping when I solve them. However, if you are not in your editorial capacity (and you need to specify that, as you are on the editor's list), we welcome debate and new ideas from a technical person, like yourself and others. This ERC is important for Ethereum. Saying that you refuse to give further feedback without explaining why, goes against the ethos of the community: collaboration, effort decentralization and evolution of computing. |
Updating ERC-1900 after suggestions from ethereum#1882 (comment): - clearer explanation of what a type is and what it's composed of - defining the purpose of the struct fields when they are first encountered - mentioning type immutability and the possibility of removing the dType `remove` function - adding more information about the `source` field - mentioning human readable names as a primary identifier for types Additionally: - removed the TypeStorage contract description, postponing it for a future EIP, due to multiple storage patterns being researched
@loredanacirstea I was offering technical feedback as an individual contributor, not as an editor. As you've decided to ignore nearly all of my feedback, wasting time on giving more of it seems pointless. I never suggested I was acting as an editor, or gating merging the draft based on my technical critique. |
@Arachnid , you are on the EIP editor list, you are by default in an editor capacity, you should specify when you are not, when commenting. To recap:
I think my effort was more than enough to demonstrate that I did not ignore your feedback, but appreciated it. You say that you wasted your time because I did not necessarily agree with all of your suggestions, while I did provide arguments. These are facts. |
As I've said before in other PRs, editors are not primarily technical reviewers of EIPs, because that's not scalable. When I'm making requests for changes in a PR in order to merge it, I'm acting as an editor. When I'm discussing technical proposals in general, I'm clearly not - because that's not part of an editor's job.
I only saw this issue because you drew my attention to it via Twitter. I didn't notice the associated PR, or I would have merged it as draft as soon as it met the typographical requirements. I'll make one more attempt at clarifying a couple of misunderstandings:
This is a type:
This is an implementation:
It's possible for one type to have many implementations - for example, ERC20. A type library would specify common interfaces that implementations can conform to, and allow consumers to know what implementations support those interfaces. This EIP claims to be a 'decentralised type system', but it lacks any distinction between a type and implementations of that type. Based on our back-and-forward, it seems like what you're really trying to build is a repository of library code. Which is fine - but then you should rename the EIP and reword accordingly.
I don't see how this is a security issue. If you write a standard for how to generate a hash from a type specification, you can provide standard test vectors, and anyone can implement that. Changing how you store data onchain so that you can do that inside Solidity buys you some convenience, but at the cost of a lot of additional storage overhead and gas costs.
Assuming the goal is for a consumer to be able to fetch the content associated with this field, the data you have here is insufficient for that, unless you specify that it must, for example, refer to a Swarm manifest content hash. If you leave this unspecified, people will use it inconsistently, and nobody will be able to write code to fetch it with confidence that it will actually work. If you do not want to specify it here, you should leave it out - anything else will result in useless overhead, because it is underspecified. |
I think Ethereum is in need of more coupling between the on-chain and the off-chain world, especially, but not limited to, the connection between the high- or source-level concepts of a smart contract and the on-chain bytecode part. Being able to directly refer to the type of a contract (or a struct being defined in that contract) residing at a certain address would be really nice. Due to the metadata hash, this should in principle be possible, but we still do not have an easy to use storage solution. I have the feeling that this proposal could be slimmed down a little, or at least could be specified in different "feature stages" and I'm not sure whether such data should be stored in storage, but it is certainly going in the right direction. Also, I see the point that a decentralized type system is only half the fun without support from compilers. The problem is that we drew the boundary for the Solidity compiler at bytecode generation, i.e. excluding deployment and excluding any kind of networking access. The benefit here is that it makes the compiler more stable and the builds reproducible, but it makes features like on-chain type discovery or type-auto-registration impossible as a central language feature. Maybe we could discuss how type registries could be used in Solidity through "compiler drivers" by supplying the necessary data to the still offline and deterministic compiler. |
@Arachnid, in response to your comment:
Note: these proposals are not optimized in terms of storage or gas costs. A) EIP-1900:Define/implement the types in a library: library GeopointLib {
struct Longitude {
int32 longitude;
}
struct Latitude {
int32 latitude;
}
struct Geopoint {
Longitude longitude;
Latitude latitude;
bytes32 identifier;
// other fields
}
// type rules for each type, if needed
// other helper functions e.g. insert(self, Geopoint memory geopoint) ...
} Register the type with dType, by sending this info to the type registry: {
"name": "Longitude",
"types": [
{"name": "int32", "label": "longitude", "relation": 0, "dimensions":[]},
],
"lang": 0,
"typeChoice": 0,
"contractAddress": "0x0000000000000000000000000000000000000001",
"source": "0x0000000000000000000000000000000000000000000000000000000000000001"
} {
"name": "Latitude",
"types": [
{"name": "int32", "label": "latitude", "relation": 0, "dimensions":[]},
],
"lang": 0,
"typeChoice": 0,
"contractAddress": "0x0000000000000000000000000000000000000001",
"source": "0x0000000000000000000000000000000000000000000000000000000000000001"
} {
"name": "Geopoint",
"types": [
{"name": "Longitude", "label": "longitude", "relation": 0, "dimensions":[]},
{"name": "Latitude", "label": "latitude", "relation": 0, "dimensions":[]},
{"name": "bytes32", "label": "identifier", "relation": 0, "dimensions":[]}
],
"lang": 0,
"typeChoice": 0,
"contractAddress": "0x0000000000000000000000000000000000000001",
"source": "0x0000000000000000000000000000000000000000000000000000000000000001"
} Any dev can now use the type by importing the Geopoint type library:
2) EIP-1921 - Functions ExtensionImplement functions that can handle the type, e.g.: library GeopointUtils {
calculateDistance(
Geopoint memory geopoint1,
Geopoint memory geopoint2
)
pure
public
returns(Distance memory distance)
{
// ...
}
} , where
The {
"name": "calculateDistance",
"types": [
{"name": "Geopoint", "label": "geopoint1", "relation": 0, "dimensions":[]},
{"name": "Geopoint", "label": "geopoint2", "relation": 0, "dimensions":[]},
],
"outputs": [
{"name": "Distance", "label": "distance", "relation": 0, "dimensions":[]},
],
"lang": 0,
"typeChoice": 0,
"contractAddress": "0x0000000000000000000000000000000000000002",
"source": "0x0000000000000000000000000000000000000000000000000000000000000002"
} Devs can use it: import 'GeopointUtils.sol';
contract DevContract {
using GeopointLib for GeopointLib.Geopoint;
using GeoMathLib for GeopointLib.Distance;
using GeopointUtils for GeopointUtils.Geopoint;
// in a function
geo1.calculateDistance(geo2)
} 3) EIP-2157 - Storage ExtensionOptionally, devs can define a storage contract for contract GeopointStorage is StorageBase {
using GeopointLib for GeopointLib.Geopoint;
mapping(bytes32 => Type) public typeStruct;
struct Type {
GeopointLib.Geopoint data;
uint256 index;
}
function insert(GeopointLib.Geopoint memory data) public returns (bytes32 hash) {}
function update(bytes32 hashi, GeopointLib.Geopoint memory data) public returns(bytes32 hash) {}
function remove(bytes32 hash) public returns(uint256 index) {}
function isStored(bytes32 hash) public view returns(bool isIndeed) {}
function getByHash(bytes32 hash) public view returns(GeopointLib.Geopoint memory data) {}
} The ABI for this contract is deterministic - we know the Any dev can reuse this contract for storing data. 4) EIP-2193 - AliasWith EIP-1900 & EIP-2157, you can have human readable identifiers for each data item. E.g. Regarding the comment:
We need a way to retrieve and cache the types for use in Solidity editors. Using ENS for resolution & reverse resolution for each type adds complexity. This makes ENS obsolete, at least for our purposes and for the purpose of having addressability on fine-grained data items. Conclusion:A type library is produced by developers and consumed by developers. The only thing that is consumed by a non-dev is storage data. Storage data can be consumed by other projects and even end-users (see https://youtu.be/zcq2di8QIUE?t=143). |
To expand what the dType registry could mean for a developer:
The ABI of the type itself & the functions can be recomposed entirely from the chain, from the dType registry. The transpiler could compose the For
And this Even without importing other function libraries. E.g. Devs would only see nice code & human-readable names. As to why devs should walk the extra mile and use dType:
|
@loredanacirstea Your examples involve writing types and pushing them to the registry, but not actually querying the registry. What purpose does the registry actually serve here? Can you give an example of how a developer would consume data from the registry?
What are the |
@Arachnid, these short videos show how smart contract developers could consume the dType registry data by write dTyped Solidity with the transpiler mentioned above: https://youtu.be/pBsual6FogE (type definitions), https://youtu.be/dpIVOYlAWrY (type definitions + function types). |
There has been no activity on this issue for two months. It will be closed in a week if no further activity occurs. If you would like to move this EIP forward, please respond to any outstanding feedback or add a comment indicating that you have addressed all required feedback and are ready for a review. |
This issue was closed due to inactivity. If you are still pursuing it, feel free to reopen it and respond to any feedback or request a review in a comment. |
The current draft can be found at https://eips.ethereum.org/EIPS/eip-1900
In-work implementation: https://github.com/pipeos-one/dType, along with a list of all related EIPs, articles and demo videos.
The text was updated successfully, but these errors were encountered: