-
Notifications
You must be signed in to change notification settings - Fork 116
Description
This issue describes in some detail how the initial implementation of smart-contract content directory (described in issues like #1520) could potentially look like and how it would affect our TypeScript codebase (query node, CLI, network tests). The draft PR containing the initial contracts code can be found here: #1752 (though this plan introduces a few changes, like the format of metadata which I actually reconsiderend while working on this issue).
Since it's not entirely clear to me yet what is feasible to implemenent on the runtime/node side, I tried to explore a few different possibilities, based on my experience with some reference implementations like this and some documentation that I was able to find (more links are provided under Runtime section below).
Runtime
Some related resources:
- https://substrate.dev/docs/en/knowledgebase/smart-contracts/evm-pallet
- https://github.com/paritytech/frontier
- https://github.com/PureStake/moonbeam/tree/moonbeam-tutorials - this is the chain I was using locally so far as reference implementation.
- https://www.youtube.com/watch?v=b9R_-JPtKlM - this video that demonstrates how some basic interactions with the evm module look like (a link to a repository containing the example CLI code can be found in the description).
- https://www.youtube.com/watch?v=lXw6GTNh73Y - a more in-depth dive into evm pallet (though it's probably quite outdated now)
We would need:
-
EVM pallet / module enabled, exposing extrinsics like
evm.call,evm.deposit,evm.withdrawandevm.create.For the purpose of extrinsics like
evm.callwe want given sender substrate address to be deterministically converted to evm-compatible address that would be accessible throughmsg.senderinside the Solidity contracts.In the implementation that I used for the initial tests this was the standard behavior and the conversion function looked like this (this implementation can also be inspected here):
impl<H: Hasher, A: AsRef<[u8]>> ConvertAccountId<A> for HashTruncateConvertAccountId<H> { fn convert_account_id(account_id: &A) -> H160 { let account_id = H::hash(account_id.as_ref()); let account_id_len = account_id.as_ref().len(); let mut value = [0u8; 20]; let value_len = value.len(); if value_len > account_id_len { value[(value_len - account_id_len)..].copy_from_slice(account_id.as_ref()); } else { value.copy_from_slice(&account_id.as_ref()[(account_id_len - value_len)..]); } H160::from(value) } }On the frontend the same conversion could be done with:
crypto.blake2AsHex(crypto.decodeAddress(substrateKeyPair.address), 256).substring(26) -
Web3 rpc support (optionally) - this seems to be
rpc-ethereumcomponent of https://github.com/paritytech/frontier.It isn't strictly required and I'm not sure how feasible it is for our node to support it, but on the frontend-side of things this would be very helpful, because otherwise accessing contract storage variables is going to be very hard to implement (I think it would require using a very low-level
evm.accountStoragesquery which returns a raw storage slot data as hex).Having access to this rpc would also:
- make it easier to depoly/initialize contracts with tools like truffle without having to build custom deployment logic in order to make it compatible with substrate extrinsics (like
evm.create) - allow the usage of standard ethereum keypairs for
evmcalls, which may add some interesting possibilities for smart contracts - we would have full support of
Web3.jsand other existing js libraries, which can generally make some interactions with smart contracts way simpler (ie. subscribing to events) - we could run Truffle tests compatible with Ganache node against our substrate node (though we would probably want to use the substrate extrinsics for final tests, since they'd rely on memberships etc.)
It's not entirely clear to me how the eth transactions are handled vs substrate transactions/extrinsics in this setup by the runtime, but it seems pretty obvious that the sender of eth transaction will still need to pay a fee (which would be in "ethereum").
Assuming the users have separate balances in
ETHandJOY(which seems to be the default case - they can then convert between them usingevm.depositandevm.withdrawextrinsics), that would mean that if we want to allow them to use Web3 RPC (ie. throughWeb3.js) to deploy/call contracts using ethereum keypairs, we probably need to also allowevm.depositto anyH160(ethereum) account.By default the evm pallet (or at least the implementation I was using) seems to only allow deposits from substrate account to an eth account that is associated with the substrate addrres (a result of conversion), so there seems to be no way for the user to transfer value to an eth address that the user actually has the private key to.
- make it easier to depoly/initialize contracts with tools like truffle without having to build custom deployment logic in order to make it compatible with substrate extrinsics (like
-
Runtime storage for 2 contract addresses that can be changed through
sudo(MembershipBridgeandContentWorkingGroupBridge) - the runtime should be able to call into those contracts when executing some of the extrinsics in membership and working-group moudle (as expalined in the next point). Ultimately we will probably want them to be set via proposal(s), but initially they can be set throughsudo, ie. we can have extrinsics likesetMembershipBridgeContractAddressandsetContentWorkingGroupBridgeContractAddress -
"Hooks" in the
membershipandworking-groupsmoduleWe'd want to runtime to execute evm calls on some specific events. For this it should use a hardcoded
msg.senderaddress, referred to as the runtime address (ie.0x2222222222222222222222222222222222222222).We'd need following hooks:
- on
memberships::set_controller_account- callMembershipBridge.setMemberAddress(memberId, address)(the address must always be converted to ethereum address, the same way it is converted for the purpose ofevm.callextrinsic) - on
memberships::buy_membership- callMembershipBridge.setMemberAddress(memberId, address)(same as above) - on
memberships::add_screened_member- same as above - on
ContentDirectoryWorkingGroup::fulfill_successful_applications(orfill_opening) - callContentWorkingGroupBridge.setCuratorAddress(curatorId, address)and potentiallyContentWorkingGroupBridge.setLeadAddress(address)(if the lead was hired) - on
ContentDirectoryWorkingGroup::deactivate_worker- callContentWorkingGroupBridge.setCuratorAddress(curatorId, address)and potentiallyContentWorkingGroupBridge.setLeadAddress(address)(in this case we use an empty address:0x0000000000000000000000000000000000000000) - on
ContentDirectoryWorkingGroup::update_role_account- callContentWorkingGroupBridge.setCuratorAddress(curatorId, address)and potentiallyContentWorkingGroupBridge.setLeadAddress(address)
The runtime can either have the signature of those methods hardcoded (ie.
SET_MEMBER_ADDRESS_SIGNATURE = keccak('setMemberAddress(uint64,address)')) or those can also be set via sudo (the same way contract addresses will be). - on
Deploying the contracts
We can do this either using evm.create extrinsic or the web3 rpc (if it's supported), as described in the first section.
Bridge contracts
When deploying the bridge contracts (MembershipBridge, ContentWorkingGroupBridge), assuming we want to preserve existing memberships and contentWorkingGroup state, we need to take into account the initialization process (which was also described here: #1677):
- Make runtime aware of bride contract addresses (set them via
sudo) - Export members / content-working-group state at the same block
1.was executed - Initialize the data in the bridge contracts (using standard initalization process with
batchInsert/batchSet)
The key here is that the bridge contracts need to handle beeing initialized and potentially updated by the runtime at the same, because initialization may take multiple blocks, during which extrinsics like membership.buy_membership may also be executed.
The solution to this which I explored in #1677 seems to be ignoring any member/curator/lead address changes that are part of batchSet (or some other method used by the initialization script) if they reffer to member/curator/lead whose address was already set by the runtime (ie. via setCuratorAddress contract method)
Storage contracts and logic contract
Those don't seem to require any special initialization logic, sice the content working group lead should be able to initialize content directory to any state using just the standard contract methods (ie. addCuratorGroup, createChannel, addVideo, updateChannelOwnership etc.).
It still may be worth to include a way to inititialize a storage contract (ie. VideoStorage) to given state and make it possible to migrate just a single storage contract (the way it was described in the initial draft: #1520 (comment)), to simplify introducing single storage contract layout changes.
Query node (and general event/metadata handling)
Here my proposition was to rely on the evm module Log event.
The data that this event returns is a raw hex that can be very easily decoded with existing tools like https://github.com/ConsenSys/abi-decoder or perhaps https://www.trufflesuite.com/docs/truffle/codec/modules/_truffle_codec.html
Using the contract abi (we can get it from json file created by truffle compile) we can convert the raw topics and data hex (part of the Log event) into a friedly event representation like:
{
name: 'ChannelCreated',
args: {
_id: BN
_ownership: { ownershipType: BN; ownerId: BN }
_metadata: string
}
}
There is a library called typechain which is capable of, among many other things, generating TypeScript interfaces for events like this based on the actual solidity contracts (the code above is pretty much just copied from one of the d.ts files it generated)
The query node could handle the Log event similarly to how it now handles events like TransactionCompleted, ie. execute the approperiate handlers like createChannel, updateChannelProperties etc. depending on the decoded event type. This should actually be much easier than the current approach, as there wouldn't be a need to deal with very abstract concepts like CreateEntity/AddSchemaSupportToEntity operations, ParametrizedInputPropertyValue types etc. The _metadata part of the event (that holds all values previously provided in form of InputProperyValues) will be just a json-encoded object that can be easily parsed and validated by the query node (as further described below)
Video/Channel metadata
Consider current Video schema (part of https://github.com/Joystream/joystream/issues/824#issuecomment-653150085):
type Video {
id: ID!
entityID: BigInteger!
channel: Channel!
category: Category!
title: String!
description: String!
duration: Int!
skippableIntroDuration: Int
thumbnailURL: String!
Language: Language
media: VideoMedia!
hasMarketing: Boolean
publishedOnJoystreamAtblockHeight: BigInteger!
publishedOnJoystreamAtTimeStamp: BigInteger!
publishedBeforeJoystream: DateTime
isPublic: Bool!
isCurated: Boolean!
isExplicit: Boolean!
license: License!
}
It consists of properties of different types like String, Int, Boolean, some one-to-one relationships with other entities like VideoMedia and many-to-one relationships with entities like Channel, Category, Language etc. Some of those (like VideoMedia) also have their own "nested" entities (ie. MediaLocation).
Currently all of this is reflected by the runtime schemas (as closely as possible), but once we move to smart contracts the entire Video metadata can represented just as a json string (which the contracts actually don't care much about, they'll just emit it in an event like VideoCreated or VideoMetadataUpdated).
We can use the json schema standard (https://json-schema.org/) to describe the expected metadata json object.
We already have json schemas kind-of like that auto-generated for entities by @joysteam/cd-schemas library (ie.: /content-directory-schemas/schemas/entities), they'd of course need to be slightly adjusted and their role would be different. Instead of beeing auto-generated from runtime schemas, they will be the one and only source of truth about how we expect the Channel / Video metadata to look like.
With json schemas it would be no problem at all to hadle nested objects, enum types, "either-or" values (oneOf) etc. since json schemas have support for all those cases.
There is also a great library called json-schema-to-typescript that allow us to generate fully-compatible TypeSciprt interfaces from those json schemas (already used by @joystram/cd-schemas) and a json-schema validation library called Ajv, that we can use to validate json object against a json schema.
With all that, we can have a very customisable, fully typesafe Channel/Video metadata, that should be easy to handle on the mappings side (as described below).
Handling metadata on channel/video creation
I imagine the flow to be like this:
- Query node recieves metadata in form of json string as part of, for example,
ChannelCreatedevent - Query node converts the json string to json object (if this fails - go to
invalid metadatacase) - Query node validates the json object against
ChannelMetadata.schema.jsonusingAjv - If the schema is valid - we can TypeScript-assert that the metadata object is of given interface type, ie.
const metadata = metadataJsonObj as ChannelMetadata. Since theChannelMetadatainterface is auto-generated from the json schema this approach should be 100% typesafe and further handling should be very easy. - If the metadata is invalid (either not a valid json at all or the validation against json schema fails) - the query node could potentially store some error log, so that the user who tried to add the content can query it and see why it was rejected. Smart contracts do not differentiate between a valid-metadata and invalid-metadata
Video/Channel, so it's still possible that smart contract methods likeupdateChannelMetadata,addVideoorupdateChannelOwnershipwill be executed in context of an invalid-metadta channel. This means query node should probably store some representation of invalid-metadata channel regardless (it will then just have anidandowner) and allow it to become "valid" onceupdateChannelMetadatais executed (in context of this channel) containing a full (not partial!), valid metadata object.
Handling channel/video metadata updates
The updates are a bit more tricky to handle, since in this case the metadata json object will not be "complete".
This means we would need to "merge" the update with existing metadata before doing validation.
It's not always obvious how to perform this merge, let's take a look at an example:
Imagine a Video update object like this:
{
media: {
location: {
httpMediaLocation: {
url: "http://example.com/video.mp4"
}
}
}
}
And let's assume the current Video metadata object looks like this
{
title: "Some title",
description: "Some video description"
media: {
width: 1080,
height: 768,
...
location: {
joystreamMediaLocation: {
contentId: "5C61WpAUiceHS8yTWxkncHYhxbAuoqspesiZJooQWSompu98"
}
}
}
}
The json schema can expect the location to have either joystreamMediaLocation or httpMediaLocation property (oneOf), but if we merge those objects recursively we would end up having both of them specified at once.
On the other hand if we just do a "shallow" merge (assign a new value for media object), we'd lose other properties of media like width and height.
There are of course ways to solve this, worst case scenario we can probably just do away with shallow merge, forcing the updater to specify the entire media object even if only wanting to update a single value of a deeply nested property.
We could also enforce that, in a scenario like this, the update should explicitly set httpMediaLocation to null and joystreamMediaLocation to a valid object. Then on the json-schema level we would require that either httpMediaLocation or joystreamMediaLocation is always set to null (this should also be possible to do with oneOf).
Another important thing about the updates is that metadata updates have no effects on smart contracts themselves, which means the query node can safely ignore any invalid update (and at best - just store an error log).
Contract address
Since migrations and upgradability is not the focus of the initial implementation, the query node (processor) can probably just import a static ContentDirectory contract abi and use a contract address provided in ENV (there is also one other possible approach further described in the CLI section below).
Ultimately (in the future) depending on the upgarade strategy we choose it may either need to be able to switch this address to a different one on some event (ie. Upgraded/Migrated), but it may also be the case that the contract address remains unchanged even when the contract is upgraded (the proxy-pattern).
@joystream/cd-schemas library
In the scenario described above (in the Query node section), the role of this library would be reduced to just storing ChannelMetadata and VideoMetadata json schemas and the TypeScript interfaces auto-generated from them.
That would mean removing a lot of functionality related to conversion between one schema representation and the other (ie. runtime schemas to json schemas), scripts related to content directory initialization, InputParser that handles parsing json objects to Transaction operations etc.
I think in that case it may make sense to just replace it with one consumable library called @joystream/content-directory that stores everything related to the new, smart contract content directory implementation, namely:
- the actual smart contracts
- smart contract deployment and migration scripts
- smart contract build artifacts (
json) generated by truffle (they containabis, addresses of the deployed contracts and everything else the other projects may need) - types generated via typechain (based on smart contracts code)
- Channel/video metadata json schemas
- Channel/video metadata TypeScript interfaces (generated from json schemas)
- perhaps some utilities to simplify contract method calls from the frontend
CLI
The CLI is currently completely independent from the query node and initially I thought that it would be best to keep it this way, but due to the way channel/video metadata is handled by smart contracts, such integration may actually make sense (this is further explored at the end of this section).
There is a set of commands like createChannel, uploadVideo, curateContent, createCuratorGroup, addCuratorToGroup etc. already implemented for Babylon release and it will probably not be very hard to update them to work with smart contracts instead of the contentWorkingGroup runtime module.
Some content-directory specific commands like classes, removeEntity, createClass, addClassSchema can be removed completely.
The integration with metadata json schemas should be quite straight-forward, since the CLI already uses similar approach.
There is a class that allows us to prompt for data based on json schema, called JsonSchemaPrompter.
It can be used to provide a convinient (as far as CLI's are considered) UI for inputing the metadata and validating it in realtime using Ajv. That means there also shouldn't be a need to change the code inside the CLI itself when the underlying metadata json schema changes.
Extrinsics / calls
Calling a contract method "through the runtime" is a rather simple process, here is a sample script that does this:
import ContractJson from '../build/contracts/ContentDir.json'
const CONTRACT_ADDRESS = ContractJson.networks[43].address
const web3 = new Web3(); // We can use a web3 without a provider in this case
const provider = new WsProvider('ws://127.0.0.1:9944')
const api = await ApiPromise.create({ provider })
const contract = new web3.eth.Contract(ContractJson.abi);
const call = contract.methods.removeVideo(1).encodeABI()
const substrateTx = api.tx.evm.call(
CONTRACT_ADDRESS,
callData,
0, // msg.value
4294967295, // gas limit
1, // gas price
null // nonce (optional)
)
We can leverage TypeChain to decorate contract.methods, making them fully typesafe.
The only issue here is that executing such call has a gas cost, which the user has to pay from his "Ethereum balance" (since each account has two separate balances as described in the EVM pallet documentation: https://substrate.dev/docs/en/knowledgebase/smart-contracts/evm-pallet), so we would either need to provide the user with commads to deposit and withdraw to/from his ETH balance and also possibly check this balance at the beginning of the execution logic of commands like createChannel (at least to warn if it's empty?) OR perform this conversion "automatically" (by either estimating the gas cost or temporarly using some some predefined value to convert before call - for example: 1 JOY)
Reading data directly from the contract (ie. public variables)
As described in the first section, I think for this we would actually need rpc-ethereum on the runtime side
and Web3.js on the frontend, otherwise we would have to read and decode raw hex data from contract storage slots,
which is quite complex, very error-prone (highly dependent on the solidity code itself) and hard to implement, ie.:
const slot = "0";
const mapStorageSlot = slot.padStart(64, '0');
const mapKey = bobEvmAccount.toString().substring(2).padStart(64, '0');
const storageKey = web3Utils.sha3('0x'.concat(mapKey.concat(mapStorageSlot)));
const accountStorage = (await api.query.evm.accountStorages(contractAddress, storageKey)).toString();
/* ... */
An example of recieving a value from a mapping in an example ERC20 contract (https://github.com/paritytech/frontier/blob/master/template/examples/contract-erc20/create-erc20.ts)
With web3 we can achive the same with just a single line of code like:
const data = await contract.methods.accountTokenBalance(account).call();
Displaying video/channel metadata within CLI
For this relying on query node actually seems unaviodable, since smart contracts would not actually store this data (instead it is just emitted in an event) so the CLI would need to do the same work as query node does in order to establish current values (ie. find related CreateChannel event, then all ChannelMetadataUpdated events and apply those subsequent updates).
This means the CLI can either:
- only display information it can fetch from smart contracts. For
Channelthat would be:id,owner,status(active/censored), for videos:id,channelIdandstatus. - use query node to query
ChannelandVideometadata (this may have some value, although it's already very easy to just query the query node directly for this via the exposed web UI)
Handling changing contract address(es) and abi(s)
When Truffle contract migrations (deployemnts) are run, the framework creates/updates "build artifacts", which are json files containing a lot of useful information about the contracts, ie.: the current address of each of the contracts depending on network (ie. development, test, live) and contract abi that both CLI and query-node need in order to be able to decode the events, encode calls etc.
We probably don't want to commit build artifacts themselves in our repository, but if we assume we always use truffle for deployemnts and the CLI imports current build artifacs like this:
import ContractJson from '@joystream/content-directory/build/contracts/ContentDirectory.json';
const CONTENT_DIRECTORY_CONTRACT_ADDRESS = ContractJson.networks[networkID].address
We wouldn't need to set anything manually after (re)deploying the contracts to a development node locally / during tests.
If the published version of @joystream/content-directory then contains build artifacs with addresses of deployed production contracts (which means we would of course have to deploy contracts first and then publish the library), the users of the CLI also wouldn't need to manually set them after they install the CLI.
In that case the CLI can of course still allow the users to customize those addresses via commands like content-directory:setLogicContractAddress, content-directory:setChannelStorageAddress etc.
Integration tests
During the integration tests we would need to ensure a few additional steps are executed before any interaction with smart contracts will be possible:
- contracts are deployed (ie. using
truffle migrateor a custom deployment script if no web3 support is available) - bridge contract addresses are set in the runtime via sudo (this should be executed before any membership module extrinsics, unless we want to always do the export&initialization part described in
Deploying the contractssection) - addresses that send
evm.callextrinsics need to have enough evm-balance to pay gas fees (this balance can either be initialized withevm.depositor via genesis config)
Besides that I suspect the effects on current integration tests would be similar to those on the CLI, mostly we would need to replace all api.tx.contentDirectory.transaction(...) extrinsics with corresponding api.tx.evm.call(...) extrinsics.
The tests are already importing Channel/Video entity typescript interfaces from @joystream/cd-schemas and using them for Channel / Video creation and updates (the library exposes a converter that parses such inputs into Transaction extrinsc input). This means that the new format of input objects used to create / update Channels and Videos (the metadata object) will not be very different from the one that is currently used (for smart contract methods we would also provide objects like:{ title: 'Video', description: 'An example video', media: { width: 800, height: 600 } }).
Current contracts addresses can be established within network tests similarly to how it was described in Handling changing contract address(es) and abi(s) section of CLI, assuming the CI would also use the standard truffle deploy flow, which would update the json build artifacs inside @joystream/content-directory in the monorepo.
In the early phase (before all runtime changes are included) there will be a separate set of unit-test tests designed specifically for smart contracts (for example, see: #1681) that can be run against a Ganache node and don't require any interactions with substrate runtime. It may be possible to merge some of those tests into our integration tests (that would depend on whether we would support web3 rpc), but keeping them separate also seems quite valueable.