Meridian Design Doc 5: IE Smart Contract
TODO
- : How to handle disputes?
- → manual resolution
- future: let people pay for claims
Introduction
MERidian stands for Measure, Evaluate, Reward, the three steps of the impact evaluator framework. The Meridian project aims to create an impact evaluator for “off-chain” networks. I.e. a network of nodes that do not maintain a shared ledger or blockchain of transactions.
This doc proposes a framework and model for Meridian that will cater initially for both the Saturn payouts system and the SPARK Station module. For a bonus point, it should be able to cater for any Station module. We believe that trying to generalise beyond these few use cases at this point may be counterproductive.
We will structure this design doc based on the three steps of an Impact Evaluator (IE), measure, evaluate and reward.
Challenges for this Meridian design document
There needs to be a smart contract that acts as the central entity for the impact evaluator. It is the one that runs the epoch loop and orchestrates the work, then performed by either other smart contracts or centralized services. Its mode of operation should be modelled close to the block reward mechanism.
The orchestrator smart contract should:
- receive funds
- control epochs
- trigger collecting measurements
- trigger the evaluation
- trigger the payout of reward
Tasking and measurement
Depending on whether tasking is seen as part of measure, it could stay in the centralized orchestrator (for now), and at a later point be moved to the IE smart contract. This document is going to chart both paths:
- IE smart contract without tasking
- IE smart contract with tasking
Quoting :
imo if tasking is independent of rewards, then tasking should not be part of the IE
However, for SPARK seems like there should be a mechanism for station operators to verify that the tasking was indeed done equally amongst all nodes.
Overview
TODO:
- I've removed membership from this contract, as it can be deducted from the measures submitted. Is that ok?
- Pros
- Simpler
- Less gas
- Tasking doesn't need memberships here
- Cons
- No explicit membership management
- Pros
- Both the measure service and random peers need to be able to submit measures. What is a way of addressing measures that works for both?
- Idea: Submit CID with optional peer address (or set thereof). Then use frisbii+lassie for data transfer. Its the peer’s responsibility to ensure data is retrievable when measurements are being aggregated over. If measures aren’t included, we can’t prove it’s the peer’s fault 🤔
- Another idea: Only let the measurement service submit commitments, if peers want to submit directly they need to submit the raw log. That for sure doesn’t scale however.
- Refactor using events
contract {
roundPeriod numBlocks
currentRound int
reserve amountFIL
rounds []struct{
measures []measure
scores map[address]score
}
evaluators []address
}
maybeAdvanceRound () {
// TODO: Define round logic. Base on tipset?
if condition {
currRound++
}
}
// Don't want to monopolize measure service, therefore:
// - can be sent by anyone
// - can happen multiple times
addMeasurement (measures CID) {
rounds[this.currentRound].measures.push(measures)
maybeAdvanceRound()
}
// TODO: must only happen once, or require same result by all evaluators? What to do on conflict
setScores (round int, scores map[address]score) {
assert(sender in this.evaluators)
assert(round == this.currRound - 1)
assert(this.rounds[round].scores == nil)
this.rounds[round].scores = scores
reward(round, scores)
}
private reward (round int) {
# TODO
}
public getCurrentRound () int {
return this.currentRound
}
public getCurrentRoundMeasures () []measure {
return this.rounds[this.currentRound].measures
}
runIE()
#sequenceDiagram
autonumber
participant C as Client
participant I as Contract: IE
participant P as Peer
participant T as Service: Tasker
participant M as Service: Measure
participant E as Service: Evaluate
participant R as Contracts: Reward
loop Funding
C->>I: Fund work
end
loop Work
T->>P: Assign task
P->>P: Perform work
P->>M: Upload logs
M->>M: Store logs
end
loop Impact Evaluator
par Measure
M->>M: Wait 1 minute
M->>M: Fetch uncommitted logs
M->>M: Create merkle proof
M->>I: Call `addMeasurement(commitment)`
I->>I: Maybe advance round
and Evaluate I: Data preprocessing
E->>I: Call `getCurrentRoundMeasures()`
E->>M: Fetch logs for unknown commitments
E->>E: Detect fraud
E->>E: Aggregate
E->>E: Store aggregates
and Evaluate II
E->>I: Call `getCurrentRound()`
E->>E: Skip if known round
E->>E: Fetch aggregates from prev. round
E->>E: Calculate reward shares
E->>I: call `setScores(round, scores)`
and Reward
I->>R: Call payments factory
R->>R: Create payment splitters
end
P->>R: Claim FIL
end
Use of centralized services
We have chosen to start the Meridian design supported by a minimal set of centralized and trusted services (previously called orchestrator). See for an exploration of alternatives.
A fully trustless model is to be developed as a very next step, because it brings important benefits for a rewards system:
- Trustless: There are no hidden parts of the system. All computation can be inspected by all peers.
- Safe: No party is responsible / liable for running the system
However, only using smart contracts poses unsolved design challenges:
- How to avoid high gas cost due to frequent contract invocation (O(n) vs O(1) when using a service that aggregates and then commits)
- How to store and work on large amounts of data
Store & commit
A pattern in the system is for centralized components to both store their results in a centralized database, and commit a proof of their results to the decentralized blockchain (through smart contracts).
This enables other services to consume the raw results produced, to implement further operations. This also adds verifiability, which is a crucial aspect of reward systems. Peers shouldn’t have to trust the system’s operators.
For example, here is how the measure service passes data on to the evaluate service, while publishing commitments on chain:
flowchart
M[Measure Service] --raw results--> MDB[Measure DB]
M --commitment--> MC[Measure Contract]
MDB --raw results--> E[Evaluate Service]
E --commitment--> EC[Evaluate Contract]
E --raw results--> EDB[Evaluate DB]Measure
In the measure step, we refer to each atomic item that gets measured as a job. For example, each retrieval served by a Saturn node is a job. For Spark, each retrieval made from an SP is a job.
This step is implemented using 3 components:
- A centralized measure service
- A centralized measure database
- A measure smart contract
Peers periodically submit measurements (job logs) to the measure service, which stores them raw in its database, for future consumption by the evaluate pipeline.
The measure service will also periodically create Merkle trees of all measurements received (since the last proof). The root hash is published to the chain, the full tree is stored in the database. This allows peers to verify that their job logs have been included, taking steps towards a fully trustless model.
sequenceDiagram
autonumber
participant P as Peer
participant S as Service: Measure
participant D as DB: Measure
participant C as Contract: Measure
loop Store
P->>S: Upload measurements
S->>D: Store measurements
end
loop Commit
D->>S: Fetch measurements
S->>S: Create Merkle tree
S->>D: Store Merkle tree
S->>C: Call with Merkle root hash
C->>C: Emit hash
endProof of Inclusion
Meridian implementers can create a verification service, which lets peers verify inclusion by exposing the relevant nodes of the matching merkle tree.
sequenceDiagram
participant P as Peer
participant S as Service: Verify
participant D as Database: Measure
P->>S: submit log hash, merkle root hash
S->>D: fetch merkle tree
S->>P: return relevant nodesSee also:
Data Model
Job logs
Job logs are periodically submitted by the peer to the measure service. This raw log of work done is the basis for the whole Meridian pipeline.
[
// Generalized record
{
"job_id": "<UUID or CID>", // Unique job id
"peer_id": "<Libp2p Peer ID>", // Who completed the job
"started_at": "Timestamp", // When did the job begin
"ended_at": "Timestamp", // When did the job end
// Any other fields that are useful measurements of work done
}
// Example Saturn record
{
"job_id": "abcdef",
"peer_id": "<Libp2p Peer ID>",
"started_at": "2023-05-01 00:52:57.62+00",
"ended_at": "2023-05-01 00:52:58.62+00",
"num_bytes_sent": 240,
"request_duration_sec": 10,
"ttfb_ms": 35,
"status_code": 200,
"cache_hit": true
}
// Example SPARK record
{
"job_id": "abcdef",
"peer_id": "<Libp2p Peer ID>",
"started_at": "2023-05-01 00:52:57.62+00",
"ended_at": "2023-05-01 00:52:58.62+00",
"status_code": 200,
"signature_chain": "<signature chain>",
"num_bytes": 200,
"ttfb_ms": 45
}
]On-chain commitment
Periodically, the measure service submits a commitment of work received to the chain, via the measure smart contract. After creating and storing a merkle tree over the logs, the root hash is published together with the measurement epoch.
{
"root": "<merkle root hash>",
"started_at": "2023-05-01 00:52:57.62+00",
"ended_at": "2023-05-01 00:52:57.62+00"
}Implementation
Evaluate
At this point we have a table of logs in the above data model, stored in an off-chain data store. The next step is to evaluate over the logs from the current measurement epoch, using the evaluation function.
Evaluation Function
In general, for evaluation fields, for node where , and with logs and evaluations on those logs and evaluation function , we can calculate the evaluation output as
where and is the evaluation of node .
In the case of Saturn, the evaluation function is a function of number of bytes sent, TTFB and the request duration. This is calculated by the Saturn payouts system.
In the case of Spark, the evaluation function is simply a count of the number of successful requests with valid signature chains a Station has performed. Specifically, for node ,
where if the log with index of node is valid and otherwise.
The Saturn evaluation function is more complicated . See https://hackmd.io/@cryptoecon/saturn-aliens/%2FMqxcRhVdSi2txAKW7pCh5Q for more details.
Multi stage evaluation
In Meridian, evaluation is a 2 stage process. Evaluation stage i is a data preprocessing pipeline, that periodically pre-filters and aggregates measurement results. This smaller dataset is then consumed by evaluation stage ii, which is executed once before the rewards phase.
The 2 stage design is one of the lessons from the Saturn project: Data needs to be aggregated and pre-filtered, as otherwise e.g. a once-a-month evaluation run will operate over too large a dataset and pose serious scaling issues.
Conveniently, fraud detection is also a 2 stage process, and each evaluation stage comes with one fraud detection stage.
Evaluation Stage I: Data preprocessing
The data preprocessing pipeline is executed whenever the measure service has committed an inclusion proof on chain. It takes the raw logs from associated measurement epoch, performs its preprocessing steps, and finally stores & commits.
sequenceDiagram
autonumber
participant E as Service: Evaluate Stage I
participant DM as Database: Measure
participant DE as Database: Evaluate
participant C as Chain
DM->>E: Fetch logs
E->>E: Detect fraud
E->>E: Aggregate
E->>DE: Store aggregates in buckets Honest/Fraudulent
E->>C: Publish proof
Fraud Detection
Based on the network-specific fraud detection function, raw job logs are aggregated into two buckets:
- Honest logs: Logs used for later processing in evaluation stage ii and reward
- Fraudulent logs: Logs kept for reference
The fraud detection function maps an individual log line to boolean fraudulent status:
flowchart TD
subgraph Logs
L1[Log]
L2[Log]
L3[Log]
end
subgraph Buckets
BH[Honest logs]
BF[Fraudulent logs]
end
Logs--detect fraud-->Buckets
L1[Log] --> BH
L2[Log] --> BH
L3[Log] --> BF
For SPARK for example, the fraud detection function verifies a log’s UCAN signature chain. If the signature chain holds, the log line will be aggregated into Honest logs, otherwise Fraudulent logs.
Aggregation
Logs from both buckets will be aggregated, and those aggregations stored in a database. A merged aggregate will be committed on chain.
However, only logs from the Honest logs bucket will count when evaluation stage ii determines the peer’s impact on the system.
flowchart TD
subgraph Buckets
BH[Honest logs]
BF[Fraudulent logs]
end
subgraph Aggregates
AH[Honest logs]
AF[Fraudulent logs]
end
subgraph Database
TH[Honest logs]
TF[Fraudulent logs]
end
BH --> AH
BF --> AF
AH --> TH
AF --> TF
AH --> Merge
AF --> Merge
Merge --> Commitment
On-chain commitment
The on-chain commitment contains aggregated measurements from both buckets Honest logs and Fraudulent logs, in order to commit to a sum of work done. No further details about fraud detection results are leaked, to prevent gaming the system.
// General commitment shape
{
// The .root hash of the measurement commitment that was aggregated over
"measurement_root": "<merkle root hash>",
"started_at": "2023-05-01 00:52:57.62+00",
"ended_at": "2023-05-01 00:52:57.62+00",
"measurements": {
"honest": {
"log_count": 1000,
// Any properties that need to be fed to the evaluation function,
// aggregated
},
"fraudulent": {
// Same shape as above
}
}
}
// Example Saturn commitment
{
"measurement_root": "<merkle root hash>",
"started_at": "2023-05-01 00:52:57.62+00",
"ended_at": "2023-05-01 00:52:57.62+00",
"measurements": {
"honest": {
"log_count": 13,
"num_bytes": 1000
},
"fraudulent": {
"log_count": 2000,
"num_bytes": 100
}
}
}
// Example SPARK commitment
{
"measurement_root": "<merkle root hash>",
"started_at": "2023-05-01 00:52:57.62+00",
"ended_at": "2023-05-01 00:52:57.62+00",
"measurements": {
"honest": {
"log_count": 13,
"num_bytes": 1000
},
"fraudulent": {
"log_count": 2000,
"num_bytes": 100
}
}
}Proof
The attached measurement root lets peers verify that their log line was included in the aggregation. There is no further proof of which log line has counted towards the aggregation (or been ignored due to fraud) in order to prevent gaming the fraud detection function.
Fraud detection is a private process and therefore also no proof will be created for this function.
Evaluation Stage II
At the end of each payment epoch, the evaluate stage ii service converts preprocessed aggregated logs (from the Honest logs bucket) into evaluation results.
It also executes a 2nd round of fraud filtering.
It asks for human review before triggering the reward phase by calling the rewards factory contract with batches of evaluations.
sequenceDiagram
autonumber
participant E as Service: Evaluate Stage II
participant DB as Database: Evaluate
participant C as Contract: Rewards Factory
DB->>E: Fetch Honest logs aggregates
E->>E: Detect and discard fraud
E->>E: Evaluate
E->>E: Ask for human review
E->>E: Create batches of evaluations
loop For each batch
E->>C: Call with batch of evaluations
end
The evaluation process runs off chain, because the dataset (all aggregated measurements produced by evaluation stage i) is too large to be handled by smart contracts.
In order for the rewards factory contract to be callable even with a large size of peers, the evaluate stage ii service needs to batch its calls into dynamically sized buckets, given known smart contract size limitations.
Fraud detection
Multiple processes can mark peers or logs as fraudulent, in between measure and evaluate. For example, the Saturn Orchestrator can mark a peer as fraudulent when it fakes its speed test results.
Therefore, all logs that are part of the Honest logs buckets but have later on been flagged as fraudulent (or associated with a peer that has been flagged) will not be fed into the evaluation function.
Proof
Proofing that the correct evaluation computation was run over the correct dataset, and that the right result was shared, is still a matter of research.
TODO : Add more details
Human Review
Should there be a problem in the previous steps (measure, evaluate phase i), their results have already been committed and can’t be adjusted any more (for this payment epoch).
At the end of evaluation stage ii data of following shape is committed on chain by calling the rewards factory contract.
{
"started_at": "2023-05-01 00:52:57.62+00",
"ended_at": "2023-05-01 00:52:57.62+00",
"payees": [{
"address": "f1...",
"proportion": "0.4", // Fraction of the reward pool
"fraud": "0.1" // Fraction of fraudulent logs amont all of the
// payee's logs (which have been discarged)
},{
"address": "f1...",
"proportion": "0.6",
"fraud": "0"
}],
}I.e. for each epoch we know what proportion of the overall tokens will go to each node.
Reward
flowchart TD
subgraph Service: Evaluate
b1[Batch of evaluations]
b2[Batch of evaluations]
end
b1--call-->F[Contract: Rewards Factory]
F--deploy-->S[Contract: Payment Splitter<br />- f1aaa: 1FIL<br />- f1bbb: 2FIL<br />...]
subgraph Peers
p1[Peer: f1aaa]
p2[Peer: f2aaa]
end
p1--claim-->S
p2--claim-->S
Pull vs push payments
It is important for peers to claim (pull) their rewards, instead of for the system to send out (push) rewards, for multiple reasons:
- This gives peers the freedom to claim whenever they want (e.g. tax benefits), or not to claim at all
- This further decouples the system from the system’s operator, which helps with liabilities




https://github.com/filecoin-station/meridian-measure-service/tree/main