Server-side Filecoin address compliance checks

Motivation

For legal reasons: In addition to checking participant Filecoin addresses against https://www.chainalysis.com/ on the client side, also perform these checks on the server side. Clients cannot be trusted and therefore the checks need to be replicated on the server.

This document is aimed at deciding where and how to perform these checks.

We must not send FIL to sanctioned addresses.

Services overview

`spark-api`

spark-api is the first server-side receiver of measurements. Checker nodes submit measurements to it one by one. After initial validation, it persists those measurements in spark-db.

`spark-publish`

The service periodically publishes batches of measurements from spark-db to the smart contract.

`spark-evaluate`

For each batch of measurements received via the smart contract, the service preprocesses them, and at round end sends reward scores to spark-rewards.

`spark-rewards`

Once per week, reward participants.

`smart-contract`

…

Lookup strategy

Looking up compliance information from Chainalysis for each measurement is not desired:

Each participant will create multiple measurements over a short time frame, therefore unnecessarily many requests will be made, that with high certainty will produce the same result. We neither want to be blocked by Chainalysis, nor create unnecessary load on the network stack, nor waste energy.

Lookup frequency

Looking up compliance information for each participant, is better, but the question is, at which frequency:

Strategy a.: If we hook into spark-publish, or spark-evaluate's preprocess step, we can look up compliance information for each participant of a 1 minute batch of measurements. This will still lead to redundant work if a participant is active for more than 1 minute. ❌

Strategy b.: If we hook into spark-evaluate's reward step, we can look up compliance information for each participant of a 20 minutes round. This will lead to some redundant work if a participant is active for more than 20 minutes. ✅
1. This way we can visualize the sanctioned measurement count

Strategy c.: If we add a cache to be put in front of Chainalysis, we don’t need to be worried about rate limiting or causing them unnecessary stress. This also gives us more freedom in choosing where we read the cache from. On the other hand, one can argue the Chainalysis already uses a cache, and we’re simply replicating that part of infrastructure. However, with regards to lookup frequency, we are in full control over how often we want to refresh compliance information. ✅

Strategy d.: Spark-rewards

Strategy e.: Smart contract

Error handling

Unavailability of Chainalysis needs to be dealt with, both for failing and slow responses.

Failure

In case of a request failure, impact depends on the lookup strategy (see above)used:

For strategy a., a participant’s work over 1 minute will be unrewarded. ✅

For strategy b., a participant’s work over 20 minutes will be unrewarded. ❌

For strategy c., we assume that our cache will be highly available, and we assume that a Chainalysis cache update failure will simply lead to potentially stale information being used. ✅✅

💡

This could be mitigated with retries, at the cost of negative impact on our services and processing times (we must not get behind).

Slowness

In case of a slow request, impact also depends on the lookup strategy (see above) used:

For strategy a., delays are bad since they hold up the “tight” 1 minute publish loop, and cause a delay for all other participants. ❌

For strategy b., the impact is minimal, as long as the request can be resolved in less than 15 minutes (20 minutes round length - 5 minutes buffer) ✅

For strategy c., we assume that our cache will be highly available, and we assume that Chainalysis cache update slowness will simply lead to potentially stale information being used. ✅ ✅

Proposal

Spark-rewards

Runs only once a week

Manual process

No time pressure on completion time

Receive scheduled rewards update, every 20 minutes
1. Nothing to be done: In ToS it says it’s the participant’s responsibility to continuously monitor the sanctions list. If they suddenly become part of the list, they are required to stop participating.

Commit scheduled rewards to the smart contract, every 7 days
1. Here we perform the compliance checks, with adequate concurrency, caching, error handling / retrying.

We have decided it’s not crucial at the moment to reject measurements from sanctioned participants as soon as possible. We have also decided it’s ok to use their results in spark data.

Implementation plan:

Add another pMap in commit-rewards.js, with retries and timeouts
- This will check only participants who have passed the previous ≥0.1 FIL check
- Add console.log() to create a bit of visibility (optionally count total number of addresses discarded for this reason)

Action items:

@Julian Gruber ask Patrick whether he wants to check this with lawyers:
> We have also decided it’s ok to use their results in spark data.