FilCDN Architecture

Author: @Nikolas Haimerl

Overview

This page goes into the design of the FilCDN and describes the services it comprises and how it interacts with other components in the FWS stack.

See FilCDN Overview for an introduction to FilCDN.

Description

The FilCDN is a software component which forwards file retrieval requests to the right SPs serving PDP file retrieval. It leverages the caching mechanism of Cloudflare and offers a single point of entry for file retrievals.

Workflow

In the following the workflow of the PDP checker node is described. It graphically shows which services of the PDP checker node interact with the other parts of the FWS stack at which point of the retrival checking process.

⚠️

The diagram below is a bit outdated. It misses the steps needed to map Client+CommP to the SP where to retrieve from, verify that the ProofSet and Root are still live (were not deleted), etc.

sequenceDiagram
participant u as User
participant cl as filcdn.io<br/>(CloudFlare entry point)
participant clc as CloudFlare Caching Layer
participant fcdn as FilCDN Backend<br/>(CloudFlare Worker)
participant sp as SP
participant db as FilCDN Stats DB<br/>(on CloudFlare)
participant cv as FilCDN Verifier Contract

u->>cl: Make file retrieval request
cl->>clc: Check for cached request
alt Cache Hit
        clc->>cl: Return file
else Cache Miss
        cl->>fcdn: Forwards Request
        fcdn->>sp: Make file retrieval request
		    fcdn->>db: Record telemetry in backgrond
        sp->>fcdn: Respond with File or Error
end
cl->>u: Return request response

loop On schedule
	fcdn->>cv: Submit aggregated telemetry to the Verifier contract
end

Thoughts from Miroslav

  • For June 6, the workflow above is all we need.
    • However: depending on how we want to slice the stats, we will likely need to include ProofSetID both in the request and in the stats submitted on the chain.
      • The Verifier contract should (eventually, post June 6?) accepts only stats for existing & live pairs of (ProofSet, Root).
  • For November, I think we will need to add the following steps to the worker before it calls SP:
    • Require both ProofSetID (e.g. 1) and RootCID (CommP) as input.
    • Check that the requested ProofSet and Root are active according to PDPVerifier state.
      • The PDPVerifier does not provide mapping from RootCID to Root Metadata, only Root Index to Root Metadata. Depending on the architecture, we may need to include the Root Index in the request.
      • Obtain the owner of the ProofSet (a 0x address).
    • Map the owner to the retrieval URL (or hostname).

Storage Flow

sequenceDiagram
participant u as User
participant pc as Pandora Contract
participant sp as SP
participant sgh as SubGraph
participant fcw as FilCDN Worker<br/>(on CloudFlare)
participant fcl as FilCDN Lookup<br/>(on CloudFlare)

u->>pc: Make PDP Storage Deal with funds from wallet (W)
u->>sp: Send bytes to SP
pc->>u: return CommP
pc->>sgh: consume on chain event with deal details
alt CDN not in deal
				sgh->>sgh: Do nothing
else CDN in deal
        sgh->>fcw: Forwards Request
        fcw->>fcl: Write lookup ((W, Commp) x SP)
end

Retrieval Flow Added the Lookup

sequenceDiagram
participant u as User
participant cl as filcdn.io<br/>(CloudFlare entry point)
participant clc as CloudFlare Caching Layer
participant fcdn as FilCDN Backend<br/>(CloudFlare Worker)
participant sp as SP
participant db as FilCDN Stats DB<br/>(on CloudFlare)
participant cv as FilCDN Verifier Contract

u->>cl: Make file retrieval request with wallet (W) and CommP
cl->>clc: Check for cached request for W + CommP
alt Cache Hit
        clc->>cl: Return file
else Cache Miss
        cl->>fcdn: Forwards Request
        fcdn->>fcdn: Lookup SP from W + CommP
        fcdn->>sp: Make file retrieval request to SP
		    fcdn->>db: Record telemetry in backgrond
        sp->>fcdn: Respond with File or Error
end
cl->>u: Return request response

loop On schedule
	fcdn->>cv: Submit aggregated telemetry to the Verifier contract
end

USDFC Value Flow

flowchart TD
  C[Client]
  F[FilCDN]
  SP
  PRP[Payment Rail: PDP]
  PRCM[Payment Rail: Cache Miss]
  PRCD[Payment Rail: CDN]
  C --For any PDP proof set--> PRP
  C --If withCDN=true--> PRCM
  C --If withCDN=true--> PRCD
  PRP --Reward proving data possession--> SP
  PRCM --Reward serving cache-misses to CDN--> SP
  PRCD --Reward making data globally fast and available--> F

  

When Client wants to store data on an SP using PDP, they pay into the Payment Rail: PDP, out of which SP will eventually be rewarded.

When Client also wants to enable FilCDN delivery (supported by Pandora SDK), two additional payment rails will be created:

  • Payment Rail: Cache Miss rewards SPs for serving PDP data to FilCDN, which is required for the CDN to function. This means, paid retrievals for SP
  • Payment Rail: CDN rewards FilCDN for globally delivering cached SP PDP data, and making it fast

Payment rails are managed by the Pandora smart contract, created by FilOz , extended by Space Meridian.

Retrieval flow

sequenceDiagram
  participant U as User
  participant CR as Cloudflare Worker: Retrieval
  participant DI as Cloudflare D1: Index
  participant SP
  participant DM as Cloudflare D1: Metrics
  U->>CR: Request [clientAddress,commP]
  CR->>DI: Query index
  CR->>CR: Decide how to serve request
  alt is not in cache
    CR->>SP: Retrieve file
    CR->>CR: Populate cache
  end
  CR->>U: Serve file
  CR->>DM: Store metrics
  

When User requests a file from FilCDN, the URL contains clientAddress and root commP. The Cloudflare Worker: Retrieval looks up requested fields in Cloudflare D1: Index, to decide how to (and if at all) serve the request. It for example checks, if CDN service has been booked for the PDP proof set that contains commP, and which SPs could have the data. If all is good, data is either served directly from cache, or forwarded from an SP (which does then populate the cache). After data has been completely shared with User, metrics about the request are written to Cloudflare D1: Metrics, which will be used both for dashboards and billing purposes.

Indexing flow

flowchart LR
  D1[Cloudflare D1: Index]
  CI[Cloudflare Worker: Indexing]
  SP[Goldsky Subgraphs: Pandora]
  SV[Goldsky Subgraphs: PDP Verifier]
  PV[PDP Verifier contract]
  P[Pandora contract]
  CI -- Populate --> D1
  PV -- Emit events --> SV
  P -- Emit events --> SP
  SP -- Send events --> CI
  SV -- Send events --> CI
  

The Cloudflare D1: Index is being populated based on events emitted by the Pandora and the PDP Verifier smart contracts (created by FilOz). GoldSky Subgraphs and pipelines are used as the glue layer between the blockchain the the Cloudflare Worker: Indexing, calling it for every new event that is seen.