Delegated Routing for FWSS
Authors:
- @Miroslav Bajtoš
<miroslav@meridian.space>
- @Julian Gruber
<julian@meridian.space>
Last Updated:
Introduction
In this document, we propose a design for IPFS Delegated Routing service that will allow IPFS nodes like Kubo and Helia to discover and retrieve IPFS (UnixFS) content stored in Filecoin Warm Storage Service deals.
Please read to better understand the differences between Filecoin and IPFS formats. TL;DR:
To store IPFS content on Filecoin, it is serialised in CAR format. The CAR output is treated as the Piece to store, and gets assigned a CommP CID. It is not possible to map an IPFS CID to the Filecoin CID or vice versa without access to the entire content.
Minimal Scope
- We want to index Root CIDs only (as opposed to indexing all CAR blocks found in a piece).
- All deals with Root CID will be indexed (with or without CDN).
- The index will point to direct retrievals from SPs, no CDN yet.
- An HTTP server implements the Delegated Routing V1 HTTP API
- The uploaders will be required to attach additional Piece metadata to indicate IPFS root CIDs.
Out of the scope
- Support for retrievals via Filecoin Beam.
- On-chain payments to cover the costs of running this index and serving Delegated Routing queries.
- IPFS CIDs contained within the root IPFS CID content.
Architecture
We propose a two-tiered architecture:
- Extend the “official” WarmStroage subgraph (source) to index
ipfsRootCidPiece metadata, similarly to how it’s already indexingwithCDNDataSet metadata.We would like Filecoin Foundation and/or FilOz to run this subgraph as public goods service for the entire ecosystem.
Having said that, the subgraph code will be open source, and anybody can deploy and pay for their own instance of the subgraph.
- Implement a thin caching reverse proxy layer that will translate Delegated Routing queries into GraphQL queries, and translate the responses back. This layer can be implemented as a Cloudflare Worker.
query { pieces(where:{ipfsRootCid:"0xabc"}) { dataSet { serviceProvider { products(where:{productType:"0"}) { serviceUrl } } } } }
The last missing piece is enhancing Synapse SDK to allow clients to provide ipfsRootCid in Piece metadata when uploading content to FWSS (https://github.com/FilOzone/synapse-sdk/issues/174).
Initial development cost
3 days
- Change subgraph and deploy our instance (1/2d)
- Update Synapse SDK (1/2d)
- Create HTTP proxy (bridging subgraph ↔ INPI delegated routing spec) (1/2d)
- One-time E2E test
- Who can help us test this? Shipyard team?
- Work with FF/FO to take over subgraph as public good
- Get cid.contact to update to query our instance in parallel to other indexes (e.g., Storachas)
Ongoing operational cost
$50/m + maybe subgraph cost
- We hope FF can host the subgraph
- HTTP proxy
- Domain: $2/m
- CF Pro plan: $50/m
- Dev time (unknown)
=== WHAT WE WOULD WANT TO BUILD IF WE HAVE A LOT OF TIME ==
Requirements
- We want to avoid repeating the history of IPNI, which became a centralised service that everyone in the ecosystem relies on. It must be easy for any team in the Filecoin or IPFS ecosystems to stand up their own instance of a delegated router, with no cooperation needed from the teams running the existing instances.
- The router must index all Warm Storage deals, regardless of whether they include CDN service or not.
- The router must be open to extensions in the future to allow CDN services other than Filecoin Beam to be included in the index.
- Monetization. There must be a way to pay the Delegated Router service provider.
Proposal - WIP
Let’s call the project FDR - Filecoin Delegated Router.
Stream of thoughts:
- In order to start an FDR replica, we need to ingest deals created in the past. Glif RPC API provides access only to the last ~18 hours. If we wanted to query all events, we would have to pay for an archive node, which is expensive. A more efficient solution is to let new instances sync the state from existing instances, but then we need to build this sync mechanism and solve verifiability. We already have a tool we can use for this - Subgraph.
- Then we need FDR to listen for deal and provider-registry changes.
- One options is to use Glif RPC API to listen for events.
- Another option is to periodically query a singleton FDR Subgraph operated by FF to get recently added deals. It’s not clear how to discover deleted deals, though.
- We can also create a Goldsky pipeline invoking webhooks when a piece is added or deleted. Each FDR instance needs to set up its own pipeline. It’s not clear whether it’s possible to share the same underlying subgraph or whether each FDR instance would need to clone the subgraph as well - that would be inefficient and expensive.
- An easy option is to let the subgraph provide the routing information, and implement FDR only as a thin bridge between Delegated Routing protocol and GraphQL provided by Subgraphs. Downside: this will not work once we move from indexing root blocks to indexing all blocks in a piece.
One CAR file can contain more than one tree (root CID). Ideally, the Piece metadata should support this by allowing an array of root CIDs, not just one CID.
We need to update Synapse SDK:
- Minimal: allow clients to provide additional Piece metadata
- Nice DX: add
uploadDirectoryto Synapse SDK, e.g. by integrating Helia into Synapse SDK.
Alternatives Considered
- Use the Filecoin Beam architecture based on Cloudflare Workers, D1 & KV storage, and Goldsky webhook pipelines. There are too many moving parts in this architecture, which makes the task of setting up a new instance too complex.
Future work
- How to extend FDR to support any content stored on PDP, e.g. FWSS alternatives. Currently, PDP is missing a link to Provider Registry & product definitions that contain SP’s service URL.
- Payment mechanism to cover the costs of providing FDR. Ideally, this mechanism should not be tied to a single FDR instance.