Abstract
We present BioRouter v2.0, a decentralized biodata routing protocol that provides a unified, authenticated gateway for patient-owned genomic and clinical datasets. Building on the BioCID addressing and four-tier authorization model introduced in v1.0, this release adds four major capabilities: (1) BioRoutes, an on-chain DNS system that maps stable biocid identities to mutable storage URIs with cryptographic fingerprint verification, event-driven migration propagation, and self-healing route refresh; (2) privacy-preserving bilinear group accumulators on BLS 12-381, implementing the three-stage Matcher/Privacy-Ticket/Resolver protocol from Buchanan, Grierson & Uribe (ICISSP 2024) for SNP membership queries that leak only a yes/no answer; (3) BioWallet, an ERC-4337 smart wallet with P-256 WebAuthn passkey signing and 96-SNP DNA social recovery via BioRecovery; and (4) BioAssetVault, an ERC-1155 parent/child hierarchy anchoring 84,200 biodata files from five origin laboratories (GenoBank, Somos, Neochromosome, TecBase, Augenomics) with 215,692 inventoried objects totaling 18.6 TB. All 14 smart contracts are deployed on the Sequentias Network (Chain ID 15132025). The live protocol is deployed at biorouter.genobank.app.
Executive Summary (Non-Technical)
Genomic data is the most personal information a human being produces, yet patients have no control over where it lives, who accesses it, or whether it moves between servers without their knowledge. BioRouter solves this by acting as the DNS of biodata: every file has a stable address (BioCID) that works regardless of which cloud bucket the bytes currently reside in, just as a domain name works regardless of which server hosts the website.
In v2.0, every storage location change is recorded on a blockchain, creating an auditable migration history. A cryptographic fingerprint proves the file has not been tampered with. If a storage URI goes dead, the system self-heals by promoting a backup route automatically. Patients can now authenticate with a fingerprint or face scan on their phone (BioWallet passkeys)—no browser extensions, no seed phrases. And if they lose access entirely, their DNA itself serves as the recovery key through a 96-SNP match.
Researchers can query whether a patient's genome contains a specific variant without ever seeing the genome. The system answers yes or no—nothing more—using a mathematical structure (bilinear group accumulator) that is provably impossible to reverse-engineer. If the answer is yes and the researcher wants access, they must obtain a time-limited Privacy Ticket that the patient can revoke at any time.
As of April 2026, BioRouter manages 84,200 biodata files (BAM, VCF, FASTQ, SQLite, CRAM, BED, gVCF) totaling 18.6 TB across five laboratories, all anchored on-chain with cryptographic fingerprints.
1. Introduction
Version 1.0 of BioRouter, published March 2026, established the BioCID addressing scheme and four-tier authorization cascade for patient-owned genomic data. That release solved the access control problem: who can read a biodata file, and under what conditions. It did not solve the location problem: where is the file right now, has it been tampered with, and what happens when storage infrastructure changes.
The gap became operational during GenoBank's AWS-to-GCS migration in
early 2026. Storage URIs lived in MongoDB fields that were never mirrored
on-chain. When files moved between buckets, clients with cached paths
received 404 errors with zero cryptographic signal that the location had
changed. The existing BioRouter.sol contract emitted no event
on URI mutation. This silent-breakage class affected every integration
partner simultaneously.
Version 2.0 addresses this with four new subsystems. BioRoutes (Section 6) is an on-chain DNS that maps stable identities to mutable storage locations with event-driven propagation and self-healing refresh. Privacy-preserving bilinear group accumulators (Section 7) implement the three-stage Matcher/Privacy-Ticket/Resolver protocol from the ICISSP 2024 paper [4a], enabling SNP membership queries that reveal only yes/no without leaking variant data. BioWallet (Section 9) replaces browser-extension wallets with ERC-4337 smart accounts authenticated by P-256 WebAuthn passkeys, with 96-SNP DNA social recovery. BioAssetVault (Section 10) provides the ERC-1155 parent/child hierarchy that anchors 84,200 biodata files from five origin laboratories with cryptographic fingerprints.
This whitepaper describes the complete BioRouter protocol stack as deployed in production. All 14 smart contracts are live on the Sequentias Network (Chain ID 15132025). The protocol manages 18.6 TB of genomic data across 215,692 inventoried objects, with 197,300 routes registered on-chain as of this writing.
2. Background and Related Work
2.1 Content-Addressable Storage
Content-identifiable addressing was formalized by Mazieres and Kaashoek [1] and popularized by IPFS's CID scheme [2]. BioCIDs extend this pattern to biodata by prepending semantic metadata (agent, owner wallet, data type) to the filename, producing addresses that encode provenance without encoding storage location. Unlike IPFS CIDs, BioCIDs do not commit to a specific storage backend, enabling migration between storage systems while preserving persistent identifiers.
2.2 Privacy-Preserving SNP Accumulators
Buchanan, Grierson & Uribe [4a] introduced a batch-mode bilinear group
accumulator for privacy-aware SNP matching at ICISSP 2024. The accumulator
acc_X = g^(Π(s+H(x))) commits to a polynomial over the
SNP set without revealing any individual variant. Membership witnesses
verify via the pairing equality
e(wit_xi, h^{xi}·h^s) = e(acc_X, h) on BLS 12-381.
Measured overhead: 0.87 ms per single SNP add, 5.65 s for a full 640k-SNP
genome, 10.90 ms per witness verification. Bloom filters serve as a cheap
negative pre-check: "definitely not stored" in <5 ms before the
accumulator is consulted. This paper is the technical spine of Section 7.
2.3 ERC-4337 Account Abstraction
ERC-4337 [5] decouples transaction validation from the EOA model by introducing UserOperations, an EntryPoint contract, and paymaster-sponsored gas. P-256 (secp256r1) signature verification enables WebAuthn passkeys as wallet signers, eliminating seed phrases entirely. BioWallet (Section 9) is a production implementation of this standard on the Sequentias Network with a domain-specific extension: 96-SNP DNA social recovery via the BioRecovery contract, implementing US Patent 11,915,808 [6].
2.4 x402 Micropayments and ERC-8004 AI Agents
The x402 protocol [7] operationalizes HTTP 402 as a machine-readable micropayment negotiation mechanism. ERC-8004 [8] defines on-chain AI agent identity registration. BioRouter uses x402 for Tier 4 payment authorization and ERC-8004 for agent attribution in BioCIDs (unchanged from v1.0).
2.5 Story Protocol and Programmable IP Licensing
Story Protocol [9] provides on-chain IP registration and the Programmable IP License (PIL) framework. GenoBank extends Story's four standard PIL templates with five genomic-specific BioPIL licenses: GDPR-research #5, AI-training #6, clinical-use #7, pharma-research #8, and family-inheritance #9. Story Protocol licenses are permanent; BioPIL licenses are revocable—the requirement for GDPR Article 17 compliance.
3. BioCID Addressing Scheme
3.1 Format Specification
A BioCID is a structured string identifier with the following grammar:
BioCID := "Biocid:" bioagent "/" owner_biowallet "/" biodata_type "/" dataset
bioagent := "user" | "gc-" HEX{8}
owner_biowallet := CHECKSUMMED_ETH_ADDRESS ; EIP-55 checksum
biodata_type := dtc-type | sequence-type | clinical-type | derived-type
dtc-type := "dtc-genotype"
sequence-type := "vcf" | "bam" | "fastq" | "cram" | "gvcf" | "bed"
clinical-type := "fhir" | "clinical-report"
derived-type := "ancestry-result" | "sqlite"
dataset := FILENAME ; original filename, no path components
3.2 Design Properties
- Storage Independence. The BioCID does not encode a GCS bucket, object path, or storage backend. The mapping from BioCID to physical storage is maintained by the BioRoutes on-chain DNS (Section 6) and cached in MongoDB.
- Owner Attribution. The checksummed wallet address in position 2 is the canonical data owner, auditable by parsing the BioCID string alone.
- Agent Attribution. AI-generated derived datasets are attributed to the specific ERC-8004 registered agent via the
gc-{first8hex}prefix. - Collision Resistance. The 50-SNP SHA-256 fingerprint stored in BioAssetVault.contentHash differentiates files with identical names.
- Migration Resilience. When storage moves from AWS S3 to GCS to any future backend, the BioCID remains unchanged. Only the route record updates.
4. System Architecture
BioWallet (Passkeys)"] R["Researcher
Web App"] A["GenoClaw AI Agent
(ERC-8004)"] CLI["biofs CLI"] end subgraph Edge["EDGE"] CF["Cloudflare
DDoS + TLS"] end subgraph Core["BIOROUTER SERVICE (CherryPy, Python 3.12)"] AUTH["Auth Layer
recover_wallet() + 4-Tier Cascade"] MATCHER["Matcher Service
Bloom + Accumulator"] TICKET["Privacy-Record Match
Privacy Ticket Mint"] RESOLVER["Resolver Service
Ticket → Presigned URL"] ROUTES_API["BioRoutes API
resolve / verify / dispute"] UPLOAD["Upload / Download
Stream / Grant / Revoke"] end subgraph Chain["SEQUENTIAS NETWORK (Chain ID 15132025)"] BV["BioAssetVault
ERC-1155"] BR["BioRouter.sol
Privacy-preserving"] BRTS["BioRoutes.sol
DNS of Biodata"] BREC["BioRecovery
96-SNP DNA"] BCRED["BioNFTCredentials
Consent Tokens"] BW["BioWallet
ERC-4337"] EP["SequentiaEntryPoint"] PM["BioPaymaster
Gas Sponsor"] BDR["BioDataRouter
x402 Payments"] BAR["BioAgentRegistry
ERC-8004"] BIOIP["BIOIPToken
ERC-20"] ORACLE["BioAssetPricingOracle"] end subgraph Storage["STORAGE"] GCS["Google Cloud Storage
AES-256, 60 Buckets"] MONGO["MongoDB
Registry + Routes Cache"] end subgraph Labs["ORIGIN LABORATORIES"] L1["GenoBank
167,951 files"] L2["Somos
46,792 files"] L3["Neochromosome
423 files"] L4["TecBase
288 files"] L5["Augenomics
238 files"] end P & R & A & CLI --> CF CF --> AUTH AUTH --> MATCHER & TICKET & RESOLVER & ROUTES_API & UPLOAD UPLOAD --> GCS & MONGO ROUTES_API --> BRTS AUTH --> BV & BR & BCRED & BDR & BAR BW --> EP --> PM L1 & L2 & L3 & L4 & L5 --> GCS style Clients fill:#fff8f3,stroke:#f39240 style Chain fill:#f0f4ff,stroke:#001fff style Storage fill:#f0fff4,stroke:#18a34a style Labs fill:#f5f0ff,stroke:#8b5cf6
4.1 Signature-Based Authentication
Every API request is authenticated by a user_signature
parameter—the ECDSA result of signing "I want to proceed"
with the client's private key. The server calls recover_wallet()
to derive the checksummed wallet address. No session tokens, API keys, or
passwords are required. BioWallet users authenticate via WebAuthn passkeys
(Section 9); the P-256 signature is validated on-chain by the
SequentiaEntryPoint contract.
4.2 Storage Isolation
All biofiles are stored across 60 GCS buckets using a wallet-indexed
hierarchy that is never exposed to clients. The internal
path structure is maintained in the bioroutes.inventory
MongoDB collection and the BioRoutes.sol on-chain event log.
Clients interact exclusively with BioCIDs.
5. Four-Tier Authorization Model
Every download and stream request passes through the following authorization cascade. Tiers are evaluated in order; the first passing tier grants access.
→ wallet W"] REC --> T1{"TIER 1
owner_wallet == W?"} T1 -->|YES| GRANT["ACCESS GRANTED
Stream from GCS"] T1 -->|NO| T2{"TIER 2
Active BioNFT Consent?
status=active, not expired"} T2 -->|YES| GRANT T2 -->|NO| T3{"TIER 3
On-chain NFT holder?
ERC-721 ownerOf() or
Story Protocol License Token"} T3 -->|YES| GRANT T3 -->|NO| T4{"TIER 4
x402 Payment?
BioDataRouter.hasAccess()"} T4 -->|YES| GRANT T4 -->|NO| DENY["HTTP 402
Payment Required
price_wei + contract addr"] style T1 fill:#e8edff,stroke:#001fff style T2 fill:#fff5ec,stroke:#f39240 style T3 fill:#ecfff0,stroke:#18a34a style T4 fill:#f5f0ff,stroke:#8b5cf6 style GRANT fill:#ecfff0,stroke:#18a34a style DENY fill:#fff0f0,stroke:#cc0000
Data Owner
Cryptographic proof via signature recovery. The wallet that uploaded the file always retains full access with no expiration.
BioNFT Consent
Owner-granted permittee access. Configurable duration, license type (BioPIL #5-9), and revocable at any time. Metamorphic Consent.
NFT Token Holder
On-chain ERC-721 ownership or Story Protocol License Token. Transferable permissions; burning the NFT revokes access.
x402 Payment
BioDataRouter on-chain settlement. 95% to patient, 5% to protocol. Immutable smart contract enforcement.
Traditional consent is binary and static. Metamorphic Consent recognizes that consent over high-value data is an ongoing relationship with economic, temporal, and scope dimensions. A patient may consent to research use for 365 days, revoke after 180 days upon learning the research was commercialized, or expand consent to include AI training in exchange for a Biodata Dividend. The consent record is a living contract, not a checkbox.
6. BioRoutes: The DNS of Biodata
BioRoutes is the storage resolution layer introduced in v2.0. It solves the fundamental problem that a biocid is a stable identity but its storage URI is a mutable record. Just as DNS maps domain names to IP addresses that change when servers migrate, BioRoutes maps biocids to storage URIs that change when files move between buckets, regions, or cloud providers.
6.1 Inventory Oracle (Phase G.0)
Before any on-chain anchoring, a comprehensive inventory of every byte
under management was constructed. The Inventory Oracle walked 60 GCS
buckets, cross-referenced every object against seven MongoDB collections
(bioip_registry, biosamples, genotypes,
dnaandme_kits, parabricks_jobs,
opencravat_jobs, biovault), and produced:
Every biodata file received a 50-SNP SHA-256 fingerprint (the canonical
contentHash stored on-chain in BioAssetVault), a 10k-bit
Bloom filter for membership pre-check, and an owner wallet assignment.
Files without an existing owner received a deterministic custodial wallet
derived via HKDF from the fingerprint, implementing the paper-wallet branch
of US Patent 11,915,808 step 230. Zero orphans remained after reconciliation.
6.2 On-Chain Anchoring (Phase G.1)
The BioRoutes.sol contract, deployed at
0xF758e2b3c4774F0f7e7D95eAa4c265b258d14bAD, is a sibling to
the existing BioRouter.sol—isolated so route schema
changes never threaten live ownership data.
// SPDX-License-Identifier: MIT
// Sequentias Network — Chain ID 15132025
// Address: 0xF758e2b3c4774F0f7e7D95eAa4c265b258d14bAD
contract BioRoutes is Ownable {
enum RouteStatus { ACTIVE, STALE, DISPUTED, UNREACHABLE, DECOMMISSIONED }
enum Tier { PRIMARY, SECONDARY, ARCHIVE, MIRROR }
struct Route {
bytes32 contentHash; // mirrors BioAssetVault.children[id].contentHash
string storageURI; // gs://... or https://...
Tier tier;
address registeredBy;
uint64 registeredAt;
uint64 lastVerifiedAt;
RouteStatus status;
}
mapping(bytes32 => Route[]) public routesFor; // key = keccak256(biocid)
event RouteRegistered(bytes32 indexed biocidKey, string storageURI, Tier tier, bytes32 contentHash);
event RouteMigrated(bytes32 indexed biocidKey, string fromURI, string toURI, bytes32 contentHash);
event RouteVerified(bytes32 indexed biocidKey, string storageURI, bytes32 observedHash);
event RouteDisputed(bytes32 indexed biocidKey, string storageURI, bytes32 claimedHash, bytes32 observedHash);
event RouteDecommissioned(bytes32 indexed biocidKey, string storageURI, string reason);
}
The on-chain anchoring minted 57,582 child BioAssets on BioAssetVault and registered 197,300 routes on BioRoutes.sol, each linking a biocid to its current GCS URI and contentHash. The minting completed at 4.0 tx/s with zero errors; route registration proceeded at 3.8 tx/s.
57,582 child BioAssets minted. 197,300 routes registered. 0 errors. Every route carries a contentHash fingerprint—the same 50-SNP SHA-256 stored in BioAssetVault—so any future retrieval can be verified against the on-chain record. A RouteDisputed event is emitted automatically if the bytes don't match.
6.3 Route Resolution and Fallback Chain
On resolve, the system evaluates routes in priority order:
- PRIMARY — latest
RouteRegisteredURI. If 200 OK, serve. - SECONDARY — latest
RouteMigratedtarget URI. - MIRROR — any
RouteMirrorevent from a lab custodian. - PEER DISCOVERY — broadcast fingerprint to registered lab custodians on BioNFTCredentials.
- If all fail and fingerprint of retrieved bytes ≠ on-chain fingerprint → emit
RouteDisputedautomatically and abort.
6.4 Self-Healing Propagation
A systemd refresher runs every 6 hours. For routes with lastVerifiedAt
older than 90 days, it fetches the first 4 MiB, computes the fingerprint,
and compares to contentHash. Match → RouteVerified
event (refreshes timestamp). Mismatch → RouteDisputed + ops alert.
Fetch 404/timeout after 3 consecutive failures → RouteDecommissioned.
Routes older than 90 days without verification are marked STALE
and require re-verification before serve.
7. Privacy Primitives: Bloom Filters, Bilinear Accumulators, and Privacy Tickets
BioRouter's privacy model implements the three-stage protocol from Buchanan, Grierson & Uribe, ICISSP 2024 [4a]. The primary primitive is a bilinear group accumulator—not a Bloom filter. The Bloom filter is a cheap pre-filter for the "definitely not stored" fast path; the accumulator is the cryptographically-binding source of truth.
Pre-Check"} BF -->|Any negative| NO1["MATCH: NO
confidence: bloom-certain-negative"] BF -->|All positive| ACC{"Accumulator
Witness Verify
e(wit, h^xi·h^s) = e(acc, h)"} ACC -->|All pass| YES["MATCH: YES
confidence: accumulator-certain"] ACC -->|Any fail| NO2["MATCH: NO
confidence: accumulator-rejected"] end subgraph Stage2["STAGE 2: PRIVACY-RECORD MATCH"] YES --> MINT["Mint Privacy Ticket
EIP-712 signed
scope + TTL + revocation"] end subgraph Stage3["STAGE 3: RESOLVER"] MINT --> RESOLVE["Consume Ticket
→ Presigned URL
+ masking rules"] end style Stage1 fill:#f0f4ff,stroke:#001fff style Stage2 fill:#fff5ec,stroke:#f39240 style Stage3 fill:#ecfff0,stroke:#18a34a
7.1 Bloom Filter Pre-Check
Each SNP-bearing file produces a 10,000-bit Bloom filter with 4 hash
functions (false positive rate ~3% at typical variant counts). The filter
answers "definitely NOT stored" in <5 ms. A Bloom negative is final
and terminates the query immediately without touching the accumulator.
Bloom filters are stored in MongoDB bioroutes.bloom_index
and never expose raw variant data.
7.2 Bilinear Group Accumulator
The accumulator construction follows Nguyen 2005 + Vitto-Biryukov 2022 batch mode on BLS 12-381:
acc_X = g^(product of (s + H(x)) for all x in X)
where:
g in G_1, h in G_2 -- BLS 12-381 generators
s in Z_p* -- trapdoor (sharded by epoch x tier x lab)
H: {0,1}* -> F_p* -- collision-resistant hash
Membership witness:
wit_xi = g^(product of (s + H(x)) for all x in X\{xi})
Verification (pairing check):
e(wit_xi, h^{xi} * h^s) == e(acc_X, h)
SNP input encoding follows the paper's canonical form:
"<rsid> <chrom> <pos> <genotype>"
(e.g., "rs12564807 1 734462 AA"). The encoding grammar
extends the paper to cover WGS novel variants, INDELs, and multi-allelic
sites via explicit kind discriminators (SNP, NOVEL, INDEL, MULTI).
The trapdoor s is sharded by (epoch_id, tier, lab_serial)
via HKDF-SHA256, so a compromise of one shard's trapdoor forces a rebuild
of only that quarter/tier/lab slice. The accumulator polynomial cannot be
inverted to recover the SNP set—this is the core privacy property
proven in the paper.
| Operation | Paper Benchmark | BioRouter Budget (2x) |
|---|---|---|
| Single SNP add | 0.87 ms | 1.74 ms |
| 100k SNP batch (batch-100) | 0.87 s | 1.74 s |
| 640k SNP full genome | 5.65 s | 11.30 s |
| Witness creation | 0.86 ms | 1.72 ms |
| Witness verification | 10.90 ms | 21.80 ms |
7.3 Privacy Tickets
"Privacy Ticket" is Daniel Uribe's term from the ICISSP 2024 paper: "Alice will be delivered a privacy ticket which will be used to resolve the actual details of SNPs gathering within the resolver service." A Privacy Ticket is an EIP-712 signed JWT containing:
{
"ticketId": "uuid-v4",
"paperCitation": "ICISSP2024-10.5220/0012455300003648",
"ownerWallet": "0x...",
"requesterWallet": "0x...",
"scope": ["vcf", "biocid://0x.../vcf/x"],
"matcherEvidence": "hash-of-matcher-Y-response",
"accumulatorRef": "bioroutes.accumulators._id",
"issuedAt": 1745000000,
"expiresAt": 1745000900,
"maxResolutions": 1,
"revocationContract":"0xfbEf8e795e6306a23F16d5e0Dc480b89F1D316Bf",
"revocationTokenId": "uint256"
}
The owner's EIP-712 signature binds the ticket to a specific BioNFTCredentials token. Revocation = owner burns the credentials token → Resolver rejects the ticket on its next chain-check. Tickets have a maximum TTL of 900 seconds (15 minutes) and are single-use by default.
Match thresholds from the paper: ~25 matches for a paternity test, ~44 SNPs
for a person-level match, ~650k SNPs for a full individual. These are
pre-populated as min_match_ratio tier presets in the Matcher API.
8. Smart Contract Registry (Sequentias Network)
Fourteen smart contracts on the Sequentias Network underpin BioRouter's
on-chain data layer. All contracts use Solidity 0.8.28 with
viaIR enabled and OpenZeppelin v5.6.
| Contract | Address | Standard | Purpose |
|---|---|---|---|
| BIOIPToken | 0xEE53dAAf7AF86E47bc3155b0642c41a30F1A5d06 |
ERC-20 | Governance and valuation token (1B supply) |
| BioAssetVault | 0x2fd98bFF77571F1338bf1F44E68b80Be77205850 |
ERC-1155 | Parent/child hierarchy. 84,200 biodata assets. |
| BioRouter | 0xc92f9f1D68A445189Ad3ad28524186A11Be30DcA |
Custom | Privacy-preserving opaque bioipId routing |
| BioRoutes | 0xF758e2b3c4774F0f7e7D95eAa4c265b258d14bAD |
Custom | DNS of biodata. Route registration, migration, verification, dispute. |
| BioRecovery | 0x1555eC4e6397645147595918657b713Ebb5c8fBD |
Custom | 96-SNP DNA attestation. US Patent 11,915,808. |
| BioNFTCredentials | 0xfbEf8e795e6306a23F16d5e0Dc480b89F1D316Bf |
ERC-721 | Credential/consent tokens. Privacy Ticket revocation anchor. |
| BioAgentRegistry | 0x24e634E570Ca8aE366aF4ae8861492a1e9B06B6B |
ERC-721 + ERC-8004 | AI agent identity registration (gc-{first8hex}) |
| BioAssetPricingOracle | 0x25d419259d7336a4d883808adc02a5F92520b2C0 |
Custom | Base × Rarity × Somatic × Annotation × Coverage |
| BioWallet | CREATE2 per-user | ERC-4337 | Smart wallet with P-256 passkey signing |
| BioWalletFactory | CREATE2 factory | ERC-4337 | Deterministic wallet deployment |
| SequentiaEntryPoint | Deployed | ERC-4337 | Minimal EntryPoint for private chain |
| BioPaymaster | Deployed | ERC-4337 | Gas sponsor for BioWallet users |
| BioDataRouter | 0x678d668ECAB612390bF60F6eB04d9e9f5398f2F3 |
Custom + x402 | x402 payment processing, 95/5 revenue split |
| ClaraJobNFT | Deployed | ERC-721 | Proof-of-computation for GPU bioinformatics jobs |
9. BioWallet: ERC-4337 Smart Wallet with DNA Recovery
BioWallet replaces the MetaMask/browser-extension dependency with a smart contract wallet authenticated by P-256 WebAuthn passkeys. Patients authenticate with a fingerprint scan or face recognition on their phone or laptop—no seed phrases, no browser extensions.
9.1 Architecture
- BioWalletFactory deploys wallets via CREATE2 for deterministic addresses.
- SequentiaEntryPoint is a minimal ERC-4337 EntryPoint optimized for the Sequentias private chain (no mempool congestion).
- BioPaymaster sponsors gas for all BioWallet transactions, so patients never need to acquire SEQ tokens.
- P-256 signature verification happens on-chain in the BioWallet contract, using the RIP-7212 precompile where available.
9.2 96-SNP DNA Social Recovery
If a patient loses their passkey device, BioRecovery at
0x1555eC4e6397645147595918657b713Ebb5c8fBD enables recovery
via a 96-SNP DNA match (US Patent 11,915,808). The origin laboratory
attests that a fresh biological sample matches the on-chain keccak256
commitment with ≥94/96 SNPs matching. After a 48-hour timelock for
dispute, the wallet's owner key rotates to the new passkey. For inherited
wallets, the parent-child DNA pathway requires 40-56 shared SNPs under
BioPIL #9 (family inheritance).
10. BioAssetVault: ERC-1155 Parent/Child Hierarchy
BioAssetVault at 0x2fd98bFF77571F1338bf1F44E68b80Be77205850
is the ERC-1155 token contract that anchors every biodata file on-chain.
The hierarchy mirrors the biological relationship between a biosample
(parent) and its derived data files (children):
DNA extraction from patient
tokenId: 1"] C1["CHILD: BAM
Aligned reads
contentHash: 0xa3f2..."] C2["CHILD: VCF
Variant calls
contentHash: 0x8b1c..."] C3["CHILD: FASTQ
Raw sequences
contentHash: 0xd7e4..."] C4["CHILD: SQLite
OpenCRAVAT annotation
contentHash: 0x1f9a..."] C5["CHILD: gVCF
Genomic VCF
contentHash: 0x5c3d..."] C6["CHILD: BED
Coverage regions
contentHash: 0xe2b7..."] P --> C1 & C2 & C3 & C4 & C5 & C6 style P fill:#fff5ec,stroke:#f39240,stroke-width:2px style C1 fill:#f0f4ff,stroke:#001fff style C2 fill:#ecfff0,stroke:#18a34a style C3 fill:#f0f4ff,stroke:#001fff style C4 fill:#f5f0ff,stroke:#8b5cf6 style C5 fill:#ecfff0,stroke:#18a34a style C6 fill:#f0f4ff,stroke:#001fff
Each child stores a contentHash (bytes32)—the 50-SNP
SHA-256 fingerprint that BioRoutes uses as the ground truth for integrity
verification. As of April 2026, 84,200 children are minted across the
five origin labs, with 57,582 minted in the latest G.1 anchoring batch.
11. Origin Laboratories
BioRouter manages biodata from five origin laboratories, each registered on-chain with a custodial wallet and authorized on both BioRouter and BioRecovery contracts:
| Laboratory | Files | Size | Primary Types | Specialty |
|---|---|---|---|---|
| GenoBank | 167,951 | 15.4 TB | BAM, VCF, FASTQ, SQLite | Full WGS pipeline, clinical annotation |
| Somos | 46,792 | 59 GB | DTC genotype, ancestry | K=24 ancestry with 10 indigenous Mexican populations |
| Neochromosome | 423 | 2.5 TB | BAM, VCF, FASTQ | First Sequentia-native WGS pipeline (10-sample batch) |
| TecBase | 288 | 230 GB | VCF, BAM | Clinical sequencing |
| Augenomics | 238 | 856 GB | BAM, FASTQ | Research sequencing |
12. Genomic Pipelines
12.1 Clara Parabricks (GPU-Accelerated WGS)
BioRouter integrates with NVIDIA Clara Parabricks 4.6.0 running on
dual A100 GPUs for whole-genome variant calling. The pipeline is
orchestrated by the biofs-node daemon, which is registered
as an ERC-8004 agent on the BioAgentRegistry. Each completed job mints
a ClaraJobNFT as proof-of-computation. A 10-sample WGS
batch for Neochromosome produced 591,904 variants per sample in ~45
minutes per genome at ~$2.45/hr spot pricing.
12.2 Somos Ancestry Pipeline (K=24)
When a DTC genotype file is uploaded with pipeline_trigger=ancestry,
BioRouter enqueues a Somos K=24 ADMIXTURE analysis: PLINK extraction of
91,645 ancestry-informative SNPs, merge with 781 reference individuals
spanning 24 global populations (including 10 indigenous Mexican: Maya,
Pima, Zapoteca, Huichol, Mixteca, Nahua-Otomi, Tarahumara, Triqui,
Andes, Amazonas), supervised ADMIXTURE K=24 (~12 minutes), and results
stored as a new BioCID attributed to the GenoClaw agent.
12.3 GA4GH htsget Streaming
BioRouter implements the GA4GH htsget v1.2 specification at
htsget.genobank.app. Researchers stream BioNFT-gated VCFs
via biofs stream <id> | bcftools stats -—zero
FUSE, zero kernel extensions. Range headers enable targeted access to
specific genomic regions without downloading entire multi-gigabyte files.
13. GenoClaw AI Agent and biofs CLI
13.1 GenoClaw
GenoClaw is GenoBank's patient-owned AI health agent, registered as an
ERC-8004 principal. It serves as the intelligence layer above BioRouter:
when a patient asks "What is my ancestry?" or "Do I carry the APOE-e4
variant?", GenoClaw queries BioRouter's three-stage privacy protocol,
retrieves the relevant data via Privacy Ticket, and returns a natural
language interpretation. GenoClaw's identity is embedded in every BioCID
it generates (gc-{first8hex}), creating permanent attribution.
13.2 biofs CLI
The biofs command-line interface is the developer's entry
point to BioRouter. Key commands:
# Resolve a biocid to its current storage location
biofs resolve <biocid>
# Resolve and verify fingerprint integrity
biofs resolve <biocid> --verify
# Reverse lookup by fingerprint
biofs resolve --by-fingerprint <sha256-hex>
# Three-stage privacy match
biofs match --owner 0x... --snps rs367789441:1:68082:TT,rs12564807:1:734462:AA
# Stream a VCF through htsget
biofs stream <biocid> | bcftools view -
# List biofiles for a wallet
biofs files
# Dispute a route with integrity mismatch
biofs resolve <biocid> --dispute
14. API Reference
All endpoints are served under
https://biorouter.genobank.app/api_biorouter/.
BioRoutes endpoints are at /api_bioroutes/.
14.1 Core Endpoints (unchanged from v1.0)
14.2 BioRoutes Endpoints (new in v2.0)
# Response 200
{
"routes": [
{
"storageURI": "gs://genobank-parabricks-output/jobs/.../sample.vcf",
"tier": "PRIMARY",
"contentHash": "0xa3f2e1d4b5c6...",
"registeredAt": "2026-04-26T04:17:48Z",
"lastVerifiedAt":"2026-04-26T12:34:29Z",
"status": "ACTIVE"
}
],
"on_chain_hash": "0xa3f2e1d4b5c6...",
"fallback_count": 1
}
14.3 Privacy Protocol Endpoints (new in v2.0)
# Request
{
"owner_wallet": "0x...",
"snps": [{"rsid":"rs367789441","chrom":"1","pos":"68082","genotype":"TT"}],
"requester_wallet": "0x..."
}
# Response 200
{
"match": true,
"confidence": "accumulator-certain",
"snp_count": 44
}
15. Privacy, Compliance, and Regulatory Alignment
15.1 Why Patient Ownership Supersedes GDPR/CCPA
GDPR and CCPA were designed to protect individuals from corporations that hold data as custodians. BioRouter inverts this: the patient is the data controller, not the data subject. The patient's wallet key is the only mechanism by which data can be accessed. The concept of an "erasure request" to a third party becomes meaningless when the patient never transferred control. Nevertheless, BioRouter satisfies GDPR Article 17 as an operational guarantee: GCS objects are deletable on demand, consent revocation propagates within one request cycle, and Privacy Tickets are revocable on-chain.
15.2 Privacy-Preserving Bloom Filters vs. Zero-Knowledge Proofs
Zero-knowledge proofs require deterministic computation. Genomics is fundamentally probabilistic: genotype calling introduces PHRED-scaled likelihoods, haplotype phasing depends on population priors, and structural variant detection involves alignment scores across repetitive regions. BioRouter uses bilinear group accumulators (provably hiding over the committed polynomial) and Bloom filters (probabilistic membership testing) instead. The accumulator cannot be inverted to recover individual SNPs—this is the core security property proven in [4a].
15.3 Why Federated Learning is Rejected
BioRouter does not implement federated learning. Federated learning degrades data quality, erases patient attribution, enables extraction of value without compensation, and does not prevent model inversion [10], membership inference [11], or property inference attacks [12]. Researchers who require data must acquire access via the four-tier authorization model, access the complete authenticated dataset, and accept full attribution and audit logging.
16. Comparative Analysis
| Property | BioRouter v2.0 | 23andMe (legacy) | IPFS-based | Federated Learning | Nebula Genomics |
|---|---|---|---|---|---|
| Patient is data controller | Yes — cryptographic | No | Partial | No | Partial |
| Consent revocable | Yes — instant + on-chain | No | No (IPFS) | No | Partial |
| Storage location auditable on-chain | Yes — BioRoutes events | No | Partial (CID = location) | N/A | No |
| Migration self-healing | Yes — 6h refresher | No | No | No | No |
| Privacy-preserving SNP query | BLS accumulator + Bloom | No | No | Gradient-based (leaky) | No |
| Passkey authentication (no extension) | Yes — BioWallet P-256 | Email/password | MetaMask required | N/A | MetaMask required |
| DNA social recovery | 96-SNP BioRecovery | No | No | No | No |
| Patient revenue share | 95% on-chain | 0% | Varies | 0% | Token-based |
| Files on-chain | 84,200 (ERC-1155) | 0 | 0 | 0 | 0 |
| AI agent attribution | ERC-8004 in BioCID | None | None | None | None |
17. Conclusion
BioRouter v2.0 advances from an access-control protocol to a complete biodata infrastructure stack. The addition of BioRoutes transforms BioCIDs from identifiers that depend on a database for resolution into identifiers backed by an on-chain event log—the DNS of biodata. Storage migrations that previously caused silent breakage now emit cryptographically-verifiable events that clients, indexers, and auditors can independently trace.
The bilinear group accumulator and Privacy Ticket system, implementing the three-stage protocol from ICISSP 2024, establishes a privacy primitive that is stronger than Bloom filters alone and architecturally honest about the limitations of zero-knowledge approaches for probabilistic genomic data. Researchers can determine whether a genome contains a specific variant panel without seeing any variant data, and the patient can revoke the resulting access credential at any time by burning a single on-chain token.
BioWallet eliminates the last major UX barrier to patient-controlled genomic data: the requirement to install a browser extension and manage a seed phrase. Patients authenticate with biometrics they already use daily. And if they lose everything, their DNA is the recovery key—a capability that exists nowhere else in production and that is protected by US Patent 11,915,808.
The numbers are no longer theoretical. 84,200 biodata files from five laboratories, totaling 18.6 TB, are anchored on-chain with cryptographic fingerprints. 197,300 routes map those files to their current storage locations. 14 smart contracts enforce the rules. The protocol is live at biorouter.genobank.app.
Privacy is not about hiding data or making it fuzzy. Privacy is about giving patients complete control over their authentic, high-quality data, with full transparency about its use and fair compensation for its value. BioRouter makes this principle operational—at scale, on-chain, and in production.
References
- [1] Mazieres, D. & Kaashoek, M.F. Escaping the Evils of Centralized Control with Self-certifying Pathnames. In: Proc. 8th ACM SIGOPS European Workshop, pp. 118–125 (1998).
- [2] Protocol Labs. Content Identifier (CID) Specification. IPFS Documentation, v1.0 (2017).
- [3] 23andMe, Inc. Chapter 11 Bankruptcy Filing, U.S. Bankruptcy Court, District of Delaware, March 2025. TTAM Research Institute acquisition for $305 million, July 2025.
- [4] Boneh, D., Boyen, X. & Shacham, H. Short Group Signatures. CRYPTO 2004, LNCS vol. 3152, pp. 41–55. Springer (2004).
- [4a] Buchanan, W.J., Grierson, S. & Uribe, D. Privacy-Aware Single-Nucleotide Polymorphisms (SNPs) Using Bilinear Group Accumulators in Batch Mode. In: Proc. 10th International Conference on Information Systems Security and Privacy (ICISSP 2024), pp. 226–233. DOI: 10.5220/0012455300003648.
- [5] ERC-4337: Account Abstraction Using Alt Mempool. Ethereum Improvement Proposal (2023). Authors: Buterin, V., et al.
- [6] Uribe, D. US Patent 11,915,808 B2. "Privacy-Preserving DNA/RNA/Microbiome/COVID-19 Test Kit Kiosk and Locker That Pairs To and Stores Results Data in Private Digital Wallet." USPTO (2024).
- [7] Coinbase, Inc. x402: A Protocol for Machine-Readable HTTP Payments. Technical Specification v0.2 (2025).
- [8] ERC-8004: AI Agent Identity Standard. Ethereum Improvement Proposal Draft (2025).
- [9] Story Protocol. Programmable IP License (PIL) Framework. Documentation (2025). Available: https://docs.story.foundation
- [10] Fredrikson, M., Jha, S. & Ristenpart, T. Model Inversion Attacks that Exploit Confidence Information. ACM CCS 2015, pp. 1322–1333.
- [11] Shokri, R. et al. Membership Inference Attacks Against Machine Learning Models. IEEE S&P 2017, pp. 3–18.
- [12] Melis, L. et al. Exploiting Unintended Feature Leakage in Collaborative Learning. IEEE S&P 2019, pp. 691–706.
- [13] Alexander, D.H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19(9), 1655–1664 (2009).
- [14] Nguyen, L. Accumulators from Bilinear Pairings and Applications. CT-RSA 2005, LNCS vol. 3376, pp. 275–292. Springer (2005).
- [15] Vitto, G. & Biryukov, A. Dynamic Universal Accumulator with Batch Update over Bilinear Groups. CT-RSA 2022, LNCS vol. 13161, pp. 393–426. Springer (2022).
- [16] Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7), 422–426 (1970).
- [17] European Parliament. Regulation (EU) 2016/679 (GDPR). Official Journal of the European Union, L 119 (2016).
- [18] California Consumer Privacy Act, Cal. Civ. Code §§ 1798.100–1798.199 (2018), as amended by CPRA (2020).
- [19] Shapley, L.S. A Value for n-Person Games. Contributions to the Theory of Games, vol. 2, pp. 307–317. Princeton University Press (1953). Application: Ghorbani, A. & Zou, J. Data Shapley. ICML 2019, pp. 2242–2251.
- [20] Azaria, A. et al. MedRec: Using Blockchain for Medical Data Access. 2016 2nd Int. Conf. Open and Big Data (OBD), pp. 25–30. IEEE (2016).
© 2026 GenoBank.io. All rights reserved.