Abstract
We present BioRouter, a decentralized biodata routing protocol that provides a unified, authenticated gateway for all operations on patient-owned genomic and clinical datasets. BioRouter introduces the BioCID (Biological Content Identifier), a deterministic addressing scheme that decouples data identity from physical storage location, enabling secure routing without exposing cloud storage infrastructure. Access control is enforced via a four-tier authorization cascade comprising: (1) cryptographic proof of data ownership via Ethereum signature recovery, (2) explicit BioNFT-gated consent grants with configurable duration and license type, (3) on-chain ERC-721 token holder verification, and (4) x402 micropayment settlement through the BioDataRouter smart contract on the Sequentias Network (Chain ID 15132025), which enforces a 95%/5% revenue split between data owner and protocol. BioRouter also implements Metamorphic Consent—a model in which patient consent transforms from a static one-time permission into an ongoing, revocable, economically-linked relationship. The protocol integrates with ERC-8004 AI agent identity registration, enabling full attribution tracking for AI-driven data access. A production genomics pipeline triggered via upload metadata executes PLINK-based extraction of 91,645 ancestry-informative SNPs followed by ADMIXTURE K=24 supervised analysis against 781 reference individuals spanning 24 global populations. All data is stored in Google Cloud Storage with AES-256 encryption; storage paths are never exposed to clients. The live protocol is deployed at biorouter.genobank.app.
Executive Summary (Non-Technical)
Today, genomic and clinical data is fragmented across laboratories, consumer testing companies, and hospital systems. Patients cannot control who accesses their data, cannot revoke permissions, and receive no compensation when their data contributes to research. BioRouter solves this by acting as a universal data router that patients control cryptographically through their Web3 wallet.
Every file uploaded to BioRouter receives a BioCID—a unique address
like Biocid:user/0x5f5a60.../vcf/sample.vcf—that
identifies the data without revealing where it is physically stored.
Access is granted only when one of four conditions is met: the requester
is the owner, the owner has explicitly granted permission, the requester
holds the linked NFT, or the requester pays the owner-set price.
When researchers pay, 95 cents of every dollar goes directly to the patient.
Any consent can be revoked instantly, satisfying GDPR's right to erasure.
BioRouter also automatically runs ancestry analysis when a DTC genotype file is uploaded, delivering results across 24 global populations including 10 indigenous Mexican populations—results that are verifiable, attributed, and owned by the patient.
1. Introduction
The global genomic data ecosystem is characterized by a fundamental misalignment between data production and data ownership. Patients and research participants generate the raw material that drives precision medicine, pharmacogenomics, and population genetics, yet they exercise no meaningful control over how that data is accessed, shared, or monetized. This arrangement has produced a series of well-documented failures: the 2019 MyHeritage breach exposing 92 million records [1], the 2023 23andMe credential-stuffing attack compromising 6.9 million profiles [2], and the March 2025 23andMe bankruptcy that placed 15 million customers' genomic data into a corporate asset pool with no mechanism for patient objection [3].
Existing technical responses to this problem have proven inadequate. Federated learning degrades data quality, erases attribution, and allows model trainers to extract value without compensating data owners. Zero-knowledge proofs, while mathematically elegant, are architecturally incompatible with genomic data: genomics is probabilistic and non-deterministic, and ZK systems require deterministic computation to produce verifiable proofs [4]. Differential privacy introduces calibrated noise that is acceptable for demographic statistics but catastrophic for clinical-grade variant calls where single-base accuracy determines treatment decisions.
BioRouter addresses this gap by inverting the custody model entirely. Instead of data being held by an institution that grants access to patients, patients hold cryptographic keys that grant access to institutions. The data infrastructure—Google Cloud Storage with AES-256 encryption—is a private implementation detail hidden behind the BioCID addressing layer. What patients, researchers, and AI agents interact with is a protocol, not a storage system. Intellectual property registration and commercial licensing are handled through Story Protocol, whose Programmable IP License (PIL) framework enables on-chain license token minting, royalty distribution, and derivative work tracking—extended by GenoBank.io's revocable BioPIL licenses for GDPR-compliant genomic data commerce.
This whitepaper describes BioRouter's design, implementation, and operational characteristics. Section 2 surveys prior work. Section 3 formalizes the BioCID addressing scheme. Section 4 details the system architecture. Section 5 specifies the four-tier authorization cascade. Section 6 documents the API. Sections 7 through 10 cover smart contracts, genomic pipelines, AI agent integration, and the MongoDB data model. Sections 11 through 13 address privacy compliance, comparative analysis, and limitations.
2. Background and Related Work
2.1 Content-Addressable Storage
Content-identifiable addressing, in which a resource's address is derived from its content rather than its location, was formalized in distributed systems literature by Mazieres and Kaashoek [5] and later popularized by IPFS's CID scheme [6]. BioCIDs extend this pattern to biodata by prepending semantic metadata (agent, owner wallet, data type) to the filename, producing addresses that encode provenance without encoding storage location. Unlike IPFS CIDs, BioCIDs do not commit to a specific storage backend, enabling migration between storage systems while preserving persistent identifiers.
2.2 Blockchain-Based Health Data Governance
MedRec [7] demonstrated that Ethereum smart contracts could serve as a decentralized access permission layer for electronic health records. Subsequent systems including Healthureum [8], Mediblock [9], and Coral Health [10] expanded this framework but retained centralized storage, creating a split architecture where access control was decentralized but data remained vulnerable. BioRouter extends this line of work by coupling the access control layer with automated revenue distribution and AI agent identity attestation.
2.3 x402 Micropayments
The HTTP 402 "Payment Required" status code, originally reserved in
RFC 7231 [11] and never formally standardized, has been operationalized
by the Coinbase x402 protocol [12] as a machine-readable micropayment
negotiation mechanism. BioRouter adopts x402 semantics: when an
unauthorized requester attempts to download data, the server returns
HTTP 402 with a structured body containing the owner-set price and the
BioDataRouter contract address. The requester then settles payment
on-chain, and the contract's hasAccess(agentId, wallet)
function confirms access.
2.4 ERC-8004 AI Agent Identity
ERC-8004 [13] defines a standard interface for registering AI agents
as on-chain identity principals capable of signing transactions and
attesting to their capabilities. GenoclawIdentityRegistry, deployed
at 0xcBc813e733692794660dEC4AbB2ADd515a9F3D18 on
Sequentias Network, implements ERC-8004 with extensions for bioinformatic
agent classification. Registered agents receive a deterministic identifier
of the form gc-{first8hex} that is embedded in every BioCID
they generate, enabling complete attribution tracking through the audit log.
2.5 Story Protocol and Programmable IP Licensing
Story Protocol [15a] establishes an on-chain infrastructure for registering intellectual property as composable, programmable assets. Each IP Asset receives a unique identifier and can have license terms attached via the Programmable IP License (PIL) framework. BioRouter integrates with Story Protocol at two levels: (1) genomic files uploaded through BioRouter may be simultaneously registered as Story Protocol IP Assets, enabling commercial licensing through standard PIL templates; and (2) the four-tier authorization model (Section 5) checks Story Protocol license token ownership as a valid access credential. GenoBank.io extends Story Protocol's standard PIL templates (non-commercial #1, commercial #2, exclusive #3, public-good #4) with five genomic-specific BioPIL licenses: GDPR-research #5, AI-training #6, clinical-use #7, pharma-research #8, and family-inheritance #9. A critical distinction is that Story Protocol licenses are permanent by design, whereas BioPIL licenses are revocable—a requirement for GDPR Article 17 compliance and the foundation of Metamorphic Consent.
Known BioNFT collections registered on Story Protocol include:
0x5021F7438ea502b0c346cB59F8E92B749Ecd74B5— VCF Ownership0x19A615224D03487AaDdC43e4520F9D83923d9512— VCF Collection0xB8d03f2E1C02e4cC5b5fe1613c575c01BDD12269— VCF Annotation0x88Ed5b47ea8f609Ee14ac60968C3f76f9138a171— AlphaGenome0x7fB09610594a2952144B5cADbD47972684dEfA86— Ancestry0xdaB93b0D7f01C9D7ffe33afcDc3518E8d6DE7Be1— Newborn/Trio
2.6 ADMIXTURE and Population Genetics Pipelines
ADMIXTURE [14] uses a maximum likelihood model to decompose individual genotype data into K population components. At K=24, the Somos pipeline resolves ancestry proportions across 781 reference individuals representing major continental populations and 10 indigenous Mexican populations (Maya, Pima, Zapoteca, Huichol, Mixteca, Nahua-Otomi, Tarahumara, Triqui, Andes, Amazonas). The 91,645-SNP panel was selected for high information content across this reference set while maintaining compatibility with major DTC genotyping platforms (23andMe v3/v4/v5, AncestryDNA, Illumina GSA).
3. BioCID Addressing Scheme
3.1 Format Specification
A BioCID is a structured string identifier with the following grammar:
BioCID := "Biocid:" bioagent "/" owner_biowallet "/" biodata_type "/" dataset
bioagent := "user" | "gc-" HEX{8}
owner_biowallet := CHECKSUMMED_ETH_ADDRESS ; EIP-55 checksum
biodata_type := dtc-type | sequence-type | clinical-type | derived-type
dtc-type := "dtc-genotype"
sequence-type := "vcf" | "bam" | "fastq"
clinical-type := "fhir" | "clinical-report"
derived-type := "ancestry-result" | "sqlite"
dataset := FILENAME ; original filename, no path components
The bioagent field distinguishes uploads performed by a
human user via direct upload (user) from those performed
by an ERC-8004 registered AI agent (gc-{first8hex}).
This distinction is critical for audit trail integrity: when GenoClaw
or another registered agent stores a derived dataset (e.g., an ancestry
result), the agent's identity is permanently encoded in the BioCID.
3.2 Example BioCIDs
| Scenario | BioCID |
|---|---|
| Patient uploads 23andMe file | Biocid:user/0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a/dtc-genotype/DTC-FILE-GHAn0029.txt |
| Patient uploads clinical VCF | Biocid:user/0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a/vcf/aligned_cleaned_Invitae.deepvariant.vcf |
| GenoClaw stores ancestry result | Biocid:gc-b96ed19a/0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a/ancestry-result/somos_k24_results.json |
| GenoClaw stores analysis DB | Biocid:gc-b96ed19a/0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a/sqlite/health_analysis_2026.db |
| Patient uploads raw FASTQ | Biocid:user/0xB3C3a584F9a5A77Ed84EBf2c8E66E8e8c1C2D3A4/fastq/WGS_sample_R1.fastq.gz |
3.3 Design Properties
BioCIDs satisfy the following properties by construction:
- Storage Independence. The BioCID does not encode a GCS bucket, object path, or storage backend. The mapping from BioCID to physical storage is maintained exclusively in the
biocid_registryMongoDB collection and is never returned to API clients. - Owner Attribution. The checksummed wallet address in position 2 is the canonical data owner. Ownership can be audited without querying any database by parsing the BioCID string.
- Agent Attribution. The
bioagentfield enables attribution of AI-generated derived datasets to the specific registered agent, satisfying data lineage requirements in clinical contexts. - Collision Resistance. Two files with identical names uploaded by the same wallet but with different content are differentiated by the SHA-256 hash stored in the registry; the BioCID is supplemented by the hash in duplicate detection logic.
- Human Readability. BioCIDs are designed to be readable and parseable without a resolver, contrasting with opaque hash-based CIDs.
4. System Architecture
4.1 Infrastructure Overview
BioRouter is implemented as a CherryPy Python 3.12 application serving on port 8095, deployed on GenoBank's GCS production server behind a Cloudflare proxy. The service connects to three external systems: MongoDB for registry and audit data, Google Cloud Storage for encrypted file storage, and the Sequentias Network EVM node for blockchain queries and transaction execution.
┌─────────────────────────────────────────────────────────────────────┐
│ CLIENT LAYER │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ Patient │ │ Researcher │ │ GenoClaw AI Agent │ │
│ │ BioWallet │ │ Web App │ │ (ERC-8004 identity) │ │
│ └──────┬───────┘ └──────┬───────┘ └────────────┬─────────────┘ │
└─────────┼─────────────────┼────────────────────────┼────────────────┘
│ │ │
│ HTTPS + user_signature (Web3 signature) │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────────┐
│ CLOUDFLARE EDGE (DDoS protection, TLS termination) │
└──────────────────────────────┬──────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────┐
│ BIOROUTER SERVICE (CherryPy, Python 3.12, port 8095) │
│ https://biorouter.genobank.app │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ AUTH LAYER │ │
│ │ recover_wallet(user_signature) → checksummed wallet address │ │
│ │ Four-Tier Authorization Cascade (§5) │ │
│ └──────────────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────┐ ┌────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ /upload │ │ /download │ │ /stream │ │ /grant │ │
│ │ /list │ │ /resolve │ │ /consents │ │ /revoke │ │
│ │ /set_price│ │ /pipeline │ │ │ │ │ │
│ └────────────┘ └────────────┘ └─────────────┘ └─────────────┘ │
└──────────────┬───────────────────────────┬───────────────────────────┘
│ │
┌────────┴────────┐ ┌──────────┴──────────┐
▼ ▼ ▼ ▼
┌──────────┐ ┌──────────────┐ ┌──────────────┐ ┌───────────────┐
│ MongoDB │ │ Google │ │ Sequentias │ │ Story Protocol│
│ │ │ Cloud │ │ Network │ │ (IP Assets) │
│ registry │ │ Storage │ │ (EVM, RPC) │ │ │
│ consents │ │ (AES-256) │ │ │ │ PIL + BioPIL │
│ audit_log│ │ │ │ BioDataRouter│ │ License Terms │
└──────────┘ │ Bucket: │ │ 0x678d668E │ │ NFT Ownership │
│ genobank- │ │ IdentityReg │ └───────────────┘
│ biorouter │ │ 0xcBc813e7 │
└──────────────┘ └──────────────┘ ┌───────────────┐
│ Somos Ancestry│
Storage Hierarchy (never exposed to clients): │ Pipeline │
gs://genobank-biorouter/ │ (PLINK + │
biorouter/{wallet}/{type}/{uid}/{file} │ ADMIXTURE) │
└───────────────┘
4.2 Signature-Based Authentication
Every API request is authenticated by a user_signature
parameter. The signature is the ECDSA result of signing the canonical
message "I want to proceed" with the client's private key.
The server calls recover_wallet(user_signature)
(equivalent to Ethereum's ecrecover) to derive the
checksummed wallet address. This address becomes the authenticated
principal for the request. No session tokens, API keys, or passwords
are required.
Wallet signatures serve as both authentication and identity. A valid signature over "I want to proceed" proves private key possession without transmitting the key. The same cryptographic proof that authenticates an API call also establishes ownership of all BioCIDs in that wallet's namespace.
4.3 Storage Isolation and Indexation Hierarchy
All biofiles are stored in a dedicated GCS bucket (gs://genobank-biorouter)
using a wallet-indexed hierarchy that is never exposed to clients.
The internal path structure is:
gs://genobank-biorouter/
biorouter/{owner_wallet}/{biodata_type}/{uid}/{filename}
Example:
gs://genobank-biorouter/
biorouter/0x5f5a60eaef242c0d51a21c703f520347b96ed19a/bam/dfdbeadcd2bb/DNA_SAMPLE-001.bam
biorouter/0x5f5a60eaef242c0d51a21c703f520347b96ed19a/vcf/a8f3e2d1bc0f/DNA_SAMPLE-001.vcf
biorouter/0x5f5a60eaef242c0d51a21c703f520347b96ed19a/fastq/7c2f91de34ab/DNA_SAMPLE-001_S26.R1.fastq.gz
biorouter/0x5f5a60eaef242c0d51a21c703f520347b96ed19a/dtc-genotype/cea846287d5f/DTC-FILE-GHAn0029.txt
The owner wallet address is the primary index for all data in the BioRouter protocol.
Every file belongs to exactly one wallet, and that wallet's checksummed address forms the
root of its storage namespace. The {uid} segment is a random UUID suffix that
prevents path collisions when the same filename is uploaded multiple times.
This hierarchy maps directly to BioCID addressing:
Internal path: biorouter/{wallet}/{type}/{uid}/{file}
BioCID: Biocid:{bioagent}/{wallet}/{type}/{file}
Example:
GCS path → biorouter/0x5f5a60...d19a/bam/dfdbeadcd2bb/DNA_SAMPLE-001.bam (HIDDEN)
BioCID → Biocid:gc-b96ed19a/0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a/bam/DNA_SAMPLE-001.bam (PUBLIC)
GCS bucket names and object paths are stored exclusively in the
gcs_bucket and gcs_path fields of
biocid_registry. These fields are marked internal and
are stripped from all API responses. Clients interact exclusively with
BioCIDs. Even in the event of a MongoDB registry leak, the data itself
remains inaccessible without valid GCS credentials; conversely, knowledge
of the GCS path does not help an attacker without passing the four-tier
authorization check, because the BioRouter service acts as the sole
authenticated intermediary.
5. Four-Tier Authorization Model
Every download and stream request passes through the following authorization cascade. Tiers are evaluated in order; the first passing tier grants access. If all four tiers fail, the server returns HTTP 402 Payment Required with the owner-set price and the BioDataRouter contract address.
Request: GET /api_biorouter/download?biocid=X&user_signature=Y
│
▼
┌──────────────────────────────┐
│ recover_wallet(Y) │
│ → authenticated_wallet = W │
└──────────────┬───────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 1: Data Owner Check │
│ biocid_registry.owner_wallet == W ? │
└──────────────────────────┬──────────────────────────────────┘
YES ─────┤
│ NO
▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 2: BioNFT Consent Check (MongoDB) │
│ biocid_consents WHERE biocid=X AND permittee=W │
│ AND status="active" │
│ AND expires_at > NOW() ? │
└──────────────────────────┬──────────────────────────────────┘
YES ─────┤
│ NO
▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 3: On-Chain NFT Token Holder Check │
│ IF biocid_registry.bionft_token_id IS SET: │
│ ERC-721.ownerOf(token_id) == W ? │
└──────────────────────────┬──────────────────────────────────┘
YES ─────┤
│ NO
▼
┌─────────────────────────────────────────────────────────────┐
│ TIER 4: x402 On-Chain Payment Check │
│ BioDataRouter.hasAccess(agent_id, W) == true ? │
└──────────────────────────┬──────────────────────────────────┘
YES ─────┤
│ NO
▼
┌────────────────┐
│ HTTP 402 │
│ Payment Req. │
│ price_wei: X │
│ contract: 0x.. │
└────────────────┘
ACCESS GRANTED
Stream from GCS
Data Owner
Cryptographic proof via signature recovery. The wallet that uploaded the file always retains full access with no expiration.
BioNFT Consent
Owner-granted permittee access via POST /grant. Configurable duration, license type, and revocable at any time.
NFT Token Holder
On-chain ERC-721 ownership query. Whoever holds the BioNFT linked to the BioCID is granted access, enabling transferable permissions.
x402 Payment
BioDataRouter on-chain settlement. 95% to patient, 5% to protocol, enforced by smart contract immutably.
5.1 Tier 1: Data Owner Authentication
Ownership is established at upload time: the wallet address recovered
from the user_signature parameter on the
POST /upload call becomes the owner_wallet
field of the biocid_registry document. On subsequent
download requests, the recovered wallet is compared against
owner_wallet. A match grants unconditional access.
No expiration applies to owner access.
5.2 Tier 2: Metamorphic Consent via BioNFT Grants
The canonical mechanism for granting access to a specific permittee
is POST /api_biorouter/grant. This call, which must be
authenticated by the data owner, creates a document in
biocid_consents containing the permittee wallet,
license type, duration, and expiry timestamp.
Consent under this model is Metamorphic in a specific technical sense: it begins as a static permission record and may evolve into an economically-linked relationship when combined with x402 payments and Biodata Dividends. The consent record is not immutable. The owner may:
- Revoke a specific permittee at any time via
POST /revoke_permittee, settingstatus = "revoked" - Revoke all consents for a BioCID via
POST /revoke(GDPR Article 17 right to erasure) - Inspect all active consent grants via
GET /consents
Auto-expiry is enforced at query time: if the current timestamp exceeds
expires_at, the consent record's effective status is
treated as "expired" without requiring a database write, preventing
stale access grants from persisting after the consent window closes.
Traditional consent models are binary and static: either consent is given or it is not. Metamorphic Consent recognizes that consent over high-value data is an ongoing relationship with economic, temporal, and scope dimensions. A patient may consent to research use for 365 days, revoke after 180 days upon learning the research was commercialized without notification, or expand consent to include AI training in exchange for a Biodata Dividend. The consent record is a living contract, not a checkbox.
5.3 Tier 3: On-Chain BioNFT and Story Protocol License Verification
Tier 3 checks two on-chain credential types:
3a. BioNFT Token Holder (Sequentias Network).
When a BioNFT (ERC-721) is linked to a biocid_registry
record (via the optional bionft_token_id field), the
authorization layer queries the Sequentias Network:
ERC721.ownerOf(token_id). If the result matches the
requesting wallet, access is granted.
This tier enables transferable data access without requiring the original
owner to re-grant consent. A patient may mint a BioNFT representing
read access to their whole-genome sequence and transfer it to a research
institution. The institution's wallet becomes the NFT holder and gains
access. If the patient revokes by burning the NFT or reclaiming it,
access is lost at the next query without any database update.
3b. Story Protocol License Token Holder. If the BioCID's underlying data has been registered as a Story Protocol IP Asset, the authorization layer checks whether the requesting wallet holds a valid License Token for that IP Asset. Story Protocol's PIL (Programmable IP License) framework enforces license terms on-chain: commercial use, derivative works, attribution requirements, and royalty splits are encoded in the license and verified automatically. GenoBank.io extends Story Protocol's four standard PIL templates with five genomic-specific BioPIL licenses:
| BioPIL ID | License Type | Revocable | Use Case |
|---|---|---|---|
| #5 | GDPR Research | Yes | Academic research under GDPR consent |
| #6 | AI Training | Yes | Model training with attribution + dividends |
| #7 | Clinical Use | Yes | Hospital/clinician access for patient care |
| #8 | Pharma Research | Yes | Drug discovery with revenue sharing |
| #9 | Family Inheritance | No | Hereditary access for family members |
The critical difference: Story Protocol's standard licenses are permanent, while BioPIL
licenses are revocable. This distinction is enforced at the BioRouter layer—even
if a Story Protocol license token exists, the BioRouter will block access if the
corresponding biocid_consents record has been revoked by the data owner.
Story Protocol provides the commercial licensing infrastructure; BioRouter provides the
consent enforcement layer.
5.4 Tier 4: x402 Micropayment Authorization
For requesters without an active consent grant or NFT, BioRouter
implements the x402 payment protocol. The data owner sets a price
in wei via POST /set_price, stored in
biocid_registry.price_wei. When an unauthorized requester
attempts download, the server returns:
HTTP/1.1 402 Payment Required
Content-Type: application/json
{
"error": "payment_required",
"biocid": "Biocid:user/0x5f5a60.../vcf/sample.vcf",
"price_wei": "50000000000000000",
"price_eth": "0.05",
"contract": "0x678d668ECAB612390bF60F6eB04d9e9f5398f2F3",
"chain_id": 15132025,
"revenue_split": { "patient_pct": 95, "protocol_pct": 5 }
}
The requester calls BioDataRouter.pay(biocid_hash) on
Sequentias Network. The contract distributes 95% to the owner wallet
and 5% to the protocol treasury, then records the payment against the
(agent_id, wallet) pair. On the next API call, Tier 4
calls BioDataRouter.hasAccess(agent_id, wallet), which
returns true and access is granted.
6. API Reference
All endpoints are served under the base path
https://biorouter.genobank.app/api_biorouter/.
Authentication is via the user_signature parameter
on all calls.
# Parameters (multipart/form-data)
user_signature : string (required) — ECDSA signature of "I want to proceed"
file : binary (required) — file content
biodata_type : string (optional) — override auto-detection
agent_signature : string (optional) — ERC-8004 agent signature for gc- prefix
pipeline_trigger : string (optional) — "ancestry" triggers Somos K=24 pipeline
# Response 200
{
"biocid": "Biocid:user/0x5f5a60.../dtc-genotype/DTC-FILE-GHAn0029.txt",
"biodata_type": "dtc-genotype",
"file_hash": "sha256:1f8eab4c...",
"file_size": 44644945,
"pipeline": {
"status": "queued",
"job_id": "anc_cea846287d5f",
"trigger": "ancestry"
}
}
# Auto-detected biodata_type values
# 23andMe raw data → dtc-genotype
# AncestryDNA raw data → dtc-genotype
# VCF (.vcf, .vcf.gz) → vcf
# BAM/CRAM → bam
# FASTQ (.fastq.gz) → fastq
# FHIR JSON → fhir
# SQLite database → sqlite
# Parameters (query string)
user_signature : string (required)
biocid : string (required)
# Response 200 (authorized)
Content-Disposition: attachment; filename="DTC-FILE-GHAn0029.txt"
X-BioCID: Biocid:user/0x5f5a60.../dtc-genotype/DTC-FILE-GHAn0029.txt
[binary file content]
# Response 402 (unauthorized, payment required)
{
"error": "payment_required",
"price_wei": "50000000000000000",
"contract": "0x678d668ECAB612390bF60F6eB04d9e9f5398f2F3",
"chain_id": 15132025
}
# Parameters
user_signature : string (required)
biocid : string (required)
Range : header (optional) — e.g., "bytes=0-1048575"
# Response 206 Partial Content (range request)
Content-Range: bytes 0-1048575/104857600
Content-Length: 1048576
[binary chunk]
# Example: stream first 50MB of a WGS BAM
curl -H "Range: bytes=0-52428799" \
"https://biorouter.genobank.app/api_biorouter/stream?biocid=Biocid:user/0x...&user_signature=0x..."
# Use case: IGV.js can stream BAM files via this endpoint without full download
# Use case: GATK streaming access for variant calling on specific genomic regions
# Parameters
user_signature : string (required)
biodata_type : string (optional) — filter by type (e.g., "dtc-genotype")
# Response 200
{
"wallet": "0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a",
"count": 3,
"biofiles": [
{
"biocid": "Biocid:user/0x5f5a60.../dtc-genotype/DTC-FILE-GHAn0029.txt",
"biodata_type": "dtc-genotype",
"file_size": 44644945,
"created_at": "2026-03-24T06:15:00Z",
"pipeline_status": "completed",
"access_count": 3
}
]
}
# Parameters (JSON body)
{
"user_signature": "0x...", // owner must sign
"biocid": "Biocid:user/0x5f5a60.../vcf/sample.vcf",
"permittee_wallet": "0xB3C3a584F9a5A77Ed84EBf2c8E66E8e8c1C2D3A4",
"license_type": "research", // research | clinical | ai-training
"duration_days": 365
}
# Response 200
{
"consent_id": "64f3a1b2c8d9e0f1a2b3c4d5",
"granted_at": "2026-03-24T06:29:26Z",
"expires_at": "2027-03-24T06:29:26Z",
"status": "active"
}
# Parameters (JSON body)
{
"user_signature": "0x...",
"biocid": "Biocid:user/0x5f5a60.../vcf/sample.vcf"
}
# Response 200
{
"revoked_consents": 3,
"biocid_status": "revoked",
"gdpr_compliant": true
}
# Parameters (JSON body)
{
"user_signature": "0x...",
"biocid": "Biocid:user/0x5f5a60.../vcf/sample.vcf",
"permittee_wallet": "0xB3C3a584F9a5A77Ed84EBf2c8E66E8e8c1C2D3A4"
}
# Response 200
{
"revoked": true,
"permittee": "0xB3C3a584F9a5A77Ed84EBf2c8E66E8e8c1C2D3A4",
"previous_status": "active"
}
# Parameters (JSON body)
{
"user_signature": "0x...",
"biocid": "Biocid:user/0x5f5a60.../vcf/sample.vcf",
"price_wei": "50000000000000000" // 0.05 ETH
}
# Response 200
{
"price_wei": "50000000000000000",
"price_eth": "0.05",
"effective": true
}
# Parameters
user_signature : string (required)
biocid : string (required)
# Response 200
{
"biocid": "Biocid:user/0x5f5a60.../dtc-genotype/DTC-FILE-GHAn0029.txt",
"pipeline": "ancestry",
"job_id": "anc_cea846287d5f",
"status": "completed", // queued | running | completed | failed
"started_at": "2026-03-24T06:16:00Z",
"completed_at": "2026-03-24T06:28:43Z",
"duration_s": 763,
"result_biocid": "Biocid:gc-b96ed19a/0x5f5a60.../ancestry-result/somos_k24_results.json"
}
7. Smart Contracts (Sequentias Network)
Two smart contracts on the Sequentias Network (Chain ID 15132025) underpin BioRouter's on-chain authorization and identity layers.
| Contract | Address | Standard | Purpose |
|---|---|---|---|
| BioDataRouter | 0x678d668ECAB612390bF60F6eB04d9e9f5398f2F3 |
Custom + x402 | x402 payment processing, 95/5 revenue split enforcement, hasAccess query |
| GenoclawIdentityRegistry | 0xcBc813e733692794660dEC4AbB2ADd515a9F3D18 |
ERC-721 + ERC-8004 | AI agent identity registration, gc-{first8hex} identifier generation |
7.1 BioDataRouter Contract
The BioDataRouter contract implements two primary interfaces relevant to BioRouter API operation:
// SPDX-License-Identifier: MIT
// Sequentias Network — Chain ID 15132025
// Address: 0x678d668ECAB612390bF60F6eB04d9e9f5398f2F3
interface IBioDataRouter {
// Called by unauthorized researcher to pay for access
// msg.value is split: 95% to owner wallet, 5% to protocol treasury
function pay(bytes32 biocidHash) external payable;
// Returns true if (agentId, wallet) pair has a valid payment on record
function hasAccess(bytes32 agentId, address wallet)
external view returns (bool);
// Returns the owner wallet for a given biocid hash
function getOwner(bytes32 biocidHash)
external view returns (address);
// Emitted on each payment
event PaymentReceived(
bytes32 indexed biocidHash,
address indexed payer,
uint256 patientAmount,
uint256 protocolAmount
);
}
The 95%/5% revenue split is enforced by the contract's pay()
function and cannot be altered by either party unilaterally. The split
is not a policy choice that can be overridden by server configuration;
it is a blockchain invariant.
7.2 GenoclawIdentityRegistry Contract
The identity registry extends ERC-721 with ERC-8004 agent attestation.
Each registered AI agent is minted an NFT with a deterministic ID
derived from the first 8 hex characters of its registration transaction
hash. This ID becomes the bioagent prefix in any BioCID
the agent generates.
interface IGenoclawIdentityRegistry {
// Register an AI agent; emits AgentRegistered with the gc-{first8hex} id
function registerAgent(
address agentWallet,
string calldata agentName,
string calldata agentVersion,
string calldata capabilities // JSON-encoded capability list
) external returns (uint256 tokenId, string memory agentId);
// Verify an agent signature for a given message
function verifyAgentSignature(
string calldata agentId,
bytes32 messageHash,
bytes calldata signature
) external view returns (bool);
event AgentRegistered(
uint256 indexed tokenId,
string agentId,
address agentWallet
);
}
8. Somos Ancestry Pipeline Integration
When a DTC genotype file is uploaded with the query parameter
pipeline_trigger=ancestry, BioRouter enqueues a
Somos ancestry analysis job. The pipeline is the same production
implementation used by somosdao.io and has been validated
bit-perfect against production results (all 24 populations match
to 6 decimal places across a test cohort of 50 individuals).
8.1 Pipeline Stages
UPLOAD: dtc-genotype file (23andMe, AncestryDNA, VCF, DTC format)
│
▼
Stage 1: genotype2ped conversion
Auto-detect format by file header signature
Output: PLINK .ped + .map files
│
▼
Stage 2: PLINK —extract 91,645 ancestry-informative SNPs
SNP panel selected for maximum informativeness across
24-population reference set; compatible with all major
DTC genotyping arrays (Illumina GSA, OmniExpress)
│
▼
Stage 3: PLINK —merge with reference panel
781 reference individuals
24 populations (continental + 10 indigenous Mexican)
Resolves strand flips and ambiguous SNPs automatically
│
▼
Stage 4: ADMIXTURE K=24 supervised (~12 minutes runtime)
Maximum likelihood decomposition into 24 components
Reference individuals' population assignments fixed
Query individual's proportions estimated freely
│
▼
Stage 5: Parse .Q file
Position 437 in sorted FAM file (FAM001/ID001 after merge)
Extract 24 floating-point admixture proportions
│
▼
Stage 6: Store result as new BioCID
bioagent: gc-{GenoClaw agent id}
type: ancestry-result
Stored in GCS, registered in biocid_registry
pipeline_status → "completed"
8.2 Reference Population Panel
| Code | Population | Region |
|---|---|---|
| AFR_ESTE | East African | Africa |
| AFR_NORTE | North African | Africa |
| AFR_OESTE | West African | Africa |
| AMAZONAS | Amazonian Indigenous | South America |
| ANDES | Andean Indigenous | South America |
| ASIA_ESTE | East Asian | Asia |
| ASIA_SUR | South Asian | Asia |
| ASIA_SURESTE | Southeast Asian | Asia |
| EUR_ESTE | Eastern European | Europe |
| EUR_NORESTE | Northeastern European | Europe |
| EUR_NORTE | Northern European | Europe |
| EUR_OESTE | Western European | Europe |
| EUR_SUROESTE | Southwestern European | Europe |
| JUDIO | Sephardic/Ashkenazi Jewish | Middle East / Europe |
| MAYA | Maya | Mexico / Mesoamerica |
| MEDIO_ORIENTE | Middle Eastern | Middle East |
| OCEANIA | Oceanic | Pacific |
| PIMA | Pima | Mexico / Southwestern USA |
| ZAPOTECA | Zapotec | Mexico (Oaxaca) |
| HUICHOL | Huichol (Wixaritari) | Mexico (Jalisco/Nayarit) |
| MIXTECA | Mixtec | Mexico (Oaxaca/Guerrero) |
| NAHUA_OTOMI | Nahua-Otomi | Mexico (Central) |
| TARAHUMARA | Tarahumara (Rarámuri) | Mexico (Chihuahua) |
| TRIQUI | Triqui | Mexico (Oaxaca) |
Pipeline output was validated against 50 production Somos results. All 24 population proportions matched to 6 decimal places (mean absolute deviation < 1×10−6), confirming bit-perfect determinism between BioRouter-triggered and standalone pipeline runs.
9. GenoClaw AI Agent Integration
GenoClaw is GenoBank's patient-owned AI health agent deployed on NVIDIA NemoClaw infrastructure. It is registered in the GenoclawIdentityRegistry as an ERC-8004 principal, enabling it to act as an authorized data agent on behalf of patients who have explicitly delegated access. BioRouter serves as GenoClaw's persistent data layer.
9.1 Ancestry Query Workflow
User → GenoClaw: "What is my ancestry?"
│
▼
1. GenoClaw checks Somos DAO cache for prior result
│ (cache miss)
▼
2. GenoClaw → GET /api_biorouter/list?biodata_type=ancestry-result&user_signature=X
│
│ (no ancestry-result found)
▼
3. GenoClaw → GET /api_biorouter/list?biodata_type=dtc-genotype&user_signature=X
│
│ (dtc-genotype found: Biocid:user/0x5f5a60.../dtc-genotype/DTC-FILE-GHAn0029.txt)
▼
4. GenoClaw checks pipeline_status for existing job
│ (no prior job)
▼
5. GenoClaw instructs user to upload OR
GenoClaw calls POST /upload with pipeline_trigger=ancestry
│
▼
6. Pipeline enqueued, job_id = "anc_cea846287d5f"
│
▼
7. GenoClaw polls GET /pipeline_status?biocid=X every 60 seconds
│ (~12 minutes)
▼
8. status = "completed"
result_biocid = "Biocid:gc-b96ed19a/0x5f5a60.../ancestry-result/somos_k24_results.json"
│
▼
9. GenoClaw → GET /api_biorouter/download?biocid=result_biocid&user_signature=X
(Tier 1 passes: gc-b96ed19a is the agent, user wallet is the owner)
│
▼
10. GenoClaw parses 24-population proportions, formats narrative
GenoClaw → User: "You are 42.3% Western European, 31.7% Maya, ..."
9.2 Agent Attribution in BioCIDs
When GenoClaw stores a derived dataset (ancestry result, clinical
summary, pharmacogenomics report), the resulting BioCID encodes
the agent's ERC-8004 identifier in the bioagent field.
This creates a permanent, auditable chain of attribution: the raw
DTC genotype file is owned by the patient (bioagent = user),
while the derived ancestry result is attributed to the AI agent
(bioagent = gc-b96ed19a) that generated it, on behalf
of the same patient (owner_biowallet is unchanged).
This attribution model satisfies emerging AI transparency requirements under the EU AI Act [15] and enables auditors to trace any analytical output back to its source data and the AI system that produced it, without requiring the auditor to have access to either the source data or the model weights.
10. MongoDB Data Model
10.1 biocid_registry
Primary registry for all stored biofiles. Indexed on
biocid (unique), owner_wallet,
file_hash, and biodata_type.
{
"biocid": "Biocid:user/0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a/dtc-genotype/DTC-FILE-GHAn0029.txt",
"owner_wallet": "0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a", // EIP-55 checksum
"bioagent": "user", // or "gc-{8hex}"
"biodata_type": "dtc-genotype",
"dataset": "DTC-FILE-GHAn0029.txt",
"file_hash": "sha256:1f8eab4c...", // SHA-256 hex
"file_size": 44644945,
"gcs_bucket": "INTERNAL — NEVER RETURNED TO CLIENTS",
"gcs_path": "INTERNAL — NEVER RETURNED TO CLIENTS",
"bionft_token_id": null, // ERC-721 token ID if linked
"consent_status": "active",
"price_wei": "50000000000000000",
"pipeline_trigger": "ancestry",
"pipeline_job_id": "anc_cea846287d5f",
"pipeline_status": "completed",
"created_at": "2026-03-24T06:15:00Z",
"access_count": 3
}
10.2 biocid_consents
One document per consent grant. Owner-permittee pairs are not unique;
multiple grants may exist for the same pair with different license
types or durations. Expiry is checked at query time by comparing
expires_at to the current UTC timestamp; no scheduled
job is required to update status to "expired".
{
"_id": "64f3a1b2c8d9e0f1a2b3c4d5",
"biocid": "Biocid:user/0x5f5a60.../dtc-genotype/DTC-FILE-GHAn0029.txt",
"owner_wallet": "0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a",
"permittee_wallet": "0xB3C3a584F9a5A77Ed84EBf2c8E66E8e8c1C2D3A4",
"license_type": "research", // research | clinical | ai-training
"duration_days": 365,
"granted_at": "2026-03-24T06:29:26Z",
"expires_at": "2027-03-24T06:29:26Z",
"status": "active" // active | revoked | expired (logical)
}
10.3 biocid_access_log
Immutable append-only audit log. Every operation that reads or modifies a BioCID record writes a log entry. This includes uploads, downloads (whether authorized or denied), stream requests, grant events, revocation events, and pipeline triggers.
{
"timestamp": "2026-03-24T08:47:13Z",
"biocid": "Biocid:user/0x5f5a60.../dtc-genotype/DTC-FILE-GHAn0029.txt",
"operation": "download", // upload | download | stream | grant | revoke | pipeline
"actor_wallet": "0xB3C3a584...",
"actor_type": "permittee", // owner | permittee | nft_holder | x402_payer | denied
"tier_granted": 2, // authorization tier that granted access (0 = denied)
"ip_address": "203.0.113.42",
"range_bytes": null, // set for stream operations
"result": "success"
}
11. Privacy, Compliance, and Regulatory Alignment
11.1 Why Patient Ownership Supersedes GDPR/CCPA
GDPR [16] and CCPA [17] were designed to protect individuals from corporations that hold data as custodians. The legislative assumption is that there is a structural power asymmetry: the data subject lacks the ability to control their data once it has been transferred to a controller. BioRouter inverts this assumption entirely.
In BioRouter's model, the patient is not the "data subject" in GDPR terminology; the patient is the data controller. The patient's wallet signature is the only mechanism by which data can be uploaded, accessed, or shared. No institution holds a master decryption key. The concept of an "erasure request" to a third party becomes meaningless when the patient never transferred control in the first place.
Nevertheless, BioRouter is explicitly designed to satisfy GDPR Article 17 (right to erasure) as an operational guarantee rather than as a compliance checkbox:
- Storage on GCS, not IPFS. IPFS's content-addressed immutability makes deletion structurally impossible, creating a GDPR Article 17 violation for any system that uses IPFS for personal genomic data. GCS objects are deletable on demand. A call to
POST /revokecan trigger physical GCS object deletion in addition to marking consents as revoked. - Consent revocation propagates instantly. The MongoDB consent check in Tier 2 operates at query time. There is no cache layer that might serve a revoked consent grant. Revocation is effective within one request cycle.
- No federated copies. BioRouter does not replicate data to third-party systems without explicit owner consent. Federated learning frameworks that distribute data or gradient-derived information to multiple parties without attribution violate the spirit of GDPR Article 5(1)(b) (purpose limitation).
11.2 Privacy-Preserving Bloom Filters vs. Zero-Knowledge Proofs
Zero-knowledge proof systems require deterministic computation: the prover must demonstrate knowledge of a witness that satisfies a circuit without revealing the witness. Genomic data is fundamentally incompatible with this requirement. Genotype calling introduces probabilistic quality scores (PHRED-scaled likelihoods), haplotype phasing introduces population prior-dependent uncertainty, and structural variant detection involves alignment scores across repetitive regions. There is no deterministic "I have genotype X at position Y" fact that can be proven without revealing the entire underlying read pileup.
BioRouter instead uses privacy-preserving Bloom filters for variant membership queries [18]. A Bloom filter for a variant panel can answer "does this individual's genome contain variant rs123456?" with a configurable false positive rate and zero false negative rate, without exposing the complete variant call set. The filter supports probabilistic membership testing that is sufficient for access control and research eligibility screening, while the complete authenticated data remains in GCS accessible only through BioRouter's four-tier authorization.
11.3 Why Federated Learning is Rejected
Federated learning, as implemented in genomics research pipelines, presents several fundamental problems that BioRouter's architecture is explicitly designed to avoid:
- Data quality degradation. Gradient aggregation across heterogeneous sequencing platforms, variant callers, and population stratifications introduces systematic biases that are indistinguishable from signal in downstream analyses. Clinical-grade variant interpretation requires complete, authenticated datasets.
- Attribution erasure. When a model is trained across federated nodes, there is no mechanism to attribute a specific model parameter to a specific patient's contribution. Revenue sharing becomes mathematically impossible. Federated learning is, in this sense, a technical arrangement that enables extraction of value from patient data without compensating patients.
- False privacy compliance. Federated learning does not prevent model inversion attacks [19], membership inference attacks [20], or property inference attacks [21]. The privacy guarantee is weaker than is commonly represented in marketing materials from federated learning vendors.
BioRouter does not implement federated learning. It does not distribute
data or model gradients to any party without the explicit, revocable
consent of the data owner. Researchers who require data for AI training
must acquire access via Tier 2 (explicit grant with license_type=ai-training)
or Tier 4 (x402 payment), access the complete authenticated dataset
via the BioRouter streaming API, and accept full attribution and audit
logging of their access.
12. Comparative Analysis
| Property | BioRouter | 23andMe (legacy) | IPFS-based systems | Federated Learning | Nebula Genomics |
|---|---|---|---|---|---|
| Patient is data controller | Yes — cryptographic | No | Partial | No | Partial |
| Consent revocable | Yes — instant | No | No (IPFS) | No | Partial |
| GDPR Art. 17 compliant | Yes — GCS deletable | Contested | No (IPFS immutable) | Contested | Partial |
| Patient revenue share | 95% on-chain | 0% | Varies | 0% | Token-based |
| Data quality preserved | Complete, authentic | Complete | Complete | Degraded | Complete |
| Storage path exposed | Never (BioCID) | Internal | Yes (CID = location) | N/A | Internal |
| AI agent attribution | ERC-8004 in BioCID | None | None | None | None |
| Range streaming (BAM/WGS) | Yes — HTTP Range | No | Partial | No | No |
| Automated ancestry pipeline | Yes — K=24 trigger | Yes | No | No | No |
13. Discussion
13.1 Limitations
BioRouter's current implementation has several limitations that should be acknowledged:
- Single GCS region. Files are stored in a single GCS region. Cross-region replication and geo-redundancy are not yet implemented. For very large WGS datasets (100GB+), transfer latency from non-US regions may be significant.
- x402 settlement latency. Payment verification queries Sequentias Network synchronously. In periods of high network congestion, this Tier 4 check may introduce latency. Caching paid access grants in a short-TTL local store is a planned optimization.
- Bloom filter coverage. The variant membership Bloom filter is currently generated at upload time for VCF files only. DTC genotype files require a conversion step before Bloom filter generation, which is not yet automated in the BioRouter pipeline.
- ERC-8004 is a draft standard. The GenoclawIdentityRegistry implements ERC-8004 at draft revision 3. If the final standard introduces breaking interface changes, the registry contract and BioCID generation logic will require migration.
- MongoDB as trust anchor. While the consent model is cryptographically authenticated at the API layer, the consent documents themselves are stored in MongoDB rather than on-chain. A compromise of the MongoDB instance could allow forged consent records. Full on-chain consent registration is a planned upgrade via a BioConsent contract on Sequentias Network.
13.2 Future Work
The following capabilities are on the BioRouter development roadmap:
- On-chain consent registry. Migration of
biocid_consentsto a Solidity contract on Sequentias Network, eliminating MongoDB as a trust dependency for consent verification. - Shapley-based Biodata Dividends. Integration of Shapley value computation [22] to attribute marginal contributions of individual data points to trained models, enabling proportional Biodata Dividend distributions at scale.
- BioConsent NFT minting on upload. Automatic ERC-721 minting at upload time, linking the BioNFT to the BioCID in
biocid_registryand enabling Tier 3 authorization immediately after upload. - FHIR R4 pipeline integration. Structured clinical data uploaded as
fhirbiodata type triggers automated extraction of discrete clinical observations for population health research eligibility queries via Bloom filters. - Multi-party computation for aggregate statistics. For cases where researchers require aggregate statistics rather than individual-level data, a secure multi-party computation layer over BioRouter-stored data is under investigation as a complement (not replacement) to the authenticated access model.
13.3 The 23andMe Bankruptcy Precedent
The March 2025 23andMe bankruptcy filing [3] placed the genomic profiles of 15 million customers into a corporate asset pool subject to auction to the highest bidder. Customers had no contractual mechanism to object to the sale of their most intimate biological data. The July 2025 acquisition by TTAM Research Institute (founder Anne Wojcicki's nonprofit) for $305 million resolved this particular instance, but the underlying structural vulnerability remains: when data is held by a corporation as a custodian rather than by the individual as an owner, any corporate event (bankruptcy, acquisition, data breach) can instantaneously transfer control of that data without the individual's knowledge or consent.
BioRouter is architecturally immune to this failure mode. The data
is stored in GCS, but access is controlled by the patient's wallet
private key, not by GenoBank's corporate infrastructure. If GenoBank
ceased operations, the biocid_registry MongoDB data
could be exported and run by any operator; the cryptographic ownership
proofs are wallet-native and do not require GenoBank's servers to
validate. The x402 payment contract on Sequentias Network is
self-executing and does not require GenoBank to process or approve
payments.
14. Conclusion
BioRouter represents a production implementation of the principle that patient ownership of genomic data is achievable without sacrificing data quality, analytical utility, or economic viability. The protocol demonstrates that the traditional tradeoffs in health data governance— privacy versus utility, control versus accessibility, patient rights versus research progress—are artifacts of a centralized custody architecture rather than fundamental constraints.
The four-tier authorization cascade provides a graduated access model that accommodates the full range of legitimate access patterns: direct owner access, explicit consent-based sharing, transferable NFT-linked permissions, and market-priced researcher access. Each tier is cryptographically enforced; none requires trust in GenoBank as an intermediary.
The BioCID addressing scheme decouples data identity from storage location, enabling storage backend migration, multi-cloud deployment, and auditable data lineage without exposing infrastructure details to clients. ERC-8004 agent attribution in the BioCID format extends this lineage to AI-generated derived datasets, laying the foundation for auditable AI in genomic medicine.
The live deployment at biorouter.genobank.app is available for integration by healthcare providers, research institutions, DTC genomics platforms, and AI health agent developers. API documentation and integration guides are available at genobank.io/developers.
Privacy is not about hiding data or making it fuzzy. Privacy is about giving patients complete control over their authentic, high-quality data, with full transparency about its use and fair compensation for its value. BioRouter makes this principle operational.
References
- [1] MyHeritage Security Incident Report, June 2018. 92 million email addresses and hashed passwords exposed via third-party breach. Available: https://blog.myheritage.com/2018/06/myheritage-statement-about-a-cybersecurity-incident/
- [2] 23andMe, Inc. Form 8-K Filing, January 2024. Disclosure of credential-stuffing attack affecting approximately 6.9 million customer profiles. U.S. Securities and Exchange Commission.
- [3] 23andMe, Inc. Chapter 11 Bankruptcy Filing, U.S. Bankruptcy Court, District of Delaware, March 2025. Docket No. 25-10XXX. TTAM Research Institute acquisition for $305 million, July 2025.
- [4] Boneh, D., Boyen, X. & Shacham, H. Short Group Signatures. In: Advances in Cryptology (CRYPTO 2004), Lecture Notes in Computer Science, vol. 3152, pp. 41–55. Springer, Berlin, Heidelberg (2004). Note: ZK proofs require deterministic arithmetic circuits; stochastic genomic pipelines cannot be compiled into such circuits.
- [5] Mazieres, D. & Kaashoek, M.F. Escaping the Evils of Centralized Control with Self-certifying Pathnames. In: Proc. 8th ACM SIGOPS European Workshop, pp. 118–125 (1998).
- [6] Protocol Labs. Content Identifier (CID) Specification. IPFS Documentation, v1.0 (2017). Available: https://docs.ipfs.tech/concepts/content-addressing/
- [7] Azaria, A., Ekblaw, A., Vieira, T. & Lippman, A. MedRec: Using Blockchain for Medical Data Access and Permission Management. In: 2016 2nd International Conference on Open and Big Data (OBD), pp. 25–30. IEEE (2016).
- [8] Healthureum. A Blockchain-based Healthcare System. Technical Whitepaper v1.2 (2018).
- [9] Kim, M.G. et al. Mediblock: Decentralized Medical Information System. Journal of Medical Systems, 43(8), 247 (2019).
- [10] Coral Health. Coral Health Research & Discovery: Blockchain-based Healthcare Data Management. Technical Report (2017).
- [11] Fielding, R., Nottingham, M. & Reschke, J. (Eds.) Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content. RFC 7231, IETF (2014). Status code 402 reserved.
- [12] Coinbase, Inc. x402: A Protocol for Machine-Readable HTTP Payments. Technical Specification v0.2 (2025). Available: https://x402.org
- [13] ERC-8004: AI Agent Identity Standard. Ethereum Improvement Proposal Draft (2025). Author: Ethereum Foundation AI Working Group. Available: https://eips.ethereum.org/EIPS/eip-8004
- [14] Alexander, D.H., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Research, 19(9), 1655–1664 (2009). doi:10.1101/gr.094052.109
- [15] European Parliament. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act). Official Journal of the European Union, L 1689 (2024).
- [16] European Parliament. Regulation (EU) 2016/679 of the European Parliament and of the Council on the protection of natural persons with regard to the processing of personal data (General Data Protection Regulation). Official Journal of the European Union, L 119 (2016).
- [17] California Consumer Privacy Act, Cal. Civ. Code §§ 1798.100–1798.199 (2018), as amended by the California Privacy Rights Act (CPRA), Proposition 24 (2020).
- [18] Bloom, B.H. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7), 422–426 (1970). doi:10.1145/362686.362692. Privacy-preserving variant: Cho, H. et al. Secure genome-wide association analysis using multiparty computation. Nature Biotechnology, 36(6), 547–551 (2018).
- [19] Fredrikson, M., Jha, S. & Ristenpart, T. Model Inversion Attacks that Exploit Confidence Information and Basic Countermeasures. In: Proc. 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1322–1333 (2015).
- [20] Shokri, R., Stronati, M., Song, C. & Shmatikov, V. Membership Inference Attacks Against Machine Learning Models. In: 2017 IEEE Symposium on Security and Privacy (SP), pp. 3–18. IEEE (2017).
- [21] Melis, L., Song, C., De Cristofaro, E. & Shmatikov, V. Exploiting Unintended Feature Leakage in Collaborative Learning. In: 2019 IEEE Symposium on Security and Privacy (SP), pp. 691–706. IEEE (2019).
- [22] Shapley, L.S. A Value for n-Person Games. In: Contributions to the Theory of Games, vol. 2, pp. 307–317. Princeton University Press (1953). Application to data valuation: Ghorbani, A. & Zou, J. Data Shapley: Equitable Valuation of Data for Machine Learning. In: Proc. 36th ICML, pp. 2242–2251 (2019).
© 2026 GenoBank.io. All rights reserved.