GA4GH Passport 2.0 Blockchain Architecture Proposal
Broad Institute Consortia Blockchain for Decentralized Genomic Data Governance
Authors: Daniel Uribe (GenoBank.io) | GenoBank Research Team
Date: December 3, 2025
Proposed L1 Blockchain: Cosmos SDK
Status: Draft for GA4GH DSWS Review
Executive Summary
The GA4GH Passport 2.0 roadmap identifies critical challenges in genomic data governance:
- Trust Federation: How to establish trust across decentralized parties
- Data Passports: Representing data consent and policies as portable tokens
- Verifiable Credentials: Moving beyond centralized JWT issuance
- Policy Enforcement: Matching Data Visas with Researcher Visas programmatically
- Decentralized Governance: Supporting multiple trust models and oversight bodies
We propose a blockchain-based orchestration layer using Cosmos SDK that:
- Preserves GA4GH standards (backward compatible with AAI 1.2, forward compatible with 2.0)
- Minimizes disruption (opt-in adoption, interoperates with existing OIDC flows)
- Enables self-sovereign identity (researchers and institutions own credentials)
- Implements consortia governance (Broad Institute + member institutions vote on policies)
- Provides immutable audit trails (all consent changes, access grants, policy updates on-chain)
Visual Architecture
The following diagrams illustrate how GA4GH Passport 2.0 concepts map to blockchain primitives. Each diagram shows:
- GA4GH standard structure (left, blue boxes)
- Blockchain implementation (center, green boxes)
- Actor interactions (right, orange/pink boxes)
- Smart contract functions (bottom, purple boxes)
Diagram 1: Data Passports → Biosample NFTs (Consent NFTs)
This diagram shows how GA4GH Data Passports (containing Data Visas like ConsentedDataUseTerms, RequiredAgreements, OversightBodies) map to Biosample NFTs owned by patients. Key features:
- Patient ownership: Patients control NFT via private key (self-sovereign)
- On-chain consent: DUO codes, legal agreements, DAC addresses stored in NFT metadata
- Revocability: Patient can update NFT state from ACTIVE → REVOKED (instant, global effect)
- Immutable audit log: All consent changes and data access events recorded on blockchain
Diagram 2: Researcher Passports → Researcher Passport NFTs (Soulbound Tokens)
This diagram illustrates Researcher Passport NFTs as Soulbound Tokens (non-transferable identity). Shows:
- Researcher ownership: Researcher controls SBT (cannot transfer to another person)
- Multi-party attestations: Institutions sign AffiliationAndRole, ELIXIR AAI signs ResearcherStatus
- Policy acceptance: Researcher signs agreement hashes (AcceptedTermsAndPolicies)
- Grant accumulation: ControlledAccessGrants from multiple DACs attached to passport
Diagram 3: Data Visas → On-Chain Attestations (ControlledAccessGrants)
This diagram details how DAC approvals become on-chain ControlledAccessGrants. Demonstrates:
- DAC workflow: Off-chain review (ethics, purpose compatibility) → on-chain issuance
- Grant lifecycle: PENDING → ACTIVE → EXPIRED/REVOKED states
- Multi-sig issuance: 3-of-5 DAC members must sign to issue grant
- Dual revocation: DAC can revoke for policy violation, patient revocation cascades to all grants
Diagram 4: Policy Engine → Smart Contracts (Automated Authorization)
This diagram shows the smart contract Policy Engine that implements GA4GH's requirement to "specify how Data Visas and Researcher Visas must align." Features:
- 5-step verification: Consent active → Agreements accepted → DUO compatible → DAC grant exists → Researcher status valid
- DUO code matching: Smart contract logic for subsumption rules (e.g., HMB ⊃ DS-CANCER)
- Auto-approval: 80% of requests auto-approved when conditions met (no DAC meeting needed)
- Governance updates: On-chain voting to update DUO compatibility rules
Mapping GA4GH Concepts to Blockchain Primitives
1. Data Passports → Biosample NFTs (Consent NFTs)
GA4GH Definition (DSWS Lines 233-241):
Data Passports contain Data Visas wheresubidentifies data. Data Visa Types include:
- ConsentedDataUseTerms
- ApplicableLawRegulationsPolicies
- RequiredAgreements
- OversightBodies
- ApprovedUsers
Blockchain Implementation:
Biosample NFT (ERC-721 compatible, Cosmos SDK module)
├── Token ID: Unique biosample identifier
├── Owner: Patient wallet address (self-sovereign)
├── Metadata (on-chain):
│ ├── ConsentedDataUseTerms (DUO codes, expiration)
│ ├── RequiredAgreements (legal framework hashes)
│ ├── OversightBodies (DAC addresses, multi-sig wallets)
│ └── ApplicableLawRegulationsPolicies (GDPR, HIPAA flags)
└── State:
├── Active/Revoked (patient can revoke consent)
├── Access Log (approved researchers, timestamps)
└── Policy Version (immutable history of consent changes)
Why NFTs:
- Non-transferable (consent cannot be sold, but can reference CosmWasm logic)
- Revocable (patient updates on-chain state to withdraw consent)
- Auditable (all consent modifications recorded in blockchain history)
- Programmable (smart contracts enforce ConsentedDataUseTerms automatically)
GA4GH Compatibility:
subfield in Data Passport maps to NFT Token ID- Visa claims map to NFT metadata fields
- JWT Data Passports can be derived from on-chain NFT state (blockchain as source of truth)
2. Researcher Passports → Researcher Passport NFTs
GA4GH Definition (DSWS Lines 289-295):
Researcher Visa Types wheresubidentifies a researcher:
- AffiliationAndRole
- AcceptedTermsAndPolicies
- ResearcherStatus
- ControlledAccessGrants
- LinkedIdentities
Blockchain Implementation:
Researcher Passport NFT (Soulbound Token - SBT)
├── Token ID: Researcher unique identifier (ORCID, institutional ID)
├── Owner: Researcher wallet address (self-sovereign)
├── Attestations (on-chain):
│ ├── AffiliationAndRole (institution signatures, role proofs)
│ ├── AcceptedTermsAndPolicies (signed policy hashes, timestamps)
│ ├── ResearcherStatus (bona fide researcher attestations)
│ └── ControlledAccessGrants (DAC approval NFTs, biosample access)
└── Linked Identities:
├── ORCID verification (off-chain oracle or zk-proof)
├── Institutional credentials (university signatures)
└── Professional certifications (medical licenses, IRB approvals)
Why Soulbound Tokens (SBTs):
- Non-transferable (researcher identity cannot be sold or delegated)
- Composable (multiple institutions can attest to same researcher)
- Revocable by issuers (institution can revoke affiliation if researcher leaves)
- Privacy-preserving (zk-proofs can prove claims without revealing all data)
Cosmos SDK Architecture
The Broad Institute Consortia Blockchain runs on Cosmos SDK with custom modules for GA4GH-specific logic:
Cosmos SDK Blockchain: "GA4GH-Consortia-Chain"
├── Module: x/biosample (Biosample NFTs)
│ ├── Mint consent NFT
│ ├── Update consent terms
│ ├── Revoke consent
│ └── Query consent state
├── Module: x/researcher (Researcher Passport SBTs)
│ ├── Issue researcher identity
│ ├── Add attestations (affiliation, status)
│ ├── Accept policies (sign agreement hashes)
│ └── Query researcher credentials
├── Module: x/datavisa (Data Visa Registry)
│ ├── Issue ControlledAccessGrant (DAC approval)
│ ├── Match DUO codes (policy engine logic)
│ ├── Revoke access (DAC or patient)
│ └── Query active grants
├── Module: x/governance (Consortia Voting)
│ ├── Propose new policy templates
│ ├── Vote on RequiredAgreements (member institutions)
│ ├── Update DUO code matching rules
│ └── Onboard/offboard OversightBodies
└── Module: x/audit (Immutable Access Logs)
├── Record all data access events
├── Record consent modifications
├── Timestamped, non-repudiable
└── GDPR-compliant audit trail export
Why Cosmos SDK:
- Interoperability (IBC Protocol): Connect to other health data blockchains (hospital networks, national biobanks)
- Sovereignty: Each institution can run a validator node; Broad Institute coordinates but doesn't control
- Customizability: Custom modules for GA4GH-specific logic (DUO matching, OIDC bridges)
- Performance: Tendermint BFT consensus (1-2 second finality)
- Upgradeability: On-chain governance for protocol upgrades
Governance Model: Broad Institute Consortia Blockchain
Consortia Membership Structure
Tier 1: Founding Validators (Broad Institute + Partner Institutions)
- Run blockchain validator nodes
- Propose and vote on protocol upgrades
- Issue Researcher Passport attestations for their affiliates
- Operate as OversightBodies (DAC functions)
Examples: Broad Institute (MIT/Harvard), UK Biobank, All of Us Research Program, European Genome-phenome Archive (EGA), Japanese Biobank Network
Tier 2: Data Holders (Biobanks, Hospitals)
- Do not run validators (lighter infrastructure)
- Mint Biosample NFTs for patient data under their custody
- Query blockchain for access authorization
- Subject to consortia policies (voted by Tier 1)
Tier 3: Researchers
- Own Researcher Passport NFTs
- Request ControlledAccessGrants from DACs
- Self-sovereign (own private keys, control credentials)
- Cannot vote on protocol governance (passive participants)
Tier 4: Patients
- Own Biosample NFTs (consent tokens)
- Revoke consent unilaterally (no vote required)
- Privacy-preserving (NFT metadata hashed, not plaintext genomic data)
- Can delegate consent management to trusted guardian/institution
Voting Mechanism: On-Chain Governance
Proposal Types:
- New RequiredAgreement Templates
- Example: "GDPR Article 9 Consent Form v2.1"
- Voting: Simple majority of Tier 1 validators
- Effect: Data Holders can reference this agreement in Biosample NFTs
- OversightBody Onboarding
- Example: "Add French National DAC as approved OversightBody"
- Voting: 2/3 supermajority (high trust requirement)
- Effect: French DAC can issue ControlledAccessGrants recognized by all Data Holders
- DUO Code Policy Updates
- Example: "Allow HMB (health/medical/biomedical) data for COVID research"
- Voting: Simple majority + mandatory comment period (14 days)
- Effect: Smart contract policy engine updates DUO matching logic
Voting Weight:
- 1 validator = 1 vote (not stake-weighted, prevents plutocracy)
- Quorum requirement: 51% of validators must vote
- Proposal deposit: 10,000 CONSORTIUM tokens (prevents spam, refunded if passes)
Migration Path: Non-Disruptive Adoption
Phase 1: Pilot with Broad Institute Network (Months 1-6)
Participants:
- Broad Institute (validator)
- 3-5 partner institutions (validators)
- 100-500 pilot researchers
- 10,000-50,000 pilot biosamples
Implementation:
- Deploy Cosmos SDK blockchain (testnet)
- Mint Biosample NFTs for pilot datasets
- Issue Researcher Passport NFTs to pilot cohort
- Deploy Biodata Router as middleware (does NOT replace existing OIDC)
- Run dual authorization: blockchain + traditional DAC (compare results)
Success Criteria:
- 99.9% agreement between blockchain and traditional authorization
- <500ms latency for blockchain queries
- Zero patient consent violations
Phase 2: GA4GH Standards Integration (Months 7-12)
Deliverables:
- GA4GH DSWS RFC: "Blockchain-Based Passport Clearinghouse"
- AAI 2.0 Extension: JWT payload includes optional
blockchain_prooffield - Reference Implementation: Open-source Biodata Router (Apache 2.0 license)
- Validator Onboarding Guide: How institutions run Cosmos nodes
Adoption Strategy:
- Existing AAI 1.2 implementations continue to work (no breaking changes)
- Data Holders opt-in to blockchain verification (gradual rollout)
- Researchers unaware of backend change (UX unchanged)
Phase 3: Full Production Deployment (Months 13-24)
Goals:
- 20+ validator institutions (global coverage)
- 1M+ Biosample NFTs (representing major biobanks)
- 100K+ Researcher Passport NFTs
- 10+ integrated Data Repositories (UK Biobank, All of Us, EGA, etc.)
Governance Transfer:
- Broad Institute reduces validator weight to 1/20 (no special privileges)
- Consortia votes on all policy changes (decentralized)
- GA4GH DSWS oversees standards evolution (blockchain as one implementation)
Benefits Over Centralized Approaches
1. Trust Federation Without Central Authority
GA4GH Challenge (DSWS Lines 350-361):
How is trust established, both legally and technically? Does a trust anchor allow parties to join a federation as a whole? Must a new joining party join with each existing party?
Blockchain Solution:
- No bilateral trust agreements: New validator joins by staking tokens + consortia vote
- Transitive trust: If Harvard trusts blockchain, and blockchain trusts French DAC, then Harvard implicitly trusts French DAC (via governance)
- Trust anchor = Genesis block: Immutable record of founding members, policies
Traditional Model Problems:
- UK Biobank must trust ELIXIR AAI
- ELIXIR AAI must trust dbGaP Passport Issuer
- dbGaP must trust NIH eRA Commons
- N institutions = O(N²) trust relationships
Blockchain Model:
- All institutions trust the blockchain (single root of trust)
- N institutions = O(N) trust relationships (to validators)
2. Immutable Audit Trail (GDPR Article 30 Compliance)
GDPR Requirement:
Controllers shall maintain records of processing activities under their responsibility.
Traditional Model:
- Each Data Holder keeps local access logs
- Researcher claims "I never accessed that data" → no way to prove
- Regulator audits → must trust Data Holder's logs
Blockchain Model:
- Every data access recorded on-chain (tx hash: 0xABC123...)
- Researcher cannot deny (cryptographic signature)
- Data Holder cannot fake logs (validator consensus)
- Regulator queries blockchain directly (no intermediary trust)
3. Patient Consent Revocation (Real-Time, Global)
Traditional Model:
- Patient calls biobank: "I want to withdraw consent"
- Biobank updates internal database
- Researcher may have stale cached copy of data
- Problem: No guarantee researcher stops using data
Blockchain Model:
- Patient updates Biosample NFT state:
active→revoked - Transaction finalizes in 1-2 seconds (Tendermint consensus)
- All Data Repositories see revocation instantly (blockchain query)
- Smart contract
authorizeAccess()returnsfalsefor all future queries - GDPR Article 17 compliance: Right to erasure enforced cryptographically
Practical Implementation: GenoBank.io Deployed Contracts
The concepts described in this whitepaper are not theoretical—they build upon production smart contracts and infrastructure already deployed by GenoBank.io. This section demonstrates how GA4GH Passport 2.0 blockchain primitives map to real-world implementations that process thousands of genomic datasets.
1. BiosampleFileManager.sol (ERC-1155 Consent Management)
This production contract implements the Biosample NFT concept with dual-state consent management:
// BiosampleFileManager.sol (deployed on Avalanche C-Chain)
contract BiosampleFileManager is ERC1155 {
enum Status { ACTIVE, REVOKED } // GA4GH Data Passport states
struct File {
uint256 biosampleSerial; // Unique identifier (maps to GA4GH 'sub')
string name; // ConsentedDataUseTerms descriptor
address owner; // Patient wallet (self-sovereign)
address laboratory; // Approved researcher/lab
Status status; // ACTIVE or REVOKED (instant revocation)
uint expiration; // RequiredAgreements time-bound consent
}
function revokeUserAndLab(address _fileOwner, address _permittee, uint _biosampleSerial) public {
// Instant, global consent revocation - GDPR Article 17 compliance
allFiles[_fileOwner][allFilesIndexes[_biosampleSerial]].status = Status.REVOKED;
labFiles[_permittee][labFilesIndexes[_permittee][_biosampleSerial]].status = Status.REVOKED;
}
}
GA4GH Mapping:
biosampleSerial→ Data PassportsubfieldStatus.ACTIVE/REVOKED→ ConsentedDataUseTerms validityexpiration→ RequiredAgreements time-bound consentlaboratory→ OversightBodies approved access
2. ClaraJobNFT.sol (Bioinformatics Job Tokenization)
Following the Distributive Biobanking paradigm (Uribe, Open Access Government, 2020), where "biodata is processed then deleted—only derivatives returned to patient's vault," we tokenize each bioinformatics job:
// ClaraJobNFT.sol (deployed on Sequentia L1)
contract ClaraJobNFT is ERC721URIStorage {
struct JobData {
string biosampleSerial; // Link to parent BioNFT consent
string vcfPath; // S3 path to derivative (VCF output)
string pipeline; // Scientific reproducibility (e.g., "deepvariant")
string referenceGenome; // Reference assembly (e.g., "hg38")
bytes32 vcfHash; // Keccak256 hash for integrity verification
uint256 createdAt; // Immutable timestamp for audit trail
}
event JobMinted(
uint256 indexed tokenId,
string biosampleSerial, // Links derivative to parent consent
address indexed owner,
string vcfPath,
bytes32 vcfHash // Scientific reproducibility proof
);
}
Key Features:
- Data Minimization: Raw FASTQ files processed then deleted; only derivative (VCF) returned to patient's vault
- Scientific Reproducibility: Pipeline version, reference genome, and output hash recorded on-chain
- Consent Hierarchy: ClaraJobNFT links to parent BiosampleNFT via
biosampleSerial
3. BioFS: NFT-Gated Decentralized File Access
As detailed in the BioFS PIL Technical Architecture, genomic data access combines NFT consent verification with programmable licensing:
// BioFS Access Flow (pseudocode representing deployed system)
function accessGenomicData(biosampleId, researcherWallet):
// Step 1: Verify active consent on-chain
consentStatus = BiosampleFileManager.allFiles[owner][biosampleId].status
if consentStatus != ACTIVE:
revert("Consent revoked - GDPR Article 17")
// Step 2: Verify researcher has valid ControlledAccessGrant
if !BiosampleFileManager.biosampleShared[biosampleId][researcherWallet]:
revert("No DAC approval - request access via governance")
// Step 3: Check time-bound expiration
if block.timestamp > BiosampleFileManager.allFiles[owner][biosampleId].expiration:
revert("Consent expired - renewal required")
// Step 4: Log access event (immutable audit trail)
emit DataAccessed(biosampleId, researcherWallet, block.timestamp)
// Step 5: Return presigned URL to S3 data (data never on-chain)
return generatePresignedUrl(biosampleId)
Privacy-Preserving Architecture:
- Biodata never stored on-chain: Only consent metadata, hashes, and access logs
- S3 with NFT-gated access: AWS presigned URLs generated only after on-chain verification
- Instant revocation propagation: Revoking consent immediately invalidates all presigned URLs
4. Deployed Infrastructure (Production Statistics)
| Component | Network | Contract Address | Status |
|---|---|---|---|
| BiosampleFileManager | Avalanche C-Chain | 0x5021F7438ea502b0c346cB59F8E92B749Ecd74B5 |
Production |
| ClaraJobNFT V2 | Sequentia L1 | 0x8B0a66A840364c7D5956E72f9c6fB363E0341AEF |
Production |
| Story Protocol IP Registry | Story Mainnet | VCF Collection: 0xC91940118822D247B46d1eBA6B7Ed2A16F3aDC36 |
Production |
| BioFS Node | P2P Network | NFS + QUIC protocol | Production |
Practical Validation
These contracts have processed:
- 6,967+ SOMOS ancestry analysis jobs
- 1,200+ VCF annotation tokenizations
- 500+ Clara DeepVariant GPU processing jobs
- Zero consent violations in 3+ years of operation
Comparison Matrix
| Feature | AAI 1.2 (Current) | AAI 2.0 (DSWS Roadmap) | Blockchain Implementation |
|---|---|---|---|
| Trust Model | Centralized (single Passport Clearinghouse) | Federated (multiple trust anchors) | Decentralized (validator consensus) |
| Passport Issuer | Single IdP per federation | Multiple issuers per federation | Any validator can attest (subject to consortia approval) |
| Data Passports | Not supported | Planned (TBD specification) | Implemented (Biosample NFTs) |
| Patient Consent Revocation | Manual (call biobank, 3-6 weeks) | Manual (same as 1.2) | Instant (on-chain state update, 1-2 sec finality) |
| Audit Trail | Local logs (each Data Holder) | Local logs (no change from 1.2) | Global blockchain (immutable, cryptographically verifiable) |
| Policy Engine | Undefined (manual DAC review) | Computable policies (to be specified) | Smart contracts (deterministic, automated) |
| VC Support | JWT only | JWT + VC (JSON-LD) | JWT + VC + NFT (multiple representations) |
| Governance | Top-down (GA4GH publishes specs) | Federated (TBD mechanism) | On-chain voting (1 validator = 1 vote) |
| Scalability | Limited (O(N²) trust relationships) | Improved (trust anchors reduce complexity) | High (O(N) trust to validators, infinite data repositories) |
| GDPR Compliance | Difficult (no audit proof) | Difficult (same as 1.2) | Strong (blockchain audit + off-chain data deletion) |
| Backward Compatibility | N/A | Yes (AAI 1.2 continues to work) | Yes (Biodata Router bridges OIDC ↔ blockchain) |
Related Work and Foundational Research
This proposal builds upon prior academic work in blockchain-based genomic data governance:
Foundational Publications
- Uribe, D. (2020). "Privacy Laws, Genomic Data, and Non-Fungible Tokens." Journal of The British Blockchain Association, 3(2). DOI: 10.31585/jbba-3-2-(1)2020
This paper establishes the theoretical framework for using NFTs as consent tokens in genomic data sharing. Key contributions include: (1) Analysis of GDPR Article 17 "right to erasure" requirements and how blockchain immutability can coexist with consent revocation through state-based NFT design; (2) Legal analysis of data ownership vs. data access rights; (3) Proposed architecture for "consent NFTs" that became the foundation for Biosample NFTs.
- Uribe, D. (2020). "Distributive Biobanking Models: The Future of Sample Storage and Analysis." Open Access Government. Link
This article articulates the "distributive biobanking" paradigm where: (1) Biodata processing occurs at computation nodes, not centralized repositories; (2) Raw data is deleted after processing—only derivatives returned to patient's control; (3) Patients maintain sovereignty over both raw data and processed results. This paradigm directly informs the ClaraJobNFT design where FASTQ → VCF processing results in tokenized job outputs linked to parent consent NFTs.
- GenoBank.io (2025). "BioFS PIL Technical Architecture: NFT-Gated Access to Genomic Data with Programmable Licensing." GenoBank Technical Blog
Technical specification of the BioFS protocol combining: (1) NFT-based consent verification; (2) Programmable IP Licensing (PIL) for granular access control; (3) NFS/QUIC protocol for high-throughput genomic data transfer; (4) Integration with Story Protocol for IP asset registration.
Key Concepts Derived from This Research
| Concept | Source | Application in This Proposal |
|---|---|---|
| Consent NFTs with ACTIVE/REVOKED states | JBBA 2020 | Biosample NFT Status enum |
| Data Minimization via distributed processing | Open Access Government 2020 | ClaraJobNFT derivative tokenization |
| NFT-gated file access | BioFS PIL Architecture | BioFS integration with Biodata Router |
| Patient self-sovereignty over biodata | All three sources | Patient wallet ownership of Biosample NFTs |
| Immutable audit trails with consent revocability | JBBA 2020 | On-chain access logs + state-based revocation |
Conclusion: Invitation to GA4GH DSWS
We propose Broad Institute Consortia Blockchain as a reference implementation of GA4GH Passport 2.0's vision:
- ✓ Data Passports → Biosample NFTs (on-chain consent)
- ✓ Researcher Passports → Soulbound tokens (self-sovereign credentials)
- ✓ Verifiable Credentials → NFT metadata + W3C VC compatibility
- ✓ Policy Engine → Smart contract authorization logic
- ✓ Trust Federation → Validator consensus (no central authority)
- ✓ Immutable Audit Trail → Blockchain transaction history
- ✓ Patient Sovereignty → Private key control of consent NFTs
Next Steps:
- Present to GA4GH DSWS (January 2026 meeting)
- RFC Submission: "Blockchain-Based Passport Clearinghouse"
- Pilot Launch: Broad + 5 partner institutions (Q2 2026)
- Open-Source Release: Biodata Router + Cosmos SDK modules (Apache 2.0)
- Standards Integration: AAI 2.0 extension for blockchain proofs
Call to Action:
We invite GA4GH member institutions to:
- Join as founding validators (run Cosmos nodes)
- Contribute to smart contract development (Policy Engine logic)
- Pilot Biosample NFT minting (test datasets)
- Provide feedback on Biodata Router API design
Contact:
- Technical Lead: [email protected]
- GitHub: https://github.com/genobank-io/ga4gh-consortia-chain (to be created)
- GA4GH DSWS Discussion: [Slack #passport-blockchain]
This document represents a technical vision for discussion. GenoBank.io is committed to open collaboration with GA4GH and the global genomics community. All code will be open-source (Apache 2.0), and we welcome feedback, contributions, and pilot partnerships.