GA4GH Passport 2.0 Blockchain Architecture Proposal

Broad Institute Consortia Blockchain for Decentralized Genomic Data Governance

Authors: Daniel Uribe (GenoBank.io) | GenoBank Research Team

Date: December 3, 2025

Proposed L1 Blockchain: Cosmos SDK

Status: Draft for GA4GH DSWS Review


Abstract—The GA4GH Passport 2.0 roadmap identifies critical challenges in genomic data governance: trust federation across decentralized parties, portable data consent tokens, verifiable credentials beyond JWT, programmable policy enforcement, and support for multiple trust models. We propose a blockchain-based orchestration layer using Cosmos SDK that preserves GA4GH standards (backward compatible with AAI 1.2, forward compatible with 2.0), minimizes disruption (opt-in adoption, interoperates with existing OIDC flows), enables self-sovereign identity (researchers and institutions own credentials), implements consortia governance (Broad Institute + member institutions vote on policies), and provides immutable audit trails (all consent changes, access grants, policy updates on-chain). The system maps GA4GH concepts to blockchain primitives: Data Passports → Biosample NFTs (consent tokens with DUO codes), Researcher Passports → Researcher Passport NFTs (soulbound identity tokens), Data Visas → On-chain ControlledAccessGrants (DAC approvals), and Policy Engine → Smart Contracts (automated DUO code matching). Rather than replacing GA4GH infrastructure, blockchain acts as a trust anchor and policy registry that existing AAI implementations query. We present technical architecture, Cosmos SDK module design, governance model, migration path, and comparison with centralized approaches.

Executive Summary

The GA4GH Passport 2.0 roadmap identifies critical challenges in genomic data governance:

We propose a blockchain-based orchestration layer using Cosmos SDK that:

  1. Preserves GA4GH standards (backward compatible with AAI 1.2, forward compatible with 2.0)
  2. Minimizes disruption (opt-in adoption, interoperates with existing OIDC flows)
  3. Enables self-sovereign identity (researchers and institutions own credentials)
  4. Implements consortia governance (Broad Institute + member institutions vote on policies)
  5. Provides immutable audit trails (all consent changes, access grants, policy updates on-chain)
Key Innovation: Rather than replacing GA4GH infrastructure, blockchain acts as a trust anchor and policy registry that existing AAI implementations can query.

Visual Architecture

The following diagrams illustrate how GA4GH Passport 2.0 concepts map to blockchain primitives. Each diagram shows:

  1. GA4GH standard structure (left, blue boxes)
  2. Blockchain implementation (center, green boxes)
  3. Actor interactions (right, orange/pink boxes)
  4. Smart contract functions (bottom, purple boxes)

Diagram 1: Data Passports → Biosample NFTs (Consent NFTs)

This diagram shows how GA4GH Data Passports (containing Data Visas like ConsentedDataUseTerms, RequiredAgreements, OversightBodies) map to Biosample NFTs owned by patients. Key features:

graph TB %% Simplified version for document subgraph "GA4GH Data Passport" DP[Data Passport JWT] DP_Visa1[ConsentedDataUseTerms: HMB] DP_Visa2[RequiredAgreements: GDPR] end subgraph "Blockchain: Biosample NFT" NFT[Biosample NFT #56789] NFT_Owner[Owner: Patient Wallet] NFT_Consent[DUO Codes: HMB, GRU] NFT_Agreements[Agreements: hash_GDPR] NFT_State[State: ACTIVE] end subgraph "Patient Actions" Patient[Patient] Revoke[Revoke Consent] end DP -.maps to.-> NFT DP_Visa1 -.maps to.-> NFT_Consent DP_Visa2 -.maps to.-> NFT_Agreements Patient -.owns.-> NFT_Owner Revoke -.updates.-> NFT_State classDef ga4gh fill:#E3F2FD,stroke:#2196F3,stroke-width:2px classDef blockchain fill:#E8F5E9,stroke:#4CAF50,stroke-width:2px classDef patient fill:#FFF3E0,stroke:#FF9800,stroke-width:2px class DP,DP_Visa1,DP_Visa2 ga4gh class NFT,NFT_Owner,NFT_Consent,NFT_Agreements,NFT_State blockchain class Patient,Revoke patient

Diagram 2: Researcher Passports → Researcher Passport NFTs (Soulbound Tokens)

This diagram illustrates Researcher Passport NFTs as Soulbound Tokens (non-transferable identity). Shows:

graph TB %% Simplified version for document subgraph "GA4GH Researcher Passport" RP[Researcher Passport JWT] RP_Visa1[AffiliationAndRole: Harvard PI] RP_Visa2[ResearcherStatus: bona_fide] end subgraph "Blockchain: Researcher Passport SBT" SBT[Researcher Passport SBT #42] SBT_Owner[Owner: Researcher Wallet NON-TRANSFERABLE] SBT_Aff[Attestation: Harvard signature] SBT_Status[Attestation: ELIXIR signature] end subgraph "Institution Actions" Harvard[Harvard Medical School] Attest[Attest Affiliation] end RP -.maps to.-> SBT RP_Visa1 -.maps to.-> SBT_Aff RP_Visa2 -.maps to.-> SBT_Status Harvard -.issues.-> SBT_Aff classDef ga4gh fill:#E3F2FD,stroke:#2196F3,stroke-width:2px classDef blockchain fill:#E8F5E9,stroke:#4CAF50,stroke-width:2px classDef institution fill:#FCE4EC,stroke:#E91E63,stroke-width:2px class RP,RP_Visa1,RP_Visa2 ga4gh class SBT,SBT_Owner,SBT_Aff,SBT_Status blockchain class Harvard,Attest institution

Diagram 3: Data Visas → On-Chain Attestations (ControlledAccessGrants)

This diagram details how DAC approvals become on-chain ControlledAccessGrants. Demonstrates:

graph TB %% Simplified version for document subgraph "GA4GH Data Visa" DV[ControlledAccessGrants Visa] DV_Source[source: UK_Biobank_DAC] DV_Expires[expires: 2026-06-30] end subgraph "Blockchain: ControlledAccessGrant" Grant[Grant: biosample/56789 + researcher/42] Grant_Issuer[Issued By: DAC multi-sig 3-of-5] Grant_State[State: ACTIVE] end subgraph "DAC Workflow" DAC[Data Access Committee] Review[Review DAR offline] Approve[3-of-5 members sign approval] end DV -.maps to.-> Grant DV_Source -.maps to.-> Grant_Issuer DAC --> Review Review --> Approve Approve -.issues.-> Grant classDef ga4gh fill:#E3F2FD,stroke:#2196F3,stroke-width:2px classDef blockchain fill:#E8F5E9,stroke:#4CAF50,stroke-width:2px classDef dac fill:#FFF9C4,stroke:#FBC02D,stroke-width:2px class DV,DV_Source,DV_Expires ga4gh class Grant,Grant_Issuer,Grant_State blockchain class DAC,Review,Approve dac

Diagram 4: Policy Engine → Smart Contracts (Automated Authorization)

This diagram shows the smart contract Policy Engine that implements GA4GH's requirement to "specify how Data Visas and Researcher Visas must align." Features:

graph TB %% Simplified version for document subgraph "GA4GH Policy Engine" PE[Policy Requirement: Match Data + Researcher Visas] end subgraph "Blockchain: Smart Contract" SC[authorizeAccess biosampleId, researcherId] Step1[1. Check consent active] Step2[2. Verify agreements] Step3[3. Match DUO codes] Step4[4. Verify DAC grant] Step5[5. Check researcher status] Result[Return: authorized true/false] end subgraph "DUO Matcher" DUO[DUO Code Compatibility Rules] Rule1[HMB subsumes DS-disease] Rule2[GRU subsumes all uses] end PE -.implemented by.-> SC SC --> Step1 Step1 --> Step2 Step2 --> Step3 Step3 -.queries.-> DUO Step3 --> Step4 Step4 --> Step5 Step5 --> Result classDef ga4gh fill:#E3F2FD,stroke:#2196F3,stroke-width:2px classDef blockchain fill:#E8F5E9,stroke:#4CAF50,stroke-width:2px classDef duo fill:#F3E5F5,stroke:#9C27B0,stroke-width:2px class PE ga4gh class SC,Step1,Step2,Step3,Step4,Step5,Result blockchain class DUO,Rule1,Rule2 duo

Mapping GA4GH Concepts to Blockchain Primitives

GA4GH Definition (DSWS Lines 233-241):

Data Passports contain Data Visas where sub identifies data. Data Visa Types include:

Blockchain Implementation:

Biosample NFT (ERC-721 compatible, Cosmos SDK module)
├── Token ID: Unique biosample identifier
├── Owner: Patient wallet address (self-sovereign)
├── Metadata (on-chain):
│   ├── ConsentedDataUseTerms (DUO codes, expiration)
│   ├── RequiredAgreements (legal framework hashes)
│   ├── OversightBodies (DAC addresses, multi-sig wallets)
│   └── ApplicableLawRegulationsPolicies (GDPR, HIPAA flags)
└── State:
    ├── Active/Revoked (patient can revoke consent)
    ├── Access Log (approved researchers, timestamps)
    └── Policy Version (immutable history of consent changes)

Why NFTs:

GA4GH Compatibility:


2. Researcher Passports → Researcher Passport NFTs

GA4GH Definition (DSWS Lines 289-295):

Researcher Visa Types where sub identifies a researcher:

Blockchain Implementation:

Researcher Passport NFT (Soulbound Token - SBT)
├── Token ID: Researcher unique identifier (ORCID, institutional ID)
├── Owner: Researcher wallet address (self-sovereign)
├── Attestations (on-chain):
│   ├── AffiliationAndRole (institution signatures, role proofs)
│   ├── AcceptedTermsAndPolicies (signed policy hashes, timestamps)
│   ├── ResearcherStatus (bona fide researcher attestations)
│   └── ControlledAccessGrants (DAC approval NFTs, biosample access)
└── Linked Identities:
    ├── ORCID verification (off-chain oracle or zk-proof)
    ├── Institutional credentials (university signatures)
    └── Professional certifications (medical licenses, IRB approvals)

Why Soulbound Tokens (SBTs):


Cosmos SDK Architecture

The Broad Institute Consortia Blockchain runs on Cosmos SDK with custom modules for GA4GH-specific logic:

Cosmos SDK Blockchain: "GA4GH-Consortia-Chain"
├── Module: x/biosample (Biosample NFTs)
│   ├── Mint consent NFT
│   ├── Update consent terms
│   ├── Revoke consent
│   └── Query consent state
├── Module: x/researcher (Researcher Passport SBTs)
│   ├── Issue researcher identity
│   ├── Add attestations (affiliation, status)
│   ├── Accept policies (sign agreement hashes)
│   └── Query researcher credentials
├── Module: x/datavisa (Data Visa Registry)
│   ├── Issue ControlledAccessGrant (DAC approval)
│   ├── Match DUO codes (policy engine logic)
│   ├── Revoke access (DAC or patient)
│   └── Query active grants
├── Module: x/governance (Consortia Voting)
│   ├── Propose new policy templates
│   ├── Vote on RequiredAgreements (member institutions)
│   ├── Update DUO code matching rules
│   └── Onboard/offboard OversightBodies
└── Module: x/audit (Immutable Access Logs)
    ├── Record all data access events
    ├── Record consent modifications
    ├── Timestamped, non-repudiable
    └── GDPR-compliant audit trail export

Why Cosmos SDK:

  1. Interoperability (IBC Protocol): Connect to other health data blockchains (hospital networks, national biobanks)
  2. Sovereignty: Each institution can run a validator node; Broad Institute coordinates but doesn't control
  3. Customizability: Custom modules for GA4GH-specific logic (DUO matching, OIDC bridges)
  4. Performance: Tendermint BFT consensus (1-2 second finality)
  5. Upgradeability: On-chain governance for protocol upgrades

Governance Model: Broad Institute Consortia Blockchain

Consortia Membership Structure

Tier 1: Founding Validators (Broad Institute + Partner Institutions)

Examples: Broad Institute (MIT/Harvard), UK Biobank, All of Us Research Program, European Genome-phenome Archive (EGA), Japanese Biobank Network

Tier 2: Data Holders (Biobanks, Hospitals)

Tier 3: Researchers

Tier 4: Patients


Voting Mechanism: On-Chain Governance

Proposal Types:

  1. New RequiredAgreement Templates
    • Example: "GDPR Article 9 Consent Form v2.1"
    • Voting: Simple majority of Tier 1 validators
    • Effect: Data Holders can reference this agreement in Biosample NFTs
  2. OversightBody Onboarding
    • Example: "Add French National DAC as approved OversightBody"
    • Voting: 2/3 supermajority (high trust requirement)
    • Effect: French DAC can issue ControlledAccessGrants recognized by all Data Holders
  3. DUO Code Policy Updates
    • Example: "Allow HMB (health/medical/biomedical) data for COVID research"
    • Voting: Simple majority + mandatory comment period (14 days)
    • Effect: Smart contract policy engine updates DUO matching logic

Voting Weight:


Migration Path: Non-Disruptive Adoption

Phase 1: Pilot with Broad Institute Network (Months 1-6)

Participants:

Implementation:

  1. Deploy Cosmos SDK blockchain (testnet)
  2. Mint Biosample NFTs for pilot datasets
  3. Issue Researcher Passport NFTs to pilot cohort
  4. Deploy Biodata Router as middleware (does NOT replace existing OIDC)
  5. Run dual authorization: blockchain + traditional DAC (compare results)

Success Criteria:


Phase 2: GA4GH Standards Integration (Months 7-12)

Deliverables:

  1. GA4GH DSWS RFC: "Blockchain-Based Passport Clearinghouse"
  2. AAI 2.0 Extension: JWT payload includes optional blockchain_proof field
  3. Reference Implementation: Open-source Biodata Router (Apache 2.0 license)
  4. Validator Onboarding Guide: How institutions run Cosmos nodes

Adoption Strategy:


Phase 3: Full Production Deployment (Months 13-24)

Goals:

Governance Transfer:


Benefits Over Centralized Approaches

1. Trust Federation Without Central Authority

GA4GH Challenge (DSWS Lines 350-361):

How is trust established, both legally and technically? Does a trust anchor allow parties to join a federation as a whole? Must a new joining party join with each existing party?

Blockchain Solution:

Traditional Model Problems:

Blockchain Model:


2. Immutable Audit Trail (GDPR Article 30 Compliance)

GDPR Requirement:

Controllers shall maintain records of processing activities under their responsibility.

Traditional Model:

Blockchain Model:


3. Patient Consent Revocation (Real-Time, Global)

Traditional Model:

Blockchain Model:


Practical Implementation: GenoBank.io Deployed Contracts

The concepts described in this whitepaper are not theoretical—they build upon production smart contracts and infrastructure already deployed by GenoBank.io. This section demonstrates how GA4GH Passport 2.0 blockchain primitives map to real-world implementations that process thousands of genomic datasets.

Key Insight: As Uribe argues in the JBBA paper "Privacy Laws, Genomic Data, and Non-Fungible Tokens" (2020), the intersection of privacy regulations (GDPR Article 17, CCPA) with genomic data ownership creates a unique requirement for revocable consent tokens—something traditional database systems cannot enforce cryptographically.

1. BiosampleFileManager.sol (ERC-1155 Consent Management)

This production contract implements the Biosample NFT concept with dual-state consent management:

// BiosampleFileManager.sol (deployed on Avalanche C-Chain)
contract BiosampleFileManager is ERC1155 {
    enum Status { ACTIVE, REVOKED }  // GA4GH Data Passport states

    struct File {
        uint256 biosampleSerial;     // Unique identifier (maps to GA4GH 'sub')
        string name;                  // ConsentedDataUseTerms descriptor
        address owner;                // Patient wallet (self-sovereign)
        address laboratory;           // Approved researcher/lab
        Status status;                // ACTIVE or REVOKED (instant revocation)
        uint expiration;              // RequiredAgreements time-bound consent
    }

    function revokeUserAndLab(address _fileOwner, address _permittee, uint _biosampleSerial) public {
        // Instant, global consent revocation - GDPR Article 17 compliance
        allFiles[_fileOwner][allFilesIndexes[_biosampleSerial]].status = Status.REVOKED;
        labFiles[_permittee][labFilesIndexes[_permittee][_biosampleSerial]].status = Status.REVOKED;
    }
}

GA4GH Mapping:

2. ClaraJobNFT.sol (Bioinformatics Job Tokenization)

Following the Distributive Biobanking paradigm (Uribe, Open Access Government, 2020), where "biodata is processed then deleted—only derivatives returned to patient's vault," we tokenize each bioinformatics job:

// ClaraJobNFT.sol (deployed on Sequentia L1)
contract ClaraJobNFT is ERC721URIStorage {
    struct JobData {
        string biosampleSerial;    // Link to parent BioNFT consent
        string vcfPath;            // S3 path to derivative (VCF output)
        string pipeline;           // Scientific reproducibility (e.g., "deepvariant")
        string referenceGenome;    // Reference assembly (e.g., "hg38")
        bytes32 vcfHash;           // Keccak256 hash for integrity verification
        uint256 createdAt;         // Immutable timestamp for audit trail
    }

    event JobMinted(
        uint256 indexed tokenId,
        string biosampleSerial,     // Links derivative to parent consent
        address indexed owner,
        string vcfPath,
        bytes32 vcfHash             // Scientific reproducibility proof
    );
}

Key Features:

3. BioFS: NFT-Gated Decentralized File Access

As detailed in the BioFS PIL Technical Architecture, genomic data access combines NFT consent verification with programmable licensing:

// BioFS Access Flow (pseudocode representing deployed system)
function accessGenomicData(biosampleId, researcherWallet):
    // Step 1: Verify active consent on-chain
    consentStatus = BiosampleFileManager.allFiles[owner][biosampleId].status
    if consentStatus != ACTIVE:
        revert("Consent revoked - GDPR Article 17")

    // Step 2: Verify researcher has valid ControlledAccessGrant
    if !BiosampleFileManager.biosampleShared[biosampleId][researcherWallet]:
        revert("No DAC approval - request access via governance")

    // Step 3: Check time-bound expiration
    if block.timestamp > BiosampleFileManager.allFiles[owner][biosampleId].expiration:
        revert("Consent expired - renewal required")

    // Step 4: Log access event (immutable audit trail)
    emit DataAccessed(biosampleId, researcherWallet, block.timestamp)

    // Step 5: Return presigned URL to S3 data (data never on-chain)
    return generatePresignedUrl(biosampleId)

Privacy-Preserving Architecture:

4. Deployed Infrastructure (Production Statistics)

Component Network Contract Address Status
BiosampleFileManager Avalanche C-Chain 0x5021F7438ea502b0c346cB59F8E92B749Ecd74B5 Production
ClaraJobNFT V2 Sequentia L1 0x8B0a66A840364c7D5956E72f9c6fB363E0341AEF Production
Story Protocol IP Registry Story Mainnet VCF Collection: 0xC91940118822D247B46d1eBA6B7Ed2A16F3aDC36 Production
BioFS Node P2P Network NFS + QUIC protocol Production

Practical Validation

These contracts have processed:


Comparison Matrix

Feature AAI 1.2 (Current) AAI 2.0 (DSWS Roadmap) Blockchain Implementation
Trust Model Centralized (single Passport Clearinghouse) Federated (multiple trust anchors) Decentralized (validator consensus)
Passport Issuer Single IdP per federation Multiple issuers per federation Any validator can attest (subject to consortia approval)
Data Passports Not supported Planned (TBD specification) Implemented (Biosample NFTs)
Patient Consent Revocation Manual (call biobank, 3-6 weeks) Manual (same as 1.2) Instant (on-chain state update, 1-2 sec finality)
Audit Trail Local logs (each Data Holder) Local logs (no change from 1.2) Global blockchain (immutable, cryptographically verifiable)
Policy Engine Undefined (manual DAC review) Computable policies (to be specified) Smart contracts (deterministic, automated)
VC Support JWT only JWT + VC (JSON-LD) JWT + VC + NFT (multiple representations)
Governance Top-down (GA4GH publishes specs) Federated (TBD mechanism) On-chain voting (1 validator = 1 vote)
Scalability Limited (O(N²) trust relationships) Improved (trust anchors reduce complexity) High (O(N) trust to validators, infinite data repositories)
GDPR Compliance Difficult (no audit proof) Difficult (same as 1.2) Strong (blockchain audit + off-chain data deletion)
Backward Compatibility N/A Yes (AAI 1.2 continues to work) Yes (Biodata Router bridges OIDC ↔ blockchain)

This proposal builds upon prior academic work in blockchain-based genomic data governance:

Foundational Publications

  1. Uribe, D. (2020). "Privacy Laws, Genomic Data, and Non-Fungible Tokens." Journal of The British Blockchain Association, 3(2). DOI: 10.31585/jbba-3-2-(1)2020
    This paper establishes the theoretical framework for using NFTs as consent tokens in genomic data sharing. Key contributions include: (1) Analysis of GDPR Article 17 "right to erasure" requirements and how blockchain immutability can coexist with consent revocation through state-based NFT design; (2) Legal analysis of data ownership vs. data access rights; (3) Proposed architecture for "consent NFTs" that became the foundation for Biosample NFTs.
  2. Uribe, D. (2020). "Distributive Biobanking Models: The Future of Sample Storage and Analysis." Open Access Government. Link
    This article articulates the "distributive biobanking" paradigm where: (1) Biodata processing occurs at computation nodes, not centralized repositories; (2) Raw data is deleted after processing—only derivatives returned to patient's control; (3) Patients maintain sovereignty over both raw data and processed results. This paradigm directly informs the ClaraJobNFT design where FASTQ → VCF processing results in tokenized job outputs linked to parent consent NFTs.
  3. GenoBank.io (2025). "BioFS PIL Technical Architecture: NFT-Gated Access to Genomic Data with Programmable Licensing." GenoBank Technical Blog
    Technical specification of the BioFS protocol combining: (1) NFT-based consent verification; (2) Programmable IP Licensing (PIL) for granular access control; (3) NFS/QUIC protocol for high-throughput genomic data transfer; (4) Integration with Story Protocol for IP asset registration.

Key Concepts Derived from This Research

Concept Source Application in This Proposal
Consent NFTs with ACTIVE/REVOKED states JBBA 2020 Biosample NFT Status enum
Data Minimization via distributed processing Open Access Government 2020 ClaraJobNFT derivative tokenization
NFT-gated file access BioFS PIL Architecture BioFS integration with Biodata Router
Patient self-sovereignty over biodata All three sources Patient wallet ownership of Biosample NFTs
Immutable audit trails with consent revocability JBBA 2020 On-chain access logs + state-based revocation

Conclusion: Invitation to GA4GH DSWS

We propose Broad Institute Consortia Blockchain as a reference implementation of GA4GH Passport 2.0's vision:

Next Steps:

  1. Present to GA4GH DSWS (January 2026 meeting)
  2. RFC Submission: "Blockchain-Based Passport Clearinghouse"
  3. Pilot Launch: Broad + 5 partner institutions (Q2 2026)
  4. Open-Source Release: Biodata Router + Cosmos SDK modules (Apache 2.0)
  5. Standards Integration: AAI 2.0 extension for blockchain proofs

Call to Action:

We invite GA4GH member institutions to:

Contact:


This document represents a technical vision for discussion. GenoBank.io is committed to open collaboration with GA4GH and the global genomics community. All code will be open-source (Apache 2.0), and we welcome feedback, contributions, and pilot partnerships.