🧬 CORE INNOVATION

The NBDR Protocol transforms genomic data access from traditional cloud storage into a blockchain-native ecosystem. Using BioFS CLI v1.6.2 (user interface), BioFS Node (decentralized infrastructure), biofs-sdk (Python library), web3fuse (C FUSE filesystem), BioNFS (QUIC protocol), and BioNFS MCP Server (AI agents), every data operation verifies NFT ownership on-chain before granting access. Story Protocol IP hierarchy ensures complete provenance from physical biosample to ACMG analysis, with a programmable % royalty flow through the entire chain.

🧭 Which Tool Should I Use? (Quick Navigation)

β–Ά

The NBDR Protocol provides six specialized tools, each solving a unique problem. Pick the right tool for your use case:

πŸ‘€

BioFS CLI v1.6.2

For: Researchers, Patients, Lab Techs

One tool for everything: Upload, download, tokenize, share, mount files. Your "Swiss Army knife" for genomic data.

npm install -g @genobank/biofs@latest

Commands: login, files, download, upload, tokenize, mount, access, job, node

πŸ’»

biofs-sdk (Python)

For: Data Scientists, Developers

Library for code: Embed BioCID access in Python scripts, Jupyter notebooks, reproducible pipelines.

pip install biofs-sdk

API: BioCIDClient.download(), .stream(), .verify_access(), .info()

πŸ”Œ

web3fuse (C FUSE)

For: Legacy Tools, Existing Pipelines

POSIX filesystem: Mount BioCID as local directory. Tools like IGV, samtools work without modification.

web3fuse --mount /mnt/genomics

Use case: Existing bioinformatics pipelines expecting local files

⚑

BioNFS (Rust QUIC)

For: Real-Time License Control

Protocol infrastructure: Real-time license verification. Can revoke mid-stream. Chromosome-aware queries.

biofs download --region chr17

Superpower: Stop downloads immediately if consent revoked

πŸ€–

BioNFS MCP Server

For: AI Agents (Claude Code, etc.)

AI-native access: Claude Code and AI agents analyze genomic data via Model Context Protocol with NFT-based permissions.

https://mcp.genobank.app

Use case: "Claude, analyze my VCF for BRCA1 variants"

πŸ—οΈ

BioFS Node

For: Labs, Research Institutions, Infrastructure Providers

Run your own node: Decentralized infrastructure node with AI/blockchain cost sponsorship for your lab's users. Self-host data management, S3 replication, IPFS pinning.

biofs node start --config lab.yaml

Superpower: Pay once, sponsor AI/Story Protocol costs for all your lab's users

πŸ’‘ Pro Tip: These tools work together, not in competition. BioFS CLI for daily work, BioFS Node for lab infrastructure, biofs-sdk for scripts, web3fuse for legacy tools, BioNFS for protocol infrastructure, MCP Server for AI. Each solves a unique problem.

πŸ“– Detailed comparison: See complete toolkit guide for in-depth explanations, use cases, and real-world scenarios.

🧬 Complete Genomic Workflow (Biosample β†’ ACMG Analysis)

β–Ά

End-to-End Pipeline with Story Protocol IP Hierarchy

graph TB subgraph Prerequisites["πŸ“‹ Prerequisites"] Tube[("πŸ§ͺ Physical Biosample
Collection Tube")] Serial["πŸ“ Biosample Serial:
55052008714000"] BioNFT["🎫 Mint BioNFT
ERC1155 Token
Owner: 0x5f5a60..."] Tube --> Serial Serial --> BioNFT end subgraph Consent["✍️ Consent & Sequencing"] ConsentNFT["πŸ” Consent to Lab
PIL License Token
Programmable % Royalty"] Sequencing["🧬 WGS Sequencing
Illumina NovaSeq"] FASTQ["πŸ“„ FASTQ Files
R1.fastq.gz + R2.fastq.gz
~100GB"] BioNFT --> ConsentNFT ConsentNFT --> Sequencing Sequencing --> FASTQ end subgraph FASTQToken["🎨 Tokenize FASTQ (Root IP)"] FASTQMint["Mint FASTQ NFT
Collection: 0xFASTQ
Token: 55052008714000"] FASTQS3["Store in S3:
test.vault.genoverse.io"] BioCIDFastq["BioCID Generated:
biocid://v1/story/IPA/0xFASTQ/..."] FASTQ --> FASTQMint FASTQMint --> FASTQS3 FASTQS3 --> BioCIDFastq end subgraph BioFSMount["πŸ’Ύ BioFS Mount (web3fuse)"] BioFSCmd["biofs mount
--uri biocid://v1/story/IPA/0xFASTQ/55052008714000
--mount /mnt/biosample"] NFTVerify["NFT Ownership Verified (ERC1155)
Web3 Signature Authenticated"] MountReady["πŸ“ FASTQ accessible via POSIX
at /mnt/biosample/R1.fastq.gz"] BioCIDFastq --> BioFSCmd BioFSCmd --> NFTVerify NFTVerify --> MountReady end subgraph Clara["πŸ–₯️ Clara Parabricks GPU"] ClaraRun["clara.genobank.app
DeepVariant Germline
pbrun deepvariant"] VCF["πŸ“Š Output VCF:
55052008714000.deepvariant.vcf
~5GB"] MountReady --> ClaraRun ClaraRun --> VCF end subgraph VCFToken["🎨 Tokenize VCF (Child of FASTQ)"] VCFMint["Mint VCF NFT as CHILD
Parent: FASTQ NFT
Collection: 0xVCF"] VCFDerivative["Register as Derivative IP
Inherits programmable % royalty terms"] BioCIDVCF["BioCID Generated:
biocid://v1/story/IPA/0xVCF/..."] VCF --> VCFMint VCFMint --> VCFDerivative VCFDerivative --> BioCIDVCF end subgraph OpenCRAVAT["πŸ”¬ OpenCRAVAT Annotator"] OCRun["opencravat.genobank.app
Annotate with 200+ databases
ClinVar, dbSNP, COSMIC"] SQLite["πŸ“Š Output SQLite:
55052008714000.sqlite
~2GB annotated variants"] BioCIDVCF --> OCRun OCRun --> SQLite end subgraph SQLiteToken["🎨 Tokenize SQLite (Grandchild)"] SQLiteMint["Mint SQLite NFT as GRANDCHILD
Parent: VCF NFT"] SQLiteDerivative["Register as Derivative IP
Inherits full PIL chain"] BioCIDSQLite["BioCID Generated:
biocid://v1/story/IPA/0xSQLite/..."] SQLite --> SQLiteMint SQLiteMint --> SQLiteDerivative SQLiteDerivative --> BioCIDSQLite end subgraph ClaudeAI["πŸ€– Claude AI ACMG Analysis"] ClaudeRun["claude.genobank.app
Expert Curator
ACMG/AMP Guidelines"] ACMGReport["πŸ“‹ ACMG Variant Report
Pathogenic, VUS, Benign"] BioCIDSQLite --> ClaudeRun ClaudeRun --> ACMGReport end subgraph ACMGToken["🎨 Tokenize ACMG (Great-Grandchild)"] ACMGMint["Mint ACMG NFT as GREAT-GRANDCHILD
Complete provenance chain"] ACMGDerivative["Register as Derivative IP"] BioCIDACMG["BioCID Generated:
biocid://v1/story/IPA/0xACMG/..."] FinalAccess["🌐 All files accessible via BioFS
Cryptographically verified"] ACMGReport --> ACMGMint ACMGMint --> ACMGDerivative ACMGDerivative --> BioCIDACMG BioCIDACMG --> FinalAccess end style Prerequisites fill:#0d1117,stroke:#00ffcc,stroke-width:2px style Consent fill:#0d1117,stroke:#00ff41,stroke-width:2px style FASTQToken fill:#0d1117,stroke:#b084ff,stroke-width:2px style BioFSMount fill:#0d1117,stroke:#00a8ff,stroke-width:2px style Clara fill:#0d1117,stroke:#ff6b6b,stroke-width:2px style VCFToken fill:#0d1117,stroke:#b084ff,stroke-width:2px style OpenCRAVAT fill:#0d1117,stroke:#ffd93d,stroke-width:2px style SQLiteToken fill:#0d1117,stroke:#b084ff,stroke-width:2px style ClaudeAI fill:#0d1117,stroke:#6bcf7f,stroke-width:2px style ACMGToken fill:#0d1117,stroke:#b084ff,stroke-width:2px
πŸ’‘ KEY INSIGHT: Complete Provenance Chain

Every analysis step creates a new IP asset as a derivative of its parent. This ensures:

  • Provenance Tracking: Trace ACMG report back to physical biosample tube
  • Royalty Flow: Programmable % of any commercial use flows up the chain to original owner
  • PIL Inheritance: License terms propagate through entire analysis tree
  • BioFS Access: All files mountable via POSIX interface using web3fuse
  • GDPR Compliance: Revoking consent deletes entire tree cryptographically

πŸ€– BioNFS MCP Server - AI Agent Access Layer

β–Ά
🌟 🧬 πŸ€–

The World's First BioNFT-Gated Human Biodata MCP Server for Compliant AI Agents is NOW OPERATIONAL!

Complete Biosample Metamorphosis: From Physical Tube β†’ AI Intelligence

Made with 🧬 by GenoBank.io - Decentralizing Human Biodata Ownership

🎯 The Final Stage: Biosample Metamorphosis Complete

The BioNFS MCP Server represents the culmination of the biosample journeyβ€”the moment when physical biological material transforms into AI-accessible intelligence while preserving patient ownership, consent, and programmable royalties.

πŸ§ͺ Physical Biosample (Tube with DNA)
      ↓ Sequencing
🧬 FASTQ Files (Raw genomic data)
      ↓ Variant Calling (Clara Parabricks)
πŸ“Š VCF Files (Variants identified) β†’ Tokenized as IP Asset
      ↓ Annotation (OpenCRAVAT)
πŸ”¬ SQLite Results (Clinical annotations) β†’ Tokenized as Derivative IP
      ↓ Curation (Expert Curator)
πŸ“ CSV Reports (ACMG pathogenic variants) β†’ Tokenized as Derivative IP
      ↓ ✨ BioNFS MCP Server
πŸ€– AI INTELLIGENCE (Claude Code, AI agents analyze via NFT-gated access)
      ↓ Usage & Royalties
πŸ’° Programmable Royalties flow back to patient through Story Protocol PIL

πŸ—οΈ Architecture Overview

The BioNFS MCP Server implements the Model Context Protocol (MCP)β€”an open-source standard created by Anthropic for connecting AI agents to external data sources. This enables AI systems like Claude Code to seamlessly access tokenized genomic data while respecting NFT ownership and PIL license terms.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ€– AI Agent Layer (Claude Code, etc.)       β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              ↓ MCP Protocol (JSON-RPC)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  πŸ” BioNFS MCP Server                        β”‚
β”‚  β€’ Web3 EIP-191 Signature Authentication       β”‚
β”‚  β€’ 4 MCP Tools (list, query, metadata, stream) β”‚
β”‚  β€’ 3 Resources (@bionfs:vcf, gene, ip)         β”‚
β”‚  β€’ 3 Prompts (analyze_trio, find_pathogenic)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              ↓ Verify NFT Ownership
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  ⛓️ Story Protocol Blockchain                 β”‚
β”‚  β€’ IP Asset Registry (who owns what)           β”‚
β”‚  β€’ License Terms (PIL commercial/derivatives)   β”‚
β”‚  β€’ Royalty Tracking (% back to patient)        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              ↓ Fetch Data
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  ☁️ BioNFT-Gated Storage                      β”‚
β”‚  β€’ AWS S3: test.vault.genoverse.io             β”‚
β”‚  β€’ MongoDB: IP metadata, job results           β”‚
β”‚  β€’ SQLite: OpenCRAVAT annotation databases      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ MCP Capabilities

1️⃣ MCP Tools (AI-Callable Functions)

  • list_accessible_files
    Returns all genomic files (VCF, SQLite, CSV) that the authenticated wallet has NFT access to
  • query_variants
    Query variants from VCF or SQLite by gene symbol, chromosome region, or clinical significance
  • get_file_metadata
    Get detailed metadata about IP assets: owner, license terms, derivative relationships, royalty settings
  • stream_file
    Generate presigned S3 URLs for secure file downloads (bypasses CloudFlare 100MB limits)

2️⃣ MCP Resources (@ Mention Support)

  • @bionfs:vcf://<job_id>/<region>
    Access specific VCF regions, e.g., @bionfs:vcf://job_123/chr17:41196312-41277500 (BRCA1)
  • @bionfs:gene://<gene_symbol>
    Search all accessible files for a gene, e.g., @bionfs:gene://BRCA2
  • @bionfs:ip://<ip_id>
    Get IP asset metadata from Story Protocol, e.g., @bionfs:ip://0x123...

3️⃣ MCP Prompts (Slash Commands)

  • >/bionfs_analyze_trio
    Automated trio analysis workflow: Compare father/mother/child VCFs to identify de novo, inherited, and compound heterozygous variants
  • >/bionfs_find_pathogenic
    Search for pathogenic and likely pathogenic variants across all accessible VCFs, filtered by ACMG criteria
  • >/bionfs_gene_report
    Generate comprehensive gene report: All variants in a gene, clinical annotations, population frequencies, predictions

πŸ” Authentication Flow

The MCP Server uses Web3 cryptographic signatures for passwordless, wallet-based authentication:

# 1. User signs standard message with their wallet
message = "I want to proceed"
signature = await wallet.signMessage(message)

# 2. AI agent sends signature with MCP requests
curl -H "Authorization: Bearer ${signature}" \
  https://mcp.genobank.app/mcp

# 3. Server recovers wallet address from signature
wallet_address = eth_account.recover_message(
    encode_defunct(text="I want to proceed"),
    signature=signature
)

# 4. Server checks Story Protocol for NFT ownership
owned_nfts = story_protocol.get_ip_assets(wallet_address)

# 5. Only return data for NFTs the wallet owns
accessible_files = filter_by_ownership(all_files, owned_nfts)

πŸ’‘ How Claude Code Connects

Developers and researchers can connect Claude Code (or any MCP-compatible AI agent) to their genomic data:

~/.config/claude/claude_desktop_config.json
{
  "mcpServers": {
    "bionfs": {
      "transport": "http",
      "url": "https://mcp.genobank.app/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_WEB3_SIGNATURE"
      }
    }
  }
}

Once connected, Claude Code can naturally interact with your genomic data:

User: "Analyze my VCF for BRCA1 variants"

Claude: *Uses query_variants tool*
      Found 3 variants in BRCA1:
      β€’ chr17:41246481 C>T (p.Arg1751Ter) - PATHOGENIC
      β€’ chr17:41256877 G>A (p.Ala1708Thr) - VUS
      β€’ chr17:41258504 T>C (p.Val1687Ala) - BENIGN

User: "Run a trio analysis for @bionfs:vcf://trio_123"

Claude: *Uses /bionfs_analyze_trio prompt*
      Analyzing family trio (Father + Mother + Child)...
      β€’ 12 de novo variants detected
      β€’ 3 compound heterozygous in CFTR gene
      β€’ Inheritance pattern suggests autosomal recessive

🎭 The Complete Metamorphosis: Patient Empowerment

This is where the biosample journey reaches its profound conclusion:

πŸ§ͺ Patient donates biosample
      ↓ Lab sequences and tokenizes (VCF as IP Asset)
⛓️ Patient owns NFT = Ownership + License Terms (PIL)
      ↓ Researcher's AI agent requests access
πŸ” BioNFS MCP Server verifies NFT ownership
      ↓ If researcher has license token:
πŸ€– AI analyzes data via MCP tools
      ↓ AI discovers novel insight
πŸ’° Royalties flow back to patient (programmable %)
      ↓ Patient shares in value created
✨ Patient empowered - True data ownership realized

This completes the biosample metamorphosis: A physical tube of DNA transforms into AI-accessible intelligence, while the patient retains ownership, consent control, and receives royalties from every use. The BioNFS MCP Server is the bridge that makes this possibleβ€”connecting AI agents to human biodata while preserving the ethical, legal, and economic rights of the individual.

🌟 Why This Matters:
For the first time in history, AI agents can analyze human genomic data with full complianceβ€”respecting consent, ownership, and royalty obligations. The MCP Server enforces these rights at the protocol level, not as an afterthought. This is the future of ethical AI Γ— genomics.

πŸ“Š Technical Specifications

🌐 Production URL
https://mcp.genobank.app
πŸ” Authentication
Web3 EIP-191 Signatures
⚑ Protocol
MCP (Model Context Protocol)
πŸ› οΈ MCP Tools
4 (list, query, metadata, stream)
πŸ“¦ MCP Resources
3 (@bionfs:vcf, gene, ip)
πŸ’¬ MCP Prompts
3 (trio, pathogenic, gene)
⛓️ Blockchain
Story Protocol (Aeneid Testnet)
☁️ Storage
AWS S3 (NFT-gated buckets)
πŸ—„οΈ Database
MongoDB Atlas + SQLite
πŸš€ Runtime
Node.js v24.2.0 (TypeScript)
πŸ”’ Service
systemd (bionfs-mcp.service)
🌐 Proxy
Nginx + Cloudflare SSL

πŸ”— Additional Resources

🌐 BioCID Specification (Blockchain Content Identifier)

β–Ά

BioCID Format

biocid://v1/<chain>/<collection>/<tokenId>/<path>

Components:

  • scheme: biocid:// (protocol identifier)
  • version: v1 (BioCID format version)
  • chain: Blockchain network (avalanche, story, ethereum, sepolia)
  • collection: NFT collection contract address (0x...)
  • tokenId: NFT token ID (decimal or hex)
  • path: Optional file path within NFT storage

Examples

BioCID Examples
# FASTQ files for biosample 55052008714000
biocid://v1/story/IPA/0x19A615224D03487AaDdC43e4520F9D83923d9512/55052008714000/R1.fastq.gz

# VCF file on Story Protocol
biocid://v1/story/0x5021F7438ea502b0c346cB59F8E92B749Ecd74B5/41221040804049/genome.vcf

# SQLite annotation database
biocid://v1/story/IPA/0xB8d03f2E1C02e4cC5b5fe1613c575c01BDD12269/55052008714000/annotation.sqlite

Supported Chains

Chain Chain ID RPC Endpoint Usage
avalanche 43114 https://api.avax.network/ext/bc/C/rpc Production biosamples
story 1513 https://rpc.odyssey.storyrpc.io PIL licensing, IP registry
ethereum 1 https://eth.llamarpc.com High-value IP assets
sequentias 15132025 http://52.90.163.112:8545 Consent management

Resolution Process

sequenceDiagram participant App as πŸ“± Application participant BioFS as πŸ–₯️ BioFS Client participant Parser as πŸ” BioCID Parser participant Chain as ⛓️ Blockchain RPC participant NFT as 🎫 NFT Contract participant Storage as πŸ’Ύ S3/IPFS App->>BioFS: biocid://v1/story/IPA/0x19A6.../55052008714000 BioFS->>Parser: Parse BioCID URI Parser-->>BioFS: {chain: avalanche, collection: 0x19A6..., token: 55052008714000} BioFS->>Chain: Connect to Avalanche RPC BioFS->>NFT: Query tokenURI(55052008714000) NFT-->>BioFS: ipfs://QmXYZ.../metadata.json BioFS->>Storage: Fetch metadata Storage-->>BioFS: {storage_type: "s3", bucket: "...", path: "..."} BioFS-->>App: File accessible at /mnt/biosample/...

βš™οΈ web3fuse (C Implementation)

β–Ά

web3fuse is a blockchain-native FUSE filesystem written in pure C (46KB binary) that enables NFT-gated access to genomic data. Unlike traditional cloud storage wrappers, web3fuse uses blockchain smart contracts as the source of truth for data ownership, permissions, and storage locations.

Architecture

πŸ“‹ BioCID Parser

Parses blockchain content identifiers (biocid://v1/...) and extracts chain, collection, and token information.

Implementation: biocid_parser.c
Function: biocid_parse_uri(const char *uri, biocid_t *out)

πŸ” Web3 Signature Verifier

Verifies Ethereum signatures (EIP-191) using libsecp256k1 for ECDSA recovery and XKCP for Keccak-256 hashing.

Implementation: web3_signature.c
Libraries: libsecp256k1, XKCP (Keccak-256)
Function: web3_verify_signature(msg, sig, expected_address)

🎫 NFT Ownership Verifier

Checks NFT ownership on-chain via JSON-RPC. Auto-detects ERC721/ERC1155 using supportsInterface queries.

Implementation: nft_verifier.c
Standards: ERC721 (0x80ac58cd), ERC1155 (0xd9b67a26)
Function: nft_verify_ownership(chain, collection, token, owner)

🌐 RPC Client

Communicates with blockchain nodes using JSON-RPC 2.0. Supports multi-chain queries with connection pooling.

Implementation: rpc_client.c
Transport: libcurl (HTTP/HTTPS)
Parsing: jansson (JSON)

πŸ“ BioCID Resolver

Resolves BioCID URIs to physical storage locations (S3, IPFS, Arweave) by querying NFT metadata.

Implementation: biocid_resolver.c
Function: biocid_resolve(biocid_uri, storage_info_t *out)
Caching: 1 hour TTL for metadata

Installation

Building web3fuse from Source
# Install dependencies
sudo apt-get install -y build-essential meson ninja-build \
                        libfuse3-dev libcurl4-openssl-dev \
                        libjansson-dev autoconf automake libtool

# Build libsecp256k1 (ECDSA signature verification)
git clone https://github.com/bitcoin-core/secp256k1.git
cd secp256k1
./autogen.sh && ./configure --enable-module-recovery
make && sudo make install && sudo ldconfig

# Build web3fuse
git clone https://github.com/Genobank/web3fuse.git
cd web3fuse
make

# Run tests
make test

Usage

Mounting Genomic Data via BioCID
# Set user signature for authentication
export USER_SIGNATURE="0xa5141ae955bba91ad46a940aefc3b05120489b8b..."

# Mount FASTQ files via BioCID
biofs --mount /mnt/genomics \
      --uri "biocid://v1/story/IPA/0x19A615224D03487AaDdC43e4520F9D83923d9512/55052008714000"

# Access files (NFT ownership verified on-chain)
ls /mnt/genomics/
head /mnt/genomics/55052008714000_R1.fastq.gz

# Unmount
fusermount -u /mnt/genomics

Performance

Operation Latency Notes
NFT Verification ~200ms Blockchain RPC call, cached 5 min
Signature Verification <1ms Local ECDSA recovery
BioCID Resolution ~300ms Blockchain + HTTP metadata fetch
File Open ~250ms After initial verification
File Read ~5ms S3 streaming (cached)
⚠️ SECURITY: Zero Hardcoded Credentials

web3fuse NEVER stores AWS access keys or blockchain private keys in configuration files. All credentials are resolved from NFT metadata or user signatures at runtime. This eliminates credential leakage vulnerabilities common in traditional cloud storage systems.

🐍 biofs-sdk (Python)

β–Ά

biofs-sdk is a Python library that provides programmatic access to BioCID-referenced genomic data. It enables data scientists to embed blockchain-gated genomic data access directly in their Python scripts, Jupyter notebooks, and analysis pipelines.

Installation

Installing biofs-sdk
# Install via pip
pip install biofs-sdk

# Or install from source
git clone https://github.com/Genobank/biofs-sdk.git
cd biofs-sdk
pip install -e .

Quick Start

Example: Accessing VCF Files via BioCID
from biofs import BioCIDClient

# Initialize client with Web3 signature
client = BioCIDClient(wallet_signature="0xa5141ae955bba91ad46a940aefc3b05120489b8b...")

# Verify access to BioCID (NFT ownership check)
biocid_url = "biocid://v1/story/IPA/0x19A615224D03487AaDdC43e4520F9D83923d9512/55052008714000/genome.vcf"
if client.verify_access(biocid_url):
    print("βœ… Access granted")

# Download file to local path
client.download(biocid_url, output_path="./genome.vcf")

# Stream large files (memory-efficient)
for chunk in client.stream(biocid_url, chunk_size=8192):
    process_chunk(chunk)

# Get metadata without downloading
info = client.info(biocid_url)
print(f"File size: {info['size']} bytes")

API Reference

BioCIDClient

class BioCIDClient:
    def __init__(self, wallet_signature: str):
        """Initialize client with Web3 signature"""

    def verify_access(self, biocid_url: str) -> bool:
        """Verify NFT ownership and access permissions"""

    def download(self, biocid_url: str, output_path: str) -> None:
        """Download file from BioCID to local path"""

    def stream(self, biocid_url: str, chunk_size: int = 8192) -> Iterator[bytes]:
        """Stream file in chunks (memory-efficient for large files)"""

    def info(self, biocid_url: str) -> Dict:
        """Get metadata without downloading (size, owner, PIL terms)"""

Response Format

# info() returns:
{
    "size": 1048576,  # bytes
    "owner": "0x5f5a...",
    "ipId": "0x19A6...",
    "license_terms": {...},
    "metadata_uri": "ipfs://Qm..."
}

Advanced Usage

Example: Genomic Analysis Pipeline with BioCID
from biofs import BioCIDClient
import pysam
import pandas as pd
import os

# Initialize client
client = BioCIDClient(wallet_signature=os.environ['USER_SIGNATURE'])

# BioCID URLs for biosample files
biosample_id = "55052008714000"
vcf_biocid = f"biocid://v1/story/IPA/0x19A6.../{biosample_id}/genome.vcf"

# Verify access before processing
if not client.verify_access(vcf_biocid):
    raise PermissionError("You do not have access to this genomic data")

# Download VCF file
vcf_path = "./genome.vcf"
client.download(vcf_biocid, output_path=vcf_path)

# Analyze with pysam
vcf = pysam.VariantFile(vcf_path)

pathogenic_variants = []
for variant in vcf:
    if variant.info.get('CLNSIG') == 'Pathogenic':
        pathogenic_variants.append({
            'chrom': variant.contig,
            'pos': variant.pos,
            'ref': variant.ref,
            'alt': variant.alts[0]
        })

# Generate report
df = pd.DataFrame(pathogenic_variants)
print(f"Found {len(df)} pathogenic variants")

# Get file metadata and provenance
info = client.info(vcf_biocid)
print(f"Owner: {info['owner']}")
print(f"IP Asset: {info['ipId']}")
πŸ’‘ Integration with Existing Tools

biofs-sdk is designed to work seamlessly with existing bioinformatics tools like pysam, bcftools, and samtools. Once you download a BioCID to a local file path, standard tools can analyze the data. The blockchain NFT verification happens before download to ensure proper access control.

πŸ› οΈ BioFS CLI v1.6.2 - Node Management & Enhanced Features

β–Ά

BioFS CLI v1.6.2 adds BioFS Node management capabilities, allowing labs to run decentralized infrastructure nodes directly from the CLI. Enhanced with improved file handling, node discovery, and integration with sponsored AI/blockchain services.

πŸ†• What's New in v1.6.2

πŸ—οΈ BioFS Node Integration

New biofs node command enables labs to run their own infrastructure nodes, sponsor AI costs for users, and manage decentralized data replication. Includes node start, status, and configuration management.

Installation

Install BioFS CLI v1.6.2
# Install via npm (recommended)
npx @genobank/biofs@latest --version

# Or install globally
npm install -g @genobank/biofs@latest

# Verify installation
biofs --version
# Output: 1.6.2

Debug Mode Usage

Standard Usage (Clean Output)
# Normal tokenization - no debug messages
biofs tokenize genome.vcf --network mainnet --title "Patient Genome"

# Output:
βœ… Step 1/6: βœ“ File validated: genome.vcf (5.2 GB)
βœ… Step 2/6: βœ“ AI Classification: VCF Germline Variants
βœ… Step 3/6: βœ“ Uploaded to encrypted S3 vault
βœ… Step 4/6: βœ“ Registered in MongoDB
βœ… Step 5/6: βœ“ Minting IP Asset NFT on Story Protocol mainnet...
βœ… Step 6/6: βœ“ BioIP Tokenization Complete!

πŸ“¦ IP Asset ID:     0x1234567890abcdef...
πŸ”— BioCID:          biocid://v1/story/IPA/0x5021F7.../55052008714000
πŸ›οΈ  PIL License:     Non-Commercial Research
Debug Mode (Verbose Logging)
# Same command with --debug flag
biofs tokenize genome.vcf --network mainnet --title "Patient Genome" --debug

# Output: (includes internal operations)
πŸ” Calling get_my_granted_bioips...
πŸ” Signature preview: 0xa5141ae955bba91ad46a940aef...
πŸ” API Response status: Success
πŸ” Full response: {
  "status": "Success",
  "status_details": {
    "granted_bioips": [...],
    "total_count": 3
  }
}
πŸ” Parsed granted BioIPs count: 3
πŸ” First granted file: {
  "ip_id": "0x789...",
  "filename": "reference_genome.vcf",
  "license_type": "non-commercial"
}

βœ… Step 1/6: βœ“ File validated: genome.vcf (5.2 GB)

πŸ” Detected IP Asset ID: 0x1234567890abcdef...
πŸ” Calling getBioIPDownloadURL API...
πŸ” API Response: {
  "access_granted": true,
  "presigned_url": "https://s3.amazonaws.com/...",
  "license_type": "non-commercial"
}
πŸ” Access granted! Filename: genome.vcf

βœ… Step 2/6: βœ“ AI Classification: VCF Germline Variants

πŸ” Response status: Success
πŸ” Full response: {
  "status": "Success",
  "bioip_id": "12345",
  "s3_path": "/biowallet/0x5f5a60.../genome.vcf",
  "mongodb_id": "67890"
}

βœ… Step 3/6: βœ“ Uploaded to encrypted S3 vault
βœ… Step 4/6: βœ“ Registered in MongoDB
βœ… Step 5/6: βœ“ Minting IP Asset NFT on Story Protocol mainnet...
βœ… Step 6/6: βœ“ BioIP Tokenization Complete!

πŸ“¦ IP Asset ID:     0x1234567890abcdef...
πŸ”— BioCID:          biocid://v1/story/IPA/0x5021F7.../55052008714000
πŸ›οΈ  PIL License:     Non-Commercial Research

Debug Mode for All Commands

Command Without --debug With --debug
biofs files Shows table of files only + API calls, BioCID resolution, NFT queries
biofs download Progress bar only + Signature verification, ownership checks, S3 pre-signed URL generation
biofs tokenize 6 step progress indicators + AI classification details, blockchain responses, MongoDB confirmations
biofs mount Mount confirmation only + FUSE operations, NFT verification, credential decryption
biofs access list Shows permittees table + Blockchain RPC calls, license term queries, PIL verification

Environment Variable Override

Enable Debug via Environment Variable
# Set DEBUG environment variable (persists for session)
export DEBUG=1

# All subsequent commands will show debug output
biofs tokenize genome.vcf --network mainnet
biofs files
biofs download biocid://v1/story/IPA/0x123.../genome.vcf

# Disable debug mode
unset DEBUG

Use Cases

πŸ§ͺ Development & Testing

Use --debug during development to inspect API responses, blockchain transactions, and internal state:

biofs tokenize test.vcf --network testnet --debug

🏭 Production Pipelines

Omit --debug in automated scripts for clean logs suitable for monitoring dashboards:

# Cron job: Daily genome upload
#!/bin/bash
for vcf in /data/genomes/*.vcf; do
    biofs tokenize "$vcf" --network mainnet --quiet
done

πŸ› Debugging Access Issues

When troubleshooting permission denials, use --debug to see NFT ownership verification steps:

biofs download biocid://v1/story/IPA/0x789.../genome.vcf --debug

# Debug output shows:
πŸ” Checking access for IP Asset 0x789...
πŸ” User signature: 0xa5141ae955bba9...
πŸ” Response status: Failure
πŸ” Full response: {
  "status": "Failure",
  "status_details": {
    "error": "NFT ownership verification failed",
    "has_access": false,
    "reason": "User does not own NFT token 55052008714000"
  }
}
❌ Access denied to this BioIP asset
⚠️ SECURITY: Debug Output Contains Sensitive Information

Debug logs may reveal:

  • Partial Web3 signatures (first 30 characters)
  • API response structures
  • S3 bucket paths and file metadata
  • Blockchain transaction details

Do not share debug logs publicly or commit them to version control. Use --debug only for local troubleshooting.

Implementation Details

Logger Architecture (TypeScript)
// src/lib/utils/logger.ts
export class Logger {
  static debug(message: string): void {
    if (process.env.DEBUG) {
      console.log(chalk.gray(`πŸ” ${message}`));
    }
  }

  static info(message: string): void {
    console.log(chalk.blue(`ℹ️  ${message}`));
  }

  static success(message: string): void {
    console.log(chalk.green(`βœ… ${message}`));
  }

  static error(message: string): void {
    console.log(chalk.red(`❌ ${message}`));
  }
}

// Global debug flag set by CLI
program
  .option('--debug', 'Enable debug output')
  .hook('preAction', (thisCommand) => {
    const opts = thisCommand.opts();
    if (opts.debug) {
      process.env.DEBUG = '1';
    }
  });
πŸ’‘ Migration to v1.6.2

Upgrading from v1.4.x or v1.5.x is seamless. All existing commands work as before, with new biofs node commands available:

# Upgrade globally
npm update -g @genobank/biofs

# Verify version
biofs --version
# Should output: 1.6.2

# No breaking changes - all existing scripts work as before

πŸ—οΈ BioFS Node - Decentralized Infrastructure

β–Ά

BioFS Node enables labs and research institutions to run their own decentralized infrastructure nodes, sponsor AI and blockchain costs for their users, and participate in the GenoBank network without relying on centralized infrastructure.

πŸ” Why Run Your Own Node?

Data Sovereignty: Host your own S3 buckets, control replication, and maintain complete ownership of your genomic data infrastructure. Cost Sponsorship: Pay once for AI (Anthropic Claude) and blockchain (Story Protocol) services, then sponsor unlimited usage for all your lab's users. Network Participation: Become part of the decentralized GenoBank network instead of depending on a single provider.

Node Architecture

BioFS supports two node types, each serving a specific role in the decentralized network:

⚑

Master Node

GenoBank.io's Core Infrastructure

  • Manages lab registrations and discovery
  • Coordinates cross-region data replication
  • Provides fallback AI/blockchain proxies
  • Monitors network health and lab status
πŸ§ͺ

Lab Node

Your Research Institution's Infrastructure

  • Self-host S3 data storage with full control
  • Sponsor AI analysis costs for your users
  • Sponsor Story Protocol gas for your lab
  • IPFS pinning for metadata and images
  • Automatic S3 disaster recovery replication

Core Capabilities

πŸ€– Anthropic Claude Proxy

Sponsor Claude AI analysis for your lab's users. Set quota limits, track usage, and pay once instead of per-user billing. Users get AI-powered variant interpretation without individual API keys.

⛓️ Story Protocol Proxy

Sponsor blockchain gas costs for IP asset registration, license minting, and derivative creation. Your users tokenize genomic data without needing ETH or managing wallets.

πŸ’Ύ S3 File Management

Manage your own S3 buckets for genomic data storage. Generate presigned URLs, track file uploads, and maintain complete control over data location and access policies.

πŸ“Œ IPFS Pinning Service

Pin images and anonymized metadata to IPFS with automatic GDPR-compliance checks. Prevents genomic data from being pinned to immutable storage (enforces data erasure rights).

πŸ”„ S3 Cross-Region Replication

Automatic disaster recovery with differential sync. Only changed files are replicated (ETag-based detection), saving bandwidth. Primary and replica buckets in different regions for geographic redundancy.

Installation & Setup

Install BioFS CLI (includes node command)
# Install BioFS CLI v1.6.2+
npm install -g @genobank/biofs@latest

# Verify node command is available
biofs node --help
Create Lab Node Configuration
# Generate configuration template
biofs node init --type lab --name "AUGenomics Lab"

# Edit lab-config.yaml with your settings:
# - node.wallet: Your lab's Ethereum wallet
# - storage.bucket: Your S3 bucket name
# - database.url: PostgreSQL connection string
# - services.anthropic_api_key: Claude API key (optional)
# - services.story_executor_key: Story Protocol executor (optional)
Start Lab Node
# Start node with configuration
biofs node start --config lab-config.yaml

# Output:
βœ… Lab registry started
βœ… S3 File Manager started (bucket: s3://augenomics-genomics)
βœ… IPFS Pinning Service started
βœ… Anthropic Proxy enabled (quota: 1M tokens/month)
βœ… Story Protocol Proxy enabled (quota: 100 transactions/month)
βœ… API server started on port 8080
πŸ—οΈ  Lab Node running at https://biofs.augenomics.org

Cost Sponsorship Model

πŸ’° How Sponsorship Works

Instead of each researcher paying for AI and blockchain services individually, your lab node sponsors costs for all users:

Service Without Lab Node With Lab Node
Claude AI Analysis Each researcher needs API key
$20/month per user
Lab pays once ($200/month)
Unlimited researchers
Story Protocol Gas Each researcher needs ETH
~$5 per tokenization
Lab sponsors gas costs
Users tokenize for free
S3 Storage GenoBank's S3 bucket
$0.023/GB
Your own S3 bucket
Full control + negotiated rates
Data Sovereignty Trust centralized provider Own your infrastructure

Quota Management

Check Lab Quotas
# View current usage and limits
biofs node quota --service anthropic

# Output:
πŸ“Š Anthropic Quota Status
   Monthly Limit:    1,000,000 tokens
   Used This Month:    450,230 tokens (45%)
   Remaining:          549,770 tokens
   Reset Date:         2025-11-01

# View Story Protocol gas usage
biofs node quota --service story

# Output:
β›½ Story Protocol Gas Quota
   Monthly Limit:    100 transactions
   Used This Month:  37 transactions (37%)
   Remaining:        63 transactions
   Reset Date:       2025-11-01

Disaster Recovery

Lab nodes automatically replicate data to backup regions for disaster recovery:

Configure S3 Replication
# In lab-config.yaml
replication:
  enabled: true
  interval_hours: 24
  replica_bucket: backup.augenomics.org
  replica_region: us-east-1  # Different from primary region
Monitor Replication
# Check replication status
curl http://localhost:8080/api/v1/replication/health

# Output:
{
  "success": true,
  "health": {
    "replication_enabled": true,
    "primary_bucket": "augenomics-genomics",
    "replica_bucket": "backup.augenomics.org",
    "interval_hours": 24,
    "last_run": "2025-10-24T17:00:00Z"
  }
}
🌐 Decentralization Benefits

Running your own node means no single point of failure. If GenoBank.io goes offline, your lab continues operating with full access to your data, AI capabilities, and blockchain services. The network is resilient because it's distributed across multiple independent nodes.

API Endpoints

Lab nodes expose RESTful APIs for integration with existing systems:

  • POST /api/v1/anthropic/messages - Claude AI analysis with quota management
  • POST /api/v1/story/register-ip - Register IP assets with sponsored gas
  • POST /api/v1/files/upload - Upload files to your S3 bucket
  • POST /api/v1/ipfs/pin-image - Pin images to IPFS with GDPR checks
  • GET /api/v1/replication/stats - View replication statistics

πŸ’‘ For Research Institutions: BioFS Node transforms genomic infrastructure from vendor lock-in to true data sovereignty. Run it on-premises, in your cloud environment, or hybrid. Full control, full transparency, full ownership.

πŸ” NFT Verification & Access Control

β–Ά

3-Factor Verification Model

Every file access requires three simultaneous verifications:

Factor 1: NFT Ownership
Query blockchain smart contract to verify user owns the NFT for this BioCID
Factor 2: Web3 Signature
Verify user's Ethereum signature using ECDSA recovery (libsecp256k1)
Factor 3: PIL License Terms
Check Story Protocol license terms for commercial use, derivatives, attribution

Verification Flow

sequenceDiagram participant User as πŸ‘€ User participant BioFS as πŸ–₯️ BioFS participant Chain as ⛓️ Blockchain participant NFT as 🎫 NFT Contract participant Story as πŸ“œ Story Protocol participant S3 as πŸ’Ύ S3 Storage User->>BioFS: open("/mnt/genomics/genome.vcf") BioFS->>BioFS: Extract BioCID from mount BioFS->>Chain: Connect to blockchain RPC Factor 1: NFT Ownership BioFS->>NFT: balanceOf(user_wallet, token_id) NFT-->>BioFS: balance: 1 βœ… Factor 2: Web3 Signature BioFS->>BioFS: Verify signature with libsecp256k1 BioFS-->>BioFS: Address matches βœ… Factor 3: PIL License BioFS->>Story: Get IP asset license terms Story-->>BioFS: {commercial: true, derivatives: true} BioFS-->>BioFS: Access granted βœ… BioFS->>S3: Fetch s3://vault.../genome.vcf S3-->>BioFS: File data BioFS-->>User: File handle

Credential Management (Zero Hardcoded Keys)

web3fuse and biofs-sdk eliminate the need for hardcoded AWS credentials. All storage credentials are resolved from NFT metadata at runtime:

NFT Metadata with Encrypted Credentials
{
  "name": "Biosample 55052008714000 - FASTQ Files",
  "description": "Whole Genome Sequencing FASTQ files",
  "image": "ipfs://QmXYZ.../thumbnail.png",

  // Storage location (public)
  "storage": {
    "type": "s3",
    "region": "us-east-1",
    "bucket": "test.vault.genoverse.io",
    "path": "/biowallet/0x5f5a60.../55052008714000/"
  },

  // Encrypted credentials (only decryptable by NFT owner)
  "credentials": {
    "encrypted": "0xABC123...",  // Encrypted with owner's public key
    "algorithm": "ECIES",
    "expires_at": 1735689600  // Unix timestamp
  },

  // Story Protocol IP asset reference
  "ipAsset": {
    "ipId": "0x789...",
    "parentIpId": "0x456...",  // Biosample tube NFT
    "licenseTerms": "0xDEF..."
  }
}
⚠️ CRITICAL: Credential Decryption Flow

When BioFS needs to access S3:

  1. Query NFT metadata to get encrypted credentials
  2. Decrypt using user's wallet signature (ECIES with secp256k1)
  3. Use temporary credentials (expires in 1 hour)
  4. Cache decrypted credentials in kernel keyring
  5. Auto-refresh before expiry

This ensures that even if someone steals the metadata JSON, they cannot access the data without the NFT owner's private key.

πŸ“‘ OSI Layer 4.5 Architecture

β–Ά

Modified OSI Stack with NBDR

The NBDR Protocol operates at OSI Layer 4.5, intercepting data flows after transport reliability (TCP) is established but before session management begins. This allows routing decisions based on data ownership rather than network paths.

Layer 7: Application
BioWallet, dApps, User Interfaces (React, Web3.js)
Layer 6: Presentation
DICOM, HL7, FHIR, VCF encoding/decoding
Layer 5: Session
Authentication (OAuth, Web3 signatures via MetaMask)
Layer 4.5: NBDR (NFT-Based Data Routing)
Innovation Layer: BioCID resolution, NFT ownership verification, PIL enforcement, blockchain-native credential management
Layer 4: Transport
TCP/UDP (reliability, flow control, error correction)
Layer 3: Network
IP addressing, routing protocols (BGP, OSPF)
Layer 2: Data Link
MAC addressing, Ethernet, WiFi
Layer 1: Physical
Cables, radio waves, fiber optics

Why Layer 4.5?

πŸ’‘ Strategic Positioning

By inserting NBDR between Transport (Layer 4) and Session (Layer 5), we gain:

  • Reliable Transport: TCP ensures data integrity before we make ownership checks
  • Pre-Session Interception: Block unauthorized access before establishing expensive sessions
  • Blockchain Query Efficiency: Can make RPC calls with reliable transport guarantees
  • Transparent to Applications: Apps at Layer 7 see standard POSIX filesystem, unaware of blockchain layer

Comparison: Traditional vs NBDR Routing

Aspect Traditional (IP-Based) NBDR (NFT-Based)
Routing Basis IP address, network topology NFT ownership, blockchain state
Access Control Firewall rules, ACLs (company-controlled) Smart contracts, PIL terms (user-controlled)
Credential Storage Config files, environment variables NFT metadata, encrypted on-chain
Audit Trail CloudTrail (mutable, company-controlled) Blockchain (immutable, public/verifiable)
Revocation Manual (IT ticket, days) Instant (burn NFT, real-time)
Royalty Tracking Not supported Automatic (Programmable % via Story Protocol)

πŸ“œ Story Protocol PIL Integration

β–Ά

PIL (Programmable IP License) Terms

Every genomic data file is registered as an IP asset on Story Protocol with specific license terms. web3fuse enforces these terms at the filesystem level before granting access.

Example PIL Terms for Biosample FASTQ Files
{
  "license_type": "PIL",
  "commercial_use": true,           // Can be used commercially
  "derivatives_allowed": true,       // Can create derivative works (VCF, etc.)
  "attribution_required": true,      // Must credit original owner
  "commercial_revenue_share": 15,    // Programmable % royalty to original owner
  "allowed_territories": ["*"],      // Global access
  "expiration": null,                // No expiration
  "revocable": true                  // Owner can revoke consent
}

IP Asset Hierarchy Example

graph TB Root["πŸ§ͺ ROOT: Biosample Collection Tube
Serial: 55052008714000
Collection: 0xBiosample
Owner: 0x5f5a60..."] FASTQ["πŸ“„ LEVEL 1: FASTQ Files (Parent)
Collection: 0xFASTQ
Token: 55052008714000
biocid://v1/story/IPA/0xFASTQ/55052008714000/...
PIL: Programmable % royalty to 0x5f5a60..."] VCF["πŸ“Š LEVEL 2: VCF File (Child of FASTQ)
Collection: 0xVCF
Token: 55052008714000
biocid://v1/story/IPA/0xVCF/55052008714000/...
Derivative IP: Inherits PIL from FASTQ
Royalty Chain: Programmable % to FASTQ owner"] SQLite["πŸ“Š LEVEL 3: SQLite DB (Grandchild of FASTQ)
Collection: 0xSQLite
Token: 55052008714000
biocid://v1/story/IPA/0xSQLite/55052008714000/...
Derivative IP: Inherits PIL from VCF
Royalty Chain: Programmable % flows through VCF to FASTQ"] ACMG["πŸ“‹ LEVEL 4: ACMG Report (Great-Grandchild)
Collection: 0xACMG
Token: 55052008714000
biocid://v1/story/IPA/0xACMG/55052008714000/...
Derivative IP: Inherits PIL from SQLite
Royalty Chain: Programmable % flows through SQLite β†’ VCF β†’ FASTQ
Complete provenance to physical tube"] Root --> FASTQ FASTQ --> VCF VCF --> SQLite SQLite --> ACMG style Root fill:#1a1a2e,stroke:#00ffcc,stroke-width:3px,color:#00ffcc style FASTQ fill:#16213e,stroke:#b084ff,stroke-width:2px,color:#e6f1e6 style VCF fill:#0f3460,stroke:#b084ff,stroke-width:2px,color:#e6f1e6 style SQLite fill:#533483,stroke:#b084ff,stroke-width:2px,color:#e6f1e6 style ACMG fill:#6d435a,stroke:#b084ff,stroke-width:2px,color:#e6f1e6
πŸ’‘ Automatic Royalty Distribution

When a pharmaceutical company accesses the ACMG report for drug development:

  1. Company pays license fee for commercial use
  2. Programmable % automatically flows to ACMG report owner (GenoBank/Lab)
  3. Programmable % of that flows to SQLite owner (OpenCRAVAT operator)
  4. Programmable % of that flows to VCF owner (Clara GPU operator)
  5. Programmable % of that flows to FASTQ owner (Sequencing lab)
  6. Final programmable % reaches original biosample owner (Patient)

This complete royalty chain is enforced by Story Protocol smart contracts, ensuring patients benefit financially from their genomic data contribution to research.

PIL Enforcement in web3fuse

C Implementation: check_pil_license()
// web3fuse/src/pil_verifier.c
int check_pil_license(const char *biocid,
                      const char *user_wallet,
                      const char *operation) {

    // 1. Query Story Protocol for IP asset license
    pil_terms_t terms = query_story_protocol(biocid);

    // 2. Check operation against terms
    if (strcmp(operation, "derive") == 0 && !terms.derivatives_allowed) {
        return -EACCES;  // Permission denied
    }

    // 3. Check if commercial context
    bool is_commercial = is_commercial_entity(user_wallet);
    if (is_commercial && !terms.commercial_use) {
        return -EACCES;  // Commercial use not allowed
    }

    // 4. Log access for royalty tracking
    log_access_for_royalty(biocid, user_wallet, operation);

    return 0;  // Access granted
}

πŸ—οΈ Complete NBDR Architecture

β–Ά

System Components

graph TB subgraph User["πŸ‘€ User Layer"] Wallet["🦊 MetaMask Wallet
Signs authentication message"] App["πŸ“± Application
Python script with biofs-sdk
or direct BioFS mount"] end subgraph BioFS["πŸ–₯️ BioFS Layer (web3fuse)"] FUSE["FUSE Kernel Module
libfuse3"] BioCIDParser["BioCID Parser
Extract chain/collection/token"] Web3Sig["Web3 Signature Verifier
libsecp256k1 + XKCP"] NFTVerifier["NFT Ownership Verifier
ERC721/ERC1155 queries"] PILCheck["PIL License Verifier
Story Protocol integration"] end subgraph Blockchain["⛓️ Blockchain Layer"] Avalanche["Avalanche C-Chain
BioNFT Collections"] Story["Story Protocol
IP Asset Registry + PIL"] Sequentias["Sequentias Network
Consent Management"] end subgraph Storage["πŸ’Ύ Storage Layer"] S3["AWS S3
test.vault.genoverse.io
Encrypted genomic data"] IPFS["IPFS
ipfs.genobank.app
Metadata & thumbnails"] end Wallet --> App App --> FUSE FUSE --> BioCIDParser BioCIDParser --> Web3Sig Web3Sig --> NFTVerifier NFTVerifier --> PILCheck NFTVerifier --> Avalanche NFTVerifier --> Story PILCheck --> Story NFTVerifier --> Sequentias PILCheck --> S3 BioCIDParser --> IPFS style User fill:#1a1a2e,stroke:#00ffcc,stroke-width:2px style BioFS fill:#16213e,stroke:#00ff41,stroke-width:2px style Blockchain fill:#0f3460,stroke:#b084ff,stroke-width:2px style Storage fill:#533483,stroke:#ffd93d,stroke-width:2px

Data Flow: File Access Request

sequenceDiagram autonumber participant User as πŸ‘€ User participant BioFS as πŸ–₯️ BioFS participant Parser as πŸ” BioCID Parser participant Web3 as πŸ” Web3 Verifier participant Chain as ⛓️ Blockchain participant Story as πŸ“œ Story Protocol participant S3 as πŸ’Ύ S3 Storage User->>BioFS: open("/mnt/genomics/genome.vcf") BioFS->>Parser: Extract BioCID from mount Parser-->>BioFS: {chain, collection, token, path} BioFS->>Web3: Verify user signature Web3-->>BioFS: βœ… Signature valid BioFS->>Chain: Query NFT ownership Chain-->>BioFS: βœ… User owns NFT BioFS->>Story: Check PIL license terms Story-->>BioFS: {commercial: true, derivatives: true} BioFS->>Chain: Get NFT metadata Chain-->>BioFS: {storage: "s3", bucket: "...", credentials: "..."} BioFS->>BioFS: Decrypt credentials with user's private key BioFS->>S3: GET s3://vault.../genome.vcf (with temp credentials) S3-->>BioFS: File data BioFS-->>User: File handle (standard POSIX)

Key Innovations

Innovation Traditional Approach NBDR Approach Impact
Data Ownership Company owns, user borrows User owns NFT, companies request access True patient sovereignty
Access Control IAM policies (IT department) NFT ownership (blockchain verification) Instant, cryptographic, revocable
Credentials AWS keys in config files Encrypted in NFT metadata Zero credential leakage
Audit Trail CloudTrail (mutable logs) Blockchain transactions (immutable) HIPAA compliance, transparency
Royalties Manual contracts, litigation Automatic programmable % via Story Protocol Passive income for patients
Provenance Lab notebooks (lost/forged) Complete IP asset lineage on-chain Regulatory compliance, trust