NBDR Protocol
BioNFT-Based Data Routing for Genomic Sovereignty
v2.1 - BioFS CLI v1.6.2The NBDR Protocol transforms genomic data access from traditional cloud storage into a blockchain-native ecosystem. Using BioFS CLI v1.6.2 (user interface), BioFS Node (decentralized infrastructure), biofs-sdk (Python library), web3fuse (C FUSE filesystem), BioNFS (QUIC protocol), and BioNFS MCP Server (AI agents), every data operation verifies NFT ownership on-chain before granting access. Story Protocol IP hierarchy ensures complete provenance from physical biosample to ACMG analysis, with a programmable % royalty flow through the entire chain.
𧬠Complete Genomic Workflow (Biosample β ACMG Analysis)
End-to-End Pipeline with Story Protocol IP Hierarchy
Collection Tube")] Serial["π Biosample Serial:
55052008714000"] BioNFT["π« Mint BioNFT
ERC1155 Token
Owner: 0x5f5a60..."] Tube --> Serial Serial --> BioNFT end subgraph Consent["βοΈ Consent & Sequencing"] ConsentNFT["π Consent to Lab
PIL License Token
Programmable % Royalty"] Sequencing["𧬠WGS Sequencing
Illumina NovaSeq"] FASTQ["π FASTQ Files
R1.fastq.gz + R2.fastq.gz
~100GB"] BioNFT --> ConsentNFT ConsentNFT --> Sequencing Sequencing --> FASTQ end subgraph FASTQToken["π¨ Tokenize FASTQ (Root IP)"] FASTQMint["Mint FASTQ NFT
Collection: 0xFASTQ
Token: 55052008714000"] FASTQS3["Store in S3:
test.vault.genoverse.io"] BioCIDFastq["BioCID Generated:
biocid://v1/story/IPA/0xFASTQ/..."] FASTQ --> FASTQMint FASTQMint --> FASTQS3 FASTQS3 --> BioCIDFastq end subgraph BioFSMount["πΎ BioFS Mount (web3fuse)"] BioFSCmd["biofs mount
--uri biocid://v1/story/IPA/0xFASTQ/55052008714000
--mount /mnt/biosample"] NFTVerify["NFT Ownership Verified (ERC1155)
Web3 Signature Authenticated"] MountReady["π FASTQ accessible via POSIX
at /mnt/biosample/R1.fastq.gz"] BioCIDFastq --> BioFSCmd BioFSCmd --> NFTVerify NFTVerify --> MountReady end subgraph Clara["π₯οΈ Clara Parabricks GPU"] ClaraRun["clara.genobank.app
DeepVariant Germline
pbrun deepvariant"] VCF["π Output VCF:
55052008714000.deepvariant.vcf
~5GB"] MountReady --> ClaraRun ClaraRun --> VCF end subgraph VCFToken["π¨ Tokenize VCF (Child of FASTQ)"] VCFMint["Mint VCF NFT as CHILD
Parent: FASTQ NFT
Collection: 0xVCF"] VCFDerivative["Register as Derivative IP
Inherits programmable % royalty terms"] BioCIDVCF["BioCID Generated:
biocid://v1/story/IPA/0xVCF/..."] VCF --> VCFMint VCFMint --> VCFDerivative VCFDerivative --> BioCIDVCF end subgraph OpenCRAVAT["π¬ OpenCRAVAT Annotator"] OCRun["opencravat.genobank.app
Annotate with 200+ databases
ClinVar, dbSNP, COSMIC"] SQLite["π Output SQLite:
55052008714000.sqlite
~2GB annotated variants"] BioCIDVCF --> OCRun OCRun --> SQLite end subgraph SQLiteToken["π¨ Tokenize SQLite (Grandchild)"] SQLiteMint["Mint SQLite NFT as GRANDCHILD
Parent: VCF NFT"] SQLiteDerivative["Register as Derivative IP
Inherits full PIL chain"] BioCIDSQLite["BioCID Generated:
biocid://v1/story/IPA/0xSQLite/..."] SQLite --> SQLiteMint SQLiteMint --> SQLiteDerivative SQLiteDerivative --> BioCIDSQLite end subgraph ClaudeAI["π€ Claude AI ACMG Analysis"] ClaudeRun["claude.genobank.app
Expert Curator
ACMG/AMP Guidelines"] ACMGReport["π ACMG Variant Report
Pathogenic, VUS, Benign"] BioCIDSQLite --> ClaudeRun ClaudeRun --> ACMGReport end subgraph ACMGToken["π¨ Tokenize ACMG (Great-Grandchild)"] ACMGMint["Mint ACMG NFT as GREAT-GRANDCHILD
Complete provenance chain"] ACMGDerivative["Register as Derivative IP"] BioCIDACMG["BioCID Generated:
biocid://v1/story/IPA/0xACMG/..."] FinalAccess["π All files accessible via BioFS
Cryptographically verified"] ACMGReport --> ACMGMint ACMGMint --> ACMGDerivative ACMGDerivative --> BioCIDACMG BioCIDACMG --> FinalAccess end style Prerequisites fill:#0d1117,stroke:#00ffcc,stroke-width:2px style Consent fill:#0d1117,stroke:#00ff41,stroke-width:2px style FASTQToken fill:#0d1117,stroke:#b084ff,stroke-width:2px style BioFSMount fill:#0d1117,stroke:#00a8ff,stroke-width:2px style Clara fill:#0d1117,stroke:#ff6b6b,stroke-width:2px style VCFToken fill:#0d1117,stroke:#b084ff,stroke-width:2px style OpenCRAVAT fill:#0d1117,stroke:#ffd93d,stroke-width:2px style SQLiteToken fill:#0d1117,stroke:#b084ff,stroke-width:2px style ClaudeAI fill:#0d1117,stroke:#6bcf7f,stroke-width:2px style ACMGToken fill:#0d1117,stroke:#b084ff,stroke-width:2px
Every analysis step creates a new IP asset as a derivative of its parent. This ensures:
- Provenance Tracking: Trace ACMG report back to physical biosample tube
- Royalty Flow: Programmable % of any commercial use flows up the chain to original owner
- PIL Inheritance: License terms propagate through entire analysis tree
- BioFS Access: All files mountable via POSIX interface using web3fuse
- GDPR Compliance: Revoking consent deletes entire tree cryptographically
π€ BioNFS MCP Server - AI Agent Access Layer
The World's First BioNFT-Gated Human Biodata MCP Server for Compliant AI Agents is NOW OPERATIONAL!
Complete Biosample Metamorphosis: From Physical Tube β AI Intelligence
Made with 𧬠by GenoBank.io - Decentralizing Human Biodata Ownership
π― The Final Stage: Biosample Metamorphosis Complete
The BioNFS MCP Server represents the culmination of the biosample journeyβthe moment when physical biological material transforms into AI-accessible intelligence while preserving patient ownership, consent, and programmable royalties.
π§ͺ Physical Biosample (Tube with DNA) β Sequencing 𧬠FASTQ Files (Raw genomic data) β Variant Calling (Clara Parabricks) π VCF Files (Variants identified) β Tokenized as IP Asset β Annotation (OpenCRAVAT) π¬ SQLite Results (Clinical annotations) β Tokenized as Derivative IP β Curation (Expert Curator) π CSV Reports (ACMG pathogenic variants) β Tokenized as Derivative IP β β¨ BioNFS MCP Server π€ AI INTELLIGENCE (Claude Code, AI agents analyze via NFT-gated access) β Usage & Royalties π° Programmable Royalties flow back to patient through Story Protocol PIL
ποΈ Architecture Overview
The BioNFS MCP Server implements the Model Context Protocol (MCP)βan open-source standard created by Anthropic for connecting AI agents to external data sources. This enables AI systems like Claude Code to seamlessly access tokenized genomic data while respecting NFT ownership and PIL license terms.
βββββββββββββββββββββββββββββββββββββββββββββββββββ β π€ AI Agent Layer (Claude Code, etc.) β βββββββββββββββββββββββββββββββββββββββββββββββββββ β MCP Protocol (JSON-RPC) βββββββββββββββββββββββββββββββββββββββββββββββββββ β π BioNFS MCP Server β β β’ Web3 EIP-191 Signature Authentication β β β’ 4 MCP Tools (list, query, metadata, stream) β β β’ 3 Resources (@bionfs:vcf, gene, ip) β β β’ 3 Prompts (analyze_trio, find_pathogenic) β βββββββββββββββββββββββββββββββββββββββββββββββββββ β Verify NFT Ownership βββββββββββββββββββββββββββββββββββββββββββββββββββ β βοΈ Story Protocol Blockchain β β β’ IP Asset Registry (who owns what) β β β’ License Terms (PIL commercial/derivatives) β β β’ Royalty Tracking (% back to patient) β βββββββββββββββββββββββββββββββββββββββββββββββββββ β Fetch Data βββββββββββββββββββββββββββββββββββββββββββββββββββ β βοΈ BioNFT-Gated Storage β β β’ AWS S3: test.vault.genoverse.io β β β’ MongoDB: IP metadata, job results β β β’ SQLite: OpenCRAVAT annotation databases β βββββββββββββββββββββββββββββββββββββββββββββββββββ
π οΈ MCP Capabilities
1οΈβ£ MCP Tools (AI-Callable Functions)
-
list_accessible_files
Returns all genomic files (VCF, SQLite, CSV) that the authenticated wallet has NFT access to -
query_variants
Query variants from VCF or SQLite by gene symbol, chromosome region, or clinical significance -
get_file_metadata
Get detailed metadata about IP assets: owner, license terms, derivative relationships, royalty settings -
stream_file
Generate presigned S3 URLs for secure file downloads (bypasses CloudFlare 100MB limits)
2οΈβ£ MCP Resources (@ Mention Support)
-
@bionfs:vcf://<job_id>/<region>
Access specific VCF regions, e.g.,@bionfs:vcf://job_123/chr17:41196312-41277500(BRCA1) -
@bionfs:gene://<gene_symbol>
Search all accessible files for a gene, e.g.,@bionfs:gene://BRCA2 -
@bionfs:ip://<ip_id>
Get IP asset metadata from Story Protocol, e.g.,@bionfs:ip://0x123...
3οΈβ£ MCP Prompts (Slash Commands)
-
>/bionfs_analyze_trio
Automated trio analysis workflow: Compare father/mother/child VCFs to identify de novo, inherited, and compound heterozygous variants -
>/bionfs_find_pathogenic
Search for pathogenic and likely pathogenic variants across all accessible VCFs, filtered by ACMG criteria -
>/bionfs_gene_report
Generate comprehensive gene report: All variants in a gene, clinical annotations, population frequencies, predictions
π Authentication Flow
The MCP Server uses Web3 cryptographic signatures for passwordless, wallet-based authentication:
# 1. User signs standard message with their wallet message = "I want to proceed" signature = await wallet.signMessage(message) # 2. AI agent sends signature with MCP requests curl -H "Authorization: Bearer ${signature}" \ https://mcp.genobank.app/mcp # 3. Server recovers wallet address from signature wallet_address = eth_account.recover_message( encode_defunct(text="I want to proceed"), signature=signature ) # 4. Server checks Story Protocol for NFT ownership owned_nfts = story_protocol.get_ip_assets(wallet_address) # 5. Only return data for NFTs the wallet owns accessible_files = filter_by_ownership(all_files, owned_nfts)
π‘ How Claude Code Connects
Developers and researchers can connect Claude Code (or any MCP-compatible AI agent) to their genomic data:
{ "mcpServers": { "bionfs": { "transport": "http", "url": "https://mcp.genobank.app/mcp", "headers": { "Authorization": "Bearer YOUR_WEB3_SIGNATURE" } } } }
Once connected, Claude Code can naturally interact with your genomic data:
User: "Analyze my VCF for BRCA1 variants" Claude: *Uses query_variants tool* Found 3 variants in BRCA1: β’ chr17:41246481 C>T (p.Arg1751Ter) - PATHOGENIC β’ chr17:41256877 G>A (p.Ala1708Thr) - VUS β’ chr17:41258504 T>C (p.Val1687Ala) - BENIGN User: "Run a trio analysis for @bionfs:vcf://trio_123" Claude: *Uses /bionfs_analyze_trio prompt* Analyzing family trio (Father + Mother + Child)... β’ 12 de novo variants detected β’ 3 compound heterozygous in CFTR gene β’ Inheritance pattern suggests autosomal recessive
π The Complete Metamorphosis: Patient Empowerment
This is where the biosample journey reaches its profound conclusion:
π§ͺ Patient donates biosample β Lab sequences and tokenizes (VCF as IP Asset) βοΈ Patient owns NFT = Ownership + License Terms (PIL) β Researcher's AI agent requests access π BioNFS MCP Server verifies NFT ownership β If researcher has license token: π€ AI analyzes data via MCP tools β AI discovers novel insight π° Royalties flow back to patient (programmable %) β Patient shares in value created β¨ Patient empowered - True data ownership realized
This completes the biosample metamorphosis: A physical tube of DNA transforms into AI-accessible intelligence, while the patient retains ownership, consent control, and receives royalties from every use. The BioNFS MCP Server is the bridge that makes this possibleβconnecting AI agents to human biodata while preserving the ethical, legal, and economic rights of the individual.
π Why This Matters:
For the first time in history, AI agents can analyze human genomic data with full complianceβrespecting consent,
ownership, and royalty obligations. The MCP Server enforces these rights at the protocol level, not as an afterthought.
This is the future of ethical AI Γ genomics.
π Technical Specifications
https://mcp.genobank.app
Web3 EIP-191 Signatures
MCP (Model Context Protocol)
4 (list, query, metadata, stream)
3 (@bionfs:vcf, gene, ip)
3 (trio, pathogenic, gene)
Story Protocol (Aeneid Testnet)
AWS S3 (NFT-gated buckets)
MongoDB Atlas + SQLite
Node.js v24.2.0 (TypeScript)
systemd (bionfs-mcp.service)
Nginx + Cloudflare SSL
π Additional Resources
π BioCID Specification (Blockchain Content Identifier)
BioCID Format
Components:
scheme: biocid:// (protocol identifier)version: v1 (BioCID format version)chain: Blockchain network (avalanche, story, ethereum, sepolia)collection: NFT collection contract address (0x...)tokenId: NFT token ID (decimal or hex)path: Optional file path within NFT storage
Examples
# FASTQ files for biosample 55052008714000 biocid://v1/story/IPA/0x19A615224D03487AaDdC43e4520F9D83923d9512/55052008714000/R1.fastq.gz # VCF file on Story Protocol biocid://v1/story/0x5021F7438ea502b0c346cB59F8E92B749Ecd74B5/41221040804049/genome.vcf # SQLite annotation database biocid://v1/story/IPA/0xB8d03f2E1C02e4cC5b5fe1613c575c01BDD12269/55052008714000/annotation.sqlite
Supported Chains
| Chain | Chain ID | RPC Endpoint | Usage |
|---|---|---|---|
| avalanche | 43114 | https://api.avax.network/ext/bc/C/rpc | Production biosamples |
| story | 1513 | https://rpc.odyssey.storyrpc.io | PIL licensing, IP registry |
| ethereum | 1 | https://eth.llamarpc.com | High-value IP assets |
| sequentias | 15132025 | http://52.90.163.112:8545 | Consent management |
Resolution Process
βοΈ web3fuse (C Implementation)
web3fuse is a blockchain-native FUSE filesystem written in pure C (46KB binary) that enables NFT-gated access to genomic data. Unlike traditional cloud storage wrappers, web3fuse uses blockchain smart contracts as the source of truth for data ownership, permissions, and storage locations.
Architecture
π BioCID Parser
Parses blockchain content identifiers (biocid://v1/...) and extracts chain, collection, and token information.
Implementation: biocid_parser.c Function: biocid_parse_uri(const char *uri, biocid_t *out)
π Web3 Signature Verifier
Verifies Ethereum signatures (EIP-191) using libsecp256k1 for ECDSA recovery and XKCP for Keccak-256 hashing.
Implementation: web3_signature.c Libraries: libsecp256k1, XKCP (Keccak-256) Function: web3_verify_signature(msg, sig, expected_address)
π« NFT Ownership Verifier
Checks NFT ownership on-chain via JSON-RPC. Auto-detects ERC721/ERC1155 using supportsInterface queries.
Implementation: nft_verifier.c Standards: ERC721 (0x80ac58cd), ERC1155 (0xd9b67a26) Function: nft_verify_ownership(chain, collection, token, owner)
π RPC Client
Communicates with blockchain nodes using JSON-RPC 2.0. Supports multi-chain queries with connection pooling.
Implementation: rpc_client.c Transport: libcurl (HTTP/HTTPS) Parsing: jansson (JSON)
π BioCID Resolver
Resolves BioCID URIs to physical storage locations (S3, IPFS, Arweave) by querying NFT metadata.
Implementation: biocid_resolver.c Function: biocid_resolve(biocid_uri, storage_info_t *out) Caching: 1 hour TTL for metadata
Installation
# Install dependencies sudo apt-get install -y build-essential meson ninja-build \ libfuse3-dev libcurl4-openssl-dev \ libjansson-dev autoconf automake libtool # Build libsecp256k1 (ECDSA signature verification) git clone https://github.com/bitcoin-core/secp256k1.git cd secp256k1 ./autogen.sh && ./configure --enable-module-recovery make && sudo make install && sudo ldconfig # Build web3fuse git clone https://github.com/Genobank/web3fuse.git cd web3fuse make # Run tests make test
Usage
# Set user signature for authentication export USER_SIGNATURE="0xa5141ae955bba91ad46a940aefc3b05120489b8b..." # Mount FASTQ files via BioCID biofs --mount /mnt/genomics \ --uri "biocid://v1/story/IPA/0x19A615224D03487AaDdC43e4520F9D83923d9512/55052008714000" # Access files (NFT ownership verified on-chain) ls /mnt/genomics/ head /mnt/genomics/55052008714000_R1.fastq.gz # Unmount fusermount -u /mnt/genomics
Performance
| Operation | Latency | Notes |
|---|---|---|
| NFT Verification | ~200ms | Blockchain RPC call, cached 5 min |
| Signature Verification | <1ms | Local ECDSA recovery |
| BioCID Resolution | ~300ms | Blockchain + HTTP metadata fetch |
| File Open | ~250ms | After initial verification |
| File Read | ~5ms | S3 streaming (cached) |
web3fuse NEVER stores AWS access keys or blockchain private keys in configuration files. All credentials are resolved from NFT metadata or user signatures at runtime. This eliminates credential leakage vulnerabilities common in traditional cloud storage systems.
π biofs-sdk (Python)
biofs-sdk is a Python library that provides programmatic access to BioCID-referenced genomic data. It enables data scientists to embed blockchain-gated genomic data access directly in their Python scripts, Jupyter notebooks, and analysis pipelines.
Installation
# Install via pip pip install biofs-sdk # Or install from source git clone https://github.com/Genobank/biofs-sdk.git cd biofs-sdk pip install -e .
Quick Start
from biofs import BioCIDClient # Initialize client with Web3 signature client = BioCIDClient(wallet_signature="0xa5141ae955bba91ad46a940aefc3b05120489b8b...") # Verify access to BioCID (NFT ownership check) biocid_url = "biocid://v1/story/IPA/0x19A615224D03487AaDdC43e4520F9D83923d9512/55052008714000/genome.vcf" if client.verify_access(biocid_url): print("β Access granted") # Download file to local path client.download(biocid_url, output_path="./genome.vcf") # Stream large files (memory-efficient) for chunk in client.stream(biocid_url, chunk_size=8192): process_chunk(chunk) # Get metadata without downloading info = client.info(biocid_url) print(f"File size: {info['size']} bytes")
API Reference
BioCIDClient
class BioCIDClient:
def __init__(self, wallet_signature: str):
"""Initialize client with Web3 signature"""
def verify_access(self, biocid_url: str) -> bool:
"""Verify NFT ownership and access permissions"""
def download(self, biocid_url: str, output_path: str) -> None:
"""Download file from BioCID to local path"""
def stream(self, biocid_url: str, chunk_size: int = 8192) -> Iterator[bytes]:
"""Stream file in chunks (memory-efficient for large files)"""
def info(self, biocid_url: str) -> Dict:
"""Get metadata without downloading (size, owner, PIL terms)"""
Response Format
# info() returns:
{
"size": 1048576, # bytes
"owner": "0x5f5a...",
"ipId": "0x19A6...",
"license_terms": {...},
"metadata_uri": "ipfs://Qm..."
}
Advanced Usage
from biofs import BioCIDClient import pysam import pandas as pd import os # Initialize client client = BioCIDClient(wallet_signature=os.environ['USER_SIGNATURE']) # BioCID URLs for biosample files biosample_id = "55052008714000" vcf_biocid = f"biocid://v1/story/IPA/0x19A6.../{biosample_id}/genome.vcf" # Verify access before processing if not client.verify_access(vcf_biocid): raise PermissionError("You do not have access to this genomic data") # Download VCF file vcf_path = "./genome.vcf" client.download(vcf_biocid, output_path=vcf_path) # Analyze with pysam vcf = pysam.VariantFile(vcf_path) pathogenic_variants = [] for variant in vcf: if variant.info.get('CLNSIG') == 'Pathogenic': pathogenic_variants.append({ 'chrom': variant.contig, 'pos': variant.pos, 'ref': variant.ref, 'alt': variant.alts[0] }) # Generate report df = pd.DataFrame(pathogenic_variants) print(f"Found {len(df)} pathogenic variants") # Get file metadata and provenance info = client.info(vcf_biocid) print(f"Owner: {info['owner']}") print(f"IP Asset: {info['ipId']}")
biofs-sdk is designed to work seamlessly with existing bioinformatics tools like pysam, bcftools, and samtools. Once you download a BioCID to a local file path, standard tools can analyze the data. The blockchain NFT verification happens before download to ensure proper access control.
π οΈ BioFS CLI v1.6.2 - Node Management & Enhanced Features
BioFS CLI v1.6.2 adds BioFS Node management capabilities, allowing labs to run decentralized infrastructure nodes directly from the CLI. Enhanced with improved file handling, node discovery, and integration with sponsored AI/blockchain services.
π What's New in v1.6.2
New biofs node command enables labs to run their own infrastructure nodes, sponsor AI costs for users, and manage decentralized data replication. Includes node start, status, and configuration management.
Installation
# Install via npm (recommended) npx @genobank/biofs@latest --version # Or install globally npm install -g @genobank/biofs@latest # Verify installation biofs --version # Output: 1.6.2
Debug Mode Usage
# Normal tokenization - no debug messages biofs tokenize genome.vcf --network mainnet --title "Patient Genome" # Output: β Step 1/6: β File validated: genome.vcf (5.2 GB) β Step 2/6: β AI Classification: VCF Germline Variants β Step 3/6: β Uploaded to encrypted S3 vault β Step 4/6: β Registered in MongoDB β Step 5/6: β Minting IP Asset NFT on Story Protocol mainnet... β Step 6/6: β BioIP Tokenization Complete! π¦ IP Asset ID: 0x1234567890abcdef... π BioCID: biocid://v1/story/IPA/0x5021F7.../55052008714000 ποΈ PIL License: Non-Commercial Research
# Same command with --debug flag biofs tokenize genome.vcf --network mainnet --title "Patient Genome" --debug # Output: (includes internal operations) π Calling get_my_granted_bioips... π Signature preview: 0xa5141ae955bba91ad46a940aef... π API Response status: Success π Full response: { "status": "Success", "status_details": { "granted_bioips": [...], "total_count": 3 } } π Parsed granted BioIPs count: 3 π First granted file: { "ip_id": "0x789...", "filename": "reference_genome.vcf", "license_type": "non-commercial" } β Step 1/6: β File validated: genome.vcf (5.2 GB) π Detected IP Asset ID: 0x1234567890abcdef... π Calling getBioIPDownloadURL API... π API Response: { "access_granted": true, "presigned_url": "https://s3.amazonaws.com/...", "license_type": "non-commercial" } π Access granted! Filename: genome.vcf β Step 2/6: β AI Classification: VCF Germline Variants π Response status: Success π Full response: { "status": "Success", "bioip_id": "12345", "s3_path": "/biowallet/0x5f5a60.../genome.vcf", "mongodb_id": "67890" } β Step 3/6: β Uploaded to encrypted S3 vault β Step 4/6: β Registered in MongoDB β Step 5/6: β Minting IP Asset NFT on Story Protocol mainnet... β Step 6/6: β BioIP Tokenization Complete! π¦ IP Asset ID: 0x1234567890abcdef... π BioCID: biocid://v1/story/IPA/0x5021F7.../55052008714000 ποΈ PIL License: Non-Commercial Research
Debug Mode for All Commands
| Command | Without --debug | With --debug |
|---|---|---|
biofs files |
Shows table of files only | + API calls, BioCID resolution, NFT queries |
biofs download |
Progress bar only | + Signature verification, ownership checks, S3 pre-signed URL generation |
biofs tokenize |
6 step progress indicators | + AI classification details, blockchain responses, MongoDB confirmations |
biofs mount |
Mount confirmation only | + FUSE operations, NFT verification, credential decryption |
biofs access list |
Shows permittees table | + Blockchain RPC calls, license term queries, PIL verification |
Environment Variable Override
# Set DEBUG environment variable (persists for session) export DEBUG=1 # All subsequent commands will show debug output biofs tokenize genome.vcf --network mainnet biofs files biofs download biocid://v1/story/IPA/0x123.../genome.vcf # Disable debug mode unset DEBUG
Use Cases
π§ͺ Development & Testing
Use --debug during development to inspect API responses, blockchain transactions, and internal state:
biofs tokenize test.vcf --network testnet --debug
π Production Pipelines
Omit --debug in automated scripts for clean logs suitable for monitoring dashboards:
# Cron job: Daily genome upload
#!/bin/bash
for vcf in /data/genomes/*.vcf; do
biofs tokenize "$vcf" --network mainnet --quiet
done
π Debugging Access Issues
When troubleshooting permission denials, use --debug to see NFT ownership verification steps:
biofs download biocid://v1/story/IPA/0x789.../genome.vcf --debug
# Debug output shows:
π Checking access for IP Asset 0x789...
π User signature: 0xa5141ae955bba9...
π Response status: Failure
π Full response: {
"status": "Failure",
"status_details": {
"error": "NFT ownership verification failed",
"has_access": false,
"reason": "User does not own NFT token 55052008714000"
}
}
β Access denied to this BioIP asset
Debug logs may reveal:
- Partial Web3 signatures (first 30 characters)
- API response structures
- S3 bucket paths and file metadata
- Blockchain transaction details
Do not share debug logs publicly or commit them to version control. Use --debug only for local troubleshooting.
Implementation Details
// src/lib/utils/logger.ts
export class Logger {
static debug(message: string): void {
if (process.env.DEBUG) {
console.log(chalk.gray(`π ${message}`));
}
}
static info(message: string): void {
console.log(chalk.blue(`βΉοΈ ${message}`));
}
static success(message: string): void {
console.log(chalk.green(`β
${message}`));
}
static error(message: string): void {
console.log(chalk.red(`β ${message}`));
}
}
// Global debug flag set by CLI
program
.option('--debug', 'Enable debug output')
.hook('preAction', (thisCommand) => {
const opts = thisCommand.opts();
if (opts.debug) {
process.env.DEBUG = '1';
}
});
Upgrading from v1.4.x or v1.5.x is seamless. All existing commands work as before, with new biofs node commands available:
# Upgrade globally npm update -g @genobank/biofs # Verify version biofs --version # Should output: 1.6.2 # No breaking changes - all existing scripts work as before
ποΈ BioFS Node - Decentralized Infrastructure
BioFS Node enables labs and research institutions to run their own decentralized infrastructure nodes, sponsor AI and blockchain costs for their users, and participate in the GenoBank network without relying on centralized infrastructure.
Data Sovereignty: Host your own S3 buckets, control replication, and maintain complete ownership of your genomic data infrastructure. Cost Sponsorship: Pay once for AI (Anthropic Claude) and blockchain (Story Protocol) services, then sponsor unlimited usage for all your lab's users. Network Participation: Become part of the decentralized GenoBank network instead of depending on a single provider.
Node Architecture
BioFS supports two node types, each serving a specific role in the decentralized network:
Master Node
GenoBank.io's Core Infrastructure
- Manages lab registrations and discovery
- Coordinates cross-region data replication
- Provides fallback AI/blockchain proxies
- Monitors network health and lab status
Lab Node
Your Research Institution's Infrastructure
- Self-host S3 data storage with full control
- Sponsor AI analysis costs for your users
- Sponsor Story Protocol gas for your lab
- IPFS pinning for metadata and images
- Automatic S3 disaster recovery replication
Core Capabilities
Sponsor Claude AI analysis for your lab's users. Set quota limits, track usage, and pay once instead of per-user billing. Users get AI-powered variant interpretation without individual API keys.
Sponsor blockchain gas costs for IP asset registration, license minting, and derivative creation. Your users tokenize genomic data without needing ETH or managing wallets.
Manage your own S3 buckets for genomic data storage. Generate presigned URLs, track file uploads, and maintain complete control over data location and access policies.
Pin images and anonymized metadata to IPFS with automatic GDPR-compliance checks. Prevents genomic data from being pinned to immutable storage (enforces data erasure rights).
Automatic disaster recovery with differential sync. Only changed files are replicated (ETag-based detection), saving bandwidth. Primary and replica buckets in different regions for geographic redundancy.
Installation & Setup
# Install BioFS CLI v1.6.2+ npm install -g @genobank/biofs@latest # Verify node command is available biofs node --help
# Generate configuration template biofs node init --type lab --name "AUGenomics Lab" # Edit lab-config.yaml with your settings: # - node.wallet: Your lab's Ethereum wallet # - storage.bucket: Your S3 bucket name # - database.url: PostgreSQL connection string # - services.anthropic_api_key: Claude API key (optional) # - services.story_executor_key: Story Protocol executor (optional)
# Start node with configuration biofs node start --config lab-config.yaml # Output: β Lab registry started β S3 File Manager started (bucket: s3://augenomics-genomics) β IPFS Pinning Service started β Anthropic Proxy enabled (quota: 1M tokens/month) β Story Protocol Proxy enabled (quota: 100 transactions/month) β API server started on port 8080 ποΈ Lab Node running at https://biofs.augenomics.org
Cost Sponsorship Model
Instead of each researcher paying for AI and blockchain services individually, your lab node sponsors costs for all users:
| Service | Without Lab Node | With Lab Node |
|---|---|---|
| Claude AI Analysis | Each researcher needs API key $20/month per user |
Lab pays once ($200/month) Unlimited researchers |
| Story Protocol Gas | Each researcher needs ETH ~$5 per tokenization |
Lab sponsors gas costs Users tokenize for free |
| S3 Storage | GenoBank's S3 bucket $0.023/GB |
Your own S3 bucket Full control + negotiated rates |
| Data Sovereignty | Trust centralized provider | Own your infrastructure |
Quota Management
# View current usage and limits biofs node quota --service anthropic # Output: π Anthropic Quota Status Monthly Limit: 1,000,000 tokens Used This Month: 450,230 tokens (45%) Remaining: 549,770 tokens Reset Date: 2025-11-01 # View Story Protocol gas usage biofs node quota --service story # Output: β½ Story Protocol Gas Quota Monthly Limit: 100 transactions Used This Month: 37 transactions (37%) Remaining: 63 transactions Reset Date: 2025-11-01
Disaster Recovery
Lab nodes automatically replicate data to backup regions for disaster recovery:
# In lab-config.yaml
replication:
enabled: true
interval_hours: 24
replica_bucket: backup.augenomics.org
replica_region: us-east-1 # Different from primary region
# Check replication status curl http://localhost:8080/api/v1/replication/health # Output: { "success": true, "health": { "replication_enabled": true, "primary_bucket": "augenomics-genomics", "replica_bucket": "backup.augenomics.org", "interval_hours": 24, "last_run": "2025-10-24T17:00:00Z" } }
Running your own node means no single point of failure. If GenoBank.io goes offline, your lab continues operating with full access to your data, AI capabilities, and blockchain services. The network is resilient because it's distributed across multiple independent nodes.
API Endpoints
Lab nodes expose RESTful APIs for integration with existing systems:
POST /api/v1/anthropic/messages- Claude AI analysis with quota managementPOST /api/v1/story/register-ip- Register IP assets with sponsored gasPOST /api/v1/files/upload- Upload files to your S3 bucketPOST /api/v1/ipfs/pin-image- Pin images to IPFS with GDPR checksGET /api/v1/replication/stats- View replication statistics
π‘ For Research Institutions: BioFS Node transforms genomic infrastructure from vendor lock-in to true data sovereignty. Run it on-premises, in your cloud environment, or hybrid. Full control, full transparency, full ownership.
π NFT Verification & Access Control
3-Factor Verification Model
Every file access requires three simultaneous verifications:
Verification Flow
Credential Management (Zero Hardcoded Keys)
web3fuse and biofs-sdk eliminate the need for hardcoded AWS credentials. All storage credentials are resolved from NFT metadata at runtime:
{
"name": "Biosample 55052008714000 - FASTQ Files",
"description": "Whole Genome Sequencing FASTQ files",
"image": "ipfs://QmXYZ.../thumbnail.png",
// Storage location (public)
"storage": {
"type": "s3",
"region": "us-east-1",
"bucket": "test.vault.genoverse.io",
"path": "/biowallet/0x5f5a60.../55052008714000/"
},
// Encrypted credentials (only decryptable by NFT owner)
"credentials": {
"encrypted": "0xABC123...", // Encrypted with owner's public key
"algorithm": "ECIES",
"expires_at": 1735689600 // Unix timestamp
},
// Story Protocol IP asset reference
"ipAsset": {
"ipId": "0x789...",
"parentIpId": "0x456...", // Biosample tube NFT
"licenseTerms": "0xDEF..."
}
}
When BioFS needs to access S3:
- Query NFT metadata to get encrypted credentials
- Decrypt using user's wallet signature (ECIES with secp256k1)
- Use temporary credentials (expires in 1 hour)
- Cache decrypted credentials in kernel keyring
- Auto-refresh before expiry
This ensures that even if someone steals the metadata JSON, they cannot access the data without the NFT owner's private key.
π‘ OSI Layer 4.5 Architecture
Modified OSI Stack with NBDR
The NBDR Protocol operates at OSI Layer 4.5, intercepting data flows after transport reliability (TCP) is established but before session management begins. This allows routing decisions based on data ownership rather than network paths.
Why Layer 4.5?
By inserting NBDR between Transport (Layer 4) and Session (Layer 5), we gain:
- Reliable Transport: TCP ensures data integrity before we make ownership checks
- Pre-Session Interception: Block unauthorized access before establishing expensive sessions
- Blockchain Query Efficiency: Can make RPC calls with reliable transport guarantees
- Transparent to Applications: Apps at Layer 7 see standard POSIX filesystem, unaware of blockchain layer
Comparison: Traditional vs NBDR Routing
| Aspect | Traditional (IP-Based) | NBDR (NFT-Based) |
|---|---|---|
| Routing Basis | IP address, network topology | NFT ownership, blockchain state |
| Access Control | Firewall rules, ACLs (company-controlled) | Smart contracts, PIL terms (user-controlled) |
| Credential Storage | Config files, environment variables | NFT metadata, encrypted on-chain |
| Audit Trail | CloudTrail (mutable, company-controlled) | Blockchain (immutable, public/verifiable) |
| Revocation | Manual (IT ticket, days) | Instant (burn NFT, real-time) |
| Royalty Tracking | Not supported | Automatic (Programmable % via Story Protocol) |
π Story Protocol PIL Integration
PIL (Programmable IP License) Terms
Every genomic data file is registered as an IP asset on Story Protocol with specific license terms. web3fuse enforces these terms at the filesystem level before granting access.
{
"license_type": "PIL",
"commercial_use": true, // Can be used commercially
"derivatives_allowed": true, // Can create derivative works (VCF, etc.)
"attribution_required": true, // Must credit original owner
"commercial_revenue_share": 15, // Programmable % royalty to original owner
"allowed_territories": ["*"], // Global access
"expiration": null, // No expiration
"revocable": true // Owner can revoke consent
}
IP Asset Hierarchy Example
Serial: 55052008714000
Collection: 0xBiosample
Owner: 0x5f5a60..."] FASTQ["π LEVEL 1: FASTQ Files (Parent)
Collection: 0xFASTQ
Token: 55052008714000
biocid://v1/story/IPA/0xFASTQ/55052008714000/...
PIL: Programmable % royalty to 0x5f5a60..."] VCF["π LEVEL 2: VCF File (Child of FASTQ)
Collection: 0xVCF
Token: 55052008714000
biocid://v1/story/IPA/0xVCF/55052008714000/...
Derivative IP: Inherits PIL from FASTQ
Royalty Chain: Programmable % to FASTQ owner"] SQLite["π LEVEL 3: SQLite DB (Grandchild of FASTQ)
Collection: 0xSQLite
Token: 55052008714000
biocid://v1/story/IPA/0xSQLite/55052008714000/...
Derivative IP: Inherits PIL from VCF
Royalty Chain: Programmable % flows through VCF to FASTQ"] ACMG["π LEVEL 4: ACMG Report (Great-Grandchild)
Collection: 0xACMG
Token: 55052008714000
biocid://v1/story/IPA/0xACMG/55052008714000/...
Derivative IP: Inherits PIL from SQLite
Royalty Chain: Programmable % flows through SQLite β VCF β FASTQ
Complete provenance to physical tube"] Root --> FASTQ FASTQ --> VCF VCF --> SQLite SQLite --> ACMG style Root fill:#1a1a2e,stroke:#00ffcc,stroke-width:3px,color:#00ffcc style FASTQ fill:#16213e,stroke:#b084ff,stroke-width:2px,color:#e6f1e6 style VCF fill:#0f3460,stroke:#b084ff,stroke-width:2px,color:#e6f1e6 style SQLite fill:#533483,stroke:#b084ff,stroke-width:2px,color:#e6f1e6 style ACMG fill:#6d435a,stroke:#b084ff,stroke-width:2px,color:#e6f1e6
When a pharmaceutical company accesses the ACMG report for drug development:
- Company pays license fee for commercial use
- Programmable % automatically flows to ACMG report owner (GenoBank/Lab)
- Programmable % of that flows to SQLite owner (OpenCRAVAT operator)
- Programmable % of that flows to VCF owner (Clara GPU operator)
- Programmable % of that flows to FASTQ owner (Sequencing lab)
- Final programmable % reaches original biosample owner (Patient)
This complete royalty chain is enforced by Story Protocol smart contracts, ensuring patients benefit financially from their genomic data contribution to research.
PIL Enforcement in web3fuse
// web3fuse/src/pil_verifier.c int check_pil_license(const char *biocid, const char *user_wallet, const char *operation) { // 1. Query Story Protocol for IP asset license pil_terms_t terms = query_story_protocol(biocid); // 2. Check operation against terms if (strcmp(operation, "derive") == 0 && !terms.derivatives_allowed) { return -EACCES; // Permission denied } // 3. Check if commercial context bool is_commercial = is_commercial_entity(user_wallet); if (is_commercial && !terms.commercial_use) { return -EACCES; // Commercial use not allowed } // 4. Log access for royalty tracking log_access_for_royalty(biocid, user_wallet, operation); return 0; // Access granted }
ποΈ Complete NBDR Architecture
System Components
Signs authentication message"] App["π± Application
Python script with biofs-sdk
or direct BioFS mount"] end subgraph BioFS["π₯οΈ BioFS Layer (web3fuse)"] FUSE["FUSE Kernel Module
libfuse3"] BioCIDParser["BioCID Parser
Extract chain/collection/token"] Web3Sig["Web3 Signature Verifier
libsecp256k1 + XKCP"] NFTVerifier["NFT Ownership Verifier
ERC721/ERC1155 queries"] PILCheck["PIL License Verifier
Story Protocol integration"] end subgraph Blockchain["βοΈ Blockchain Layer"] Avalanche["Avalanche C-Chain
BioNFT Collections"] Story["Story Protocol
IP Asset Registry + PIL"] Sequentias["Sequentias Network
Consent Management"] end subgraph Storage["πΎ Storage Layer"] S3["AWS S3
test.vault.genoverse.io
Encrypted genomic data"] IPFS["IPFS
ipfs.genobank.app
Metadata & thumbnails"] end Wallet --> App App --> FUSE FUSE --> BioCIDParser BioCIDParser --> Web3Sig Web3Sig --> NFTVerifier NFTVerifier --> PILCheck NFTVerifier --> Avalanche NFTVerifier --> Story PILCheck --> Story NFTVerifier --> Sequentias PILCheck --> S3 BioCIDParser --> IPFS style User fill:#1a1a2e,stroke:#00ffcc,stroke-width:2px style BioFS fill:#16213e,stroke:#00ff41,stroke-width:2px style Blockchain fill:#0f3460,stroke:#b084ff,stroke-width:2px style Storage fill:#533483,stroke:#ffd93d,stroke-width:2px
Data Flow: File Access Request
Key Innovations
| Innovation | Traditional Approach | NBDR Approach | Impact |
|---|---|---|---|
| Data Ownership | Company owns, user borrows | User owns NFT, companies request access | True patient sovereignty |
| Access Control | IAM policies (IT department) | NFT ownership (blockchain verification) | Instant, cryptographic, revocable |
| Credentials | AWS keys in config files | Encrypted in NFT metadata | Zero credential leakage |
| Audit Trail | CloudTrail (mutable logs) | Blockchain transactions (immutable) | HIPAA compliance, transparency |
| Royalties | Manual contracts, litigation | Automatic programmable % via Story Protocol | Passive income for patients |
| Provenance | Lab notebooks (lost/forged) | Complete IP asset lineage on-chain | Regulatory compliance, trust |