π Table of Contents
- π¦ Introduction: The Metamorphosis Metaphor
- 𧬠Stage 1: Genesis - Activation & Identity
- π Stage 2: Embryonic - Data Birth & Storage
- π¨ Stage 3: Larval - NFT Minting & Consent
- π Stage 4: Pupal - IP Registration & Licensing
- π§ Stage 5: Adult - Intelligence Ecosystem
- π BioNFT V2: Native Alphanumeric Architecture
- π Technical Deep-Dive: Token ID Structure
- π The Complete Ecosystem: 6 Services
- β Conclusion & Future Vision
π¦ Introduction: The Metamorphosis Metaphor
Just as a butterfly undergoes complete metamorphosisβfrom egg to larva to pupa to adultβa biosample in the GenoBank ecosystem transforms through distinct stages, each adding new capabilities and value. This isn't merely data processing; it's a fundamental transformation from physical DNA to tokenized, licensable, AI-analyzed genomic intelligence.
This whitepaper documents the complete technical journey, providing architects, developers, and genomics researchers with a comprehensive understanding of how GenoBank transforms biosamples into intelligent, programmable bioassets on blockchain infrastructure.
Why "Metamorphosis"?
Traditional genomics workflows treat data as static files. GenoBank reimagines this as dynamic transformation:
- Egg (DNA Kit) β Static potential
- Larva (Activation) β Initial identity
- Pupa (Tokenization) β Structural transformation
- Adult (Intelligence) β Full capability realization
Each stage is irreversible yet upgradeable, mirroring biological metamorphosis where each form serves a unique purpose.
The Complete Metamorphosis Journey
graph TB
subgraph "STAGE 1: GENESIS - Activation"
A1[DNA Kit Arrives] --> A2[User Scans QR Code]
A2 --> A3[genobank.io/activate]
A3 --> A4{Biosample ID:
FR756568541491}
A4 --> A5[Web3 Wallet Creation]
A5 --> A6[Sign Consent]
A6 --> A7[Token ID Encoding
256 bits]
A7 --> A8[POST /claim/token_id]
end
subgraph "STAGE 2: EMBRYONIC - Data Birth"
B1[MongoDB Storage] --> B2[biosamples Collection]
B2 --> B3{serial: FR756568541491
owner: 0x5f5a...
status: active}
B3 --> B4[biosample-activations]
B4 --> B5[Linkage Metadata]
end
subgraph "STAGE 3: LARVAL - NFT Minting"
C1[Avalanche C-Chain] --> C2[BiosampleDataNFT.sol]
C2 --> C3[ERC-1155 Dual-Mint]
C3 --> C4[Owner NFT]
C3 --> C5[Permittee NFT]
C4 --> C6{Consent Token
Data Access Rights}
C5 --> C6
end
subgraph "STAGE 4: PUPAL - IP Registration"
D1[Story Protocol] --> D2[IP Asset Registry]
D2 --> D3[IPFS Metadata Upload]
D3 --> D4{IP Account
ERC-6551 TBA}
D4 --> D5[PIL License Attachment]
D5 --> D6[Programmable Rights]
end
subgraph "STAGE 5: ADULT - Intelligence Ecosystem"
E1[VCF Upload] --> E2[OpenCRAVAT Annotation]
E2 --> E3[Child IP Asset]
E3 --> E4[Claude AI Analysis]
E4 --> E5[Grandchild IP Asset]
E5 --> E6{Genomic Intelligence
Insights & Reports}
E7[AlphaGenome Scoring] --> E6
E8[Ancestry Analysis] --> E6
E9[Trio Family Analysis] --> E6
E10[BioIP Tokenization] --> E6
end
A8 ==> B1
B5 ==> C1
C6 ==> D1
D6 ==> E1
style A4 fill:#667eea,stroke:#333,stroke-width:2px,color:#fff
style B3 fill:#48bb78,stroke:#333,stroke-width:2px,color:#fff
style C6 fill:#ed8936,stroke:#333,stroke-width:2px,color:#fff
style D4 fill:#9f7aea,stroke:#333,stroke-width:2px,color:#fff
style E6 fill:#f56565,stroke:#333,stroke-width:2px,color:#fff
𧬠Stage 1: GENESIS - Activation & Identity
Genesis
Birth of Identity
Overview
Genesis is where a physical DNA kit transforms into a digital biosample identity. This stage establishes the foundational link between:
- Physical biosample barcode (e.g.,
FR756568541491) - Web3 wallet address (e.g.,
0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a) - Cryptographic consent signature
Technical Flow
1. User Scans QR Code
https://genobank.io/activate?biosampleId=FR756568541491&laboratoryId=12345#SECRET_HASH
2. Frontend Token ID Encoding (JavaScript)
// activate/js/scripts.js:630
function prepareData() {
const account = generateAddress; // User's Web3 wallet
// V2: Alphanumeric biosample ID
const biosampleIdBytes32 = web3.utils.asciiToHex("FR756568541491").padEnd(66, '0');
// Token ID structure: [biosample:56][permittee:40][wallet:160]
const biosampleHash = keccak256(biosampleIdBytes32).slice(0, 16); // 56 bits
const permitteeIdHex = leftPad(parseInt(permiteeId), 10, '0', false); // 40 bits
const tokenID = `0x${biosampleHash}${permitteeIdHex}${account.substr(2)}`;
// Sign claim data
const claimData = `0x${stringToHex('genobank.create')}${tokenID.substring(2)}${seed}`;
const signature = await wallet.signMessage(ethers.utils.arrayify(keccak256(claimData)));
return { tokenID, signature };
}
3. Backend Claim Endpoint (Python)
# runweb.py:3153
@cherrypy.expose
def claim(self, token_id):
data = cherrypy.request.json
# Decode token ID
biosample_bytes32 = extract_bytes32_from_token(token_id[0:66])
biosample_id = bytes32_to_string(biosample_bytes32) # "FR756568541491"
wallet_address = f"0x{token_id[24:]}"
# Validate signature
if not verify_signature(data['signature'], wallet_address):
return error_response("Invalid signature")
# Validate biosample not already activated
if biosample_dao.is_activated(biosample_id):
return error_response("Biosample already activated")
# Store activation
biosample_dao.activate(biosample_id, wallet_address, data)
return success_response({"biosample_id": biosample_id})
Key Innovations: BioNFT V2
Native Alphanumeric Support: Unlike traditional blockchain systems limited to numeric IDs, BioNFT V2 natively supports alphanumeric biosample barcodes (e.g., FR756568541491) using Solidity's bytes32 type.
Token ID Structure (256 bits)
V1 (Numeric):
0x [00027B5A5D9E3451] [000000004D2] [5f5a60EaEf242c0D51A21c703f520347b96Ed19a]
^^^^^^^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Biosample (56b) Permittee Wallet Address (160b)
(40b)
V2 (Alphanumeric):
0x [00A7F3E8B2C91D45] [000000004D2] [5f5a60EaEf242c0D51A21c703f520347b96Ed19a]
^^^^^^^^^^^^^^^^ ^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Hash("FR7565...") Permittee Wallet Address
(56b) (40b) (160b)
π Stage 2: EMBRYONIC - Data Birth & Storage
Embryonic
Data Foundation
Overview
The Embryonic stage establishes persistent data storage in MongoDB, creating the foundational database records that will support all future metamorphosis stages.
MongoDB Schema
Collection: biosamples
{
"_id": ObjectId("..."),
"serial": "FR756568541491", // V2: String (alphanumeric or numeric)
"biosampleIdBytes32": "0x4652373536353638353431343931000000000000000000000000000000000000",
"isAlphanumeric": true, // V2 flag
"numericSerial": null, // null for alphanumeric, number for legacy
"owner": "0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a",
"status": "active", // active | revoked | deleted
"chainID": 1516, // Story Protocol Mainnet
"createdAt": ISODate("2025-10-21T12:00:00Z"),
"updatedAt": ISODate("2025-10-21T12:00:00Z")
}
Collection: biosample-activations
{
"_id": ObjectId("..."),
"serial": "FR756568541491",
"permitteeSerial": 12345,
"physicalId": "ABC123",
"wallet_address": "0x5f5a60EaEf242c0D51A21c703f520347b96Ed19a",
"consent_signature": "0xa5141ae955bba91ad...",
"consent_timestamp": ISODate("2025-10-21T12:00:00Z"),
"createdAt": ISODate("2025-10-21T12:00:00Z")
}
Affected Collections
The embryonic stage touches 11 MongoDB collections across GenoBank's ecosystem:
biosamples- Core biosample recordsbiosample-activations- Activation metadatadeliveries- File deliveriesbiosample-transfer-history- Ownership transferspermissions- Access controlvcf_annotation_jobs- OpenCRAVAT jobsopencravat_results- Annotation resultsalphagenome_analyses- DeepMind scoringbioip_registry- BioIP assetsclaude_ai_sessions- AI chat sessionsnewborn_trios- Family analysis
π¨ Stage 3: LARVAL - NFT Minting & Consent
Larval
On-Chain Identity
Overview
The Larval stage mints the biosample as an ERC-1155 multi-token on Avalanche C-Chain, establishing immutable on-chain ownership and consent.
Smart Contract Architecture
BiosampleDataNFT_V2.sol (Proposed)
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.9;
import "@openzeppelin/contracts/token/ERC1155/ERC1155.sol";
import "@openzeppelin/contracts/access/Ownable.sol";
contract BiosampleDataNFT_V2 is ERC1155, Ownable {
// Native alphanumeric support
mapping(bytes32 => bool) public biosampleSerialExists;
mapping(bytes32 => mapping(address => bool)) public biosampleSharedWithPermittee;
// Backward compatibility
mapping(uint256 => bytes32) public legacyNumericToBiosampleId;
enum Status { ACTIVE, REVOKED, DELETED }
struct File {
bytes32 biosampleId;
string name;
address owner;
address laboratory;
Status status;
uint expiration;
uint createdAt;
}
event BiosampleActivated(bytes32 indexed biosampleId, address indexed owner);
event ConsentGranted(bytes32 indexed biosampleId, address indexed owner, address indexed permittee);
event ConsentRevoked(bytes32 indexed biosampleId, address indexed owner, address indexed permittee);
// V2 primary function
function uploadFile(
bytes32 _biosampleId,
string memory _name,
address _fileOwner,
uint _expiration
) public returns (uint) {
require(msg.sender == contractAdministrator, "Only admin");
require(!isStringEmpty(_name), "Empty filename");
biosampleSerialExists[_biosampleId] = true;
emit BiosampleActivated(_biosampleId, _fileOwner);
return internalFileSerial.current();
}
// Dual-mint pattern for consent
function shareFile(
bytes32 _biosampleId,
address _fileOwner,
address _permittee,
uint _expiration
) public {
require(biosampleSerialExists[_biosampleId], "Biosample not exists");
require(!biosampleSharedWithPermittee[_biosampleId][_permittee], "Already shared");
// Mint NFT to owner
bytes memory bytesPermittee = toBytes(_permittee);
_mint(_fileOwner, uint256(_biosampleId), 1, bytesPermittee);
// Mint NFT to permittee (lab)
bytes memory bytesFileOwner = toBytes(_fileOwner);
_mint(_permittee, uint256(_biosampleId), 1, bytesFileOwner);
biosampleSharedWithPermittee[_biosampleId][_permittee] = true;
emit ConsentGranted(_biosampleId, _fileOwner, _permittee);
}
// V1 backward compatibility wrapper
function uploadFileLegacy(
uint256 _numericBiosampleId,
string memory _name,
address _fileOwner,
uint _expiration
) public returns (uint) {
bytes32 biosampleId = numericToBytes32(_numericBiosampleId);
legacyNumericToBiosampleId[_numericBiosampleId] = biosampleId;
return uploadFile(biosampleId, _name, _fileOwner, _expiration);
}
}
Dual-Mint Pattern
Key Innovation: Both the biosample owner and the permittee (lab) receive NFTs, creating a cryptographic record of consent and data access rights.
- Owner NFT β Proves ownership, can revoke consent
- Permittee NFT β Proves access rights, can query data
This dual-ownership model is GDPR Article 9 compliant, as consent can be revoked on-chain by burning the NFTs.
π Stage 4: PUPAL - IP Registration & Licensing
Pupal
IP Transformation
Overview
The Pupal stage registers the biosample NFT as an IP Asset on Story Protocol, enabling programmable licensing and royalty distribution.
Story Protocol Integration
IP Asset Registration (Python)
# story_protocol_manager_dao.py
def register_biosample_as_ip(biosample_id_alphanumeric, wallet, metadata):
"""
Register biosample as Story Protocol IP Asset (V2)
"""
# Step 1: Convert alphanumeric to bytes32
biosample_bytes32 = web3.utils.asciiToHex(biosample_id_alphanumeric).padEnd(66, '0')
# Step 2: Generate deterministic uint256 token ID
token_id = generate_token_id_v2(biosample_id_alphanumeric, permittee_id, wallet)
# Step 3: Prepare metadata (IPFS)
metadata = {
"name": f"Biosample {biosample_id_alphanumeric}",
"description": "Genomic biosample registered as IP Asset",
"attributes": [
{"trait_type": "Biosample Serial", "value": biosample_id_alphanumeric},
{"trait_type": "Biosample ID (bytes32)", "value": biosample_bytes32},
{"trait_type": "Token ID", "value": hex(token_id)},
{"trait_type": "Chain", "value": "Avalanche C-Chain"},
{"trait_type": "IP Type", "value": "Genomic Data"}
]
}
# Step 4: Upload to IPFS
ipfs_hash = ipfs_service.upload_json(metadata)
# Step 5: Register on Story Protocol
tx = ip_manager_contract.functions.register(
chainId=43114, # Avalanche
tokenContract=BIOSAMPLE_NFT_V2_ADDRESS,
tokenId=token_id,
metadataURI=f"ipfs://{ipfs_hash}"
).transact()
# Get IP Asset ID (ERC-6551 Token Bound Account)
receipt = w3.eth.wait_for_transaction_receipt(tx)
story_ip_id = parse_ip_registered_event(receipt)
return story_ip_id
ERC-6551 Token Bound Accounts
Every registered IP Asset automatically creates an ERC-6551 Token Bound Account (TBA), a smart contract wallet owned by the NFT itself. This enables:
- β NFT can own other assets (derivative IP, royalties)
- β NFT can execute transactions (license minting)
- β NFT can hold metadata on-chain
Programmable IP Licenses (PIL)
Story Protocol's PIL framework allows defining license terms on-chain:
PIL License Terms Example
{
"licenseTemplate": "PIL_COMMERCIAL_USE",
"terms": {
"transferable": true,
"commercialUse": true,
"derivativesAllowed": true,
"attribution": true,
"commercialRevShare": 10, // 10% royalty
"currency": "0x..." // ERC-20 token for royalties
}
}
π§ Stage 5: ADULT - Intelligence Ecosystem
Adult
Full Intelligence
Overview
The Adult stage represents the fully metamorphosed bioasset, capable of generating derivative intellectual property through AI analysis, variant annotation, and clinical insights.
Derivative IP Asset Chain
Story Protocol enables creating parent-child IP relationships, where each analysis becomes a derivative IP Asset:
Hierarchical IP Structure
Biosample (Root IP Asset)
βββ biosample_id: "FR756568541491"
βββ story_ip_id: 0xABCD...
βββ PIL License: COMMERCIAL_USE
β
ββ VCF File (Child IP)
β βββ parent_ip_id: 0xABCD...
β βββ story_ip_id: 0xEF01...
β βββ Derives 10% royalty to parent
β β
β ββ SQLite Annotation (Grandchild IP)
β β βββ parent_ip_id: 0xEF01...
β β βββ story_ip_id: 0x2345...
β β βββ OpenCRAVAT: 146 annotators
β β
β ββ CSV Report (Grandchild IP)
β βββ parent_ip_id: 0xEF01...
β βββ story_ip_id: 0x6789...
β βββ Claude AI: Variant explanations
β
ββ AlphaGenome Analysis (Child IP)
β βββ parent_ip_id: 0xABCD...
β βββ story_ip_id: 0xABCE...
β βββ DeepMind: Variant pathogenicity
β
ββ Ancestry Results (Child IP)
βββ parent_ip_id: 0xABCD...
βββ story_ip_id: 0xDEF0...
βββ SOMOS DAO: Mexican Biobank analysis
The Six Intelligence Services
1. VCF Annotator
OpenCRAVAT: 146 annotators
2. Claude AI
Genomic variant explanations
3. AlphaGenome
DeepMind variant scoring
4. SOMOS DAO
Ancestry analysis
5. Newborn/Trio
Family genomics
6. BioIP
IP asset tokenization
Example: VCF Annotation Flow
Complete Analysis Pipeline
# 1. User uploads VCF file
POST /api_vcf_annotator/post_register_user
{
"biosample_serial": "FR756568541491",
"vcf_file_path": "s3://bucket/user/sample.vcf",
"user_signature": "0xa5141ae..."
}
# 2. OpenCRAVAT annotation job
β Submit to OpenCRAVAT API
β 146 annotators run (ClinVar, dbSNP, COSMIC, gnomAD, etc.)
β Generate SQLite database
# 3. Claude AI analysis
β Extract variants from SQLite
β Send to Claude API with genomic context
β Generate human-readable explanations
# 4. Tokenize results as derivative IP
β VCF β Parent IP (story_ip_id: 0xABCD...)
β SQLite β Child IP (parent: 0xABCD...)
β CSV Report β Grandchild IP (parent: 0xEF01...)
# 5. Mint license tokens for access
β Researcher requests access
β Mints license token (costs $X)
β 10% royalty to biosample owner
β License grants download rights
π BioNFT V2: Native Alphanumeric Architecture
The Problem: Numeric-Only Limitation
V1 Architecture (current) only supports numeric biosample IDs due to uint256 storage:
- β Cannot support European biosample barcodes (e.g.,
FR756568541491) - β Requires encoding/hashing (loses original barcode)
- β Not intuitive for labs with alphanumeric systems
The Solution: bytes32 Native Support
V2 Architecture (proposed) uses Solidity's bytes32 type for native alphanumeric IDs:
Technical Comparison
Storage Cost Analysis (Avalanche C-Chain)
// V1: uint256 storage mapping(uint256 => bool) public biosampleExists; // 20,000 gas (SSTORE) // V2: bytes32 storage mapping(bytes32 => bool) public biosampleExists; // 20,000 gas (SSTORE) // Result: IDENTICAL gas cost β
JavaScript Conversion
// Convert alphanumeric to bytes32 const biosampleId = "FR756568541491"; const biosampleIdBytes32 = web3.utils.asciiToHex(biosampleId).padEnd(66, '0'); // Result: 0x4652373536353638353431343931000000000000000000000000000000000000 // Convert back to string const originalId = web3.utils.hexToAscii(biosampleIdBytes32).replace(/\0/g, ''); // Result: "FR756568541491" β
Migration Strategy: Dual-Mode
V2 maintains 100% backward compatibility via dual-mode architecture:
- β V1 biosamples continue using numeric IDs
- β V2 biosamples use native alphanumeric IDs
- β Both types coexist in same ecosystem
- β Zero forced migration
π Technical Deep-Dive: Complete Ecosystem Impact
Components Affected by V2 Upgrade
| Component | Impact Level | Changes Required | Effort (hours) |
|---|---|---|---|
| Smart Contracts | π΄ CRITICAL | New contract deployment (bytes32 mappings) | 40h |
| Backend API | π HIGH | Remove int() casts (40-50 functions) | 60h |
| MongoDB | π HIGH | Schema migration (11 collections) | 20h |
| Frontend | π‘ MEDIUM | Token encoding + validation (5-10 files) | 30h |
| Story Protocol | π’ LOW | Metadata enhancement only | 20h |
| Avalanche | π’ NONE | EVM-compatible (no changes) | 0h |
| TOTAL DEVELOPMENT | 170h (~6 weeks) | ||
API Endpoints Affected
Out of 227+ API endpoints across GenoBank's ecosystem, approximately 40-50 endpoints require code changes:
POST /claim/{token_id}- Token decoding logic (CRITICAL)GET /biosample_details- Query logicGET /my_active_biosamples- Response serializationPOST /api_vcf_annotator/post_register_user- VCF registrationGET /api_claude_ia/get_variants_explanation- AI analysisPOST /api_alphagenome/register_analysis- Variant scoring
π The Complete Ecosystem: 227+ API Endpoints
GenoBank's metamorphosis journey is powered by a comprehensive REST API with 227+ endpoints across 8 microservices:
| Service | Endpoints | Base URL |
|---|---|---|
| Main API | 161 | genobank.app/ |
| Claude AI | 23 | /api_claude_ia/ |
| SOMOS DAO | 38 | /api_somos_dao/ |
| VCF Annotator | 32 | /api_vcf_annotator/ |
| AlphaGenome | 10 | /api_alphagenome/ |
| Newborn | 43 | /api_newborn/ |
| BioIP | 18 | /api_bioip/ |
| Clara | 28 | /api_clara/ |
β Conclusion & Future Vision
The Complete Journey: A Recap
The BioNFT metamorphosis transforms a physical DNA kit through five distinct stages:
- Genesis β Web3 identity creation
- Embryonic β Persistent data storage
- Larval β On-chain NFT minting
- Pupal β IP Asset registration
- Adult β Intelligence ecosystem
Each stage is irreversible, upgradeable, and value-additive, mirroring biological metamorphosis.
BioNFT V2: The Next Evolution
The proposed V2 architecture brings native alphanumeric biosample ID support, unlocking:
- β EU Market Expansion - European labs require alphanumeric barcodes
- β Data Integrity - Original barcode preserved on-chain
- β Backward Compatibility - 100% maintained via dual-mode
- β Zero Performance Impact - bytes32 = uint256 gas cost
Future Vision: Programmable Genomics
The BioNFT metamorphosis is the foundation for programmable genomicsβwhere genomic data becomes composable, licensable, and AI-analyzable infrastructure.
Imagine:
- 𧬠Researcher mints license token β Gains access to 10,000 biosamples for GWAS study
- π° Biosample owners earn royalties β 10% of every license sale
- π€ AI analyzes derivative IP β Grandchild IP assets created automatically
- π Data DAOs govern access β Community votes on licensing terms
This is the future GenoBank is building: Web3 genomics infrastructure for the world.
π Further Reading
- Securing AI-Generated Life: Blockchain SynBio NFTs
- Web3 OpenCRAVAT: Decentralized Variant Annotation
- Biosample Permission Tokens with NFTs
- Story Protocol Documentation
- GenoBank GitHub Repositories
Ready to Start Your BioNFT Metamorphosis?
Join GenoBank's Web3 genomics ecosystem today
This whitepaper is based on comprehensive technical analysis of GenoBank's BioNFT ecosystem. For detailed technical documentation, see BioNFT V2 Analysis Repository.