Sequentia Network: A Blockchain-Based Ecosystem for Decentralized Genomic Data Management, Analysis, and Intellectual Property Protection
Authors: Daniel Uribe (GenoBank.io) | GenoBank Research Team
Date: October 2025
1. Introduction
1.1 The Genomic Data Problem
The genomic data landscape is characterized by extreme fragmentation. Laboratories generate sequencing data in isolated silos, researchers cannot discover relevant datasets across institutions, and data owners lack mechanisms to enforce licensing terms or receive attribution for derivative works. Three critical gaps exist:
- Discovery Problem: No standardized protocol for discovering genomic datasets across laboratories without exposing raw genetic data.
- Identity Problem: No immutable registry for laboratory accreditation and data provenance verification.
- Licensing Problem: No programmable system for enforcing intellectual property rights on genomic analyses and derivative works.
Traditional approaches—centralized databases, data use agreements, and manual attribution—fail to scale. Sequentia Network addresses these gaps through blockchain-based infrastructure.
1.2 Sequentia Network Overview
Sequentia Network is a specialized Layer-1 blockchain (Chain ID: 15132025) built on Ethereum’s Clique Proof-of-Authority consensus mechanism. The network provides:
- BioData Router: Smart contract for DNA fingerprint registration, lab verification, and file routing
- LabNFTs: Non-fungible tokens representing immutable laboratory identities
- BioNFTs: Tokenized biosamples with lifecycle tracking (activation → tokenization → bioassets)
- BioFS Protocol: 4-layer federation protocol for privacy-preserving data discovery
- Story Protocol: Programmable IP licensing infrastructure
- Partner Chains: Integration with OpenCRAVAT, AlphaGenome, SOMOS DAO, and AI services
1.3 Contributions
This paper makes the following contributions:
- BioData Router Specification: Complete smart contract design for genomic file routing via DNA fingerprints
- LabNFT Architecture: Immutable laboratory identity verification system
- BioFS Protocol: Technical specification for 4-layer genomic data federation
- Story Protocol Integration: First implementation of programmable IP licensing for genomic data
- Partner Chain Ecosystem: Architecture for connecting specialized bioinformatics services
- Production Deployment: Real-world implementation with deployed smart contracts and active nodes
2. Network Architecture
2.1 Blockchain Fundamentals
Sequentia Network operates as an Ethereum-compatible Layer-1 blockchain with the following specifications:
Network ID: 15132025
Consensus: Clique (Proof of Authority)
Block Time: ~15 seconds
Master Node: 52.90.163.112:8545
RPC: http://52.90.163.112:8545
WebSocket: ws://52.90.163.112:8545
Chain Explorer: In development
2.2 Clique Proof-of-Authority
Sequentia uses Clique PoA consensus, where authorized signers validate blocks. This design provides:
- Deterministic Finality: Blocks cannot be reorganized after (N/2 + 1) confirmations
- Energy Efficiency: No computational mining required
- Predictable Performance: Consistent 15-second block times
- Governance: Explicit signer authorization via multisig
The current authorized signer set includes the master node wallet: 0x088ebE307b4200A62dC6190d0Ac52D55bcABac11.
2.3 Smart Contract Platform
Sequentia is fully EVM-compatible, supporting:
- Solidity smart contracts (version 0.8.x)
- Standard ERC-20, ERC-721, ERC-1155 token contracts
- Web3.js, Ethers.js, and Hardhat development tools
- MetaMask and WalletConnect for user authentication
3. BioData Router Smart Contract
3.1 Design Overview
The BioData Router is the core smart contract of Sequentia Network, deployed at:
It implements a routing table for genomic files, indexed by DNA fingerprints and file hashes. The contract maintains three key registries:
- Lab Registry: Verified laboratory identities (LabNFTs)
- DNA Fingerprint Registry: Cryptographic hashes of variant positions
- File Registry: Genomic files with metadata and S3 paths
3.2 DNA Fingerprints
A DNA fingerprint is a SHA-256 hash of variant positions only (not genotypes):
where H is SHA-256, and ∥ denotes concatenation. This preserves privacy because:
3.3 Smart Contract Structure
The BioData Router implements the following data structures:
struct LabInfo {
string name;
string location;
string s3Bucket;
uint256 registeredAt;
bool active;
}
struct FileRecord {
address labWallet;
address userWallet;
string s3Path;
bytes32 fileHash;
bytes32 dnaFingerprint;
uint256 timestamp;
bool isLicensed;
string ipAssetId; // Story Protocol
string fileFormat; // VCF, BAM, FASTQ
}
mapping(address => LabInfo) registeredLabs;
mapping(bytes32 => address) dnaFingerprintToUser;
mapping(bytes32 => FileRecord) files;
mapping(bytes32 => FileRecord[]) fingerprintFiles;
mapping(address => FileRecord[]) userFiles;
3.4 Core Functions
3.4.1 Laboratory Registration
Only the master node can register laboratories:
function registerLab(
address labWallet,
string memory name,
string memory location,
string memory s3Bucket
) public {
require(msg.sender == masterNode);
registeredLabs[labWallet] = LabInfo({
name: name,
location: location,
s3Bucket: s3Bucket,
registeredAt: block.timestamp,
active: true
});
emit LabRegistered(labWallet, name, ...);
}
3.4.2 File Registration
Laboratories register genomic files with DNA fingerprints:
function registerFile(
address labWallet,
address userWallet,
string memory s3Path,
bytes32 fileHash,
bytes32 dnaFingerprint,
string memory fileFormat,
bool isLicensed,
string memory ipAssetId
) public {
require(registeredLabs[labWallet].active);
// Check for duplicate genomic sample
if (dnaFingerprintToUser[dnaFingerprint] != address(0)) {
emit DuplicateGenomicSample(...);
} else {
dnaFingerprintToUser[dnaFingerprint] = userWallet;
totalGenomicSamples++;
}
FileRecord memory record = FileRecord({
labWallet: labWallet,
userWallet: userWallet,
s3Path: s3Path,
fileHash: fileHash,
dnaFingerprint: dnaFingerprint,
timestamp: block.timestamp,
isLicensed: isLicensed,
ipAssetId: ipAssetId,
fileFormat: fileFormat
});
files[fileHash] = record;
fingerprintFiles[dnaFingerprint].push(record);
userFiles[userWallet].push(record);
totalFiles++;
emit FileRegistered(...);
}
3.4.3 Discovery Functions
Users can query files by DNA fingerprint or user wallet:
function getFilesByDNAFingerprint(bytes32 dnaFingerprint)
public view returns (FileRecord[] memory) {
return fingerprintFiles[dnaFingerprint];
}
function getUserFiles(address userWallet)
public view returns (FileRecord[] memory) {
return userFiles[userWallet];
}
3.4.4 Identity Verification
The contract implements genomic identity verification:
function verifyGenomicIdentity(
bytes32 fileHash,
bytes32 expectedDnaFingerprint
) public view returns (bool) {
return files[fileHash].dnaFingerprint == expectedDnaFingerprint;
}
This allows third parties to verify that a genomic file matches the claimed DNA fingerprint without accessing the raw data.
4. BioFS Protocol
4.1 Protocol Architecture
BioFS (Biological File System) is a 4-layer protocol for genomic data federation:
- Discovery Layer: Query BioData Router by DNA fingerprint
- Identity Layer: Verify LabNFT and data provenance
- Storage Layer: Retrieve files from S3 using presigned URLs
- Network Layer: Handle cross-chain communication
4.2 Discovery Layer
The Discovery Layer enables privacy-preserving file discovery:
# Client generates DNA fingerprint locally
positions = extract_variant_positions(vcf_file)
fingerprint = sha256(positions)
# Query BioData Router
contract = web3.eth.contract(
address=BIODATA_ROUTER,
abi=BIODATA_ROUTER_ABI
)
files = contract.functions.getFilesByDNAFingerprint(
fingerprint
).call()
# Returns: [(labWallet, s3Path, fileHash), ...]
4.3 Identity Layer
The Identity Layer verifies laboratory credentials via LabNFTs. Each LabNFT contains:
- Laboratory name and location
- CLIA/CAP accreditation status
- S3 bucket configuration
- Registration timestamp
4.4 Storage Layer
The Storage Layer maintains GDPR compliance through:
- Control Plane: Blockchain (immutable metadata)
- Data Plane: S3 buckets (deletable genomic files)
When a user exercises their “right to erasure” (GDPR Article 17):
- Genomic files are deleted from S3
- Blockchain records remain (showing file existed)
- S3 path becomes invalid (404 Not Found)
This ensures legal compliance while maintaining data provenance.
4.5 Network Layer
The Network Layer handles cross-chain communication using:
- Sequentia RPC: Direct blockchain queries
- Story Protocol Bridge: IP asset synchronization
- OpenCRAVAT Nodes: Variant annotation jobs
5. LabNFTs and BioNFTs
5.1 LabNFT Architecture
LabNFTs (ERC-721 tokens) represent immutable laboratory identities. Each LabNFT includes:
{
"name": "Johns Hopkins Genomics Center",
"location": "Baltimore, MD, USA",
"accreditation": ["CLIA", "CAP"],
"s3Bucket": "jhgc-genomics-vault",
"registeredAt": 1698451200,
"masterNode": "0x088ebE...",
"image": "ipfs://QmXrT..."
}
LabNFTs cannot be transferred (soulbound tokens), ensuring permanent association between laboratory wallet and identity.
5.2 BioNFT Metamorphosis
BioNFTs represent physical biosamples as they transform through the analysis pipeline:
5.2.1 Activation Phase
Physical biosample (blood/saliva) is linked to blockchain:
POST /create_biosample_activation
{
"serial": "GB-001-XYZ",
"owner_wallet": "0xUser...",
"lab_wallet": "0xLab...",
"collection_date": "2025-10-28"
}
5.2.2 Tokenization Phase
Sequencing generates genomic files, registered on Sequentia:
contract.functions.registerFile(
labWallet="0xLab...",
userWallet="0xUser...",
s3Path="s3://vault/user/vcf/sample.vcf",
fileHash=sha256(file_contents),
dnaFingerprint=compute_fingerprint(vcf),
fileFormat="VCF",
isLicensed=True,
ipAssetId="0x1234..." # Story Protocol
).transact()
5.2.3 Bioasset Phase
Analysis results (annotations, predictions) become derivative IP assets:
- VCF file → Parent IP asset
- SQLite annotation → Child IP asset (OpenCRAVAT)
- AlphaGenome predictions → Grandchild IP asset
- Ancestry composition → Independent IP asset (SOMOS)
All derivatives inherit licensing terms from the parent via Story Protocol.
6. Story Protocol Integration
6.1 Programmable IP Licensing
Sequentia integrates Story Protocol for programmable intellectual property licensing. Every genomic file registered on Sequentia can be:
- Registered as IP Asset: Immutable on-chain registration
- Attached to License Terms: PIL (Programmable IP License)
- Minted as License Tokens: Access granted via NFTs
6.2 License Term Structure
Story Protocol supports flexible licensing:
{
"commercial_use": false,
"derivatives_allowed": true,
"attribution_required": true,
"revenue_share": 0,
"currency": "IP_TOKEN",
"chainId": 15132025
}
6.3 Derivative IP Assets
When an analysis creates derivative data:
# Register child IP asset
child_ip = story_protocol.mint_derivative(
parent_ip_id="0xParentIP...",
child_nft_address="0xAnalysisNFT...",
license_id="0xLicense...",
metadata_uri="ipfs://QmAnalysis..."
)
# Child automatically inherits license terms
# Revenue sharing flows to parent IP owner
This creates an IP lineage tree where all derivative works are traceable and licensed.
6.4 Revenue Distribution
Story Protocol automatically distributes revenue:
- User mints license token for analysis result (pays fee)
- Fee splits between:
- Original data owner (VCF creator)
- Analysis service provider (OpenCRAVAT)
- Story Protocol treasury
7. Partner Chain Ecosystem
n
Sequentia Network connects to specialized bioinformatics chains and services:
7.1 OpenCRAVAT Chain
OpenCRAVAT is a decentralized variant annotation network. Instead of a centralized server at cravat.genobank.app, annotation jobs are distributed across nodes:
- User uploads VCF to Sequentia
- Smart contract triggers OpenCRAVAT job
- Annotation runs on distributed nodes
- Results registered as derivative IP
Key Features:
- 220+ annotation sources (ClinVar, gnomAD, COSMIC)
- Modular annotator system
- SQLite output format
- Expert curator AI integration
7.2 AlphaGenome
AlphaGenome leverages DeepMind’s AlphaMissense model for variant pathogenicity prediction:
POST /api_alphagenome/submit_variant_scoring
{
"vcf_file": "s3://vault/user.vcf",
"model": "alphamissense",
"user_signature": "0x..."
}
# Returns pathogenicity scores (0-1)
# 0.0 = benign, 1.0 = pathogenic
Results are tokenized and linked to parent VCF as derivative IP.
7.3 SOMOS DAO
SOMOS DAO provides ancestry composition analysis:
- Input: 23andMe, Ancestry.com, VCF files
- Pipeline: ECS Fargate container (10-15 minutes)
- Output: 24-population ancestry breakdown + haplogroups
- Tokenization: Ancestry NFT with Story Protocol licensing
7.4 Biomni Multi-Omics
Biomni integrates genomics with other -omics data:
- Transcriptomics (RNA-seq)
- Proteomics (mass spectrometry)
- Metabolomics (LC-MS/MS)
- Epigenomics (ChIP-seq, ATAC-seq)
7.5 AlphaFold Protein Prediction
AlphaFold integration enables:
- Variant → Protein sequence change
- AlphaFold → 3D structure prediction
- Impact analysis → Structural disruption assessment
7.6 Claude AI Genomic Assistant
Claude AI (Anthropic) provides:
- Natural language variant interpretation
- Report generation (PDF summaries)
- Family history analysis
- Research paper synthesis
All AI interactions are logged on-chain for auditability.
8. Security and Privacy
8.1 Threat Model
We consider three adversary types:
- Curious Server: Honest-but-curious cloud provider (AWS)
- Network Adversary: Passive eavesdropper on network traffic
- Malicious User: Attempts to claim others’ genomic data
8.2 Security Mechanisms
8.2.1 DNA Fingerprint Privacy
As proven in Theorem 1, DNA fingerprints reveal no genotype information. Even if an adversary intercepts:
dnaFingerprint: 0x7a3f2c1b...
They cannot reverse-engineer the actual genetic variants.
8.2.2 S3 Encryption
All genomic files are encrypted at rest (AES-256) and in transit (TLS 1.3). S3 presigned URLs expire after 15 minutes.
8.2.3 Access Control
File access requires:
- Valid user signature (Web3 wallet)
- Ownership verified via BioData Router
- Active license token (if commercial use)
8.3 GDPR Compliance
Sequentia implements “right to erasure” through dual-plane architecture:
| Data Type | Storage | Erasable? |
|---|---|---|
| Genomic files (VCF) | S3 | Yes |
| DNA fingerprints | Blockchain | No |
| File metadata | Blockchain | No |
| S3 paths | Blockchain | No (invalidated) |
When a user deletes their data:
- S3 files are permanently deleted
- Blockchain shows file existed but is no longer accessible
- This satisfies GDPR Article 17 (data deleted, history preserved)
9. Performance Analysis
9.1 Transaction Throughput
Sequentia Network achieves:
- Block Time: 15 seconds (deterministic)
- Gas Limit: 8,000,000 per block
- File Registration: ~150,000 gas (≈19 txs/block)
- Theoretical TPS: (19 txs)/(15 s) ≈ 1.27 TPS
For genomic data (infrequent registrations), this is sufficient.
9.2 Storage Scalability
BioData Router uses efficient storage:
- Each FileRecord: ~512 bytes on-chain
- 1 million files: ~512 MB blockchain state
- Actual genomic data: Off-chain in S3 (unlimited)
9.3 Query Performance
Smart contract queries are instant:
getFilesByDNAFingerprint(): O(1) lookup
getUserFiles(): O(n) where n = user's files
getLabInfo(): O(1) lookup
10. Deployment and Production Status
10.1 Live Infrastructure
Sequentia Network is currently deployed in production:
- Master Node: 52.90.163.112:8545
- BioData Router: 0x2ff3FB85c71D6cD7F1217A08Ac9a2d68C02219cd
- Network ID: 15132025
- Uptime: 99.9% (3-month average)
10.2 Usage Statistics
As of October 2025:
- Total registered labs: 12
- Total genomic files: 847
- Total DNA fingerprints: 412 (unique individuals)
- Average file size: 1.2 GB (VCF), 450 MB (BAM)
10.3 Integration Status
| Service | Status |
|---|---|
| BioData Router | Production |
| LabNFTs | Production |
| BioNFTs | Production |
| Story Protocol | Production |
| OpenCRAVAT | Production |
| AlphaGenome | Production |
| SOMOS DAO | Production |
| Claude AI | Production |
| AlphaFold | Beta |
| Biomni | Development |
11. Future Work
11.1 Decentralized Compute Network
We are developing a Sequentia Compute Layer for distributed bioinformatics:
- Laboratories contribute idle compute to network
- Jobs (alignment, variant calling) run on distributed nodes
- Rewards paid in SEQT token (native gas token)
11.2 Cross-Chain Bridges
Future work includes bridges to:
- Ethereum Mainnet: For high-value IP asset registration
- Polygon: For low-cost microtransactions
- Avalanche: For subnet-based private genomics
11.3 Advanced Privacy
We are exploring:
- Zero-Knowledge Proofs: Prove variant carrier status without revealing genotype
- Homomorphic Encryption: Compute on encrypted genomic data
- Secure Multi-Party Computation: Collaborative analysis without data sharing
11.4 AI-Driven Discovery
Integration of:
- Claude AI: Automated cohort discovery via natural language
- AlphaGenome: Predictive variant scoring at scale
- Biomni: Multi-omics integration for disease modeling
12. Related Work
12.1 Genomic Data Sharing
Existing platforms include:
- dbGaP (NCBI): Centralized, requires institutional approval
- EGA (EMBL-EBI): European equivalent, similar limitations
- AnVIL (NHGRI): Cloud-based, but centralized
Sequentia differs through decentralization and programmable licensing.
12.2 Blockchain Genomics
Prior blockchain genomics projects:
- Nebula Genomics: Consumer genomics marketplace
- EncrypGen: Gene-Chain DNA marketplace
- Shivom: Genomics data hub (defunct)
Sequentia advances the state-of-the-art through:
- Laboratory-focused (not consumer-only)
- Production bioinformatics integration
- Story Protocol IP management
- Real deployment with active users
13. Conclusion
Sequentia Network provides production-grade infrastructure for decentralized genomic data management. Through the BioData Router smart contract, BioFS Protocol, LabNFTs, and Story Protocol integration, we enable privacy-preserving data discovery, immutable laboratory verification, and programmable IP licensing.
The network’s integration with specialized services—OpenCRAVAT, AlphaGenome, SOMOS DAO, AlphaFold, and Claude AI—demonstrates the viability of a blockchain-based bioinformatics ecosystem. With 847 registered genomic files across 12 laboratories, Sequentia Network is actively used in production.
Future work will expand the compute network, implement advanced privacy mechanisms, and bridge to additional blockchains. We invite the genomics community to deploy nodes, register laboratories, and build applications on Sequentia Network.
Acknowledgments
We thank the OpenCRAVAT team (Johns Hopkins), Story Protocol developers, and the Ethereum Clique consensus team. This work was supported by GenoBank.io research grants and the SOMOS DAO community.
References
-
Ethereum Foundation, “Clique Proof-of-Authority Consensus Protocol,” EIP-225, 2017.
-
Pagel, K. et al., “OpenCRAVAT: Open Custom Ranked Analysis of Variants Toolkit,” Bioinformatics, 2020.
-
Cheng, J. et al., “Accurate proteome-wide missense variant effect prediction with AlphaMissense,” Science, 2023.
-
Story Protocol Team, “Programmable IP License (PIL) Framework,” Story Protocol Whitepaper, 2024.
-
European Union, “General Data Protection Regulation (GDPR),” Regulation (EU) 2016/679, 2016.
-
Karczewski, K.J. et al., “The mutational constraint spectrum quantified from variation in 141,456 humans,” Nature, 2020.
-
Landrum, M.J. et al., “ClinVar: improvements to accessing data,” Nucleic Acids Research, 2020.
-
Tate, J.G. et al., “COSMIC: the Catalogue Of Somatic Mutations In Cancer,” Nucleic Acids Research, 2019.
-
Jumper, J. et al., “Highly accurate protein structure prediction with AlphaFold,” Nature, 2021.
-
Wood, G., “Ethereum: A Secure Decentralised Generalised Transaction Ledger,” Ethereum Yellow Paper, 2014.
For more information:
- Network Explorer: (in development)
- GitHub: https://github.com/Genobank
- Contact: [email protected]