Bitcoin Block Data Parser — Design
- Bitcoin Block Data Parser — Design
Bitcoin Block Data Parser — Design
Sources
Offline Node Data
Operator has an offline Bitcoin node synced up to ~1 month ago. This is the test data source.
What we need from the node:
- Raw block data (blk*.dat files) — OR —
- RPC access (if node can be started offline with
-listen=0) — OR — - Exported block data (JSON via
getblockRPC in a loop)
Best approach for offline node:
# Export block data without network (node runs in regtest/offline mode)
bitcoin-cli -rpcconnect=127.0.0.1 -rpcport=8332 getblockhash <height> >> hashes.txt
bitcoin-cli getblock <hash> 2 >> blocks.json
Alternative: Copy blk.dat files*
# Copy from node machine to SSD
rsync -av /home/user/.bitcoin/blocks/blk*.dat /media/user/shared-rw/bitcoin/blocks/
Parser 1: Sat Hodl Wave Parser (Hedblitz Claims)
What is Hodl Wave?
Hodl wave analyzes how long coins have been held (UTXO age distribution). “Hedblitz claims” likely refers to claims based on coin age — proving you held coins for a certain period.
What we parse:
For each UTXO in a block:
- Find the transaction that created it (input’s previous output)
- Determine the block height of the creating transaction
- Calculate age:
current_height - creation_height - Categorize by age brackets:
- < 1 day
- 1-7 days
- 7-30 days
- 30-90 days
- 90-365 days
- 1-2 years
- 2-5 years
- 5+ years
For Hedblitz Claims:
A “claim” would be a cryptographic proof that:
- You control a UTXO of age N
- The UTXO has value V
- The UTXO has not been spent since creation
This could be the basis for a “proof of loyalty” TXXM type.
Output Format:
{
"block_height": 850000,
"block_hash": "...",
"timestamp": 1720000000,
"utxo_count": 5000,
"total_sats": 1234567890,
"age_distribution": {
"0_1d": {"count": 100, "sats": 1000000},
"1_7d": {"count": 200, "sats": 2000000},
"7_30d": {"count": 300, "sats": 3000000},
"30_90d": {"count": 400, "sats": 4000000},
"90_365d": {"count": 500, "sats": 5000000},
"1_2y": {"count": 1000, "sats": 10000000},
"2_5y": {"count": 1500, "sats": 15000000},
"5y_plus": {"count": 1000, "sats": 100000000}
}
}
Parser 2: OP_RETURN Metaprotocol Filter
What we parse:
Every transaction’s outputs. For each OP_RETURN output:
- Extract the data push (up to 80 bytes)
- Check if data starts with a known metaprotocol prefix
- If yes → EXCLUDE (not Kapnet)
- If no → check for Kapnet prefix
- If Kapnet → INCLUDE and decode TXXM
Metaprotocol Prefixes to EXCLUDE:
| Protocol | Prefix (hex) | Notes |
|---|---|---|
| Ordinals | ord |
Ordinal inscriptions |
| BRC-20 | brc-20 |
BRC-20 token operations |
| SRC-20 | src-20 |
Stamps protocol |
| Runes | runes |
Runes protocol |
| BitStore | b |
BitStore data |
| Snow | snow |
Snow protocol |
| Counterparty | CNTRPRTY |
Counterparty tokens |
| OMNI | omni |
Omni layer (USDT etc) |
| Colored Coins | CLCT |
Early colored coins |
| Open Assets | OA |
Open Assets protocol |
| RGB | RGB |
RGB protocol |
| Taproot Assets | tap |
Taproot Assets |
| Atomicals | atom |
Atomicals protocol |
| MRI | mri |
MRI protocol |
Kapnet Whitelist Prefixes:
| Protocol | Prefix (hex) | Notes |
|---|---|---|
| Kapnet TXXM | kapnet |
Coordination data |
| Kapnet Anchor | kanchor |
Chain anchor |
| Kapnet Governance | kgov |
Governance TXXM |
Output Format:
{
"block_height": 850000,
"total_transactions": 2500,
"op_return_count": 500,
"excluded": {
"ord": 200,
"brc-20": 150,
"src-20": 50,
"runes": 30,
"other": 20
},
"whitelisted": {
"kapnet": 5,
"kanchor": 2,
"kgov": 0
},
"unknown": 43,
"kapnet_txxms": [
{
"txid": "...",
"vout": 0,
"data": "kapnet:...",
"decoded": { ... }
}
]
}
Implementation Strategy
Option A: Rust Binary (Fast, Use Existing Rust Toolchain)
kapnet-block-parser/
├── Cargo.toml (depends on rust-bitcoin 0.32 which already in workspace)
├── src/
│ ├── main.rs — CLI entry point
│ ├── hodlwave.rs — Hodl wave parser
│ ├── op_return.rs — OP_RETURN parser + metaprotocol filter
│ ├── types.rs — Shared types
│ └── output.rs — JSON output
Pros: Fast, reuse existing rust-bitcoin, runs on SSD toolchain Cons: Needs cc (C compiler) to build — not in AppVM
Option B: Node.js Script (Quick, Available Now)
kapnet-block-parser/
├── package.json
├── src/
│ ├── hodlwave.js — Hodl wave logic
│ ├── op_return.js — OP_RETURN filter
│ └── index.js — CLI
Pros: Runs now, no build needed, nostr-tools already installed Cons: Slower for large block data, no native rust-bitcoin
Option C: Hybrid (Recommended)
- Rust binary for heavy parsing (build on SSD toolchain, run anywhere)
- Node.js wrapper for Nostr integration (publish results as TXXM envelopes)
Chain Analysis: What Blocks to Parse
Approach 1: Full Chain Walk
Parse every block from genesis to tip. Comprehensive but slow.
- ~850,000 blocks
- ~700GB of raw block data
- Days to weeks to process
Approach 2: Sample Analysis
Parse specific block ranges:
- Every 1000th block (850 samples) — quick overview
- Last 1000 blocks (most recent) — fresh data
- Specific halving epochs — historical comparison
Approach 3: OP_RETURN Focus (Fastest)
Parse only blocks containing OP_RETURN transactions.
- Skip blocks with no OP_RETURN (most early blocks)
- Index OP_RETURN containing blocks first
- Parse only those
Recommendation for test: Approach 3 for OP_RETURN, Approach 2 for hodl wave.
Deliverables
- Block data acquisition — get node data to SSD
- Hodl wave parser — age distribution per block range
- OP_RETURN filter — metaprotocol exclusion + Kapnet whitelist
- TXXM decoder — decode Kapnet TXXMs from OP_RETURN data
- Reporter soul — automated analysis + Nostr publishing
Write a comment