Last Updated: March 2026
Custom on-chain analysis tools are purpose-built software systems designed to query, process, and analyse blockchain transaction data for investigative, compliance, or forensic purposes, beyond the capabilities offered by commercial platforms such as Chainalysis or Elliptic. The technical requirements for such tools are substantial: they must handle the UTXO (Unspent Transaction Output) model used by Bitcoin and its derivatives, the account model used by Ethereum and EVM-compatible chains, and increasingly the cross-chain data structures that emerge from bridge protocols and wrapped token mechanisms. According to TRM Labs‘s 2024 blockchain analytics report, more than 40% of institutional compliance teams now supplement commercial tools with custom-built analysis components for specific investigative or reporting requirements.
Crypto Trace Labs builds and deploys custom on-chain analysis tools for regulated financial institutions, law enforcement agencies, and enterprise compliance programmes. The team – ACAMS (Association of Certified Anti-Money Laundering Specialists) accredited, MLRO (Money Laundering Reporting Officer) qualified across UK, US, and EU, and Chartered Fellow Grade at the CMI – brings founding-level experience from Blockchain.com, Kraken, and Coinbase, where custom analytics tooling was built at scale.
Key Takeaways
- Node access is the foundation: Custom tools require direct RPC (Remote Procedure Call) access to archive nodes for historical data – public APIs cannot provide the data completeness required for forensic-grade analysis.
- UTXO and account models require separate parsers: Bitcoin’s UTXO model and Ethereum’s account model are structurally different and require different parsing logic, clustering algorithms, and graph representations.
- Graph database backends outperform relational databases: According to Chainalysis’s 2024 developer documentation, address-graph analysis at scale requires a graph database such as Neo4j or AWS Neptune to achieve query performance under 100ms for complex multi-hop traversals.
- Output formats must be audit-ready: Any tool used in investigations or compliance must produce outputs with full data lineage – raw source data, processing steps, and analytical conclusions must be separable and independently verifiable.
- Cross-chain support requires bridge protocol parsing: Tracing assets through Wormhole, LayerZero, or Across Protocol bridges requires custom event log parsing and cross-reference logic that commercial tools do not fully support.
Why This Matters
Commercial blockchain analytics platforms cover the most common tracing requirements for Bitcoin and Ethereum, but they have documented limitations for emerging chains, privacy coins, DeFi protocols, and high-volume institutional analysis. Organisations that rely exclusively on commercial tools encounter gaps: Monero tracing is not supported by most commercial platforms, DeFi protocol-specific heuristics for identifying wallet owners are rarely built into commercial tools, and real-time monitoring at the transaction level requires infrastructure that commercial SaaS platforms cannot provide at acceptable latency. Custom tools address these gaps, but only if they are built to the right technical standards – tools that produce outputs without data lineage, or that cannot be validated independently, create compliance and litigation risk rather than reducing it.
Node Infrastructure Requirements
Node infrastructure is the data access layer that custom on-chain analysis tools depend on, and the choice of node type directly affects the completeness and reliability of the data available for analysis.
Full archive nodes are the minimum requirement for forensic-grade custom tools. An archive node retains the complete historical state of the blockchain, including all intermediate states at every block height. This is essential for tracing transactions that occurred years ago, reconstructing wallet balances at specific block heights, and querying smart contract states at historical points. Pruned nodes, which discard historical state data, are not suitable for forensic analysis.
For Bitcoin, the tool should connect to a Bitcoin Core archive node via JSON-RPC. For Ethereum and EVM chains, connection to a Geth or Erigon archive node provides full state access. Running dedicated nodes ensures data independence from third-party APIs, which may have rate limits, outages, or data processing that differs from raw blockchain state. For operational resilience, production forensic tools should connect to at least two independent nodes and cross-validate responses.
| Node Type | Data Completeness | Suitable For | Infrastructure Cost |
|---|---|---|---|
| Full archive node | Complete historical state | Forensic analysis, compliance | High |
| Full node (pruned) | Current state + recent history | Transaction monitoring | Medium |
| Third-party API | API-dependent | Prototyping, low-volume | Low |
| Light client | Headers only | Verification only | Very low |
UTXO Model Parser Design
UTXO model parsing is the process of reading Bitcoin transaction inputs and outputs to reconstruct the flow of funds between addresses, and it forms the core of Bitcoin forensic tools.
Each Bitcoin transaction consumes one or more UTXOs as inputs and creates one or more UTXOs as outputs. The parser must correctly identify input-output relationships, compute address balances from the UTXO set, and apply clustering heuristics to group addresses into wallet entities. The two most widely validated clustering heuristics for Bitcoin are the common input ownership heuristic (addresses that co-sign a transaction are likely controlled by the same entity) and the change address heuristic (the output returning change to the sender follows identifiable patterns).
The parser must store both the raw transaction data and the processed clustering results as separate data objects, maintaining data lineage. For forensic applications, any cluster that is later updated – for example, when a deposit address is added to an exchange cluster – must log the update with a timestamp and the evidence basis, so that historical analyses can be reviewed in light of new attribution information.

Ethereum and EVM Account Model Requirements
Ethereum’s account model differs fundamentally from Bitcoin’s UTXO model: instead of tracking unspent outputs, it maintains account balances in a global state that updates with each transaction.
For custom Ethereum analysis tools, the parser must handle three types of transactions: native ETH transfers between externally owned accounts; token transfers (ERC-20, ERC-721, ERC-1155) emitted as events in smart contract execution; and internal transactions generated by smart contract interactions that do not appear directly in the transaction list but in the execution trace. Missing internal transactions – which require trace_block or debug_traceTransaction RPC calls, only available on archive nodes – produces an incomplete picture of fund flows.
DeFi protocol interactions add additional complexity. A single Uniswap swap may involve multiple token transfers across multiple contracts in a single transaction. The tool must parse event logs against the ABI (Application Binary Interface) of the relevant contract to extract meaningful transfer data from raw hexadecimal event data. According to Elliptic’s 2024 DeFi forensics report, the failure to correctly decode DeFi event logs is the most common cause of significant errors in Ethereum forensic analyses.

Audit-Ready Output Design
Audit-ready outputs are the forensic-grade data products that custom on-chain analysis tools must produce to support investigations, compliance submissions, and court proceedings.
Every output must include: the raw source data with its provenance (node address, RPC method, timestamp, response hash); the processing steps applied to derive the output from the raw data; the analytical conclusions and the data supporting them; and a machine-readable audit trail that allows independent reconstruction of the analysis. Outputs should be exportable in standardised formats (JSON-LD, CSV, PDF) with embedded metadata.
For compliance use cases, outputs must be designed to satisfy the reporting requirements of the AML (Anti-Money Laundering) framework applicable to the institution, including suspicious activity report (SAR) export formats compatible with the UK’s National Crime Agency or US FinCEN. For litigation use cases, outputs must be structured to meet the chain of custody and integrity verification requirements described in the Forensic Science Regulator’s 2024 Codes of Practice.
Frequently Asked Questions
What is the minimum infrastructure needed for a custom on-chain analysis tool?
The minimum infrastructure for a forensic-grade custom tool is a full archive node for each blockchain being analysed, a graph database for address relationship storage, a relational database for transaction metadata, and an output layer producing audit-ready exports with data lineage. For Bitcoin, this means a Bitcoin Core archive node. For Ethereum, a Geth or Erigon archive node. Running on cloud infrastructure, an archive node for each chain typically requires 2-8 TB of storage.
Can custom tools replace commercial platforms like Chainalysis?
Custom tools can replace commercial platforms for specific use cases, particularly where commercial tools have documented gaps – Monero tracing, specific DeFi protocol analysis, or high-volume real-time monitoring. However, commercial platforms offer attribution databases (linking addresses to known entities such as exchanges) that take years to build. Most organisations use custom tools alongside commercial platforms rather than as a complete replacement.
How are address clusters maintained and updated over time?
Address clusters should be stored in a graph database with a versioned schema that records the evidence basis for each clustering decision and timestamps for any updates. When new evidence is received (for example, a new exchange deposit address is confirmed), the cluster update must be logged without overwriting historical states. This allows retrospective analysis to be reviewed in light of updated attribution without compromising the integrity of the original analysis.
What programming languages are best suited for on-chain analysis tools?
Python is the most widely used language for on-chain analysis tools due to its extensive library ecosystem for data processing, graph analysis (NetworkX), and API interaction. For high-performance transaction processing, Rust and Go are preferred. For graph database integration, tools that support Cypher query language (Neo4j) or Gremlin (AWS Neptune, JanusGraph) are most effective. Production forensic tools typically use a combination of languages for different components.
How should DeFi protocol interactions be parsed?
DeFi protocol interactions require parsing event logs using the ABI of the relevant contract. The tool must maintain an ABI registry for major DeFi protocols (Uniswap V2/V3, Aave, Compound, Curve), decode hexadecimal event data into human-readable transfer events, and reconstruct the fund flow across multiple contracts within a single transaction. For unknown contracts, bytecode analysis or community-sourced ABI databases can be used to identify event signatures.
What privacy coins can custom tools analyse?
Bitcoin-based privacy features (CoinJoin, PayJoin, Lightning Network) can be partially analysed using custom tools with appropriate heuristics. Monero is the most challenging due to its ring signature and stealth address architecture, which obscures sender, receiver, and amount. Partial deanonymisation of Monero is possible through timing analysis, chain-of-custody reconstruction at on-ramp and off-ramp points, and exchange KYC correlation, but on-chain tracing alone is insufficient. Zcash shielded transactions present similar challenges.
How are cross-chain bridge transactions handled?
Cross-chain bridge transactions require the tool to monitor bridge contract events on the source chain (a lock or burn event) and correlate them with corresponding events on the destination chain (a mint or release event) using bridge protocol-specific identifiers such as Wormhole’s sequence numbers or LayerZero’s nonce values. This requires simultaneous indexing of multiple chains and a cross-chain correlation database. Bridge forensics is a rapidly evolving field with significant tooling gaps in commercial platforms.
What testing standards apply to forensic-grade custom tools?
Forensic-grade custom tools require validation testing against known transaction datasets with verified outcomes, regression testing to ensure updates do not alter results for historical data, and independent audit of the analysis methodology. For tools used in proceedings, the testing methodology should be documentable and disclosed alongside the tool as part of the CPR Part 35 expert declaration. Tools that cannot demonstrate validated accuracy should not be used in legal or regulatory contexts.
Executive Summary
Building custom on-chain analysis tools requires archive node infrastructure, separate parsers for UTXO and account-model blockchains, a graph database backend capable of handling complex multi-hop address traversals, and output systems that produce fully auditable data exports with complete data lineage. Commercial platforms have significant gaps – particularly for DeFi forensics, privacy coins, and cross-chain bridge tracing – that well-designed custom tools can address. For investigations and compliance applications, audit-ready outputs are non-negotiable: tools that cannot demonstrate how their conclusions were derived from raw data create legal and regulatory risk rather than reducing it.
What Should You Do Next?
If your organisation needs on-chain analysis capabilities beyond what commercial platforms provide, Crypto Trace Labs designs and builds custom blockchain analytics tools for regulated institutions, law enforcement agencies, and compliance-critical applications.
The team at Crypto Trace Labs – ACAMS-accredited, MLRO-qualified across the UK, US, and EU, Chartered Fellow Grade at the CMI, with founding members from Blockchain.com, Kraken, and Coinbase – has built analytics infrastructure at scale for major exchanges and developed crypto asset recovery tooling used in live UK High Court proceedings. We offer no upfront charge for non-custodial wallet recoveries. Contact us to discuss your technical requirements.
People Also Read
- How Does Blockchain Forensics Work? Expert Methods Explained
- Comparing On-Chain Analysis Platforms: Technical Feature Analysis
- What Is On-Chain Analysis? Complete Guide to Blockchain Data
- On-Chain Risk Scoring: How Investigators Rate Transaction Suspiciousness
About the Author
Crypto Trace Labs is a specialist crypto asset recovery and blockchain forensics firm founded by VP and Director-level executives formerly of Blockchain.com, Kraken, and Coinbase. Our team holds ACAMS accreditations, MLRO qualifications across the UK, US, and EU, and Chartered Fellow Grade status at the CMI. With over 10 years of experience in financial crime investigation and court-recognized blockchain forensics expertise, we have recovered 101 Bitcoin for clients in the last 12 months and delivered record fraud reduction for a $14bn crypto exchange. We work with law enforcement agencies, regulated financial institutions, and private clients on crypto asset recovery, blockchain forensics, AML compliance, and expert witness testimony – globally. We offer no upfront charge for non-custodial wallet recoveries. Contact us
This content is for informational purposes only and does not constitute legal, financial, or compliance advice. Crypto asset recovery outcomes depend on specific circumstances, regulatory cooperation, and technical factors. Consult qualified professionals regarding your specific situation.


