Skip to content

Concepts

This page offers an overview of Amp, covering its core concepts, system architecture, and infrastructure design. It acts as a reference for developers and data teams seeking to understand how Amp processes, organizes, and delivers blockchain data for indexing, analytics, and enterprise use.

Amp a high-performance solution built for data-heavy teams that need full control of their blockchain data.

Traditional indexing follows Extract, Transform, and Load (ETL) principles, while Amp’s architecture supports extraction and transformation directly in SQL, creating a consistent and simple workflow.

Its ELT (Extract, Load, Transform) architecture enables faster data access and flexible analytics, while verifying data correctness at the time of extraction. As a complete data warehouse, Amp manages the entire blockchain data lifecycle to meet modern enterprise requirements.

StageDescription
ExtractPulls data directly from Ethereum JSON-RPC nodes
TransformProcesses data using SQL queries and blockchain-specific functions
StoreSaves optimized columnar data in optimized Apache Parquet format
ServeProvides query-ready data production-ready interfaces, including SQL, REST, GraphQL, Snowflake, and BI

Note: The familiar SQL syntax, removes the need for custom parsers or blockchain-specific code. Object storage can be either local or using cloud services such as AWS or GCS.

  • Parallel data extraction with configurable worker pools
  • Columnar storage optimized for large-scale analytics
  • Streamed query execution for massive datasets
  • Apache Arrow integration for efficient data transfer
  • Full SQL support using Apache DataFusion
  • Custom blockchain-specific SQL functions (e.g., decode logs, call contracts)
  • Handles complex analytical queries efficiently
  • Supports both real-time and historical data access
  • EVM RPC: Direct connection to Ethereum-compatible nodes
  • Batch requests: Optimized batching for extraction efficiency
  • Multiple endpoints: Support for fallback RPC endpoints
  • Custom chains: Compatible with any EVM-based blockchain
  • Client libraries for Python and TypeScript
  • Interactive notebooks (Marimo, Jupyter)
  • REST and gRPC APIs for flexible integration
  • Built-in monitoring and observability via Grafana
  • Lineage & auditability: Tracks every step of data extraction and transformation for verification and compliance.
  • Purpose-built: Designed for co-located or enterprise environments, including regulated industries.
  • Extensible multichain access: Unified APIs and a shared data lake for querying across multiple blockchains.
  • Local-first development: Ability to index and query events in seconds on any laptop.
  • Auto-modeling: Automatically generates schemas and datasets from smart contracts
  • Dataset publishing & composability: Datasets that are discoverable, shareable, and capable of being remixed.
  • Production-grade APIs: Support for SQL (streaming and batch), REST, and GraphQL (batch).