Streaming SQL
Amp processes blockchain data as a continuous stream. When you define a derived table, Amp incrementally updates it as new blocks arrive. SQL used in derived table definitions must be incrementally updatable so Amp can append new rows without recomputing the entire dataset.
How Streaming Works
Section titled “How Streaming Works”Amp applies your SQL definition every time new chain data arrives:
- Reads new blocks/logs
- Runs your SQL against the new rows
- Appends results to the derived table
- Repeats for each new block
Important: Your SQL must work correctly when run incrementally. Operations that require seeing all data at once (like
GROUP BYwith aggregates) don’t work in this model.
Common Patterns for Streaming SQL
Section titled “Common Patterns for Streaming SQL”1. Pre-Filter High-Volume Tables
Section titled “1. Pre-Filter High-Volume Tables”Reduce a large dataset into something query-efficient.
usdc_transfers: { sql: ` SELECT * FROM anvil.logs WHERE address = '0xUSDC...' AND topic0 = '0xddf252ad...' `,}2. Enrich Events with Block Metadata
Section titled “2. Enrich Events with Block Metadata”Attach timestamps and context to events.
transfers_with_time: { sql: ` SELECT l.address, l.data, b.timestamp FROM anvil.logs l JOIN anvil.blocks b ON l.block_num = b.block_num `,}3. Combine Event Types with UNION ALL
Section titled “3. Combine Event Types with UNION ALL”all_value_transfers: { sql: ` SELECT block_num, from_addr, to_addr, value, 'eth' AS type FROM anvil.transactions WHERE value > 0
UNION ALL
SELECT block_num, from_addr, to_addr, value, 'token' AS type FROM "_/erc20@dev".transfers `,}4. Decode Logs Using UDFs
Section titled “4. Decode Logs Using UDFs”decoded_swaps: { sql: ` SELECT block_num, evm_decode_log(data, topics, 'event Swap(...)') AS decoded FROM anvil.logs WHERE topic0 = evm_topic('Swap(...)') `,}Testing Your SQL
Section titled “Testing Your SQL”1. Prototype in Amp Studio
Section titled “1. Prototype in Amp Studio”just studio2. Validate Build
Section titled “2. Validate Build”pnpm amp build -o /tmp/test-manifest.json3. Deploy and Query
Section titled “3. Deploy and Query”just downjust uppnpm amp query 'SELECT * FROM "_/dataset@dev".table LIMIT 5'Performance Tips
Section titled “Performance Tips”- Filter early
- Select only needed columns
- Use indexed columns (
address,block_num) - Move expensive operations (sorting, grouping, limiting) to query time
When to Use Derived Tables
Section titled “When to Use Derived Tables”Use derived tables when:
- You repeatedly query the same filtered/joined data
- You need sub-second query latency
- The transformation is streaming-compatible
When to Use Query-Time SQL
Section titled “When to Use Query-Time SQL”Use query-time SQL when:
- You need aggregates (COUNT, SUM, etc.)
- You need sorting or deduplication
- You’re exploring data ad-hoc
Examples From the Template
Section titled “Examples From the Template”The (Quickstart)[] includes additional working examples showing:
- Filtering dependency tables
- Joining blocks and transactions
- Using UDFs (e.g.
evm_decode_log) - Correct streaming-compatible SQL structure
You can view them here:
cat amp.config.extended-example.tsThese examples illustrate complete, real-world derived table definitions.
Need Help?
Section titled “Need Help?”- SQL syntax reference: DuckDB SQL
- Performance tuning: See Streaming SQL Reference