Distributed Queries
How queries are distributed and executed across nodes.
Query Execution Flow
graph TB
Q[Query] --> P[Query Planner]
P --> S[Shard Selection]
S --> N1[Node 1: Partition A]
S --> N2[Node 2: Partition B]
S --> N3[Node 3: Partition C]
N1 --> A[Aggregator]
N2 --> A
N3 --> A
A --> R[Final Result]
Query Types
| Type |
Distribution |
Example |
| Point |
Single node |
get_account(pubkey) |
| Range |
Multiple nodes |
get_transactions(start, end) |
| Scatter-Gather |
All nodes |
search(pattern) |
Query Planning
pub struct QueryPlan {
pub sub_queries: Vec<SubQuery>,
pub aggregation: AggregationType,
pub estimated_cost: u64,
}
impl QueryPlanner {
pub fn plan(&self, query: &Query) -> QueryPlan {
match query.query_type {
QueryType::PointLookup { key } => {
let node = self.ring.get_node(&key);
QueryPlan::single(node, query)
}
QueryType::RangeScan { start, end } => {
let nodes = self.ring.get_range_nodes(start, end);
QueryPlan::parallel(nodes, query)
}
QueryType::FullScan => {
QueryPlan::scatter_gather(self.all_nodes(), query)
}
}
}
}
Parallel Execution
pub async fn execute_parallel(&self, plan: QueryPlan) -> Result<QueryResult> {
let futures: Vec<_> = plan.sub_queries
.iter()
.map(|sq| self.execute_sub_query(sq))
.collect();
let results = futures::future::join_all(futures).await;
self.aggregate(results, plan.aggregation)
}
Result Aggregation
| Operation |
Method |
COUNT |
Sum partial counts |
SUM |
Sum partial sums |
AVG |
Weighted average |
MIN/MAX |
Compare partials |
LIMIT |
Merge and truncate |
- Use point queries when possible
- Add filters to reduce data transfer
- Limit result size with pagination
- Cache frequent queries client-side