title: Query execution and overlay merge
This document describes the single query execution pipeline in Fluree DB and how it combines:
- Indexed data (binary columnar indexes)
- Overlay data (novelty + staged flakes)
It also calls out where graph scoping (g_id) is applied so named graphs remain isolated.
Pipeline overview
flowchart TD
LedgerState -->|produces| LedgerSnapshot
LedgerSnapshot -->|shared substrate| GraphDb
GraphDb -->|single-ledger| QueryRunner
GraphDb -->|member_of| DataSetDb
DataSetDb -->|federated| QueryRunner
QueryRunner -->|scan index + merge overlay| DatasetOperator
DatasetOperator -->|per-graph| BinaryScanOperator
BinaryScanOperator -->|fast path| BinaryCursor
BinaryScanOperator -->|fallback| range_with_overlay
BinaryCursor -->|graph-scoped decode| BinaryGraphView
range_with_overlay -->|delegates| RangeProvider
Where this exists in code
-
API entrypoints
fluree-db-api/src/view/query.rs: single-ledgerGraphDbqueries (query)fluree-db-api/src/view/dataset_query.rs: dataset queries (DataSetDb)
-
Unified query runner
fluree-db-query/src/execute/runner.rsprepare_execution(db: GraphDbRef<'_>, query: &ExecutableQuery)builds derived facts/ontology (if enabled), rewrites patterns, and builds the operator tree.execute_prepared(...)runs the operator tree using anExecutionContext.
-
Dataset operator
fluree-db-query/src/dataset_operator.rsDatasetOperatorwraps every triple-pattern scan. In single-graph mode (the common case) it passes through to one innerBinaryScanOperatorwith negligible overhead. In multi-graph mode (FROM/FROM NAMED datasets) it fans out one inner operator per active graph, drives their lifecycles, and stamps ledger provenance (Binding::IriMatch) on results that span multiple ledgers.DatasetBuildertrait (factory pattern): the planner constructs aScanDatasetBuilderat plan time;DatasetOperatorcallsbuild()at execution time duringopen()to produce per-graphBinaryScanOperators.- Nested composition: inner operators can themselves be
DatasetOperators — provenance stamping passesIriMatchthrough unchanged.
-
Scan operators
fluree-db-query/src/binary_scan.rsBinaryScanOperatorhandles single-graph scanning only. Selects between binary cursor (streaming, integer-ID pipeline) and range fallback atopen()time based on theExecutionContext.
-
Range fallback
fluree-db-core/src/range.rs:range_with_overlay(snapshot, g_id, overlay, ...)fluree-db-core/src/range_provider.rs:RangeProvidertrait implemented by the binary range provider
Graph scoping (g_id)
Graph scoping is applied at two key boundaries:
- Binary streaming path:
BinaryCursoroperates on aBinaryGraphView(graph-scoped decode handle), ensuring leaf/leaflet decoding, predicate dictionaries, and specialty arenas are graph-isolated. - Range path:
range_with_overlay(snapshot, g_id, overlay, ...)passesg_idinto theRangeProvider, which routes the range query to the correct per-graph index segments.
Overlay providers are graph-scoped at the trait boundary: the overlay hook receives g_id and must only return flakes for that graph. This keeps multi-tenant named graphs isolated even when overlay data is sourced externally.
Overlay merge semantics (high level)
Both scan paths implement the same logical behavior:
- Read matching flakes from the indexed base (binary files)
- Read matching flakes from the overlay (novelty/staged)
- Merge them using
(t, op)semantics so retractions cancel assertions as-of the query time bound
The details differ:
BinaryScanOperatortranslates overlay flakes into integer-ID space and merges them into the decoded columnar stream.RangeScanOperatordelegates torange_with_overlay, which combinesRangeProvideroutput with overlay output.
Planner fast paths
Before building the generic operator tree, build_operator_tree_inner
(fluree-db-query/src/execute/operator_tree.rs) runs a chain of detect_*
shape recognizers. When one matches, it builds a specialized FastPathOperator
that captures the slow generic tree as a fallback; the operator returns
Ok(None) from its open()-time closure to defer to that fallback whenever its
runtime preconditions do not hold. The whole chain is disabled in History mode
(!planning.is_history()).
Fast paths choose one of two overlay strategies (fast_path_common.rs):
- strategy (a) — bail:
fast_path_store/allow_fast_pathdecline when there is uncommitted overlay (epoch != 0),to_t < max_t, multi-ledger, afrom_t, or non-root policy. Used by base-only micro-optimizations. - strategy (b) — merge:
allow_cursor_fast_pathadmits overlay and time-travel because the operator reads through an overlay-mergingBinaryCursor(build_psot_cursor_for_predicate/build_post_cursor_for_predicate) that folds novelty and honorsto_t.
Reverse-POST ORDER BY DESC(?o) LIMIT k
detect_post_order_desc_limit → fast_post_order_limit::post_order_desc_limit_operator.
Recognizes SELECT ?s ?o WHERE { ?s <p> ?o [ ; ?s a <Class> ] } ORDER BY DESC(?o) LIMIT k
(optional OFFSET/DISTINCT; DISTINCT requires ?o projected). The POST
index is sorted (p_id, o_type, o_key, o_i, s_id), so for a predicate whose
objects share one order-preserving o_type (numeric / temporal / boolean —
fast_path_common::is_post_desc_orderable), the physical tail of the predicate’s
POST range is exactly the DESC top-k. The operator walks POST leaf entries from
the tail, decodes only the rows it keeps, and stops after OFFSET + LIMIT
survivors — avoiding a full-predicate drain into the top-k SortOperator.
Correctness is enforced at runtime (the detector is shape-only):
- A directory prepass proves the predicate’s objects are a single
order-preserving
o_typebefore collecting. If they are not — dict-backed strings/refs (which sort above numerics/temporals), a mixed leaflet, or more than oneo_type— the operator bails to the generic top-k rather than emit a wrong order. (A full prepass, not a streaming check, is required because the tail walk stops afterOFFSET + LIMITrows.) - Profitability: for the
?s a <Class>shape,detect_post_order_desc_limitconsultsStatsViewand declines (defers to the generic plan) when the class is selective — i.e. the estimated tail rows to collectOFFSET + LIMITclass members (need / (class_count / ndv_subjects(p))) exceedsclass_count, so anchoring on the class directly is cheaper. The bare shape (no class) always wins; missing stats fall through to a runtime scan budget that bails after a bounded number of inspected rows. (Note the detector already rejects any other pattern, so an arbitrary selective filter like?s :employeeId 1234never reaches this path — it goes to the generic planner, which anchors on it.)
The gate is allow_cursor_fast_path + to_t >= index_t (deep time-travel, which
needs the history sidecar, defers). Two lanes split on whether novelty is present:
- Base lane (
epoch == 0,to_t == index_t): a plain reverse leaf-walk over the persisted index. Class membership usesbatched_lookup_predicate_refs(persisted is exact here). - Overlay lane (
epoch != 0): the same reverse leaf-walk is merged, in descending value order, with the predicate’s resolved novelty ops — a row-set merge (skip base rows retracted by overlay; add overlay asserts; dedup by the full V3 fact identity(s_id, o_key, o_i)within the single proveno_type—o_ikeeps repeated/list values distinct), which sidesteps the “base + asserts − retracts” arithmetic pitfall that only afflicts overlay-aware counting.rdf:typeclass membership is likewise evaluated overlay-correctly (persisted base ± novelty type asserts/retracts), and novelty-only subjects are materialized through a novelty-aware graph view (the persisted-dictencoded_sidform would not resolve them).