Skip to content

Commit 1f3130a

Browse files
committed
docs: Adding details from life of a query training video
1 parent 1460768 commit 1f3130a

File tree

5 files changed

+67
-5
lines changed

5 files changed

+67
-5
lines changed

packages/firestore/devdocs/architecture.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,17 @@ The SDK is composed of several key components that work together to provide the
1212
* **Core**:
1313
* **Event Manager**: Acts as a central hub for all eventing in the SDK. It is responsible for routing events between the API Layer and Sync Engine. It manages query listeners and is responsible for raising snapshot events, as well as handling connectivity changes and some query failures.
1414
* **Sync Engine**: The central controller of the SDK. It acts as the glue between the Event Manager, Local Store, and Remote Store. Its responsibilities include:
15-
* Coordinating client requests and remote events.
16-
* Managing a view for each query, which represents the unified view between the local and remote data stores.
15+
* Coordinating and translating client requests and remote events from the backend.
16+
* Initiating responses to user code from both remote events (backend updates) and local events (e.g. garbage collection).
17+
* Managing a "view" for each query, which represents the unified view between the local and remote data stores. The Sync Engine builds the user-facing "View" using the formula: `View = Remote Document + Overlay`. A **Remote Document** is the authoritative state from the backend. An **Overlay** is A computed "delta" representing pending local mutations. Overlays are calculated immediately when a mutation is applied and persisted separately. This allows for zero-latency "Optimistic Updates."
18+
* Deciding whether a document is in a "limbo" state (e.g. its state is unknown) and needs to be fetched from the backend.
1719
* Notifying the Remote Store when the Local Store has new mutations that need to be sent to the backend.
1820
* **Local Store**: A container for the components that manage persisted and in-memory data.
19-
* **Remote Table**: A cache of the most recent version of documents as known by the Firestore backend.
21+
* **Remote Table**: A cache of the most recent version of documents as known by the Firestore backend (A.K.A. Remote Documents).
2022
* **Mutation Queue**: A queue of all the user-initiated writes (set, update, delete) that have not yet been acknowledged by the Firestore backend.
2123
* **Local View**: A cache that represents the user's current view of the data, combining the Remote Table with the Mutation Queue.
24+
* **Query Engine**: Determines the most efficient strategy (Index vs. Scan) to identify documents matching a query in the local cache.
25+
* **Overlays**: A performance-optimizing cache that stores the calculated effect of pending mutations from the Mutation Queue on documents. Instead of re-applying mutations every time a document is read, the SDK computes this "overlay" once and caches it, allowing the Local View to be constructed more efficiently.
2226
* **Remote Store**: The component responsible for all network communication with the Firestore backend. It manages the gRPC streams for reading and writing data, and it abstracts away the complexities of the network protocol from the rest of the SDK.
2327
* **Persistence Layer**: The underlying storage mechanism used by the Local Store to persist data on the client. In the browser, this is implemented using IndexedDB.
2428

@@ -77,7 +81,7 @@ Here's a step-by-step walkthrough of how data flows through the SDK for a write
7781
1. **API Layer**: A user attaches a listener to a query (e.g., `onSnapshot`).
7882
2. **Event Manager**: The Event Manager creates a listener and passes it to the Sync Engine.
7983
3. **Sync Engine**: The Sync Engine creates a "view" for the query.
80-
4. **Local View (in Local Store)**: The Sync Engine asks the Local Store for the current documents matching the query. This includes any optimistic local changes from the **Mutation Queue**.
84+
4. **Local View (in Local Store)**: The Sync Engine asks the Query Engine in the Local Store to execute the query. The Query Engine selects a strategy (e.g., Index Scan or Timestamp Optimization) to find matching keys. The Local Store then constructs the documents by applying cached Overlays on top of Remote Documents.
8185
5. **API Layer**: The initial data from the Local View is sent back to the user's `onSnapshot` callback. This provides a fast, initial result.
8286
6. **Remote Store**: Simultaneously, the Sync Engine instructs the Remote Store to listen to the query on the Firestore backend.
8387
7. **Backend**: The backend returns the initial matching documents for the query.

packages/firestore/devdocs/code-layout.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,12 @@ This document explains the code layout in this repository. It is closely related
66
* `api/`: Implements the **API Layer** for the main SDK.
77
* `lite-api/`: Contains the entry point of for the lite SDK.
88
* `core/`: Contains logic for the **Sync Engine** and **Event Manager**.
9-
* `local/`: Contains the logic the **Local Store**, which includes the **Mutation Queue**, **Remote Table**, **Local View**, and the **Persistence Layer**.
9+
* `local/`: Contains the logic the **Local Store**, which includes the **Mutation Queue**, **Remote Table**, **Local View**, **Overlays**, and the **Persistence Layer**
10+
* `local_store.ts`: The main entry point for persistence operations.
11+
* `query_engine.ts`: Implements the strategy selection logic (Scan vs. Index).
12+
* `index_backfiller.ts`: The background task that updates Client-Side Indexes.
13+
* `remote_document_cache.ts`: Manages the `remote_documents` table (base truth).
14+
* `overlay_cache.ts`: Manages pending mutation queue.
1015
* `remote/`: Contains the logic for the **Remote Store**, handling all network communication.
1116
* `model/`: Defines the internal data models used throughout the SDK, such as `Document`, `DocumentKey`, and `Mutation`. These models are used to represent Firestore data and operations in a structured way.
1217
* `platform/`: Contains platform-specific code to abstract away the differences between the Node.js and browser environments. This includes things like networking, storage, and timers. This allows the core logic of the SDK to be platform-agnostic.

packages/firestore/devdocs/overview.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,16 @@ The primary goals of this SDK are:
1616
* Offer a lightweight version for applications that do not require advanced features.
1717
* Maintain API and architectural symmetry with the [Firestore Android SDK](https://github.com/firebase/firebase-android-sdk) and [Firestore iOS SDK](https://github.com/firebase/firebase-ios-sdk). This consistency simplifies maintenance and makes it easier to port features between platforms. The public API is intentionally consistent across platforms, even if it means being less idiomatic, to allow developers to more easily port their application code.
1818

19+
20+
## Key Concepts & Vocabulary
21+
22+
* **Query**: The client-side representation of a data request (filters, order bys).
23+
* **Target**: The backend's representation of a Query. The SDK allocates a unique integer `TargetID` for every unique query to manage the real-time stream.
24+
* **Mutation**: A user-initiated change (Set, Update, Delete). Mutations are queued locally and sent to the backend.
25+
* **Overlay**: The computed result of applying a Mutation to a Document. We store these to show "Optimistic Updates" instantly without modifying the underlying "Remote Document" until the server confirms the write.
26+
* **Limbo**: A state where a document exists locally and matches a query, but the server hasn't explicitly confirmed it belongs to the current snapshot version. The SDK must perform "Limbo Resolution" to ensure these documents are valid.
27+
28+
1929
## Artifacts
2030

2131
The Firestore JavaScript SDK is divided into two main packages:
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Query Execution & Indexing
2+
3+
This document details how the Firestore SDK executes queries against the local cache. Understanding this is crucial for debugging performance issues and understanding offline behavior.
4+
5+
## The Query Engine
6+
7+
The **Query Engine** is the component within the **Local Store** responsible for finding the set of document keys that match a given query. It does not load the full document data; it only identifies the keys. It employs a hierarchy of strategies, ordered by efficiency:
8+
9+
1. **Full Index Scan (O(log N))**:
10+
* Used when a Client-Side Index (CSI) exists that covers all filters and sort orders of the query.
11+
* This is the most performant strategy.
12+
2. **Partial Index Scan**:
13+
* Used when an index covers some filters (typically equality filters like `where('status', '==', 'published')`).
14+
* The engine uses the index to narrow down the potential keys and then performs a scan on that smaller subset to verify the remaining conditions.
15+
3. **Index-Free (Timestamp) Optimization**:
16+
* **Concept**: If the client has been online and syncing, it knows the state of the collection up to a specific point in time (the `lastLimboFreeSnapshot`).
17+
* **Mechanism**: The SDK assumes the "base state" (documents matching at the last snapshot) is correct. It then only scans the `remote_documents` table for documents modified *after* that snapshot version.
18+
* This drastically reduces the work required for active listeners, changing the cost from *Collection Size* to *Recent Change Volume*.
19+
4. **Full Collection Scan (O(N))**:
20+
* The fallback strategy. The engine iterates through every document in the collection locally to check for matches.
21+
22+
## Client-Side Indexing (CSI)
23+
24+
To support efficient querying without blocking the main thread, the SDK utilizes a decoupled indexing architecture.
25+
26+
* **Structure**: Index entries are stored in a dedicated `index_entries` table. An entry maps field values (e.g., `(coll/doc1, fieldA=1)`) to a document key.
27+
* **The Index Backfiller**: Indexes are **not** updated synchronously when you write a document. Instead, a background task called the **Backfiller** runs periodically (when the SDK is idle). It reads new/modified documents and updates the index entries.
28+
* **Hybrid Lookup**: Because the Backfiller is asynchronous, the index might be "stale" (behind the document cache). To guarantee consistency, the Query Engine performs a **Hybrid Lookup**:
29+
1. Query the **Index** for results up to the `IndexOffset` (the point where the Backfiller stopped).
30+
2. Query the **Remote Document Cache** for any documents modified *after* the `IndexOffset`.
31+
3. Merge the results.
32+
33+
## Composite Queries (OR / IN)
34+
35+
Queries using `OR` or `IN` are not executed as a single monolithic scan. The SDK transforms these into **Disjunctive Normal Form (DNF)**—essentially breaking them into multiple sub-queries.
36+
37+
* **Execution**: Each sub-query is executed independently using the strategies above (Index vs. Scan).
38+
* **Union**: The resulting sets of document keys are unioned together in memory to produce the final result.

packages/firestore/devdocs/testing.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,11 @@ This document provides a detailed explanation of the Firestore JavaScript SDK te
99
- Firebase emulator for local development
1010
- Integration testing with the backend
1111

12+
## Component Testing
13+
* **Query Engine Tests**: We rely on specific spec tests to ensure the Query Engine picks the correct strategy (e.g., verifying that it uses an Index when available rather than scanning).
14+
* **Integration Tests**: Use the Firebase Emulator to verify that "Limbo Resolution" correctly fetches documents when the local cache drifts from the server state.
15+
16+
1217
# Patterns and Practices
1318

1419

0 commit comments

Comments
 (0)