-
Notifications
You must be signed in to change notification settings - Fork 0
Local Caching #37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Local Caching #37
Conversation
- LocalFs::read_cached_file now opens the file and seeks to offset instead of reading the entire file into memory then slicing. - WriteThroughFs::read only populates the cache when offset == 0 to avoid caching a partial slice under the file's path.
Mesa DescriptionThis PR introduces a local caching layer for the filesystem to improve performance and reduce latency for cloud operations. This caching layer acts as a "write-through" cache, meaning that write operations will update both the local cache and the backing cloud storage. This ensures data consistency while allowing subsequent read operations to be served directly from the faster local cache. Key changes include:
Description generated by Mesa. Update settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Performed full review of f6fed08...e0ff702
Analysis
-
Cache correctness issue: The background cache population mechanism must ensure only complete files are stored, which may require careful synchronization or verification.
-
Eviction coordination risk: There's a potential for misalignment between CacheTracker and AsyncICache during cache eviction, which could lead to orphaned entries or reference inconsistencies.
-
Reproducibility concern: The reusable workflow action needs proper templated inputs to maintain release reproducibility across different environments.
-
Architecture complexity: The shift to a pluggable caching architecture introduces additional abstraction layers and composition points that may increase debugging difficulty.
Tip
Help
Slash Commands:
/review- Request a full code review/review latest- Review only changes since the last review/describe- Generate PR description. This will update the PR body or issue comment depending on your configuration/help- Get help with Mesa commands and configuration options
0 files reviewed | 1 comments | Edit Agent Settings • Read Docs
| .read(ino, fh, offset, size, flags, lock_owner) | ||
| .await?; | ||
|
|
||
| // Background: populate cache only for full-file reads (offset == 0) to avoid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Populating the cache whenever offset == 0 still caches whatever slice the backend returned for that first read. The kernel almost always issues the first read at offset 0 with a fixed chunk size (e.g. 128 KB), so we end up writing just that chunk to the cache and then advertising it as the full file. Subsequent reads at higher offsets (https://github.com/mesa-dot-dev/git-fs/blob/HEAD/src/fs/write_through.rs#L172-L177) will hit the truncated cache entry, get an empty slice, and stop reading early, so large files appear to be cut off at the first chunk once cached.
We should only populate the cache when we know we have the full file (e.g. compare the returned length against the file’s actual size, or buffer until EOF) instead of caching every offset‑0 read.
Prompt for Agent
Task: Address review feedback left on GitHub.
Repository: mesa-dot-dev/gitfs#37
File: src/fs/write_through.rs#L185
Action: Open this file location in your editor, inspect the highlighted code, and resolve the issue described below.
Feedback:
Populating the cache whenever `offset == 0` still caches whatever *slice* the backend returned for that first read. The kernel almost always issues the first read at offset 0 with a fixed chunk size (e.g. 128 KB), so we end up writing just that chunk to the cache and then advertising it as the full file. Subsequent reads at higher offsets (<https://github.com/mesa-dot-dev/git-fs/blob/HEAD/src/fs/write_through.rs#L172-L177>) will hit the truncated cache entry, get an empty slice, and stop reading early, so large files appear to be cut off at the first chunk once cached.
We should only populate the cache when we know we have the full file (e.g. compare the returned length against the file’s actual size, or buffer until EOF) instead of caching every offset‑0 read.
No description provided.