Skip to content

MES-696: Enable prefetching#33

Open
markovejnovic wants to merge 11 commits intomainfrom
marko/mes-696-lookup-pre-fetch-post-readdir
Open

MES-696: Enable prefetching#33
markovejnovic wants to merge 11 commits intomainfrom
marko/mes-696-lookup-pre-fetch-post-readdir

Conversation

@markovejnovic
Copy link
Collaborator

No description provided.

@mesa-dot-dev
Copy link

mesa-dot-dev bot commented Feb 11, 2026

Mesa Description

This PR introduces a prefetching mechanism to improve the performance of directory listing operations. By proactively fetching metadata and populating the inode cache, these changes aim to reduce latency when browsing organizations and repositories.

The core changes include:

  • Asynchronous Filesystem Initialization: A new init() method has been added to the Fs trait, allowing for setup tasks to run after the filesystem is mounted but before serving requests. This is used to pre-warm the cache by fetching repository listings for all configured organizations at startup.
  • readdir-triggered Prefetching: The readdir implementations for organizations and repositories have been updated. When a directory is listed, a background task is now spawned to prefetch the metadata for all the child entries in that directory. For example, listing an organization will trigger prefetching for all of its repositories.
  • Refactored Async Cache: The internal AsyncICache has been refactored to be cheaply clonable via Arc, enabling it to be shared across asynchronous prefetching tasks. It now exposes prefetch and spawn_prefetch methods to handle concurrent cache population.

Description generated by Mesa. Update settings

Copy link

@mesa-dot-dev mesa-dot-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed full review of b6f03ac...d893741

Analysis

  1. The new fire-and-forget prefetch design introduces an uncoordinated background workload path that could potentially overwhelm the remote metadata service when processing large directory listings.

  2. The implementation lacks proper concurrency controls, rate-limiting, or backpressure mechanisms to prevent excessive RPC calls from being issued to the resolver.

  3. The PR modifies core caching behavior (making AsyncICache cloneable via Arc) but doesn't appear to consider potential resource utilization implications across the system when many prefetch tasks are spawned simultaneously.

Tip

Help

Slash Commands:

  • /review - Request a full code review
  • /review latest - Review only changes since the last review
  • /describe - Generate PR description. This will update the PR body or issue comment depending on your configuration
  • /help - Get help with Mesa commands and configuration options

0 files reviewed | 1 comments | Edit Agent SettingsRead Docs

Replace evict_zero_rc_children calls with evict_stale_children in readdir
paths. The new method only evicts rc=0 children whose names are NOT in
the current directory listing, preserving prefetched grandchild inodes.
All production callers now use evict_stale_children which preserves
prefetched children. Update evict_cleans_child_index test to use
evict_stale_children with an empty set instead.
Add trace logging to ensure_child_ino (cache hit vs allocation),
evict_stale_children (eviction counts), forget (child_index cleanup),
and readdir eviction results in both RepoFs and CompositeFs.
Two fixes for prefetch reliability:

1. ensure_child_ino now validates that the cached inode still exists in
   the inode_table before returning it. If a failed prefetch (rc=0)
   caused get_or_resolve to evict the entry from inode_table without
   cleaning up child_index, the stale mapping is removed and a fresh
   inode is allocated. This prevents cascading lookup failures.

2. Prefetch failures are now logged at warn! level (was trace!) so
   they're visible with normal log settings. Includes the error message
   for easier debugging.
Switch the entire Fs stack from &mut self to &self so FuserAdapter can
spawn a tokio task per FUSE request instead of blocking the session loop
with block_on. This unblocks concurrent request processing — a slow API
call no longer stalls all other FUSE operations.

Key changes:
- Fs trait: &self, Send+Sync+'static bounds, readdir returns Vec<DirEntry>
- HashMapBridge: wrap BiMaps in std::sync::RwLock for interior mutability
- CompositeFs: slots in tokio::sync::RwLock, inode maps in scc::HashMap
- RepoFs/OrgFs/MesaFS: all mutable state behind concurrent containers
- FuserAdapter: fs wrapped in Arc<F>, each method spawns instead of block_on
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant