-
Notifications
You must be signed in to change notification settings - Fork 4
Implement stale while revalidate (SWR) cache strategy #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Implement stale while revalidate (SWR) cache strategy #35
Conversation
979705c to
ae8746b
Compare
|
@NoahFisher I still have to deal with some rubocop offenses and so one final review of the test coverage but I could use some feedback on this draft, I also have one open question to double check in the PR description. |
da9e215 to
d798182
Compare
43fcf9b to
7ff496b
Compare
Add three new configuration options to support SWR caching: - cache_stale_while_revalidate: Enable/disable SWR (default: false) - cache_stale_ttl: Grace period for serving stale data (default: 300s) - cache_refresh_threads: Background thread pool size (default: 5) SWR caching requires Rails cache backend and will be validated during configuration. This enables serving stale data instantly while refreshing in the background for better performance.
Restructure CacheEntry to support stale-while-revalidate pattern: - Replace single expires_at with fresh_until and stale_until - Add fresh? method: entry is fresh and can be served immediately - Add stale? method: entry is stale but usable (revalidate in background) - Update expired? method: entry must be revalidated synchronously This three-state model enables SWR caching where stale entries can be served instantly while a background refresh occurs.
Add full SWR caching support with concurrent background refresh: - fetch_with_stale_while_revalidate: Main SWR fetch method - Three-state handling: FRESH (instant), STALE (instant + refresh), MISS (sync fetch) - Concurrent thread pool for background refresh operations - Stampede protection during background refresh - Logger integration for debugging cache states - Graceful shutdown of thread pool The adapter now accepts stale_ttl and refresh_threads parameters. When stale_ttl is set, SWR is enabled. When nil, falls back to standard fetch_with_lock behavior. This provides instant responses for 99% of requests by serving stale data while refreshing in the background.
Update Client to pass SWR configuration options to RailsCacheAdapter: - Pass stale_ttl (only when cache_stale_while_revalidate is enabled) - Pass refresh_threads for background thread pool size - Pass logger for debugging and error reporting Extract create_rails_cache_adapter method for clarity. When SWR is disabled, stale_ttl is nil, keeping standard behavior. Tests verify correct configuration propagation and thread pool creation based on SWR settings.
Refactor get_prompt to choose optimal caching strategy based on available cache capabilities: 1. SWR cache (fetch_with_stale_while_revalidate) - best performance 2. Distributed cache (fetch_with_lock) - stampede protection 3. Simple cache (get/set) - basic in-memory caching 4. No cache - direct API fetch Extract separate methods for each strategy for better maintainability and testability. The strategy is selected at runtime based on which methods the cache adapter responds to. This enables automatic use of SWR when available without breaking existing cache implementations.
This was the pattern already in place. This commit simply expands the scope to some of the new spec examples added, however, it would be interesting to discuss whether we need to allow for a larger number of memoized helpers or if some refactoring is needed for these specs.
7ff496b to
ab7628d
Compare
Add spec examples to verify that SWR (Stale-While-Revalidate) is correctly enabled only when stale_ttl is positive, and disabled for zero or negative values. Adds thread pool initialization coverage for both cache adapters. All 785 tests pass with 97.36% coverage maintained.
|
@NoahFisher, @kxzk I think this is in a good spot for some review. Let me know if there are any concerns or change suggestions you'd like me to address. Thanks. |
|
My last commit is disabling Rubocop's |
Will try and review today. |
kxzk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few comments but overall looks solid
|
@kxzk thanks for all the feedback. I'll address them next. |
Consolidates all cache locking mechanisms to use a single, simplified API: - Unified lock keys: Both fetch and refresh operations use the same lock - Single timeout: All locks use lock_timeout (10s) instead of separate values - Simplified method names: build_lock_key, acquire_lock, release_lock Changes: - StaleWhileRevalidate module now defines single set of abstract lock methods - RailsCacheAdapter implements distributed locking with Rails.cache - PromptCache implements in-memory thread-safe locking - All tests updated to reflect unified lock key format (:lock suffix) Benefits: - Simpler, more maintainable code - Single lock prevents duplicate work for both fetch and refresh scenarios - Clearer intent and easier to reason about - Reduced code duplication All tests passing (785 examples, 97.58% coverage)
Resolves TODO by moving fetch_with_lock out of StaleWhileRevalidate module and into RailsCacheAdapter where it belongs. Rationale: - StaleWhileRevalidate module is specifically for SWR functionality - fetch_with_lock is for non-SWR scenarios (when stale_ttl is 0) - Only RailsCacheAdapter needs distributed stampede protection - PromptCache uses in-memory locking, not suitable for fetch_with_lock Changes: - Moved fetch_with_lock from module to RailsCacheAdapter (public method) - Moved wait_for_cache helper to RailsCacheAdapter - Removed cache_set_simple (no longer needed, uses set() directly) - Kept build_lock_key in module (used by schedule_refresh) - Updated tests to reflect PromptCache doesn't support fetch_with_lock Benefits: - Cleaner separation of concerns - StaleWhileRevalidate module reduced from 370 to 283 lines - No rubocop disable needed for module length - Clearer API: only RailsCacheAdapter provides distributed locking All tests passing (786 examples, 97.01% coverage)
Simplifies cache API by deriving expiration values from ttl and stale_ttl instead of requiring explicit expires_in parameter. Rationale: - Cache implementations already know their ttl and stale_ttl values - No need for callers to calculate and pass expires_in - Expiration logic is encapsulated within each cache implementation - Cleaner, simpler API with fewer parameters Changes: - Removed expires_in parameter from PromptCache#set - Removed expires_in parameter from RailsCacheAdapter#set - Removed expires_in parameter from cache_set abstract method - PromptCache now calculates: fresh_until = ttl, stale_until = ttl + stale_ttl - RailsCacheAdapter calculates: expires_in = ttl + stale_ttl (if SWR) else ttl - Updated all call sites and tests Benefits: - Simpler API - fewer parameters to think about - Encapsulation - expiration logic lives in cache implementation - Less error-prone - no risk of passing wrong expires_in value - More maintainable - changing TTL logic only affects cache class All tests passing (786 examples, 96.93% coverage)
Ensures cache thread pools are properly shutdown when client is disposed. Rationale: - Client#shutdown already calls score_client.shutdown for cleanup - SWR-enabled caches use thread pools that need proper shutdown - Thread pools should be gracefully terminated to prevent leaks - Consistent cleanup pattern across all client components Changes: - Added ApiClient#shutdown method to shutdown cache if it supports it - Client#shutdown now calls api_client.shutdown in addition to score_client - Uses respond_to?(:shutdown) check for backward compatibility - StaleWhileRevalidate#shutdown safely handles nil thread pools Test Coverage: - Added 5 new tests for ApiClient#shutdown (all scenarios) - Added 3 new tests for Client#shutdown with cache scenarios - Tests verify shutdown is called on SWR-enabled caches - Tests verify shutdown is safe when cache is nil or doesn't support SWR - All 794 examples passing, 96.86% coverage maintained Benefits: - Proper resource cleanup for long-running applications - No thread pool leaks when client is shutdown - Safe for all cache configurations (SWR, TTL-only, or no cache) - Graceful degradation for caches without shutdown support
f325ea2 to
dff1a15
Compare
…or cache_stale_ttl
Improves API ergonomics by replacing Float::INFINITY with a more Ruby-idiomatic
:indefinite symbol for never-expiring cache configuration.
Rationale:
- :indefinite is more readable and Ruby-idiomatic than Float::INFINITY
- Symbols are the conventional way to represent special values in Ruby config
- Makes intent clearer in configuration code
- Aligns with Ruby community conventions (e.g., Rails cache :expires_in)
Changes:
- Renamed THOUSAND_YEARS_IN_SECONDS constant to INDEFINITE_SECONDS
- Updated constant comment to clarify intent ("indefinite cache duration")
- Changed normalize_stale_ttl from class method to instance method (normalized_stale_ttl)
- Instance method reads from cache_stale_ttl and returns normalized value on-demand
- Removed normalization from Config#initialize (now done lazily via method)
- Updated create_memory_cache to use config.normalized_stale_ttl
- Updated create_rails_cache_adapter to use config.normalized_stale_ttl
- Updated validate_swr_config! to handle :indefinite symbol properly
- Updated validation error message: "must be non-negative or :indefinite"
Documentation Updates:
- docs/CACHING.md: Changed Float::INFINITY to :indefinite in examples
- docs/CONFIGURATION.md: Changed Float::INFINITY to :indefinite in examples
- lib/langfuse/config.rb: Updated attr_accessor doc to mention :indefinite
- lib/langfuse/prompt_cache.rb: Updated comment to reflect :indefinite normalization
Test Coverage:
- Updated all tests to use :indefinite instead of Float::INFINITY
- Updated all tests to use INDEFINITE_SECONDS instead of THOUSAND_YEARS_IN_SECONDS
- Updated test expectations for new validation error message
- All 797 examples passing, 96.86% coverage maintained
Benefits:
- More intuitive API for developers (config.cache_stale_ttl = :indefinite)
- Clearer intent in configuration code
- Normalization happens on-demand via instance method
- Original config value preserved (not mutated during initialization)
- Better alignment with Ruby conventions and best practices
dff1a15 to
f7e9fdc
Compare
Adds specific validation to catch when cache_stale_while_revalidate is enabled but cache_stale_ttl is nil, providing a helpful error message with recommended solutions. Rationale: - When SWR is enabled, cache_stale_ttl must have a value - nil is invalid regardless of SWR status - Users need clear guidance on how to fix the configuration - Prevents confusing runtime errors from occurring later Changes: - Added validate_swr_stale_ttl! private method to handle stale_ttl validation - Added validate_refresh_threads! private method to validate refresh threads - Split validate_swr_config! into smaller, focused methods - New error message guides users to set cache_ttl or :indefinite - Maintains existing validation that cache_stale_ttl cannot be nil - Removed apply_swr_defaults! method from Config class - Removed apply_swr_defaults! call from Config#initialize - Removed test for auto-defaulting behavior - Validation now catches missing cache_stale_ttl with helpful message Error Message: "cache_stale_ttl cannot be nil when cache_stale_while_revalidate is enabled. Set it to cache_ttl for a logical default, or use :indefinite for never-expiring cache." Test Coverage: - Added test for SWR enabled with nil cache_stale_ttl scenario - All 802 examples passing with 96.87% coverage maintained - RuboCop: No offenses detected Benefits: - Clear, actionable error messages for misconfiguration - Helps users understand the relationship between SWR and stale_ttl - Reduces cyclomatic complexity through method extraction - Easier to maintain and test individual validation rules - More explicit, less magical configuration - Easier to understand what values are actually set - No hidden side effects during initialization
|
@kxzk I think I was able to address all your feedback. This is ready for another round of reviews. Thanks again. |
|
@kxzk, @NoahFisher let me know if this look good and what the next steps would be for us getting this merged to main. |
NoahFisher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional Considerations:
- Memory Leak Potential with In-Memory Locks
In PromptCache, locks are stored in a hash but only deleted on release_lock.
If a refresh thread crashes between acquire_lock and release_lock (despite the ensure block), the lock is never released. The ensure should handle this, but if the thread is killed externally (e.g., Thread#kill), the lock persists forever. Unlike Redis locks which have TTL expiration, these in-memory locks have no timeout.
Suggestion: Accept this as a known limitation of the in-memory backend, but document it.
- No Backpressure Signal to Callers
When the thread pool queue fills up and tasks are discarded, callers have no idea. This is fine for SWR (they get stale data), but it means:
- Monitoring is blind to refresh failures
- If the API is down for extended periods, stale data keeps being served with no signal
Question: Should there be a hook or callback for observability? e.g., on_refresh_dropped, on_refresh_failed?
- Cold Start Behavior
On application boot with an empty cache, the first request for each prompt will be a synchronous API call (cache miss). With SWR enabled, you've documented cache warming, which is good. But worth calling out:
SWR doesn't help cold starts - it only helps after the cache has been populated at least once.
- Ruby's GVL (Global VM Lock)
Worth noting that Ruby's GVL means these threads aren't truly parallel for CPU work. However, for I/O-bound work (HTTP calls to Langfuse API), threads do release the GVL during I/O waits, so this is fine. Just worth being aware that max_threads: 5 doesn't mean 5x throughput for CPU-bound work.
- Should SWR be Default-On for Rails Backend?
The PR makes SWR opt-in (cache_stale_while_revalidate = false by default). Given the latency benefits and that the distributed lock prevents thundering herd, is there an argument for making
it default-on for the :rails backend specifically? Users with Redis already have the
infrastructure for distributed locks.
Counter-argument: Surprising behavior changes are bad. Opt-in is safer.
Good work so far, I wouldn't consider the above blocking comments (I feel more strongly about the inline ones), but curious if we should document our thinking for next steps.
| | `cache_stale_while_revalidate` | Boolean | No | `false` | Enable stale-while-revalidate | | ||
| | `cache_stale_ttl` | Integer | No | `60` when SWR is enabled | Stale TTL (seconds) | | ||
| | `cache_refresh_threads` | Integer | No | `5` | Background refresh threads | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Configuration Complexity
Concern: This PR adds 3 new configuration options (cache_stale_while_revalidate, cache_stale_ttl, cache_refresh_threads). I'd push back on exposing all of these to users.
Principle: Every configuration option is a decision the user has to make. Good defaults mean fewer decisions.
Suggestion: Consider collapsing to a single option:
Instead of:
config.cache_stale_while_revalidate = true
config.cache_stale_ttl = 300
config.cache_refresh_threads = 5
Just:
config.cache_stale_while_revalidate = true # Uses sensible defaults internally
Reasoning:
cache_stale_ttldefaulting tocache_ttlis already the right choice 99% of the time - why expose it?cache_refresh_threadsat 5 is fine for virtually all deployments - this is an implementation detail, not a user concern- Users who truly need to tune these can always subclass or we can add them later
The bar for adding config options should be: "Will a significant number of users need to change this from the default?" If not, hardcode it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We absolutely need to set the TTL to infinite. Not exposing the TTL setting won't work for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain what the use case for a non-infinite TTL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every configuration option is a decision the user has to make. Good defaults mean fewer decisions.
I agree, it certainly makes sense to hide cache_refresh_threads. Exposing cache_stale_ttl would be more of a flexibility in my opinion in case clients have very specific needs.
With that said, if I choose to enable SWR, I would more often than not want an infinite grace period, e.g. cache_stale_ttl = :indefinite for high availability reasons, then use cache_ttl to control how often cache is refreshed.
Based on that, only exposing config.cache_stale_while_revalidate would make sense to me if that defaults cache_stale_ttl to :indefinite as per the proposed implementation.
Does that make sense?
|
Thanks for all of the great feedback so far @NoahFisher! I've addressed some of the points you raised and wanted to follow up on a few of the points you brought up above: Point 2
This is a great point and I think it would be interesting to provide such hooks. Is the following somewhat aligned with what you had in mind? Langfuse.configure do |config|
# Hook called whenever a refresh attempt fails for whatever reason
config.on_cache_refresh_failed do |error, cache_key|
# handle refresh error accordingly... e.g. publish to Sentry, log something out, etc...
end
# Hook called whenever a refresh attempt is dropped due to not being able to acquire the lock
config.on_cache_refresh_dropped do |cache_key|
# handle refresh drops accordingly...
end
endCan you think of any other scenario we'd like to cover? Also, would it make sense to ship that as another enhancement PR? Point 3
Could you point me to that doc entry so I can double check? Cache warming should not be impacted by SWR as you pointed out, so that would likely be a mishap on my part. Point 5
My initial thinking was exactly that "Opt-in seemed safer" and intentional. Now, you do raise a good point, if 99% of the time Rails clients end up turning SWR on, then perhaps it should be the default? I could go either way on this one to be honest. |
|
@NoahFisher, @kxzk do you think we may be able to cut a new release including the work in this PR at some point this week? Also, one small snag I just ran into is that the project we plan on using Now the simplest solution to our problem seems to be having |
|
Moving the faraday convo to a new issue: #38 |
|
@NoahFisher, @kxzk any thoughts on the last update? Do you think we could move on with the solution as is and address anything that is non-blocking in a follow-up PR perhaps? |
|
@kxzk @NoahFisher Happy new year. Could we please get this merged? It's been going quite a long time. |
Summary
Add support to Stale-While-Revalidate (SWR) prompt cache strategy based off the Design Document: docs/future-enhancements/STALE_WHILE_REVALIDATE_DESIGN.md
The GIF below shows the new feature in action against a self-hosted production instance of Langfuse using the following config:
📊 Impact
:railsbackend)concurrent-rubygem for thread pool managementUsage
Configure the client:
Once configured, SWR works transparently:
More details on the proposed documentation.