Feature/aggregate rework v2 #540

FBumann · 2025-12-25T22:09:33Z

Description

Brief description of the changes in this PR.

Type of Change

Bug fix
New feature
Documentation update
Code refactoring

Related Issues

Closes #(issue number)

Testing

I have tested my changes
Existing tests still pass

Checklist

My code follows the project style
I have updated documentation if needed
I have added tests for new functionality (if applicable)

Summary by CodeRabbit

Release Notes

New Features
- Added comprehensive time-series clustering functionality for reducing problem size while maintaining solution quality.
- Introduced clustering expansion capability to restore full-resolution solutions from clustered models.
- Added configurable storage behavior modes during clustering operations.
- New example notebooks demonstrating clustering workflows, multi-period clustering, and storage mode configurations.
Documentation
- Added cluster architecture design documentation with detailed implementation guidance.
- Expanded example system generators for clustering demonstrations.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

…egration # Conflicts: # flixopt/flow_system.py

…o feature/aggregate-rework-v2

#549) * Enable selecting a single period/scenario * No selected_* tracking * Clustering IO * Clustering IO * Improve IO * Improve validation * Fix cluster weight IO * Fix cluster weights stuff * Fix cluster weights stuff * Refactor weights API: always normalize scenario weights (#547) * Add weights class * Add weights class * The Weights API is now used in the modeling equations: Changes made: 1. elements.py - Flow tracking: # Before: flow_hours = self.flow_rate * self._model.timestep_duration weighted_flow_hours = flow_hours * self._model.cluster_weight tracked_expression=weighted_flow_hours.sum(self._model.temporal_dims) # After: tracked_expression=self._model.weights.sum_temporal(self.flow_rate) 2. elements.py - Load factor total hours: # Before: total_hours = (self._model.timestep_duration * self._model.cluster_weight).sum(self._model.temporal_dims) # After: total_hours = self._model.weights.temporal.sum(self._model.weights.temporal_dims) 3. features.py - Status tracking: # Before: active_hours = self.status * self._model.timestep_duration weighted_active_hours = active_hours * self._model.cluster_weight tracked_expression=weighted_active_hours.sum(self._model.temporal_dims) # After: tracked_expression=self._model.weights.sum_temporal(self.status) 4. features.py - Temporal effects summing (only needs cluster weight since already per-timestep): # Before: weighted_per_timestep = self.total_per_timestep * self._model.cluster_weight temporal_dims = [d for d in self.total_per_timestep.dims if d not in ('period', 'scenario')] # After: weighted_per_timestep = self.total_per_timestep * self._model.weights.cluster self._eq_total.lhs -= weighted_per_timestep.sum(dim=self._model.weights.temporal_dims) * The Weights API is now used in the modeling equations: Changes made: 1. elements.py - Flow tracking: # Before: flow_hours = self.flow_rate * self._model.timestep_duration weighted_flow_hours = flow_hours * self._model.cluster_weight tracked_expression=weighted_flow_hours.sum(self._model.temporal_dims) # After: tracked_expression=self._model.weights.sum_temporal(self.flow_rate) 2. elements.py - Load factor total hours: # Before: total_hours = (self._model.timestep_duration * self._model.cluster_weight).sum(self._model.temporal_dims) # After: total_hours = self._model.weights.temporal.sum(self._model.weights.temporal_dims) 3. features.py - Status tracking: # Before: active_hours = self.status * self._model.timestep_duration weighted_active_hours = active_hours * self._model.cluster_weight tracked_expression=weighted_active_hours.sum(self._model.temporal_dims) # After: tracked_expression=self._model.weights.sum_temporal(self.status) 4. features.py - Temporal effects summing (only needs cluster weight since already per-timestep): # Before: weighted_per_timestep = self.total_per_timestep * self._model.cluster_weight temporal_dims = [d for d in self.total_per_timestep.dims if d not in ('period', 'scenario')] # After: weighted_per_timestep = self.total_per_timestep * self._model.weights.cluster self._eq_total.lhs -= weighted_per_timestep.sum(dim=self._model.weights.temporal_dims) * Minor fixes in test * Improve weighting system and normalization of scenrio weights * Update CHANGELOG.md * 1. ClusterStructure.n_clusters naming - Added explicit rename (matching n_representatives pattern) to avoid "None" variable names in serialized datasets 2. original_timesteps validation - Added explicit KeyError with actionable message when original_time coordinate is missing 3. active_hours bounds simplified - Passing total_hours DataArray directly instead of .max().item() fallback, allowing proper per-(period, scenario) bounds

* Add dataset plot accessor * Add fxplot acessor showcase * The internal plot accessors now leverage the shared .fxplot implementation, reducing code duplication while maintaining the same functionality (data preparation, color resolution from components, PlotResult wrapping). * Fix notebook * 1. xlabel/ylabel parameters - Added to bar(), stacked_bar(), line(), area(), and duration_curve() methods in both DatasetPlotAccessor and DataArrayPlotAccessor 2. scatter() method - Plots two variables against each other with x and y parameters 3. pie() method - Creates pie charts from aggregated (scalar) dataset values, e.g. ds.sum('time').fxplot.pie() 4. duration_curve() method - Sorts values along the time dimension in descending order, with optional normalize parameter for percentage x-axis 5. CONFIG.Plotting.default_line_shape - New config option (default 'hv') that controls the default line shape for line(), area(), and duration_curve() methods * Fix faceting of pie * Improve auto dim handling * Improve notebook * Fix pie plot * Logic order changed: 1. X-axis is now determined first using CONFIG.Plotting.x_dim_priority 2. Facets are resolved from remaining dimensions (x-axis excluded) x_dim_priority expanded: x_dim_priority = ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - Time-like dims first, then common grouping dims as fallback - variable stays excluded (it's used for color, not x-axis) _get_x_dim() refactored: - Now takes dims: list[str] instead of a DataFrame - More versatile - works with any list of dimension names * Add x parameter and x_dim_priority config to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection order - X-axis determined first, facets from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, not DataFrame) - Support scalar data (no dims) by using 'variable' as x-axis * Add x parameter and smart dimension handling to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection Default: ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - X-axis determined first, facets resolved from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, more versatile) - Support scalar data (no dims) by using 'variable' as x-axis - Skip color='variable' when x='variable' to avoid double encoding - Fix _dataset_to_long_df to use dims (not just coords) as id_vars * Add x parameter and smart dimension handling to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection Default: ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - X-axis determined first, facets resolved from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, more versatile) - Support scalar data (no dims) by using 'variable' as x-axis - Skip color='variable' when x='variable' to avoid double encoding - Fix _dataset_to_long_df to use dims (not just coords) as id_vars - Ensure px_kwargs properly overrides all defaults (color, facets, etc.) * Improve documentation * Fix notebook in docs * 1. heatmap kwarg merge order - Now uses **{**imshow_args, **imshow_kwargs} so user can override 2. scatter unused colors - Removed the unused parameter 3. to_duration_curve sorting - Changed [::-1] to np.flip(..., axis=time_axis) for correct multi-dimensional handling 4. DataArrayPlotAccessor.heatmap - Same kwarg merge fix * Improve docstrings * Update notebooks to not do file operations * Fix notebook * Fix CI * mkdocs-jupyter was treating this .py file as a notebook and executing it, causing the NetCDF write failure in CI * Add missing type annotation

* Add dataset plot accessor * Add fxplot acessor showcase * The internal plot accessors now leverage the shared .fxplot implementation, reducing code duplication while maintaining the same functionality (data preparation, color resolution from components, PlotResult wrapping). * Fix notebook * 1. xlabel/ylabel parameters - Added to bar(), stacked_bar(), line(), area(), and duration_curve() methods in both DatasetPlotAccessor and DataArrayPlotAccessor 2. scatter() method - Plots two variables against each other with x and y parameters 3. pie() method - Creates pie charts from aggregated (scalar) dataset values, e.g. ds.sum('time').fxplot.pie() 4. duration_curve() method - Sorts values along the time dimension in descending order, with optional normalize parameter for percentage x-axis 5. CONFIG.Plotting.default_line_shape - New config option (default 'hv') that controls the default line shape for line(), area(), and duration_curve() methods * Fix faceting of pie * Improve auto dim handling * Improve notebook * Fix pie plot * Logic order changed: 1. X-axis is now determined first using CONFIG.Plotting.x_dim_priority 2. Facets are resolved from remaining dimensions (x-axis excluded) x_dim_priority expanded: x_dim_priority = ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - Time-like dims first, then common grouping dims as fallback - variable stays excluded (it's used for color, not x-axis) _get_x_dim() refactored: - Now takes dims: list[str] instead of a DataFrame - More versatile - works with any list of dimension names * Add x parameter and x_dim_priority config to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection order - X-axis determined first, facets from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, not DataFrame) - Support scalar data (no dims) by using 'variable' as x-axis * Add x parameter and smart dimension handling to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection Default: ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - X-axis determined first, facets resolved from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, more versatile) - Support scalar data (no dims) by using 'variable' as x-axis - Skip color='variable' when x='variable' to avoid double encoding - Fix _dataset_to_long_df to use dims (not just coords) as id_vars * Add x parameter and smart dimension handling to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection Default: ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - X-axis determined first, facets resolved from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, more versatile) - Support scalar data (no dims) by using 'variable' as x-axis - Skip color='variable' when x='variable' to avoid double encoding - Fix _dataset_to_long_df to use dims (not just coords) as id_vars - Ensure px_kwargs properly overrides all defaults (color, facets, etc.) * Improve documentation * Fix notebook in docs * 1. heatmap kwarg merge order - Now uses **{**imshow_args, **imshow_kwargs} so user can override 2. scatter unused colors - Removed the unused parameter 3. to_duration_curve sorting - Changed [::-1] to np.flip(..., axis=time_axis) for correct multi-dimensional handling 4. DataArrayPlotAccessor.heatmap - Same kwarg merge fix * Improve docstrings * Update notebooks to not do file operations * Fix notebook * Summary of Changes .github/workflows/docs.yaml 1. Notebook caching - Caches executed notebooks using a hash of notebooks + source code 2. Parallel execution - Runs jupyter execute with -P 4 (4 notebooks in parallel) 3. Skip mkdocs-jupyter execution - Sets MKDOCS_JUPYTER_EXECUTE=false since notebooks are pre-executed * Fix CI * mkdocs-jupyter was treating this .py file as a notebook and executing it, causing the NetCDF write failure in CI * Add missing type annotation * cache key computation now sorts files before hashing to ensure stable keys across runs

- get_quickstart_data() - 01-quickstart - get_heat_system_data() - 02-heat-system - get_investment_data() - 03-investment - get_constraints_data() - 04-constraints - get_multicarrier_data() - 05-multi-carrier - get_time_varying_data() - 06a-time-varying - get_scenarios_data() - 07-scenarios Notebook changes: | Notebook | Changes | |------------------|------------------------------------------------------| | 01-quickstart | Uses fxplot for heat demand visualization | | 02-heat-system | Uses helper + fxplot for demand/price plots | | 03-investment | Uses helper + fxplot; removed cost definition cell | | 04-constraints | Uses helper + fxplot for demand visualization | | 05-multi-carrier | Uses helper + fxplot for profiles | | 06a-time-varying | Uses helper + fxplot; removed inline COP calculation | | 07-scenarios | Uses helper + fxplot; removed inline data generation | Benefits: - Centralized data generation reduces code duplication - Consistent visualization with .fxplot accessor - Easier maintenance - change data in one place - Notebooks focus on teaching flixopt concepts, not data setup

…dated the internal call to pass Non

# Conflicts: # CHANGELOG.md

* Improve documentation and improve CHANGELOG.md * FIx CHangelog and change to v6.0.0 * FIx CHangelog and change to v6.0.0 * FIx CHangelog and change to v6.0.0 * Enhanced Clustering Control New Parameters Added to cluster() Method | Parameter | Type | Default | Purpose | |-------------------------|-------------------------------|----------------------|--------------------------------------------------------------------------------------------------------------------| | cluster_method | Literal[...] | 'k_means' | Clustering algorithm ('k_means', 'hierarchical', 'k_medoids', 'k_maxoids', 'averaging') | | representation_method | Literal[...] | 'meanRepresentation' | How clusters are represented ('meanRepresentation', 'medoidRepresentation', 'distributionAndMinMaxRepresentation') | | extreme_period_method | Literal[...] | 'new_cluster_center' | How peaks are integrated ('None', 'append', 'new_cluster_center', 'replace_cluster_center') | | rescale_cluster_periods | bool | True | Rescale clusters to match original means | | random_state | int | None | None | Random seed for reproducibility | | predef_cluster_order | np.ndarray | list[int] | None | None | Manual clustering assignments | | **tsam_kwargs | Any | - | Pass-through for any tsam parameter | Clustering Quality Metrics Access via fs.clustering.metrics after clustering - returns a DataFrame with RMSE, MAE, and other accuracy indicators per time series. Files Modified 1. flixopt/transform_accessor.py - Updated cluster() signature and tsam call 2. flixopt/clustering/base.py - Added metrics field to Clustering class 3. tests/test_clustering/test_integration.py - Added tests for new parameters 4. docs/user-guide/optimization/clustering.md - Updated documentation * Dimension renamed: original_period → original_cluster Property renamed: n_original_periods → n_original_clusters * Problem: Expanded FlowSystem from clustering didn't have the extra timestep that regular FlowSystems have. Root Cause: In expand_solution(), the solution was only indexed by original_timesteps (n elements) instead of original_timesteps_extra (n+1 elements). Fix in flixopt/transform_accessor.py: 1. Reindex solution to timesteps_extra (line 1296-1298): - Added expanded_fs._solution.reindex(time=original_timesteps_extra) for consistency with non-expanded FlowSystems 2. Fill extra timestep for charge_state (lines 1300-1333): - Added special handling to properly fill the extra timestep for storage charge_state variables using the last cluster's extra timestep value 3. Updated intercluster storage handling (lines 1340-1388): - Modified to work with original_timesteps_extra instead of just original_timesteps - The extra timestep now correctly gets the final SOC boundary value with proper decay applied Tests updated in tests/test_cluster_reduce_expand.py: - Updated 4 assertions that check solution time coordinates to expect 193 (192 + 1 extra) instead of 192 * - 'variable' is treated as a special valid facet value (since it exists in the melted DataFrame from data_var names, not as a dimension) - When facet_row='variable' or facet_col='variable' is passed, it's passed through directly - In line(), when faceting by variable, it's not also used for color (avoids double encoding) * Add variable and color to auto resolving in fxplot * Added 'variable' to both priority lists and updated the logic to treat it consistently: flixopt/config.py: 'extra_dim_priority': ('variable', 'cluster', 'period', 'scenario'), 'x_dim_priority': ('time', 'duration', 'duration_pct', 'variable', 'period', 'scenario', 'cluster'), flixopt/dataset_plot_accessor.py: - _get_x_dim: Now takes n_data_vars parameter; 'variable' is available when > 1 - _resolve_auto_facets: 'variable' is available when len(data_vars) > 1 and respects exclude_dims Behavior: - 'variable' is treated like any other dimension in the priority system - Only available when there are multiple data_vars - Properly excluded when already used (e.g., for x-axis) * Improve plotting, especially for clustering * Drop cluster index when expanding * Fix storage expansion * Improve clustering * fix scatter plot faceting * ⏺ Fixed the documentation in the notebook: 1. Cell 32 (API Reference table): Updated defaults to 'hierarchical', 'medoidRepresentation', and None 2. Cell 16: Swapped the example to show k_means as the alternative (since hierarchical is now default) 3. Cell 17: Updated variable names to match 4. Cell 33 (Key Takeaways): Clarified that random_state is only needed for non-deterministic methods like 'k_means' The code review * 1. Error handling for accuracyIndicators() - Added try/except with warning log and empty DataFrame fallback, plus handling empty DataFrames when building the metrics Dataset 2. Random state to tsam - Replaced global np.random.seed() with passing seed parameter directly to tsam's TimeSeriesAggregation 3. tsam_kwargs conflict validation - Added validation that raises ValueError if user tries to override explicit parameters via **tsam_kwargs (including seed) 4. predef_cluster_order validation - Added dimension validation for DataArray inputs, checking they match the FlowSystem's period/scenario structure 5. Out-of-bounds fix - Clamped last_original_cluster_idx to n_original_clusters - 1 to handle partial clusters at the end * 1. DataFrame truth ambiguity - Changed non_empty_metrics.get(first_key) or next(...) to explicit if metrics_df is None: check 2. removed random state * Fix pie plot animation frame and add warnings for unassigned dims * Change logger warning to regular warning * ⏺ The centralized slot assignment system is now complete. Here's a summary of the changes made: Changes Made 1. flixopt/config.py - Replaced three separate config attributes (extra_dim_priority, dim_slot_priority, x_dim_priority) with a single unified dim_priority tuple - Updated CONFIG.Plotting class docstring and attribute definitions - Updated to_dict() method to use the new attribute - The new priority order: ('time', 'duration', 'duration_pct', 'variable', 'cluster', 'period', 'scenario') 2. flixopt/dataset_plot_accessor.py - Created new assign_slots() function that centralizes all dimension-to-slot assignment logic - Fixed slot fill order: x → color → facet_col → facet_row → animation_frame - Updated all plot methods (bar, stacked_bar, line, area, heatmap, scatter, pie) to use assign_slots() - Removed old _get_x_dim() and _resolve_auto_facets() functions - Updated docstrings to reference dim_priority instead of x_dim_priority 3. flixopt/statistics_accessor.py - Updated _resolve_auto_facets() to use the new assign_slots() function internally - Added import for assign_slots from dataset_plot_accessor Key Design Decisions - Single priority list controls all auto-assignment - Slots are filled in fixed order based on availability - None means a slot is not available for that plot type - 'auto' triggers auto-assignment from priority list - Explicit string values override auto-assignment * Add slot_order to config * Add new assign_slots() method * Add new assign_slots() method * Fix heatmap and convert all to use fxplot * Fix heatmap * Fix heatmap * Fix heatmap * Fix heatmap * Squeeze signleton dims in heatmap()

* Add dataset plot accessor * Add fxplot acessor showcase * The internal plot accessors now leverage the shared .fxplot implementation, reducing code duplication while maintaining the same functionality (data preparation, color resolution from components, PlotResult wrapping). * Fix notebook * 1. xlabel/ylabel parameters - Added to bar(), stacked_bar(), line(), area(), and duration_curve() methods in both DatasetPlotAccessor and DataArrayPlotAccessor 2. scatter() method - Plots two variables against each other with x and y parameters 3. pie() method - Creates pie charts from aggregated (scalar) dataset values, e.g. ds.sum('time').fxplot.pie() 4. duration_curve() method - Sorts values along the time dimension in descending order, with optional normalize parameter for percentage x-axis 5. CONFIG.Plotting.default_line_shape - New config option (default 'hv') that controls the default line shape for line(), area(), and duration_curve() methods * Fix faceting of pie * Improve auto dim handling * Improve notebook * Fix pie plot * Logic order changed: 1. X-axis is now determined first using CONFIG.Plotting.x_dim_priority 2. Facets are resolved from remaining dimensions (x-axis excluded) x_dim_priority expanded: x_dim_priority = ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - Time-like dims first, then common grouping dims as fallback - variable stays excluded (it's used for color, not x-axis) _get_x_dim() refactored: - Now takes dims: list[str] instead of a DataFrame - More versatile - works with any list of dimension names * Add x parameter and x_dim_priority config to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection order - X-axis determined first, facets from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, not DataFrame) - Support scalar data (no dims) by using 'variable' as x-axis * Add x parameter and smart dimension handling to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection Default: ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - X-axis determined first, facets resolved from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, more versatile) - Support scalar data (no dims) by using 'variable' as x-axis - Skip color='variable' when x='variable' to avoid double encoding - Fix _dataset_to_long_df to use dims (not just coords) as id_vars * Add x parameter and smart dimension handling to fxplot - Add `x` parameter to bar/stacked_bar/line/area for explicit x-axis control - Add CONFIG.Plotting.x_dim_priority for auto x-axis selection Default: ('time', 'duration', 'duration_pct', 'period', 'scenario', 'cluster') - X-axis determined first, facets resolved from remaining dimensions - Refactor _get_x_column -> _get_x_dim (takes dim list, more versatile) - Support scalar data (no dims) by using 'variable' as x-axis - Skip color='variable' when x='variable' to avoid double encoding - Fix _dataset_to_long_df to use dims (not just coords) as id_vars - Ensure px_kwargs properly overrides all defaults (color, facets, etc.) * Improve documentation * Fix notebook in docs * 1. heatmap kwarg merge order - Now uses **{**imshow_args, **imshow_kwargs} so user can override 2. scatter unused colors - Removed the unused parameter 3. to_duration_curve sorting - Changed [::-1] to np.flip(..., axis=time_axis) for correct multi-dimensional handling 4. DataArrayPlotAccessor.heatmap - Same kwarg merge fix * Improve docstrings * Update notebooks to not do file operations * Add Comparison class * Add Release notes * Add Comparison class to all Notebooks * Update comparison.py and add documentation * ⏺ The class went from ~560 lines to ~115 lines. Key simplifications: 1. __getattr__ - dynamically delegates any method to the underlying accessor 2. _wrap_plot_method - single method that handles all the data collection and concatenation 3. _recreate_figure - infers plot type from the original figure and recreates with combined data Tradeoffs: - Less explicit type hints on method signatures (but still works the same) - Infers plot type from original figure rather than hardcoding per method - Automatically supports any new methods added to StatisticsPlotAccessor in the future * ⏺ The class went from ~560 lines to ~115 lines. Key simplifications: 1. __getattr__ - dynamically delegates any method to the underlying accessor 2. _wrap_plot_method - single method that handles all the data collection and concatenation 3. _recreate_figure - infers plot type from the original figure and recreates with combined data Tradeoffs: - Less explicit type hints on method signatures (but still works the same) - Infers plot type from original figure rather than hardcoding per method - Automatically supports any new methods added to StatisticsPlotAccessor in the future * Minor bugfix * Now all methods properly split kwargs and pass plotly_kwargs to the figure creation. The _DATA_KWARGS mapping defines which kwargs affect data processing - everything else passes through to plotly. * Now all methods properly split kwargs and pass plotly_kwargs to the figure creation. The _DATA_KWARGS mapping defines which kwargs affect data processing - everything else passes through to plotly. * Improve documentation and improve CHANGELOG.md * Fix core dims * FIx CHangelog and change to v6.0.0 * FIx CHangelog and change to v6.0.0 * FIx CHangelog and change to v6.0.0 * Enhanced Clustering Control New Parameters Added to cluster() Method | Parameter | Type | Default | Purpose | |-------------------------|-------------------------------|----------------------|--------------------------------------------------------------------------------------------------------------------| | cluster_method | Literal[...] | 'k_means' | Clustering algorithm ('k_means', 'hierarchical', 'k_medoids', 'k_maxoids', 'averaging') | | representation_method | Literal[...] | 'meanRepresentation' | How clusters are represented ('meanRepresentation', 'medoidRepresentation', 'distributionAndMinMaxRepresentation') | | extreme_period_method | Literal[...] | 'new_cluster_center' | How peaks are integrated ('None', 'append', 'new_cluster_center', 'replace_cluster_center') | | rescale_cluster_periods | bool | True | Rescale clusters to match original means | | random_state | int | None | None | Random seed for reproducibility | | predef_cluster_order | np.ndarray | list[int] | None | None | Manual clustering assignments | | **tsam_kwargs | Any | - | Pass-through for any tsam parameter | Clustering Quality Metrics Access via fs.clustering.metrics after clustering - returns a DataFrame with RMSE, MAE, and other accuracy indicators per time series. Files Modified 1. flixopt/transform_accessor.py - Updated cluster() signature and tsam call 2. flixopt/clustering/base.py - Added metrics field to Clustering class 3. tests/test_clustering/test_integration.py - Added tests for new parameters 4. docs/user-guide/optimization/clustering.md - Updated documentation * Dimension renamed: original_period → original_cluster Property renamed: n_original_periods → n_original_clusters * Problem: Expanded FlowSystem from clustering didn't have the extra timestep that regular FlowSystems have. Root Cause: In expand_solution(), the solution was only indexed by original_timesteps (n elements) instead of original_timesteps_extra (n+1 elements). Fix in flixopt/transform_accessor.py: 1. Reindex solution to timesteps_extra (line 1296-1298): - Added expanded_fs._solution.reindex(time=original_timesteps_extra) for consistency with non-expanded FlowSystems 2. Fill extra timestep for charge_state (lines 1300-1333): - Added special handling to properly fill the extra timestep for storage charge_state variables using the last cluster's extra timestep value 3. Updated intercluster storage handling (lines 1340-1388): - Modified to work with original_timesteps_extra instead of just original_timesteps - The extra timestep now correctly gets the final SOC boundary value with proper decay applied Tests updated in tests/test_cluster_reduce_expand.py: - Updated 4 assertions that check solution time coordinates to expect 193 (192 + 1 extra) instead of 192 * - 'variable' is treated as a special valid facet value (since it exists in the melted DataFrame from data_var names, not as a dimension) - When facet_row='variable' or facet_col='variable' is passed, it's passed through directly - In line(), when faceting by variable, it's not also used for color (avoids double encoding) * Add variable and color to auto resolving in fxplot * Added 'variable' to both priority lists and updated the logic to treat it consistently: flixopt/config.py: 'extra_dim_priority': ('variable', 'cluster', 'period', 'scenario'), 'x_dim_priority': ('time', 'duration', 'duration_pct', 'variable', 'period', 'scenario', 'cluster'), flixopt/dataset_plot_accessor.py: - _get_x_dim: Now takes n_data_vars parameter; 'variable' is available when > 1 - _resolve_auto_facets: 'variable' is available when len(data_vars) > 1 and respects exclude_dims Behavior: - 'variable' is treated like any other dimension in the priority system - Only available when there are multiple data_vars - Properly excluded when already used (e.g., for x-axis) * Improve plotting, especially for clustering * Drop cluster index when expanding * Fix storage expansion * Improve clustering * fix scatter plot faceting * ⏺ Fixed the documentation in the notebook: 1. Cell 32 (API Reference table): Updated defaults to 'hierarchical', 'medoidRepresentation', and None 2. Cell 16: Swapped the example to show k_means as the alternative (since hierarchical is now default) 3. Cell 17: Updated variable names to match 4. Cell 33 (Key Takeaways): Clarified that random_state is only needed for non-deterministic methods like 'k_means' The code review * 1. Error handling for accuracyIndicators() - Added try/except with warning log and empty DataFrame fallback, plus handling empty DataFrames when building the metrics Dataset 2. Random state to tsam - Replaced global np.random.seed() with passing seed parameter directly to tsam's TimeSeriesAggregation 3. tsam_kwargs conflict validation - Added validation that raises ValueError if user tries to override explicit parameters via **tsam_kwargs (including seed) 4. predef_cluster_order validation - Added dimension validation for DataArray inputs, checking they match the FlowSystem's period/scenario structure 5. Out-of-bounds fix - Clamped last_original_cluster_idx to n_original_clusters - 1 to handle partial clusters at the end * 1. DataFrame truth ambiguity - Changed non_empty_metrics.get(first_key) or next(...) to explicit if metrics_df is None: check 2. removed random state * Fix pie plot animation frame and add warnings for unassigned dims * Change logger warning to regular warning * ⏺ The centralized slot assignment system is now complete. Here's a summary of the changes made: Changes Made 1. flixopt/config.py - Replaced three separate config attributes (extra_dim_priority, dim_slot_priority, x_dim_priority) with a single unified dim_priority tuple - Updated CONFIG.Plotting class docstring and attribute definitions - Updated to_dict() method to use the new attribute - The new priority order: ('time', 'duration', 'duration_pct', 'variable', 'cluster', 'period', 'scenario') 2. flixopt/dataset_plot_accessor.py - Created new assign_slots() function that centralizes all dimension-to-slot assignment logic - Fixed slot fill order: x → color → facet_col → facet_row → animation_frame - Updated all plot methods (bar, stacked_bar, line, area, heatmap, scatter, pie) to use assign_slots() - Removed old _get_x_dim() and _resolve_auto_facets() functions - Updated docstrings to reference dim_priority instead of x_dim_priority 3. flixopt/statistics_accessor.py - Updated _resolve_auto_facets() to use the new assign_slots() function internally - Added import for assign_slots from dataset_plot_accessor Key Design Decisions - Single priority list controls all auto-assignment - Slots are filled in fixed order based on availability - None means a slot is not available for that plot type - 'auto' triggers auto-assignment from priority list - Explicit string values override auto-assignment * Add slot_order to config * Add new assign_slots() method * Add new assign_slots() method * Fix heatmap and convert all to use fxplot * Fix heatmap * Fix heatmap * Fix heatmap * Fix heatmap * Merge remote-tracking branch 'origin/feature/tsam-params' into feature/comparison # Conflicts: # docs/notebooks/08c-clustering.ipynb # flixopt/config.py * comparison.py: 1. Removed _resolve_facets method - fxplot handles 'auto' resolution internally 2. Updated all methods to pass facet params directly to fxplot 3. sizes now uses ds.fxplot.bar() instead of px.bar 4. effects now uses ds.fxplot.bar() with proper column naming statistics_accessor.py: 1. Simplified effects method significantly: - Works directly with Dataset (no DataArray concat/conversion) - Uses dict.get for aspect selection - Cleaner aggregation logic - Returns Dataset with effects as data variables - Uses fxplot.bar instead of px.bar The code is now consistent - all plotting methods in both StatisticsPlotAccessor and ComparisonStatisticsPlot use fxplot for centralized dimension/slot handling. * Squeeze signleton dims in heatmap() * Replaced print statements with class repr * 1. 08a-aggregation.ipynb cell 16: Removed corrupted <cell_type>markdown</cell_type> tag from markdown source 2. flixopt/comparison.py line 75: Added fallback for None names: # Before self._names = names or [fs.name for fs in flow_systems] # After self._names = names or [fs.name or f'System {i}' for i, fs in enumerate(flow_systems)]

FBumann added 30 commits December 13, 2025 23:11

Temp

8ac58f5

Add n_segments

e579a11

Update CHANGELOG.md

16cffe1

Use deep copy

139dc89

Add notebook for clustering

60dd670

Update notebook

8c03f64

Fix multi period and multi scenario clsutering

9fdc53d

Improve

a9a442d

Exclude solution when clustering

b29460b

Use pre-buildt flow_system

44aa5db

Improve notebook

fe87396

Added new system to notebook defaults

584e907

Use realistic flow system in notebooks

f47ef38

Merge branch 'feature/flow-system-first' into feature/better-tsam-int…

b0c9166

…egration # Conflicts: # flixopt/flow_system.py

add segmentation to notebooks

1bfdc56

fix cluster_multi_dimensional_data

6aebd18

fix notebooks to only create flow_system if needed

c0c7c45

Fix inter-cluster segmentation

db3e37e

Improve notebook to use more segments

bcf7136

Improve notebook to use more segments

6b5a638

Fix Data_DIR in notebooks

e20775f

Group constraints and varaibles form clustering together

6fdd684

Only equalize SOME variables

76af019

Always fix binaries for better pre-solve

b7c5d60

Group constraints and varaibles form clustering together

737f354

Improve readybility of clustering equations

79db532

Add IO for clustering

2f38e7c

Add IO for clustering

96e0826

Add IO for clustering

e8ab4b8

Improve clustering

c74c5e7

FBumann added 27 commits December 28, 2025 12:56

add catchwarning and fix example system

14abf85

improve notebook

dcea667

Improve docs, changelog and add tests

7014689

Merge remote-tracking branch 'origin/feature/aggregate-rework-v2' int…

974df60

…o feature/aggregate-rework-v2

Fix plotting animation handling

702803a

fix: cluster handling in temporal shares

ab9a519

Change logging in clustering

300ea33

Fix imports and warnings

9a8cd27

Merge remote-tracking branch 'origin/feature/aggregate-rework-v2' int…

8e80478

…o feature/aggregate-rework-v2

Fix netcdf warnings

3ba9730

Fix notebooks to not save to netcdf

325f4eb

Warning Handling Refactored

282594c

Simplify notebooks

d4dd58b

Removed animation_frame from the method signature, docstring, and up…

b1fa086

…dated the internal call to pass Non

Add missing notebooks to docs

5f37003

Fix ci

184513e

Fix ci

0439d10

Add tutorial_data.py

cfe8b9c

Retrigger ci

c9bd406

Merge branch 'main' into feature/aggregate-rework-v2

7dcacfd

# Conflicts: # CHANGELOG.md

Add tutorial data back to ntoebooks

4f1d827

This was referenced Jan 5, 2026

Feature/aggregate rework #537

Closed

Improve clustering #533

Closed

Feature/better tsam integration #530

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/aggregate rework v2 #540

Feature/aggregate rework v2 #540

Uh oh!

FBumann commented Dec 25, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature/aggregate rework v2 #540

Are you sure you want to change the base?

Feature/aggregate rework v2 #540

Uh oh!

Conversation

FBumann commented Dec 25, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Related Issues

Testing

Checklist

Summary by CodeRabbit

Release Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FBumann commented Dec 25, 2025 •

edited by coderabbitai bot

Loading