Skip to content

Adjusted readme, unified fm and fs calculation. Added stability mode.#16

Open
Conscht wants to merge 2 commits intoKainmueller-Lab:masterfrom
Conscht:dev_lisa
Open

Adjusted readme, unified fm and fs calculation. Added stability mode.#16
Conscht wants to merge 2 commits intoKainmueller-Lab:masterfrom
Conscht:dev_lisa

Conversation

@Conscht
Copy link
Contributor

@Conscht Conscht commented Jan 29, 2026

Summary

This PR introduces three major improvements:

  1. Improved README and documentation
  2. Unified false split (FS) and false merge (FM) calculations
  3. Added stability mode to the evaluation loop

Motivation

  • The README previously did not fully explain how to run the package as a module or how to use the Python API.
  • FM/FS calculations were implemented via two separate code paths, leading to duplicated logic.
  • Following the paper’s recommendations, models should be evaluated across three independent runs to assess robustness.

Documentation (README)

  • Adds a clearer installation section, including a uv-based workflow.
  • Expands CLI usage examples.
  • Adds Python API examples using evaluate_file and evaluate_volume.
  • Introduces a dedicated Stability & Robustness Mode section, documenting:
    • CLI flags: --stability_mode, --run_dirs
    • Expected output directory structure
    • The requirement of exactly three runs

Code Changes (evalinstseg/compute.py)

  • Imports get_m2m_matches and introduces a shared helper (compute_m2m_stats) to compute FM/FS in a single place.
  • Adds get_m2m_metrics(...) to compute (FM, FS, matches) in a unified way for many-to-many matching.
  • Preserves compatibility with existing FM/FS entry points while reducing duplicated code.

Notes

  • Stability mode is currently fixed to three runs, matching the evaluation protocol described in the paper.

@Conscht Conscht marked this pull request as ready for review February 5, 2026 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant