feat: add GO enrichment analysis page for ProteomicsLFQ results #7

hjn0415a · 2026-02-03T07:18:50Z

This PR adds a new GO Enrichment Analysis page for ProteomicsLFQ results.
The page allows users to perform GO term enrichment (BP, CC, MF) based on protein-level differential abundance results.

Added a new Streamlit results page: results_proteomicslfq.py
Integrated GO enrichment analysis using MyGene.info for GO annotation
Foreground proteins are selected based on configurable p-value and |log2FC| thresholds
Enrichment is computed using Fisher’s exact test
Results are visualized as bar plots and tables, separated by GO category (BP / CC / MF)
Added mygene as a new dependency

Summary by CodeRabbit

New Features
- Added GO Terms page to the results interface.
- Introduced GO Enrichment Analysis workflow with adjustable p-value and log2FC cutoffs.
- Displays Protein-level abundance table and GO enrichment results across Biological Process, Cellular Component, and Molecular Function tabs with visualizations.
Chores
- Added mygene dependency.

coderabbitai · 2026-02-03T07:22:16Z

📝 Walkthrough

Walkthrough

This PR adds a new Proteomics LFQ results interface with GO enrichment analysis capabilities. Users can view protein abundance data and perform enrichment analysis on significant proteins using Fisher's exact test, with customizable p-value and log2FC thresholds and results visualization across biological process, cellular component, and molecular function categories.

Changes

Cohort / File(s)	Summary
Application Navigation `app.py`	Added new "GO Terms" page entry in the Results section pointing to the Proteomics LFQ results module.
Proteomics Results Interface `content/results_proteomicslfq.py`	New module implementing interactive Proteomics LFQ results with protein abundance table, GO enrichment analysis workflow using Fisher's exact test, UniProt term fetching via MyGene.info API, and visualizations of enriched GO terms across three categories (BP, CC, MF).
Dependencies `requirements.txt`	Added `mygene` package to support UniProt term retrieval during GO enrichment analysis.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Module as Proteomics LFQ Module
    participant MyGene as MyGene.info API
    participant Fisher as Fisher's Exact Test
    participant Viz as Visualization Engine

    User->>Module: Load page & set p-value/log2FC cutoffs
    Module->>Module: Filter significant proteins
    Module->>MyGene: Fetch UniProt terms for protein IDs
    MyGene-->>Module: Return UniProt annotations
    Module->>Module: Build background & foreground GO sets
    Module->>Fisher: Compute enrichment per GO term
    Fisher-->>Module: Return p-values & statistics
    Module->>Module: Aggregate results by category (BP/CC/MF)
    Module->>Viz: Render top 15 GO terms (bar plots)
    Viz-->>User: Display enrichment results & tables

Poem

🐰 Hops of joy through GO's grand garden,
Enrichment blooms where proteins dance,
Fisher's wisdom guides each test,
MyGene whispers UniProt's tale,
Now proteomics takes its bow ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and concisely describes the main change: adding a GO enrichment analysis page for ProteomicsLFQ results, which aligns with the substantial additions in the changeset.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@content/results_proteomicslfq.py`:
- Line 8: The import list includes an unused symbol ttest_ind; remove ttest_ind
from the import statement that currently reads "from scipy.stats import
ttest_ind, fisher_exact" and only import the used function (fisher_exact) so the
file no longer imports unused symbols.
- Around line 59-64: Replace the bare except in get_clean_uniprot with specific
exception types that reflect possible failures (e.g., catch IndexError and
TypeError) so you don't mask unrelated errors; update the handler to "except
(IndexError, TypeError) as e" (or similar specific tuple) and keep returning
None on those cases while letting other exceptions propagate.

🧹 Nitpick comments (6)

requirements.txt (1)

152-152: Consider pinning the mygene version for reproducibility.

The dependency is added without a version constraint, which could lead to unexpected behavior if the upstream API changes. This is consistent with other manually-added dependencies in this file, but pinning to a known working version (e.g., mygene>=3.2.2) would improve build reproducibility.
content/results_proteomicslfq.py (5)
50-51: Consider using @st.fragment for the GO enrichment workflow.

Per coding guidelines, interactive UI updates should use @st.fragment decorator to avoid full page reloads. The enrichment analysis triggered by the button could benefit from being wrapped in a fragment function.
Example refactor approach
`@st.fragment`
def run_enrichment_analysis(analysis_df, p_cutoff, fc_cutoff):
    # Move the enrichment logic (lines 52-141) into this function
    ...

# Then call it conditionally
if st.button("Run GO Enrichment"):
    run_enrichment_analysis(pivot_df.dropna(subset=["p-value", "log2FC"]).copy(), p_cutoff, fc_cutoff)
As per coding guidelines: "Use @st.fragment decorator for interactive UI updates without full page reloads".

79-80: Use idiomatic pandas filtering for boolean column.

The comparison != True works but is non-idiomatic. For pandas boolean columns, prefer the bitwise negation operator.
Proposed fix
 if "notfound" in res.columns:
-    res = res[res["notfound"] != True]
+    res = res[~res["notfound"].fillna(False)]
90-91: Lambda captures loop variable - potential late binding issue.

While safe here because .apply() executes immediately, capturing go_type in a lambda within a loop is a code smell that could cause bugs if refactored.
Proposed fix using default argument binding
 for go_type in ["BP", "CC", "MF"]:
-    res[f"{go_type}_terms"] = res["go"].apply(lambda x: extract_go_terms(x, go_type))
+    res[f"{go_type}_terms"] = res["go"].apply(lambda x, gt=go_type: extract_go_terms(x, gt))
128-128: Add strict=True to zip() for safer iteration.

Using strict=True ensures the iterables have the same length, catching potential bugs early.
Proposed fix
-for tab, go_type in zip([bp_tab, cc_tab, mf_tab], ["BP", "CC", "MF"]):
+for tab, go_type in zip([bp_tab, cc_tab, mf_tab], ["BP", "CC", "MF"], strict=True):
140-141: Consider logging the full exception for debugging.

While catching exceptions broadly for UI robustness is acceptable, logging the traceback would aid debugging production issues.
Proposed fix
+import logging
+
+logger = logging.getLogger(__name__)
+
 # ... at the exception handler:
 except Exception as e:
+    logger.exception("GO enrichment failed")
     st.error(f"GO enrichment failed: {e}")

coderabbitai · 2026-02-03T07:22:19Z

content/results_proteomicslfq.py

+import plotly.express as px
+import mygene
+from collections import defaultdict
+from scipy.stats import ttest_ind, fisher_exact


⚠️ Potential issue | 🟡 Minor

Remove unused import ttest_ind.

The ttest_ind function is imported but never used in this file.

Proposed fix

-from scipy.stats import ttest_ind, fisher_exact +from scipy.stats import fisher_exact

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

from scipy.stats import ttest_ind, fisher_exact

from scipy.stats import fisher_exact

🤖 Prompt for AI Agents

In `@content/results_proteomicslfq.py` at line 8, The import list includes an unused symbol ttest_ind; remove ttest_ind from the import statement that currently reads "from scipy.stats import ttest_ind, fisher_exact" and only import the used function (fisher_exact) so the file no longer imports unused symbols.

coderabbitai · 2026-02-03T07:22:19Z

content/results_proteomicslfq.py

+                    def get_clean_uniprot(name):
+                        try:
+                            parts = str(name).split("|")
+                            return parts[1] if len(parts) >= 2 else parts[0]
+                        except Exception:
+                            return None


⚠️ Potential issue | 🟡 Minor

Use specific exception types instead of bare Exception.

The broad exception handler could mask unexpected errors. Since this parsing logic can only reasonably fail with IndexError or similar, catch specific exceptions.

Proposed fix

def get_clean_uniprot(name): try: parts = str(name).split("|") return parts[1] if len(parts) >= 2 else parts[0] - except Exception: + except (IndexError, TypeError, AttributeError): return None

🧰 Tools

🪛 Ruff (0.14.14)

[warning] 63-63: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents

In `@content/results_proteomicslfq.py` around lines 59 - 64, Replace the bare except in get_clean_uniprot with specific exception types that reflect possible failures (e.g., catch IndexError and TypeError) so you don't mask unrelated errors; update the handler to "except (IndexError, TypeError) as e" (or similar specific tuple) and keep returning None on those cases while letting other exceptions propagate.

feat: add GO enrichment analysis page for ProteomicsLFQ results

cc70184

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add GO enrichment analysis page for ProteomicsLFQ results #7

feat: add GO enrichment analysis page for ProteomicsLFQ results #7

Uh oh!

hjn0415a commented Feb 3, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Feb 3, 2026

Walkthrough

Changes

Sequence Diagram(s)

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Feb 3, 2026

Uh oh!

coderabbitai bot Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	from scipy.stats import ttest_ind, fisher_exact
	from scipy.stats import fisher_exact

feat: add GO enrichment analysis page for ProteomicsLFQ results #7

Are you sure you want to change the base?

feat: add GO enrichment analysis page for ProteomicsLFQ results #7

Uh oh!

Conversation

hjn0415a commented Feb 3, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 3, 2026

Walkthrough

Changes

Sequence Diagram(s)

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

hjn0415a commented Feb 3, 2026 •

edited by coderabbitai bot

Loading