Skip to content

Conversation

@hjn0415a
Copy link

@hjn0415a hjn0415a commented Feb 3, 2026

This PR adds a new GO Enrichment Analysis page for ProteomicsLFQ results.
The page allows users to perform GO term enrichment (BP, CC, MF) based on protein-level differential abundance results.

  • Added a new Streamlit results page: results_proteomicslfq.py
  • Integrated GO enrichment analysis using MyGene.info for GO annotation
  • Foreground proteins are selected based on configurable p-value and |log2FC| thresholds
  • Enrichment is computed using Fisher’s exact test
  • Results are visualized as bar plots and tables, separated by GO category (BP / CC / MF)
  • Added mygene as a new dependency

Summary by CodeRabbit

  • New Features

    • Added GO Terms page to the results interface.
    • Introduced GO Enrichment Analysis workflow with adjustable p-value and log2FC cutoffs.
    • Displays Protein-level abundance table and GO enrichment results across Biological Process, Cellular Component, and Molecular Function tabs with visualizations.
  • Chores

    • Added mygene dependency.

@coderabbitai
Copy link

coderabbitai bot commented Feb 3, 2026

📝 Walkthrough

Walkthrough

This PR adds a new Proteomics LFQ results interface with GO enrichment analysis capabilities. Users can view protein abundance data and perform enrichment analysis on significant proteins using Fisher's exact test, with customizable p-value and log2FC thresholds and results visualization across biological process, cellular component, and molecular function categories.

Changes

Cohort / File(s) Summary
Application Navigation
app.py
Added new "GO Terms" page entry in the Results section pointing to the Proteomics LFQ results module.
Proteomics Results Interface
content/results_proteomicslfq.py
New module implementing interactive Proteomics LFQ results with protein abundance table, GO enrichment analysis workflow using Fisher's exact test, UniProt term fetching via MyGene.info API, and visualizations of enriched GO terms across three categories (BP, CC, MF).
Dependencies
requirements.txt
Added mygene package to support UniProt term retrieval during GO enrichment analysis.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Module as Proteomics LFQ Module
    participant MyGene as MyGene.info API
    participant Fisher as Fisher's Exact Test
    participant Viz as Visualization Engine

    User->>Module: Load page & set p-value/log2FC cutoffs
    Module->>Module: Filter significant proteins
    Module->>MyGene: Fetch UniProt terms for protein IDs
    MyGene-->>Module: Return UniProt annotations
    Module->>Module: Build background & foreground GO sets
    Module->>Fisher: Compute enrichment per GO term
    Fisher-->>Module: Return p-values & statistics
    Module->>Module: Aggregate results by category (BP/CC/MF)
    Module->>Viz: Render top 15 GO terms (bar plots)
    Viz-->>User: Display enrichment results & tables
Loading

Poem

🐰 Hops of joy through GO's grand garden,
Enrichment blooms where proteins dance,
Fisher's wisdom guides each test,
MyGene whispers UniProt's tale,
Now proteomics takes its bow

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and concisely describes the main change: adding a GO enrichment analysis page for ProteomicsLFQ results, which aligns with the substantial additions in the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@content/results_proteomicslfq.py`:
- Line 8: The import list includes an unused symbol ttest_ind; remove ttest_ind
from the import statement that currently reads "from scipy.stats import
ttest_ind, fisher_exact" and only import the used function (fisher_exact) so the
file no longer imports unused symbols.
- Around line 59-64: Replace the bare except in get_clean_uniprot with specific
exception types that reflect possible failures (e.g., catch IndexError and
TypeError) so you don't mask unrelated errors; update the handler to "except
(IndexError, TypeError) as e" (or similar specific tuple) and keep returning
None on those cases while letting other exceptions propagate.
🧹 Nitpick comments (6)
requirements.txt (1)

152-152: Consider pinning the mygene version for reproducibility.

The dependency is added without a version constraint, which could lead to unexpected behavior if the upstream API changes. This is consistent with other manually-added dependencies in this file, but pinning to a known working version (e.g., mygene>=3.2.2) would improve build reproducibility.

content/results_proteomicslfq.py (5)

50-51: Consider using @st.fragment for the GO enrichment workflow.

Per coding guidelines, interactive UI updates should use @st.fragment decorator to avoid full page reloads. The enrichment analysis triggered by the button could benefit from being wrapped in a fragment function.

Example refactor approach
`@st.fragment`
def run_enrichment_analysis(analysis_df, p_cutoff, fc_cutoff):
    # Move the enrichment logic (lines 52-141) into this function
    ...

# Then call it conditionally
if st.button("Run GO Enrichment"):
    run_enrichment_analysis(pivot_df.dropna(subset=["p-value", "log2FC"]).copy(), p_cutoff, fc_cutoff)

As per coding guidelines: "Use @st.fragment decorator for interactive UI updates without full page reloads".


79-80: Use idiomatic pandas filtering for boolean column.

The comparison != True works but is non-idiomatic. For pandas boolean columns, prefer the bitwise negation operator.

Proposed fix
 if "notfound" in res.columns:
-    res = res[res["notfound"] != True]
+    res = res[~res["notfound"].fillna(False)]

90-91: Lambda captures loop variable - potential late binding issue.

While safe here because .apply() executes immediately, capturing go_type in a lambda within a loop is a code smell that could cause bugs if refactored.

Proposed fix using default argument binding
 for go_type in ["BP", "CC", "MF"]:
-    res[f"{go_type}_terms"] = res["go"].apply(lambda x: extract_go_terms(x, go_type))
+    res[f"{go_type}_terms"] = res["go"].apply(lambda x, gt=go_type: extract_go_terms(x, gt))

128-128: Add strict=True to zip() for safer iteration.

Using strict=True ensures the iterables have the same length, catching potential bugs early.

Proposed fix
-for tab, go_type in zip([bp_tab, cc_tab, mf_tab], ["BP", "CC", "MF"]):
+for tab, go_type in zip([bp_tab, cc_tab, mf_tab], ["BP", "CC", "MF"], strict=True):

140-141: Consider logging the full exception for debugging.

While catching exceptions broadly for UI robustness is acceptable, logging the traceback would aid debugging production issues.

Proposed fix
+import logging
+
+logger = logging.getLogger(__name__)
+
 # ... at the exception handler:
 except Exception as e:
+    logger.exception("GO enrichment failed")
     st.error(f"GO enrichment failed: {e}")

import plotly.express as px
import mygene
from collections import defaultdict
from scipy.stats import ttest_ind, fisher_exact
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Remove unused import ttest_ind.

The ttest_ind function is imported but never used in this file.

Proposed fix
-from scipy.stats import ttest_ind, fisher_exact
+from scipy.stats import fisher_exact
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
from scipy.stats import ttest_ind, fisher_exact
from scipy.stats import fisher_exact
🤖 Prompt for AI Agents
In `@content/results_proteomicslfq.py` at line 8, The import list includes an
unused symbol ttest_ind; remove ttest_ind from the import statement that
currently reads "from scipy.stats import ttest_ind, fisher_exact" and only
import the used function (fisher_exact) so the file no longer imports unused
symbols.

Comment on lines +59 to +64
def get_clean_uniprot(name):
try:
parts = str(name).split("|")
return parts[1] if len(parts) >= 2 else parts[0]
except Exception:
return None
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use specific exception types instead of bare Exception.

The broad exception handler could mask unexpected errors. Since this parsing logic can only reasonably fail with IndexError or similar, catch specific exceptions.

Proposed fix
 def get_clean_uniprot(name):
     try:
         parts = str(name).split("|")
         return parts[1] if len(parts) >= 2 else parts[0]
-    except Exception:
+    except (IndexError, TypeError, AttributeError):
         return None
🧰 Tools
🪛 Ruff (0.14.14)

[warning] 63-63: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
In `@content/results_proteomicslfq.py` around lines 59 - 64, Replace the bare
except in get_clean_uniprot with specific exception types that reflect possible
failures (e.g., catch IndexError and TypeError) so you don't mask unrelated
errors; update the handler to "except (IndexError, TypeError) as e" (or similar
specific tuple) and keep returning None on those cases while letting other
exceptions propagate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant