Skip to content

Cleanup patch code#999

Merged
spoorcc merged 2 commits intomainfrom
patch-class
Feb 21, 2026
Merged

Cleanup patch code#999
spoorcc merged 2 commits intomainfrom
patch-class

Conversation

@spoorcc
Copy link
Contributor

@spoorcc spoorcc commented Feb 15, 2026

cleanup patch code and make a more coherent oo-interface

Summary by CodeRabbit

  • Refactor

    • Introduced a unified, typed patch API for all VCS operations (creation, combination, filtering, reversing, serialization), replacing legacy patch utilities and standardizing patch handling.
  • Tests

    • Test suite updated to use the new patch API and validate patch generation, prefixing, reversal, and application via the unified interface.
  • Chores

    • Development container updated to a newer pip version.

@coderabbitai
Copy link

coderabbitai bot commented Feb 15, 2026

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. By default, CodeRabbit skips reviewing PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • ✅ Review completed - (🔄 Check again to review again)

Walkthrough

Replaced numerous procedural patch utilities with a typed Patch API (Patch, PatchType) and updated callers across commands, VCS integrations, project code, and tests to create, transform, combine, and apply Patch objects instead of operating on raw strings/bytes or legacy helpers.

Changes

Cohort / File(s) Summary
Core Patch API
dfetch/vcs/patch.py
Added PatchType enum and Patch class with static constructors (from_file, from_bytes, from_string, for_new_files, empty) and instance ops (apply, dump, reverse, filter, add_prefix, convert_type, combine). Removed many legacy top-level helpers (parse/dump/filter/apply/create_*_for_new_file/add_prefix_to_patch/convert_patch_to).
Command: format_patch
dfetch/commands/format_patch.py
Iterate patch file paths as patch_file, construct Patch.from_file(...), determine PatchType, call .convert_type(...), .add_prefix(...), and .dump(); output filenames derive from original patch_file. _determine_target_patch_type now returns PatchType.
SVN/Git VCS integrations
dfetch/vcs/svn.py, dfetch/vcs/git.py
VCS diff helpers now return Patch objects (use Patch.from_bytes(...).filter(ignore) or Patch.empty().convert_type(PatchType.SVN) for SVN; use Patch.for_new_files(..., PatchType.GIT) for Git untracked files).
Superprojects
dfetch/project/svnsuperproject.py, dfetch/project/gitsuperproject.py
Switched to Patch-based diff composition: use Patch.for_new_files(...), Patch.combine(...), .reverse() and .dump(); updated truthiness checks to .is_empty().
Subproject / apply flow
dfetch/project/subproject.py
Replaced apply_patch(...) with Patch.from_file(...).apply(...); removed abstract create_formatted_patch_header from SubProject; imports updated to Patch.
Subproject implementations
dfetch/project/gitsubproject.py, dfetch/project/svnsubproject.py
Removed PatchInfo dependency and deleted create_formatted_patch_header implementations.
Tests
tests/test_patch.py, tests/test_subproject.py
Updated tests to use Patch/PatchType APIs (for_new_files/from_file(...).add_prefix(...).dump(), .apply()); removed tests relying on removed helper functions and removed create_formatted_patch_header override.
Devcontainer
.devcontainer/Dockerfile
Bumped pip version from 25.3 to 26.0.1.
Feature files
features/*
Minor patch-hunk header formatting adjustments (@@ -0,0 +1,1 @@) in scenarios.

Sequence Diagram(s)

sequenceDiagram
  participant Cmd as format_patch command
  participant FS as Filesystem
  participant Patch as dfetch.vcs.patch.Patch
  participant SubP as SubProject / SuperProject
  participant Output as Patch file writer

  rect rgba(100,149,237,0.5)
  Cmd->>FS: iterate patch_file paths
  FS-->>Cmd: return patch_file
  end

  rect rgba(60,179,113,0.5)
  Cmd->>Patch: Patch.from_file(patch_file)
  Patch-->>Cmd: Patch instance
  Cmd->>Patch: .convert_type(target: PatchType)
  Cmd->>Patch: .add_prefix(prefix)
  Cmd->>Patch: .dump()
  Patch-->>Output: write patch text
  end

  rect rgba(255,165,0,0.5)
  SubP->>Patch: Patch.for_new_files(paths, PatchType)
  Patch-->>SubP: Patch instance
  SubP->>Patch: .combine() / .reverse()
  Patch-->>Output: final dump()
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

enhancement, testing

Suggested reviewers

  • ben-edna
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Cleanup patch code' is vague and generic; while it relates to the changeset, it does not clearly convey the primary change—introducing a new Patch class API to replace legacy patch utilities. Consider a more specific title like 'Refactor patch utilities into Patch class API' or 'Extract patch handling into object-oriented Patch interface' to better communicate the main architectural change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 95.45% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
dfetch/vcs/patch.py (1)

402-424: ⚠️ Potential issue | 🔴 Critical

convert_patch_to is broken: PatchType enum members never equal raw patch_ng constants.

required_type is a PatchType enum, but lines 406, 409, and 419 compare it against patch.type (a raw patch_ng constant) and patch_ng.GIT/patch_ng.SVN directly. Python Enum.__eq__ won't match these, so:

  • Line 406: early-return is never taken (conversion always runs even when unnecessary).
  • Lines 409/419: neither GIT nor SVN branches execute — the function falls through and sets patch.type = required_type (a PatchType enum), which patch_ng won't understand downstream.

Use .value consistently, or compare against PatchType members.

Proposed fix
 def convert_patch_to(
     patch: patch_ng.PatchSet, required_type: PatchType
 ) -> patch_ng.PatchSet:
     """Convert the patch to the required type."""
-    if required_type == patch.type:
+    if required_type.value == patch.type:
         return patch
 
-    if required_type == patch_ng.GIT:
+    if required_type == PatchType.GIT:
         for file in patch.items:
             file.header = [
                 b"diff --git "
                 + _rewrite_path(b"a/", file.source)
                 + b" "
                 + _rewrite_path(b"b/", file.target)
                 + b"\n"
             ]
-            file.type = required_type
-    elif required_type == patch_ng.SVN:
+            file.type = required_type.value
+    elif required_type == PatchType.SVN:
         for file in patch.items:
             file.header = [b"Index: " + file.target + b"\n", b"=" * 67 + b"\n"]
-            file.type = required_type
-    patch.type = required_type
+            file.type = required_type.value
+    patch.type = required_type.value
 
     return patch
🤖 Fix all issues with AI agents
In `@dfetch/project/svnsuperproject.py`:
- Around line 125-132: The code mixes Patch objects and raw strings causing
type/runtime errors: replace uses of Patch.from_file(filtered) (where filtered
is a diff string from create_diff) with Patch.from_bytes(filtered.encode() or
Patch.from_bytes(filtered)) to construct a Patch, change the handling of
Patch.for_new_file (which currently returns str) to return a Patch object (or
wrap its result into Patch.for_new_file_from_str/ Patch.from_bytes) so all
entries in patches are Patch instances, and finally use Patch.combine(patches)
(or call the class/instance method that combines Patch objects) instead of
combine_patches which expects Sequence[bytes]; update or call the corrected
Patch.combine API so combine is invoked on Patch objects rather than passing
list[Patch] to combine_patches.

In `@dfetch/vcs/patch.py`:
- Around line 145-156: combine() currently only appends items from other and
omits self; update Patch.combine so it includes both self._patchset.items and
other items (preserving intended order), e.g. initialize combined_patchset.items
with self._patchset.items (accessing self._patchset.items) and then extend/+=
with each patch._patchset.items from other (still handling the case where other
is a single Patch by wrapping it), then return Patch(combined_patchset); keep
existing use of patch_ng.PatchSet and the Patch constructor and respect the
protected-access pattern used elsewhere.
- Around line 127-133: The filter method creates a new patch_ng.PatchSet but
doesn't preserve the original patchset's type, so is_git/is_svn and dump_patch
can be wrong; update the filter method (function name: filter, class: Patch,
variable: filtered and source: self._patchset) to initialize or assign the new
PatchSet's type from self._patchset (e.g., construct filtered with the same
.type or set filtered.type = self._patchset.type) before appending items so the
filtered Patch preserves the original type.

In `@dfetch/vcs/svn.py`:
- Line 336: Guard against empty patch_text before calling Patch.from_bytes to
avoid RuntimeError: if patch_text is empty (falsy) return the empty result
(e.g., "") rather than calling Patch.from_bytes; change the code around
Patch.from_bytes(patch_text).filter(ignore).dump() to first check patch_text and
short-circuit, otherwise proceed to construct Patch and call filter and dump.
Ensure you reference the patch_text variable and the Patch.from_bytes / filter /
dump chain when making the change so behavior matches previous empty-output
behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
dfetch/vcs/git.py (1)

381-413: Note: GitLocalRepo.create_diff still returns str while SvnRepo.create_diff now returns Patch.

This asymmetry may be intentional (git's diff is consumed differently), but if further unification is planned, it's worth aligning these return types in a follow-up.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/git.py` around lines 381 - 413, The GitLocalRepo.create_diff
currently returns a raw str while SvnRepo.create_diff returns a Patch object, so
make them consistent: update GitLocalRepo.create_diff (the create_diff method in
dfetch/vcs/git.py) to return a Patch (or the same type used by
SvnRepo.create_diff) by converting the git command output into a Patch instance
before returning (e.g., parse or construct a Patch from result.stdout.decode()),
and update the method signature/return annotation accordingly so callers expect
the unified type.
dfetch/vcs/patch.py (3)

185-229: add_prefix mutates self despite docstring claiming it returns a new Patch.

The docstring says "Return new patch with all file paths rewritten" but this method mutates self._patchset.items in-place (modifying file.source, file.target, file.header) and returns self. This is inconsistent with filter(), combine(), and reverse(), which all return genuinely new Patch instances. A caller doing:

original = Patch.from_file(...)
prefixed = original.add_prefix("src")
# original is also mutated here

Current callers aren't bitten because they reassign (patch = patch.add_prefix(...)) or chain on a fresh object, but it's a latent footgun. At minimum, fix the docstring to say "Rewrite file paths in-place and return self for chaining." The same applies to convert_type.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 185 - 229, The add_prefix method currently
mutates self._patchset.items in-place (modifying file.source, file.target and
file.header) but its docstring claims it "Return new patch"; update the
add_prefix docstring to accurately state it rewrites file paths in-place and
returns self for chaining (and do the same for convert_type), referencing the
add_prefix method and the in-place mutations of file.source, file.target and
file.header on self._patchset.items so callers know it does not produce a new
Patch instance.

253-265: combine shares item object references — later mutation of source/combined patch will affect both.

combined_patchset.items += self._patchset.items copies list references, not the items themselves. Since add_prefix and convert_type mutate items in-place, calling .add_prefix() on either the original or combined Patch would corrupt the other. This is safe only if callers never reuse a Patch after combining. Worth documenting or shallow-copying items.

Proposed defensive copy
         combined_patchset = patch_ng.PatchSet()
-        combined_patchset.items += self._patchset.items
+        combined_patchset.items += list(self._patchset.items)
         for patch in other:
-            combined_patchset.items += (
-                patch._patchset.items  # pylint: disable=protected-access
+            combined_patchset.items += list(
+                patch._patchset.items  # pylint: disable=protected-access  
             )

Note: this only gives a new list — the individual patch item objects are still shared. A true deep copy would require copying each item, which may not be worth the complexity if you document the aliasing contract.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 253 - 265, The combine method is currently
concatenating lists by reference (combined_patchset.items +=
self._patchset.items and similar), causing shared list objects between the
original and returned Patch; change combine in class Patch to create new lists
of items (e.g., replace in-place concatenation with list(self._patchset.items)
and extend with list(patch._patchset.items) for each incoming patch) so the
combined Patch has its own items list and later calls like add_prefix or
convert_type on one Patch won't mutate the other's list structure; alternatively
explicitly document the aliasing contract if you prefer to keep references, but
prefer the defensive shallow-copy approach for combined_patchset before
returning.

119-122: Patch.empty() creates a Patch that will crash on dump() but silently succeed, while reverse(), from_bytes() etc. will raise.

Patch.empty() creates a Patch(patch_ng.PatchSet()) with no items. Calling .dump() returns "" (fine), .files returns [] (fine), and .apply() would call self._patchset.apply(...) which may return False and raise RuntimeError. Meanwhile .reverse() would call dump()""reverse_patch(b"")""from_bytes(b"")RuntimeError.

Consider documenting that Patch.empty() is only suitable for sentinel/no-op uses (e.g., as returned by svn.create_diff for empty diffs) and should not be reversed or applied.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 119 - 122, Patch.empty() currently
constructs Patch(patch_ng.PatchSet()) which behaves as a sentinel/no-op for
dump() and files but will raise or fail when used with reverse(), from_bytes(),
or apply() (via _patchset.apply); update the empty() implementation docstring to
clearly state it is only a sentinel/no-op (e.g., suitable for svn.create_diff
empty diffs) and explicitly warn that reverse(), from_bytes(), and apply() are
unsupported and may raise RuntimeError, so callers must not attempt to reverse
or apply Patch.empty() instances.
tests/test_patch.py (1)

304-310: Consider testing Patch.from_file on an invalid/empty file to confirm RuntimeError is raised.

The test suite covers the happy path well. A small addition could verify that Patch.from_file raises RuntimeError for invalid input, which is the expected behavior relied on by subproject.py (immediate failure on bad patches).

Do you want me to generate a test case for this?

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_patch.py` around lines 304 - 310, Add a new unit test that
verifies Patch.from_file raises RuntimeError for invalid or empty patch files:
create an empty/invalid temporary file and call Patch.from_file(path) inside
pytest.raises(RuntimeError) (or catch and assert the exception) to ensure the
function errors deterministically; reference Patch.from_file and ensure the test
is colocated in tests/test_patch.py alongside the existing reverse-patch test so
it fails fast if invalid input is provided.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dfetch/project/svnsuperproject.py`:
- Line 120: The code currently uses truthiness of the Patch instance
("filtered") which is always true for dataclass Patch, causing empty patches to
be included and later crash when combined.reverse() calls Patch.from_bytes(b"");
change the check when building patches to detect and exclude empty patches (use
Patch.empty() or equivalent) — e.g. only append filtered if filtered is not None
and not filtered.empty() so that create_diff-produced empty Patch objects are
omitted before any reverse() call; update the logic around the variable filtered
and the patches list to prevent passing an empty Patch into combined.reverse().

---

Nitpick comments:
In `@dfetch/vcs/git.py`:
- Around line 381-413: The GitLocalRepo.create_diff currently returns a raw str
while SvnRepo.create_diff returns a Patch object, so make them consistent:
update GitLocalRepo.create_diff (the create_diff method in dfetch/vcs/git.py) to
return a Patch (or the same type used by SvnRepo.create_diff) by converting the
git command output into a Patch instance before returning (e.g., parse or
construct a Patch from result.stdout.decode()), and update the method
signature/return annotation accordingly so callers expect the unified type.

In `@dfetch/vcs/patch.py`:
- Around line 185-229: The add_prefix method currently mutates
self._patchset.items in-place (modifying file.source, file.target and
file.header) but its docstring claims it "Return new patch"; update the
add_prefix docstring to accurately state it rewrites file paths in-place and
returns self for chaining (and do the same for convert_type), referencing the
add_prefix method and the in-place mutations of file.source, file.target and
file.header on self._patchset.items so callers know it does not produce a new
Patch instance.
- Around line 253-265: The combine method is currently concatenating lists by
reference (combined_patchset.items += self._patchset.items and similar), causing
shared list objects between the original and returned Patch; change combine in
class Patch to create new lists of items (e.g., replace in-place concatenation
with list(self._patchset.items) and extend with list(patch._patchset.items) for
each incoming patch) so the combined Patch has its own items list and later
calls like add_prefix or convert_type on one Patch won't mutate the other's list
structure; alternatively explicitly document the aliasing contract if you prefer
to keep references, but prefer the defensive shallow-copy approach for
combined_patchset before returning.
- Around line 119-122: Patch.empty() currently constructs
Patch(patch_ng.PatchSet()) which behaves as a sentinel/no-op for dump() and
files but will raise or fail when used with reverse(), from_bytes(), or apply()
(via _patchset.apply); update the empty() implementation docstring to clearly
state it is only a sentinel/no-op (e.g., suitable for svn.create_diff empty
diffs) and explicitly warn that reverse(), from_bytes(), and apply() are
unsupported and may raise RuntimeError, so callers must not attempt to reverse
or apply Patch.empty() instances.

In `@tests/test_patch.py`:
- Around line 304-310: Add a new unit test that verifies Patch.from_file raises
RuntimeError for invalid or empty patch files: create an empty/invalid temporary
file and call Patch.from_file(path) inside pytest.raises(RuntimeError) (or catch
and assert the exception) to ensure the function errors deterministically;
reference Patch.from_file and ensure the test is colocated in
tests/test_patch.py alongside the existing reverse-patch test so it fails fast
if invalid input is provided.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
dfetch/vcs/patch.py (1)

189-233: Inconsistent mutation semantics across Patch methods.

add_prefix() and convert_type() mutate self in-place and return self, while filter(), reverse(), and combine() return new Patch instances. This mixed API can surprise callers who expect uniform copy-on-write or uniform in-place behavior.

Current callers work correctly because they reassign (patch = patch.add_prefix(...)), but this inconsistency is a foot-gun for future use.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 189 - 233, The add_prefix method currently
mutates self (via self._patchset and file.source/file.target and file.header)
then returns self, creating an API mismatch with filter()/reverse()/combine()
which return new Patch instances; change add_prefix to be non-mutating by
creating a copy of the Patch (e.g., clone the Patch and/or deep-copy
self._patchset and its File entries), apply the path rewrites to the copy’s
file.source/file.target/file.header using the existing _rewrite_path and regex
logic, and return the new Patch instance; also update convert_type to follow the
same non-mutating pattern so add_prefix and convert_type have the same
copy-on-write semantics as filter/reverse/combine.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dfetch/vcs/patch.py`:
- Around line 115-117: The plain-patch branch is joining diff lines with
"\n".join(diff) which inserts extra blank lines because difflib.unified_diff
already includes trailing "\n"; change the plain fallback to use "".join(diff)
(same approach as the GIT/SVN branches) so the output isn't double-spaced—update
the return in the function handling patch_type (compare with the existing
PatchType.SVN branch) to use "".join(diff) or otherwise avoid adding an extra
delimiter.
- Around line 257-269: The combine() method creates a new patch_ng.PatchSet but
does not copy the original patchset type, causing is_git/is_svn and dump() to be
incorrect; update combine (in the Patch.combine method) to set
combined_patchset.type = self._patchset.type (same approach used in filter())
after creating combined_patchset and before returning the new Patch, ensuring
that when you merge other Patch or Sequence[Patch] instances the resulting Patch
retains the original PatchSet.type.
- Around line 124-126: The is_empty method is inverted: it returns True when
items exist, causing non-empty patches to be treated as empty; update
dfetch.vcs.patch.Patch.is_empty to return the negation of the current expression
(i.e., return not bool(self._patchset.items) or equivalently check
len(self._patchset.items) == 0) so callers like the usage in svnsuperproject.py
correctly treat only truly empty Patch instances as empty.

---

Duplicate comments:
In `@dfetch/project/svnsuperproject.py`:
- Around line 118-125: The condition that initializes patches uses
filtered.is_empty() inverted, causing non-empty diffs to be dropped; change the
logic so that when filtered is NOT empty the list starts with filtered (i.e.,
include filtered when filtered.is_empty() is false) and then append new-file
patches from Patch.for_new_file and Patch.from_string as before; update the
initialization around the symbols filtered, Patch, Patch.for_new_file,
Patch.from_string, and repo.untracked_files (inside in_directory(path)) so the
code includes non-empty filtered patches rather than excluding them.

---

Nitpick comments:
In `@dfetch/vcs/patch.py`:
- Around line 189-233: The add_prefix method currently mutates self (via
self._patchset and file.source/file.target and file.header) then returns self,
creating an API mismatch with filter()/reverse()/combine() which return new
Patch instances; change add_prefix to be non-mutating by creating a copy of the
Patch (e.g., clone the Patch and/or deep-copy self._patchset and its File
entries), apply the path rewrites to the copy’s
file.source/file.target/file.header using the existing _rewrite_path and regex
logic, and return the new Patch instance; also update convert_type to follow the
same non-mutating pattern so add_prefix and convert_type have the same
copy-on-write semantics as filter/reverse/combine.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
dfetch/vcs/patch.py (3)

100-121: for_new_file returns str while every other factory method returns Patch.

Every static constructor (from_file, from_bytes, from_string, empty) returns a Patch. for_new_file returns a raw str, forcing callers to do Patch.from_string(Patch.for_new_file(...)) (as in svnsuperproject.py) while git.py consumes the string directly. Returning a Patch directly would unify the API and remove the intermediate wrapping call.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 100 - 121, The factory method for_new_file
currently returns a raw string while other constructors return a Patch; change
for_new_file to return a Patch by building the same diff string currently
returned and wrapping it with Patch.from_string(...) before returning. Locate
Patch.for_new_file and use the same logic that constructs diff (using
_unified_diff_new_file(path), PatchType checks, _git_mode(path), and
_git_blob_sha1(path)) but return Patch.from_string(the_final_string) instead of
the raw string so callers can receive a Patch directly.

179-185: dump_header does not reference self — should be @staticmethod.

-    def dump_header(self, patch_info: PatchInfo, patch_type: PatchType) -> str:
+    `@staticmethod`
+    def dump_header(patch_info: PatchInfo, patch_type: PatchType) -> str:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 179 - 185, The method dump_header is
defined as an instance method but does not use self; mark it as a static method
by adding the `@staticmethod` decorator and remove the unused self parameter so
the signature becomes dump_header(patch_info: PatchInfo, patch_type: PatchType)
-> str; keep the existing logic that calls patch_info.to_git_header() and
patch_info.to_svn_header() and update any internal calls if they currently
invoke it as an instance method to call it as a static method when necessary.

187-193: reverse() mutates self; combine() returns a new object — pick one convention.

reverse, filter, add_prefix, and convert_type all mutate self in-place and return self. combine constructs and returns a brand-new Patch. This asymmetry is a latent footgun:

a = patch
b = a.filter(ignore)    # b is a; both are now mutated
c = a.combine([other])  # c is a NEW Patch; a is unchanged

All current call sites reassign the result, so there are no bugs today, but the inconsistency will surprise future callers. Consider making all transform methods either all return-new (functional style) or all mutate-and-return-self (fluent style).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 187 - 193, The patch transformation methods
are inconsistent: combine() returns a new Patch while reverse(), filter(),
add_prefix(), and convert_type() currently mutate self; standardize on a
non-mutating functional style by having reverse(), filter(), add_prefix(), and
convert_type() return a new Patch instead of mutating self. For reverse(), stop
assigning to self._patchset — create a new Patch from the transformed bytes (use
Patch.from_bytes or the existing from_bytes factory and the dump()/reverse_patch
call) and return that new instance; apply the same pattern in filter(),
add_prefix(), and convert_type() so they construct and return new Patch objects
rather than modifying self, preserving combine()’s behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dfetch/vcs/patch.py`:
- Around line 251-271: convert_type currently updates self._patchset.type even
when required is not GIT or SVN, leaving per-file file.header and file.type
inconsistent (e.g., when required == PatchType.PLAIN); update convert_type to
either reject/raise on unsupported required types or treat PLAIN as a no-op: if
required.value is neither patch_ng.GIT nor patch_ng.SVN, do not change
self._patchset.type and return self (or raise a ValueError), otherwise proceed
with the existing per-file header/type updates; reference function convert_type,
enum PatchType (PLAIN/GIT/SVN), the patchset self._patchset and per-file
attributes file.header and file.type so the change keeps dump() and
is_git/is_svn consistent.

---

Nitpick comments:
In `@dfetch/vcs/patch.py`:
- Around line 100-121: The factory method for_new_file currently returns a raw
string while other constructors return a Patch; change for_new_file to return a
Patch by building the same diff string currently returned and wrapping it with
Patch.from_string(...) before returning. Locate Patch.for_new_file and use the
same logic that constructs diff (using _unified_diff_new_file(path), PatchType
checks, _git_mode(path), and _git_blob_sha1(path)) but return
Patch.from_string(the_final_string) instead of the raw string so callers can
receive a Patch directly.
- Around line 179-185: The method dump_header is defined as an instance method
but does not use self; mark it as a static method by adding the `@staticmethod`
decorator and remove the unused self parameter so the signature becomes
dump_header(patch_info: PatchInfo, patch_type: PatchType) -> str; keep the
existing logic that calls patch_info.to_git_header() and
patch_info.to_svn_header() and update any internal calls if they currently
invoke it as an instance method to call it as a static method when necessary.
- Around line 187-193: The patch transformation methods are inconsistent:
combine() returns a new Patch while reverse(), filter(), add_prefix(), and
convert_type() currently mutate self; standardize on a non-mutating functional
style by having reverse(), filter(), add_prefix(), and convert_type() return a
new Patch instead of mutating self. For reverse(), stop assigning to
self._patchset — create a new Patch from the transformed bytes (use
Patch.from_bytes or the existing from_bytes factory and the dump()/reverse_patch
call) and return that new instance; apply the same pattern in filter(),
add_prefix(), and convert_type() so they construct and return new Patch objects
rather than modifying self, preserving combine()’s behavior.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
dfetch/commands/format_patch.py (1)

129-131: Specify encoding="utf-8" in write_text().

Without an explicit encoding, the default is platform-dependent (e.g., cp1252 on Windows), which may corrupt or error on patches containing non-ASCII content.

♻️ Proposed fix
-        output_patch_file.write_text(
-            patch.dump_header(patch_info) + patch.dump()
-        )
+        output_patch_file.write_text(
+            patch.dump_header(patch_info) + patch.dump(), encoding="utf-8"
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/commands/format_patch.py` around lines 129 - 131, The call to
output_patch_file.write_text(...) in format_patch.py writes patch content
without specifying encoding, which is platform-dependent; update the write to
pass encoding="utf-8" so the output of patch.dump_header(patch_info) +
patch.dump() is written using UTF-8 encoding (locate the write_text call near
dump_header/dump usage and add the encoding argument).
dfetch/vcs/patch.py (3)

228-231: Regexes recompiled on every add_prefix call — promote to module-level constants.

diff_git and svn_index are invariant; compiling them inside the method on each call is wasteful.

♻️ Proposed fix
+_RE_DIFF_GIT = re.compile(
+    r"^diff --git (?:(?P<a>a/))?(?P<old>.+) (?:(?P<b>b/))?(?P<new>.+?)[\r\n]*$"
+)
+_RE_SVN_INDEX = re.compile(rb"^Index: (.+)$")
+

 class PatchType(Enum):
     ...

 # Inside add_prefix, replace the two re.compile calls:
-        diff_git = re.compile(
-            r"^diff --git (?:(?P<a>a/))?(?P<old>.+) (?:(?P<b>b/))?(?P<new>.+?)[\r\n]*$"
-        )
-        svn_index = re.compile(rb"^Index: (.+)$")
+        diff_git = _RE_DIFF_GIT
+        svn_index = _RE_SVN_INDEX
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 228 - 231, Move the invariant regex
compilations out of the add_prefix function and promote them to module-level
constants so they are compiled once; specifically, take the current local
variables diff_git and svn_index, create module-level constants (e.g.,
DIFF_GIT_RE and SVN_INDEX_RE) initialized with re.compile(...) and
re.compile(rb"...") respectively, and update the add_prefix implementation to
reference these constants instead of recompiling on each call.

293-313: combine() silently mutates the other patches as a side effect.

The convert_type() call on Line 307 modifies the input patches in other in-place (their file.header, file.type, and _patchset.type). A caller that holds a reference to a patch it passed to combine() would see its type unexpectedly changed. Additionally, combined_patchset.items += uses shallow copies, so later item-level mutations on the returned combined Patch propagate back to the original patches.

♻️ Proposed fix — convert a copy, deep-copy items
 def combine(self, other: Patch | Sequence[Patch]) -> Patch:
     """Combine two patches into a new patch."""
     if isinstance(other, Patch):
         other = [other]

+    import copy
     combined_patchset = patch_ng.PatchSet()
     combined_patchset.type = self._patchset.type
-    combined_patchset.items += self._patchset.items
+    combined_patchset.items += copy.copy(self._patchset.items)
     for patch in other:
         if (
             patch._patchset.type  # pylint: disable=protected-access
             != self._patchset.type
         ):
-            patch.convert_type(PatchType(combined_patchset.type))
-
-        combined_patchset.items += (
-            patch._patchset.items  # pylint: disable=protected-access
-        )
+            patch = copy.deepcopy(patch)
+            patch.convert_type(PatchType(combined_patchset.type))
+        combined_patchset.items += copy.copy(
+            patch._patchset.items  # pylint: disable=protected-access
+        )

     return Patch(combined_patchset)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 293 - 313, combine currently mutates input
Patch objects and shares item objects; fix by operating on copies: create a deep
copy of self._patchset (or at least deep-copy its items) into combined_patchset
rather than using combined_patchset.items += self._patchset.items, and for each
patch in other make a deep copy of patch._patchset (so convert_type is invoked
on the copy, not the original), call convert_type on that copy (using
PatchType(combined_patchset.type)), then extend combined_patchset.items with the
copied items; finally return Patch(combined_patchset). Use the unique symbols
combine, Patch, PatchType, convert_type, _patchset, and combined_patchset.items
when locating and changing the code.

41-55: @dataclass exposes _patchset as a public constructor param and generates a meaningless __eq__.

Two concerns:

  1. Because _patchset (intended private) is a plain dataclass field, the generated __init__ exposes it as a positional parameter, making Patch(some_patchset) part of the public API rather than an internal implementation detail.
  2. patch_ng.PatchSet almost certainly doesn't implement __eq__, so the auto-generated __eq__ degrades to identity comparison — two Patch instances wrapping identical content will never compare equal, which can surprise callers.

Consider suppressing the generated equality logic and restricting the direct constructor:

♻️ Suggested approach
-@dataclass
-class Patch:
+@dataclass(eq=False)
+class Patch:

Or, if equality by content is meaningful, implement it explicitly. Alternatively, move to a plain class with an explicit __init__ and @classmethod factories — given all construction goes through static factories anyway, the @dataclass __init__ gives little benefit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dfetch/vcs/patch.py` around lines 41 - 55, Patch currently exposes the
internal _patchset as a public dataclass constructor arg and gets a poor default
__eq__; change the class to stop exposing _patchset and to provide meaningful
equality: mark the dataclass with eq=False and make _patchset a
non-init/non-compare field (e.g., field(init=False, compare=False, repr=False)),
keep/implement factory methods (e.g., your existing from_string/from_file
constructors) to set self._patchset, and add an explicit __eq__(self, other)
that compares normalized patch content (for example by comparing a canonical
text/serialized form of self._patchset and other._patchset) so equality reflects
patch content rather than identity.
tests/test_patch.py (1)

306-310: Prefer pytest.fail() over assert False to preserve traceback context.

assert False swallows the original exception's traceback, making it harder to diagnose failures on flaky runs.

♻️ Suggested change
         try:
             Patch.from_file(patch_file).apply(root=str(tmp_path))
         except Exception as e:
-            assert False, f"Reverse patch failed: {e}"
+            pytest.fail(f"Reverse patch failed: {e}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_patch.py` around lines 306 - 310, Replace the "assert False" idiom
in the except block with pytest.fail to preserve traceback context: inside the
try/except around Patch.from_file(...).apply(root=str(tmp_path)) catch, call
pytest.fail(f"Reverse patch failed: {e}") (and add "import pytest" at the top if
missing) instead of assert False so failures surface with proper test reporting
and context.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dfetch/vcs/patch.py`:
- Around line 157-162: The error message in Patch.apply is unhelpful when
self.path is empty (inline patches created by from_string/from_bytes); update
the RuntimeError raised in apply to include a fallback identifier (e.g. "<inline
patch>" or use a property that returns self.path or a descriptive fallback) so
the message becomes something like 'Applying patch "<identifier>" failed';
modify the apply method to compute an identifier from self.path (or a new helper
on the Patch class) and use that identifier in the RuntimeError to make failures
for inline patches clear.
- Around line 200-210: The error message in reverse() is unreachable because
from_bytes raises on empty input; change reverse() (method reverse) to check the
result of _reverse_patch(self.dump()) in the reversed_text variable before
calling from_bytes: if reversed_text is empty, raise RuntimeError("Failed to
reverse patch; patch parsing returned empty."); otherwise proceed to call
self.from_bytes(reversed_text.encode("utf-8"))._patchset and keep the existing
is_empty() guard to catch any other parsing issues; reference functions:
reverse(), _reverse_patch, from_bytes(), is_empty(), dump(), and attribute
_patchset.

---

Nitpick comments:
In `@dfetch/commands/format_patch.py`:
- Around line 129-131: The call to output_patch_file.write_text(...) in
format_patch.py writes patch content without specifying encoding, which is
platform-dependent; update the write to pass encoding="utf-8" so the output of
patch.dump_header(patch_info) + patch.dump() is written using UTF-8 encoding
(locate the write_text call near dump_header/dump usage and add the encoding
argument).

In `@dfetch/vcs/patch.py`:
- Around line 228-231: Move the invariant regex compilations out of the
add_prefix function and promote them to module-level constants so they are
compiled once; specifically, take the current local variables diff_git and
svn_index, create module-level constants (e.g., DIFF_GIT_RE and SVN_INDEX_RE)
initialized with re.compile(...) and re.compile(rb"...") respectively, and
update the add_prefix implementation to reference these constants instead of
recompiling on each call.
- Around line 293-313: combine currently mutates input Patch objects and shares
item objects; fix by operating on copies: create a deep copy of self._patchset
(or at least deep-copy its items) into combined_patchset rather than using
combined_patchset.items += self._patchset.items, and for each patch in other
make a deep copy of patch._patchset (so convert_type is invoked on the copy, not
the original), call convert_type on that copy (using
PatchType(combined_patchset.type)), then extend combined_patchset.items with the
copied items; finally return Patch(combined_patchset). Use the unique symbols
combine, Patch, PatchType, convert_type, _patchset, and combined_patchset.items
when locating and changing the code.
- Around line 41-55: Patch currently exposes the internal _patchset as a public
dataclass constructor arg and gets a poor default __eq__; change the class to
stop exposing _patchset and to provide meaningful equality: mark the dataclass
with eq=False and make _patchset a non-init/non-compare field (e.g.,
field(init=False, compare=False, repr=False)), keep/implement factory methods
(e.g., your existing from_string/from_file constructors) to set self._patchset,
and add an explicit __eq__(self, other) that compares normalized patch content
(for example by comparing a canonical text/serialized form of self._patchset and
other._patchset) so equality reflects patch content rather than identity.

In `@tests/test_patch.py`:
- Around line 306-310: Replace the "assert False" idiom in the except block with
pytest.fail to preserve traceback context: inside the try/except around
Patch.from_file(...).apply(root=str(tmp_path)) catch, call pytest.fail(f"Reverse
patch failed: {e}") (and add "import pytest" at the top if missing) instead of
assert False so failures surface with proper test reporting and context.

@spoorcc spoorcc merged commit 58a6ef8 into main Feb 21, 2026
38 checks passed
@spoorcc spoorcc deleted the patch-class branch February 21, 2026 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants