Skip to content

Add crashdump example and include snapshot/scratch in core dumps#1264

Open
jsturtevant wants to merge 6 commits intohyperlight-dev:mainfrom
jsturtevant:crashdump
Open

Add crashdump example and include snapshot/scratch in core dumps#1264
jsturtevant wants to merge 6 commits intohyperlight-dev:mainfrom
jsturtevant:crashdump

Conversation

@jsturtevant
Copy link
Contributor

Core dumps generated by Hyperlight were missing the snapshot and scratch memory regions, making post-mortem debugging with GDB incomplete — register state was present but the guest's code, stack, heap, and page tables were absent. This adds the snapshot and scratch regions to the ELF core dump alongside any dynamically mapped regions so that GDB can show full backtraces, disassemble at the crash site, and inspect guest memory. A new runnable crashdump example demonstrates automatic dumps (VM-level faults), on-demand dumps (guest-caught exceptions), and per-sandbox opt-out, with GDB-based integration tests that validate register and memory content in the generated ELF files. The debugging docs are also updated with practical GDB commands for inspecting crash dumps.

@syntactically
Copy link
Member

@jsturtevant I believe you were working on another version of this that explicitly walks through the whole guest virtual address space?

@jsturtevant
Copy link
Contributor Author

@jsturtevant I believe you were working on another version of this that explicitly walks through the whole guest virtual address space?

Yes, been working with @dblnz on getting it working. Should have something today

@jsturtevant jsturtevant marked this pull request as ready for review March 3, 2026 23:14
@jsturtevant jsturtevant added the kind/bugfix For PRs that fix bugs label Mar 4, 2026
@jsturtevant jsturtevant force-pushed the crashdump branch 3 times, most recently from ba9bd81 to 4181d57 Compare March 4, 2026 04:00
Co-authored-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
@jsturtevant jsturtevant force-pushed the crashdump branch 2 times, most recently from 6ab40fe to 21b7183 Compare March 4, 2026 05:48
Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
The runtime config was passed by reference into set_up_hypervisor_partition then
immediately cloned, but no caller needs it afterward so it is now passed by
value. The entry point field uses Option<u64> instead of a bare zero default so
a missing value is detectable rather than silently producing a bogus AT_ENTRY.

Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
generate_crashdump_to_dir accepts the output directory
as a parameter instead of requiring callers to set the
HYPERLIGHT_CORE_DUMP_DIR environment variable. This
removes the need for unsafe std::env::set_var in tests
while preserving the existing env-var fallback path
for the automatic dump and the no-argument generate_crashdump method.

Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
} else if gpa >= SandboxMemoryLayout::BASE_ADDRESS as u64 {
} else if gpa >= SandboxMemoryLayout::BASE_ADDRESS as u64
&& gpa < SandboxMemoryLayout::BASE_ADDRESS as u64 + snap.mem_size() as u64
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember this was a bug fix for some regions that are not in the snapshot and not in the scratch region either, but in between, but the previous logic returned the snapshot memory

Copy link
Contributor Author

@jsturtevant jsturtevant Mar 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes it explicit that it will look for a GPA in both were as previously it was also looking at values that where past the end of either. Here snap is

//! A snapshot freshly created from an empty VM will result in roughly
//! the following physical layout:
//!
//! +-------------------------------------------+
//! | Guest Page Tables |
//! +-------------------------------------------+
//! | Init Data | (GuestBlob size)
//! +-------------------------------------------+
//! | Guest Heap |
//! +-------------------------------------------+
//! | PEB Struct | (HyperlightPEB size)
//! +-------------------------------------------+
//! | Guest Code |
//! +-------------------------------------------+ 0x1_000
//! | NULL guard page |
//! +-------------------------------------------+ 0x0_000
and the scratch is
//! There is also a scratch region at the top of physical memory,
//! which is mostly laid out as a large undifferentiated blob of
//! memory, although at present the snapshot process specially
//! privileges the statically allocated input and output data regions:
//!
//! +-------------------------------------------+ (top of physical memory)
//! | Exception Stack, Metadata |
//! +-------------------------------------------+ (1 page below)
//! | Scratch Memory |
//! +-------------------------------------------+
//! | Output Data |
//! +-------------------------------------------+
//! | Input Data |
//! +-------------------------------------------+ (scratch size)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the case of the snapshot code, which was formerly the only user of this code, this was being handled in the function guest_page, and the entire call to access_gpa was avoided. Now that there are more users of this VA enumeration, and crashdump/gdb are likely to continue to need it, we should see if the interface can be factored out a bit better (made to not rely on exclusive memory access, but to take advantage of it where available; and made to unify scratch/snapshot handling and mapped region handling).

});
} else {
// GPA not in snapshot/scratch — check dynamic mmap regions
resolve_from_mmap_regions(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not 100%, but from what I see, this function creates a region that exactly matches the mapping in size.
I am wondering, wouldn't the inclusion of dynamic mmap regions below take care of this by including the whole mmap region?

Note: I remember that when I was working on this, the dumps wouldn't correctly load without this function call, but I'm just saying, maybe I'm missing something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes sure the dump preserves the GVA (virt_base, virt_end) instead of the physical address which is what the other loop does. The below mmap regions includes the GVA locations just for full dump but likely won't be used in gdb

Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
dblnz
dblnz previously approved these changes Mar 4, 2026
Copy link
Contributor

@dblnz dblnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Having participated in these changes, I'd feel more comfortable if other people had a look/approved this 😄

Signed-off-by: James Sturtevant <jsturtevant@gmail.com>
Copy link
Contributor

@ludfjig ludfjig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great, i like the added tests!

/// `usize` host addresses. The crash dump path only reads host memory
/// through raw pointers, so it never needs the file-mapping metadata
/// stored in [`HostRegionBase`] on Windows.
#[cfg(feature = "crashdump")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new types in memory_region.rs and helpers in mgr.rs use #[cfg(feature = "crashdump")] while most of the rest of the crate the #[cfg(crashdump)] alias. Maybe we can remove the alias since it doens't serve a purpose (I seem to remember it was only usable in debug profile before, but that no longer seems to be the case), OR switch to using the alias everywhere

/// With `init-paging` enabled, walks the guest page tables to discover
/// GVA→GPA mappings and translates them to host-backed regions. Also
/// includes any dynamic mmap regions that may not be covered by the
/// page table walk, and a sentinel page at the stack top for clean
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

outdated doc since the sentinel page was removed i believe

Comment on lines +527 to +530
if !gdb_is_available() {
eprintln!("Skipping test: {GDB_COMMAND} not found on PATH");
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this needed in CI? Do we not want the test to fail here?

Copy link
Member

@syntactically syntactically left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The direction of this looks good! I have a few comments, most of which are minor nits or just signposting things I thought were interesting points to discuss; the only thing that I'm actually particularly concerned about with the present state of the code is the mmap regions being present in the dump at VAs where they are not actually present in the sandbox.

At a higher level, it occurs to me that an alternative for crashdumps might be to just take a snapshot, and then write a separate utility that prepares a crashdump from a snapshot. That's probably worse, since in the future we might not be able to take a snapshot at arbitrary times, but only when I/O to the guest is quiet? However, I wanted to mention the idea to see what anyone else thought.

/// back to the host as `GuestAborted` errors.
///
/// However, some faults are intercepted by the **hypervisor** before the
/// CPU delivers them to the guest. For example, when the guest writes to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, but this example is not the hypervisor "intercepting a fault that was originally destined for the guest": the fault was always targeted at the hypervisor, because it was the hypervisor's own page tables that caused it. There are intercepts that allow us to redirect faults to the hv (on amd64), but we don't use a tonne of them.


// This call triggers a ud2 instruction in the guest. The guest's IDT
// catches the #UD exception and reports it back to the host as a
// GuestAborted error via I/O. This does NOT trigger an automatic crash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have this distinction? We should be clear about exactly which exceptions result in dumps and which do not, and whether there is any relationship between that and which faults poison the sandbox.

/// so GDB can compute the correct load offset for PIE binaries.
#[cfg(crashdump)]
pub(crate) entry_point: u64,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new to this PR, so feel free to file an issue and defer, but I wanted to see if you had thoughts in this context first. This addition brings to my attention the question: what is the expected lifecycle of SandboxRuntimeConfig? Some parts, like debug_info and guest_core_dump, seem like they are config of a specific sandbox instance (regardless of what guest/snapshot is being run inside of that instance), while things like entry_point and guest_core_dump seem like they are related fundamentally to the snapshot that's underlying the sandbox, and would need to be reset on restore to have sensible behaviour for a single sandbox that urns multiple different guests at different times.

) -> Option<(&'a ExclusiveSharedMemory, usize)> {
let scratch_base = scratch_base_gpa(scratch_size);
if gpa >= scratch_base {
if gpa >= scratch_base && gpa < scratch_base + scratch_size as u64 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change makes sense, since it's good for the contract here to be that the returned offset exists in the memory (and an error would happen fast if not).

However, I'm curious if this change was motivated by actually finding a case where it would make a difference or not---it feels like a bug if there is a pointer to above the scratch region (only 1 unmapped page exists there), so if that was the cause of the change I'd be curious where the access was coming from.

} else if gpa >= SandboxMemoryLayout::BASE_ADDRESS as u64 {
} else if gpa >= SandboxMemoryLayout::BASE_ADDRESS as u64
&& gpa < SandboxMemoryLayout::BASE_ADDRESS as u64 + snap.mem_size() as u64
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the case of the snapshot code, which was formerly the only user of this code, this was being handled in the function guest_page, and the entire call to access_gpa was avoided. Now that there are more users of this VA enumeration, and crashdump/gdb are likely to continue to need it, we should see if the interface can be factored out a bit better (made to not rely on exclusive memory access, but to take advantage of it where available; and made to unify scratch/snapshot handling and mapped region handling).

if *executable {
flags |= MemoryRegionFlags::EXECUTE;
}
(flags, MemoryRegionType::Code)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MemoryRegionType doesn't seem right, but then again I'm not sure if we're actually using the MemoryRegionTypes for anything, and they were already a bit inconsistent/nonsensical...

})
})???;

// Include dynamic mmap regions at their GVA addresses as a fallback
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem quite right. The mmap region guest_regions are GPAs not GVAs, and the rest of the crashdump is (much like a snapshot) handled entirely in GVA space. Code could well be doing something important at some GVA that is equal to a GPA where something entirely different is present, and this would overwrite the former with the contents of the latter as far as I understand it. Regions that are mapped into some guest physical address space, but not mapped from there to anywhere in VA space, are basically inaccessible to any code (unless it bothers to map them in first), since we don't have a complete physmap anyway...

let host_base = HostGuestMemoryRegion::to_addr(rgn.host_region.start) + offset;
let host_end = host_base + mapping.len as usize;

regions.push(CrashDumpRegion {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the coalesce logic above run here as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bugfix For PRs that fix bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants