Skip to content

Conversation

@mkroening
Copy link
Member

@mkroening mkroening commented Jan 7, 2026

In Hermit, we found it useful to be able to print our page tables for debugging. We've had several iterations of such debugging functions but are now happy with the current compact range-based table formatting introduced in hermit-os/kernel#1818.

This PR upstreams this page table printing functionality. It can be used as follows:

let mapped_page_table: MappedPageTable<'_, _>;

println!("{}", mapped_page_table.display());

This is how a formatted table looks like:

100000-101000 100000-101000 WRITABLE | ACCESSED | DIRTY
101000-103000 101000-103000 WRITABLE | ACCESSED
103000-105000 103000-105000 WRITABLE
105000-106000 105000-106000 WRITABLE | ACCESSED
106000-107000 106000-107000 WRITABLE
107000-10d000 107000-10d000 WRITABLE | ACCESSED
10d000-111000 10d000-111000 WRITABLE
111000-112000 111000-112000 WRITABLE | ACCESSED
112000-114000 112000-114000 WRITABLE
114000-118000 114000-118000 WRITABLE | ACCESSED
118000-119000 118000-119000 WRITABLE
119000-11a000 119000-11a000 WRITABLE | ACCESSED
11a000-11b000 11a000-11b000 WRITABLE
11b000-11c000 11b000-11c000 WRITABLE | ACCESSED | DIRTY
11c000-120000 11c000-120000 WRITABLE | ACCESSED
120000-121000 120000-121000 WRITABLE
121000-122000 121000-122000 WRITABLE | ACCESSED | DIRTY
122000-123000 122000-123000 WRITABLE
123000-124000 123000-124000 WRITABLE | ACCESSED | DIRTY
124000-125000 124000-125000 WRITABLE
ffffff8000000000-ffffff8000001000 11f000-120000 WRITABLE | ACCESSED
ffffff8000001000-ffffff8000002000 120000-121000 WRITABLE
ffffffffc0000000-ffffffffc0001000 11e000-11f000 WRITABLE | ACCESSED
ffffffffffe00000-ffffffffffe01000 11d000-11e000 WRITABLE | ACCESSED
fffffffffffff000-                 11c000-11d000 WRITABLE

This is how a table formatted with the alternate (#) flag looks like:

size   len                   virtual address                  physical address flags
4KiB     1           100000-          101000                   identity-mapped WRITABLE | ACCESSED | DIRTY
4KiB     2           101000-          103000                   identity-mapped WRITABLE | ACCESSED
4KiB     2           103000-          105000                   identity-mapped WRITABLE
4KiB     1           105000-          106000                   identity-mapped WRITABLE | ACCESSED
4KiB     1           106000-          107000                   identity-mapped WRITABLE
4KiB     7           107000-          10e000                   identity-mapped WRITABLE | ACCESSED
4KiB     3           10e000-          111000                   identity-mapped WRITABLE
4KiB     1           111000-          112000                   identity-mapped WRITABLE | ACCESSED
4KiB     2           112000-          114000                   identity-mapped WRITABLE
4KiB     4           114000-          118000                   identity-mapped WRITABLE | ACCESSED
4KiB     1           118000-          119000                   identity-mapped WRITABLE
4KiB     1           119000-          11a000                   identity-mapped WRITABLE | ACCESSED
4KiB     1           11a000-          11b000                   identity-mapped WRITABLE
4KiB     1           11b000-          11c000                   identity-mapped WRITABLE | ACCESSED | DIRTY
4KiB     5           11c000-          121000                   identity-mapped WRITABLE | ACCESSED
4KiB     1           121000-          122000                   identity-mapped WRITABLE | ACCESSED | DIRTY
4KiB     1           122000-          123000                   identity-mapped WRITABLE
4KiB     1           123000-          124000                   identity-mapped WRITABLE | ACCESSED | DIRTY
4KiB     1           124000-          125000                   identity-mapped WRITABLE
4KiB     1 ffffff8000000000-ffffff8000001000           11f000-          120000 WRITABLE | ACCESSED
4KiB     1 ffffff8000001000-ffffff8000002000           120000-          121000 WRITABLE
4KiB     1 ffffffffc0000000-ffffffffc0001000           11e000-          11f000 WRITABLE | ACCESSED
4KiB     1 ffffffffffe00000-ffffffffffe01000           11d000-          11e000 WRITABLE | ACCESSED
4KiB     1 fffffffffffff000-                           11c000-          11d000 WRITABLE

While this PR is technically non-breaking and could target master, it becomes much easier to use with #576.

The first two commits prepare a directory structure for the mapped_page_table module. The third commit adds three new modules to mapped_page_table: iter, range_iter, and display. The former two are private for now. They allow iterating over page mappings and contiguous ranges of page mappings, respectively. The third module adds the display adapter for MappedPageTable.

Thanks as always for your great work! :)

Copy link
Member

@Freax13 Freax13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

I like the idea of adding a display impl for the page table types to print mapped ranges.

That said, I'm a bit wary of exposing too many implementation details as public APIs. I think there are probably good use-cases for MappedPageTableRangeInclusiveIter, but I'm not too sure about some of the trait impls for the new types.


#[derive(Debug)]
struct PageTableWalker<P: PageTableFrameMapping> {
pub struct PageTableWalker<P: PageTableFrameMapping> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that exposing this could be useful, but I'm not sure we want to expose this API in its current form.

For example, PageTableWalker::next_table should be unsafe (we can't just blindly trust that the PageTableEntry indeed points to a valid page table).

If it's okay with you, I'd like to delay that to another PR, so that we can properly think about the APIs we want to expose.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, we can keep it private. 👍

You say that next_tables has safety requirements regarding entry that are not captured by new's undocumented safety requirements, right? That sounds like we should make this unsafe even when kept private to make this sound. Should we do that before this PR, since this PR uses that API, or ignore the issue for now?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You say that next_tables has safety requirements regarding entry that are not captured by new's undocumented safety requirements, right?

Yes. This function can be used to create a shared reference to arbitrary memory and that's not sound.

Should we do that before this PR, since this PR uses that API, or ignore the issue for now?

I'm fine with keeping this as is for now, but if you wanted to clean this up a bit, I'd appreciate that.

type Item = MappedPageItem;

fn next(&mut self) -> Option<Self::Item> {
self.next_forward().or_else(|| self.next_forward())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think calling next_forward twice is sufficient.

Consider the following: A page directory which has its first two entries point to two empty completely page tables, followed by a third entry pointing to a page table with present mappings.
PD[0] -> PT[511:0] -> All not present
PD[1] -> PT[511:0] -> All not present
PD[2] -> PT[0] -> 0x1234000.

If I'm understanding next_forward correctly, the first call to next_forward would traverse the page directory to the first page table, fail to find any entries and increment p1 until increment_p1_index returns None. <MappedPageTableIter as Iterator>::next would then call next_forward a second time. This time, next_forward would traverse the page directory to the second page table, again fail to find any entries and return None yet again. At this point, <MappedPageTableIter as Iterator>::next returns None even though the third page directory entry with the page table containing actual mappings hasn't been traversed yet.

I think this can be fixed by calling self.next_forward() until p4_index >= 512.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks! This will be resolved in my next revision. :)

Ok(page_table) => break page_table,
Err(PageTableWalkError::NotMapped) => self.increment_p4_index()?,
Err(PageTableWalkError::MappedToHugePage) => {
panic!("level 4 entry has huge page bit set")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code below, we silently ignore bad entries:

// Invalid frame address
self.increment_p1_index()?;

Do we want to do the same here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done in my next revision. 👍


#[derive(Debug)]
pub struct MappedPageTableDisplay<'a, P: PageTableFrameMapping + Clone> {
inner: MappedPageTableRangeInclusiveIter<'a, P>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we change inner's type from MappedPageTableRangeInclusiveIter<'a, P> to MappedPageTable<'_, P>, we could replace the for mapped_page_range in self.inner.clone() { with for mapped_page_range in self.inner.range_iter() {.

If we do that, we don't need P: Clone anymore.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Done in my next revision. 👍

let frame_end = self.inner.frame_range.end.start_address();
let flags = self.inner.flags;
let format_phys = if page_start.as_u64() == frame_start.as_u64() {
assert_eq!(page_end.as_u64(), frame_end.as_u64());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_eq!(page_end.as_u64(), frame_end.as_u64());
debug_assert_eq!(page_end.as_u64(), frame_end.as_u64());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reworked MappedPageRangeInclusive to guarantee the page range and the frame range have the same length instead. :)

Comment on lines 86 to 89
write!(
f,
"size: {size}, len: {len:5}, virt: {page_start:18p}..={page_end:18p}, phys: {format_phys}, flags: {flags:?}"
)
Copy link
Member

@Freax13 Freax13 Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a bit confusing when 4K and 2M mappings are mixed in a page table.

A mapping covering an address range with 4K mappings could look like this:

size: 4KiB, len:  1024, virt:           0x200000..=          0x5FF000, phys:                         identity mapped, flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED | DIRTY)

and an otherwise identical mapping of the same range with 2M pages could then look like this:

size: 2MiB, len:     2, virt:           0x200000..=          0x400000, phys:                         identity mapped, flags: PageTableFlags(PRESENT | WRITABLE | ACCESSED | DIRTY)

The end addresses look completely different even though they're really the same.

Have you considered printing the last covered address (inclusive, i.e. 0x5FFFFF in this example) or one past the last covered address (exclusive, i.e. 0x600000) instead of the start address of the end page?

Copy link
Member Author

@mkroening mkroening Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

I think using exclusive ranges would look better. That's also how Linux's proc_pid_maps(5) does it.

This is addressed in my next revision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this file needs to be commented to explain what's going on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in my next revision. 👍

Comment on lines 140 to 162
impl<S: PageSize> Add<u64> for MappedPage<S> {
type Output = Self;

fn add(self, rhs: u64) -> Self::Output {
Self {
page: self.page + rhs,
frame: self.frame + rhs,
flags: self.flags,
}
}
}

impl<S: PageSize> Sub<u64> for MappedPage<S> {
type Output = Self;

fn sub(self, rhs: u64) -> Self::Output {
Self {
page: self.page - rhs,
frame: self.frame - rhs,
flags: self.flags,
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the use cases for these Add and Sub impls?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use page == end + 1 to determine whether two mapped pages are contiguous.

While Add would suffice, I also added AddAssign, Sub, and SubAssign for completeness.

@Freax13
Copy link
Member

Freax13 commented Jan 10, 2026

I appreciate the separate commits and the short explanation for each of them!

@mkroening
Copy link
Member Author

Thanks for the quick review! :)

I'll split off the OffsetPageTable removal, retarget this PR against next and take care of the individual comments.
Since this huge new API surface is not strongly motivated and rightly doubted, I will remove all public API changes except for the display method.
Although I had to use some internal APIs to make this work outside this crate, I have no further uses for these myself.
In case somebody has actual use cases for these, we can reevaluate making some of them public.

I remembered one more thing to discuss:
When implementing this in Hermit, it was natural to create a page table extension trait and implement a method returning a display adapter.
Since we are in the proper crate now, we could think about implementing Display for MappedPageTable directly, removing the need to call a method, making this more usable with named parameters: println!("{page_table}").
To quote the Display docs:

Because a type can only have one Display implementation, it is often preferable
to only implement Display when there is a single most "obvious" way that
values can be formatted as text. This could mean formatting according to the
"invariant" culture and "undefined" locale, or it could mean that the type
display is designed for a specific culture/locale, such as developer logs.

If not all values have a justifiably canonical textual format or if you want
to support alternative formats not covered by the standard set of possible
formatting traits, the most flexible approach is display adapters: methods
like str::escape_default or Path::display which create a wrapper
implementing Display to output the specific display format.

What's your opinion on this? Is this table the single most obvious way of formatting the whole page table as text, or should we stick with the display adapter? If the latter, is display the best name, or would display_table be better?

@Freax13
Copy link
Member

Freax13 commented Jan 10, 2026

I remembered one more thing to discuss: When implementing this in Hermit, it was natural to create a page table extension trait and implement a method returning a display adapter. Since we are in the proper crate now, we could think about implementing Display for MappedPageTable directly, removing the need to call a method, making this more usable with named parameters: println!("{page_table}"). To quote the Display docs:

Because a type can only have one Display implementation, it is often preferable
to only implement Display when there is a single most "obvious" way that
values can be formatted as text. This could mean formatting according to the
"invariant" culture and "undefined" locale, or it could mean that the type
display is designed for a specific culture/locale, such as developer logs.
If not all values have a justifiably canonical textual format or if you want
to support alternative formats not covered by the standard set of possible
formatting traits, the most flexible approach is display adapters: methods
like str::escape_default or Path::display which create a wrapper
implementing Display to output the specific display format.

What's your opinion on this? Is this table the single most obvious way of formatting the whole page table as text, or should we stick with the display adapter? If the latter, is display the best name, or would display_table be better?

I think the whole concept of formatting a set of page tables is a bit unusual, so it's probably warranted to use a inherent display method. Looking at the implementors of Display in the standard librariy, it's mostly either basic types (e.g. i32) or errors (e.g. TryRecvError). Page tables don't fit into either of these categories.
To me, this way of formatting page tables feels more like something I'd expect from a Debug implementation except that it's maybe a bit too verbose. Of course, we do already have more concise Debug implementations for these types and I think we want to keep that.
With all that in mind, I think your original approach of using an inherent display method is probably the way to go. If we wanted to be a bit more explicit, we could call it display_mappings, but I'd also be fine with just display.

@Freax13
Copy link
Member

Freax13 commented Jan 10, 2026

Another idea would be to pretty print the table when alternate ({:#?}) formatting is requested. Still, this might break users that already use alternate formatting.

@mkroening
Copy link
Member Author

Alternate formatting also affects the derived Debug implementation and the formatting of its members, so it would be a bit inconsistent if we disallowed that.
While the derived Debug implementation is too verbose to be useful for me, it might make sense to keep it anyway.
In that case, I would prefer sticking with a display method.

@mkroening mkroening force-pushed the display-page-tables branch from 3f25b46 to 091b45d Compare January 15, 2026 20:04
@mkroening
Copy link
Member Author

After you confirmed this is generally suitable for upstreaming, I have reworked the code to make it upstreamable. All comments from your last review should be addressed. You are welcome to mark any comments as resolved as you see fit.

I split the code into different modules to make it easier to understand. I also added doc comments to all items and have any not-straightforward code. Furthermore, I reworked the formatting of the table and have updated the PR description accordingly.

Thanks again for your reviews! They are very valuable. :)

@mkroening mkroening changed the base branch from master to next January 15, 2026 20:04
@mkroening mkroening requested a review from Freax13 January 15, 2026 20:05
@mkroening mkroening force-pushed the display-page-tables branch 4 times, most recently from 927c9b7 to 37f815a Compare January 15, 2026 20:23
@mkroening mkroening force-pushed the display-page-tables branch from 37f815a to fb3610b Compare January 15, 2026 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants