Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PTE proposal with load-side barrier bits #234

Merged
merged 16 commits into from
Sep 16, 2024

Conversation

andresag01
Copy link
Collaborator

This is a draft proposal that reworks the Zcheripte extension to include load-side revocation support in the PTE. The goal of the extension is to support the following:

  • Load-side revocation
  • Optimize sweep phase of capability revocation.
  • Limit capability propagation at page granularity

In summary, CHERI would add 3 bits to the PTE:

  • Capability Access (CA): Clear read/written tags if clear
  • Capability Dirty (CD): Trap when capability with tag=1 is written to page and CD=0
  • Capability Read Generation (CRG): generation of the page for revocation

Note that this is only a draft to stimulate discussion, not a finished proposal. For example, @jrtc27 previously mentioned that conflating read/write into CA is not desirable. My intuition is that some combinations are not useful, so more efficient encodings are possible. Specifically:

  • The CRG bit of a page is not very useful if the page does not allow capability reads
  • The CD bit of a page is not very useful if the page does not allow capability writes

However, is it always the case that the capability read/write bit of a page is set on allocation and remains unchanged throughout the lifetime of a page? If so, then how about something like this:

Cap Read Cap Write CD CRG Notes
1 1 0 0 Cap read/write set means CD/CRG needed
1 1 1 1
1 1 0 1
1 1 1 0
1 0 ? 1 Cap write clear means CD is not useful
1 0 ? 0
0 ? ? ? Cap read clear means CRG is not useful. Cap read clear also implies that Cap write is probably not useful at the page granularity

If this works, then we must be able to encode 7 values which is possible in 3 bits (and we get one spare encoding!). What do you think @jrtc27 @tariqkurd-repo @arichardson?

Fixes #17

@arichardson
Copy link
Collaborator

CC: @nwf

@jrtc27
Copy link
Collaborator

jrtc27 commented May 9, 2024

If you mprotect a page it can lose its capability write permission but still be capability dirty

@nwf
Copy link
Collaborator

nwf commented May 10, 2024

OK, let me give this a shot. I'm sorry this is long, but I hope it's all accurate and provides, if nothing else, a useful software-driven perspective.

It's also worth remembering that we have done... very little experimentation, relative to the body of work on MMUs in general, when it comes to "what software wants from CHERI PTE bits". Some of this, then, is necessarily a best guess extrapolating from very few data points.

I've said it before, but to repeat: it's a bit sad we have to resort to using PTE bits for this, and I hope that in the future there are alternative extensions that build on physically-indexed metadata tables, like AMD's SEV-SNP's RMTs or Arm's RME's GPTs. In any case, until such time, we are stuck using PTEs to reflect properties of a logical page (or even of the underlying physical frame), but hopefully future Zcheripte editions can use fewer PTE bits in composition with those other mechanisms.

Some of the design space is dependent on how willing microarchitecture is to take on data dependence in operations. It's most useful to software if we can trigger events only on tagged loads or stores, but these are new, and possibly expensive, dependencies, keeping state speculative for longer.

Some of the design space is also dependent on how willing software is to perform expensive operations on the set of page mappings for a given page. Historically, such need was limited to things like paging, and so such operations are, generally, not particularly speedy. The use of PTE mapping bits to approximate page state sometimes requires traversing this set.


OK, with all that out of the way... it's probably most useful to think about what logical states we need a logical page or an individual page mapping to be in. As relevant here, pages can be given four states:

  1. "Always cap-clean": no mapping of this page has or will ever accept a capability store. This includes mmap-ed file data.

  2. "Cap-dirty": this page has been the target of a capability store and has not yet been found to have no capabilities within. There may exist mappings of this page that accept tagged stores without other side-effects.

  3. "Ephemerally cap-clean": this page is potentially permissive of capability stores, but there do not exist mappings of it that accept tagged stores without trapping.

  4. "Cap-dirtyable": this page is potentially permissive of capability stores, and there may exist mappings that accept tagged stores, possibly without trapping (but with logging; that is, where the PTW will perform an AMO on the PTE).

Correctness absent temporal safety concerns really only differentiates between 1 and 2, smashing 3 and 4 up into an amalgamated 2. This is how the TLBE bits of CHERI-MIPS were used.

Under Cornucopia (Reloaded) revocation, the revoker must visit (at least) all cap-dirty and cap-dirtyable pages. Transition from ephemerally cap-clean is induced by traps or new mappings, and may require broadcast updates of PTEs, depending on exactly which things PTEs can represent (see below). Re-entry to ephemerally cap-clean requires broadcast down-grading all of a page's mappings, and may require retries across multiple revocation epochs.

Mappings also come in multiple flavors. The easy cases:

  1. A mapping, read-only or read-write, of an "always cap-clean" page . If mapped read-write, untagged stores are permitted, but tagged stores either trap or clear the tag, and loads either trap if tagged or clear the tag. No need for CD/CRG tracking.

  2. A mapping of a cap-dirty page: if read-write, stores of all flavors permitted; tagged (or cap-width) loads must be mediated by CRG.

  3. A mapping of a cap-dirtyable page: if read-write, tagged (or cap-width) stores either AMO the PTE or trap; tagged (or cap-width) loads must be mediated by CRG. The AMO effectively upgrades the page from cap-dirtyable to cap-dirty, but this transition will be discovered late, possibly as late as attempted re-entry to cap-clean.

And then there is the hard case involving ephemerally cap-clean (or "recently" ephemerally cap-clean) pages. We would, ideally, like for the revoker to be able to pay no attention to cap-clean pages, be they ephemerally or always so. This requires that software either be able to...

a. broadcast updates to PTEs for pages transitioning from cap-clean. That's potentially expensive.

b. configure mappings of cap-clean pages so that they become as if they were the wrong CRG whenever the page -- not this mapping, but the page via another mapping -- transitions away from cap-clean.

The latter seems preferable to me, but it seems to necessitate a PTE state of "all tagged loads trap" and we almost surely want that to be a true dependence on the loaded value, so that cap-width reads of data don't cause software to have to start paying attention to the page again.

In any case, assuming it were microarchitecturally agreeable, I think the following set of six states suffice for all software we've written so far:

  1. (TN) No tagged loads or stores permitted (clearing tags OK); the mapping may be read and/or written with untagged data.
  2. (TPg) Tagged stores permitted if page writable, with 1 CRG bit
  3. (TAg) Tagged stores AMO (to state 2 above) or trap if page writable, with 1 CRG bit
  4. (TT) Tagged stores and tagged loads always trap; the mapping may be read and/or written with untagged data.

If we had to remove the last one, we could (by using state 1 instead, and broadcasting to PTEs). That's still 5 states, tho', so still 3 bits.

The distinction between TP and TA states is only meaningful if the W bit is asserted, but that's probably not useful for any kind of representation compression.


Specifically, to the point of

If you mprotect a page it can lose its capability write permission but still be capability dirty

Yes, but that cap-dirty-ness doesn't need to be reflected in the updated mapping, so long as it's tracked in the logical page state in software.

Whew.

@andresag01
Copy link
Collaborator Author

andresag01 commented May 12, 2024

@nwf: Thanks for the explanation and your proposal. Its very informative!

First of all, I am new to temporal safety using CHERI! Could you elaborate a bit on how could a page get into the "ephemeral cap-clean" state? Is this purely due to the revocation algorithm?

My guess is that a page that the revoker can ignore pages that are ephemerally cap clean altogether. However, these pages are distinct from cap-dirtyable because in the latter you may have other mappings that are cap-dirty for example -- would the revoker at any point scan all mappings of a cap-dirtyable page and, for example, conclude that they are all "cap-clean" so that page should really be ephemeral and is safe to ignore it in the next revocation cycle?

P.S. I can see why it would be helpful to have physically-indexed metadata tables!

@nwf
Copy link
Collaborator

nwf commented May 16, 2024

@nwf: Thanks for the explanation and your proposal. Its very informative!

Great to hear that that brain-dump communicated something. :)

First of all, I am new to temporal safety using CHERI! Could you elaborate a bit on how could a page get into the "ephemeral cap-clean" state? Is this purely due to the revocation algorithm?

My guess is that a page that the revoker can ignore pages that are ephemerally cap clean altogether. However, these pages are distinct from cap-dirtyable because in the latter you may have other mappings that are cap-dirty for example -- would the revoker at any point scan all mappings of a cap-dirtyable page and, for example, conclude that they are all "cap-clean" so that page should really be ephemeral and is safe to ignore it in the next revocation cycle?

That's exactly right: it's easy for the revoker to keep track of whether or not a given page was found to have a capability on it during the sweep, and, if not, that's a good reason to try downgrading it to ephemerally cap-clean (even though the application could write a capability later, we might hope that it'll be a few revocation epochs in the future).

That downgrade is a frightful wad of complexity that eventually involves looking at all aliases of a page and which we stage across multiple revocation epochs (so that we can avoid per-page TLB shoot-downs and IPIs on platforms that require those). I have a... long-abandoned, long-winded draft of a technical report that has an exhaustively tested Murφ model of multiple TLBs and multiple page aliases and all the permitted flows through the system. While there's a huge semantic gap between that and the implementation in CheriBSD, it was, nevertheless, quite useful for understanding. I'll try to dig that up, stamp "DRAFT" on it in big letters, and put it up as a PDF somewhere for people to look at.

@andresag01
Copy link
Collaborator Author

andresag01 commented May 16, 2024

@nwf: Thanks for the info! My understanding so far is that we there are 3 key goals for the PTEs:

  • Limit capability propagation -- TN state (1)
  • Revocation load barrier -- CRG states (2) & (3)
  • Revocation optimizations -- dirty and ephimeral states (2), (3) & (4)

Is that correct? For the optimizations, are there any studies/figures showing what improvements (e.g. performance, pause times, etc) the dirty/ephimeral ideas provide? Is it also feasible to consider these optimizations as separate features?

@jonwoodruff
Copy link
Contributor

@nwf and @andresag01
Further to this, can "Revocation Barrier", if gating both loads and stores, be used to "Limit Capability Propagation" in the case of extreme PTE poverty?

(I didn't quite parse how the 6 states behaved with respect to capability loads, but I assume the "with 1 CRG bit" meant that it traps on capability read if CRG doesn't match.)

@jonwoodruff
Copy link
Contributor

jonwoodruff commented May 24, 2024

To try to summarise a conversation from yesterday, a seemingly sensible 2-bit proposal is as follows:

Bit 1: CW (Trap if a tagged capability is stored to this page)
Bit 2: CRG (Trap if a tagged capability might be loaded from this page and the generation bit doesn't match the current one in a CSR. Potentially either tag-sensitive trapping, or capability-width trapping allowed.)

Desired behaviours:
Capability propagation -
For any pages that must not hold capabilities, assuming they are clean at the beginning of time, CW will be cleared and the trap handler will throw an error on an attempt to write a capability. Or conceivably clear the tag and store the data. If, for some reason, a read barrier is preferred, keeping CRG inverted could work, but would lead to too many traps if cap-width semantics are implemented and memcpy uses caps.

Capability revocation sweep -
Flip the CSR generation bit. All subsequent capability loads will trap, allowing to sweep clean every page before it is accessed.

Capability presence/dirty tracking -
To accelerate the revocation sweep, we may like to track pages that allow capability writes, but nevertheless do not have capabilities present on them at the moment. This can be done by setting CW to False when a page is discovered to have no capabilities during a sweep such that the OS will be notified if any capabilities are written to this page. This information can be used to preemptively flip the CRG of non-capability-bearing pages on the next revocation without sweeping. Alternatively, the presence of capabilities might be conservatively tracked in a CPU that implements hardware dirty tracking by leaving CW True for clean, capability-allowed pages and observe the standard dirty bit to detect any write at all to these pages, conservatively considering that this might have been a capability.

Relation to Wes' states: This is more-or-less 2 & 3 of Wes' states; TPg and TAg.

===========================

What is missing?/What is our wishlist for an extra bit?
(Numbering these, as others will likely add more)

  1. Hardware capability dirty tracking not expressible
    As we are conflating dirty tracking and capability write propagation in CW by trapping in either case and distinguishing in software, it is not possible to allow hardware to transparently indicate that it has observed a tagged write to a page. At least one more state would be needed for this. (CW allowed + CW observed/not-observed)

  2. "Always allow cap read" not expressible
    CRG always flips polarity when the CSR generation bit flips. This means that capability-free mappings (e.g. files) will likely trap on subsequent capability-wide reads. (We cannot use non-trapping semantics for CRG mismatch, as the primary use case is to use trapping to interpose on legal program capability reads for lazy sweeping.) To avoid this, you must traverse the page table, flipping CRG bits for all non-capability bearing pages. For some data-heavy applications, this could easily be more expensive than leaving the CSR generation bit the same and flipping all CRG bits for capability-bearing pages. This probably requires at least one more state (CRG ignored, CW disallowed (only?)), unless CRG can always be ignored when CW is disallowed. I think there is some discussion above related to this.

============================

Tests required to make decision:

  1. Cost of revocation when emulating dirty tracking with the capability write permission. That is, trapping on every transition from capability clean to capability dirty. Hopefully this is possible to emulate on Morello, as this is our most mature microarchitecture? CHERI-Toooba would not be suitable for comparison, as it is not capable of hardware dirty tracking.

2.1 Is it possible to ignore CRG when CW is disallowed? If so, we can actually solve the "Missing 2" above, and reduce our states to 3. Can this be mocked up in hardware or software to detect if it is ever necessary?

2.2 If Test#2.1 is not possible, we need to measure the cost of manually flipping CRG bits in the STW phase for pages that need not be swept. Presumably this can be emulated either in Morello or Toooba.

@jonwoodruff
Copy link
Contributor

jonwoodruff commented May 28, 2024

If CRG, the Capability Read Generation bit, is ignored when CW, Capability Write, is not set, then I think we might get the important states we want?

State 1 & 2: CW True, CRG 1/0
Allow capability writes, and allow capability reads only if CRG matches the CSR generation bit.

State 3: CW False, CRG 0 (for example)
Trap on valid capability writes (tag-sensitive), strip tags on all capability reads. This should work fine for non-capability-bearing mappings such that capability-wide memcpy can be used without flipping CRG on each generation swap.

State 4: CW False, CRG 1 (for example)
Allow silent upgrade to CW True, CRG = CSR generation bit. This allows hardware dirty tracking; software must detect the transition of CW at a later point, as must be usual for dirty tracking.

Only need to answer 2.1 above.

@nwf
Copy link
Collaborator

nwf commented Jun 10, 2024

After a little bit of discussion elsewhere, here's a different 2-bit proposal. It's not great -- I think the 3-bit proposals are nicer -- but I think this is at least workable, if a little trap-heavy in use.

Revocation with four-state PTEs 2024/06/07

The four states proposed here are

State Cap Read Cap Write
CRW, LCRG=0 GCRG=0? OK
CRW, LCRG=1 GCRG=1? OK
NoCaps Assumed untagged Tag dep trap
CapsTrap Tag dep trap Tag dep trap

That is, this approach jetissons CAP-DIRTYABLE / AMO-cap-dirtying PTEs in favor of the "CapsTrap" state.

Steady states

  • An always data-only page (e.g., mmap'd file) has all of its aliases parked in NoCaps, and the kernel will deliver SIGSEGV on traps.

  • A cap-dirty page with all of its aliases aware of its dirtiness will have all of its alises bouncing between the two CRW states as revocation epochs elapse.

  • A (logically) cap-permissive but ephemerally clean page (safely being ignored by the revoker) can have all its aliases in the CapsTrap state.

State transitions

  • The first trap on any alias of a cap-permissive but ephemerally clean page will upgrade the page's logical state to cap-dirty and the PTE to the correct CRW LCRG state.

    Subsequent traps through a CapsTrap alias of a cap-dirty page will trigger revocation (if a sweep of the containing address space is active) and then move the PTE to the appropriate CRW state.

    No broadcast update is necessary.

    We may presume that the capability store triggering the first fault has been inspected by any ongoing revocation (even if it just started; there is a somewhat amusing case in which a tagged cap is stored and triggers the trap, then revocation starts and clears the tag in the spilled register file, and so when the store retries, it will store a clear tag, but that's fine).

    Subsequent traps through CapsTrap aliases of known-dirty pages must engage the revocation machinery during sweeps because other capabilities on that page may not have been yet inspected by the revoker.

  • Becoming clean remains somewhat tricky, given all the moving pieces.

    The existing revoker uses cap-dirtyable PTE states to stage its transition to cleanliness, relying on the existence of such a state -- and insistance that the PTW perform an AMO CAS to exit that state -- to ensure that at least one epoch has begun, and so at least one broadcast TLB invalidation has happened, within the address space containing an alias between that alias's PTE being in a cap-dirty (equivalently, CRW) and cap-clean but cap-load-permissive (no direct analogy exists in the proposal here).

    Absent such a state, we probably cannot get away from needing a TLB shootdown on the transition from cap-dirty to ephemerally cap-clean. It may suffice to have this slightly more expensive procedure:

    1. Downgrade aliases from CRW to CapsTrap in the background revoker if no caps are observed during a visit.

    2. If the background revoker finds a logically cap-dirty page with a CapsTrap alias, visit the page, and...

      • if capabilities were found and not revoked, upgrade the alias to CRW
    • if there are no aliases, the beginning of this epoch has already ensured that this alias's TLB entry is also CapsTrap, like its PTE, and so we may safely mark the page ephemerally cap-clean.

    • otherwise, gather the page's PTEs and, if they are all CapsTrap,

      • issue a TLB shootdown for each alias (which, recall, might be in multiple address spaces)

      • visit the page again, now that stores through stale TLB entries are not possible. If no caps are found, mark the page ephemerally cap-clean.

      The exact expense of this approach depends on how often aliased pages become ephemerally cap-clean, but that is probably fairly rare.

Other Caveats

  • It continues to have a store-path tag dependency. Relaxing this is somewhat difficult due to the conflicting desire to have hybrid executables (DDC) and limit capability propagation through file mappings.

    The tag dependency could be relaxed in the CapsTrap state more easily, at the cost of considering more pages during revocation, but since it probably has to exist for NoCaps, that hardly seems worthwhile.

  • Its load-path dependencies are also possibly tricky. Relaxing to cap-capable loads rather than loads of tagged capabilities again just means considering more pages cap-dirty. (And requires that software be prepared for marking a page cap-dirty in response to a load through a CapsTrap alias.)

  • The "assumed untagged" NoCaps behavior of NoCaps places the onus on kernel software to ensure that there are no caps on these pages (but permits hardware enforcement with clearing or trapping).

@jonwoodruff
Copy link
Contributor

@nwf Thank for the detailed thoughts!

Is there a justification for jettisoning the pre-CRW state for the CapsTrap state?
Maybe you could make a distinction in the OS for mapping pages with no alias into the pre-CRW state, and map aliased dirtyable pages as NoCaps, trapping on the first write and implementing the CRW upgrade in software.

This would maybe also eliminate the tag sensitivity for loads?

@andresag01
Copy link
Collaborator Author

@jonwoodruff has updated the proposal as discussed last week! Thanks for taking the time to do this @jonwoodruff! andresag01#2

@jonwoodruff
Copy link
Contributor

jonwoodruff commented Aug 28, 2024

With some help, I've now performed a large-scale experiment on the feasibility of a restricted CHERI PTE scheme using Morello.

I created a branch of CheriBSD that does not flip individual CHERI protection bits (of which there are four), but simply assigns the bits to one of our four states. Quite surprisingly, I got this to work correctly such that it runs all attempted software and benchmarks with revocation enabled without crashing. Sadly, tracing PTE bits on QEMU indicates that I never get the pre-CRW state, and thus am only using 3 states. That is, NoCaps, Caps0, and Caps1.

We then booted Morello hardware with this kernel and ran spec benchmarks. The changes were confirmed by tracing PTE cap-dirtying events, which were zero with the new kernel (and non-zero with the old one). Surprisingly, we see no overhead from running with these reduced states, with performance being consistently slightly faster than previously. This is thought to be due to lack of optimisation of the baseline and some unknown effects that we will now be tracking down. Nevertheless, the average overhead for temporal safety/revocation on the ref workload for Spec2006 benchmarks that would run (all int workloads but gcc and perlbench) is about 5%. It's likely that this can be reduced significantly with optimisation such that the three-state solution would be slower than the four-state solution with hardware dirtying, which this proposal would also support. It seems more of a stretch to say that any other missing states would have a significant impact on performance.

@nwf has made a cursory review of my modifications, and believes that they should be safe.

In summary, I believe we have demonstrated that the 3-state subset of the proposal works and does not have surprising overheads with Spec2006 benchmarks. The full 4-state solution should be very close indeed to the best we can do.

========

Exemple modification note:
Morello previously had a "always allow caps" state which was used for the kernel. This was replaced with Caps0, with the assumption that the kernel would never flip its generation bit. This proved to be true.

@andresag01 andresag01 changed the title Draft: PTE proposal with load-side barrier bits PTE proposal with load-side barrier bits Aug 28, 2024
@tariqkurd-repo
Copy link
Collaborator

I think that menvcfg should also get the CW, CRG bits, otherwise this scheme is different to how the other bit-fields work where menvcfg is a superset.

@andresag01
Copy link
Collaborator Author

I think that menvcfg should also get the CW, CRG bits, otherwise this scheme is different to how the other bit-fields work where menvcfg is a superset.

The key is that virtual memory is associated to S-mode, but not M-Mode. Perhaps the right place for the CRG bit is satp (or something like it) which is actually related to virtual memory instead of senvcfg.

@tariqkurd-repo
Copy link
Collaborator

I think that menvcfg should also get the CW, CRG bits, otherwise this scheme is different to how the other bit-fields work where menvcfg is a superset.

The key is that virtual memory is associated to S-mode, but not M-Mode. Perhaps the right place for the CRG bit is satp (or something like it) which is actually related to virtual memory instead of senvcfg.

fair point - but let's stick with regularity for now and add it into menvcfg, which I'm working on. All these extra bits will change when we get to ARC review anyway.

Copy link
Collaborator

@tariqkurd-repo tariqkurd-repo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's probably worth adding something about the benefits/consequences of each of the 4 options, but I think the actual spec is ok now

@tariqkurd-repo
Copy link
Collaborator

it's probably worth adding something about the benefits/consequences of each of the 4 options, but I think the actual spec is ok now

Jon has pointed out that this is already covered in the doc

@PeterRugg
Copy link
Contributor

I think that menvcfg should also get the CW, CRG bits, otherwise this scheme is different to how the other bit-fields work where menvcfg is a superset.

Looking through the PR, I don't think there is a CW bit in either senvcfg or menvcfg, and I don't think there should be. CRG only needs to be there as it's what you compare the PTE CRG bits against. It's not an enable as such, so senvcfg likely isn't the best fit. satp could make sense, but I don't think you want to insist on running an sfence.vma for it to take effect...

@jrtc27
Copy link
Collaborator

jrtc27 commented Sep 9, 2024

(v)sstatus would make the most sense for it. There's precedent from SUM and MXR (and TVM in the full mstatus).

@andresag01
Copy link
Collaborator Author

andresag01 commented Sep 10, 2024

I think that menvcfg should also get the CW, CRG bits, otherwise this scheme is different to how the other bit-fields work where menvcfg is a superset.

Looking through the PR, I don't think there is a CW bit in either senvcfg or menvcfg, and I don't think there should be. CRG only needs to be there as it's what you compare the PTE CRG bits against. It's not an enable as such, so senvcfg likely isn't the best fit. satp could make sense, but I don't think you want to insist on running an sfence.vma for it to take effect...

I agree, satp would be great, but there isn't much space in that CSR. I also like the idea of putting it in (v)sstatus

@andresag01
Copy link
Collaborator Author

(v)sstatus would make the most sense for it. There's precedent from SUM and MXR (and TVM in the full mstatus).

@jrtc27 I agree that (v)sstatus would be a good place to put this flag. Does hstatus also need its own CRG bit? The RISC-V privileged spec says this:

An OS or hypervisor running in HS-mode uses the supervisor CSRs to interact with the exception, interrupt, and address-translation subsystems

Wouldn't we expect that the OS or hypervisor running in HS-mode to also take advantage of revokation?

@tariqkurd-repo
Copy link
Collaborator

(v)sstatus would make the most sense for it. There's precedent from SUM and MXR (and TVM in the full mstatus).

@jrtc27 I agree that (v)sstatus would be a good place to put this flag. Does hstatus also need its own CRG bit? The RISC-V privileged spec says this:

An OS or hypervisor running in HS-mode uses the supervisor CSRs to interact with the exception, interrupt, and address-translation subsystems

Wouldn't we expect that the OS or hypervisor running in HS-mode to also take advantage of revokation?

Probably. I think we can safely rule out mstatus. If in doubt maybe add the bit in hstatus?

@andresag01
Copy link
Collaborator Author

It needs to be in mstatus because it is a superset of sstatus. Here is the relevant fragment form the privileged spec

Screenshot from 2024-09-10 11-15-49

@tariqkurd-repo
Copy link
Collaborator

It needs to be in mstatus because it is a superset of sstatus. Here is the relevant fragment form the privileged spec

yes ok - but I noticed that hstatus doesn't have all the mstatus/status fields.

@andresag01
Copy link
Collaborator Author

I migrated the CRG bit from senvcfg to xstatus excluding hstatus. We can add it to hstatus if its necessary.

Signed-off-by: Andres Amaya Garcia <[email protected]>
@tariqkurd-repo
Copy link
Collaborator

any objections to merging this?

@arichardson
Copy link
Collaborator

ping @jonwoodruff @nwf @nwf-msr: Is this ready to be merged?

Copy link

@nwf-msr nwf-msr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as a minimal baseline. I'm a little hesitant about the requirements on software, which are not fully correctly realized (that is, there are known bugs, sorry) in the existing CheriBSD prototype, but I think it is possible to be correct atop this specification, which is a very good first step. :)

Copy link
Collaborator

@arichardson arichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge this now since Wes is also ok with it.

@jonwoodruff
Copy link
Contributor

Just found one run-on sentence. Otherwise, it looked good to me!

--- a/src/cheri-pte-ext.adoc
+++ b/src/cheri-pte-ext.adoc
@@ -20,8 +20,8 @@ but capability pointers must not be shared across these channels
into a foreign address space.
An operating system might defend against this by only issuing a capability
to the shared
-region that does not grant the load/store capability permission,
-there are circumstances where portions of general-purpose, mmapped^^ memory become shared,
+region that does not grant the load/store capability permission.
+However, there are circumstances where portions of general-purpose, mmapped^
^ memory become shared,
and the operating system must prevent future capability communication
through those pages.
This is not possible without restructuring software, as the capability for

@andresag01
Copy link
Collaborator Author

Just found one run-on sentence. Otherwise, it looked good to me!

Thanks for spotting this! I applied your change in this commit: 48d5761

@andresag01 andresag01 merged commit 9572e0c into riscv:main Sep 16, 2024
3 checks passed
tariqkurd-repo added a commit to tariqkurd-repo/riscv-cheri that referenced this pull request Oct 9, 2024
Refactor the Zcheripte extension to use load-side barriers and only allocate 2 PTE bits
for CHERI as follows:

* Bit 1: CW (Trap if a tagged capability is stored to this page)
* Bit 2: CRG (Trap if a tagged capability might be loaded from this page and the generation bit doesn't match the current one in a CSR. Potentially either tag-sensitive trapping, or capability-width trapping allowed.)

Fixes riscv#17

---------

Signed-off-by: Andres Amaya Garcia <[email protected]>
Signed-off-by: Tariq Kurd <[email protected]>
Co-authored-by: Jonathan Woodruff <[email protected]>
Co-authored-by: Tariq Kurd <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Zcheri_pte lacks support for load-side revocation
8 participants