-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strict Provenance MVP #95241
Strict Provenance MVP #95241
Conversation
Hmm I probably should remove my changes to ptr examples, they should stick to "official" APIs. |
e172082
to
8c1bdc6
Compare
Note that a lot of the meat of this PR is just an enormous treatise being appended to core::ptr's top-level docs, discussing the high level "concept". |
cc @RalfJung for if you feel this new enormous treatise is acceptable to include given that it prefixed with the caveat that it is extremely overly strict, experimental, and non-normative. |
library/core/src/fmt/mod.rs
Outdated
// formatter takes an &Opaque. Rust understandably doesn't think we should compare | ||
// the function pointers if they don't have the same signature, so we cast to | ||
// pointers to convince it that we know what we're doing. | ||
if self.formatter as *mut u8 == USIZE_MARKER as *mut u8 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i should probably revert this to not make AVR sad
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this caused by an address space mismatch in the cast since avr uses AS1 for function pointers? I.e. an invalid bitcast error?
I played around with the datalayout code a few weeks ago to improve handling of address spaces. I can try to upload that as a PR later this week and see what else needs to be done to use an addrspacecast instead of a bitcast so you don't need the inttoptr/ptrtoint workaround. Should hopefully also help for CHERI support (which was my original motivation for looking at that code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See: avr-rust#143
And:
rust/library/core/src/ptr/mod.rs
Line 1390 in 9280445
// HACK: The intermediate cast as usize is required for AVR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(fwiw validating that these changes help for AVR is probably blocked by https://reviews.llvm.org/D114611)
@@ -57,6 +57,9 @@ pub struct DirEntry { | |||
data: c::WIN32_FIND_DATAW, | |||
} | |||
|
|||
unsafe impl Send for OpenOptions {} | |||
unsafe impl Sync for OpenOptions {} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was some argument as to whether this is a serious Robustness To Refactoring regression. If so, I can replace this with a wrapper type for just c::LPSECURITY_ATTRIBUTES
-- I don't know if such a wrapper already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Robustness To Refactoring
I searched for this term on Google and there are a total of two results, none of which have any relation to the term itself. What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a specific term, just Capitalization for Emphasis. (i.e. this is a potential regression to correctness in the face of refactoring in the future, such as the addition of another !Send
/!Sync
field in the future.)
lmk know what y'all wanna do about the "AVR is weird" situation which might also be a "wasm is weird" situation |
|
||
// Try to slide in the node at the head of the linked list, making sure | ||
// that another thread didn't just replace the head of the linked list. | ||
let exchange_result = state_and_queue.compare_exchange( | ||
current_state, | ||
me | RUNNING, | ||
me.with_addr(me.addr() | RUNNING), |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah map_addr was added to the set of APIs we want to go with after I wrote all this so it's most only using with_addr, which isn't necessarily a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we get really serious with this approach, I also like to imagine further conveniences, like abbreviating p.with_addr(p.addr() | TAG)
as p.tag_addr(TAG)
and similar for masking, which saves writing a map_addr
closure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, let's be exuberantly maximalist about encapsulating these patterns. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(My really maximalist thought is wondering how much raw pointer usage that's just for tagging could be replaced with completely safe wrappers like Tagged<'a, T>
that consider how much alignment you get from T
and handle masking on access automatically. But this can happen outside the standard library, I hope.)
Edit: And I feel like I would have felt like such an interface was vaguely ~dangerous~ in the past, but the strict provenance mental model and with_addr
make it way more obvious to me when something is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(My really maximalist thought is wondering how much raw pointer usage that's just for tagging could be replaced with completely safe wrappers like Tagged<'a, T> that consider how much alignment you get from T and handle masking on access automatically. But this can happen outside the standard library, I hope.)
(Is this considered bad practice to link crates which happen to be yours on rust-lang discussion? Anyway, I have a crate that provides tagged pointer support, ptr-union, though it doesn't automatically detect guaranteed alignment yet, due to stable const generic limitations. I'm currently working on making it work under strict provenance / not use inttoptr.)
We'll want to do a bunch of nonsense with try builds, so: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple more trivial nitpicks. (Well, and a typo that looks like it should fail to compile?)
(rust-highfive has picked a reviewer for you, use r? to override) |
Hmm, apparently I don't understand bors syntax as well as I thought. |
...Weird. Is the bot dead? |
Yeah this PR seems... cursed? the CI isn't even running. |
Github is having trouble it seems: https://www.githubstatus.com/ |
8e1015b
to
b4e12ad
Compare
} | ||
} | ||
|
||
pub(crate) fn is_dangling<T: ?Sized>(ptr: *mut T) -> bool { | ||
let address = ptr as *mut () as usize; | ||
address == usize::MAX | ||
(ptr as *mut ()).addr() == usize::MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a vital question for this PR, but it makes me wonder if .addr()
should work on ?Sized
directly, making this just ptr.addr() == usize::MAX
despite ptr
being potentially wide.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume historically as usize
isn't allowed on DST pointers because it's a footgun in a world where we claim this roundtrips. With a clear addr
method it might be fine to do this, but I'd like to be conservative against people just abusing these APIs and roundtripping anyways? Not sure I'm convinced by that argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could definitely live without it, wasn't really sure what to think.
✌️ @Gankra~~ can now approve this pull request |
✌️ @Gankra can now approve this pull request |
☀️ Test successful - checks-actions |
YESSSSS ok i will properly post a public announcement for this when it hits nightly and I can link the docs |
Finished benchmarking commit (e50ff9b): comparison url. Summary: This benchmark run did not return any relevant results. If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. @rustbot label: -perf-regression |
Fix a minor typo from rust-lang#95241 which prevented compilation on x86_64-unknown-openbsd.
…, r=dtolnay Fix library/std compilation on openbsd. Fix a minor typo from rust-lang#95241 which prevented compilation on x86_64-unknown-openbsd.
I have published an article "properly" announcing this and further discussing the ideas behind the project and "The Tower Of Weakening": https://gankra.github.io/blah/tower-of-weakenings/ (Just wrapping up loose threads on my work) |
This patch series examines the question: how bad would it be if we adopted
an extremely strict pointer provenance model that completely banished all
int<->ptr casts.
The key insight to making this approach even vaguely pallatable is the
ptr.with_addr(addr) -> ptr
function, which takes a pointer and an address and creates a new pointer
with that address and the provenance of the input pointer. In this way
the "chain of custody" is completely and dynamically restored, making the
model suitable even for dynamic checkers like CHERI and Miri.
This is not a formal model, but lots of the docs discussing the model
have been updated to try to the concept of this design in the hopes
that it can be iterated on.
See #95228