Look into ways get_function_address could safely return a function #5

TheDan64 · 2017-07-03T19:05:56Z

Currently it returns an address value pointer, which is in itself safe - but to actually use it, the user needs to unsafely transmute it into a rust extern "C" function. I'm wondering if this could be accomplished under the hood by "deserializing" the function's FunctionType (which has all of the needed type info) into rust types. Maybe writing a Serde plugin could help when it comes to complex types like StructType & PointerType.

Will probably need to account for edge cases - is VoidType deserializable? If so, would that just be () and is that ever useful? Is *() / &() valid in rust? Etc

The text was updated successfully, but these errors were encountered:

TheDan64 · 2017-07-16T22:17:23Z

I'm starting to think that this might not be possible, and possible as a safe wrapper:

While FunctionType may have all the required info such as return type, argument types, etc. It would only be accessible at runtime, so generation of a function signature at runtime is not possible I believe (we can't return a function generated at runtime)
Even if the above wasn't a problem, and we could provide a safe wrapper for casting the address to an actual function, that function is very probably unsafe. Rust is not able to provide any guarantees about that function generated by LLVM. And indeed, it's very easy to shoot yourself in the foot.

So, this may just be an area where unsafety is required on the user's part 😢.

One slightly safer approach (that doesn't solve this problem but is a related idea) might be to return a non-copy address wrapper reference &MemoryAddress so that the underlying copy-able usize address isn't as easily abused and passed out of the scope/lifetime of the execution engine, but can still be transmuted to a function signature.

remexre · 2017-08-07T06:51:07Z

I would just have a separate get_function_handle::<Args, Ret>() method that returns a Option<&Fn<Args, Output=Ret>>, doing the check at runtime. This definitely requires nightly though, and would probably be frowned upon (the Fn<Args, Output=Ret> bit is unstable)

TheDan64 · 2017-08-07T11:45:58Z

Oh, neat, thanks! I didn't know something like this existed. I'll have to look into it more. What's the name for this feature? I could probably just have an experimental crate feature for stuff like this, so that the whole crate doesn't have to move to nightly.

TheDan64 · 2017-08-07T11:49:09Z

Also, does this let you specify that the generated Fn is unsafe somehow?

remexre · 2017-08-07T13:45:39Z

What's the name for this feature?

I think unboxed_closures; at least, that's the feature name for it.

Also, does this let you specify that the generated Fn is unsafe somehow?

I don't think so, because the Fn trait doesn't have the function marked as unsafe. Maybe make the get_function_handle method unsafe? (Although I agree, that's not 100% satisfying.)

TheDan64 · 2017-08-07T19:59:50Z

Yeah, ideally the function generator would be safe, and the actual function would be unsafe. Nonetheless, thanks for the idea! Worth exploring for sure.

…

On Aug 7, 2017, 9:45 AM, at 9:45 AM, Nathaniel Ringo ***@***.***> wrote: > What's the name for this feature? I *think* `unboxed_closures`; at least, that's the feature name for it. > Also, does this let you specify that the generated Fn is unsafe somehow? I don't think so, because the `Fn` trait doesn't have the function marked as `unsafe`. Maybe make the `get_function_handle` method `unsafe`? (Although I agree, that's not 100% satisfying.) -- You are receiving this because you were assigned. Reply to this email directly or view it on GitHub: #5 (comment)

Michael-F-Bryan · 2018-03-11T08:16:36Z

In the tutorial I'm working on I noticed how awkward it is to use the execution engine's get_function_address() method. Besides the fact that it should return a usize because we're using pointers, I'd expect the lifetime of the returned symbol to be tied to the execution engine.

The libloading crate does this really well when you are loading function pointers out of a dynamic library. This will always be an innately unsafe operation because there's no way to statically prove a function pointer generated at runtime has the correct signature, but by using Rust lifetimes and returning something like Symbol<'ee, T> where we use the T to infer what signature to transmute the function pointer to, we should be able to make this easier to use and reduce the foot guns.

In general, I feel like this crate can benefit a lot from using lifetimes more. I've already encountered a couple segfaults because I didn't fully understand inkwell/LLVM's ownership model and because it's not encoded in the typesystem there's no way for the compiler to help prevent these issues.

TheDan64 · 2018-03-12T01:27:17Z

Good point, it maybe should be a usize.

The method itself should not be unsafe to call, but the result should be unsafe to manipulate, I think (but could be wrong). I've also thought about using the typesystem and lifetimes here to tie it to the EE somehow and my most recent idea was to return a &FunctionAddress which is just a thin wrapper tied to the EE similarly to your example. But yeah, thanks for the information, it's certainly a good resource.

I do want to get more issues solved via the type system, but I'm holding that off until the second iteration/version(v0.2.0) of inkwell(I've jotted down ideas here: #8) because it's a lot of additional work and I just want to focus on core functionality being complete first across multiple LLVM versions so that we can get an initial version out the door. And yeah, LLVM's ownership model is very confusing, and I've just been piecing it together myself as I go

CAD97 · 2018-04-01T03:04:26Z

(I saw this a while back when looking around at LLVM in Rust, so this has been sitting in my mind for a while. Then I had an idea, so) Just to spitball the idea:

Maybe you could use Frunk's HList to allow a nice safe interface to the (unsafe) function.

My first idea was that you could expose an interface that gave back a LLVMJitFn<...> where the interface for unsafe fn sum(_: i32, _: i32) -> i32 would be something like llvm_sum.arg(5).arg(5).call(), where each step effectively built the call, typechecked with the information used to build the fn in the first place.

Then I remembered seeing the blog posts about Frunk's HList. Above I've linked the actual blog post, but an HList is basically a type-level cons list, so you can have types like (i32, (i32, ()) and talk about them in a somewhat sane manner.

So, using unsafe fn sum(_: i32, _: i32) -> i32 as an example, the idea is that you'd provide a get_function call on the fully constructed JIT fn that returns some LLVMJitFn<HList![i32, i32], i32>, which would then provide the unsafe fn LLVMJitFn::invoke(arguments: HList![i32, i32]) -> i32.

Maybe if in the future the Fn trait stabilizes in some fashion and supports being tagged as unsafe to call, you could use that. But I think using HList you can "reinvent" the Fn trait in a not-too-cumbersome manner.

Michael-F-Bryan · 2018-04-01T06:16:54Z

@CAD97 I like the idea of using something like a cons list for passing in arguments, but there may be issues trying to shoehorn a LLVM function signature (an instance of FunctionType) into something at the type level.

The big problem I see is you can't guarantee a function signature at compile time, because LLVM doesn't even generate the function until runtime. So about the best you can do is either provide some sort of builder type which will do runtime checks to validate the signature (llvm_sum.arg(5).arg(5).call()) or use libloading's get() which is essentially a smart transmute which uses a wrapper type to ensure the function symbol doesn't outlive where it came from.

Either way, I don't think it's sound to try and remove the unsafe. We're essentially plucking a random function pointer out of the JIT executor and then running it, there's no guarantee the pointer is valid or that it doesn't do unsafe stuff internally. We require people to use unsafe when calling into C for exactly the same reason.

71 · 2018-04-01T07:30:24Z

I also think it should stay unsafe. Using a wrapper type or fn(...) -> ... always requires an assumption to be made at compile-time. The only way I see of making this safe is by building the HList when creating the function (ie fn_type(return_type).arg<i32>(i32type).arg<i64>(i64type)), but it would get messy quickly. Plus, it would require Any and a lot of runtime reflection-fu to ensure the argument given to arg corresponds to the Rust type.

Michael-F-Bryan · 2018-04-01T08:24:21Z

@6a I've made a PR which replaces the old get_function_address() with a get_function() which returns a Symbol<F>, allowing people to call the function (via a Deref impl) and it also holds a Rc<LLVMExecutionEngineRef> so the symbol doesn't accidentally outlive its execution engine.

Under the hood it's essentially just the old get_function_address() with an additional pointer cast at the end.

I don't know if you can constrain F to be just a function pointer, so if anyone has any ideas, feel free to let me know.

71 · 2018-04-01T09:46:34Z

Looks like a good compromise to me, but it essentially depends on @TheDan64's opinion. To my knowledge, there aren't any fn constraints, no; however I do think it is statically possible to check that size_of<F>() == size_of<uint>().

Michael-F-Bryan · 2018-04-01T11:57:45Z

I tried using the static_assertions crate/hack and unfortunately it doesn't look like we'll be able to use it here without proper static asserts and all the related const fn machinery 😞

It seems like F is almost too generic!

error[E0512]: transmute called with types of different sizes
   --> src/execution_engine.rs:230:9
    |
230 |         assert_eq_size!(F, usize); // `F` must be the size of a pointer
    |         ^^^^^^^^^^^^^^^^^^^^^^^^^^
    |
    = note: source type: F (this type's size can vary)
    = note: target type: usize (64 bits)
    = note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)

TheDan64 · 2018-04-01T22:58:46Z

~~This doesn't seem unreasonable but one problem I have is that deref is a safe call, when in reality the call is unsafe... Or do you still get an unsafe function from deref?~~ Meant to write this on the PR 😂

I wonder if we could use the nightly call from https://doc.rust-lang.org/nightly/std/ops/trait.Fn.html#required-methods ? Seems to be runtime function-ish

Michael-F-Bryan · 2018-04-23T11:04:46Z

@TheDan64 given #36 is merged, would we say this issue is resolved? You still need to use a bit of unsafe, but grabbing a function pointer out of a JIT compiler's memory is always going to be an unsafe operation. At least #36 makes sure you can only get unsafe extern "C" functions.

Update inkwell llvm12

TheDan64 self-assigned this Jul 3, 2017

TheDan64 added this to the 0.2.0 milestone Jul 3, 2017

TheDan64 removed their assignment Jul 3, 2017

TheDan64 added enhancement help wanted labels Jul 16, 2017

Michael-F-Bryan mentioned this issue Apr 1, 2018

Replace Get function address #36

Merged

Michael-F-Bryan mentioned this issue Apr 2, 2018

Tracking issue for Fn traits (unboxed_closures & fn_traits feature) rust-lang/rust#29625

Open

2 tasks

TheDan64 modified the milestones: 0.2.0, 0.1.0 Apr 24, 2018

TheDan64 closed this as completed Apr 24, 2018

This was referenced Oct 22, 2018

Safe function wrapper #56

Closed

Provide safe function wrappers #57

Merged

seanyoung mentioned this issue Dec 4, 2020

test_module::test_garbage_ir_fails_create_module_from_ir triggers assertion in llvm 10 #198

Open

ayazhafiz pushed a commit to ayazhafiz/inkwell that referenced this issue Mar 10, 2022

Merge pull request TheDan64#5 from rtfeldman/update-inkwell-llvm12

237cbc7

Update inkwell llvm12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Look into ways get_function_address could safely return a function #5

Look into ways get_function_address could safely return a function #5

TheDan64 commented Jul 3, 2017

TheDan64 commented Jul 16, 2017 •

edited

Loading

remexre commented Aug 7, 2017

TheDan64 commented Aug 7, 2017

TheDan64 commented Aug 7, 2017

remexre commented Aug 7, 2017

TheDan64 commented Aug 7, 2017 via email

Michael-F-Bryan commented Mar 11, 2018

TheDan64 commented Mar 12, 2018

CAD97 commented Apr 1, 2018

Michael-F-Bryan commented Apr 1, 2018

71 commented Apr 1, 2018 •

edited

Loading

Michael-F-Bryan commented Apr 1, 2018

71 commented Apr 1, 2018

Michael-F-Bryan commented Apr 1, 2018

TheDan64 commented Apr 1, 2018 •

edited

Loading

Michael-F-Bryan commented Apr 23, 2018

Look into ways get_function_address could safely return a function #5

Look into ways get_function_address could safely return a function #5

Comments

TheDan64 commented Jul 3, 2017

TheDan64 commented Jul 16, 2017 • edited Loading

remexre commented Aug 7, 2017

TheDan64 commented Aug 7, 2017

TheDan64 commented Aug 7, 2017

remexre commented Aug 7, 2017

TheDan64 commented Aug 7, 2017 via email

Michael-F-Bryan commented Mar 11, 2018

TheDan64 commented Mar 12, 2018

CAD97 commented Apr 1, 2018

Michael-F-Bryan commented Apr 1, 2018

71 commented Apr 1, 2018 • edited Loading

Michael-F-Bryan commented Apr 1, 2018

71 commented Apr 1, 2018

Michael-F-Bryan commented Apr 1, 2018

TheDan64 commented Apr 1, 2018 • edited Loading

Michael-F-Bryan commented Apr 23, 2018

TheDan64 commented Jul 16, 2017 •

edited

Loading

71 commented Apr 1, 2018 •

edited

Loading

TheDan64 commented Apr 1, 2018 •

edited

Loading